While competing methods such as
gradient descent for constrained optimization require a
projection step back to the feasible set in each iteration, the Frank–Wolfe algorithm only needs the solution of a convex problem over the same set in each iteration, and automatically stays in the feasible set. The convergence of the Frank–Wolfe algorithm is sublinear in general: the error in the objective function to the optimum is O(1/k) after
k iterations, so long as the gradient is
Lipschitz continuous with respect to some norm. The same convergence rate can also be shown if the sub-problems are only solved approximately. The iterations of the algorithm can always be represented as a sparse convex combination of the extreme points of the feasible set, which has helped to the popularity of the algorithm for sparse greedy optimization in
machine learning and
signal processing problems, as well as for example the optimization of
minimum–cost flows in
transportation networks. If the feasible set is given by a set of linear constraints, then the subproblem to be solved in each iteration becomes a
linear program. While the worst-case convergence rate with O(1/k) can not be improved in general, faster convergence can be obtained for special problem classes, such as some strongly convex problems. ==Lower bounds on the solution value, and primal-dual analysis==