Several algorithms are available to solve knapsack problems, based on the dynamic programming approach, the
branch and bound approach or
hybridizations of both approaches.
Dynamic programming in-advance algorithm The
unbounded knapsack problem (
UKP) places no restriction on the number of copies of each kind of item. Besides, here we assume that x_i > 0 : m[w']= \max \left( \sum_{i=1}^n v_i x_i \right) : subject to \sum_{i=1}^n w_i x_i \leq w' and x_i > 0 Observe that m[w] has the following properties: 1. m[0]=0\,\! (the sum of zero items, i.e., the summation of the
empty set). 2. m[w]=\max (v_1+m[w-w_1],v_2+m[w-w_2],...,v_n+m[w-w_n]) , w_i \leq w, where v_i is the value of the i-th kind of item. The second property needs to be explained in detail. During the process of the running of this method, how do we get the weight w? There are only i ways and the previous weights are w-w_1, w-w_2,..., w-w_i where there are total i kinds of different item (by saying different, we mean that the weight and the value are not completely the same). If we know each value of these i items and the related maximum value previously, we just compare them to each other and get the maximum value ultimately and we are done. Here the maximum of the empty set is taken to be zero. Tabulating the results from m[0] up through m[W] gives the solution. Since the calculation of each m[w] involves examining at most n items, and there are at most W values of m[w] to calculate, the running time of the dynamic programming solution is
O(nW). Dividing w_1,\,w_2,\,\ldots,\,w_n,\,W by their
greatest common divisor is a way to improve the running time. Even if
P≠NP, the O(nW) complexity does not contradict the fact that the knapsack problem is
NP-complete, since W, unlike n, is not polynomial in the length of the input to the problem. The length of the W input to the problem is proportional to the number of bits in W, \log W, not to W itself. However, since this runtime is
pseudopolynomial, this makes the (decision version of the) knapsack problem a
weakly NP-complete problem.
0-1 knapsack problem A similar dynamic programming solution for the 0-1 knapsack problem also runs in pseudo-polynomial time. Assume w_1,\,w_2,\,\ldots,\,w_n,\, W are strictly positive integers. Define m[i,w] to be the maximum value that can be attained with weight less than or equal to w using items up to i (first i items). We can define m[i,w] recursively as follows:
(Definition A) • m[0,\,w]=0 • m[i,\,w]=m[i-1,\,w] if w_i > w\,\! (the new item is more than the current weight limit) • m[i,\,w]=\max(m[i-1,\,w],\,m[i-1,w-w_i]+v_i) if w_i \leqslant w. The solution can then be found by calculating m[n,W]. To do this efficiently, we can use a table to store previous computations. The following is
pseudocode for the dynamic program: // Input: // Values (stored in array v) // Weights (stored in array w) // Number of distinct items (n) // Knapsack capacity (W) // NOTE: The array "v" and array "w" are assumed to store all relevant values starting at index 1. array m[0..n, 0..W]; for j from 0 to W do: m[0, j] := 0 for i from 1 to n do: m[i, 0] := 0 for i from 1 to n do: for j from 1 to W do: if w[i] > j then: m[i, j] := m[i-1, j] else: m[i, j] := max(m[i-1, j], m[i-1, j-w[i + v[i]) This solution will therefore run in O(nW) time and O(nW) space. (If we only need the value m[n,W], we can modify the code so that the amount of memory required is O(W) which stores the recent two lines of the array "m".) However, if we take it a step or two further, we should know that the method will run in the time between O(nW) and O(2^n). From
Definition A, we know that there is no need to compute all the weights when the number of items and the items themselves that we chose are fixed. That is to say, the program above computes more than necessary because the weight changes from 0 to W often. From this perspective, we can program this method so that it runs recursively. // Input: // Values (stored in array v) // Weights (stored in array w) // Number of distinct items (n) // Knapsack capacity (W) // NOTE: The array "v" and array "w" are assumed to store all relevant values starting at index 1. Define value[n, W] Initialize all value[i, j] = -1 Define m:=(i,j) // Define function m so that it represents the maximum value we can get under the condition: use first i items, total weight limit is j { if i == 0 or j j then: // item cannot fit in the bag value[i, j] = value[i-1, j] else: if (value[i-1, j-w[i == -1) then: // m[i-1,j-w[i has not been calculated, we have to call function m m(i-1, j-w[i]) value[i, j] = max(value[i-1,j], value[i-1, j-w[i + v[i]) } Run m(n, W) For example, there are 10 different items and the weight limit is 67. So, \begin{align} &w[ 1]= 23 ,w[ 2]= 26,w[ 3]= 20,w[ 4]= 18,w[ 5]= 32,w[ 6]= 27,w[ 7]= 29,w[ 8]= 26,w[ 9]= 30,w[ 10]= 27 \\ &v[ 1]=505 ,v[ 2]=352,v[ 3]=458,v[ 4]=220,v[ 5]=354,v[ 6]=414,v[ 7]=498,v[ 8]=545,v[ 9]=473,v[ 10]=543 \\ \end{align} If you use above method to compute for m(10,67), you will get this, excluding calls that produce m(i,j) = 0: \begin{align} &m(10, 67) = 1270\\ &m(9, 67) = 1270, m(9, 40) = 678\\ &m(8, 67) = 1270, m(8, 40) = 678, m(8, 37) = 545\\ &m(7, 67) = 1183, m(7, 41) = 725, m(7, 40) = 678, m(7, 37) = 505\\ &m(6, 67) = 1183, m(6, 41) = 725, m(6, 40) = 678, m(6, 38) = 678, m(6, 37) = 505\\ &m(5, 67) = 1183, m(5, 41) = 725, m(5, 40) = 678, m(5, 38) = 678, m(5, 37) = 505\\ &m(4, 67) = 1183, m(4, 41) = 725, m(4, 40) = 678, m(4, 38) = 678, m(4, 37) = 505, m(4, 35) = 505\\ &m(3, 67) = 963, m(3, 49) = 963, m(3, 41) = 505, m(3, 40) = 505, m(3, 38) = 505, m(3, 37) = 505, m(3, 35) = 505, m(3, 23) = 505, m(3, 22) = 458, m(3, 20) = 458\\ &m(2, 67) = 857, m(2, 49) = 857, m(2, 47) = 505, m(2, 41) = 505, m(2, 40) = 505, m(2, 38) = 505, m(2, 37) = 505, m(2, 35) = 505, m(2, 29) = 505, m(2, 23) = 505\\ &m(1, 67) = 505, m(1, 49) = 505, m(1, 47) = 505, m(1, 41) = 505, m(1, 40) = 505, m(1, 38) = 505, m(1, 37) = 505, m(1, 35) = 505, m(1, 29) = 505, m(1, 23) = 505\\ \end{align} Besides, we can break the recursion and convert it into a tree. Then we can cut some leaves and use
parallel computing to expedite the running of this method. To find the actual subset of items, rather than just their total value, we can run this after running the function above: /** * Returns the indices of the items of the optimal knapsack. * i: We can include items 1 through i in the knapsack * j: maximum weight of the knapsack */ function knapsack(i: int, j: int): Set { if i == 0 then: return {} if m[i, j] > m[i-1, j] then: return {i} ∪ knapsack(i-1, j-w[i]) else: return knapsack(i-1, j) } knapsack(n, W)
Meet-in-the-middle Another algorithm for 0-1 knapsack, discovered in 1974 and sometimes called "meet-in-the-middle" due to parallels to
a similarly named algorithm in cryptography, is exponential in the number of different items but may be preferable to the DP algorithm when W is large compared to
n. In particular, if the w_i are nonnegative but not integers, we could still use the dynamic programming algorithm by scaling and rounding (i.e. using
fixed-point arithmetic), but if the problem requires d fractional digits of precision to arrive at the correct answer, W will need to be scaled by 10^d, and the DP algorithm will require O(W10^d) space and O(nW10^d) time.
algorithm Meet-in-the-middle
is input: A set of items with weights and values.
output: The greatest combined value of a subset. partition the set {1...
n} into two sets
A and
B of approximately equal size compute the weights and values of all subsets of each set
for each subset
of A do find the subset of
B of greatest value such that the combined weight is less than
W keep track of the greatest combined value seen so far The algorithm takes O(2^{n/2}) space, and efficient implementations of step 3 (for instance, sorting the subsets of B by weight, discarding subsets of B which weigh more than other subsets of B of greater or equal value, and using
binary search to find the best match) result in a runtime of O(n2^{n/2}). As with the
meet in the middle attack in cryptography, this improves on the O(n2^n) runtime of a naive brute force approach (examining all subsets of \{1...n\}), at the cost of using exponential rather than constant space (see also
baby-step giant-step). The current state of the art improvement to the meet-in-the-middle algorithm, using insights from Schroeppel and Shamir's Algorithm for Subset Sum, provides as a corollary a
randomized algorithm for Knapsack which preserves the O^{*}(2^{n/2}) (up to polynomial factors) running time and reduces the space requirements to O^{*}(2^{0.249999n}) (see Corollary 1.4). In contrast, the best known
deterministic algorithm runs in O^{*}(2^{n/2}) time with a slightly worse space complexity of O^{*}(2^{n/4}).
Approximation Algorithms As for most NP-complete problems, it may be enough to find workable solutions even if they are not optimal. Preferably, however, the approximation comes with a guarantee of the difference between the value of the solution found and the value of the optimal solution. As with many useful but computationally complex algorithms, there has been substantial research on creating and analyzing algorithms that approximate a solution. The knapsack problem, though NP-Hard, is one of a collection of algorithms that can still be approximated to any specified degree. This means that the problem has a polynomial time approximation scheme. To be exact, the knapsack problem has a
fully polynomial time approximation scheme (FPTAS).
Greedy approximation algorithm George Dantzig proposed a
greedy approximation algorithm to solve the unbounded knapsack problem. His version sorts the items in decreasing order of value per unit of weight, v_1/w_1\ge\cdots\ge v_n/w_n. It then proceeds to insert them into the sack, starting with as many copies as possible of the first kind of item until there is no longer space in the sack for more. Provided that there is an unlimited supply of each kind of item, if m is the maximum value of items that fit into the sack, then the greedy algorithm is guaranteed to achieve at least a value of m/2. For the bounded problem, where the supply of each kind of item is limited, the above algorithm may be far from optimal. Nevertheless, a simple modification allows us to solve this case: Assume for simplicity that all items individually fit in the sack (w_i \le W for all i). Construct a solution S_1 by packing items greedily as long as possible, i.e. S_1=\left\{1,\ldots,k\right\} where k=\textstyle\max_{1\le k'\le n}\textstyle\sum_{i=1}^{k'} w_i\le W. Furthermore, construct a second solution S_2=\left\{k+1\right\} containing the first item that did not fit. Since S_1\cup S_2 provides an upper bound for the
LP relaxation of the problem, one of the sets must have value at least m/2; we thus return whichever of S_1 and S_2 has better value to obtain a 1/2-approximation. It can be shown that the average performance converges to the optimal solution in distribution at the error rate n^{-1/2}
Fully polynomial time approximation scheme The
fully polynomial time approximation scheme (FPTAS) for the knapsack problem takes advantage of the fact that the reason the problem has no known polynomial time solutions is because the profits associated with the items are not restricted. If one rounds off some of the least significant digits of the profit values then they will be bounded by a polynomial and 1/ε where ε is a bound on the correctness of the solution. This restriction then means that an algorithm can find a solution in polynomial time that is correct within a factor of (1-ε) of the optimal solution. {H} = -\sum_{i=1}^n v_i x_i + P \left(\sum_{i=1}^n w_i x_i - W\right)^2, where P is the penalty constant which is determined by case-specific fine-tuning.
Dominance relations Solving the unbounded knapsack problem can be made easier by throwing away items which will never be needed. For a given item i, suppose we could find a set of items J such that their total weight is less than the weight of i, and their total value is greater than the value of i. Then i cannot appear in the optimal solution, because we could always improve any potential solution containing i by replacing i with the set J. Therefore, we can disregard the i-th item altogether. In such cases, J is said to
dominate i. (Note that this does not apply to bounded knapsack problems, since we may have already used up the items in J.) Finding dominance relations allows us to significantly reduce the size of the search space. There are several different types of
dominance relations, which all satisfy an inequality of the form: \qquad \sum_{j \in J} w_j\,x_j \ \le \alpha\,w_i, and \sum_{j \in J} v_j\,x_j \ \ge \alpha\,v_i\, for some x \in Z _+^n where \alpha\in Z_+ \,,J\subsetneq N and i\not\in J. The vector x denotes the number of copies of each member of J. ;Collective dominance: The i-th item is
collectively dominated by J, written as i\ll J, if the total weight of some combination of items in J is less than
wi and their total value is greater than
vi. Formally, \sum_{j \in J} w_j\,x_j \ \le w_i and \sum_{j \in J} v_j\,x_j \ \ge v_i for some x \in Z _+^n , i.e. \alpha=1. Verifying this dominance is computationally hard, so it can only be used with a dynamic programming approach. In fact, this is equivalent to solving a smaller knapsack decision problem where V = v_i, W = w_i, and the items are restricted to J. ;Threshold dominance: The i-th item is
threshold dominated by J, written as i\prec\prec J, if some number of copies of i are dominated by J. Formally, \sum_{j \in J} w_j\,x_j \ \le \alpha\,w_i, and \sum_{j \in J} v_j\,x_j \ \ge \alpha\,v_i\, for some x \in Z _+^n and \alpha\geq 1. This is a generalization of collective dominance, first introduced in and used in the EDUK algorithm. The smallest such \alpha defines the
threshold of the item i, written t_i =(\alpha-1)w_i. In this case, the optimal solution could contain at most \alpha-1 copies of i. ;Multiple dominance: The i-th item is
multiply dominated by a single item j, written as i\ll_{m} j, if i is dominated by some number of copies of j. Formally, w_j\,x_j \ \le w_i, and v_j\,x_j \ \ge v_i for some x_j \in Z _+ i.e. J=\{j\}, \alpha=1, x_j=\lfloor \frac{w_i}{w_j}\rfloor. This dominance could be efficiently used during preprocessing because it can be detected relatively easily. ;Modular dominance: Let b be the
best item, i.e. \frac{v_b}{w_b}\ge\frac{v_i}{w_i}\, for all i. This is the item with the greatest density of value. The i-th item is
modularly dominated by a single item j, written as i\ll_\equiv j, if i is dominated by j plus several copies of b. Formally, w_j+tw_b \le w_i, and v_j +tv_b \ge v_i i.e. J=\{b,j\}, \alpha=1, x_b=t, x_j=1. == Variations ==