The token bucket algorithm can be conceptually understood as follows: • A token is added to the bucket every 1/r seconds. • The bucket can hold at the most b tokens. If a token arrives when the bucket is full, it is discarded. • When a packet (network layer
PDU) of
n bytes arrives, • if at least
n tokens are in the bucket,
n tokens are removed from the bucket, and the packet is sent to the network. • if fewer than
n tokens are available, no tokens are removed from the bucket, and the packet is considered to be
non-conformant.
Variations Implementers of this algorithm on platforms lacking the clock resolution necessary to add a single token to the bucket every 1/r seconds may want to consider an alternative formulation. Given the ability to update the token bucket every S milliseconds, the number of tokens to add every S milliseconds = (r*S)/1000.
Properties Average rate Over the long run the output of conformant packets is limited by the token rate, r.
Burst size Let M be the maximum possible transmission rate in bytes/second. Then T_\text{max} = \begin{cases} b/(M -r) & \text{ if } r is the maximum burst time, that is the time for which the rate M is fully utilized. The maximum burst size is thus B_\text{max} = T_\text{max}*M
Uses The token bucket can be used in either
traffic shaping or
traffic policing. In traffic policing, nonconforming packets may be discarded (dropped) or may be reduced in priority (for downstream traffic management functions to drop if there is congestion). In traffic shaping, packets are delayed until they conform. Traffic policing and traffic shaping are commonly used to protect the network against excess or excessively bursty traffic, see
bandwidth management and
congestion avoidance. Traffic shaping is commonly used in the
network interfaces in
hosts to prevent transmissions being discarded by traffic management functions in the network. The token bucket algorithm is also used in controlling database IO flow. In it, limitation applies to neither
IOPS nor the bandwidth but rather to a
linear combination of both. By defining tokens to be the normalized sum of IO request weight and its length, the algorithm makes sure that the
time derivative of the aforementioned function stays below the needed threshold. == Comparison to leaky bucket ==