Speedup can be defined for two different types of quantities:
latency and
throughput.
Latency of an architecture is the reciprocal of the execution speed of a task: : L = \frac{1}{v} = \frac{T}{W}, where •
v is the execution speed of the task; •
T is the execution time of the task; •
W is the execution workload of the task.
Throughput of an architecture is the execution rate of a task: : Q = \rho vA = \frac{\rho AW}{T} = \frac{\rho A}{L}, where •
ρ is the execution density (e.g., the number of stages in an
instruction pipeline for a
pipelined architecture); •
A is the execution capacity (e.g., the number of
processors for a parallel architecture). Latency is often measured in seconds per unit of execution workload. Throughput is often measured in units of execution workload per second. Another unit of throughput is
instructions per cycle (IPC) and its reciprocal,
cycles per instruction (CPI), is another unit of latency. Speedup is dimensionless and defined differently for each type of quantity so that it is a consistent metric.
Speedup in latency Speedup in
latency is defined by the following formula: : S_\text{latency} = \frac{L_1}{L_2} = \frac{T_1W_2}{T_2W_1}, where •
Slatency is the speedup in latency of the architecture 2 with respect to the architecture 1; •
L1 is the latency of the architecture 1; •
L2 is the latency of the architecture 2. Speedup in latency can be predicted from
Amdahl's law or
Gustafson's law.
Speedup in throughput Speedup in
throughput is defined by the formula: : S_\text{throughput} = \frac{Q_2}{Q_1} = \frac{\rho_2A_2T_1W_2}{\rho_1A_1T_2W_1} = \frac{\rho_2A_2}{\rho_1A_1}S_\text{latency}, where •
Sthroughput is the speedup in throughput of the architecture 2 with respect to the architecture 1; •
Q1 is the throughput of the architecture 1; •
Q2 is the throughput of the architecture 2. ==Examples==