In a multiprocessor system, task parallelism is achieved when each processor executes a different thread (or process) on the same or different data. The threads may execute the same or different code. In the general case, different execution threads communicate with one another as they work, but this is not a requirement. Communication usually takes place by passing data from one thread to the next as part of a
workflow. As a simple example, if a system is running code on a 2-processor system (
CPUs "a" & "b") in a
parallel environment and we wish to do tasks "A" and "B", it is possible to tell CPU "a" to do task "A" and CPU "b" to do task "B" simultaneously, thereby reducing the
run time of the execution. The tasks can be assigned using
conditional statements as described below. Task parallelism emphasizes the distributed (parallelized) nature of the processing (i.e. threads), as opposed to the data (
data parallelism). Most real programs fall somewhere on a continuum between task parallelism and data parallelism.
Thread-level parallelism (
TLP) is the
parallelism inherent in an application that runs multiple
threads at once. This type of parallelism is found largely in applications written for commercial
servers such as databases. By running many threads at once, these applications are able to tolerate the high amounts of I/O and memory system latency their workloads can incur - while one thread is delayed waiting for a memory or disk access, other threads can do useful work. The exploitation of thread-level parallelism has also begun to make inroads into the desktop market with the advent of
multi-core microprocessors. This has occurred because, for various reasons, it has become increasingly impractical to increase either the clock speed or instructions per clock of a single core. If this trend continues, new applications will have to be designed to utilize multiple threads in order to benefit from the increase in potential computing power. This contrasts with previous microprocessor innovations in which existing code was automatically sped up by running it on a newer/faster computer. ==Example==