Multiprocessor system with a shared memory closely connected to the processors. A
symmetric multiprocessing system is a system with centralized shared memory called main memory (MM) operating under a single
operating system with two or more homogeneous processors. There are two types of systems: • Uniform memory-access (UMA) system • NUMA system
Uniform memory access (UMA) system • Heterogeneous multiprocessing system • Symmetric multiprocessing system (SMP)
Heterogeneous multiprocessor system A
heterogeneous multiprocessing system contains multiple, but not homogeneous, processing units – central processing units (CPUs), graphics processing units (GPUs), digital signal processors (DSPs), or any type of
application-specific integrated circuits (ASICs). The system architecture allows any accelerator – for instance, a graphics processor – to operate at the same processing level as the system's CPU.
Symmetric multiprocessor system Systems operating under a single OS (
operating system) with two or more homogeneous processors and with a centralized shared main memory. A symmetric multiprocessor system (SMP) is a system with a pool of homogeneous processors running under a single OS with a centralized, shared main memory. Each processor, executing different programs and working on different sets of data, has the ability to share common resources (memory, I/O device, interrupt system, and so on) that are connected using a system bus, a crossbar, or a mix of the two, or an address bus and data crossbar. Each processor has its own cache memory that acts as a bridge between the processor and main memory. The function of the cache is to alleviate the need for main-memory data access, thus reducing system-bus traffic. Use of shared memory allows for a uniform memory-access time (UMA).
cc-NUMA system It is known that the SMP system has limited scalability. To overcome this limitation, the architecture called "cc-NUMA" (cache coherency–non-uniform memory access) is normally used. The main characteristic of a cc-NUMA system is having shared global memory that is distributed to each node, although the effective "access" a processor has to the memory of a remote component subsystem, or "node", is slower compared to local memory access, which is why the memory access is "non-uniform". A cc–NUMA system is a cluster of SMP systems – each called a "node", which can have a single processor, a
multi-core processor, or a mix of the two, of one or other kinds of architecture – connected via a high-speed "connection network" that can be a "link" that can be a single or double-reverse ring, or multi-ring, point-to-point connections, or a mix of these (e.g.
IBM Power Systems), bus interconnection (e.g. NUMAq), "crossbar", "segmented bus" (
NUMA Bull HN ISI ex
Honeywell,) "
mesh router", etc. cc-NUMA is also called "
distributed shared memory" (DSM) architecture. The difference in access times between local and remote memory can be also an order of magnitude, depending on the kind of connection network used (faster in segmented bus, crossbar, and point-to-point interconnection; slower in serial rings connection).
Examples of interconnection To overcome this limit, a large remote cache (see
Remote cache) is normally used. With this solution, the cc-NUMA system becomes very close to a large SMP system. == Tightly-coupled versus loosely-coupled architecture ==