Computing node The computing nodes are custom
ASICs with about fifty million transistors each. They are mainly made up of existing building blocks from
IBM. They are built around a 500 MHz
PowerPC 440 core with 4 MB
DRAM, memory management for external
DDR SDRAM, system I/O for internode communications, and dual Ethernet built in. The computing node is capable of 1 double precision
Gflops. Each node has one
DIMM socket capable of holding between 128 and 2048 MB of 333 MHz
ECC DDR SDRAM.
Inter node communication Each node has the capability to send and receive data from each of its twelve nearest neighbors in a six-dimensional mesh at a rate of 500 Mbit/s each. This provides a total off-node bandwidth of 12 Gbit/s. Each of these 24 channels has
DMA to the other nodes' on-chip DRAM or the external SDRAM. In practice only four dimensions will be used to form a communications sub-torus where the remaining two dimensions will be used to partition the system. The operating system communicates with the computing nodes using the Ethernet network. This is also used for diagnostics, configuration and communications with disk storage.
Mechanical design Two nodes are placed together on a daughter card with one DIMM socket and a 4:1 Ethernet hub for off-card communications. The daughter cards have two connectors, one carrying the internode communications network and one carrying power, Ethernet, clock and other house keeping facilities. Thirty-two daughter cards are placed in two rows on a motherboard that supports 800 Mbit/s off-board Ethernet communications. Eight motherboards are placed in crates with two backplanes supporting four motherboards each. Each crate consists of 512 processor nodes a and a 26 hypercube communications network. One node consumes about 5 W of power, and each crate is air and water cooled. A complete system can consist of any number of crates, for a total of up to several tens of thousands of nodes.
Operating system The QCDOC runs a custom-built operating system,
QOS, which facilitates boot, runtime, monitoring, diagnostics, and performance and simplifies management of the large number of computing nodes. It uses a custom embedded
kernel and provides single process
POSIX ("unix-like") compatibility using the Cygnus
newlib library. The kernel includes a specially written
UDP/
IP stack and
NFS client for disk access. The operating system also maintains system partitions so several users can have access to separate parts of the system for different applications. Each partition will only run one client application at any given time. Any multitasking is scheduled by the host controller system which is a regular computer using a large amounts of Ethernet ports connecting to the QCDOC. == See also ==