Each Origin 2000 module is based on nodes that are plugged into a
backplane. Each module can contain up to four node boards, two router boards and twelve
XIO options. The modules are then mounted inside a deskside enclosure or a rack. Deskside enclosures can only contain one module, while racks can contain two. In configurations with more than two modules, multiple racks are used. Figures specified are for maximum configurations. The
Origin 200 uses some of the architectural components, but in a very different physical realization that is not scalable.
Architecture An Origin 2000 system is composed of nodes linked together by an interconnection network. It uses the
distributed shared memory sometimes called Scalable Shared-Memory Multiprocessing (S2MP) architecture. The Origin 2000 uses
NUMAlink (originally named CrayLink) for its system interconnect. The nodes are connected to router boards, which use NUMAlink cables to connect to other nodes through their routers. The Origin 2000's network topology is a bristled fat
hypercube. In configurations with more than 64 processors, a hierarchical fat hypercube network topology is used instead. Additional NUMAlink cables, called Xpress links can be installed between unused Standard Router ports to reduce latency and increase bandwidth. Xpress links can only be used in systems that have 16 or 32 processors, as these are the only configurations with a network topology that enables unused ports to be used in such a way. The architecture has its roots in the
DASH project at
Stanford University, led by
John L. Hennessy, which included two of the Origin designers.
Router boards There are four different router boards used by the Origin 2000. Each successive router board allows a larger amount of nodes to be connected.
Null Router The Null Router connects two nodes in the same module. A system using the Null Router cannot be expanded as there are no external connectors.
Star Router The Star Router can connect up to four nodes. It is always used in conjunction with a Standard Router to function correctly.
Standard Router (Rack Router) The Standard Router can connect up to 32 nodes. It contains an
application-specific integrated circuit (ASIC) known as the scalable pipelined interconnect for distributed endpoint routing (SPIDER), which serves as a router for the NUMAlink network. The SPIDER ASIC has six ports, each with a pair of unidirectional links, connected to a
crossbar which enables the ports to communicate with one another.
Meta Router (Cray Router) The Meta Router is used in conjunction with Standard Routers to connect more than 32 nodes. It can connect up to 64 nodes.
Nodes Each Origin 2000 node fits on a single 16" by 11"
printed circuit board that contains one or two processors, the main memory, the directory memory and the Hub ASIC. The node board plugs into the backplane through a 300-pad CPOP (Compression Pad-on-Pad) connector. The connector actually combines two connections, one to the NUMAlink router network and another to the XIO I/O subsystem.
Processor Each processor and their secondary cache is contained on a HIMM (Horizontal Inline Memory Module) daughter card that plugs into the node board. At the time of introduction, the Origin 2000 used the IP27 board, featuring one or two
R10000 processors clocked at 180 MHz with 1 MB secondary cache(s). A high-end model with two 195 MHz R10000 processors with 4 MB secondary caches was also available. In February 1998, the IP31 board was introduced with two 250 MHz R10000 processors with 4 MB secondary caches. Later, the IP31 board was upgraded to support two 300, 350 or 400 MHz
R12000 processors. The 300 and 400 MHz models had 8 MB L2 caches, while the 350 MHz model had 4 MB L2 caches. Near the end of its life, a variant of the IP31 board that could utilize the 500 MHz
R14000 with 8 MB L2 caches was made available.
Main memory and directory memory Each node board can support a maximum of 4 GB of memory through 16 DIMM slots by using proprietary
ECC memory SDRAM DIMMs with capacities of 16, 32, 64 and 256 MB. Because the memory bus is 144 bits wide (128 bits for data and 16 bits for ECC), memory modules are inserted in pairs. To support the Origin 2000 distributed shared memory model, the memory modules are proprietary and include directory memory, which contains information on the contents of remote caches for maintaining
cache coherency, supporting up to 32 processors. Additional directory memory is required in configurations with more than 32 processors. The additional directory memory is contained on proprietary DIMMs that are inserted into eight DIMM slots set aside for its use.
Hub ASIC The Hub ASIC interfaces the processors, memory and
XIO to the
NUMAlink 2 system interconnect. The ASIC contains five major sections: the crossbar (referred to as the "XB"), the I/O interface (referred to as the "II"), the network interface (referred to as the "NI"), the processor interface (referred to as the "PI") and the memory and directory interface (referred to as the "DM"), which also serves as the memory controller. The interfaces communicate with each other via
FIFO buffers that are connected to the crossbar. When two processors are connected to the Hub ASIC, the node does not behave in a
SMP fashion. Instead, the two processors operate separately and their buses are
multiplexed over the single processor interface. This was done to save pins on the Hub ASIC. The Hub ASIC is clocked at 100 MHz and contains 900,000 gates fabricated in a five-layer metal process.
I/O subsystem The I/O subsystem is based around the Crossbow (Xbow) ASIC, which shares many similarities with the SPIDER ASIC. Since the Xbow ASIC is intended for use with the simpler XIO protocol, its hardware is also simpler, allowing the ASIC to feature eight ports, compared with the SPIDER ASIC's six ports. Two of the ports connect to the node boards, and the remaining six to XIO cards. While the I/O subsystem's native bus is XIO,
PCI-X and
VME64 buses can also be used, provided by XIO bridges. An IO6 base I/O board is present in every system. It is a XIO card that provides: • 1 10/100BASE-TX port • 2
Serial ports provided by dual
UARTs • 1 internal
Fast 20 UltraSCSI single-ended port • 1 external
wide UltraSCSI, singled ended port • 1 real-time interrupt output for frame sync • 1 real-time interrupt input (edge triggered) •
Flash PROM,
NVRAM and
real-time clock The IO6G (G for Graphics) had 2 additional serial ports and keyboard/mouse ports plus the above ports. The IO6G was required on systems with the Onyx Graphics pipes(cards) to connect keyboard/mouse. ==Notes==