Hitachi 1000 User Manual

Page 15

Advertising
background image

www.hitachi.com

BladeSymphony 1000 Architecture

White Paper

15

Figure 6. Hitachi Node Controller connects multiple server blades

By dividing the SMP system across several server blades, the memory bus contention problem is
solved by virtue of the distributed design. A processor’s access to its on-board memory incurs no
penalty. The two processors (four cores) can access up to 64 GB at the full speed of local memory.
When a processor needs data that is not contained in its locally attached memory, its node controller
needs to contact the appropriate other node controller to retrieve the data. The latency for retrieving
that data is therefore higher than retrieving data from local memory. Since remote memory takes longer
to access, this is known as a non-uniform memory architecture (NUMA). The advantage of using non-
uniform memory is the ability to scale to a larger number of processors within a single system image
while still allowing for the speed of local memory access.

While there is a penalty for accessing remote memory, a number of operating systems are enhanced to
improve the performance of NUMA system designs. These operating systems take into account where
data is located when scheduling tasks to run on CPUs, using the closest CPU where possible. Some
operating systems are able to rearrange the location of data in memory to move it closer to the
processors where its needed. For operating systems that are not NUMA aware, the BladeSymphony
1000 offers a number of memory interleaving options that can improve performance.

The Node Controllers can connect to up to three other Node Controllers providing a point-to-point
connection between each Node Controller. The advantage of the point-to-point connections is it
eliminates a bus, which would be prone to contention, and eliminates the cross bar switch, which
reduces contention as a bus, but adds complexity and latency. A remote memory access is streamlined
because it only needs to pass through the two Node Controllers, this provides less latency when
compared to other SMP systems.

MC

Memory

Controller

MC

Memory

Controller

DDR2

Memory

DDR2

Memory

PCI

Bridge

PCI

Slots

PCI-Express (4 Lane)

PCI Bus

2GB/s x3

Processor Bus

6.4 GB/s

(FSB400MHz)

10.6 GB/s

(FSB667MHz)

Memory Bus

4.8 GB/s

(FSB400MHz)

5.3 GB/s

(FSB667MHz)

L3 Cache
Copy Tag

Node Bandwidth

4.8 GB/s

(FSB400MHz)

5.3 GB/s

(FSB667MHz)

CC-Numa
Point to point
Low Latency

NDC

Node

Controller

NDC

Node

Controller

NDC

Node

Controller

NDC

Node

Controller

Advertising