Figure 11 – AMD ATHLON 64 User Manual

Page 31

Advertising
background image

Chapter 3

Analysis and Recommendations

31

Performance Guidelines for AMD Athlon™ 64 and AMD Opteron™

ccNUMA Multiprocessor Systems

40555

Rev. 3.00

June 2006

However, as shown in Figure 11 on page 31, when both threads are write-only, the 0 hop-1 hop and
0 hop-2 hop cases are faster than the 0 hop-0 hop case.

Figure 11.

Both Write-Only Threads Running on Node 0 (Different Cores) on an Idle

System

When a single thread reads locally, it generates a memory bandwidth load of 1.64 GB/s. Assuming a
sustained memory bandwidth of 70% of the theoretical maximum of 6.4 GB/s (PC3200 DDR
memory), the cumulative bandwidth demanded by two read-only threads does not exceed the
sustained memory bandwidth on that node and hence the local or 0 hop-0 hop case is the fastest.

However, when a single thread writes locally it generates a memory bandwidth load of 2.98 GB/s.
This is because each write in this test case results in a cache line eviction and thus generates twice the
memory traffic generated by a read. The cumulative memory bandwidth demanded by 2 write-only
threads now exceeds the sustained memory bandwidth on that node. The 0 hop-0 hop case now incurs
the penalty of saturating the memory bandwidth on that node. For detailed analysis, refer to Section
A.4 on page 42.

It is useful to study whether this observation is also applicable under a variable background load.

One would expect that, if the memory bandwidth demanded of the remote node were increased, at
some point the 0 hop-1 hop case would become as slow as, and perhaps slower than, the
0 hop-0 hop case for the write-only threads.

The same two write-only threads as before are running on node 0, going though the following cases:

Both threads access local memory.

First thread accesses local memory and second thread accesses memory that is remote by one hop.

First thread accesses local memory and second thread access memory that is remote by two hops.

Total Time for both threads (write-write)

147%

126%

125%

136%

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

0.0.w.0 0.1.w.0 (0 Hops) (0 Hops)
0.0.w.0 0.1.w.1 (0 Hops) (1 Hops)

0.0.w.0 0.1.w.2 (0 Hops) (1 Hops)
0.0.w.0 0.1.w.3 (0 Hops) (2 Hops)

0
Hop

0 Hop
1 Hop

0 Hop
1 Hop

0 Hop
2 Hop

Advertising