Dell Emulex Family of Adapters User Manual

Page 656

Advertising
background image

Emulex Drivers for Windows User Manual

P010077-01A Rev. A

3. Configuration

NIC Driver Configuration

656

To improve network and CPU performance for heavy network loads under these

conditions, you may have to make an appropriate NUMA CPU selection. For example,

in Windows Server 2012 R2, you can use the Task Manager to adjust the “Set Affinity”

property to bind the application to a specific NUMA node for maximum network

performance and CPU efficiency.

Checksum Offloading and Large Send Offloading (LSO)

The adapter supports IP, TCP, and UDP checksum offloading. All these protocols are

enabled by default. You can disable offloading through the Windows Device Manager

Advanced Properties. Disabling checksum offloading is only useful for packet sniffing

applications, such as Ethereal or Microsoft Network Monitor, on the local system where

the adapter is installed and monitored. When packets are sniffed, transmit packets may

appear to have incorrect checksums because the hardware has not yet calculated them.
The adapter supports transmit LSO, which allows the TCP stack to send one large block

of data, and the hardware segments it into multiple TCP packets. This is recommended

for performance, but it can be disabled for packet sniffing applications. LSO sends

appear as giant packets in the packet sniffer, because the hardware has not yet

segmented them.

Note: On Windows Server 2012, Recv Segment Coalescing is enabled by default. You

must disable Recv Segment Coalescing if you want to set the Checksum Offload

setting to anything other than enabled.

For information on modifying the CheckSum Offload or Large Send Offload parameter,

see “Configuring NIC Driver Options” on page 589.

Receive Side Scaling (RSS) for Non-Offloaded IP/TCP Network
Traffic

The adapter can process TCP receive packets on multiple processors in parallel. This is

ideal for applications that are CPU limited. Typically, these applications have

numerous client TCP connections that may be short-lived. Web servers and database

servers are prime examples. RSS typically increases the number of transactions per

second for these applications.

Understanding RSS

To better understand RSS, it helps to understand the interrupt mechanism used in the

network driver. Without RSS, a network driver receives an interrupt when a network

packet arrives. This interrupt may occur on any CPU, or it may be limited to a set of

CPUs for a given device, depending on the server architecture. The network driver

launches one DPC that runs on the same CPU as the interrupt. Only one DPC ever runs

at a time. In contrast, with RSS enabled, the network driver launches multiple parallel

DPCs on different CPUs.
For example, on a four-processor server that interrupts all processors, without RSS the

DPC jumps from CPU to CPU, but it only runs on one CPU at a time. Each processor is

busy only 25 percent of the time. The total reported CPU usage of the system is about 25

percent (perhaps more if other applications are also using the CPU). This is a sign that

Advertising