Dell Emulex Family of Adapters User Manual

Page 658

Advertising
background image

Emulex Drivers for Windows User Manual

P010077-01A Rev. A

3. Configuration

NIC Driver Configuration

658

TCP timers, including delayed ACK, push, retransmit and keep alive, are

implemented in hardware. This reduces host CPU usage.

Retransmits are handled entirely in hardware.

Packetizing data, including segmenting, checksums, and CRC, is supported.

The network driver should use send and receive buffers that are larger than 1

MB for maximum efficiency.

The driver provides efficient parallel processing of multiple connections TCP on

multiple CPU systems.

The adapter receive path is zero-copy for applications that prepost receive buffers or

that issue a socket read before the data arrives. Ideal applications use Microsoft's

Winsock2 Asynchronous Sockets API, which allows posting multiple receive buffers

with asynchronous completions, and posting multiple send operations with

asynchronous completions. Applications that do not prepost receive buffers may incur

the penalty of the data copy, and the performance improvement is significantly less

noticeable.
Applications that transmit large amounts of data show excellent CPU efficiency using

TCP offload. TCP offload allows the network driver to accept large buffers of data to

transmit. Each buffer is roughly the same amount of processing work as a single TCP

packet for non-offloaded traffic. The entire process of packetizing the data, processing

the incoming data acknowledgements, and potentially retransmitting any lost data is

handled by the hardware.

TCP Offload Exclusions

Microsoft provides a method to exclude certain applications from being offloaded to

the adapter. There are certain types of applications that do not benefit effectively from

TCP offload. These include TCP connections that are short-lived, transfer small

amounts of data at a time, exhibit fragmentation from end to end, or make use of IP

options.
If an application sends less data than the MSS, the driver, like most TCP stacks, uses a

Nagling algorithm. Nagling reduces the number of TCP packets on the network by

combining small application sends into one larger TCP packet. Nagling typically

reduces the performance of a single connection to allow greater overall performance for

a large group of connections.
During Nagling, a single connection may have long pauses (200 ms) between sending

subsequent packets, as the driver waits for more data from the application to append to

the packet. An application can disable Nagling using the TCP_NO_DELAY parameter.

TCP offload does not improve the performance for connections that Nagle, because the

performance is intentionally limited by the Nagling algorithm. Telnet and SSH consoles

are examples of connections that typically use Nagling.
Windows Server has not optimized the connection offload path. Some applications that

use numerous short-lived TCP connections do not show a performance improvement

using TCP offload.
Windows Server provides control over the applications and TCP ports that are eligible

for TCP offload using the netsh tool. Refer to the Microsoft documentation for these

netsh commands:

Advertising