Receive side scaling (rss), Analyzing performance issues – Dell Emulex Family of Adapters User Manual

Page 872

Advertising
background image

Emulex Drivers Version 10.2 for Linux User Manual

P010081-01A Rev. A

3. Configuration

Network Performance Tuning

872

For receive interrupts, disable AIC (since it is enabled by default) and set the interrupt

delay duration using ethtool. For example, to disable AIC and set the constant RX

interrupt delay to 8 microseconds, run

ethtool -C eth<N> adaptive-rx off rx-usec 8

where eth<N> is the number of the Ethernet interface you are working on.
If your application requires low or predictive latency, Emulex recommends that you

turn off AIC and set rx-usecs to 0.
For transmit interrupts, the default interrupt delay duration is 96 microseconds. You

can change this value using ethtool. For example, to set the TX interrupt delay to 64

microseconds run

ethtool -C eth<N> tx-usec 64

where eth<N> is the number of the Ethernet interface you are working on.

Receive Side Scaling (RSS)

Distributing the incoming traffic across several receive rings with separate interrupt

vectors helps to distribute the receive processing across several CPU cores. This could

reduce the packet drop and improve the packet rate in certain applications. RSS is

enabled in non-SR-IOV and non-multichannel configurations. In multichannel

configurations, RSS is enabled in the first section of each port.

Analyzing Performance Issues

MSI-x interrupts are required for RSS to work. If your motherboard and operating

system version supports MSI-X, the Ethernet driver automatically uses MSI-X

interrupts. If there are not enough MSI-X vectors available, the Ethernet driver uses

INTx interrupts, which may decrease performance. The proc node /proc/interrupts

shows the interrupts and their types.
The Linux performance utility “top” can monitor the CPU utilization while

troubleshooting performance issues. A low idle CPU percentage in any CPU core is an

indication of excessive processing load for that CPU. The proc node /proc/interrupts

shows the distribution of the interrupts across the CPU cores. If you see too many

interrupts per second directed to one CPU, check to see if the irqbalance program is

running. The irqbalance program is normally started at system boot. In some cases, you

can get better performance by disabling irqbalance and manually distributing

interrupts. You can manually distribute the interrupt load across the available CPU

cores by setting the CPU affinity for any interrupt vector by setting the mask in the proc

node /proc/irq/<int-vector>/smp_affinity.
Use the netstat command to look for excessive TCP retransmits or packet drops in the

network stack.
In systems having more than one NUMA node, you can get better performance by

pinning interrupts to the NUMA node local to the PCIe device.
Use the –S option of ethtool to see all statistics counters maintained by the Ethernet and

driver. Excessive drop or error counters are an indication of a bad link or defective

hardware. See Table E-1, Ethtool -S Option Statistics, on page 975, and Table E-2,

Advertising