Performance and advanced functionalityEdit
Intel 82574L Gigabit Ethernet NIC, a PCI Express ×1 card, which provides two hardware receive queues[5]
Multiqueue NICs provide multiple transmit and receive queues, allowing packets received by the NIC to be assigned to one of its receive queues. Each receive queue is assigned to a separate interrupt; by routing each of those interrupts to different CPUs/cores, processing of the interrupt requests triggered by the network traffic received by a single NIC can be distributed among multiple cores, bringing additional performance improvements in interrupt handling. Usually, a NIC distributes incoming traffic between the receive queues using a hash function, while separate interrupts can be routed to different CPUs/cores either automatically by theoperating system, or manually by configuring the IRQ affinity.[6][7]
The hardware-based distribution of the interrupts, described above, is referred to asreceive-side scaling (RSS).[8]:82 Purely software implementations also exist, such as thereceive packet steering (RPS) and receive flow steering (RFS).[6] Further performance improvements can be achieved by routing the interrupt requests to the CPUs/cores executing the applications which are actually the ultimate destinations for network packetsthat generated the interrupts. That way, taking the application locality into account results in higher overall performance, reduced latency and better hardware utilization, resulting from the higher utilization of CPU caches and fewer required context switches. Examples of such implementations are the RFS[6] and Intel Flow Director.[8]:98,99[9][10][11]
With multiqueue NICs, additional performance improvements can be achieved by distributing outgoing traffic among different transmit queues. By assigning different transmit queues to different CPUs/cores, various operating system’s internal contentions can be avoided; this approach is usually referred to as transmit packet steering(XPS).[6]
Some products feature NIC partitioning(NPAR, also known as port partitioning) that uses SR-IOV to divide a single 10 Gigabit Ethernet NIC into multiple discrete virtual NICs with dedicated bandwidth, which are presented to the firmware and operating system as separate PCI device functions.[12][13] TCP offload engine is a technology used in some NICs to offload processing of the entire TCP/IP stack to the network controller. It is primarily used with high-speed network interfaces, such as Gigabit Ethernet and 10 Gigabit Ethernet, for which the processing overhead of the network stack becomes significant.[14]
Some NICs offer integrated field-programmable gate arrays (FPGAs) for user-programmable processing of network traffic before it reaches the host computer, allowing for significantly reduced latencies in time-sensitive workloads. Moreover, some NICs offer complete low-latency TCP/IP stacksrunning on integrated FPGAs in combination with userspace libraries that intercept networking operations usually performed by the operating system kernel; Solarflare’s open-source OpenOnload network stack that runs on Linux is an example. This kind of functionality is usually referred to as user-level networking.[15][16][17]
Comments