As traffic and data types proliferate across the network, IT planners and administrators must find new ways to...
improve traffic efficiency to preserve application performance and maintain acceptable user-experience levels. Modern network adapters now provide an array of enhancements designed for that exact purpose when they're used with operating systems such as Windows Server 2012 R2.
As with most new technologies, network performance enhancements should be approached in a systematic manner -- often starting in a controlled lab environment -- to gain experience in deployment and management.
Kernel-mode remote DMA
Direct memory access (DMA) is actually an old idea of moving data from one point in memory to another with far less intervention from the operating system or higher levels of the software stack. But DMA's low reliance on software has also had a profound impact on network efficiency. The introduction of remote DMA (RDMA) technology to network devices enables direct data moves between two different servers with minimal involvement of either system's software stack. Instead, the NIC hardware handles the transfers internally.
The latest iteration of DMA locates light DMA code directly in the Windows Server 2012 R2 kernel; this is called kernel-mode remote DMA (kRDMA). This allows applications to communicate almost directly with the computer's network hardware for better workload performance and lower latency. KRDMA functionality is a relatively new addition to the network adapter feature set and will require a NIC that expressly supports the feature, sometimes called an RDMA-enabled NIC or RNIC.
Receive segment coalescing
Each packet the traditional NIC receives requires CPU intervention to strip the frame from the underlying data segment and then move the data segment to a buffer. The CPU has to unravel each piece of data and pass each data piece up the software stack to the application that needs it. It works, but heavy CPU demand, especially for receive-intensive network applications, can limit the server's scalability.
Receive segment coalescing (RSC) is a type of offload technology Windows Server 2012 R2 supports. It uses the NIC to strip data from each incoming packet and combine or coalesce received segments into a single larger packet. The NIC then sends the coalesced packet to the application. The net result is that the CPU requires far less receive-side intervention, allowing the CPU to work on more productive tasks and support greater scalability in the server.
RSC requires an RSC-capable network adapter card. In virtualized environments, the NIC should also support single-root I/O virtualization, or SR-IOV. Unfortunately, RSC does not handle IPsec encrypted traffic or non-transmission control protocol (TCP) protocols.
Remember that RSC is designed to support the receive-side of network traffic, so it has no effect on outgoing network traffic. It won't help Web servers or other transmit-intensive applications. However, a complementary technology such as large send offload, or LSO, can boost efficiency in the server's outgoing network traffic.
TCP loopback optimization
The TCP is verbose, so it relies heavily on handshakes between sending and receiving points to ensure proper communication. Network applications rely on TCP to communicate on the open network, and many applications also use TCP loopback to ensure the reliability of communication between processes on the same server; this is called interprocess communication, or IPC. In spite of TCP's benefits, such handshaking can add latency or limit the performance of enterprise applications, especially when they're interoperating with other processes on the same server.
Windows Server 2012 R2 includes a TCP loopback optimization feature designed to shorten the path needed for handshakes. Normally, a TCP loopback would need to go from an application layer to a TCP layer to an IP layer and back. The optimized path foregoes the IP layer, which can be processing-intensive. Handshakes still occur, but the shorter path means less latency and better application performance. Optimization is enabled on a per-connection (per-socket) basis. Virtualization is supported, but VM-to-VM optimization is not supported, so you cannot boost communication between two VMs on the same server.
NIC teaming in Windows Server 2012 R2
How generic routing encapsulation works in Windows Server 2012 R2