While working on a Hyper-V 3.0 deployment recently, I noticed that all of the virtual machines running on all newly built hypervisors were showing an unhealthy high ping latency trying to ping anything on the local network. Pinging the same destinations from the hypervisors resulted in perfectly reasonable 1ms ping times, but pings from inside the virtual machines varied from 1ms all the way past 200ms.
I started digging, thinking initially that it was just a clock synchronization issue that we saw on AMD as well as Intel chips, specifically on HP DL servers, in Windows Server 2003 days. The issue is not very well documented but one of the few Microsoft articles that attempts to explain it here. Solution to that issue is to add /USEPMTIMER switch to the boot.ini file. It used to fix negative ping times on AMD chips as well as lengthy ping times on Intel chips.
Windows Server 2012 does not have a boot.ini file obviously, so the way to add this switch is to issue the following command in elevated command line or PowerShell:
bcdedit.exe /set USEPLATFORMCLOCK on
(This explains the right side of the screenshot above – boot configuration is shown “before” and “after” running this command).
Unfortunately, applying this configuration at the hypervisor level and rebooting the server made no difference to ping times from VM (guest partitions).
Virtual Machine Queues
Virtual machine queues were introduced in Windows Server 2008. The purpose of this feature was to improve network performance of virtual machines receiving a lot of inbound traffic, by providing a more direct access to the hardware NIC. To quote directly from Microsoft:
“When VMQ is enabled, a dedicated queue is established on the physical network adapter for each virtual network adapter that has requested a queue. As packets arrive for a virtual network adapter, the physical network adapter places them in that network adapter’s queue. When packets are indicated up, all the packet data in the queue is delivered directly to the virtual network adapter. Packets arriving for virtual network adapters that don’t have a dedicated queue, as well as all multicast and broadcast packets, are delivered to the virtual network in the default queue. The virtual network handles routing of these packets to the appropriate virtual network adapters as it normally would.”
When the hypervisors were built, VMQ features of the network card on the physical host were enabled to achieve better VM network performance. Through disabling performance features such as VMQ, TCP Chimney, and Receive Side Scaling, it turned out that VMQ was the root cause for high ping latency. As soon as VMQ was disabled at the parent partition’s NIC driver level, VM ping times got down to a steady <1ms reading.
The hardware/software affected by this issue:
- HP DL 360p Gen8 server chassis
- 4-port 1Gbps Broadcom NIC (HP Ethernet 1Gb 4-port 331FLR Adapter)
- Windows Server 2012 Datacenter Edition
- Broadcom’s NIC driver dated October 26 2012, version 18.104.22.168 (link to HP site)
Bottom line, always perform at least basic QA of your builds before putting them into production. Some hardware (and more likely drivers) may cause performance issues instead of providing a performance boost - especially in early releases.