تاثیر Jumbo Frame ها بر روی کارایی vMotion
My relationship with jumbo frames typically results in a Captain Picard facepalm. They sound like a great idea and make sense on paper, but often end up crashing and burning into a pile of fail. If you are not familiar with the term, a Jumbo Frame is any layer 2 Ethernet frame that has a payload larger than 1500 bytes. To enable jumbo frames increase the Maximum Transmission Unit (MTU) on all devices that make up the networking path from the source to the destination.
Classically, one of the big issues around supporting jumbo frames is trying to get all network components set the desired MTU, typically 9000. The setup in a brand new environment is trivial – just set the frame MTU on all devices and if you missed one, oops, no big deal – just go fix it. In a legacy environment running production workloads, it can be a little more stressful to make the change. I want to know that the effort required to get jumbo frames working is met with a solid use case that will show substantial improvement, such as a 10% increase in performance.
This got me to thinking about jumbo frames for VMware vMotion traffic. And that further lead me to testing performance on a 1 GbE network in the lab to see if the highest MTU size, 9000, would show any level of improvement when shuffling around a VM workload. To make things even more interesting, I did the test on a multi-NIC vMotion network.
Note: Jumbo frames for vMotion is both supported and recommended as per this KB. Specifically the line “Use of Jumbo Frames is recommended for best vMotion performance.”
Testing Baseline
A pair of identical hosts running ESXi 5.1 build 1021289 from the Wahl Network lab was used for this test. Each host has a pair of 1 GbE uplinks configured for multi-NIC vMotion to an HP V1910 switch. No workloads were running on the hosts other than a single VM with 2 vCPUs and 24 GB of RAM running Windows Server 2008 R2 Enterprise. All other workloads were powered down and no significant traffic, other than the test itself, was on the switch.
The test VM had Prime95 configured to consume 20 GB of RAM by two worker threads. Below is the exact configuration of the torture test.
And here is a picture of the test in progress.
All vMotions were performed programmatically via PowerCLI. The code basically identifies the VM, figures out which of the two hosts it is on, migrates it to the other one, and then queries the task start and finish time to determine how many seconds it took.
01 |
$vm = Get -VM -Name "Prime95" |
02 |
If ( $vm .Host -match "esx1" ) { $target = (Get -VMHost -Name "esx2.glacier.local" )} |
03 |
Else { $target = (Get -VMHost -Name "esx1.glacier.local" )} |
04 |
05 |
Write-Host "Starting migration now..." |
06 |
Move -VM -VM $vm -Destination $target -Confirm : $false |
07 |
08 |
$tasks = Get -Task |
09 |
$span = New-TimeSpan -Start ( $tasks [-1].Starttime) -End ( $tasks [-1].finishtime) |
10 |
Write-Host "Task completed in" $span .TotalSeconds "seconds" |
Test 1 – vMotion, No Jumbo Frames (MTU 1500)
The first set of tests were completed with the standard MTU value of 1500. Below is a sample of the ESXTOP data during a migration.
The vMotion was performed 10 times. I removed the highest and lowest speed results, with the remaining 8 tests clocking in as follows:
Test, Time (in seconds)
|
---|
1,162.00 |
2,162.00 |
3,169.00 |
4,177.00 |
5,185.00 |
6,186.00 |
7,186.00 |
8,186.00 |
This puts average migration time right at 176 seconds.
Test 2 – vMotion, With Jumbo Frames (MTU 9000)
The next set of tests were completed with a jumbo MTU value of 9000. Below is a sample of the ESXTOP data during a migration.
The vMotion was performed 10 times. I removed the highest and lowest speed results, with the remaining 8 tests clocking in as follows:
Test, Time (in seconds)
|
---|
1,171.00 |
2,172.00 |
3,174.00 |
4,178.00 |
5,179.00 |
6,180.00 |
7,191.00 |
8,195.00 |
This puts average migration time right at 180 seconds.
Test 3 – vMotion with Prime95 Halted
As a further attempt to control the test, I halted all of the Prime95 worker threads. vMotion speeds, both with and without jumbo frames, were nearly identical at 19 seconds. The margin of difference was roughly 0.5 seconds in favor of a 1500 MTU, which is not enough to really call it either way. It could have just been rounding or latency from the task to report back to the vmkernel.
Thoughts
The test shows that enabling jumbo frames (MTU 9000) actually hurt the performance of vMotion by an average of 4 seconds on a VM with 20 GB of very active RAM in my lab. I think this underlines the need to test the effects of a change prior to implementing it, as you would want to see if any positive gains can be achieved in your specific environment. If you’re using 10 GbE you’d want to test this out with your line speed, as jumbo frames tends to have a bit more success with a larger pipe.