I recently had the chance to compare similar generation AMD and Intel blade offerings from a single vendor with LoginVSI to see if there was any great difference between the two available chipsets when deploying server based computing environments leveraging Citrix XenApp.
This produced some really interesting results… Allow me to introduce ‘The Citrix XenApp Showdown: AMD Opteron 6278 vs Intel Xeon E5-2670′
These processors were chosen as they are two similar spec HPC blades from the same vendor. However it is clear from a July 2012 AMD document titled HPC Processor Comparison that the Opteron 6278 has a lower price and lower SPECint_rate2006 benchmark score than the Intel Xeon E5-2670.
This test has been conducted on identical infrastructure, i.e. same storage, same 10GB network, same Citrix XenApp 6.5 infrastructure, same PVS 6.1 image and uses the well known benchmarking tool LoginVSI to produce a VSImax score to determine how many concurrent users each blade can safely handle before the user experience deteriorates.
The environment used for the test includes:
- ESXi 5.1
- Citrix XenApp 6.5
- Citrix PVS 6.1
- LoginVSI 4.0.4
- Virtual Machines running Windows Server 2008 R2 SP1
In the left corner we have the AMD blade based on the Bulldozer 6200 Opteron Processor:
- AMD Processor – Opteron 6278 Dual 16 core 2.4 Ghz with 256GB memory
In the right corner we have the Intel blade based on the Sandy Bridge E5-2600 Xeon processor:
- Intel Processor – Xeon E5-2670 Dual 8 Core 2.6Ghz with 256GB memory
At a high level the blades are pretty similar, with dual sockets and similar generation processors, released within 3 months of each other in 2012, similar clock speed and no of logical processors. The architecture of the AMD and Intel blade are however quite different.
The AMD blade has a big advantage with a total of 32 physical cores but without any hyper threading equivalent therefore it has 32 logical cores available to ESXi. The Intel blade has 16 physical cores, but with hyper threading enabled this also gives 32 logical CPU available to ESXi.
Because AMD are able to offer twice as many cores as the Intel blade, ESXi reports as having almost twice the GHz available, shown in the picture below, on the AMD blade (R) than the Intel blade. (L)
ESXi reports the Intel blade has 16*2.599 = 41.5GHz and the AMD blade reports 32*2.4 = 76.8 GHz so you would expect on face value for the AMD blade to offer almost double the performance.
I’m not going to go into too much detail here, this is well documented elsewhere, but essentially AMD and Intel have come up with two different approaches to solving the same problem – CPU under utilisation. Intel rely on a single core which they try to increase performance by concurrently supplying two threads with Hyper threading to increase CPU utilisation.
AMD chose to split the core in two, so rather than having one complex core, they opted for two simple cores with shared components, with each core having their own execution thread. Hence how AMD are able to offer 16 core blades vs Intel 8 core blades with twice the available GHz of the Intel.
Each approach clearly has its own benefits and advantages… There is a comparison available between the processors at cpuboss.com. To summarise, the Intel Xeon processor is more expensive, has a higher clock speed, more L3 cache vs the AMD blade (20MB vs 15.3MB), but the AMD blade is cheaper, has 8x times the L2 cache (16MB vs 2MB) and double the cores.
But which is more suited to Citrix XenApp?
Medium Workload Test
The first test run was the default LoginVSI Medium Workload Test.
Each blade was configured according to Citrix best practices: 8x 2008 R2 SP1 VMs with 4vCPU each so that no of vCPU = logical CPU. Memory was configured at 16GB for a total of 128GB per blade.
Shown below is the AMD blade with a Medium Workload VSIMax score of 83. Note the high VSIbase score (4633) which indicates the performance of the system with no load on the environment. The lower the score the better the performance and this is used to determine the performance threshold.
There are a no of high number of maximum responses (in red). The user experience starts to suffer almost immediately and the maximum responses start to spike and exceed 6000ms after only 24 users have logged on (3 users per VM). The VSImax score indicates that you would be hard pressed to run more than 10 users per VM which is pretty poor.
Shown below is the Intel Blade test with a Medium Workload with a VSImax score of 134. No official VSImax score was reached, although there is a blue X indicating VSImax at 150 users, less the 16 stuck sessions equals a VSImax corrected score of 134. For anyone with doubts, this is an accurate figure based on other medium workload tests that we ran.
In comparison to the AMD Opteron 6278, note the much lower VSIbase score for the Intel Xeon E5-2670 (2217) indicating better system performance and the complete lack of high maximum response times indicating a more reliable user experience. Maximum response times only start to exceed 6000ms around the 90 user mark indicating the blade is able to process user logons and run applications in the background consistently. 134 users equals a much more respectable 16 users per VM for the Intel blade.
Conclusion: There is a pretty impressive 53 user increase in user density between the AMD and Intel blades on a Medium Workload. In other words if you replace your AMD blades with Intel blades you are looking at a 63% gain in user density with comparable Intel hardware with a medium workload user. For a blade with half the number of cores and GHz that is quite impressive and a massive endorsement of the Intel chipset architecture.
Heavy Workload Test
I re-ran the tests with a LoginVSI Heavy Workload. Again each blade was configured according to Citrix best practices: 8x 2008 R2 SP1 VMs with 4vCPU each so that no of vCPU = logical CPU. Memory was configured at 16GB for a total of 128GB per blade.
The VSImax results get really interesting with the LoginVSI heavy workload test. Here is a summary of the LoginVSI workloads. The Heavy workload is “higher on memory and CPU consumption because more applications are running in the background.”
Shown below is the AMD blade with a Heavy Workload VSImax score of 61. As expected the VSImax score drops due to the heavier workload. Note the similar high VSIbase score to the previous AMD test and how maximum response times start to exceed 6000ms after only 26 users. A VSImax score of 61 is a maximum of 7 users per VM. We’re heading into really poor territory now.
Shown below is the Intel Blade test with a Heavy Workload VSImax score of 129. This is a drop of only 5 users from the Medium workload test which is remarkable. The Intel blade appears to perform better when the workload is increased. Maximum response times have improved and only exceed 6000ms at 90 users (and never exceeds 10000ms unlike the medium workload test.) A VSImax score of 129 ensures that the number of users per VM remains at 16 even on a high workload.
Conclusion: The difference between the two results is startling. The high frequency of maximum response times in the AMD test show how the blade is simply struggling to cope with the task of processing user logons and launching and using standard desktop applications.
These numbers are hard to believe, but increasing the workload shows an even bigger gap between the AMD and Intel blades. There is now a 68 user increase in user density by moving from AMD to Intel. If you have a higher proportion of heavy users in your environment, you will see even greater gains by moving from AMD to Intel. In this case you are looking at a 111% gain in user density with comparable Intel hardware.
The clear winner here, by a large margin is the Intel Sandy Bridge Xeon E5-2670 processor blade. Although the Intel blade will be more expensive due to the more expensive processor, it more than pays for itself by offering a far higher user density and a surprising ability to cope with heavy workloads.
I’m still scratching my head here as the AMD blade appears to offer a decent performance/price point alternative to the Intel blade, but the results do not support this. Although it offers twice the number of cores and almost doubles the available GHz to the hypervisor, it is not able to translate this into providing a similar user experience. Although the Intel has a higher SPECint_rate2006 benchmark score I never thought this would translate into more than double (111%) user density increase when testing with LoginVSI.
I would be interested to do a comparison between two blades where the AMD blade has a higher SpecInt_rate2006 benchmark score to see at what level a lower Intel spec blade can outperform its AMD rival. My guess is that even the entry level Xeon E5-2620 (SPECint_rate2006 score 396) would be able to match the top of the range Opteron 6284 SE (SPECint_rate2006 score 573).
As the workload gets heavier, the results skew even more in Intel’s favour. A heavier workload need not necessarily come from your users behaviour. It has been documented by Citrix and ProjectVRC that moving from Office 2010 to Office 2013 results in a 20-30% increase in the user workload. After reviewing these results I know which processor I would rather have in my SBC environment.
In other words choosing Intel over AMD not only provides better user density, lower CapEx and OpEx costs (due to the smaller infrastructure footprint, licensing, etc) and an improved ability to cope with heavier workloads but can provide some future proofing if you are planning on upgrading to Office 2013.
Clearly the AMD Bulldozer architecture has some advantages over the Intel Sandy Bridge, but server based computing (SBC) is not one of them.
Steer clear if you can.