Citrix XenApp: Is it worth upgrading to B200 M3 to improve user density?

We are currently running Cisco UCS B200 M2 blades in our current XenApp cluster. Now that the M2s are end of life and we are beginning to procure newer Cisco UCS B200 M3’s, I am starting to wonder what the benefits would be of moving our XenApp cluster to the newer Intel E5-2600 processor family. I would expect a decent increase as the M3 adds 8 logical CPU so it can support x2 additional 4vCPU XenApp VMs per blade (Citrix recommendation is to use 4vCPU VMs and align no of vCPU to logical CPU)…. so I’m hoping to increase user density by 30 users per host.

But how to know for sure? Well all things being equal, i.e. same storage, same XenApp environment, same PVS image, same network, we have Intel Xeon X5680 vs Xeon E5-2680, 12 vs 16 pCPU, 24 vs 32 logical CPUs and 39.888 GHz vs 43.184 GHz but it’s difficult to quantify the increase in user density without running live users on the blade or using software than can calculate the additional no of users that can be accommodated without compromising the user experience.

Into the ring enters “LoginVSI“… the de facto load testing tool for virtual desktop environments. LoginVSI generates a VSImax score, which is the maximum number of users workloads your VM\blade\environment can support before the user experience degrades. We will use the same LoginVSI test and the same applications to create a VSimax baseline on the B200 M2 and then compare this figure to the VSImax generated on the B200 M3 to calculate the increase in user density that can be safely accommodated by the M3.

In the left corner is the Cisco UCS B200 M2, a half width 2 socket blade based on Intel’s Nehalem 5600 processor.  In our case we are running a X5680 2 socket 6 core blade running at 3.324 GHz, rated at 130W, 12MB cache size, 1333MHz DDR3 DIMM, for a total of 12 cores and 39.888GHz. In the right corner is the Cisco UCS B200 M3, a half width 2 socket blade based on Intel’s Sandy Bridge E5-2600 processor. Our test blade is a E5-2680 2 socket 8 core blade running at 2.699 GHz, rated at 130W, 20MB cache size and 1666MHz DDR3 DIMM, for a total of 16 cores and 43.139GHz.

Shown below is a LoginVSI 150 user test with a Medium No Flash workload on a single B200 M2 running 6x XenApp VMs with 4vCPU and 12GB RAM each. The image below shows a VSImax score of 105, which is very similar to our current real user load per blade. As you can see the user experience degrades quite rapidly after 100 users.

B200 M2 X5680 VSImax

B200 M2 X5680 – Medium No Flash VSImax

The same test was run against a B200 M3 E5-2680 running 16 CPUs at 2.699 GHz for a total of 32 logical CPUs and 43.184GHz.

Shown below is the same LoginVSI 150 user test with a Medium No Flash workload on a single B200 M3, but this time running 8x XenApp VMs with 4vCPU and 12GB RAM each. The image below shows a VSImax score of 141, with little degradation until the 120 user mark, meaning the host was able to safely handle an additional 36 users without compromising the user experience. Not bad.

B200 M3 VSImax

B200 M3 E5-2680 – Medium No Flash VSImax

What happens when the workload is increased?

I ran the tests again with a LoginVSI Medium Workload. Here is a good link for the differences in LoginVSI workloads.

With a Medium Workload the VSImax on the M2 drops to 86, a drop of 19 users.

B200 M2 Medium Workload

B200 M2 X5680 – Medium VSImax

The VSImax on the M3 drops to 123, a similar drop of 18 users. The difference between them improves slightly to 37 users or 30% over the M2 VSImax.

B200 M3 150 user - Medium VSImax

B200 M3 150 user – Medium VSImax

Analysing the Results:

Has the user (and host) density increased? Indeed it has: The B200 M3 has improved the VSImax by 25% with a Medium No Flash workload and 30% with a Medium workload over the B200 M2; therefore user\host density has increased.
Is it worth it? That depends on your environment and your phase of deployment. If your M2’s are due to be replaced, this 25-30% increase will make quite a big difference if you have thousands of XenApp workloads to support. If your M2s still have some life left in them, but you are looking at procuring new hardware to support additional XenApp workloads, then factor in an additional 25-30% users per blade with the Cisco UCS B200 M3.
What is interesting is the number of users per physical core has changed very little.
  • B200 M2: 105 Medium No Flash users divided by 12 pCPU is 8.75 users per core.
  • B200 M3: 141 Medium No Flash users divided by 16 pCPU is 8.8 users per core.

It’s impressive that the M3 is able to support the same number of users per core as the M2, as is stated in CTX129761, “processor speed has a direct impact on the number of XenApp users that can be supported per processor.” In our test the M3 is rated substantially lower than the M2, 2.7GHz vs 3.33 GHz so Intel processors are definitely becoming more efficient.

Bring on 12 core… 🙂