Citrix XenApp Showdown: AMD Opteron 6278 vs Intel Xeon E5-2670


I recently had the chance to compare similar generation AMD and Intel blade offerings from a single vendor with LoginVSI to see if there was any great difference between the two available chipsets when deploying server based computing environments leveraging Citrix XenApp.

This produced some really interesting results… Allow me to introduce ‘The Citrix XenApp Showdown: AMD Opteron 6278 vs Intel Xeon E5-2670′

These processors were chosen as they are two similar spec HPC blades from the same vendor. However it is clear from a July 2012 AMD document titled HPC Processor Comparison that the Opteron 6278 has a lower price and lower SPECint_rate2006 benchmark score than the Intel Xeon E5-2670.

AMD Positioning Guidance

AMD HPC Processor Comparison

This test has been conducted on identical infrastructure, i.e. same storage, same 10GB network, same Citrix XenApp 6.5 infrastructure, same PVS 6.1 image and uses the well known benchmarking tool LoginVSI to produce a VSImax score to determine how many concurrent users each blade can safely handle before the user experience deteriorates.

The environment used for the test includes:

  • ESXi 5.1
  • Citrix XenApp 6.5
  • Citrix PVS 6.1
  • LoginVSI 4.0.4
  • Virtual Machines running Windows Server 2008 R2 SP1

In the left corner we have the AMD blade based on the Bulldozer 6200 Opteron Processor:

  • AMD Processor – Opteron 6278 Dual 16 core 2.4 Ghz with 256GB memory

In the right corner we have the Intel blade based on the Sandy Bridge E5-2600 Xeon processor:

  • Intel Processor – Xeon E5-2670 Dual 8 Core 2.6Ghz with 256GB memory

At a high level the blades are pretty similar, with dual sockets and similar generation processors, released within 3 months of each other in 2012, similar clock speed and no of logical processors. The architecture of the AMD and Intel blade are however quite different.

The AMD blade has a big advantage with a total of 32 physical cores but without any hyper threading equivalent therefore it has 32 logical cores available to ESXi. The Intel blade has 16 physical cores, but with hyper threading enabled this also gives 32 logical CPU available to ESXi.

Because AMD are able to offer twice as many cores as the Intel blade, ESXi reports as having almost twice the GHz available, shown in the picture below, on the AMD blade (R) than the Intel blade. (L)

AMD vs Intel Total GHz

Intel (L) 41.5 GHz vs AMD (R) 76.8GHz total available

ESXi reports the Intel blade has 16*2.599 = 41.5GHz and the AMD blade reports 32*2.4 = 76.8 GHz so you would expect on face value for the AMD blade to offer almost double the performance.

I’m not going to go into too much detail here, this is well documented elsewhere, but essentially AMD and Intel have come up with two different approaches to solving the same problem – CPU under utilisation. Intel rely on a single core which they try to increase performance by concurrently supplying two threads with Hyper threading to increase CPU utilisation.

AMD chose to split the core in two, so rather than having one complex core, they opted for two simple cores with shared components, with each core having their own execution thread. Hence how AMD are able to offer 16 core blades vs Intel 8 core blades with twice the available GHz of the Intel.

Each approach clearly has its own benefits and advantages… There is a comparison available between the processors at cpuboss.com. To summarise, the Intel Xeon processor is more expensive, has a higher clock speed, more L3 cache vs the AMD blade (20MB vs 15.3MB), but the AMD blade is cheaper, has 8x times the L2 cache (16MB vs 2MB) and double the cores.

But which is more suited to Citrix XenApp?

Medium Workload Test

The first test run was the default LoginVSI Medium Workload Test.

Each blade was configured according to Citrix best practices: 8x 2008 R2 SP1 VMs with 4vCPU each so that no of vCPU = logical CPU. Memory was configured at 16GB for a total of 128GB per blade.

Shown below is the AMD blade with a Medium Workload VSIMax score of 83. Note the high VSIbase score (4633) which indicates the performance of the system with no load on the environment. The lower the score the better the performance and this is used to determine the performance threshold.

There are a no of high number of maximum responses (in red). The user experience starts to suffer almost immediately and the maximum responses start to spike and exceed 6000ms after only 24 users have logged on (3 users per VM). The VSImax score indicates that you would be hard pressed to run more than 10 users per VM which is pretty poor.

AMD Medium Workload

AMD Opteron 6278 Medium Workload

Shown below is the Intel Blade test with a Medium Workload with a VSImax score of 134. No official VSImax score was reached, although there is a blue X indicating VSImax at 150 users, less the 16 stuck sessions equals a VSImax corrected score of 134. For anyone with doubts, this is an accurate figure based on other medium workload tests that we ran.

In comparison to the AMD Opteron 6278, note the much lower VSIbase score for the Intel Xeon E5-2670 (2217) indicating better system performance and the complete lack of high maximum response times indicating a more reliable user experience. Maximum response times only start to exceed 6000ms around the 90 user mark indicating the blade is able to process user logons and run applications in the background consistently. 134 users equals a much more respectable 16 users per VM for the Intel blade.

Intel Medium Workload

Intel Xeon E5-2670 Medium Workload

Conclusion: There is a pretty impressive 53 user increase in user density between the AMD and Intel blades on a Medium Workload. In other words if you replace your AMD blades with Intel blades you are looking at a 63% gain in user density with comparable Intel hardware with a medium workload user. For a blade with half the number of cores and GHz that is quite  impressive and a massive endorsement of the Intel chipset architecture.

Heavy Workload Test

I re-ran the tests with a LoginVSI Heavy Workload. Again each blade was configured according to Citrix best practices: 8x 2008 R2 SP1 VMs with 4vCPU each so that no of vCPU = logical CPU. Memory was configured at 16GB for a total of 128GB per blade.

The VSImax results get really interesting with the LoginVSI heavy workload test. Here is a summary of the LoginVSI workloads. The Heavy workload is “higher on memory and CPU consumption because more applications are running in the background.”

Shown below is the AMD blade with a Heavy Workload VSImax score of 61. As expected the VSImax score drops due to the heavier workload. Note the similar high VSIbase score to the previous AMD test and how maximum response times start to exceed 6000ms after only 26 users. A VSImax score of 61 is a maximum of 7 users per VM. We’re heading into really poor territory now.

Heavy Workload - AMG

AMD Opteron 6278 Heavy Workload

Shown below is the Intel Blade test with a Heavy Workload VSImax score of 129.  This is a drop of only 5 users from the Medium workload test which is remarkable. The Intel blade appears to perform better when the workload is increased. Maximum response times have improved and only exceed 6000ms at 90 users (and never exceeds 10000ms unlike the medium workload test.) A VSImax score of 129 ensures that the number of users per VM remains at 16 even on a high workload.

Heavy Workload - Intel

Intel Xeon E5-2670 Heavy Workload

Conclusion: The difference between the two results is startling. The high frequency of maximum response times in the AMD test show how the blade is simply struggling to cope with the task of processing user logons and launching and using standard desktop applications.

These numbers are hard to believe, but increasing the workload shows an even bigger gap between the AMD and Intel blades. There is now a 68 user increase in user density by moving from AMD to Intel. If you have a higher proportion of heavy users in your environment, you will see even greater gains by moving from AMD to Intel. In this case you are looking at a 111% gain in user density with comparable Intel hardware.

Summary

The clear winner here, by a large margin is the Intel Sandy Bridge Xeon E5-2670 processor blade. Although the Intel blade will be more expensive due to the more expensive processor, it more than pays for itself by offering a far higher user density and a surprising ability to cope with heavy workloads.

VSIMax Summary

VSIMax Summary

I’m still scratching my head here as the AMD blade appears to offer a decent performance/price point alternative to the Intel blade, but the results do not support this. Although it offers twice the number of cores and almost doubles the available GHz to the hypervisor, it is not able to translate this into providing a similar user experience. Although the Intel has a higher SPECint_rate2006 benchmark score I never thought this would translate into more than double (111%) user density increase when testing with LoginVSI.

I would be interested to do a comparison between two blades where the AMD blade has a higher SpecInt_rate2006 benchmark score to see at what level a lower Intel spec blade can outperform its AMD rival. My guess is that even the entry level Xeon E5-2620 (SPECint_rate2006 score 396) would be able to match the top of the range Opteron 6284 SE (SPECint_rate2006 score 573).

As the workload gets heavier, the results skew even more in Intel’s favour. A heavier workload need not necessarily come from your users behaviour. It has been documented by Citrix and ProjectVRC that moving from Office 2010 to Office 2013 results in a 20-30% increase in the user workload. After reviewing these results I know which processor I would rather have in my SBC environment.

In other words choosing Intel over AMD not only provides better user density, lower CapEx and OpEx costs (due to the smaller infrastructure footprint, licensing, etc) and an improved ability to cope with heavier workloads but can provide some future proofing if you are planning on upgrading to Office 2013.

Clearly the AMD Bulldozer architecture has some advantages over the Intel Sandy Bridge, but server based computing (SBC) is not one of them.

Steer clear if you can.

LoginVSI: The importance of setting the correct logon interval


When you run LoginVSI load tests you want to make sure that the host has 30 seconds between each user logging on, to give it enough time to process the user logon and to not let user logons skew host resources and your VSImax results.

User logons are resource intensive so when trying to generate a VSImax score it is important to ensure that user logons do not negatively influence the test. VSImax scores are calculated by creating a baseline using the first 15 logons. If the baseline is artificially high because too many logons were being processed then your VSimax score will be artificially low.

Additionally if you have auto-logoff enabled it is important to leave it around the 600 seconds mark.

Test 1 – 30 second interval:

Shown below is a 150 user test using two launchers. Each launcher is configured to launch 75 users every 60 seconds, ensuring the host only processes a user logon every 30 seconds. CPU and memory shows a uniform progression in resource usage as users are logged in over 75 minutes. This is what you want to see to ensure you generate an accurate VSImax figure.

LoginVSI 30 second interval

Host CPU and Memory – 30 second interval

Test 2 – 15 second interval:

Here is the same test 150 user test, using two launchers, but each launcher is launching 75 users every 30 seconds, so the host is processing a logon every 15 seconds. CPU and memory climb rapidly as all users are logged in over 37.5 minutes. In the same time as it takes to logon all users from the test above, the test has already finished and users are already logging off under the auto log off timeout.

15 second interval Host CPU and Memory

Host CPU and Memory – 15 second interval

How to calculate the required session interval?

When using parallel launchers, use the built in Session Calculator link in the Interval Settings section on the Test Configuration page.

Test 1 – 30 second interval:

The difference in timeframe is shown below. Test 1 used 4500 seconds and Test 2 used 2250 seconds.

Session Calculator - 15 second interval

Session Calculator – 30 second interval

Test 2 – 15 second interval:

Session Calculator - 15 second interval

Session Calculator – 15 second interval

Impact on VSImax:

And finally the difference in VSImax score. First Test 1 with a VSImax of 105:

VSImax - 30 second interval

VSImax – 30 second interval

Second Test 2, with a VSImax of 89:

150 users - 15 second interval

VSImax – 15 second interval

Summary:

The golden rule is to ensure that ‘Each physical host has to handle one session each’ every 30 seconds. Sure it will take a lot longer when running large tests but it will help to keep your results accurate and consistent.

Citrix XenApp: Is it worth upgrading to B200 M3 to improve user density?


We are currently running Cisco UCS B200 M2 blades in our current XenApp cluster. Now that the M2s are end of life and we are beginning to procure newer Cisco UCS B200 M3’s, I am starting to wonder what the benefits would be of moving our XenApp cluster to the newer Intel E5-2600 processor family. I would expect a decent increase as the M3 adds 8 logical CPU so it can support x2 additional 4vCPU XenApp VMs per blade (Citrix recommendation is to use 4vCPU VMs and align no of vCPU to logical CPU)…. so I’m hoping to increase user density by 30 users per host.

But how to know for sure? Well all things being equal, i.e. same storage, same XenApp environment, same PVS image, same network, we have Intel Xeon X5680 vs Xeon E5-2680, 12 vs 16 pCPU, 24 vs 32 logical CPUs and 39.888 GHz vs 43.184 GHz but it’s difficult to quantify the increase in user density without running live users on the blade or using software than can calculate the additional no of users that can be accommodated without compromising the user experience.

Into the ring enters “LoginVSI“… the de facto load testing tool for virtual desktop environments. LoginVSI generates a VSImax score, which is the maximum number of users workloads your VM\blade\environment can support before the user experience degrades. We will use the same LoginVSI test and the same applications to create a VSimax baseline on the B200 M2 and then compare this figure to the VSImax generated on the B200 M3 to calculate the increase in user density that can be safely accommodated by the M3.

In the left corner is the Cisco UCS B200 M2, a half width 2 socket blade based on Intel’s Nehalem 5600 processor.  In our case we are running a X5680 2 socket 6 core blade running at 3.324 GHz, rated at 130W, 12MB cache size, 1333MHz DDR3 DIMM, for a total of 12 cores and 39.888GHz. In the right corner is the Cisco UCS B200 M3, a half width 2 socket blade based on Intel’s Sandy Bridge E5-2600 processor. Our test blade is a E5-2680 2 socket 8 core blade running at 2.699 GHz, rated at 130W, 20MB cache size and 1666MHz DDR3 DIMM, for a total of 16 cores and 43.139GHz.

Shown below is a LoginVSI 150 user test with a Medium No Flash workload on a single B200 M2 running 6x XenApp VMs with 4vCPU and 12GB RAM each. The image below shows a VSImax score of 105, which is very similar to our current real user load per blade. As you can see the user experience degrades quite rapidly after 100 users.

B200 M2 X5680 VSImax

B200 M2 X5680 – Medium No Flash VSImax

The same test was run against a B200 M3 E5-2680 running 16 CPUs at 2.699 GHz for a total of 32 logical CPUs and 43.184GHz.

Shown below is the same LoginVSI 150 user test with a Medium No Flash workload on a single B200 M3, but this time running 8x XenApp VMs with 4vCPU and 12GB RAM each. The image below shows a VSImax score of 141, with little degradation until the 120 user mark, meaning the host was able to safely handle an additional 36 users without compromising the user experience. Not bad.

B200 M3 VSImax

B200 M3 E5-2680 – Medium No Flash VSImax

What happens when the workload is increased?

I ran the tests again with a LoginVSI Medium Workload. Here is a good link for the differences in LoginVSI workloads.

With a Medium Workload the VSImax on the M2 drops to 86, a drop of 19 users.

B200 M2 Medium Workload

B200 M2 X5680 – Medium VSImax

The VSImax on the M3 drops to 123, a similar drop of 18 users. The difference between them improves slightly to 37 users or 30% over the M2 VSImax.

B200 M3 150 user - Medium VSImax

B200 M3 150 user – Medium VSImax

Analysing the Results:

Has the user (and host) density increased? Indeed it has: The B200 M3 has improved the VSImax by 25% with a Medium No Flash workload and 30% with a Medium workload over the B200 M2; therefore user\host density has increased.
Is it worth it? That depends on your environment and your phase of deployment. If your M2’s are due to be replaced, this 25-30% increase will make quite a big difference if you have thousands of XenApp workloads to support. If your M2s still have some life left in them, but you are looking at procuring new hardware to support additional XenApp workloads, then factor in an additional 25-30% users per blade with the Cisco UCS B200 M3.
What is interesting is the number of users per physical core has changed very little.
  • B200 M2: 105 Medium No Flash users divided by 12 pCPU is 8.75 users per core.
  • B200 M3: 141 Medium No Flash users divided by 16 pCPU is 8.8 users per core.

It’s impressive that the M3 is able to support the same number of users per core as the M2, as is stated in CTX129761, “processor speed has a direct impact on the number of XenApp users that can be supported per processor.” In our test the M3 is rated substantially lower than the M2, 2.7GHz vs 3.33 GHz so Intel processors are definitely becoming more efficient.

Bring on 12 core…🙂

Trend Micro Deep Security and Citrix XenApp: The effect of Agentless AV on VSImax


I’ve been doing some benchmarking recently on our 2 socket 6 core 3.3GHz B200 M2’s used in our dedicated XenApp cluster (each ESXi host providing a total of 39.888GHz) to quantify the impact of AV protection on VSImax. (If you haven’t heard of LoginVSI before, it is a load testing tool for virtual desktop environments. VSImax is the maximum number of users workloads your environment can support before the user experience degrades (response times > 4 seconds) and is a great benchmark as it can be used across different platforms.)

We use Trend Micro Deep Security 9.1 in our environment providing agentless anti malware protection for our XenApp VMs. The Deep Security Virtual Appliances provides the real time scanning via the vShield Endpoint API using a custom XenApp policy that includes all the Anti Virus best practices for Citrix XenApp and Citrix PVS.

Test Summary:

  1. Testing Tool: LoginVSI 3.6 with Medium No Flash workload
  2. Citrix XenApp anti-malware policy: Real Time Scanning enabled with all the best practice directory, file and extension exclusions set as well as the recommendation to disable Network Directory Scan and only scan files on Write.
  3. Deep Security Virtual Appliance (DSVA): Deployed with the default settings: 2vCPU, 2GB RAM, no CPU reservation and a 2 GB memory reservation.

Shown below is a LoginVSI 150 user test with a medium (no Flash) workload on a single B200 M2 running 6x VMs with 4vCPU and 12GB RAM each with agentless protection disabled. The image below shows a VSImax score of 105, which is very similar to our current real user load per blade.

VSIMax with No AV

VSIMax with No AV

Shown below is the same 150 user test with a medium (No Flash) workload on a single B200 M2 running 6x VMs with 4vCPU and 12GB RAM each with agentless anti malware protection enabled. The image below shows a VSImax score of 101.

VSIMax with AV

VSIMax with AV

The impact on VSImax with Deep Security agentless protection enabled is only 4 users per blade which is only a 3.8% user penalty. Shown below is the CPU MHz usage of the DSVA during the LoginVSI test. CPU MHz peaks at 550MHz which is 1.3% of the total available MHz of the host (39888MHz).  An acceptable penalty to keep our security boys happy!

DSVA CPU MHz

DSVA CPU MHz

Improving Citrix PVS 6.1 write cache performance on ESXi 5 with WcHDNoIntermediateBuffering


I’ve being doing a lot of Citrix XenApp 6.5 and PVS 6.1 performance tuning in our ESXi 5 environment recently. This post is about an interesting Citrix PVS registry setting that is no longer enabled by default in PVS 6.1. Credit to Citrix guru Alex Crawford for alerting me to this.

The setting is called WcHDNoIntermediateBuffering – there is a current article CTX126042 on the Citrix website but it is out of date and this document only applies to PVS 5.x.

What I noticed in our ESXi 5 environment was that if you compared an IOmeter test on your write cache volume with the PVS image read-only C:, you would see a huge IO penalty incurred when writes are redirected by PVS to the .vdiskcache file. In my testing with IOMeter, I would regularly achieve ~27000 IOPS (shown below) with a VDI test on the persistent disk.

Persistent Disk IO without PVS

Persistent Disk IO without PVS

When the same test was run against the read-only C: and the PVS driver had to intercept every write and redirect it to the .vdiskcache file IOPS would drop to 1000 (or x27 times), which is a pretty massive penalty.

WcHDNoIntermediateBuffering Disabled

WcHDNoIntermediateBuffering Disabled

Clearly this bottleneck would have an impact on write cache performance and latency and directly impact write intensive operations such as user logon and launching applications which would negatively impact the user experience.

WcHDNoIntermediateBuffering enables or disables intermediate buffering which aims to improve system performance. In PVS 5.x, PVS used an algorithm to determine whether the setting was enabled based on the free space available on the write cache volume if no registry value was set (default setting).

This is no longer the case, WcHDNoIntermediateBuffering in PVS 6.x is permanently disabled. I have confirmed this with Citrix Technical Support. Why was it disabled? Not sure, probably too onerous for Citrix to support – here are two current articles relating to issues with the setting – CTX131112 and CTX128038.

With PVS 6.1 the behaviour of the “HKLM\SYSTEM\CurrentControlSet\Services\BNIStack\Parameters\WcHDNoIntermediateBuffering” value is as follows:

  • No value present – (Disabled)
  • REG_DWORD=0 (Disabled)
  • REG_DWORD=1 (Disabled)
  • REG_DWORD=2 (Enabled)

As you can see the default behaviour is now disabled and the only way to enable WcHDNoIntermediateBuffering is to set the value to 2.

In testing in our ESXi 5 environment, with XenApp VMs running on VM8 hardware with an eager zero persistent disk on a SAS storage pool with the paravirtual SCSI adapter I saw a +20x increase in IO with WcHDNoIntermediateBuffering enabled. The throughput performance with WcHDNoIntermediateBuffering enabled is 76% of the true IO of the disk which is a much more manageable penalty.

WcHDNoIntermediateBuffering Enabled

WcHDNoIntermediateBuffering Enabled

Enabling WcHDNoIntermediateBuffering increased IOPS in our IOmeter VDI tests from 1000 IOPS to over 20000 IOPS, a pretty massive x20 increase.

Bottom Line: While CPU will be the bottleneck in most XenApp environments, if you are looking for an easy win, enabling this setting will align write cache IO performance closer to the true IO of your disk, eliminating a write cache bottleneck and improving the user experience on your PVS clients. We’ve rolled this into production without any issues and I recommend you do too.

Update 15/08/2013: Since upgrading to PVS 6.1 HF 16 I’ve since not seen any deterioration in IOmeter tests between our persistent disk and the read-only C:\. This may be due to improvements in HF16 or changes in our XenApp image, but this is good news nonetheless as there is now no IO penalty on the System drive with WcHDNoIntermediateBuffering enabled.

Recreating the test in your environment:

I used a simple VDI test to produce these results that included 80% writes / 20% reads with 100% Random IO on 4KB for 15 minutes.

Follow these instructions to run the same test:

  1. Download the attachment and rename it to iometer.icf.
  2. Spin up your XenApp image in standard mode
  3. Install IOmeter
  4. Launch IOmeter
  5. Open iometer.icf
  6. Select the computer name
  7. Select your Disk Target (C:, D:, etc)
  8. Click Go
  9. Save Results
  10. Monitor the Results Display to see Total I/O per second

VCE vBlock – 1st Year in Review


Well we have just passed a year of vBlock ownership and the last year has passed rather painlessly.

Our vBlock was one of the first out there, delivered in November 2011. I wanted to provide some pros and cons of vBlock ownership. Some of the themes are not vBlock specific, but worth bearing in mind because there will always be a gap between what you hear from pre-sales and what the reality is.

Pros:

VCE – The company has been constantly improving which is good to see. Not content to rest on their laurels, they really have grabbed the bull by the horns and they are innovating in a lot of areas.

vBlock – The concept of the vBlock itself deserves a mention. VCE are definitely on the right path… it’s like the first generation Model T Ford. I’m sure old Henry had hundred’s of suppliers that provided the components for his Model T and he came along with the assembly line production and he put it all together. This is like what is happening over at VCE. Over time I’m hoping that the integration between components will become more and more seamless as the demand for pre-configured virtualisation platforms grows and grows and the designers behind each of the components are forced to work closer together.

Management and Support – If you have a bloated IT support team in large sprawling organisation, a vBlock can help reduce your head count by simplifying your environment. One thing converged infrastructure platforms are good for, is breaking down the traditional support silos with regards to storage, network, compute, virtualisation. When all the components are so tightly integrated, your silo’d operations team morphs into one.

Compatibility Matrix – This has to be the biggest selling point in my book. Taking away the pain of ensuring compatibility between so many different components. The VCE matrix is far more stringent than individual vendor product testing and therefore far more trust worthy. Try getting a complete infrastructure upgrade over a single weekend across storage, network, compute and virtualisation components through your change management team. It’s not going to happen unless it’s been pre-tested.

Single line of support – Being able to call a single number when there is any issue, immensely simplifies fault finding and problem resolution. Worth it alone just for this and the matrix.

Single pain of glass – This is where UIMp is starting to come into its own. It’s been a long road, but the future looks good. VCE’s goal is to replace each of the individual management consoles so that VCE customers can use UIMp for all their automated provisioning. When it works, it really does simplify provisioning.

Customer Advocate – In my experience the customer advocate offers great value. Extremely useful when managing high severity incidents and ensuring your environment remains up to date and in support, with regular services reviews and providing an easy path into VCE to organise training sessions, bodies to fill gaps in support, provide direct line of contact to escalation engineers and just deal with any queries and questions you may have about your environment.

Cons:

The AMP – the major design flaw in the AMP for me is the 1GB network. Data transfers between VMs in our 10GB service cluster can achieve 300 Mbps; as soon as the AMP is involved it drops to 30Mbps. Really annoying and what is in the AMP? vCenter, which is used to import virtual machines. Let’s say you are doing a migration of 1000 VMs for example… that 30Mbps is going to get really annoying and it has.

Cost – The vBlock hardware isn’t so bad, but what really surprised me is the amount of and cost of the licenses. Want to add a UCS Blade? No problem, that will be £5k for the blade and about £3k for the licenses – UCS, UIMp, VNX, vSphere,  etc. It all adds up pretty quickly. Ensuring you adequately size your UCS blades up front, i.e. plenty of memory and CPU is really important.

Management & Support – Converged Infrastructure Platforms require a lot of ongoing support and management. This is an issue not limited to VCE. It’s just the nature of the beast. If you have  an immature IT organisation and have had a fairly piecemeal IT infrastructure and support team up until now, you will be in for a shock when you purchase a converged infrastructure platform. There’s no doubt a vBlock is an excellent product, but it’s excellent because it uses the latest and greatest, which can be complex. It also comprises multiple products  from 3 different vendors – EMC, Cisco and VMware, so you need the right skillset to manage it, which can be expensive to find and train. It takes at least a year for someone to become familiar with all components of the vBlock. You’re always going to have employees with core skills like virtualisation, storage, network, compute, etc, but you do want people to broaden their skills and be comfortable with the entire stack.

Integration between products – See above, multiple products from 3 different vendors. At the moment the VCE wrapper is just that, little more than a well designed wrapper, lots of testing and a single line of support. Ok, so EMC own VMware, but it seems to make little difference. EMC can’t even align products within their own company, how on earth can they expect to align products with a subsidiary?  If the vBlock is going to be a single vendor product, then all 3x vendors need to invest in closer co-operation to align product lifecycles and integration. VMware release vCenter 5.1 and Powerpath have to release an emergency patch to support it? Going back to my Model T analogy, the vBlock is never going to become a real Model T until Cisco buys EMC or EMC drop Cisco and start making the compute\network components. Not so far fetched.

Complexity – The VCE wrapper hasn’t changed the complexity. (This is the same with HP or Flexpod.) This is another myth. “We’ve made it simple!”. Er, no, you haven’t. You’ve just done all the design work and testing for us. Until the integration above takes places, which will allow for simplification of the overall package its going to remain just a wrapper and it’s still going to remain an extremely complex piece of kit. VCE have focused efforts on improving UIMp to simplify vBlock provisioning and to simplify vBlock management through a single interface but really these are just band aids if the individual components are made by separate companies.

Patching – Even though there is a compatibility matrix, which does the integration and regression testing for you, it still doesn’t take away the pain\effort of actually deploying the patches. Having a vBlock doesn’t mean there is no patching required. This is a common pre-sales myth, ‘Don’t worry about it, we’ll do all the patching for you.’ Sure, but at what cost? Security patches, bug fixes and feature enhancements come out more or less monthly and this has to be factored in to your budget and over time costs.

Monitoring and Reporting – This is a pain and I know there are plans afoot at VCE to simplify this, but currently there is no single management point you can query to monitor the vitals of a vBlock. If you want to know the status of UCS: UCS manager, VNX: Unisphere, ESXi: vCenter, etc. For example, you buy VCOps but that only plugs into vCenter, so you are only aware of what resources vCenter has been assigned. To get a helicopter view of the entire vBlock from a single console is impossible. UIMp gives you a bit of a storage overview: available vs provisioned, but does not give you much more than that. So you end up buying these tactical solutions for each of the individual components, like VNX Monitoring and Reporting. Hopefully soon we will be able to query a single device and get up to date health checks and alerting for all vBlock components.

Niggles – There have been a few small niggles, mainly issues between vCenter/Cisco 1000V and vCenter/VNX 7500 but overall for the amount of kit we purchased it has not been bad. I think a lot of these issues had to do with vCenter 5\ESXi 5. As soon as Upgrade 1 came out, everything settled down. Note to self don’t be quick up upgrade to vCenter 6/ESXi 6!