Improving Citrix PVS 6.1 write cache performance on ESXi 5 with WcHDNoIntermediateBuffering

I’ve being doing a lot of Citrix XenApp 6.5 and PVS 6.1 performance tuning in our ESXi 5 environment recently. This post is about an interesting Citrix PVS registry setting that is no longer enabled by default in PVS 6.1. Credit to Citrix guru Alex Crawford for alerting me to this.

The setting is called WcHDNoIntermediateBuffering – there is a current article CTX126042 on the Citrix website but it is out of date and this document only applies to PVS 5.x.

What I noticed in our ESXi 5 environment was that if you compared an IOmeter test on your write cache volume with the PVS image read-only C:, you would see a huge IO penalty incurred when writes are redirected by PVS to the .vdiskcache file. In my testing with IOMeter, I would regularly achieve ~27000 IOPS (shown below) with a VDI test on the persistent disk.

Persistent Disk IO without PVS

Persistent Disk IO without PVS

When the same test was run against the read-only C: and the PVS driver had to intercept every write and redirect it to the .vdiskcache file IOPS would drop to 1000 (or x27 times), which is a pretty massive penalty.

WcHDNoIntermediateBuffering Disabled

WcHDNoIntermediateBuffering Disabled

Clearly this bottleneck would have an impact on write cache performance and latency and directly impact write intensive operations such as user logon and launching applications which would negatively impact the user experience.

WcHDNoIntermediateBuffering enables or disables intermediate buffering which aims to improve system performance. In PVS 5.x, PVS used an algorithm to determine whether the setting was enabled based on the free space available on the write cache volume if no registry value was set (default setting).

This is no longer the case, WcHDNoIntermediateBuffering in PVS 6.x is permanently disabled. I have confirmed this with Citrix Technical Support. Why was it disabled? Not sure, probably too onerous for Citrix to support – here are two current articles relating to issues with the setting – CTX131112 and CTX128038.

With PVS 6.1 the behaviour of the “HKLM\SYSTEM\CurrentControlSet\Services\BNIStack\Parameters\WcHDNoIntermediateBuffering” value is as follows:

  • No value present – (Disabled)
  • REG_DWORD=0 (Disabled)
  • REG_DWORD=1 (Disabled)
  • REG_DWORD=2 (Enabled)

As you can see the default behaviour is now disabled and the only way to enable WcHDNoIntermediateBuffering is to set the value to 2.

In testing in our ESXi 5 environment, with XenApp VMs running on VM8 hardware with an eager zero persistent disk on a SAS storage pool with the paravirtual SCSI adapter I saw a +20x increase in IO with WcHDNoIntermediateBuffering enabled. The throughput performance with WcHDNoIntermediateBuffering enabled is 76% of the true IO of the disk which is a much more manageable penalty.

WcHDNoIntermediateBuffering Enabled

WcHDNoIntermediateBuffering Enabled

Enabling WcHDNoIntermediateBuffering increased IOPS in our IOmeter VDI tests from 1000 IOPS to over 20000 IOPS, a pretty massive x20 increase.

Bottom Line: While CPU will be the bottleneck in most XenApp environments, if you are looking for an easy win, enabling this setting will align write cache IO performance closer to the true IO of your disk, eliminating a write cache bottleneck and improving the user experience on your PVS clients. We’ve rolled this into production without any issues and I recommend you do too.

Update 15/08/2013: Since upgrading to PVS 6.1 HF 16 I’ve since not seen any deterioration in IOmeter tests between our persistent disk and the read-only C:\. This may be due to improvements in HF16 or changes in our XenApp image, but this is good news nonetheless as there is now no IO penalty on the System drive with WcHDNoIntermediateBuffering enabled.

Recreating the test in your environment:

I used a simple VDI test to produce these results that included 80% writes / 20% reads with 100% Random IO on 4KB for 15 minutes.

Follow these instructions to run the same test:

  1. Download the attachment and rename it to iometer.icf.
  2. Spin up your XenApp image in standard mode
  3. Install IOmeter
  4. Launch IOmeter
  5. Open iometer.icf
  6. Select the computer name
  7. Select your Disk Target (C:, D:, etc)
  8. Click Go
  9. Save Results
  10. Monitor the Results Display to see Total I/O per second