Improving Citrix PVS 6.1 write cache performance on ESXi 5 with WcHDNoIntermediateBuffering


I’ve being doing a lot of Citrix XenApp 6.5 and PVS 6.1 performance tuning in our ESXi 5 environment recently. This post is about an interesting Citrix PVS registry setting that is no longer enabled by default in PVS 6.1. Credit to Citrix guru Alex Crawford for alerting me to this.

The setting is called WcHDNoIntermediateBuffering – there is a current article CTX126042 on the Citrix website but it is out of date and this document only applies to PVS 5.x.

What I noticed in our ESXi 5 environment was that if you compared an IOmeter test on your write cache volume with the PVS image read-only C:, you would see a huge IO penalty incurred when writes are redirected by PVS to the .vdiskcache file. In my testing with IOMeter, I would regularly achieve ~27000 IOPS (shown below) with a VDI test on the persistent disk.

Persistent Disk IO without PVS

Persistent Disk IO without PVS

When the same test was run against the read-only C: and the PVS driver had to intercept every write and redirect it to the .vdiskcache file IOPS would drop to 1000 (or x27 times), which is a pretty massive penalty.

WcHDNoIntermediateBuffering Disabled

WcHDNoIntermediateBuffering Disabled

Clearly this bottleneck would have an impact on write cache performance and latency and directly impact write intensive operations such as user logon and launching applications which would negatively impact the user experience.

WcHDNoIntermediateBuffering enables or disables intermediate buffering which aims to improve system performance. In PVS 5.x, PVS used an algorithm to determine whether the setting was enabled based on the free space available on the write cache volume if no registry value was set (default setting).

This is no longer the case, WcHDNoIntermediateBuffering in PVS 6.x is permanently disabled. I have confirmed this with Citrix Technical Support. Why was it disabled? Not sure, probably too onerous for Citrix to support – here are two current articles relating to issues with the setting – CTX131112 and CTX128038.

With PVS 6.1 the behaviour of the “HKLM\SYSTEM\CurrentControlSet\Services\BNIStack\Parameters\WcHDNoIntermediateBuffering” value is as follows:

  • No value present – (Disabled)
  • REG_DWORD=0 (Disabled)
  • REG_DWORD=1 (Disabled)
  • REG_DWORD=2 (Enabled)

As you can see the default behaviour is now disabled and the only way to enable WcHDNoIntermediateBuffering is to set the value to 2.

In testing in our ESXi 5 environment, with XenApp VMs running on VM8 hardware with an eager zero persistent disk on a SAS storage pool with the paravirtual SCSI adapter I saw a +20x increase in IO with WcHDNoIntermediateBuffering enabled. The throughput performance with WcHDNoIntermediateBuffering enabled is 76% of the true IO of the disk which is a much more manageable penalty.

WcHDNoIntermediateBuffering Enabled

WcHDNoIntermediateBuffering Enabled

Enabling WcHDNoIntermediateBuffering increased IOPS in our IOmeter VDI tests from 1000 IOPS to over 20000 IOPS, a pretty massive x20 increase.

Bottom Line: While CPU will be the bottleneck in most XenApp environments, if you are looking for an easy win, enabling this setting will align write cache IO performance closer to the true IO of your disk, eliminating a write cache bottleneck and improving the user experience on your PVS clients. We’ve rolled this into production without any issues and I recommend you do too.

Update 15/08/2013: Since upgrading to PVS 6.1 HF 16 I’ve since not seen any deterioration in IOmeter tests between our persistent disk and the read-only C:\. This may be due to improvements in HF16 or changes in our XenApp image, but this is good news nonetheless as there is now no IO penalty on the System drive with WcHDNoIntermediateBuffering enabled.

Recreating the test in your environment:

I used a simple VDI test to produce these results that included 80% writes / 20% reads with 100% Random IO on 4KB for 15 minutes.

Follow these instructions to run the same test:

  1. Download the attachment and rename it to iometer.icf.
  2. Spin up your XenApp image in standard mode
  3. Install IOmeter
  4. Launch IOmeter
  5. Open iometer.icf
  6. Select the computer name
  7. Select your Disk Target (C:, D:, etc)
  8. Click Go
  9. Save Results
  10. Monitor the Results Display to see Total I/O per second
Advertisements

17 responses to “Improving Citrix PVS 6.1 write cache performance on ESXi 5 with WcHDNoIntermediateBuffering

  1. Interesting setting!. Can you provide the IOMeter settings you have used? Just for having a comparison with our environment and your statistics. Also, when we perform the IOmeter tests on the C:\ volume, IO meter hangs at “preparing drives”. Did you change something in order to run the tests on C:\ ?

    • Indeed, it is a very interesting setting. Hopefully you see similar results.

      The test was a 4K 80% writes / 20% reads with 100% random IO, to mimic a typical XenApp\VDI workload. I’ll try upload it to the post.

      By default IOmeter’s Maximum Disk Size is configured to 0 so it will try fill up the disk with its test file, which if you have lots of free space can take forever. I used a 1GB disk size in my tests to limit the size of the test file.

      Change the Maximum Disk Size to 2048000 and see how you get on.

  2. Pingback: RT @IngmarVerheij: Improving Citrix PVS 6.1 write… | The Architect's Experience

  3. Pingback: Best Practices: Atlantis ILIO and Citrix PVS (Provisioning Services) | Atlantis Computing Blog

  4. Hello, I’m curious extra how you setup your VM with the ParaVirtual SCSI controller. Normally, my PVS targets have the LSI Logic controller connected to the write cache drive and when the OS boots, I have the C drive that is the Streamed OS and the D Drive is the Write Cache. However, if I take that same machine and switch the controller to ParaVirtual, only the Provisioned C drive shows up. The D drive(write cache) shows up as disabled and Disk Admin shows it as Disabled by policy.

    Thoughts?

    Thanks for your help.

    • Hmmmmmm, I can’t remember us having any issues. The PVSCSI driver is included with VMWare Tools so as long as this is uptodate, I can’t think of why it isn’t working.

      Have you run a PVSCSI disk on your image before then to make sure the driver is installed and the hard disk is working before capturing the image?

      • Ah, that’s probably what it is. I’ll move the vdisk to private mode and try it that way. Any idea what the PVSCSI shows up as in the Windows OS under device manager as? Also, this is the only blog I’ve seen to actually recommend using the PVSCSI controller for PVS targets. For PVS vdisk stores I read about it some but for the targets, not so much.

  5. Pingback: Improving Citrix PVS 6.1 write cache performance on ESXi 5 with WcHDNoIntermediateBuffering | blocksandbytes | My Blog

  6. Probably a ‘dumb’ question, but is the intermediate buffering registry key set on the PVS Server itself or on my Gold Image? This is unclear in all of Citrix’s articles as well as all the blogs I’ve read.

    Secondly:
    http://support.citrix.com/article/CTX126042
    Above article link says to change the vdisk image to private mode first, is that really necessary? I believe I’d have to shutdown all my session hosts in order to change the vDisk to private mode.

    Thirdly: the mention of ‘eager zero’ are you speaking of the eager zero setting on thin provisioned VMDK’s in VMWare properties of the guest? Or an eager zero setting elsewhere. I can’t find eager zero related settings in PVS.

    My environment is 6 XenApp 7.1 session hosts that are derived from a PVS hosted read-only vDisk.

    Thanks for answering this. If you ever have an Exchange question let me know I’ll hook you up.

    Chris

  7. Hey Chris,

    This registry setting is set on your gold image\vdisk, hence why the vdisk must be put into private mode as per the CTX126042 article.

    If you didn’t put the image into private mode to update your image and seal it afterwards this setting would never take effect as it would be lost on every reboot. You could try apply the setting via group policy preference but the disk would initialise before GPP applied the registry setting so you would never see the benefit.

    Yes you would eventually have to reboot all your session hosts to apply the setting but this could be phased in with PVS image version control. Most PVS customers reboot their session hosts on a daily\weekly schedule therefore the impact to users should be minimal phasing in a new vdisk/gold image.

    Yes on the vSphere VM datastore settings. There are 3 types of vmdk disk formats – thin, lazy zero and eager zero. I dislike thin VMDK disk as most backend storage pools are already thin provisioned. Thin provisioning twice makes capacity planning very difficult. Most times customers are only aware they are running out of space after the fact. Lazy zero is preffered for most virtual machines as it doesn’t zero out the VMDK disk (and the datastore) like eager zero and use up space on thin provisioned storage pools but does allow accurate reporting of datastore usage on your hypervisor.

    From experience now I always recommend the PVSCSI adapter and eager-zero VMDK for latency sensitive applications like Citrix, SQL, Exchange where disk performance and scalability is critical.

    As per the CTX article testing on your image should be done before and after with IOMETER to ensure the changes you make to your environment show an improvement.

    Gareth

    • And it is fairly easy to convert thin to eager zero disk – You should be able to right click the datastore, find the .vmdk file and select inflate. Make sure the VM is off and there are no active snapshots.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s