Deep Security 8 and DSAFILTER_HEAP_MAX_SIZE


I have been having major reliability issues with Deep Security 8 for the last 2 weeks.

There is a lot of confusion out there with the recommended settings for a stable Deep Security 8 environment, which isn’t helped when the vendor doesn’t have any KB articles or updated documentation publicly available. It appears they would rather their customers test their products for them and keep their teething issues with ESXi 5.0 under wraps. See my last post here.

This brings me to my issues with setting the DSAFILTER_HEAP_MAX_SIZE on each ESXi host. There is a recommended workaround mentioned on  KB1055625 “How to enable the Deep Security Virtual Appliance (DSVA) to support more than 25 virtual machines on ESX.’ What most people don’t realise is this is a recommended workaround for Trend DS 7.0\7.5, not Trend DS 8.

The article that states that the DSAFILTER_HEAP_MAX_SIZE should be roughly 1MB per each VM planned. This results in most customers setting a value for the HEAP_MAX_SIZE of around 40 – 50MB. This is a 1/10th of what it should be according to Trend Engineering for DS 8 (512MB)!

Here is a quick summary:

I set a  DSAFILTER_HEAP_MAX_SIZE on all my ESXi hosts with  a value of 50MB. In theory according to the knowledge base article this should be sufficient for 40 VMs.

For Deep Security 8 Trend actually recommend not changing this value as the dsafilter is supposed to dynamically adjust the HEAP_MAX_SIZE according to the numbe rof VMs on the host. If you have set the value manually as per the KB article  OR you encounter symptoms such as VMs loses connectivity intermittingly and then the hosts PSODing after 30mins – 2 hours then its most likely an inadequate  HEAP MAX size and the filter driver is running out of memory.

Here are the calculcations from Trend Engeering. They have used a calculation of 70 VMs but state “Even if you are not planning on running as many VM’s it is still suggested to set the value to 512mb. “

—————————————————————————————————————————————–

We use 432 bytes per TCP connection so for the mathematical calculation lets roundup to 512 bytes per TCP allocation.

If using the default values for the Maximum TCP Connections. <SystemSetting name=”maxConnectionsTcp” value=”10000″ />

With this 512 bytes  * 10000 connection = 5MB per VM so we need to have 70*5MB = 350MB minimum memory for the visor memory for the connection tables only.

In addition to this we will need ~ 70 MB to run the FD so it comes down to 420 MB.

We recommend bumping to 512MB and verify the results.  e.g. esxcfg-module -s DSAFILTER_HEAP_MAX_SIZE=536870912 dvfilter-dsa

To verify the setting, execute: % esxcfg-module -g dvfilter-dsa The setting will not take effect until the driver is reloaded.  Reloading will either require a reboot (best option) of ESX.

———————————————————————————————————————————————-

If you are running DS 8 and ESXi 5.0 and you have already implemented the MOD_TIMER=0 fix, the next step will be to manually set the MAX_HEAP_SIZE to 512MB.

This solved the issue for me and we haven’t had a single crash since.

About frikking time!

———————————————————————————————————————————————-

Update 30/05/2012

Michael Gioia from Trend contacted me to give me a more detailed analysis of why this issue occurs.

First of all there is Trend KB article out for this issue — http://esupport.trendmicro.com/solution/en-us/1060125.aspx

It sounds like there are a few unhappy customers out there. In Trend’s defence and the point put across by Michael, is that it is very difficult to ensure 100% compatibility with  two products that are integrated in the kernel such as the ESXi hypervisor and the Trend filter.

This anomaly is only supposed to occur in very rare circumstances when the filter driver is under sever exhaustion of memory (circa 4 bytes).

In theory this issue shouldn’t occur with many customers then, but to be honest my setup was not anything special or extreme, so I’d be surprised if more customers weren’t affected by this.

On the plus side, the developers should have released a fix in 7.5 SP4 and 8.0 SP1 which are both available now.

I still have to verify this – until I am confident the issue is resolved my HEAP_MAX_SIZE settings will remain!

 

Advertisements

Known Issue! PSOD with Security 8 and ESXi 5.0


It looks like there is a known issue with Trend Deep Security 8 and ESXi 5.0 that causes PSOD. I cannot find a KB article yet, so am documenting it here. Hopefully this will help some people who don’t have a Trend support contract.

There is a known issue with ESXi 5.0 and Deep Security 8.0. A number of customer’s are experiencing ESXi system crashes – purple screen of death. By default the Deep Security Filter Driver will attempt to multiplex a single kernel timer across all virtual machines, to ensure they perform a maintenance task every 30 seconds.

This appears to be creating the instability issues and causing the system crashes as using a single timer across all VMs is complex to manage and implement.

The workaround is to disable this setting, so that the maintenance tasks execute without the timer. This occurs periodically anyway when the system processes packets, so there is no impact performing this change.

  1. SSH to ESXi. From the ESXi console, execute this command to find out the value that is configured for the Filter Driver heap memory size: Run % esxcfg-module -g dvfilter-dsa to see if you have modified the DSAFILTER_HEAP_MAX_SIZE
  2. If you have not configured the DSAFILTER_HEAP_MAX_SIZE value just set the DSAFILTER_MOD_TIMER_ENABLED to 0 with the following command: % esxcfg-module -s DSAFILTER_MOD_TIMER_ENABLED=0 dvfilter-dsa
  3. If you have configured the DSAFILTER_HEAP_MAX_SIZE value, use the following command to preserve your existing setting: % esxcfg-module -s “DSAFILTER_HEAP_MAX_SIZE= <value that you got from the last query> DSAFILTER_MOD_TIMER_ENABLED=0” dvfilter-dsa
  4. You should now see options = value set to DSAFILTER_MOD_TIMER_ENABLED=0 when you run % esxcfg-module -g dvfilter-dsa
  5. Reboot the ESXi server for the changes to take effect. Note: The setting will not take effect until the driver is reloaded. Reloading will require a reboot (best option) of ESXi or unloading/loading of the driver.

vSphere 5, vShield 5, Trend DS 8 (vBlock 300HX) Upgrade


Call this the perfect storm upgrade. If you have to perform a vSphere 5, vShield 5 and Trend DS 8 upgrade (whether or not you happen to have a vBlock 300HX), read the following for what TO do and what NOT to do!

The main caveats to remember when performing this upgrade are:

  • vShield Endpoint v3.x and vShield Endpoint v5.x are NOT compatible.
  • You cannot upgrade to the latest VMware Tools if you have the old endpoint thin agent installed on your Windows VMs. It has to be removed first.

Your final approach will depend on whether you are upgrading your hosts with VUM or rebuilding them withvia ISO. I took the ISO route as I thought it would be cleaner.

Before we get started, there is some documentation you should read:

  1. vSphere 5 Upgrade Guide including vCenter, ESXi
  2.  vShield 5 Quick Start guide
  3. Trend Manager 8 Getting Started Guide

Step-by-Step Deployment Guide:

I’ll tell you what you should do to avoid the pain and suffering I went through. If you prefer testing the upgrade on a single host to ensure the process works, update accordingly. It will still work.

  1. Upgrade Trend Manager to v8
  2. Power of all your VMs except Trend appliances.
  3. De-activate your Trend Appliances from Trend Manager
    • You should see the Trend service account in Virtual Center updating the configuration (.vmx) files of all your VMs.
    • Confirm all VFILE line entries have been removed from the VMs .vmx files before continuing
  4. Power off and delete your Trend appliances from Virtual Center
  5. Put all hosts into Maintenance mode.
  6. Remove Virtual Center from Trend Manager.
  7. Login and un-register vShield Manager 4.1 from Virtual Center
    • Power off vShield Manager 4.1
  8. Disconnect and remove all hosts from cluster
  9. Upgrade Virtual Center to v5
    • If any your hosts are disconnected during the upgrade, just reconnect them.
  10. Upgrade VMware Update Manager to v5
  11. Deploy vShield Manager v5
  12. Register vShield Manager v5 with Virtual Center
  13. Rebuild hosts manually with vanilla ISO
    • Setup management IP address on each host
  14. Add hosts back into the cluster
  15. Patch hosts with VUM and apply any host profiles
  16. Add hosts back to the 1000V if present
    • Setup all vDS virtual adapters
  17. Add virtual center back into the Trend Manager
  18. Deploy vShield Endpoint v5 driver to all hosts
    • Ensure vShield Manager is reporting Endpoint is installed before continuing
  19. Deploy Trend 8 dvfilter-dsa to all hosts via Trend Manager
    • Ensure Trend Manager is reporting hosts are prepared before continuing
  20. Deploy and activate all Trend 8 virtual appliances
    • Ensure all virtual appliances are reporting as ‘vShield Endpoint: Registered’
  21. Power on your VMs
  22. Remove vShield Endpoint Thin Agent from all your Windows VMs and reboot
  23. Upgrade VMware Tools on all your VMs, ensuring vShield option is selected. Reboot required.
  24. Confirm all VMs are protected by the local virtual appliance. Anti-malware should report ‘real time’.
  25. Update all your DRS groups as all the hosts and appliances will have been removed.
If you want to upgrade, rather than rebuild, do the following between steps 3 and 4:
  1. Uninstall Trend filter (dvfilter-dsa) from all hosts
  2. Uninstall Endpoint v3 filter (epsec_vfile) from all hosts
and upgrade vShield Manager instead of deploying new version. Refer to Page 29 of the vShield Quick Start Guide.
Things to Watch Out For:
Steps 2 and 3 are crucial.
Step 2 – vShield Endpoint v3 includes a loadable kernel module (LKM) called VFILE, which loads into the kernel on a vSphere 4.1 host at boot up.  Whenever a VM is powered on, on a host running the VFILE LKM, the virtual machine’s .vmx file is updated with the following two line entries:

VFILE.globaloptions = “svmip=169.254.50.39 svmport=8888?
scsi0:0.filters = “VFILE”

vShield endpoint v5 does not do this! No VFILE LKM is loaded, no VFILE line entries are added to the .vmx files of the VMs. Therefore if you do not correctly decommission vShield Endpoint v3, your VMs will not power on, on your vSphere 5 hosts.

This is implied in the vShield 5 Quick Start guide on Page 31 under ‘Upgrading vShield Endpoint’:

2. Deactivate all Trend DSVAs. This is required to remove vShield related VFILE filter entries from the virtual machines.

What they don’t tell you above though is that all your VMs must be powered off. If you de-activate your Trend appliances while your VMs are on, well mine just had their .vmx files updated again immediately afterwards!

If you missed that step the first time around, you’ll have to manully update the .vmx file of every virtual machine to remove the vfile line entries as per KB1030463.

 Step 3 – If you don’t remove and re-add Virtual Center from Trend Manager after you have installed vShield Manager 5,  your DS virtual appliances will not register with vShield Endpoint.

Step 7 – First time I deployed vShield Manager 5 I didn’t have any issues, although I did have to re-deploy it a 2nd time as it stopped synchronising with vCenter. Unfortunately then it no longer recognised vShield Endpoint was installed and I had to rebuild all my hosts.

Besides these issues, things went relatively smoothly. Its just a matter of time.

Good Luck!

Trend Micro Deep Security v8 is out Friday


With the release of Trend Micro Deep Security v8 out this Friday the 27th January and Trend Manager v8 already available for download, I thought I would document my current list of issues I hope will be fixed in the new release.

Areas that could be improved:

  • When you Prepare an ESX Host, there is no mention in the window of which host you are preparing. It gets extremely confusing when you are trying to prepare a large cluster as immediately after preparing the esx host, the next action is to deploy the virtual appliance and there is no indicator of which host you are working so you don’t know which virtual appliance you are deploying or what to name it.
  • Leading on from the point above it would be great if this process could be automated. Why do you have to manually deploy the filter to a single host at a time. You should be able to select a cluster and select Deploy filter. Their solution doesn’t scale well! I pity the fool who has 50 node cluster to roll Trend out too.
  • It would be great to be able to deploy the Trend agent (and now that I mention it, vShield Endpoint agent) to VMs from within Trend Manager. Maybe I missed something here, but I don’t think that is a feature currently.
  • Every time I vmotion a VM the status changes to ‘virtual machine unprotected during move to another ESX.’  I spend my whole team clearing ‘warnings/errors’ as there is often a spurious message being displayed which means you cannot see the current status of the VM. There really needs to be an extra column for Alerts to separate these messages from the Status column as these messages often have no bearing on the status.
  • In the quick start guide there is no mention of the DRS rules or groups that should be configured to ensure that the virtual appliances remain on the correct hosts as well as the preferred HA settings to ensure the virtual appliances are left ‘powered on’ under the isolation response settings.
  • The current version of DS does not support wildcards so you have to exclude the whole folder (D:\WINDOWS\NTDS) — you cannot for instance, exclude NTDS*.* from the D:\WINDOWS\NTDS folder. This is AV 101. Not sure why it wasn’t included!

I welcome any additions to this list!

Trend Micro Deep Security Installation


I’ve spent the last couple of weeks installing and reinstalling Trend Micro Deep Security 7.5 SP3 after we rebuilt our ESX clusters. I found the Trend Micro Deep Security 7.5 SP3 Quick Start guide a rather poor guide to installing Trend. It was not very helpful, thankfully I had some pre-sales assistance.

My first impressions are that it is pretty cool, but a bit of pain in the arse to get configured as it requires so much work to do be done on each host. If you have a large cluster it can get tiresome repeating the installation on multiple hosts. I think the weakness in vSphere 4.1 is vShield Endpoint. It seems rather flaky. I have had to rebuild a number of hosts because vShield Endpoint wouldn’t install correctly.

Also I am continuously getting EPSec VM, EPSec SVM or EPSec host errors and I don’t know why.  No errors in the Trend Manager and they seem to come and go like the wind with little indication of what caused the alert.

Anyway, here are a couple of points worth noting:

  • The only vShield component you need installed on your ESX hosts for Trend DS is vShield Endpoint. You don’t need to push out any other components via vShield Manager – i.e. vShield Zones or vShield Edge Port Group Isolation.
  • When you activate the DSVA appliance, it is registering itself with vShield as a Security VM. If you happen to roll out vShield Endpoint after you have installed Trend (deployed the Trend filter to every host and deployed and activated all appliances), you must re-activate all your appliances.
  • You must install the vShield Endpoint agent on all your VMs to gain anti-malware protection and you cannot push this out via vShield Manager or the Trend Manager. Its a manual install – a real pain.
  • You only need to install the Trend agent on your VMs if you want Log Inspection or Integrity Monitoring. Anti-malware protection is available without the Trend VM agent.
  • With Trend 7.5 SP3 you won’t be able to provide Anti-malware protection on physical servers.
  • When Trend 8 is released at the end of January 2012, you will be able to deploy anti-malware protection to your Windows Server and Trend 8 SP.1 will provide anti-malware protection for physical linux servers.
  • You need to create DRS Groups and Rules to ensure that all your virtual appliances are limited to a single host
  • You will also need to modify the HA Virtual Machine settings so the virtual appliances are set to restart priority of high and an isolation repsonse of leave powered on.

I’m looking forward to repeating the install again when we upgrade to Vsphere 5 in February.

Enjoy.