Trend DS 8 Feature #873 – 300 VMs are not protected?


Argh, well I was performing some security hardening last week. One of my tasks was to tidy up the Administrators group in vCenter.

Yes, dangerous I know and it looks like a few service accounts were without vCenter admins for a while.

My fault completely, I had no one to blame this time, but I wasn’t expecting the fall out. All my other applications were fine – VUM, SRM, vShield, VC Ops,  etc but not Trend.

I got an email alert that 300 VMs were not protected. That took me by surprise. Must be some sort of mistake, so I login to the Trend Manager, and every single virtual machine and every single appliance was unmanaged.

WTF!

Looks like Trend DSM shat itself without admin privileges. According to Trend, apparently its expected behaviour. Sounds like pretty shite expected behaviour to me!

Thankfully re-activating all virtual appliances and VMs only took a few minutes, but then I noticed that none of the virtual appliances were updating. A quick check of my relay groups showed they had no members.

I deactivated and reactivated my relays again. No change. Relay groups still empty. I deactivated, uninstalled the agent, rebooted, reinstalled the agent, activated again. No change.  Both my internal, DMZ and even the default relay group remained empty with no members. WTF?

Then I installed the relay agent on brand new servers to see if they would come up in the Default Relay Group when they were activated. Nope, nothing. Weird.

At this stage I started to panic and raised a call with Trend. While investigating the issue further, we found that on the System-Updates view the relays were being shown, but when viewing the Relay Groups in the System Settings-Updates tab, they were not showing any members.

So it looks like there was some issue with the relay groups I had created. To fix the issue I had to deactivate all relays, set all VMs and virtual appliances to the Default Relay Group so I could delete my custom relay groups, and then deactivate and reactivate the agents.

Finally the relays appeared as members in the Default Relay Group and I could re-create the Internal and DMZ Relay groups and assign the members to the correct groups to recreate my update hierarchy. Lastly my virtual appliances were assigned to the internal relay group and they were able to pick up the latest definitions.

So, you’ve been warned. Trend DSM needs admin rights all the time!

Advertisements

Deep Security 8 SP1 Upgrade


As you guys and girls may be aware, Trend DS 8 SP1 has been out since the 30th April.

DS 8 SP1 promises support for wildcard exclusions and also adds linux support via an agent for on-demand scanning. (no real-time scanning yet).

There is also the added benefit of fixing the HEAP_MAX_SIZE PSOD issue but still waiting confirmation on this.

We’ve been having a few ongoing issues with our Trend environment mainly due to a lack of care and attention since I installed 7.5 SP1 and upgraded to DS 8. Also Trend is not the easiest beast to get up and running correctly. A lot of this is down to the documentation. The install guide (Getting Started?) is too  simplistic and the Best Practice documentation is confidential (go figure!) so I would definitely recommend professional services if you are think about buying Trend DS. And on the plus side you get someone to blame if anything goes wrong!

I thought the release of 8 SP1 would be a good oppurtunity to get the Trend boys onsite to blow away the existing DSM + database and install DS 8.0 SP1 from scratch.

Bear in mind this was a live cluster, so we effectively split the cluster in half and kept one half on DS 8 (with all the live VMs) and the other half was upgraded to DS 8 SP1.

We deployed a new VM, installed DSM 8 SP1 on a new database, prepared the ESXi hosts and deployed the new virtual appliances. Once the infrastructure was configured, the existing virtual machines were vmotioned onto the DS 8 SP1 hosts that were managed with the new 8 SP1 DSM.

This was a little tricky as you effectively had two DSM’s in operation on a single cluster – not recommended for long! The key to managing the VMs was to change the view to sort by host, then you could easily ignore all the unmanaged VMs on half the hosts that were not prepared.

Once the VMs were vmotioned across, we waited 5 minutes for their config to update (to ensure they still didn’t think they were being protected by a DS8 appliance) and then activated them on the new DS 8 SP1 virtual appliances on the new DSM.

After all the VMs were activated we could upgrade the remaining ESXi hosts and re-enable DRS to spread the VMs back across the cluster.

All in all it was a painless upgrade with no downtime and on the plus side Trend is looking much better.

If you have been through a few iterations of  Trend DS and  you’re having issues with high maintenance, VMs being unprotected, appliances going offline, etc I recommend this approach to clear out your infrastructure and database and start off fresh.

Yes you have to reconfigure your alerting and security profiles but its a small price to pay for a healthy, stable environment.

DS 8 SP1 — well recommended!

— UPDATE 11/06/2012 —

I have had confirmation from Trend HEAP_MAX_SIZE issue has been resolved in DS 8 SP1, but for now I’ve left the HEAP_MAX_SIZE variable set on all my ESXi hosts as it is still unclear in my mind whether this setting is no longer needed.

 

Deep Security 8 and DSAFILTER_HEAP_MAX_SIZE


I have been having major reliability issues with Deep Security 8 for the last 2 weeks.

There is a lot of confusion out there with the recommended settings for a stable Deep Security 8 environment, which isn’t helped when the vendor doesn’t have any KB articles or updated documentation publicly available. It appears they would rather their customers test their products for them and keep their teething issues with ESXi 5.0 under wraps. See my last post here.

This brings me to my issues with setting the DSAFILTER_HEAP_MAX_SIZE on each ESXi host. There is a recommended workaround mentioned on  KB1055625 “How to enable the Deep Security Virtual Appliance (DSVA) to support more than 25 virtual machines on ESX.’ What most people don’t realise is this is a recommended workaround for Trend DS 7.0\7.5, not Trend DS 8.

The article that states that the DSAFILTER_HEAP_MAX_SIZE should be roughly 1MB per each VM planned. This results in most customers setting a value for the HEAP_MAX_SIZE of around 40 – 50MB. This is a 1/10th of what it should be according to Trend Engineering for DS 8 (512MB)!

Here is a quick summary:

I set a  DSAFILTER_HEAP_MAX_SIZE on all my ESXi hosts with  a value of 50MB. In theory according to the knowledge base article this should be sufficient for 40 VMs.

For Deep Security 8 Trend actually recommend not changing this value as the dsafilter is supposed to dynamically adjust the HEAP_MAX_SIZE according to the numbe rof VMs on the host. If you have set the value manually as per the KB article  OR you encounter symptoms such as VMs loses connectivity intermittingly and then the hosts PSODing after 30mins – 2 hours then its most likely an inadequate  HEAP MAX size and the filter driver is running out of memory.

Here are the calculcations from Trend Engeering. They have used a calculation of 70 VMs but state “Even if you are not planning on running as many VM’s it is still suggested to set the value to 512mb. “

—————————————————————————————————————————————–

We use 432 bytes per TCP connection so for the mathematical calculation lets roundup to 512 bytes per TCP allocation.

If using the default values for the Maximum TCP Connections. <SystemSetting name=”maxConnectionsTcp” value=”10000″ />

With this 512 bytes  * 10000 connection = 5MB per VM so we need to have 70*5MB = 350MB minimum memory for the visor memory for the connection tables only.

In addition to this we will need ~ 70 MB to run the FD so it comes down to 420 MB.

We recommend bumping to 512MB and verify the results.  e.g. esxcfg-module -s DSAFILTER_HEAP_MAX_SIZE=536870912 dvfilter-dsa

To verify the setting, execute: % esxcfg-module -g dvfilter-dsa The setting will not take effect until the driver is reloaded.  Reloading will either require a reboot (best option) of ESX.

———————————————————————————————————————————————-

If you are running DS 8 and ESXi 5.0 and you have already implemented the MOD_TIMER=0 fix, the next step will be to manually set the MAX_HEAP_SIZE to 512MB.

This solved the issue for me and we haven’t had a single crash since.

About frikking time!

———————————————————————————————————————————————-

Update 30/05/2012

Michael Gioia from Trend contacted me to give me a more detailed analysis of why this issue occurs.

First of all there is Trend KB article out for this issue — http://esupport.trendmicro.com/solution/en-us/1060125.aspx

It sounds like there are a few unhappy customers out there. In Trend’s defence and the point put across by Michael, is that it is very difficult to ensure 100% compatibility with  two products that are integrated in the kernel such as the ESXi hypervisor and the Trend filter.

This anomaly is only supposed to occur in very rare circumstances when the filter driver is under sever exhaustion of memory (circa 4 bytes).

In theory this issue shouldn’t occur with many customers then, but to be honest my setup was not anything special or extreme, so I’d be surprised if more customers weren’t affected by this.

On the plus side, the developers should have released a fix in 7.5 SP4 and 8.0 SP1 which are both available now.

I still have to verify this – until I am confident the issue is resolved my HEAP_MAX_SIZE settings will remain!