Garrett's IT Guide: March 2008

Wanted to let some engineers know about a “bug” that exists in the EMC Dart Code for the Cellera (NS) series of NAS.

There is a bug in that if an iSCSI read request that is larger than 1 MB in size is made to a Cellera, that the data mover may crash on the request. What happens is that the iSCSI service cannot process the request and results in a kernel panic on the data mover. The data mover then fails over, but due to the nature of the crash, meaning that the greater than 1 MB read request still must be processed, it could lead to the second data mover crashing, or in some cases, complete loss of data as the ESX server has writes needing to be processed.

There is also a HUGE possibility with ESX 3.5 where an HA event may be triggered and that the VM’s on that 3.5 node will be powered down. If you are in a mixture ESX 3.0.x and 3.5 nodes in an HA cluster, a HUGE amount of confusion can be brought on by the cluster with the HA event not being seen by the 3.0 hosts. This in fact lead to, in the case of one customer, 17 VM’s being corrupted and over 6 having to have DR performed on them to recover.

According to VMWare, they have made 3.5 much more sensitive to storage problems (???) and that it will force an HA event. Now, the customer in this case was using iSCSI QLogic HBA’s, so ESX is not aware of the underlying network calls being processed and in their defense treat the code the same as a Fiber Channel HBA so the timing is as if it’s an actual SAN and therefore doesn’t wait for the 45 second failover period of the data movers. This variable cannot be changed according to an escalation engineer within VMWare.

The cure is to patch the Dart Code on the NS so that it doesn’t panic on greater than 1 MB read requests.

I’d like to point out how easily a read over greater than 1 MB can be processed. VMFS formats a LUN in 1 MB blocks or larger. Windows NTFS formats it’s file system with 4k blocks. We all know that fragmentation occurs when a file is larger than 4k is written to the file system that it must take up at least 2 blocks. So, a 6k file actually takes up 8k of space on the file system. When the OS writes and it the next contiguous block is occupied, Windows (Linux, whatever) writes to the next available free block. So when FileA is 6k, it gets written to block 23 and the next block available is block 238,654. When the OS needs to read that file, it has to read both blocks. This is fragmentation. It takes a while to spin the disk and hence we get slower performance.

Well, since the 2 blocks are located on the VMFS VMDK file, ESX has to read 2 – 1 MB blocks to service the request for its Guest VM. Boom, kernel panic!

This problem was not found for this particular customer as the VM’s did not have fragmentation at the beginning of the implementation. However, now that the systems have run for a period of time, fragmentation builds up within both their file servers and database servers and has led to 2 major outages in exactly 10 days. They have thus decided, even though EMC claims the problem will not continue, that they are breaking the CX Clariion backend out and trashing the Cellerra NAS head. I really can’t say I blame them. EMC did not replace the first data mover that failed or was even able to determine the problem in the 10 days between the failures. So their failure yesterday did not have another data mover to move to. They were down for 10 hours. This customer is a bank and is very dependent on the services their infrastructure provides.

To make things worse, they did have a hot site that was configured for host based replication of their existing VM”s to be mirrored at. However, about 2 months ago their NS in that facility suffered from this very same problem and has not been completely repaired. They are to break the CX out there as well.

So at any rate, thought you guys should know. I have to admit I am personally a bit hesitant about pushing ESX with iSCSI out on these devices. It’s not solid, and EMC is HIGHLY unresponsive on resolving the issues when they do occur. When the sales rep is calling you to say (which I do have to say, that was awesome for him to do) “hey, I noticed your NAS was down” 2 hours after it occurred and tech support has not called the customer, there is a huge lack of communication or ability to fulfill the service requests that EMC receives. I know that this part is a rant, and everyone has some problems, but it’s not like we’re buying a 10k kia rio…Ok, I’ll shut up now.

So I ran across this horrible problem this past week in doing an Exchange 2003 to Exchange 2007 migration. My customer used Exclaimer heavily for doing auto replies and for creating automatic signatures for the users within the organization.

After installing this app and then trying to configure the Captaris Rightfax Connector for Exchange 2007, I ran into an instance where when sending to an address formatted as:

[Fax:user@2125555555]

Exchange generated an instance undeliverable message that it could not be delivered.

After working with the great folks at Captaris (Read as: some jerk on the phone that constantly spooke over you and did not really want to help anyways) and Microsoft, we discovered that Exclaimer re-writes the IMCEA address that's generated when sending to the FAX: or RFAX: address space, to the opposite case it needs to be. In other words, it takes IMCEA and makes it imcea.

Ready for the kicker? Exchange 2007 is case sensitive. The only solution was to uninstall Exclaimer and move on for now. The customer accepted the problem.

Just wanted to save you peeps some time!!

Garrett's IT Guide

Friday, March 28, 2008

Wicked Problem with EMC NS, iSCSI and VMWare

Monday, March 10, 2008

Exchange 2007 Restore of Hub Transport

Exchange 2007 and Exclaimer

Blog Archive

About Me