Hey fellas. Ran into this error last night, finally resolved. I've seen dozens of posts on this same issue with no answers. Please feel free to post on any forum you like. Note that I'm an EMC storage guy, the ESX side of the house is not my realm.
We had guests (59) "Freeze" for 5-10 seconds.
VMkernel:
Aug 6 09:56:43 pdvesx12 vmkernel: 79:18:38:56.771 cpu12:1229)SCSI: 638: Queue for device vml.02000000006006016093f02100ec41893fa943de11524149442035 is being blocked to check for hung SP.
Aug 6 09:56:52 pdvesx12 vmkernel: 79:18:39:05.585 cpu15:1482)<4>lpfc0:0754:FPe:SCSI timeout Data: xc6aa280 x98 x29157be0 xec
win2k3sp1 guests:
The device, \Device\Scsi\symmpi1, is not ready for access yet.
Linux Guests:
Aug 5 15:44:30 pdlnetnag01 kernel: sd 0:0:0:0: SCSI error: return code = 0x00000008
Aug 5 15:44:30 pdlnetnag01 nagios: Error: Unable to create temp file for writing status data!
Aug 5 15:44:30 pdlnetnag01 kernel: ReiserFS: dm-1: warning: clm-6006: writing inode 110849 on readonly FS
3 Node ESX cluster after massive log review, switch dumps, grabs, and webex the answer is,,,,,,
One fiber cable was not fully seated in the switch.
I'm posting this so no one else has to make a 10 hour run at something so simple, yet it trashed an entire 3 node cluster. Switch port errors were still generating after Maint. mode, so that's a good way to check.