Hello Everyone,
I need assistance on vSAN alert.
On one of the Cluster we are getting an error as, Virtual SAN device is under permanent failure.
- Failed : Physical disk
- Failed : Component metadata health
- Failed : Overall disks health
I have gone through with couple of KBs and community.
VSAN health check - component metadata health
Component metadata health check fails with invalid state error (2145347) | VMware KB
ESXi host :
VMware ESXi 6.0.0 build-3620759
VMware ESXi 6.0.0 Update 2
vSAN Version:
Name : VMware-vsan-health Relocations: (not relocatable)
Version : 6.2.0 Vendor: VMware, Inc.
Release : 3547697 Build Date: Sat Feb 13 03:04:16 2016
Install Date: Thu Oct 13 18:12:01 2016 Build Host: sc-bld-lin1268.eng.vmware.com
Group : Applications/Management Source RPM: VMware-vsan-health-6.2.0-3547697.src.rpm
Size : 52872114 License: commercial
Signature : (none)
Summary : VMware Virtual SAN Health Service
Description :
VMware Virtual SAN Health Service
Distribution: (none)



vmkernel.log
2017-04-24T10:17:07.853Z cpu16:42460)PLOG: PLOG_QuiesceDevice:8531: : Got quiesce reason 1 on disk naa.600605b00991a3f0202de2c45f900beb:2 5296f94a-d540-efa9-e0e4-d7a2788d97ce
2017-04-24T10:17:07.853Z cpu7:33656)PLOG: PLOG_CleanupElevator:1473: Waiting for Elevator from UUID 5296f94a-d540-efa9-e0e4-d7a2788d97ce
2017-04-24T10:17:07.863Z cpu32:2341680)WARNING: LSOM: LSOMEventNotify:6450: Virtual SAN device 5296f94a-d540-efa9-e0e4-d7a2788d97ce has gone offline.
2017-04-24T10:17:09.857Z cpu4:33662)PLOG: PLOGGarbageCollectDevice:1542: Throttled: Device naa.600605b00991a3f0202de2c45f900beb:1 5296f94a-d540-efa9-e0e4-d7a2788d97ce is prepared to delete
2017-04-24T10:17:09.857Z cpu4:33662)PLOG: PLOG_FreeDevice:325: PLOG in-mem device 0x430cdf26f030 naa.600605b00991a3f0202de2c45f900beb:1 0x419 5296f94a-d540-efa9-e0e4-d7a2788d97ce is being freed SSD 52cec8b9-4703-a9ad-aa5b-eaccb9b6f0e8
2017-04-24T10:17:09.867Z cpu9:33662)PLOG: PLOG_FreeDevice:325: PLOG in-mem device 0x430cdf270070 naa.600605b00991a3f0202de2c45f900beb:2 0x41d 5296f94a-d540-efa9-e0e4-d7a2788d97ce is being freed SSD 52cec8b9-4703-a9ad-aa5b-eaccb9b6f0e8
2017-04-24T10:17:11.369Z cpu36:41665)PLOG: PLOGNotifyDisks:4010: MD 3 with UUID 5296f94a-d540-efa9-e0e4-d7a2788d97ce with state 0 formatVersion 4 backing SSD 52cec8b9-4703-a9ad-aa5b-eaccb9b6f0e8 notified
2017-04-24T10:17:11.418Z cpu0:7034782)PLOG: PLOGGetRecoveredState:6637: Last LSN recoverd 5296f94a-d540-efa9-e0e4-d7a2788d97ce 46544828
2017-04-24T10:17:12.421Z cpu0:7034782)PLOG: PLOG_OpenDevHandles:1228: Registered APD callback for naa.600605b00991a3f0202de2c45f900beb:2 5296f94a-d540-efa9-e0e4-d7a2788d97ce
2017-04-24T10:17:12.424Z cpu0:7034782)PLOG: PLOG_OpenDevHandles:1228: Registered APD callback for naa.600605b00991a3f0202de2c45f900beb:2 5296f94a-d540-efa9-e0e4-d7a2788d97ce
2017-04-24T10:17:12.425Z cpu0:7034782)PLOG: PLOGInitAndAnnounceMD:6987: Successfully announced VSAN MD (naa.600605b00991a3f0202de2c45f900beb:2) with UUID 5296f94a-d540-efa9-e0e4-d7a2788d97ce
2017-04-24T10:17:12.530Z cpu26:43820)WARNING: LSOM: LSOMEventNotify:6440: Virtual SAN device 5296f94a-d540-efa9-e0e4-d7a2788d97ce is under permanent error.
2017-04-24T10:17:07.853Z cpu8:7034742)PLOG: PLOGValidateDiskGroupOpFn:1415: Issuing PLOG Op DISKGROUP UNMOUNT for MD :naa.600605b00991a3f0202de2c45f900beb
2017-04-24T10:17:07.853Z cpu16:42460)PLOG: PLOG_QuiesceDevice:8531: : Got quiesce reason 1 on disk naa.600605b00991a3f0202de2c45f900beb:2 5296f94a-d540-efa9-e0e4-d7a2788d97ce
2017-04-24T10:17:07.853Z cpu32:41665)LSOM: LSOMEventNotify:6413: Throttled: Waiting for component cleanup
2017-04-24T10:17:07.853Z cpu7:33656)PLOG: PLOG_CleanupElevator:1473: Waiting for Elevator from UUID 5296f94a-d540-efa9-e0e4-d7a2788d97ce
2017-04-24T10:17:07.863Z cpu32:2341680)WARNING: LSOM: LSOMEventNotify:6450: Virtual SAN device 5296f94a-d540-efa9-e0e4-d7a2788d97ce has gone offline.
2017-04-24T10:17:07.863Z cpu32:2341680)LSOM: LSOMEventNotify:6519: Throttled: Waiting for open component countto drop to zero
2017-04-24T10:17:07.872Z cpu29:36378)PLOG: PLOGIsPlogUnloading:100: Elevator exit for device is set
2017-04-24T10:17:07.872Z cpu29:36378)PLOG: PLOGElevBaseHandler:617: Elevator exiting due to unload operation
2017-04-24T10:17:07.974Z cpu8:33711)Global: Virsto_DetachInstance:301: INFO: Detaching Virsto Instance 0x430b680a9060 from PLOG device
2017-04-24T10:17:08.855Z cpu21:33659)PLOG: PLOG_CleanupDefence:6346: Waiting for defence task for naa.600605b00991a3f0202de2c45f900beb:1
2017-04-24T10:17:09.856Z cpu21:33659)Destroyed VSAN Slab PLOGIORetry_slab_0000000000 (maxCount=0 failCount=0)
2017-04-24T10:17:09.857Z cpu21:33659)Destroyed VSAN Slab PLOGIORetry_slab_0000000001 (maxCount=1 failCount=0)
2017-04-24T10:17:09.857Z cpu21:33659)ScsiEvents: 353: EventSubsystem: Device Events, Event Mask: 20, Parameter: 0x430cdde547e0, UnRegistered!
2017-04-24T10:17:09.857Z cpu3:7034742)PLOG: PLOGValidateDiskGroupOpFn:1415: Issuing PLOG Op DISKGROUP UNMOUNT for MD :naa.600605b00991a3f0202de2c45f900beb
2017-04-24T10:17:09.857Z cpu4:33662)PLOG: PLOGGarbageCollectDevice:1542: Throttled: Device naa.600605b00991a3f0202de2c45f900beb:1 5296f94a-d540-efa9-e0e4-d7a2788d97ce is prepared to delete
2017-04-24T10:17:09.857Z cpu4:33662)PLOG: PLOG_FreeDevice:325: PLOG in-mem device 0x430cdf26f030 naa.600605b00991a3f0202de2c45f900beb:1 0x419 5296f94a-d540-efa9-e0e4-d7a2788d97ce is being freed SSD 52cec8b9-4703-a9ad-aa5b-eaccb9b6f0e8
2017-04-24T10:17:09.857Z cpu4:33662)PLOG: PLOG_FreeDevice:496: Throttled: Waiting for ops to complete on device: 0x430cdf26f030 naa.600605b00991a3f0202de2c45f900beb:1
2017-04-24T10:17:09.867Z cpu9:33662)PLOG: PLOG_FreeDevice:325: PLOG in-mem device 0x430cdf270070 naa.600605b00991a3f0202de2c45f900beb:2 0x41d 5296f94a-d540-efa9-e0e4-d7a2788d97ce is being freed SSD 52cec8b9-4703-a9ad-aa5b-eaccb9b6f0e8
2017-04-24T10:17:09.867Z cpu9:33662)PLOG: PLOG_FreeDevice:454: Unregistering diskAttrHandle:0x430cdf2708b0 on disk naa.600605b00991a3f0202de2c45f900beb
2017-04-24T10:17:09.867Z cpu9:33662)LSOMCommon: LSOM_UnregisterDiskAttrHandle:136: DiskAttrHandle:0x430cdf2708b0 is removed from moduleID 86 for disk:naa.600605b00991a3f0202de2c45f900beb
2017-04-24T10:17:09.868Z cpu9:33662)Destroyed VSAN Slab PLOGIORetry_slab_0000000000 (maxCount=26 failCount=0)
2017-04-24T10:17:09.868Z cpu9:33662)Destroyed VSAN Slab PLOGIORetry_slab_0000000001 (maxCount=9 failCount=0)
2017-04-24T10:17:09.868Z cpu9:33662)ScsiEvents: 353: EventSubsystem: Device Events, Event Mask: 20, Parameter: 0x430cdf2720d0, UnRegistered!
2017-04-24T10:17:09.906Z cpu28:33528)WARNING: DVFilter: 1181: Couldn't enable keepalive: Not supported
2017-04-24T10:17:09.982Z cpu46:7034760)VSAN Device Monitor: Successfully unmounted failed VSAN disk naa.600605b00991a3f0202de2c45f900beb
Regards,
Ali