Sure. Let me know how it goes.
Regards,
Bharath G
Original Message:
Sent: Apr 25, 2025 09:50 AM
From: Jennifer Nguyen
Subject: lost connectivity to storage device affected datastores unknown
Hi Bharath,
No worries. Pure suggested that we upgrade our FA to the latest version. We did that last weekend. Going to be monitoring it for a few weeks. Will keep yall updated.
Original Message:
Sent: Apr 23, 2025 03:12 PM
From: Bharath Kumar G
Subject: lost connectivity to storage device affected datastores unknown
Hi Jennifer,
Sorry I wasn't around a few days.
I don't see the attachment for some reason.
Path related issues are usually straight forward. ESXi has storage drivers that record response from the storage subsystem and mark the relevant path by failing the command and response H:0x1. This can be viewed int he vmkernel log.
hostd log will record ''Path redundancy to storage device naa.################################degraded. Path vmhba3:C0:T1:L7 is down. Affected datastores: Datastore1.".
Are you able to file a Broadcom VMware ticket by any chance?
If not, please see if you can upload the hostd and vmkernel logs of the ESXi host on which the alert appeared.
Regards,
Bharath G
Original Message:
Sent: Apr 09, 2025 05:01 PM
From: Jennifer Nguyen
Subject: lost connectivity to storage device affected datastores unknown
Thank you for the quick response. I ran esxcfg-mpath -bd naa.624a93709cac436e0b074d33000f4019 on host-04 and all paths shows as active. Also ran it against 11 other storage devices (1 local) and they all are showing as active.
This is happening on all 13 of our hosts, on different vmhbas, same storage device and different LUNS.
For example: last night, we received 2 alarms for host 4, on vmhba 4 and 5, same storage device, same LUN. But on Sunday, we received alarms for multiple hosts, different vmhba, storage device, different LUN.
I actually have a host log from one of the last alarms. please see attachment.
We have been working with Pure on the issue. They did not see anything issues on their end.
Original Message:
Sent: Apr 09, 2025 04:15 PM
From: Bharath Kumar G
Subject: lost connectivity to storage device affected datastores unknown
That's strange. Could you SSH to the ESXi host that the alarm is referring to, navigate to /var/log, run ls and provide me a screenshot of what you see?
If you do not find the 3 log files, please do the following:
This would give you all the paths ESXi was able to pick up over the PSA from Storage array:
esxcfg-mpath -bd naa.624a93709cac436e0b074d33000f4019
See if any of the paths listed is showing anything but active. E.g., dead
As your LUNs have multiple paths configured for redundancy, you will not likely see any effect of this message, with your workloads.
https://knowledge.broadcom.com/external/article/318935/path-redundancy-to-storage-device-degrad.html
This would give tell you whether or not there is a VMFS partition on the LUN:
partedUtli getptbl /vmfs/devices/disks/naa.624a93709cac436e0b074d33000f4019
If you see a partition labelled vmfs, you can use the following command to find out which datastore the alert is about:
esxcli storage vmfs extent list | grep naa.624a93709cac436e0b074d33000f4019
The cause of this message is usually due to an issue at Layer 1.
Capture the timestamp and and hostname per the vCenter and engage your Fabric and Storage Vendors to see if they observe any anomalies at the time of the message.
Once you have the report, you can file a ticket with Broadcom VMware-Storage Support team, if you'd like or we may continue to investigate over this thread.
Original Message:
Sent: Apr 09, 2025 03:19 PM
From: Jennifer Nguyen
Subject: lost connectivity to storage device affected datastores unknown
Hi Kumar,
There are no logs for the 3 that you mentioned when I looked in the /var/log directory.
We a primary and a failover, the issue is on the failover site:
2 datacenters
2 Pure Storage
2 Cisco B200M5s
ESXi hosts are all 8.0 3d
Storage adapters is a shared connection from the chassis, fi are 2 physical connections are shared between blades. Not standalone servers, it's shared. If it's a physical connect it would affect all blades.
The alerts that we are getting from vCenter is for paths that do not exist, that is why it shows datastore is 'unknown':
Alarm alarm.StorageConnectivityAlarm on Host host1_name
because Path redundancy to storage device naa.624a93709cac436e0b074d33000f4019 degraded. Path vmhba5:C0:T692:L237 is down. Affected datastores: Unknown..
Not sure why the T: (Target) is so high in the numbering...
Seems like there's a cached path somewhere that is triggering the alarms.
When we do get the alerts, there is nothing down, all Hosts, VMs, vNic, storage adapters, datastores, etc. are green and good. Why is it that we don't get the alerts for the primary site but just for the failover site? They have the same setup and configuration.
Things tried:
Rescanned datastores
Rescanned storage adapters
Disable then re-enabled different alarms: vSAN online health alarm 'Disks usage on storage controller (vCenter level), Cannot connect to storage (vCenter level), Datastore usage on disk (vCenter level)
SFP health check was done by Pure Support
Any idea on this is greatly appreciated.
Original Message:
Sent: Apr 07, 2025 06:53 PM
From: Bharath Kumar G
Subject: lost connectivity to storage device affected datastores unknown
Hi Jennifer,
We should be able to tell from taking a look at the logs.
Can you upload the hostd, vmkernel and vobd logs?
Original Message:
Sent: Apr 07, 2025 11:33 AM
From: Jennifer Nguyen
Subject: lost connectivity to storage device affected datastores unknown
Hi Dave,
Were you able to get this issue resolved? I have the same exact issue.
Original Message:
Sent: Aug 04, 2010 02:51 PM
From: DAMahoney
Subject: lost connectivity to storage device affected datastores unknown
Hello,
I am receiving the error "lost connectivity to storage device affected datastores unknown" once a day, different times everyday. Sometimes it identifies which datastore is affected, so far it is only happening to two datastores. I have been running vSphere for about 10 months without issues and haven't changed/patched anything recently. I have 2 hosts, 2 Equallogic PS5000 sans connected through iSSCSI. Screen shots of my cirtual switch's are attached it that helps.
Thank you for your assistance