VMware vSphere

 View Only
  • 1.  Replacing ESXI host to SAN network interconnect switch

    Posted Jan 20, 2025 09:01 PM

    We are in the midst of attempting to replace aging switches that connect our SAN NFS environment to our ESXI 8 host systems.  We have made several attempts to no avail with the most recent attempt getting very close,  but we ultimately had to back it out.  

    Here is a hi level of our process;

    • Shutdown VM's
    • Disable HA in VCenter
    • Edit Port Switch Networking to add the VMNIC's and remove the old VMNIC's
    • Remove old SAN DAC cables to old switch,  connect NEW DAC switches to SAN units,  note:  during this period,  NFS volumes are present in the ESXI hosts,  but no volume data or size info appear.  
    • Test basic connectivity to and from hosts & SAN and all test pass as expected.
    • Observe NFS volumes and they initially appear to be present and intact.    

    At this point,  we are under the assumption that we are able to start bringing VM's back online.  When we attempt to bring a single VM online,  it hangs and is not able to power on.  An additional observation of NFS Volumes results in some being available,  but some are showing "0" size and we are unable to perform a Browse files from the individual host ESXI management interface.  We end up having to back the cutover out,  and move back to the original config.  Systems restore fully within 30 seconds of reconnecting.  

    We don't place the hosts in Maintenance mode and/or shut them down at all during this process.  We also do not place the NFS VOLS in offline mode.  Wondering if this is a step that we need to consider as it may force the connections to come up clean.  We are also fighting a config on the NEW switches as they are a pair of Dell S5248 units configured with VLT between them.  We are considering ripping the VLT out and treating them as two independent switches. 

    Thoughts or ideas of this process?



  • 2.  RE: Replacing ESXI host to SAN network interconnect switch

    Posted Jan 21, 2025 02:29 AM

    this sounds like an issue with mtu to me. could it be that you used jumbo frames on the old switches and have the new switches still on mtu 1500?

    regards, raoul.




  • 3.  RE: Replacing ESXI host to SAN network interconnect switch

    Posted Jan 22, 2025 08:29 AM

    Recommendations

    1. Plan a Clean Cutover:

      • Place ESXi hosts in maintenance mode before starting the process.
      • Ensure no VMs are powered on and temporarily disconnect NFS volumes from the ESXi hosts before replacing the switches.
      • This ensures there are no active connections to NFS volumes, preventing inconsistent access issues.
    2. Connectivity Tests Before Re-enabling Volumes:

      • Verify paths and test connectivity between hosts and NFS volumes using tools like vmkping.
      • If using jumbo frames (MTU 9000), perform specific tests (vmkping -s 8972 -d <NFS IP>).
    3. VLT Configuration:

      • Temporarily disable VLT and configure the switches as independent units to simplify the setup. Once the migration works correctly, you can consider enabling VLT again.
    4. Allow Time for Stabilization:

      • Ensure ESXi hosts fully recognize the paths to NFS volumes after reconnection. This might take time depending on the environment.
    5. Log Review:

      • Check vCenter event logs and ESXi logs (e.g., /var/log/vobd.log and /var/log/hostd.log) for errors related to NFS.
    6. Advanced NFS Configuration:

          • Consider increasing retry and timeout values in the NFS mount options to provide greater resilience during network interruptions.

    Best Regards