ESXi-Arm Fling

 View Only
  • 1.  VMs on Fling become read only

    Posted Apr 12, 2023 10:59 PM

    Greetings Esteemed Colleagues;

        I hope someone can give me some guidance.  I have a 2 Cluster VCenter with a 3 node X86 cluster and a 2 node ARM cluster running on 8 Gig Raspberry Pis.  I'm running VCenter 7.0d.  I'm running build 18427252 of ESXI Fling.  All my VM storage comes from a Synology NAS with 11T of available storage.  Both my 3 node X86 Cluster and my 2 node ARM cluster use the same storage.  My X86 cluster is running CentOS or RHEL vms, my ARM cluster is running Ubuntu ARM VMs  and  Raspbian VMs.

        What vexes me is that my VMs on the ARM cluster will often go into read only mode.  Most of the time they'll recover with a reboot, but occasionally I get file system corruption and will either have to restore from snapshot, or start over with a fresh template.  I've never had any VMs in my X86 Cluster go into read only mode.

        From experience I know that should storage become briefly unavailable Linux will go into read only mode as a matter of self preservation, but I don't think that's what is happening here since ARM VMs go RO on a fairly regular basis but I've never had that happen with my X86 VMs.

        Between problems with ARM Fling problems running with VCenter 70.1 and newer and this RO I've nearly abandoned the project, but I now find that I would like to experiment with Docker and Kubernetes in a mixed architecture environment, so I back rev'd my Vcenter to 7.0d to restore HA to my ARM cluster.  I'm a member of VMUG so I get my licensing through them which saves me re-installing ESXI-ARM Fling every few months.

        Has anybody else experienced this issue and is there a solution?

     

    Thanks

    -Bob

     



  • 2.  RE: VMs on Fling become read only

    Broadcom Employee
    Posted Apr 13, 2023 12:10 AM

    Hi Bob, thanks for testing the ESXi Arm Fling!

    As you mentioned, linux goes RO when it detects an I/O error on the disk access. I see two possible causes for it:

    • the virtual device emulation. You can try varying between SATA, NVMe and PVSCSI (depending on what your linux came with). What are your VMs currently using?
    • the ESXi network stack, as your disks are shared from the NAS.

    The Arm architecture has some particularities and the fling may have a few bugs left in either of those areas.

     

    To help troubleshoot, there are a couple of things you can provide:

    • Your VMs "dmesg" after the system went RO. If you can still ssh into it, run dmesg, and save the output. Linux certainly tells what went wrong that triggered it to remount the filesystems in read-only.
    • A support bundle from the ESXi host, when the issue appeared in a guest. See https://kb.vmware.com/s/article/1010705 on how to get it.

     

    Cheers,

    Cyprien



  • 3.  RE: VMs on Fling become read only

    Posted Apr 13, 2023 12:29 AM

    If I understand your question correctly, my VMs are using SATA AHCI.  I'll see what other options are available.

     

    Cheers

    -Bob