VMware vSphere

 View Only
  • 1.  VCSA HA eth0 and services dont start after reboot

    Posted Jan 13, 2017 01:42 PM

    Hi,

    I configured in our test environment a VCSA 6.5 HA deployment. All three node were up and fine, vcenter showed them green at HA status. First problem: After a simultaneous reboot of all three machines, the eth0 interface of master and passive node stays down. Second problem: even after configuring eth0 manually and successfully regaining IP connectivity on public interface, the vcsa services can't be started:

    Command> service-control --status

    Running:

    vmware-statsmonitor vmware-vcha vmware-vmon

    Stopped:

    applmgmt lwsmd pschealth vmafdd vmcad vmcam vmdird vmdnsd vmonapi vmware-cis-license vmware-cm vmware-content-library vmware-eam vmware-imagebuilder vmware-mbcs vmware-netdumper vmware-perfcharts vmware-psc-client vmware-rbd-watchdog vmware-rhttpproxy vmware-sca vmware-sps vmware-sts-idmd vmware-stsd vmware-updatemgr vmware-vapi-endpoint vmware-vpostgres vmware-vpxd vmware-vpxd-svcs vmware-vsan-health vmware-vsm vsphere-client vsphere-ui

    Command> service-control --start --all

    Perform start operation. vmon_profile=HACore, svc_names=None, include_coreossvcs=True, include_leafossvcs=False

    2017-01-13T13:30:25.695Z   Running command: ['/usr/bin/systemctl', 'is-enabled', u'lwsmd']

    2017-01-13T13:30:25.698Z   Done running command

    Service lwsmd startup type is not automatic. Skip

    2017-01-13T13:30:25.701Z   Running command: ['/usr/bin/systemctl', 'is-enabled', u'vmafdd']

    2017-01-13T13:30:25.703Z   Done running command

    Service vmafdd startup type is not automatic. Skip

    2017-01-13T13:30:25.705Z   Running command: ['/usr/bin/systemctl', 'is-enabled', u'vmdird']

    2017-01-13T13:30:25.707Z   Done running command

    Service vmdird startup type is not automatic. Skip

    2017-01-13T13:30:25.710Z   Running command: ['/usr/bin/systemctl', 'is-enabled', u'vmcad']

    2017-01-13T13:30:25.712Z   Done running command

    Service vmcad startup type is not automatic. Skip

    2017-01-13T13:30:25.714Z   Running command: ['/usr/bin/systemctl', 'is-enabled', u'vmware-sts-idmd']

    2017-01-13T13:30:25.716Z   Done running command

    Service vmware-sts-idmd startup type is not automatic. Skip

    2017-01-13T13:30:25.719Z   Running command: ['/usr/bin/systemctl', 'is-enabled', u'vmware-stsd']

    2017-01-13T13:30:25.721Z   Done running command

    Service vmware-stsd startup type is not automatic. Skip

    2017-01-13T13:30:25.723Z   Running command: ['/usr/bin/systemctl', 'is-enabled', u'vmdnsd']

    2017-01-13T13:30:25.726Z   Done running command

    Service vmdnsd startup type is not automatic. Skip

    2017-01-13T13:30:25.728Z   Running command: ['/usr/bin/systemctl', 'is-enabled', u'vmware-psc-client']

    2017-01-13T13:30:25.730Z   Done running command

    Service vmware-psc-client startup type is not automatic. Skip

    Successfully started vmon services. Profile HACore.

    /etc/systemd/network/10-eth0.network.manual:

    [Match]

    Name=eth0

    [Network]

    Gateway=10.45.128.1

    Address=10.45.128.32/24

    DHCP=no

    [DHCP]

    UseDNS=false

    /etc/systemd/network/10-eth1.network

    [Match]

    Name=eth1

    [Network]

    Address=192.168.64.204/23

    DHCP=no

    [DHCP]

    UseDNS=false

    Now I shut the passive node down to get active node and witness node up again but the problems still persist. Pinging between active and witness node HA interfaces works. Why are both eth0 down after boot and why cant i start the services?

    edit:

    networkctl status eth0

    ● 2: eth0

           Link File: /usr/lib/systemd/network/99-default.link

        Network File: n/a

                Type: ether

               State: routable (unmanaged)

                Path: pci-0000:03:00.0

              Driver: vmxnet3

              Vendor: VMware

               Model: VMXNET3 Ethernet Controller

          HW Address: 00:0c:29:7e:7e:08 (VMware, Inc.)

                 MTU: 1500

             Address: 10.45.128.32

             Gateway: 10.45.128.1 (ICANN, IANA Department)

    Network file n/a and State: unmanged??? But I have file /etc/systemd/network/10-eth0.network.manual which was created by vcenter. How can I fix this?



  • 2.  RE: VCSA HA eth0 and services dont start after reboot

    Posted Feb 08, 2017 10:43 AM

    Same thing exactly after failed HA cluster installation.

    I destroyed failed cluster, deleted passive and witness nodes, restarted active node and lost management IP.

    Done some experiments to /etc/systemd/network/10-eth0.network.manual

    root@vcsa1 [ /etc/systemd/network ]# mv 10-eth0.network.manual 10-eth0.network

    root@vcsa1 [ /etc/systemd/network ]# systemctl restart systemd-networkd

    root@vcsa1 [ /etc/systemd/network ]# networkctl

    Got this:

    IDX LINK            TYPE              OPERATIONAL SETUP

      1 lo              loopback          carrier    unmanaged

      2 eth0            ether              routable    configured


    eth0 is up, pings fine.

    After reboot - network's down again and 10-eth0.network got renamed back to 10-eth0.network.manual

    Fine.

    root@vcsa1 [ /etc/systemd/network ]# cp 10-eth0.network.manual 20-eth0.network

    root@vcsa1 [ /etc/systemd/network ]# systemctl restart systemd-networkd


    Network is up.

    Rebooting vcsa, while pinging it from outside. It starts up, i see couple of pings and no pings again.

    Checking renamed 20-eth0.network is still there, but something brings network down during boot.

    root@vcsa1 [ /etc/systemd/network ]# systemctl restart systemd-networkd

    Makes it work again.

    Just out of curiosity:

    root@vcsa1 [ /etc/systemd/network ]# /etc/rc.d/init.d/network start

    Starting network (via systemctl):  Job for network.service failed because the control process exited with error code. See "systemctl status network.service" and "journalctl -xe" for details.

                                                              [FAILED]


    root@vcsa1 [ /etc/systemd/network ]# systemctl status network.service


    ● network.service - LSB: Bring up/down networking

      Loaded: loaded (/etc/rc.d/init.d/network; bad; vendor preset: enabled)

      Active: failed (Result: exit-code) since Wed 2017-02-08 10:28:37 UTC; 10s ago

        Docs: man:systemd-sysv-generator(8)

      Process: 2345 ExecStart=/etc/rc.d/init.d/network start (code=exited, status=6)

    Feb 08 10:28:37 vcsa1.lab.local systemd[1]: Starting LSB: Bring up/down networking...

    Feb 08 10:28:37 vcsa1.lab.local systemd[1]: network.service: Control process exited, code=exited status=6

    Feb 08 10:28:37 vcsa1.lab.local systemd[1]: Failed to start LSB: Bring up/down networking.

    Feb 08 10:28:37 vcsa1.lab.local systemd[1]: network.service: Unit entered failed state.

    Feb 08 10:28:37 vcsa1.lab.local systemd[1]: network.service: Failed with result 'exit-code'.

    I'm actually willing to fix this, since it's the second time I redeploy vcsa and run into the same problem.

    Anyone?

    UPDATE.

    It seems that the vCenterHA cluster was destroyed incorrectly.

    Destroying it manually on the active node did the trick.

    root@vcsa1 [ ~ ]# destroy-vcha

    Caution: This will remove all vCenter HA related configuration from the current node and it cannot be reused to form a vCenter HA cluster unless this is the Active node.

    Confirm to proceed? (y/n): y

    logs available at: /var/log/vmware/vcha

    2017-02-08T13:31:44.935Z   Successfully updated starttype: DISABLED for service vcha

    2017-02-08T13:31:50.644Z   Running command: ['/usr/lib/applmgmt/networking/bin/firewall-reload']

    2017-02-08T13:31:50.778Z   Done running command

    Skip not found service - vmware-stsd

    Skip not found service - vmware-sts-idmd

    Skip not found service - vmdnsd

    Skip not found service - vmdird

    Skip not found service - vmcad

    Skip not found service - vmware-psc-client

    Reboot and you're done.



  • 3.  RE: VCSA HA eth0 and services dont start after reboot

    Posted Apr 12, 2017 06:39 PM

    +1 for

    # destroy-vcha 



  • 4.  RE: VCSA HA eth0 and services dont start after reboot

    Posted Nov 08, 2017 02:44 AM

    THANK YOU!!!!! I thought I'd be proactive and deploy VCSA HA. What a nightmare!. VMware needs to pull this feature until it's fixed. Lost eth0 like you guys and spent the last couple hours troubleshooting. Was about to rebuild VCSA when i came across this post. Saved the day :smileyhappy:



  • 5.  RE: VCSA HA eth0 and services dont start after reboot

    Posted Feb 20, 2024 11:35 AM