Hello all,
I am trying to figure out an issue with one of our ESXi host.
I was in vCenter to clear up alerts that were showing up and all of sudden, one of the hosts showed "Not responding".
I hopped on the ESXi host directly and sure enough it was up and running. Everything seemed fine but when I checked the VMs, I noticed I wasn't able to get on the console. Just got a spinning hour glass.
After checking a few things, I tried to disconnect the host and reconnect but it was failing at around 80% and saying it timed out.
Sure enough, now it showed a disconnected state in vCenter so I removed it from inventory and tried to re-add it.
Now it was just flat out refusing the connection saying the host is not reachable. I went back to the host and sure enough, the web GUI is returning the error below:
503 Service Unavailable (Failed to connect to endpoint: [N7Vmacore4Http16LocalServiceSpecE:0x000000d09c704f80] _serverNamespace = / action = Allow _port = 8309)
I was still able to SSH in to the machine and I tried to restart the hostd and vpxa services per VMWare articles but nothing. VPXA was initially failing to restart due to a missing watchdog.pid file but the file is there in the directory.
Restarted all management with services.sh restart and everything restarts but I cannot connect to it via web GUI or vCenter.
I was thinking about shutting down all the VMs but the commands to list the VMs via SSH says connection refused.
[root@myhost:/etc/init.d]vim-cmd vmsvc/getallvms
Failed to login: Connection refused: The remote service is not running, OR is overloaded, OR a firewall is rejecting connections.
I am at a completely lost as the guest VMs are running fine but I cannot get the host to restart the management consoles or connect via vCenter.
Please help.