Hi!
I have host (esxi 6.0) that are disconnected from VC. I began to study the problem .
The service hostd don't start.
[root@esx-35:/var/run/vmware] /etc/init.d/hostd restart
watchdog-hostd: PID file /var/run/vmware/watchdog-hostd.PID does not exist
watchdog-hostd: Unable to terminate watchdog: No running watchdog process for hostd
sh: you need to specify whom to kill
Ramdisk 'hostd' with estimated size of 1803MB already exists
[root@esx-35:/var/run/vmware] /opt/vmware/vpxa/bin/vmware-watchdog -r hostd
-sh: /opt/vmware/vpxa/bin/vmware-watchdog: not found
[root@esx-35:/var/run/vmware] /sbin/watchdog.sh -r hostd
nothing
[root@esx-35:/var/run/vmware] ls -l vmware-hostd.PID watchdog-hostd.PID
ls: watchdog-hostd.PID: No such file or directory
-rw-r--r-- 1 root root 8 Jan 15 13:46 vmware-hostd.PID
hostd.log
2019-01-15T10:00:16.218Z error hostd[676C1B70] [Originator@6876 sub=SoapAdapter.HTTPService.HttpConnection] Failed to read header on stream <io_obj p:0x667580a4, h:36, <TCP '0.0.0.0:0'>, <TCP '0.0.0.0:0'>>: N7Vmacore15SystemExceptionE(Connection reset by peer)
2019-01-15T10:02:59.391Z error hostd[67CC4B70] [Originator@6876 sub=SoapAdapter.HTTPService.HttpConnection] Failed to read header on stream <io_obj p:0x6665fc6c, h:31, <TCP '0.0.0.0:0'>, <TCP '0.0.0.0:0'>>: N7Vmacore15SystemExceptionE(Connection reset by peer)
2019-01-15T10:03:24.108Z error hostd[674BAB70] [Originator@6876 sub=Solo.VmwareCLI opID=esxcli-22-71b7 user=root] GetPrimitiveParam: Cannot find (help)
2019-01-15T10:03:24.408Z error hostd[674FBB70] [Originator@6876 sub=Solo.VmwareCLI opID=esxcli-a0-71cb user=root] GetPrimitiveParam: Cannot find (help)
2019-01-15T10:03:24.926Z error hostd[67CC4B70] [Originator@6876 sub=SoapAdapter.HTTPService.HttpConnection] Failed to read header on stream <io_obj p:0x67a3a74c, h:34, <TCP '0.0.0.0:0'>, <TCP '0.0.0.0:0'>>: N7Vmacore15SystemExceptionE(Connection reset by peer)
2019-01-15T10:03:24.977Z error hostd[67CC4B70] [Originator@6876 sub=Solo.VmwareCLI opID=esxcli-e7-71db user=root] GetPrimitiveParam: Cannot find (help)
2019-01-15T14:47:58.335Z warning -[FFA75B20] [Originator@6876 sub=Default] Estimated fds limit 4864 > 4096 max supported by setrlimit. Setting fds limit to 4096
2019-01-15T14:47:58.336Z warning hostd[FFA75B20] [Originator@6876 sub=Default] Unrecognized log/level '' using 'info'
2019-01-15T14:47:58.380Z warning hostd[FFA75B20] [Originator@6876 sub=Hostsvc] Removing duplicate pools.xml entry 'resourcePool[0003]'
2019-01-15T14:47:58.380Z warning hostd[FFA75B20] [Originator@6876 sub=Hostsvc] Destroying unregistered VMkernel resource group 'host/user/pool2/pool1'
2019-01-15T14:47:58.386Z warning hostd[FFA75B20] [Originator@6876 sub=Hostsvc] Destroying unregistered VMkernel resource group 'host/user/pool2/pool1/vmx.15702277'
2019-01-15T14:47:58.386Z warning hostd[FFA75B20] [Originator@6876 sub=Hostsvc] Destroying unregistered VMkernel resource group 'host/user/pool2/pool1/vmx.15702277/worldGroup.15702277'
I see this KB
https://kb.vmware.com/s/article/1005566
https://kb.vmware.com/s/article/1003490?1=
In my case i use LACP. In KB 1003490 i see this:
- If LACP is enabled and configured, do not restart management services using services.sh command. Instead restart independent services using the /etc/init.d/module restart command.
I use "services.sh restart" command on this host and on others hosts, Other hosts are ok, but this host are gone crazy)
Ony ideas?
P.S. i cant reboot host.