VMware vSphere

 View Only
  • 1.  my virtual machine on esxi down after 1h

    Posted Apr 06, 2021 07:55 PM

    I have a virtual server in ESXI version 6.7 that has been running for so long that I can not access it automatically after an hour or two in a few days.
    Ping from the IP of this server gives time out
    Also, when I open the console in the browser, no page appears to me and it is black.
    I looked at the VMware.log file and saw the text below
    Please help fix this problem

     

    2021-04-06T18:09:54.194Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-06T18:10:54.619Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-06T18:11:54.621Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-06T18:12:54.695Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-06T18:13:54.694Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-06T18:14:54.694Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-06T18:15:54.688Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-06T18:16:54.676Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-06T18:17:54.687Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-06T18:18:54.688Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-06T18:19:54.687Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-06T18:20:54.689Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-06T18:21:54.688Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-06T18:22:54.682Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-06T18:23:54.685Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-06T18:24:54.685Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-06T18:25:54.738Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-06T18:26:56.734Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-06T18:27:56.724Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-06T18:28:57.292Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-06T18:29:57.418Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-06T18:30:57.538Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-06T18:31:57.658Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-06T18:32:57.808Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-06T19:09:04.650Z| vmx| I125: GuestRpcSendTimedOut: message to toolbox timed out.
    2021-04-06T19:09:19.158Z| vcpu-0| I125: Tools: Tools heartbeat timeout.
    2021-04-06T19:09:19.158Z| vcpu-0| I125: Tools: Running status rpc handler: 1 => 0.
    2021-04-06T19:09:19.158Z| vcpu-0| I125: Tools: Changing running status: 1 => 0.
    2021-04-06T19:09:24.654Z| vmx| I125: GuestRpcSendTimedOut: message to toolbox timed out.
    2021-04-06T19:09:24.654Z| vmx| I125: GuestRpc: app toolbox's second ping timeout; assuming app is down
    2021-04-06T19:09:24.656Z| vmx| I125: GuestRpc: Reinitializing Channel 0(toolbox)
    2021-04-06T19:09:24.656Z| vmx| I125: GuestMsg: Channel 0, Cannot unpost because the previous post is already completed
    2021-04-06T19:26:43.213Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-06T19:26:45.817Z| vmx| I125: VigorTransportProcessClientPayload: opID=esxui-b809-85e2 seq=1920878: Receiving MKS.IssueTicket request.
    2021-04-06T19:26:45.817Z| vmx| I125: SOCKET creating new socket listening on /var/run/vmware/ticket/5c88c1cc3a32bd3c
    2021-04-06T19:26:45.818Z| vmx| I125: SOCKET 7121 (80) creating new listening socket on port -1
    2021-04-06T19:26:45.818Z| vmx| I125: Issuing new webmks ticket 5c88c1... (120 seconds)
    2021-04-06T19:26:45.818Z| vmx| I125: VigorTransport_ServerSendResponse opID=esxui-b809-85e2 seq=1920878: Completed MKS request with messages.
    2021-04-06T19:26:45.913Z| mks| I125: Accepting connection for webmks ticket 5c88c1...
    2021-04-06T19:26:45.913Z| mks| I125: Expiring webmks ticket 5c88c1...
    2021-04-06T19:26:45.913Z| mks| I125: SOCKET 7122 (117) AsyncTCPSocketSetOption: Option layer/level [6], option/name [1]: successfully set OS option for TCP socket.
    2021-04-06T19:26:45.913Z| mks| W115: SOCKET 7123 (117) unable to determine remote IP address
    2021-04-06T19:26:45.914Z| mks| I125: SOCKET 7122 (117) AsyncTCPSocketSetOption: sendLowLatencyMode set to [1].
    2021-04-06T19:26:45.914Z| mks| I125: SOCKET 7123 (117) Creating VNC remote connection.
    2021-04-06T19:26:45.914Z| mks| I125: MKSControlMgr: New VNC connection 87
    2021-04-06T19:26:45.914Z| svga| I125: VNCENCODE 7123 VNCEncode: VNCEncode_ServerSetTopology - original root: (0, 0) size: (603, 499)
    2021-04-06T19:26:45.914Z| svga| I125: VNCENCODE 7123 VNCEncode: Number of screens changed from 0 to 1
    2021-04-06T19:26:45.914Z| svga| I125: VNCENCODE 7123 VNCEncode: screen: 0 BoundingBox: (603x499) Screen (603x499) @ (0,0) bytesPerLine: 2412
    2021-04-06T19:26:46.050Z| mks| W115: VNCENCODE 7123 JPEG quality levels (min, mid, max). Input: (25, 35, 90) Clamped: (25, 35, 90)
    2021-04-06T19:26:46.050Z| mks| I125: VNCENCODE 7123 VNCEncodeChooseRegionEncoder: region encoder adaptive. Screen 1/1 @ Resolution: 603 x 499
    2021-04-06T19:26:46.343Z| vmx| I125: Tools_SetGuestResolution: Sending rpcMsg = Resolution_Set 603 499
    2021-04-06T19:26:51.498Z| mks| I125: SOCKET 7123 (117) Received websocket close frame with empty status code
    2021-04-06T19:26:51.498Z| mks| I125: SOCKET 7123 (117) Sending websocket close frame, status code = 1000
    2021-04-06T19:26:51.498Z| mks| I125: SOCKET 7123 (117) VNC Remote Disconnect: socket closed.
    2021-04-06T19:26:51.499Z| mks| I125: MKSControlMgr: Remove VNC connection 87

    Moderator edit by wila: Moved post to VMware ESXi discussions



  • 2.  RE: my virtual machine on esxi down after 1h

    Posted Apr 07, 2021 06:14 AM

    It could be many things... Is it a Windows VM? Or Linux? What is the hardware version of the VM? What is the version of VMware Tools? What did you already try to solve it?



  • 3.  RE: my virtual machine on esxi down after 1h

    Broadcom Employee
    Posted Apr 07, 2021 09:08 AM

    ESXi can't reach Tools, the guest has hung:

    2021-04-06T19:09:04.650Z| vmx| I125: GuestRpcSendTimedOut: message to toolbox timed out.
    2021-04-06T19:09:19.158Z| vcpu-0| I125: Tools: Tools heartbeat timeout.
    2021-04-06T19:09:19.158Z| vcpu-0| I125: Tools: Running status rpc handler: 1 => 0.
    2021-04-06T19:09:19.158Z| vcpu-0| I125: Tools: Changing running status: 1 => 0.
    2021-04-06T19:09:24.654Z| vmx| I125: GuestRpcSendTimedOut: message to toolbox timed out.
    2021-04-06T19:09:24.654Z| vmx| I125: GuestRpc: app toolbox's second ping timeout; assuming app is down

    Is there any indication before the svga / mks logs? Anything in the vmkernel logs?

    If this isn't happening to any other VM on that host / datastore then you should look for a in-guest problem. You can send an NMI and see whether it still crashes, if not you can suspend the guest, convert the memory to a core file and debug it.



  • 4.  RE: my virtual machine on esxi down after 1h

    Posted Apr 07, 2021 08:27 PM

    Hi
    I have several virtual machines on this ESXI version 6.7 update version 3 On one of these machines I installed some network traffic monitoring software and then deleted it. For about 3 days now, this server has been down every 1 to 2 hours, and I have to reset the server to fix it, which means that the guest OS section is down. Now I will put the relevant logs in the following section for viewing


    In the review I did, my server went down at 22:37


    Solutions that I tried and did not fix:
    1- I added a new network card
    2. I reinstalled VMware Tools
    3- I executed the SFC /scannow command

    My server is a windows server 2008



  • 5.  RE: my virtual machine on esxi down after 1h

    Posted Apr 07, 2021 08:28 PM

    vmware.log

    2021-04-07T22:52:04.175Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-07T22:53:04.184Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-07T22:54:04.200Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-07T22:55:04.250Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-07T22:56:04.769Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-07T22:57:05.303Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-07T22:58:05.876Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-07T22:59:06.425Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-07T23:27:13.433Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-07T23:37:15.196Z| vmx| I125: GuestRpcSendTimedOut: message to toolbox timed out.
    2021-04-07T23:37:20.025Z| vcpu-0| I125: Tools: Tools heartbeat timeout.
    2021-04-07T23:37:20.025Z| vcpu-0| I125: Tools: Running status rpc handler: 1 => 0.
    2021-04-07T23:37:20.025Z| vcpu-0| I125: Tools: Changing running status: 1 => 0.
    2021-04-07T23:37:35.201Z| vmx| I125: GuestRpcSendTimedOut: message to toolbox timed out.
    2021-04-07T23:37:35.201Z| vmx| I125: GuestRpc: app toolbox's second ping timeout; assuming app is down
    2021-04-07T23:37:35.203Z| vmx| I125: GuestRpc: Reinitializing Channel 0(toolbox)
    2021-04-07T23:37:35.203Z| vmx| I125: GuestMsg: Channel 0, Cannot unpost because the previous post is already completed
    2021-04-07T23:38:19.550Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
    2021-04-07T23:38:21.292Z| vmx| I125: VigorTransportProcessClientPayload: opID=esxui-b214-ccab seq=1979170: Receiving MKS.IssueTicket request.
    2021-04-07T23:38:21.293Z| vmx| I125: SOCKET creating new socket listening on /var/run/vmware/ticket/145557bd1eebe415
    2021-04-07T23:38:21.293Z| vmx| I125: SOCKET 481 (134) creating new listening socket on port -1
    2021-04-07T23:38:21.293Z| vmx| I125: Issuing new webmks ticket 145557... (120 seconds)
    2021-04-07T23:38:21.293Z| vmx| I125: VigorTransport_ServerSendResponse opID=esxui-b214-ccab seq=1979170: Completed MKS request with messages.
    2021-04-07T23:38:21.332Z| mks| I125: Accepting connection for webmks ticket 145557...
    2021-04-07T23:38:21.332Z| mks| I125: Expiring webmks ticket 145557...
    2021-04-07T23:38:21.332Z| mks| I125: SOCKET 482 (135) AsyncTCPSocketSetOption: Option layer/level [6], option/name [1]: successfully set OS option for TCP socket.
    2021-04-07T23:38:21.332Z| mks| W115: SOCKET 483 (135) unable to determine remote IP address
    2021-04-07T23:38:21.333Z| mks| I125: SOCKET 482 (135) AsyncTCPSocketSetOption: sendLowLatencyMode set to [1].
    2021-04-07T23:38:21.333Z| mks| I125: SOCKET 483 (135) Creating VNC remote connection.
    2021-04-07T23:38:21.334Z| mks| I125: MKSControlMgr: New VNC connection 1
    2021-04-07T23:38:21.334Z| svga| I125: VNCENCODE 483 VNCEncode: VNCEncode_ServerSetTopology - original root: (0, 0) size: (640, 480)
    2021-04-07T23:38:21.334Z| svga| I125: VNCENCODE 483 VNCEncode: Number of screens changed from 0 to 1
    2021-04-07T23:38:21.334Z| svga| I125: VNCENCODE 483 VNCEncode: screen: 0 BoundingBox: (640x480) Screen (640x480) @ (0,0) bytesPerLine: 2560
    2021-04-07T23:38:21.445Z| mks| W115: VNCENCODE 483 JPEG quality levels (min, mid, max). Input: (25, 35, 90) Clamped: (25, 35, 90)
    2021-04-07T23:38:21.445Z| mks| I125: VNCENCODE 483 VNCEncodeChooseRegionEncoder: region encoder adaptive. Screen 1/1 @ Resolution: 640 x 480
    2021-04-07T23:38:21.752Z| vmx| I125: Tools_SetGuestResolution: Sending rpcMsg = Resolution_Set 640 480
    2021-04-07T23:38:23.590Z| mks| I125: SOCKET 483 (135) Received websocket close frame with empty status code
    2021-04-07T23:38:23.590Z| mks| I125: SOCKET 483 (135) Sending websocket close frame, status code = 1000
    2021-04-07T23:38:23.590Z| mks| I125: SOCKET 483 (135) VNC Remote Disconnect: socket closed.


  • 6.  RE: my virtual machine on esxi down after 1h

    Posted Apr 07, 2021 08:29 PM

    vmkernel.log

     

    2021-04-07T23:11:38.027Z cpu24:2097914)NMP: nmp_ResetDeviceLogThrottling:3580: Error status H:0x0 D:0x2 P:0x0 Sense Data: 0x5 0x24 0x0 from dev "mpx.vmhba0:C0:T3:L0" occurred 1 times(of 1 commands)
    2021-04-07T23:12:47.629Z cpu27:20495944)Unmap6: 7146: [Unmap] 'datastore1':device(0x43063d4a3ca0)does not support unmap
    2021-04-07T23:12:53.631Z cpu8:20495946)J6: 2651: 'datastore1': Exiting async journal replay manager world
    2021-04-07T23:16:38.025Z cpu27:2097914)NMP: nmp_ResetDeviceLogThrottling:3580: Error status H:0x0 D:0x2 P:0x0 Sense Data: 0x5 0x24 0x0 from dev "mpx.vmhba0:C0:T3:L0" occurred 1 times(of 1 commands)
    2021-04-07T23:17:49.612Z cpu29:20496091)Unmap6: 7146: [Unmap] 'datastore1':device(0x43063d4a3ca0)does not support unmap
    2021-04-07T23:17:55.616Z cpu12:20496093)J6: 2651: 'datastore1': Exiting async journal replay manager world
    2021-04-07T23:20:38.027Z cpu25:2097985)DVFilter: 5963: Checking disconnected filters for timeouts
    2021-04-07T23:21:38.027Z cpu33:2097914)NMP: nmp_ResetDeviceLogThrottling:3580: Error status H:0x0 D:0x2 P:0x0 Sense Data: 0x5 0x24 0x0 from dev "mpx.vmhba0:C0:T3:L0" occurred 1 times(of 1 commands)
    2021-04-07T23:22:51.803Z cpu9:20496244)Unmap6: 7146: [Unmap] 'datastore1':device(0x43063d4a3ca0)does not support unmap
    2021-04-07T23:22:57.809Z cpu22:20496246)J6: 2651: 'datastore1': Exiting async journal replay manager world
    2021-04-07T23:26:38.025Z cpu46:2097914)NMP: nmp_ResetDeviceLogThrottling:3580: Error status H:0x0 D:0x2 P:0x0 Sense Data: 0x5 0x24 0x0 from dev "mpx.vmhba0:C0:T3:L0" occurred 1 times(of 1 commands)
    2021-04-07T23:27:53.612Z cpu24:20496406)Unmap6: 7146: [Unmap] 'datastore1':device(0x43063d4a3d90)does not support unmap
    2021-04-07T23:27:59.618Z cpu10:20496408)J6: 2651: 'datastore1': Exiting async journal replay manager world
    2021-04-07T23:29:21.169Z cpu3:16017957 opID=58515efd)World: 11943: VC opID esxui-183a-cbf1 maps to vmkernel opID 58515efd
    2021-04-07T23:29:21.169Z cpu3:16017957 opID=58515efd)NVDManagement: 1461: No nvdimms found on the system
    2021-04-07T23:30:38.027Z cpu25:2097985)DVFilter: 5963: Checking disconnected filters for timeouts
    2021-04-07T23:31:38.026Z cpu26:2097914)NMP: nmp_ResetDeviceLogThrottling:3580: Error status H:0x0 D:0x2 P:0x0 Sense Data: 0x5 0x24 0x0 from dev "mpx.vmhba0:C0:T3:L0" occurred 1 times(of 1 commands)
    2021-04-07T23:32:55.530Z cpu3:20496572)Unmap6: 7146: [Unmap] 'datastore1':device(0x43063d4a3ca0)does not support unmap
    2021-04-07T23:33:01.535Z cpu3:20496574)J6: 2651: 'datastore1': Exiting async journal replay manager world
    2021-04-07T23:36:38.026Z cpu29:2097914)NMP: nmp_ResetDeviceLogThrottling:3580: Error status H:0x0 D:0x2 P:0x0 Sense Data: 0x5 0x24 0x0 from dev "mpx.vmhba0:C0:T3:L0" occurred 1 times(of 1 commands)
    2021-04-07T23:37:57.557Z cpu16:20496714)Unmap6: 7146: [Unmap] 'datastore1':device(0x43063d4a3b70)does not support unmap
    2021-04-07T23:38:03.561Z cpu37:20496716)J6: 2651: 'datastore1': Exiting async journal replay manager world
    2021-04-07T23:40:38.027Z cpu25:2097985)DVFilter: 5963: Checking disconnected filters for timeouts
    2021-04-07T23:41:38.025Z cpu37:2097914)NMP: nmp_ResetDeviceLogThrottling:3580: Error status H:0x0 D:0x2 P:0x0 Sense Data: 0x5 0x24 0x0 from dev "mpx.vmhba0:C0:T3:L0" occurred 1 times(of 1 commands)
    2021-04-07T23:42:59.831Z cpu20:20496871)Unmap6: 7146: [Unmap] 'datastore1':device(0x43063d4a3ca0)does not support unmap
    2021-04-07T23:43:05.836Z cpu17:20496873)J6: 2651: 'datastore1': Exiting async journal replay manager world
    2021-04-07T23:46:38.025Z cpu35:2097914)NMP: nmp_ResetDeviceLogThrottling:3580: Error status H:0x0 D:0x2 P:0x0 Sense Data: 0x5 0x24 0x0 from dev "mpx.vmhba0:C0:T3:L0" occurred 1 times(of 1 commands)
    2021-04-07T23:48:01.809Z cpu7:20496998)Unmap6: 7146: [Unmap] 'datastore1':device(0x43063d4a3bb0)does not support unmap
    2021-04-07T23:48:07.812Z cpu46:20497000)J6: 2651: 'datastore1': Exiting async journal replay manager world
    2021-04-07T23:48:37.532Z cpu25:20494767)VSCSI: 6602: handle 8409(vscsi0:0):Destroying Device for world 20494752 (pendCom 0)
    2021-04-07T23:48:37.533Z cpu25:20494767)NetPort: 1580: disabled port 0x200001e
    2021-04-07T23:48:37.598Z cpu39:20494767)VSCSI: 3810: handle 8410(vscsi0:0):Creating Virtual Device for world 20494752 (FSS handle 751848245) numBlocks=209715200 (bs=512)
    2021-04-07T23:48:37.598Z cpu39:20494767)VSCSI: 273: handle 8410(vscsi0:0):Input values: res=0 limit=-2 bw=-1 Shares=1000
    2021-04-07T23:48:37.598Z cpu39:20494767)NetPort: 1359: enabled port 0x200001e with mac 00:0c:29:c1:a4:58
    2021-04-07T23:49:27.003Z cpu5:2097641)WARNING: smartpqi: pqisrc_taskMgmt:1288 :[92:0.0][1,0,2]:IN :(virt reset)
    2021-04-07T23:49:27.003Z cpu5:2097641)smartpqi: pqisrc_taskMgmt:1408: (virt reset)	DONE
    2021-04-07T23:49:27.003Z cpu5:2097641)WARNING: WARN: smartpqi: pqisrc_taskMgmt:1455: OUT 0
    2021-04-07T23:50:38.027Z cpu25:2097985)DVFilter: 5963: Checking disconnected filters for timeouts
    2021-04-07T23:51:38.026Z cpu25:2097914)NMP: nmp_ResetDeviceLogThrottling:3580: Error status H:0x0 D:0x2 P:0x0 Sense Data: 0x5 0x24 0x0 from dev "mpx.vmhba0:C0:T3:L0" occurred 1 times(of 1 commands)
    2021-04-07T23:53:03.671Z cpu23:20497151)Unmap6: 7146: [Unmap] 'datastore1':device(0x43063d4a3ca0)does not support unmap
    2021-04-07T23:53:09.671Z cpu14:20497153)J6: 2651: 'datastore1': Exiting async journal replay manager world
    2021-04-07T23:56:38.025Z cpu46:2097914)NMP: nmp_ResetDeviceLogThrottling:3580: Error status H:0x0 D:0x2 P:0x0 Sense Data: 0x5 0x24 0x0 from dev "mpx.vmhba0:C0:T3:L0" occurred 1 times(of 1 commands)
    2021-04-07T23:58:05.638Z cpu23:20497318)Unmap6: 7146: [Unmap] 'datastore1':device(0x43063d4a3ca0)does not support unmap
    2021-04-07T23:58:11.641Z cpu38:20497320)J6: 2651: 'datastore1': Exiting async journal replay manager world
    2021-04-08T00:00:38.027Z cpu25:2097985)DVFilter: 5963: Checking disconnected filters for timeouts
    2021-04-08T00:03:07.558Z cpu45:20497645)Unmap6: 7146: [Unmap] 'datastore1':device(0x43063d4a3ca0)does not support unmap
    2021-04-08T00:03:13.563Z cpu30:20497647)J6: 2651: 'datastore1': Exiting async journal replay manager world
    2021-04-08T00:06:38.025Z cpu24:2097914)NMP: nmp_ResetDeviceLogThrottling:3580: Error status H:0x0 D:0x2 P:0x0 Sense Data: 0x5 0x24 0x0 from dev "mpx.vmhba0:C0:T3:L0" occurred 1 times(of 1 commands)
    2021-04-08T00:08:09.357Z cpu22:20497792)Unmap6: 7146: [Unmap] 'datastore1':device(0x43063d4a3ca0)does not support unmap
    2021-04-08T00:08:15.361Z cpu18:20497794)J6: 2651: 'datastore1': Exiting async journal replay manager world
    2021-04-08T00:10:38.027Z cpu38:2097985)DVFilter: 5963: Checking disconnected filters for timeouts