vCenter

 View Only
  • 1.  Infamous black screen when opening console

    Posted Sep 19, 2014 02:00 PM

    Hi.

    I ran into a problem today I can't figure out, and searching some of the fora, it seems others have had this problem with no obvious fix. So here's another attempt at flushing this out.

    I am unable to open a console window properly for some VMs. I get a "black screen" that appears to be unresponsive to keyboard and mouse. However, investigation shows:

    1) keystrokes and mouse are making it to the VM. (This was verified by someone else who is able to open the console normally and saw the keystrokes and mouse actions I was generating).

    2) I'm able to open consoles fine on one vCenter, but not another.

    3) I'm having the same problem whether using the windows vSphere client or the Web Client, so it's not just a browser issue.

    4) I'm seeing the problem for all VMs in the one setup, whether running windows, or linux.

    5) It's just me (my particular setup), as two colleagues are able to open consoles to the same machines just fine.

    6) I'm doing this from Windows 7 running under Linux KVM (RHEL). Using FireFox 31.1.0

    7) If I run Chrome directly under linux (on same laptop) I can open a console successfully, but since a chrome console doesn't pass control characters, that solution is't really a useable option for me.

    8) Running vCenter/ESXi 5.5

    Poking around:

    > VSphere 5.5 VM Console screen black | VMware Communities

    > https://communities.vmware.com/message/2363368

    The solution there involved IPv6, but we are not using IPv6 and none of the machines in question have IPv6 addresses.

    Other related reports point to the DNS entries needing to resolve properly, which is not the case here (plus, you typically get errors in that case).

    Any suggestions?

    Thanks!

    Thomas



  • 2.  RE: Infamous black screen when opening console

    Posted Sep 22, 2014 09:03 AM

    Any firewall, port filtering, NAT or something similar between vClient and vCenter/ESXi hosts?



  • 3.  RE: Infamous black screen when opening console

    Posted Sep 22, 2014 05:45 PM

    > Any firewall, port filtering, NAT or something similar between  vClient and vCenter/ESXi hosts?

    Not that I think are relevant.

    There is a NAT running between the Windows-VM and the rest of the world. No known FW getting in the way.

    Testing some more, just now I was able to get a proper working console using Chrome (under the Windows VM). At the same time, from the same VM, neither the console under FF or the Windows client worked. I just see the black screen.

    Also, I did some additional testing that showed that even though the screen is black, the TCP connection to the ESXi host where the VM/Console resides is being opened correctly and data is flowing in both directions (per tcpdump) as I type or move the mouse. So I doubt its a FW/NAT issue. I believe that running a console uses a single TCP connection to the host where the VM resides



  • 4.  RE: Infamous black screen when opening console

    Posted Sep 23, 2014 11:20 AM

    IT's seems a one-way console.

    I suggest you to test vclient (not web client) directly connected to the esxi host and check if the console works fine.



  • 5.  RE: Infamous black screen when opening console

    Posted Sep 25, 2014 07:41 PM

    I've narrowed down the problem I am having.  The TCP connection for the console window is getting hosed in a weird way. I'm able to reliably reproduce the problem when using VPN software X, but problem does not happen when using VPN software Y.

    Wireshark has shown the following:

    1) Machine A (vSphere Client) opens a TCP connection to machine B (ESXi host), they successfully exchange some data in both directions, and then B sends a big chunk of data (more than one segment - i.e., the initial console screen) to A that doesn't arrive (wireshark reports "previous segment not captured").  A then responds with a Selective ACK, which B apparently doesn't deal with properly. (Or the SACK gets mangled before B sees it. Or something -- I don't know what is really happening here, but from this point on the connection is wedged.)

    2) Once A has returned a SACK, B continues to retransmit data, but it retransmits from the SACK point, not the ACK point. So A continues to respond with an ACK of X (since it is missing a chunk), whereas B retransmits at a sequence point greater than X, and the missing data never gets retransmitted.

    I would bet that the problem has to do with the VPN software, rather than with ESXi's ability to handle SACK. But that is just a guess.

    I googled for stuff like "ESXi SACK problems" but nothing came up. Are there any known issues with ESXi and SACK?



  • 6.  RE: Infamous black screen when opening console

    Posted Sep 26, 2014 04:08 AM

    heh.. i liked your zeal to narrow down the problem. Based on the initial post, i thought the problem might be with the way networking is being handled inside KVM on RHEL.

    Like you said, it might be a problem with the VPN software. Did you raise a support ticket for this?

    Regards

    Girish



  • 7.  RE: Infamous black screen when opening console

    Posted Sep 26, 2014 07:05 AM

    MTU problem? Try to force a lower MTU just to be sure.



  • 8.  RE: Infamous black screen when opening console

    Posted Sep 26, 2014 07:55 PM

    More narrowing down of problem. I used pktcap-uw on the ESXi host to

    get a packet trace, and did the same on my laptop.

    Looking at the traces, I see the following "interesting" behavior

    (courtesy of tcpdump):

    > 06:18:14.490938 IP esxi.ideafarm-door > laptop.ipfltbcst: Flags [.], ack 1361, win 129, length 0

    > 06:18:14.490963 IP esxi.ideafarm-door > laptop.ipfltbcst: Flags [P.], seq 3141:3178, ack 1361, win 129, length 37

    > 06:18:14.534533 IP esxi.ideafarm-door > laptop.ipfltbcst: Flags [P.], seq 3178:3231, ack 1361, win 129, length 53

    > 06:18:14.534556 IP esxi.ideafarm-door > laptop.ipfltbcst: Flags [P.], seq 3231:4772, ack 1361, win 129, length 1541

    Here, esxi sends out a packet > 1500 bytes... even though the MTU on

    vmknic0 is 1500...

    On my laptop, I do receive:

    > 11:23:06.957635 IP esxi.ideafarm-door > laptop.ipfltbcst: Flags [P.], seq 4553:4772, ack 1361, win 129, length 219

    I.e., the packet that was sent was apparently resegmented into two

    TCP segments (i.e., two separate IP packets) but only the second one

    arrived. Presumably TCP offload is doing resegmentation on outbound.

    > 06:18:14.586327 IP laptop.ipfltbcst > esxi.ideafarm-door: Flags [P.], seq 1361:1398, ack 3178, win 16483, length 37

    > 06:18:14.618077 IP laptop.ipfltbcst > esxi.ideafarm-door: Flags [.], ack 3231, win 16470, options [nop,nop,sack 1 {4553:4772}], length 0

    Laptop returns SACK saying it got sequence 4553:4772, and is missing

    3231-4552 (1321 bytes)

    > 06:18:14.620168 IP laptop.ipfltbcst > esxi.ideafarm-door: Flags [P.], seq 1398:1435, ack 3231, win 16470, length 37

    > 06:18:14.620184 IP esxi.ideafarm-door > laptop.ipfltbcst: Flags [.], ack 1435, win 129, length 0

    > 06:18:15.141979 IP esxi.ideafarm-door > laptop.ipfltbcst: Flags [.], seq 3231:4553, ack 1435, win 129, length 1322

    > 06:18:15.991972 IP esxi.ideafarm-door > laptop.ipfltbcst: Flags [.], seq 3231:4553, ack 1435, win 129, length 1322

    Esxi is retransmitting the missing bytes, but the packet is not

    getting delivered to laptop. The above two packets do not show up in

    the trace on the laptop.

    > 06:18:16.447390 IP laptop.ipfltbcst > esxi.ideafarm-door: Flags [P.], seq 1435:1488, ack 3231, win 16470, length 53

    > 06:18:16.448857 IP laptop.ipfltbcst > esxi.ideafarm-door: Flags [P.], seq 1488:1541, ack 3231, win 16470, length 53

    Laptop has more data to send, does so (but still ACKing 3231)...

    > 06:18:16.448884 IP esxi.ideafarm-door > laptop.ipfltbcst: Flags [.], ack 1541, win 128, length 0

    Esxi ACKs the received data...

    The remainder of the trace shows ESXi continuing to retransmit the

    missing segment, but I never receive it at my laptop. Somewhere

    between the ESXi outbound interface and my laptop the packet is lost.

    At this point, I'm not sure I have the ability to debug/trace

    further. There could be an issue with the ESXi TCP offload, or there

    could be some random issue along the path between the esxi host and my

    laptop.

    A couple of other things I tried:

    1) drop MTU on esxi host to something like 1200. When I did that, the

    problem went away. :smileyhappy:

    2) From the esxi host, I ran ping to my laptop  with various packet

    sizes around the size of the packet that was getting lost in the above

    trace. No packets were lost, and no fragmentation was observed.

    So, I'm now stuck at trying to understand where the missing packet is

    getting lost, and why.

    BTW, ESXi setup is:

    ~ # vmware -v

    VMware ESXi 5.5.0 build-1623387

    Host: IBM System x3650M3

    Network Adaptors: Broadcom NetXtreme II BCM5709