VMware vSphere

 View Only
  • 1.  Problem with NVidia GRID K2 Card

    Posted Dec 01, 2015 03:14 PM

    Here is my setup:
    Dell PowerEdge 630
    (1) GRID K2
    Running VMWare ESXi 5.5
    Server is hosting (2) Windows 2008 VMs for XenApp

    I setup the Card as passthrough in ESXi. I assigned (1) PCI GPU to Server A and the other PCI GPU to Server B

    Server A is having no issues
    Server B is seeing Video Freezes when users run programs that start using the GPU. I also see the same issue if I RDP into the VM and run GPU-Z.

    In the Event viewer I get error entries for Display with the following.

    Display driver nvlddmkm stopped responding and has successfully recovered.

    At first I thought it was a bad GRID Card - Dell sent me a new one, problem went away. 3 weeks later same problem retuned only on Server B. Now I'm thinking it's not a hardware issue and something else. I built a brand new VM, shut down Server B and assigned it's GPU to the new VM....problem followed

    Anybody else ever see this? or better yet, anybody know how to resolve this?



  • 2.  RE: Problem with NVidia GRID K2 Card

    Posted Dec 03, 2015 02:59 PM

    Hello there,

    driver freezes happen most often with bad drivers for your GPU - is the driver version consistent across these two VMs? Have you tried making a clone of the 1st, nonproblematic VM (provided you customize it afterwards with let's say sysprep of course) and running some stresstests on the 2nd GPU?



  • 3.  RE: Problem with NVidia GRID K2 Card

    Posted Dec 03, 2015 03:10 PM

    I have tried multiple drivers and still have the same issues. The drivers were the same on both VMs, but only the 2nd was having the issues.  I have since tried multiple newer drivers with no improvements

    I have not tried to clone the 1st VM, only thing I have tried is building brand new VM ... which the problem followed to it.

    I did find something out the other night. I shut down both VMs, restarted the ESXi Host and then powered everything back on. Since doing this I have not received any Video Freezes or Display errors. So now it almost seems like it could be something with the ESXi Host? But I'm not exactly sure what would be causing it on the host



  • 4.  RE: Problem with NVidia GRID K2 Card

    Posted Dec 04, 2015 08:58 AM

    Hmm is your ESXi host up-to-date with the latest patches? Also, if the issue occurs again could you try posting ESXtop with expanded VM ('e' key, navigate to the VM and press enter) - maybe there is a world in the group which has some sort of resource leak which can cause locking up and misbehaving.

    Also /vmfs/volumes/<vmname>/vmware.log, vmkernel.log and vmkwarning.log from /var/log would be very helpful in the time of occurrence.