ESXi

 View Only
Expand all | Collapse all

Passing through Tesla k80 Issue...

brayxu

brayxuAug 24, 2015 01:56 PM

  • 1.  Passing through Tesla k80 Issue...

    Posted Aug 24, 2015 04:36 AM

    After adding a Nvidia Tesla K80M PCI pass-through device into a Guest OS, the Guest OS failed to start.

    Here is the related message in the vmware.log.

    vmx| I120: PCIPassthru: total number of pages needed (4206592) exceeds limit (917504), failing

    I have add pciHole.start = “2048” to vxm file, but is invalid.

    Here is the vmx file and log file, thanks!



  • 2.  RE: Passing through Tesla k80 Issue...

    Posted Aug 24, 2015 05:55 AM

    looks like a memory issue. As per documentation support should be there for  Processor support for Platform support for I/O DMA remapping.  http://us.download.nvidia.com/Windows/Quadro_Certified/350.12/350.12-win8-win7-winvista-quadro-grid-release-notes.pdf Can you please that?

    Also one more try to do might be do some memory reservation and then try.



  • 3.  RE: Passing through Tesla k80 Issue...

    Posted Aug 24, 2015 07:43 AM

    There is a few things to try but lets start with the simple ones:

    Upgrade the bios on the host to the latest one.

    Upgrade the firmware on the GPU to the latest one.

    In the bios look for a setting that is similar to “Enable >4G Decode”, “Enable 64-bit MMIO”, “Above 4G Decoding”.

         It should be set to “Disabled”

    Install the latest ESXi patches.

    Create a new VM using hardware version 10, reserve all memory for the vm. (No need to edit the vmx-file anymore)

    Install Windows 7.

    Passthrough the GPU


    If still problems post the vmware.log again.



  • 4.  RE: Passing through Tesla k80 Issue...

    Posted Aug 24, 2015 02:17 PM

    Hi, The machine is Dell R730, and it's bios is newest.

    In the bios, I set Memory Mapped I/O above 4GB to disable,after that, R730 can not boot successful.....

    So I can not go to next step.Thanks!

    By the way ,the R730 plug two Tesla K80 ,each Tesla K80 have two gpu-chip and 24GB grahics memory.

    Perhaps can not support two K80 or 24GB memory?



  • 5.  RE: Passing through Tesla k80 Issue...

    Posted Aug 24, 2015 02:31 PM

    Attach R730's error screen



  • 6.  RE: Passing through Tesla k80 Issue...

    Posted Nov 24, 2015 04:34 PM

    Hi brayxu,

    did you manage to solve this problem or is it still open ?

    I have the same hardware (R730 + one Tesla K80) and I'm interested to hear if anyone has solved this problem by now (although I don't bare high hopes after looking around on the internet for a very long time).

    Thanks,

    Geert.



  • 7.  RE: Passing through Tesla k80 Issue...

    Posted Aug 24, 2015 01:56 PM

    Thanks for http://us.download.nvidia.com/Windows/Quadro_Certified/350.12/350.12-win8-win7-winvista-quadro-grid-release-notes.pdf

    1.In this document,K80 are supported for device passthrough with ESXI

    2.In the "Known Issues": VMware • PCI I/O hole may need to be changed for Windows 64-bit VMs. Windows 64-bit VMs may require that you edit the VM configuration file to configure a larger PCI I/O hole for the GPU.

       I have set PCI I/O hole to 2048 ,do not slove this problem..



  • 8.  RE: Passing through Tesla k80 Issue...

    Posted Aug 24, 2015 08:00 AM

    Pease check is Nviaxxxxx is compatible with current version of ESXi.



  • 9.  RE: Passing through Tesla k80 Issue...

    Posted Aug 24, 2015 01:56 PM

    Hi, what is Nviaxxxxx ,thanks!



  • 10.  RE: Passing through Tesla k80 Issue...

    Posted Feb 09, 2016 07:22 PM

    A previous version of this post included advice to add two VMX file entries (efi.legacyBoot.enabled and efi.bootOrder) as part of the solution. These two settings should NOT be used. Instead, following the directions below.

    --------

    You should be able to pass a single GPU (that is, half of a K80) to a VM running on ESX 6 by creating an EFI-bootable VM, doing an EFI installation of your guest OS, and then adding the following to the VM's VMX file.

    pciPassthru.use64bitMMIO="TRUE"

    Trying to pass more than one of these GPUs into the same VM will currently hit a platform memory limit and the VM will fail to boot. (NOTE: This limit has been removed in ESX 6.5).

    A smaller card like the K2 does not have this issue: GPGPU Blog Entry

    If the above does not work for you, send me email directly at "simons at vmware dot com". In either case, please share your experience with others on the thread.

    And if you have any other questions about running HPC applications in a VMware environment, I'd be happy to hear from you directly.

    If you are interested in learning more of what we've been doing related to HPC, you can check out our HPC entries on the VMware CTO blog site here: HPC Blog Entries

    Josh Simons

    High Performance Computing

    Office of the CTO

    VMware, Inc.



  • 11.  RE: Passing through Tesla k80 Issue...

    Posted Oct 14, 2016 01:35 PM

    Hello, I have the same issue. When applying these lines in the vmx in a windows 10 Pro vm, the machine no longer starts.

    When using just pciPassthru.use64bitMMIO="TRUE" I can detect the new hardware in win10, but NVIDIA instalation of "356.54-tesla-desktop-win10-64bit-international-whql.exe" never finishes.



  • 12.  RE: Passing through Tesla k80 Issue...

    Posted Jan 31, 2017 02:08 PM

    This works, thanks for the tip.



  • 13.  RE: Passing through Tesla k80 Issue...

    Posted Apr 04, 2017 10:40 PM

    Hi, I was wondering how you got it to work?

    I have a Tesla P100 GPU and I'm trying to passthrough to a VM on ESXi 6 which is on a Dell PowerEdge R730.

    Adding the parameter doesn't seem to work for me. The GPU can be added to the Vsphere passthrough list (Advance Settings). After that I set up a Win 10 vm and installed it on EFI and used your parameter,"pciPassthru.use64bitMMIO", and in Windows it sees an unknown 3D Video Controller (Before and after installing VMWare Tools). The Nvidia Tesla drivers don't install as it says the version of Win is not supported and the graphics card can't be found, even though, the driver was from Nvidia for Win 10 and the GPU was added as a PCI device to the VM. Truly appreciate any help as I couldn't find much information online.



  • 14.  RE: Passing through Tesla k80 Issue...

    Posted Feb 02, 2022 04:30 PM

    I'm running into this same issue.  After adding a Tesla V100 GPU, I get:

    vmx| | I005: PCIPassthru: Device 0000:c8:00.0 barIndex 0 type 2 realaddr 0xe8000000 size 16777216 flags 0
    vmx| | I005: PCIPassthru: Device 0000:c8:00.0 barIndex 1 type 3 realaddr 0x38d800000000 size 17179869184 flags 12
    vmx| | I005: PCIPassthru: Device 0000:c8:00.0 barIndex 3 type 3 realaddr 0x38dc00000000 size 33554432 flags 12
    vmx| | I005: PCIPassthru: Device has PCI Express Cap Version 2(size 60)
    vmx| | I005: PCIPassthru: Registered a PCI device for 0000:c8:00.0 vIRQ 0x11, physical MSI = Enabled (vmmInt = Enabled), IntrPin = 1
    vmx| | I005: PCIPassthru: total number of pages needed (4206592) exceeds limit (917504), failing
    vmx| | I005: Module 'DevicePowerOn' power on failed.

    Any ideas?



  • 15.  RE: Passing through Tesla k80 Issue...

    Posted Aug 17, 2022 08:37 PM

    you need to add another parameter for this to work refer to the article below. 

    pciPassthru.64bitMMIOSizeGB = ????

    https://earlruby.org/2022/02/calculating-the-value-for-64bitmmiosizegb/