Hello all,
I work in the Windows kernel and have been kernel debugging VMWare Workstation VMs for 20 years. This year, things have gone very much sour. My VMs crash numerous times a day with NMI_HARDWARE_FAILURE, often during boot but at other times as well. This only happens when I'm kernel debugging
Host: Windows 11 24H2, VBS Enabled
Target VM: Windows 11, Windows Server 2025
Debugging transport: TCP/IP
Windbg version: I've tried various versions, including the latest WDK version and the latest Windows Store version
Anyone have any thoughts on what this might be?
Here's more info:
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
NMI_HARDWARE_FAILURE (80)
This is typically due to a hardware malfunction. The hardware supplier should
be called.
Arguments:
Arg1: 00000000004f4454, 'TDO'
Arg2: 0000000000000010, Status Byte
Arg3: 0000000000000000
Arg4: 0000000000000000
Debugging Details:
------------------
KEY_VALUES_STRING: 1
Key : Analysis.CPU.mSec
Value: 5718
Key : Analysis.Elapsed.mSec
Value: 20603
Key : Analysis.IO.Other.Mb
Value: 6
Key : Analysis.IO.Read.Mb
Value: 9
Key : Analysis.IO.Write.Mb
Value: 31
Key : Analysis.Init.CPU.mSec
Value: 1296
Key : Analysis.Init.Elapsed.mSec
Value: 23162
Key : Analysis.Memory.CommitPeak.Mb
Value: 97
Key : Analysis.Version.DbgEng
Value: 10.0.27829.1001
Key : Analysis.Version.Description
Value: 10.2503.24.01 amd64fre
Key : Analysis.Version.Ext
Value: 1.2503.24.1
Key : Bugcheck.Code.KiBugCheckData
Value: 0x80
Key : Bugcheck.Code.LegacyAPI
Value: 0x80
Key : Bugcheck.Code.TargetModel
Value: 0x80
Key : Failure.Bucket
Value: 0x80_4F4454_GenuineIntel_NOERRREC_VRF_IMAGE_GenuineIntel.sys
Key : Failure.Hash
Value: {eb133242-18d9-0441-17b8-cde57cbe4235}
Key : Hypervisor.Enlightenments.Value
Value: 12576
Key : Hypervisor.Enlightenments.ValueHex
Value: 0x3120
Key : Hypervisor.Flags.AnyHypervisorPresent
Value: 1
Key : Hypervisor.Flags.ApicEnlightened
Value: 0
Key : Hypervisor.Flags.ApicVirtualizationAvailable
Value: 0
Key : Hypervisor.Flags.AsyncMemoryHint
Value: 0
Key : Hypervisor.Flags.CoreSchedulerRequested
Value: 0
Key : Hypervisor.Flags.CpuManager
Value: 0
Key : Hypervisor.Flags.DeprecateAutoEoi
Value: 1
Key : Hypervisor.Flags.DynamicCpuDisabled
Value: 0
Key : Hypervisor.Flags.Epf
Value: 0
Key : Hypervisor.Flags.ExtendedProcessorMasks
Value: 0
Key : Hypervisor.Flags.HardwareMbecAvailable
Value: 0
Key : Hypervisor.Flags.MaxBankNumber
Value: 0
Key : Hypervisor.Flags.MemoryZeroingControl
Value: 0
Key : Hypervisor.Flags.NoExtendedRangeFlush
Value: 1
Key : Hypervisor.Flags.NoNonArchCoreSharing
Value: 0
Key : Hypervisor.Flags.Phase0InitDone
Value: 1
Key : Hypervisor.Flags.PowerSchedulerQos
Value: 0
Key : Hypervisor.Flags.RootScheduler
Value: 0
Key : Hypervisor.Flags.SynicAvailable
Value: 1
Key : Hypervisor.Flags.UseQpcBias
Value: 0
Key : Hypervisor.Flags.Value
Value: 536632
Key : Hypervisor.Flags.ValueHex
Value: 0x83038
Key : Hypervisor.Flags.VpAssistPage
Value: 1
Key : Hypervisor.Flags.VsmAvailable
Value: 0
Key : Hypervisor.RootFlags.AccessStats
Value: 0
Key : Hypervisor.RootFlags.CrashdumpEnlightened
Value: 0
Key : Hypervisor.RootFlags.CreateVirtualProcessor
Value: 0
Key : Hypervisor.RootFlags.DisableHyperthreading
Value: 0
Key : Hypervisor.RootFlags.HostTimelineSync
Value: 0
Key : Hypervisor.RootFlags.HypervisorDebuggingEnabled
Value: 0
Key : Hypervisor.RootFlags.IsHyperV
Value: 0
Key : Hypervisor.RootFlags.LivedumpEnlightened
Value: 0
Key : Hypervisor.RootFlags.MapDeviceInterrupt
Value: 0
Key : Hypervisor.RootFlags.MceEnlightened
Value: 0
Key : Hypervisor.RootFlags.Nested
Value: 0
Key : Hypervisor.RootFlags.StartLogicalProcessor
Value: 0
Key : Hypervisor.RootFlags.Value
Value: 0
Key : Hypervisor.RootFlags.ValueHex
Value: 0x0
Key : SecureKernel.HalpHvciEnabled
Value: 0
Key : WER.OS.Branch
Value: ge_release
Key : WER.OS.Version
Value: 10.0.26100.1
BUGCHECK_CODE: 80
BUGCHECK_P1: 4f4454
BUGCHECK_P2: 10
BUGCHECK_P3: 0
BUGCHECK_P4: 0
FAULTING_THREAD: ffff9d8c83865280
PROCESS_NAME: System
STACK_TEXT:
ffffd681`ce96c2f8 fffff805`b1265b02 : 00000000`00000000 00000000`00000080 00000000`00000023 fffff805`b136f2a0 : nt!DbgBreakPointWithStatus
ffffd681`ce96c300 fffff805`b126502c : 00000000`00000003 ffffd681`ce96c460 00000000`00000000 00000000`00000000 : nt!KiBugCheckDebugBreak+0x12
ffffd681`ce96c360 fffff805`b11b06d7 : 00000000`00000000 00000000`00000000 ffff9d8c`8b3d9010 ffff9d8c`8aff4460 : nt!KeBugCheck2+0xb2c
ffffd681`ce96caf0 fffff805`b11f8132 : 00000000`00000080 00000000`004f4454 00000000`00000010 00000000`00000000 : nt!KeBugCheckEx+0x107
ffffd681`ce96cb30 fffff805`b11f2729 : ffff9d8c`8aff4000 00000000`00001000 00000000`00001000 fffff805`b113d9ed : nt!HalpNMIHalt+0x2e
ffffd681`ce96cb70 fffff805`4233124b : 00000000`00000000 ffffd681`ce96cc39 ffff9d8c`8aff4488 fffff805`b1abfeb0 : nt!HalBugCheckSystem+0x69
ffffd681`ce96cbb0 fffff805`b138c65a : 00000000`00000000 ffffd681`ce96cc39 ffff9d8c`8aff4488 00000000`00000000 : PSHED!PshedBugCheckSystem+0xb
ffffd681`ce96cbe0 fffff805`b11f7f0b : fffff805`b1c7e610 fffff805`b1c7e610 fffff805`b1abfeb0 00000000`0000005c : nt!WheaReportHwError+0x32c62a
ffffd681`ce96cca0 fffff805`b1269f8f : fffff805`b1c7e680 ffffd681`ce96cd10 00000000`00000003 ffffd681`ce96cd10 : nt!HalHandleNMI+0x14b
ffffd681`ce96ccd0 fffff805`b13616c2 : 00000000`00000000 ffffd681`ce96ced0 00000000`00000000 ffffd681`ce8e4180 : nt!KiProcessNMI+0xff
ffffd681`ce96cd10 fffff805`b136142e : 0004eb1f`0004eab0 00000000`00000000 ffffd681`ce96ced0 00000000`00000000 : nt!KxNmiInterrupt+0x82
ffffd681`ce96ce50 fffff805`b135053f : fffff805`b11a3d2b 00000000`04eee7a0 ffffbb8f`0527dae0 fffff09d`e5c439a1 : nt!KiNmiInterrupt+0x26e
ffffbb8f`0527d9a8 fffff805`b11a3d2b : 00000000`04eee7a0 ffffbb8f`0527dae0 fffff09d`e5c439a1 ffffd681`ce8ec9c0 : nt!HalProcessorIdle+0xf
ffffbb8f`0527d9b0 fffff805`b11a4301 : 00000000`ffffffff ffffbb8f`00000000 ffff9d8c`8b5ec010 ffffd681`ce8e4180 : nt!PpmIdleDefaultExecute+0x2b
ffffbb8f`0527d9e0 fffff805`b1114ac0 : ffffd681`ce8e4180 ffffd681`ce8e4180 ffffbb8f`0527dbd9 ffffd681`ce8ec9c0 : nt!PpmIdleExecuteTransition+0x5a9
ffffbb8f`0527db70 fffff805`b1356674 : ffffd681`ce8e4180 ffffd681`ce8e4100 00000000`00000000 00000000`00000000 : nt!PoIdle+0x1c0
ffffbb8f`0527dc40 00000000`00000000 : ffffbb8f`0527e000 ffffbb8f`05278000 00000000`00000000 00000000`00000000 : nt!KiIdleLoop+0x54
MODULE_NAME: GenuineIntel
IMAGE_NAME: GenuineIntel.sys
STACK_COMMAND: .process /r /p 0xfffff805b1c7ef80; .thread 0xffff9d8c83865280 ; kb
FAILURE_BUCKET_ID: 0x80_4F4454_GenuineIntel_NOERRREC_VRF_IMAGE_GenuineIntel.sys
OS_VERSION: 10.0.26100.1
BUILDLAB_STR: ge_release
OSPLATFORM_TYPE: x64
OSNAME: Windows 10
FAILURE_ID_HASH: {eb133242-18d9-0441-17b8-cde57cbe4235}
Followup: MachineOwner
---------
3: kd> !whea
Error Source Table @ fffff805b1b9e7f8
5 Error Sources
Error Source 0 @ ffff9d8c89ee4b60
Notify Type : Unknown
Type : 0x10 (Invalid)
Error Count : 0
Record Count : 1
Record Length : 2e08
Error Records : wrapper @ ffff9d8c89be6000 record @ ffff9d8c89be6028
Descriptor : @ ffff9d8c89ee4bc0
Length : 3cc
Max Raw Data Length : d2c
Num Records To Preallocate : 1
Max Sections Per Record : 3
Error Source ID : 1
Flags : 00000000
Error Source 1 @ ffff9d8c8b3c2a60
Notify Type : MCE (INT18)
Type : 0x0 (MCE)
Error Count : 0
Record Count : 4
Record Length : 3528
Error Records : wrapper @ ffff9d8c8b7c3000 record @ ffff9d8c8b7c3028
: wrapper @ ffff9d8c8b7c6528 record @ ffff9d8c8b7c6550
: wrapper @ ffff9d8c8b7c9a50 record @ ffff9d8c8b7c9a78
: wrapper @ ffff9d8c8b7ccf78 record @ ffff9d8c8b7ccfa0
Descriptor : @ ffff9d8c8b3c2ac0
Length : 3cc
Max Raw Data Length : 141
Num Records To Preallocate : 4
Max Sections Per Record : a
Error Source ID : 2
Flags : 80000000
Error Source 2 @ ffff9d8c83794a10
WHEA_NOTIFICATION_DESCRIPTOR @ 0xffff9d8c83794aa0
Notify Type : Polled
Type : 0x1 (CMC)
Error Count : 0
Record Count : 3
Record Length : 3528
Error Records : wrapper @ ffff9d8c8b6d7000 record @ ffff9d8c8b6d7028
: wrapper @ ffff9d8c8b6da528 record @ ffff9d8c8b6da550
: wrapper @ ffff9d8c8b6dda50 record @ ffff9d8c8b6dda78
Descriptor : @ ffff9d8c83794a70
Length : 3cc
Max Raw Data Length : 141
Num Records To Preallocate : 3
Max Sections Per Record : a
Error Source ID : 3
Flags : 80000000
Error Source 3 @ ffff9d8c8b3d9010
Notify Type : NMI (INT2)
Type : 0x3 (NMI)
Error Count : 1
Record Count : 1
Record Length : 6c0
Error Records : wrapper @ ffff9d8c8aff4460 record @ ffff9d8c8aff4488
Descriptor : @ ffff9d8c8b3d9070
Length : 3cc
Max Raw Data Length : 100
Num Records To Preallocate : 1
Max Sections Per Record : 3
Error Source ID : 4
Flags : 80000000
Error Source 4 @ ffff9d8c8b3d8010
Notify Type : Polled
Type : 0x7 (BOOT)
Error Count : 0
Record Count : 0
Record Length : 0
Error Records : Descriptor : @ ffff9d8c8b3d8070
Length : 3cc
Max Raw Data Length : 1000
Num Records To Preallocate : 1
Max Sections Per Record : 8
Error Source ID : 5
Flags : 80000000
@ NMI Error
Error Source 3 @ ffff9d8c8b3d9010
Notify Type : NMI (INT2)
Type : 0x3 (NMI)
Error Count : 1
Record Count : 1
Record Length : 6c0
Error Records : wrapper @ ffff9d8c8aff4460 record @ ffff9d8c8aff4488
Descriptor : @ ffff9d8c8b3d9070
Length : 3cc
Max Raw Data Length : 100
Num Records To Preallocate : 1
Max Sections Per Record : 3
Error Source ID : 4
Flags : 80000000
@ Let's look at the NMI record
3: kd> dx (_WHEA_ERROR_RECORD*) 0xffff9d8c8aff4488
(_WHEA_ERROR_RECORD*) 0xffff9d8c8aff4488 : 0xffff9d8c8aff4488 [Type: _WHEA_ERROR_RECORD *]
[+0x000] Header [Type: _WHEA_ERROR_RECORD_HEADER]
[+0x080] SectionDescriptor [Type: _WHEA_ERROR_RECORD_SECTION_DESCRIPTOR [1]]
3: kd> dx -r1 (*((ntkrnlmp!_WHEA_ERROR_RECORD_HEADER *)0xffff9d8c8aff4488))
(*((ntkrnlmp!_WHEA_ERROR_RECORD_HEADER *)0xffff9d8c8aff4488)) [Type: _WHEA_ERROR_RECORD_HEADER]
[+0x000] Signature : 0x52455043 [Type: unsigned long]
[+0x004] Revision [Type: _WHEA_REVISION]
[+0x006] SignatureEnd : 0xffffffff [Type: unsigned long]
[+0x00a] SectionCount : 0x2 [Type: unsigned short]
[+0x00c] Severity : WheaErrSevFatal (1) [Type: _WHEA_ERROR_SEVERITY]
[+0x010] ValidBits [Type: _WHEA_ERROR_RECORD_HEADER_VALIDBITS]
[+0x014] Length : 0x1dc [Type: unsigned long]
[+0x018] Timestamp [Type: _WHEA_TIMESTAMP]
[+0x020] PlatformId : {00000000-0000-0000-0000-000000000000} [Type: _GUID]
[+0x030] PartitionId : {00000000-0000-0000-0000-000000000000} [Type: _GUID]
[+0x040] CreatorId : {CF07C4BD-B789-4E18-B3C4-1F732CB57131} [Type: _GUID]
[+0x050] NotifyType : {5BAD89FF-B7E6-42C9-814A-CF2485D6E98A} [Type: _GUID]
[+0x060] RecordId : 0x1dbf1d2c1e7d9de [Type: unsigned __int64]
[+0x068] Flags [Type: _WHEA_ERROR_RECORD_HEADER_FLAGS]
[+0x06c] PersistenceInfo [Type: _WHEA_PERSISTENCE_INFO]
[+0x074] OsBuildNumber : 0x0 [Type: unsigned long]
[+0x078] Reserved2 [Type: unsigned char [8]]
[+0x074] Reserved [Type: unsigned char [12]]
@ There are two sections
3: kd> dx ((_WHEA_ERROR_RECORD*) 0xffff9d8c8aff4488)->SectionDescriptor[0]
((_WHEA_ERROR_RECORD*) 0xffff9d8c8aff4488)->SectionDescriptor[0] [Type: _WHEA_ERROR_RECORD_SECTION_DESCRIPTOR]
[+0x000] SectionOffset : 0x110 [Type: unsigned long]
[+0x004] SectionLength : 0xc0 [Type: unsigned long]
[+0x008] Revision [Type: _WHEA_REVISION]
[+0x00a] ValidBits [Type: _WHEA_ERROR_RECORD_SECTION_DESCRIPTOR_VALIDBITS]
[+0x00b] Reserved : 0x0 [Type: unsigned char]
[+0x00c] Flags [Type: _WHEA_ERROR_RECORD_SECTION_DESCRIPTOR_FLAGS]
[+0x010] SectionType : {9876CCAD-47B4-4BDB-B65E-16F193C4F3DB} [Type: _GUID]
[+0x020] FRUId : {00000000-0000-0000-0000-000000000000} [Type: _GUID]
[+0x030] SectionSeverity : WheaErrSevInformational (3) [Type: _WHEA_ERROR_SEVERITY]
[+0x034] FRUText : "" [Type: char [20]]
3: kd> dx ((_WHEA_ERROR_RECORD*) 0xffff9d8c8aff4488)->SectionDescriptor[1]
((_WHEA_ERROR_RECORD*) 0xffff9d8c8aff4488)->SectionDescriptor[1] [Type: _WHEA_ERROR_RECORD_SECTION_DESCRIPTOR]
[+0x000] SectionOffset : 0x1d0 [Type: unsigned long]
[+0x004] SectionLength : 0xc [Type: unsigned long]
[+0x008] Revision [Type: _WHEA_REVISION]
[+0x00a] ValidBits [Type: _WHEA_ERROR_RECORD_SECTION_DESCRIPTOR_VALIDBITS]
[+0x00b] Reserved : 0x0 [Type: unsigned char]
[+0x00c] Flags [Type: _WHEA_ERROR_RECORD_SECTION_DESCRIPTOR_FLAGS]
[+0x010] SectionType : {E71254E7-C1B9-4940-AB76-909703A4320F} [Type: _GUID]
[+0x020] FRUId : {00000000-0000-0000-0000-000000000000} [Type: _GUID]
[+0x030] SectionSeverity : WheaErrSevFatal (1) [Type: _WHEA_ERROR_SEVERITY]
[+0x034] FRUText : "" [Type: char [20]]
@ What is the associated data? Unfortunately, not anything useful...
@ https://github.com/ralish/DecodeWheaRecord
@ e71254e7-c1b9-4940-ab76-909703a4320f = WHEA_NMI_ERROR_SECTION
3: kd> dx (_WHEA_NMI_ERROR_SECTION*) (0xffff9d8c8aff4488+0x1d0)
(_WHEA_NMI_ERROR_SECTION*) (0xffff9d8c8aff4488+0x1d0) : 0xffff9d8c8aff4658 [Type: _WHEA_NMI_ERROR_SECTION *]
[+0x000] Data [Type: unsigned char [8]]
[+0x008] Flags [Type: _WHEA_NMI_ERROR_SECTION_FLAGS]
3: kd> dx -r1 (*((PSHED!unsigned char (*)[8])0xffff9d8c8aff4658))
(*((PSHED!unsigned char (*)[8])0xffff9d8c8aff4658)) [Type: unsigned char [8]]
[0] : 0x10 [Type: unsigned char]
[1] : 0x0 [Type: unsigned char]
[2] : 0x0 [Type: unsigned char]
[3] : 0x0 [Type: unsigned char]
[4] : 0x0 [Type: unsigned char]
[5] : 0x0 [Type: unsigned char]
[6] : 0x0 [Type: unsigned char]
[7] : 0x0 [Type: unsigned char]
3: kd> dx -r1 (*((PSHED!_WHEA_NMI_ERROR_SECTION_FLAGS *)0xffff9d8c8aff4660))
(*((PSHED!_WHEA_NMI_ERROR_SECTION_FLAGS *)0xffff9d8c8aff4660)) [Type: _WHEA_NMI_ERROR_SECTION_FLAGS]
[+0x000 ( 0: 0)] HypervisorError : 0x0 [Type: unsigned long]
[+0x000 (31: 1)] Reserved : 0x0 [Type: unsigned long]
[+0x000] AsULONG : 0x0 [Type: unsigned long]
-------------------------------------------