Fusion

 View Only

[BUG] Architectural Performance Bottleneck in DirectX-to-Metal Layer with High-Frequency API Calls

  • 1.  [BUG] Architectural Performance Bottleneck in DirectX-to-Metal Layer with High-Frequency API Calls

    Posted 25 days ago

    Bug Report: Architectural Performance Bottleneck in DirectX-to-Metal Layer with High-Frequency API Calls

    Report Date: July 12, 2025

    Summary: A specific, deterministically reproducible workload in a DirectX 11 game (Anonymous;Code) triggers a catastrophic performance degradation on VMware Fusion for Apple Silicon. Extensive diagnostics reveal this is not a GPU execution stall or memory limit, but a CPU-side driver overhead bottleneck within the DirectX-to-Metal translation layer. This issue is exposed when the application's API draw call count per frame dramatically increases, suggesting a fundamental challenge in handling high-frequency command submission. Comparative testing shows this is an architectural issue also present in competing virtualization software, positioning this report as a key test case for the maturity of the translation layer.

    1. Environment Configuration

    • Host Machine: MacBook Pro (M4 Pro)

    • Host OS: 15.5 (24F74)

    • Virtualization Software: VMware Fusion Professional Version 13.6.3 (24585314)

    • Guest OS: Windows 11 Pro for ARM Version (22H2)

    2. Virtual Machine Configuration

    • Processors: 1 Core

    • RAM: 8 GB

    • Graphics Memory: 2 GB

    • 3D Acceleration: Enabled

    • Game: Anonymous;Code (Steam Version, x86 32-bit executable)

    Note: Performance was empirically found to be best on a single core. Increasing the core count to 2 or 4 paradoxically decreased performance, strongly indicating a cross-core synchronization overhead issue when the driver is under heavy load.

    3. Steps to Reproduce

    The issue is 100% reproducible with the game save file.

    1. Configure a VM with the settings listed above.

    2. Install Steam and Anonymous;Code.

    3. Place the attached save file (acode_bug_report_save.dat) into the game's save directory (C:\Users\[Your Username]\Documents\MAGES\ANONYMOUS_CODE_STEAM).

    4. Launch the game and load the provided save file. The save is at the beginning of the problematic scene.

    5. Expected Behavior: The scene should render at or near 60 FPS, consistent with other scenes in the game.

    6. Actual Behavior: The framerate immediately and consistently drops to the 13-22 FPS range. The guest OS CPU usage spikes only during this scene.

    4. Diagnostic Analysis and Key Findings

    This issue has been extensively debugged using RenderDoc. The findings conclusively point to a CPU-side driver overhead issue, not a GPU execution issue.

    • Root Cause: The performance drop is triggered when the game engine renders a scene with two specific high-fidelity animated characters simultaneously. To handle their complex layering, the engine switches to a multi-pass compositing rendering technique.

    • The Bottleneck - "Death by a Thousand Cuts": This technique causes the number of API calls per frame to explode from a baseline of ~120 in a normal scene to over 400 in the problematic scene. While each individual draw call executes quickly on the GPU, the sheer volume of commands overwhelms the CPU-side of the DirectX-to-Metal translation layer, creating a massive processing bottleneck. The GPU is effectively waiting for the CPU to feed it commands.

    • Control Group Confirmation: This is not an issue with the rendering technique in principle. The game Robotics;Notes Elite, from the same developer, uses a similar 3D character rendering method. A RenderDoc capture of a complex scene in that game shows it runs at a flawless 60 FPS with only ~169 draw calls and minimal CPU usage. This proves the VMware translation layer can handle the method, but not the extreme volume of calls generated specifically by Anonymous;Code.

    • Cross-Platform Verification: The exact same performance degradation occurs when testing this scene in Parallels Desktop. This confirms the issue is not a simple VMware-specific bug, but a fundamental architectural challenge for the current state of DirectX-to-Metal translation when faced with this pathological, high-frequency workload.

    5. Attachments Available:

    I have a complete set of diagnostic files ready, including:

    • A game save file to reproduce the issue instantly.

    • RenderDoc captures (.rdc) of both a normal frame and the slow frame.

    • A RenderDoc capture from a control game (Robotics;Notes Elite) for comparison.

    • A full VMware Support Bundle (.zip) with detailed system logs.

    Due to the sensitive nature of the support bundle, I will not post these files publicly. I can provide all attachments directly to a verified VMware employee via private message or email upon request.