VMware Cloud Foundation

 View Only

[SOLVED] vCenter PBM InvalidLogin / SPS Login Fail caused by vsan.health certificate mismatch

  • 1.  [SOLVED] vCenter PBM InvalidLogin / SPS Login Fail caused by vsan.health certificate mismatch

    Posted 7 days ago

    Environment

    vCenter Server (VMCA certificates, embedded PSC)


    Issue

    We experienced persistent authentication-related issues in vCenter:

    • VM creation failed with:

      A general system error occurred: PBM error occurred during PreCreateCheckCallback
      pbm.fault.InvalidLogin
      
    • Logs showed:

      AcquireToken exception: InvalidCredentials
      vim.fault.InvalidLogin
      
    • Storage Policy Service (SPS) errors:

      Login to PBM failed
      
    • Additionally, after some time:

      https://<vcenter>/lookupservice/sdk became inaccessible
      

    Behavior

    • Restarting services (vmware-vpxd, vmware-sps) temporarily resolved the issue

    • However, the problem reappeared after a few hours


    Troubleshooting Steps

    • Identified NTP synchronization issues initially

      • Time drift caused token validation problems (STS / SSO related)

      • Fixed NTP configuration and ensured proper synchronization with domain controllers

    • Verified certificates (no expiration issues)

    • Ran LSDoctor → no issues detected

    • Restarted services → only temporary improvement


    Root Cause

    After fixing NTP, the issue persisted.

    Using vCert, we identified:

    com.vmware.vsan.health (Machine SSL) → MISMATCH
    

    This indicated that the vsan-health extension was registered with an incorrect certificate thumbprint.

    Even though vSAN was not actively used, this service is still part of the vCenter internal trust chain.


    Resolution

    We resolved the issue by:

    1. Fixing NTP synchronization (critical first step)

    2. Running vCert:

      Option 6 – Reset all certificates with VMCA-signed certificates
      
    3. Regenerating Machine SSL and Solution User certificates

    4. Updating extension thumbprints

    5. Restarting all VMware services

    6. Not replacing STS signing certificate (not required)


    Result

    After the fix:

    com.vmware.vsan.health → MATCHES
    
    • PBM errors disappeared

    • SPS login failures stopped

    • Lookup service remained stable

    • No recurrence after several hours


    Key Takeaway

    This issue was caused by a combination of NTP drift and certificate trust mismatch:

    • NTP issues broke token validation

    • Certificate mismatch caused persistent authentication failures


    Recommendation

    If you see:

    • PBM InvalidLogin errors

    • SPS authentication failures

    • Issues returning after restart

     Check BOTH:

    1. NTP synchronization

    2. Extension certificate thumbprints (especially vsan.health)




    -------------------------------------------