VMware Tanzu Kubernetes Grid Integrated Edition

 View Only

TKC pods fail to mount persistentvolume: Watch on virtualmachine "..." timed out

  • 1.  TKC pods fail to mount persistentvolume: Watch on virtualmachine "..." timed out

    Posted Apr 07, 2025 12:28 PM

    We have a TKC cluster in vSphere Tanzu and we are trying to run pods that mount volumes provisione by the default vsphere-csi-driver.

    After some infrastructure instability, TKC nodes were rebooted and some pods fail to remount their volumes.

    A kubectl describe pod gives us a few error messages.

    Events:
      Type     Reason              Age                From                     Message
      ----     ------              ----               ----                     -------
      Normal   Scheduled           31m                default-scheduler        Successfully assigned logging/graylog-es-data-0 to tkc-3rd01-md-0-2rccf-x6d26-v2xfq
      Warning  FailedAttachVolume  31m                attachdetach-controller  Multi-Attach error for volume "pvc-b5092e1c-4dc8-4ed1-aab1-660551f2944e" Volume is already exclusively attached to one node and can't be attached to another
      Warning  FailedAttachVolume  23m (x9 over 25m)  attachdetach-controller  AttachVolume.Attach failed for volume "pvc-b5092e1c-4dc8-4ed1-aab1-660551f2944e" : rpc error: code = Internal desc = observed Error: "failed to attach cns volume" is set on the volume "43bf6b28-a5f3-4593-9f8c-e80d5c2643da-b5092e1c-4dc8-4ed1-aab1-660551f2944e" on virtualmachine "tkc-3rd01-md-0-2rccf-x6d26-v2xfq"
      Warning  FailedAttachVolume  15m                attachdetach-controller  AttachVolume.Attach failed for volume "pvc-b5092e1c-4dc8-4ed1-aab1-660551f2944e" : volume attachment is being deleted
      Warning  FailedAttachVolume  11m (x2 over 13m)  attachdetach-controller  AttachVolume.Attach failed for volume "pvc-b5092e1c-4dc8-4ed1-aab1-660551f2944e" : rpc error: code = Internal desc = observed Error: "failed to detach cns volume" is set on the volume "43bf6b28-a5f3-4593-9f8c-e80d5c2643da-b5092e1c-4dc8-4ed1-aab1-660551f2944e" on virtualmachine "tkc-3rd01-md-0-2rccf-x6d26-v2xfq"
      Warning  FailedAttachVolume  52s (x8 over 21m)  attachdetach-controller  AttachVolume.Attach failed for volume "pvc-b5092e1c-4dc8-4ed1-aab1-660551f2944e" : rpc error: code = Internal desc = Watch on virtualmachine "tkc-3rd01-md-0-2rccf-x6d26-v2xfq" timed out
    

    I guess the erro comes from the VolumeAttachment object itself.

    Status:
      Attach Error:
        Message:  rpc error: code = Internal desc = Watch on virtualmachine "tkc-3rd01-md-0-2rccf-x6d26-v2xfq" timed out
        Time:     2025-04-04T23:28:54Z
      Attached:   false
    Events:       <none>
    

    At this point, we are not sure how to troubleshoot these errors. Any advice is welcome.

    Best regards.