VMware vSphere

 View Only
Expand all | Collapse all

Issues Enabling Workload Management with vSphere 7

daphnissov

daphnissovSep 08, 2020 05:48 PM

elihuj

elihujSep 08, 2020 06:03 PM

  • 1.  Issues Enabling Workload Management with vSphere 7

    Posted Sep 08, 2020 04:27 PM

    I am attempting to setup Workload Management in a greenfield vSphere 7 environment with NSX-T and it continues to hang at "Error configurating cluster NIC on master VM. This operation is part of API server configuration and will be retried". I see the following in the wcpsvc.log file:

    2020-09-08T16:16:54.416Z error wcp [opID=5f57bd08-domain-c8] Failed to create cluster network interface for MasterNode: VirtualMachine:vm-88. Err: Unauthorized

    2020-09-08T16:16:54.416Z error wcp [opID=5f57bd08-domain-c8] Error configuring cluster NIC on master VM vm-88: Unauthorized

    2020-09-08T16:16:54.416Z error wcp [opID=5f57bd08-domain-c8] Error configuring API server on cluster domain-c8 Error configuring cluster NIC on master VM. This operation is part of API server configuration and will be retried.

    My vCenter, and NSX deployments are on the same Layer 2 segment. NSX-T is currently functioning, with a connectivity validated from a logical segment out to the Internet. I have also validated that MTU is 1600 throughout the environment.



  • 2.  RE: Issues Enabling Workload Management with vSphere 7

    Posted Sep 08, 2020 05:48 PM

    Are your hosts also running ESXi 7?



  • 3.  RE: Issues Enabling Workload Management with vSphere 7

    Posted Sep 08, 2020 06:03 PM

    Yes, ESXi 7 build 16324942.



  • 4.  RE: Issues Enabling Workload Management with vSphere 7

    Posted Sep 13, 2020 04:30 AM

    Hi elihuj,

    Make sure the edge nodes are deployed as a medium (suggest large if you have the available resources) as the LB deployed is a medium size.



  • 5.  RE: Issues Enabling Workload Management with vSphere 7

    Posted Sep 16, 2020 08:42 PM

    Hello VirtualizingStuff, thank you for the reply. I did deploy a Large Edge, but unfortunately that was not the fix. I tried it again, and for whatever reason it succeeded all the way through.



  • 6.  RE: Issues Enabling Workload Management with vSphere 7

    Posted Nov 25, 2020 07:34 AM

    Hello,

    I've the same issue. NSX-T 3.1, VMware ESXi, 7.0.1, 17168206, vCenter build: 17004997

    In NSX-T manager Alarm there is one Open issue when Workload Management hang. I'm using 3 NSX manager appliance.

    Manager Node has detected the NCP is down or unhealthy.

    Entity name: domain-c11:a83fdad6-c5e1-472e-a47b-d670fb2dd1c3

    I noticed this entity is not exists. I'm very new in NSX-T so I do not know this error is relevant or not.

    Transport nodes and Edge nodes Tunnels are fine if I'm right.

    nsxt-01.PNGnsxt-02.PNG

    Please give advice where should I search the root cause. Thank you.



  • 7.  RE: Issues Enabling Workload Management with vSphere 7

    Posted Dec 18, 2020 06:29 PM

    This error seems common as I see lots of people having the same issue. I wonder if anyone at VMware knows how to troubleshoot it?

     



  • 8.  RE: Issues Enabling Workload Management with vSphere 7

    Posted Dec 31, 2020 10:53 PM

    Two most common reasons are:

    1. Trust is not enabled in the Compute Manager for this vCenter in NSX.

    2. Time between vCenter and NSX is not in sync.

     



  • 9.  RE: Issues Enabling Workload Management with vSphere 7

    Posted Jan 01, 2021 04:08 PM

    can you please get NCP log :

    kubectl -n vmware-system-nsx logs <ncp-pod-name> -p

    when you enabled WCP you enter "corp.local" as master DNS?



  • 10.  RE: Issues Enabling Workload Management with vSphere 7

    Posted Jan 04, 2021 09:32 AM

    Usually this kind of error occurs when master and worker DNS configured as same.
    Actually the master DNS should be reachable from the management network and worker DNS should be reachable from workload network.
    If both the DNS servers are same then it need to be reachable from both networks(Management/Workload).
    To cross check the network reachability ,
    - Connect to the Kubernetes API master VM
    - Run below commands,
    1) ping -I eth0 <masterDNS>
    2) ping -I eth1 <workerDNS>



  • 11.  RE: Issues Enabling Workload Management with vSphere 7

    Posted Jan 04, 2021 07:35 PM

    Ok, this may be an issue. I am not well versed on the networking going on here. I am not sure how to assign IP addresses to the Ingress and Egress CIDRs. I assume by "worker" you mean these. I understand these need to be routable, But I can't figure out what VLAN they are on. I also don't have the capability to do BGP, and am not sure how to enter a route to these addresses. I can't even figure out what the interface to the T0 and T1 routers are. I understand networking, just not NSX-T. 

     



  • 12.  RE: Issues Enabling Workload Management with vSphere 7

    Posted Jan 27, 2021 03:37 AM

     

    Hi,

    Do you know any other way to login the supervisor VM?

    I had the same issue "Error configurating cluster NIC on master VM" therefore the "workload management" -> "namespaces" web page hanging at "workload management is still being configured. Please check back later".

    I believe this "hanging" is preventing me from download and install k8s cli tool to connect to the control plane VMs.

     

    By the way,

    Do the DNS records need to be created for the master & worker before the deployment of workload management cluster?

    thanks



  • 13.  RE: Issues Enabling Workload Management with vSphere 7

    Posted Jan 27, 2021 06:03 AM

    Login into the Supervisor Master VM:

    - SSH into the vCenter and enable shell(if required)

    - Run "/usr/lib/vmware-wcp/decryptK8Pwd.py" to get the IP address and password for SC Master VM.

    Eg:

    # /usr/lib/vmware-wcp/decryptK8Pwd.py
    Read key from file
    Connected to PSQL
    Cluster: domain-c8:2bcXXXX
    IP: 10.xx.xx.xx
    PWD: xxxxxxxxxxx

    # ssh root@10.xx.xx.xx

    type "yes" and provide above PWD.

     

    After connect to supervisor master VM session , run the previous "ping" commands to check the Master/Worker DNS connectivity , nodes status like "kubectl get nodes" and system pods status "kubectl get pods -A" for troubleshooting.

    >> Do the DNS records need to be created for the master & worker before the deployment of workload management cluster?

    Its completely depends on your network, but for master directly use the management DNS. 



  • 14.  RE: Issues Enabling Workload Management with vSphere 7

    Posted Jan 28, 2021 05:35 AM

    That's useful!

    I discovered that a pod " tmc-agent-installer-1611810900-8n776" is in error status and another pod "vsphere-csi-controller-6687dc774f-xnbfq" is in crashloopbackoff status in the master.

    I didn't have DNS records created for master/worker yet so the ping was unsuccessful.

    The three masters are all in "ready" status(using "kubectl get nodes") so i can only assume that the hanging issue that I mentioned before was due to other unknown reason...

    Thanks!



  • 15.  RE: Issues Enabling Workload Management with vSphere 7

    Posted Jan 27, 2021 09:48 PM

    For those who are interested, I had to get BGP working on the ToR switch to get Workload Management to install. Maybe you can get by without it, but it didn't work for me. Just Sayin'

     



  • 16.  RE: Issues Enabling Workload Management with vSphere 7



  • 17.  RE: Issues Enabling Workload Management with vSphere 7

    Posted Jun 30, 2022 10:04 AM

    In my case this NSX-T not being able to connect to the compute manager was the problem

    This was the fix in my case (not an IP connectivity problem)  https://www.ibm.com/support/pages/after-vcenter-upgrade-connection-status-compute-manager-shows-down-nsx-t-manager