VMware Tanzu Kubernetes Grid Integrated Edition

 View Only

 PKS deployed service is refusing connection

Jim Song's profile image
Jim Song posted Mar 19, 2020 05:17 PM

I'm following official guideline to deploy a sample nginx on pks, it seems the service is on:

ubuntu@opsmgr26:~$ kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.100.200.1 <none> 443/TCP 5d5h nginx NodePort 10.100.200.182 <none> 80:32005/TCP 8h

But when I try to access it through "curl http://192.168.180.13:32005", I get refusing error:

curl: (7) Failed to connect to 192.168.180.13 port 32005: Connection refused

 

I bosh ssh log in to the worker node(192.168.180.13) and confirmed that port 32005 is open and listening:

kube-prox 11833         root  12u IPv4 9746833   0t0 TCP *:32005 (LISTEN)

 

Any suggestion about how I could resolve the issue?

Daniel Lynch's profile image
Broadcom Employee Daniel Lynch

not sure which procedure you are following so i will paste one that we can both work off of.

 

deployment manifest

apiVersion: apps/v1 kind: Deployment metadata: name: my-nginx spec: selector: matchLabels: run: my-nginx replicas: 2 template: metadata: labels: run: my-nginx spec: containers: - name: my-nginx image: nginx ports: - containerPort: 80

create the service

kubectl expose deployment/my-nginx --type="NodePort"

 

test using any worker IP

 

~> kubectl get service NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE my-nginx NodePort 10.100.200.116 <none> 80:31178/TCP 29s   ~> kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME 521d5ed4-aab4-47c8-8d29-df463ebd3e9f Ready <none> 12d v1.15.5 10.193.71.167 10.193.71.167 Ubuntu 16.04.6 LTS 4.15.0-88-generic docker://18.9.9 543e2574-3bc8-47bd-9b72-d60332180de4 Ready <none> 12d v1.15.5 10.193.71.169 10.193.71.169 Ubuntu 16.04.6 LTS 4.15.0-88-generic docker://18.9.9 d3f1edfb-5eda-4644-acb1-69f3df0546d8 Ready <none> 12d v1.15.5 10.193.71.168 10.193.71.168 Ubuntu 16.04.6 LTS 4.15.0-88-generic docker://18.9.9     ~> curl http:10.193.71.169:31178 curl: (3) Port number ended with '.' ~> curl http://10.193.71.169:31178 <!DOCTYPE html> <html> <head> <title>Welcome to nginx!</title> <style> body { width: 35em; margin: 0 auto; font-family: Tahoma, Verdana, Arial, sans-serif; } </style> </head> <body> <h1>Welcome to nginx!</h1> <p>If you see this page, the nginx web server is successfully installed and working. Further configuration is required.</p>   <p>For online documentation and support please refer to <a href="http://nginx.org/">nginx.org</a>.<br/> Commercial support is available at <a href="http://nginx.com/">nginx.com</a>.</p>   <p><em>Thank you for using nginx.</em></p> </body> </html>

 

 

from the worker you should have all the firewall rules in place for the service node port.

FROM KUBECTL ~> kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES my-nginx-756fb87568-65hjq 1/1 Running 0 9m36s 10.200.10.8 543e2574-3bc8-47bd-9b72-d60332180de4 <none> <none> my-nginx-756fb87568-d5dhj 1/1 Running 0 9m36s 10.200.87.18 521d5ed4-aab4-47c8-8d29-df463ebd3e9f <none> <none>     FROM WORKER NODE worker/8a77d684-4c16-48b5-afba-2e710fa00e20:~# iptables -t nat -L | egrep 31178 KUBE-MARK-MASQ tcp -- anywhere anywhere /* danl/my-nginx: */ tcp dpt:31178 KUBE-SVC-NDDILDPES4JGN5FS tcp -- anywhere anywhere /* danl/my-nginx: */ tcp dpt:31178     worker/8a77d684-4c16-48b5-afba-2e710fa00e20:~# iptables -t nat -L KUBE-SVC-NDDILDPES4JGN5FS Chain KUBE-SVC-NDDILDPES4JGN5FS (2 references) target prot opt source destination KUBE-SEP-DPI3UUA3D6OYEI6O all -- anywhere anywhere statistic mode random probability 0.50000000000 KUBE-SEP-HXCXGU5NLVDSGUSD all -- anywhere anywhere worker/8a77d684-4c16-48b5-afba-2e710fa00e20:~# iptables -t nat -L KUBE-SEP-DPI3UUA3D6OYEI6O Chain KUBE-SEP-DPI3UUA3D6OYEI6O (1 references) target prot opt source destination KUBE-MARK-MASQ all -- 10.200.10.8 anywhere DNAT tcp -- anywhere anywhere tcp to:10.200.10.8:80

 

 

 

Daniel Lynch's profile image
Broadcom Employee Daniel Lynch

if all that is in place and you are getting connection refused then ip `192.168.180.13` is not a worker VM or there is a firewall between where your are running curl command and the worker node that is blocking your port.

Jim Song's profile image
Jim Song

Thanks for the information. By checking the pod details:

kubectl describe pod nginx-844664bf47-cv864

I found following error:

Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning Failed 36m (x1127 over 4d2h) kubelet, 97e19e62-34e2-4aba-9fbe-3802041cd50d Error: ErrImagePull Normal Pulling 31m (x1128 over 4d2h) kubelet, 97e19e62-34e2-4aba-9fbe-3802041cd50d Pulling image "nginx:1.13-alpine" Normal BackOff 16m (x25459 over 4d2h) kubelet, 97e19e62-34e2-4aba-9fbe-3802041cd50d Back-off pulling image "nginx:1.13-alpine" Warning Failed 6m17s (x25501 over 4d2h) kubelet, 97e19e62-34e2-4aba-9fbe-3802041cd50d Error: ImagePullBackOff

I bosh ssh onto one worker node and manually run "sudo docker pull nginx:1.13-alpine" and get error:

Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

While it seems dockerd is running on local:

worker/31f784ba-b3dc-48d1-94b0-bfa40a29045e:~$ sudo lsof -i -n -P | grep -i listen bosh-agen 752 root 6u IPv4 22954 0t0 TCP 127.0.0.1:2825 (LISTEN) sshd 789 root 3u IPv4 4209089 0t0 TCP *:22 (LISTEN) bosh-dns- 9140 vcap 3u IPv4 29506 0t0 TCP *:8853 (LISTEN) bosh-dns 9208 vcap 3u IPv4 29532 0t0 TCP 169.254.0.2:53 (LISTEN) bosh-dns 9208 vcap 5u IPv4 29534 0t0 TCP 127.0.0.1:53080 (LISTEN) bosh-dns 9208 vcap 6u IPv4 29536 0t0 TCP 169.254.0.2:53 (LISTEN) rpcbind 10590 root 8u IPv4 33807 0t0 TCP *:111 (LISTEN) monit 11493 root 4u IPv4 34075 0t0 TCP 127.0.0.1:2822 (LISTEN) dockerd 11584 root 6u IPv4 34937 0t0 TCP 127.0.0.1:4243 (LISTEN) kubelet 12165 root 9u IPv4 35919 0t0 TCP 127.0.0.1:32831 (LISTEN) kubelet 12165 root 25u IPv4 36525 0t0 TCP 127.0.0.1:10248 (LISTEN) kubelet 12165 root 27u IPv4 36530 0t0 TCP *:10250 (LISTEN) kube-prox 12234 root 11u IPv4 35139 0t0 TCP 127.0.0.1:10249 (LISTEN) kube-prox 12234 root 12u IPv4 36137 0t0 TCP *:32347 (LISTEN) kube-prox 12234 root 13u IPv4 35146 0t0 TCP *:10256 (LISTEN) kube-prox 12234 root 14u IPv4 3066632 0t0 TCP *:32005 (LISTEN)

Anything I did wrong here?

 

Daniel Lynch's profile image
Broadcom Employee Daniel Lynch

The user you are logged in as might not have the right env vars set. You should have the following

~# env | egrep DOCKER DOCKER_HOST=unix:///var/vcap/sys/run/docker/docker.sock DOCKER_SOCK=/var/vcap/sys/run/docker/docker.sock

 

Jim Song's profile image
Jim Song

Actually I do have above env vars:

worker/c89621cf-5a5b-4a02-a88f-35480c1400bc:~$ env | egrep DOCKER DOCKER_HOST=unix:///var/vcap/sys/run/docker/docker.sock DOCKER_SOCK=/var/vcap/sys/run/docker/docker.sock

I think following message from "kubectl describe pod nginx-844664bf47-8t9t6" output indicate the issue may caused by network or dns configuration, I'm still working on how to fix it:

Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 7h52m default-scheduler Successfully assigned default/nginx-844664bf47-8t9t6 to d019fed2-8f6f-4376-9140-a614b5786ca6 Warning Failed 7h52m kubelet, d019fed2-8f6f-4376-9140-a614b5786ca6 Failed to pull image "nginx:1.13-alpine": rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: dial tcp: lookup registry-1.docker.io on 192.168.180.1:53: read udp 192.168.180.11:34745->192.168.180.1:53: i/o timeout Warning Failed 7h51m kubelet, d019fed2-8f6f-4376-9140-a614b5786ca6 Failed to pull image "nginx:1.13-alpine": rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: dial tcp: lookup registry-1.docker.io on 192.168.180.1:53: read udp 192.168.180.11:40959->192.168.180.1:53: i/o timeout Warning Failed 7h50m kubelet, d019fed2-8f6f-4376-9140-a614b5786ca6 Failed to pull image "nginx:1.13-alpine": rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: dial tcp: lookup registry-1.docker.io on 192.168.180.1:53: read udp 192.168.180.11:55813->192.168.180.1:53: i/o timeout Normal Pulling 7h50m (x4 over 7h52m) kubelet, d019fed2-8f6f-4376-9140-a614b5786ca6 Pulling image "nginx:1.13-alpine" Warning Failed 7h50m (x4 over 7h52m) kubelet, d019fed2-8f6f-4376-9140-a614b5786ca6 Error: ErrImagePull Warning Failed 7h50m kubelet, d019fed2-8f6f-4376-9140-a614b5786ca6 Failed to pull image "nginx:1.13-alpine": rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: dial tcp: lookup registry-1.docker.io on 192.168.180.1:53: server misbehaving Normal BackOff 7h49m (x6 over 7h51m) kubelet, d019fed2-8f6f-4376-9140-a614b5786ca6 Back-off pulling image "nginx:1.13-alpine" Warning Failed 7h49m (x7 over 7h51m) kubelet, d019fed2-8f6f-4376-9140-a614b5786ca6 Error: ImagePullBackOff

Jim Song's profile image
Jim Song

By the way, when I execute "kubectl apply -f nginx.yml", the actually "docker pull image nginx:1.13-alpine" command is executed on worker node vm, correct? I only see dockerd running on worker node, there's no docker daemon running on master node. Am I get this correct?

Daniel Lynch's profile image
Broadcom Employee Daniel Lynch

looks at the error more closely it appears your work node can not lookup the docker registry hostname. That is a dns issue.

dial tcp: lookup registry-1.docker.io on 192.168.180.1:53: server misbehaving

 

Jim Song's profile image
Jim Song

Yes, it is a dns issue, after it get resolved, I redeploy the service and now it's status is "running". And I could access it through curl. Thanks for the help, Dan.

ubuntu@opsmgr26:~$ kubectl get pod NAME READY STATUS RESTARTS AGE nginx-844664bf47-fmdmn 1/1 Running 0 76m nginx-844664bf47-hs4hn 1/1 Running 0 76m ubuntu@opsmgr26:~$ kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.100.200.1 <none> 443/TCP 93m nginx NodePort 10.100.200.229 <none> 80:30839/TCP 77m