Kubernetes Recipe - Basic setup not working
-
New to K8s. Simply using the recipe to deploy a small 3 node cluster. Master and workers came up successfully and can ssh into each node using debian user and key. IPs are supplied to nodes via DHCP. From the master I can check the node status but <none> is specified for the ROLES of the workers:
$ kubectl get nodes NAME STATUS ROLES AGE VERSION master Ready master 22m v1.19.2 node-1 Ready <none> 15m v1.19.2 node-2 Ready <none> 15m v1.19.2 node-3 Ready <none> 15m v1.19.2
Again, never used K8 before, but this doesn't seem right?
Next, I create a deployment:
debian@master:~$ kubectl create deployment kubernetes-bootcamp --image=gcr.io/google-samples/kubernetes-bootcamp:v1 deployment.apps/kubernetes-bootcamp created
Next, I verify , but available column shows 0
$ kubectl get deployments NAME READY UP-TO-DATE AVAILABLE AGE kubernetes-bootcamp 0/1 1 0 14m
Pod is pending also:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
kubernetes-bootcamp-57978f5f5d-t6bsm 0/1 Pending 0 16mI also see taints but unsure what's needed to resolve the issue: $ kubectl describe pods Name: kubernetes-bootcamp-57978f5f5d-t6bsm Namespace: default Priority: 0 Node: <none> Labels: app=kubernetes-bootcamp pod-template-hash=57978f5f5d Annotations: <none> Status: Pending IP: IPs: <none> Controlled By: ReplicaSet/kubernetes-bootcamp-57978f5f5d Containers: kubernetes-bootcamp: Image: gcr.io/google-samples/kubernetes-bootcamp:v1 Port: <none> Host Port: <none> Environment: <none> Mounts: /var/run/secrets/kubernetes.io/serviceaccount from default-token-bqs24 (ro) Conditions: Type Status PodScheduled False Volumes: default-token-bqs24: Type: Secret (a volume populated by a Secret) SecretName: default-token-bqs24 Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 63s (x13 over 17m) default-scheduler 0/4 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 3 node(s) had taint {node.kubernetes.io/disk-pressure: }, that the pod didn't tolerate.
Checking network interfaces I have the following (from the master)
# ip addr sh 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 02:a1:71:79:d0:1c brd ff:ff:ff:ff:ff:ff inet 192.168.1.168/24 brd 192.168.1.255 scope global dynamic eth0 valid_lft 4886sec preferred_lft 4886sec inet6 fe80::a1:71ff:fe79:d01c/64 scope link valid_lft forever preferred_lft forever 3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default link/ether 02:42:e5:e1:84:8a brd ff:ff:ff:ff:ff:ff inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0 valid_lft forever preferred_lft forever 4: kube-bridge: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1480 qdisc noqueue state UNKNOWN group default qlen 1000 link/ether ce:7d:de:2f:39:8b brd ff:ff:ff:ff:ff:ff inet6 fe80::cc7d:deff:fe2f:398b/64 scope link valid_lft forever preferred_lft forever
Kind of lost and not sure what to do next or how to resolve
Any tips greatly appreciated!
BTW - I absolutely love xcp-NG and XOA. One of the best switches I've made in a long time.
-
Ping @BenjiReis
-
Hello,
Can the master communicate with the other nodes? What IP ranges did you use for kubernetes CIDR & for your VM IPs?
Thanks
-
@BenjiReis , yes, all of the nodes could communicate with each other. The CIDR supplied in the template was in a different range which didn't overlap existing ones (i.e. completely different). This morning, I had to move forward so manually deployed a 10 node cluster (< what a pain! ) and had to abandon the template for now. I do plan to try this again as I'm certain I may have been doing something wrong and it would make my life easier if I can get it working. One thing I noticed this morning is the master was out of disk space and only 4 GB was allocated . Seems pretty low. I'll try again sometime.
-
Ok, I'll try on my end to see if I encounter a similar error. Let me know how your next try goes.
Regards
-
I have exactly the same problem, tried different cidr's, changing it manually in the kube-controller-master and then restarting the kubelet service.
kubectl get pods -A -o wide NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES default kubernetes-bootcamp-57978f5f5d-mx8sd 0/1 Pending 0 6m52s <none> <none> <none> <none> kube-system coredns-f9fd979d6-8jjbs 0/1 ContainerCreating 0 83m <none> node-1 <none> <none> kube-system coredns-f9fd979d6-mn4d8 0/1 ContainerCreating 0 83m <none> node-1 <none> <none> kube-system etcd-master 1/1 Running 1 83m 192.168.1.52 master <none> <none> kube-system kube-apiserver-master 1/1 Running 1 83m 192.168.1.52 master <none> <none> kube-system kube-controller-manager-master 1/1 Running 0 35m 192.168.1.52 master <none> <none> kube-system kube-proxy-84k8x 1/1 Running 1 79m 192.168.1.55 node-2 <none> <none> kube-system kube-proxy-f5shp 1/1 Running 1 79m 192.168.1.53 node-1 <none> <none> kube-system kube-proxy-qg4bk 1/1 Running 1 83m 192.168.1.52 master <none> <none> kube-system kube-proxy-whcwv 1/1 Running 1 79m 192.168.1.54 node-3 <none> <none> kube-system kube-router-7zmtb 0/1 CrashLoopBackOff 22 79m 192.168.1.55 node-2 <none> <none> kube-system kube-router-llgqk 0/1 CrashLoopBackOff 23 79m 192.168.1.53 node-1 <none> <none> kube-system kube-router-q4m5d 0/1 CrashLoopBackOff 22 79m 192.168.1.54 node-3 <none> <none> kube-system kube-router-xs696 0/1 CrashLoopBackOff 33 83m 192.168.1.52 master <none> <none> kube-system kube-scheduler-master 1/1 Running 1 83m 192.168.1.52 master <none> <none> debian@master:~$ kubectl -n kube-system logs -f kube-router-7zmtb I1015 09:06:02.910389 1 kube-router.go:231] Running /usr/local/bin/kube-router version v1.1.0-dirty, built on 2020-10-02T22:14:14+0000, go1.13.13 F1015 09:06:03.015240 1 network_routes_controller.go:1060] Failed to get pod CIDR from node spec. kube-router relies on kube-controller-manager to allocate pod CIDR for the node or an annotation `kube-router.io/pod-cidr`. Error: node.Spec.PodCIDR not set for node: node-2 debian@master:~$ kubectl get nodes -o jsonpath='{.items[*].spec.podCIDR}' -A debian@master:~$ kubectl cluster-info dump -o yaml | grep -i cidr | grep \\\-\\\- - --allocate-node-cidrs=true - --cluster-cidr=10.96.0.0/22 - --node-cidr-mask-size=25 debian@master:~$ kubectl logs pod/kube-controller-manager-master -n kube-system E1015 08:35:32.193635 1 controller_utils.go:248] Error while processing Node Add: failed to allocate cidr from cluster cidr at idx:0: CIDR allocation failed; there are no remaining CIDRs left to allocate in the accepted range
-
Did the manual change solve the issue?
-
@BenjiReis
No, the manual change didn't solve the problem -
Did you make sure
/proc/sys/net/bridge/bridge-nf-call-iptables
is set to 1?Our implementation uses kube-router wich requirest this setting.
-
@BenjiReis said in Kubernetes Recipe - Basic setup not working:
/proc/sys/net/bridge/bridge-nf-call-iptables
Just checked:
debian@master:~$ more /proc/sys/net/bridge/bridge-nf-call-iptables 1
-
Ok thanks, then I don't understant why there's a pb unfortunately...
The recipe follow this doc to create the cluster: https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/
@suaro how does your manual install differs from the doc used by the recipe? Perhaps we can dig there.
-
And the pod network is made with kube-router: https://github.com/cloudnativelabs/kube-router/blob/master/docs/kubeadm.md
-
I'm going to have a look into it!
Thanks and hopefully I'll find the cullpritt, so the recipe can be updated with some nice new ingrediΓ«nts -
Thanks
I'll try to investigate myself as well when i can, do not hesitate to come back here if you find anything.
-
I got it working. Seems like the podCIDR wasn't set. Setting it manually by patching the nodes worked for me:
for node in master node-1 node-2 node-3 do kubectl patch node $node -p '{"spec":{"podCIDR":"10.96.0.0/12"}}' done
Not sure if this is a problem with the recipe, or if this is a bug in kube-router/kube-controller-manager. Anyway, If have my cluster up and running now!!!
-
It seems that Debian Buster has some problems with Kubernetes. While this base setup is working, one should also assure that every tool uses the legacy iptables. If not, pod's will not be able to reach the kubernetes api... And then... failure all over!
So we also need:update-alternatives --set iptables /usr/sbin/iptables-legacy update-alternatives --set ip6tables /usr/sbin/ip6tables-legacy update-alternatives --set arptables /usr/sbin/arptables-legacy update-alternatives --set ebtables /usr/sbin/ebtables-legacy ```