Kubernetes Recipe - Basic setup not working

suaro

New to K8s. Simply using the recipe to deploy a small 3 node cluster. Master and workers came up successfully and can ssh into each node using debian user and key. IPs are supplied to nodes via DHCP. From the master I can check the node status but <none> is specified for the ROLES of the workers:

$ kubectl get nodes
NAME     STATUS   ROLES    AGE   VERSION
master   Ready    master   22m   v1.19.2
node-1   Ready    <none>   15m   v1.19.2
node-2   Ready    <none>   15m   v1.19.2
node-3   Ready    <none>   15m   v1.19.2

Again, never used K8 before, but this doesn't seem right?

Next, I create a deployment:

debian@master:~$ kubectl create deployment kubernetes-bootcamp --image=gcr.io/google-samples/kubernetes-bootcamp:v1

deployment.apps/kubernetes-bootcamp created

Next, I verify , but available column shows 0

$ kubectl get deployments
NAME                  READY   UP-TO-DATE   AVAILABLE   AGE
kubernetes-bootcamp   0/1     1            0           14m

Pod is pending also:

$ kubectl get pods
NAME READY STATUS RESTARTS AGE
kubernetes-bootcamp-57978f5f5d-t6bsm 0/1 Pending 0 16m


I also see taints but unsure what's needed to resolve the issue:

$  kubectl describe pods
Name:           kubernetes-bootcamp-57978f5f5d-t6bsm
Namespace:      default
Priority:       0
Node:           <none>
Labels:         app=kubernetes-bootcamp
                pod-template-hash=57978f5f5d
Annotations:    <none>
Status:         Pending
IP:
IPs:            <none>
Controlled By:  ReplicaSet/kubernetes-bootcamp-57978f5f5d
Containers:
  kubernetes-bootcamp:
    Image:        gcr.io/google-samples/kubernetes-bootcamp:v1
    Port:         <none>
    Host Port:    <none>
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-bqs24 (ro)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  default-token-bqs24:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-bqs24
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason            Age                 From               Message
  ----     ------            ----                ----               -------
  Warning  FailedScheduling  63s (x13 over 17m)  default-scheduler  0/4 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 3 node(s) had taint {node.kubernetes.io/disk-pressure: }, that the pod didn't tolerate.

Checking network interfaces I have the following (from the master)

# ip addr sh
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 02:a1:71:79:d0:1c brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.168/24 brd 192.168.1.255 scope global dynamic eth0
       valid_lft 4886sec preferred_lft 4886sec
    inet6 fe80::a1:71ff:fe79:d01c/64 scope link
       valid_lft forever preferred_lft forever
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:e5:e1:84:8a brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
4: kube-bridge: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1480 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether ce:7d:de:2f:39:8b brd ff:ff:ff:ff:ff:ff
    inet6 fe80::cc7d:deff:fe2f:398b/64 scope link
       valid_lft forever preferred_lft forever

Kind of lost and not sure what to do next or how to resolve

Any tips greatly appreciated!

BTW - I absolutely love xcp-NG and XOA. One of the best switches I've made in a long time.

olivierlambert

Ping @BenjiReis

BenjiReis

Hello,

Can the master communicate with the other nodes? What IP ranges did you use for kubernetes CIDR & for your VM IPs?

Thanks

suaro

@BenjiReis , yes, all of the nodes could communicate with each other. The CIDR supplied in the template was in a different range which didn't overlap existing ones (i.e. completely different). This morning, I had to move forward so manually deployed a 10 node cluster (< what a pain! ) and had to abandon the template for now. I do plan to try this again as I'm certain I may have been doing something wrong and it would make my life easier if I can get it working. One thing I noticed this morning is the master was out of disk space and only 4 GB was allocated . Seems pretty low. I'll try again sometime.

BenjiReis

Ok, I'll try on my end to see if I encounter a similar error. Let me know how your next try goes.

Regards

ralphsmeets

I have exactly the same problem, tried different cidr's, changing it manually in the kube-controller-master and then restarting the kubelet service.

kubectl get pods -A -o wide
NAMESPACE     NAME                                   READY   STATUS              RESTARTS   AGE     IP             NODE     NOMINATED NODE   READINESS GATES
default       kubernetes-bootcamp-57978f5f5d-mx8sd   0/1     Pending             0          6m52s   <none>         <none>   <none>           <none>
kube-system   coredns-f9fd979d6-8jjbs                0/1     ContainerCreating   0          83m     <none>         node-1   <none>           <none>
kube-system   coredns-f9fd979d6-mn4d8                0/1     ContainerCreating   0          83m     <none>         node-1   <none>           <none>
kube-system   etcd-master                            1/1     Running             1          83m     192.168.1.52   master   <none>           <none>
kube-system   kube-apiserver-master                  1/1     Running             1          83m     192.168.1.52   master   <none>           <none>
kube-system   kube-controller-manager-master         1/1     Running             0          35m     192.168.1.52   master   <none>           <none>
kube-system   kube-proxy-84k8x                       1/1     Running             1          79m     192.168.1.55   node-2   <none>           <none>
kube-system   kube-proxy-f5shp                       1/1     Running             1          79m     192.168.1.53   node-1   <none>           <none>
kube-system   kube-proxy-qg4bk                       1/1     Running             1          83m     192.168.1.52   master   <none>           <none>
kube-system   kube-proxy-whcwv                       1/1     Running             1          79m     192.168.1.54   node-3   <none>           <none>
kube-system   kube-router-7zmtb                      0/1     CrashLoopBackOff    22         79m     192.168.1.55   node-2   <none>           <none>
kube-system   kube-router-llgqk                      0/1     CrashLoopBackOff    23         79m     192.168.1.53   node-1   <none>           <none>
kube-system   kube-router-q4m5d                      0/1     CrashLoopBackOff    22         79m     192.168.1.54   node-3   <none>           <none>
kube-system   kube-router-xs696                      0/1     CrashLoopBackOff    33         83m     192.168.1.52   master   <none>           <none>
kube-system   kube-scheduler-master                  1/1     Running             1          83m     192.168.1.52   master   <none>           <none>

debian@master:~$ kubectl -n kube-system logs -f kube-router-7zmtb
I1015 09:06:02.910389       1 kube-router.go:231] Running /usr/local/bin/kube-router version v1.1.0-dirty, built on 2020-10-02T22:14:14+0000, go1.13.13
F1015 09:06:03.015240       1 network_routes_controller.go:1060] Failed to get pod CIDR from node spec. kube-router relies on kube-controller-manager to allocate pod CIDR for the node or an annotation `kube-router.io/pod-cidr`. Error: node.Spec.PodCIDR not set for node: node-2

debian@master:~$ kubectl get nodes -o jsonpath='{.items[*].spec.podCIDR}' -A

debian@master:~$ kubectl cluster-info dump -o yaml | grep -i cidr | grep \\\-\\\-
      - --allocate-node-cidrs=true
      - --cluster-cidr=10.96.0.0/22
      - --node-cidr-mask-size=25

debian@master:~$ kubectl logs pod/kube-controller-manager-master -n kube-system
E1015 08:35:32.193635       1 controller_utils.go:248] Error while processing Node Add: failed to allocate cidr from cluster cidr at idx:0: CIDR allocation failed; there are no remaining CIDRs left to allocate in the accepted range

BenjiReis

Did the manual change solve the issue?

ralphsmeets

@BenjiReis
No, the manual change didn't solve the problem

BenjiReis

Did you make sure /proc/sys/net/bridge/bridge-nf-call-iptables is set to 1?

Our implementation uses kube-router wich requirest this setting.

ralphsmeets

@BenjiReis said in Kubernetes Recipe - Basic setup not working:

/proc/sys/net/bridge/bridge-nf-call-iptables

Just checked:

debian@master:~$ more /proc/sys/net/bridge/bridge-nf-call-iptables
1

BenjiReis

Ok thanks, then I don't understant why there's a pb unfortunately...

The recipe follow this doc to create the cluster: https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/

@suaro how does your manual install differs from the doc used by the recipe? Perhaps we can dig there.

BenjiReis

And the pod network is made with kube-router: https://github.com/cloudnativelabs/kube-router/blob/master/docs/kubeadm.md

ralphsmeets

I'm going to have a look into it!
Thanks and hopefully I'll find the cullpritt, so the recipe can be updated with some nice new ingrediënts

BenjiReis

Thanks

I'll try to investigate myself as well when i can, do not hesitate to come back here if you find anything.

ralphsmeets

I got it working. Seems like the podCIDR wasn't set. Setting it manually by patching the nodes worked for me:

for node in master node-1 node-2 node-3 do 
  kubectl patch node $node -p '{"spec":{"podCIDR":"10.96.0.0/12"}}'
done

Not sure if this is a problem with the recipe, or if this is a bug in kube-router/kube-controller-manager. Anyway, If have my cluster up and running now!!!

ralphsmeets

It seems that Debian Buster has some problems with Kubernetes. While this base setup is working, one should also assure that every tool uses the legacy iptables. If not, pod's will not be able to reach the kubernetes api... And then... failure all over!
So we also need:

update-alternatives --set iptables /usr/sbin/iptables-legacy
update-alternatives --set ip6tables /usr/sbin/ip6tables-legacy
update-alternatives --set arptables /usr/sbin/arptables-legacy
update-alternatives --set ebtables /usr/sbin/ebtables-legacy

```