Networking

Welcome to the jungle!

Deployments

Custom DNS nameservers

To specify a nameserver use this following template:

...
spec:
  containers:
    - name: ........
      image: ..........
...
  dnsPolicy: "None" #overide all configs
  dnsConfig:
    nameservers:
      - 8.8.8.8
      - 208.67.222.222
...

References

https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-config

DNS troubleshoot

kubectl create namespace my-namespace
kubectl -n my-namespace create -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
  name: busybox
spec:
  containers:
  - name: busybox
    image: busybox
    command:
      - sleep
      - "3600"
    imagePullPolicy: IfNotPresent
  restartPolicy: Always

EOF

Then run.

kubectl -n my-namespace exec -ti busybox -- nslookup kubernetes.default

Port forwarding

port-forward

kubectl -n MY-NAMESPACE port-forward deployment/MY-DEPLOYMENT 8001:8001

proxy

kubectl proxy

Then access:

http://127.0.0.1:8001/api/v1/namespaces/MY-NAMESPACE/services/MY-SERVICE/proxy/

Known errors and solutions

Calico node 'XXXX' is already using the IPv4 address X.X.X.X

Problem

When you using kubespray, you remove a node:

ansible-playbook -i inventory/custom/hosts.ini remove-node.yml \
  -b -v --extra-vars "node=worker-006"

Then you try to scale the cluster again, using the same node IP but with a different host name.

"calico-node" pod fails with the following error:

2019-03-28 19:53:36.194 [INFO][10] startup.go 251: Early log level set to info
2019-03-28 19:53:36.194 [INFO][10] startup.go 267: Using NODENAME environment for node name
2019-03-28 19:53:36.194 [INFO][10] startup.go 279: Determined node name: fm-mcd-006
2019-03-28 19:53:36.472 [INFO][10] startup.go 101: Skipping datastore connection test
2019-03-28 19:53:36.608 [INFO][10] startup.go 352: Building new node resource Name="fm-mcd-006"
2019-03-28 19:53:36.608 [INFO][10] startup.go 367: Initialize BGP data
2019-03-28 19:53:36.608 [INFO][10] startup.go 456: Using IPv4 address from environment: IP=61.1.2.6
2019-03-28 19:53:36.610 [INFO][10] startup.go 489: IPv4 address 61.1.2.6 discovered on interface fm-k8s
2019-03-28 19:53:36.610 [INFO][10] startup.go 432: Node IPv4 changed, will check for conflicts
2019-03-28 19:53:36.701 [WARNING][10] startup.go 861: Calico node 'worker-006' is already using the IPv4 address 61.1.2.6.
2019-03-28 19:53:36.701 [WARNING][10] startup.go 1058: Terminating
Calico node failed to start

Solution

SSH to the master node, query etcd and check for the registries related to the "old" host name:

export ETCDCTL_API=3
export ETCDCTL_CACERT=/etc/ssl/etcd/ssl/ca.pem
export ETCDCTL_CERT=/etc/ssl/etcd/ssl/node-k8smaster.pem
export ETCDCTL_KEY=/etc/ssl/etcd/ssl/node-k8smaster-key.pem
etcdctl get / --prefix --keys-only | grep "fm-doc-001"

Sample output:

...
/calico/ipam/v2/host/fm-doc-001/ipv4/block/10.233.114.0-26
/calico/ipam/v2/host/fm-doc-001/ipv4/block/10.233.114.128-26
/calico/ipam/v2/host/fm-doc-001/ipv4/block/10.233.114.64-26
/calico/resources/v3/projectcalico.org/felixconfigurations/node.fm-doc-001
/calico/resources/v3/projectcalico.org/nodes/fm-doc-001
/calico/resources/v3/projectcalico.org/profiles/kns.nmp-fm-doc-001
...

Analyse the registries than delete them:

etcdctl get / --prefix --keys-only | grep "fm-doc-001" | xargs -I {} etcdctl del {}

Delete "calico-node" pod on the new node, it should start normally.

If it does not work, try the following steps:

Remove the "old" node;
Delete its etcd registries;
Scale the cluster with the new node host name.

network plugin is not ready: cni config uninitialized

Problem

When trying to scale the cluster, the new node got disconnected in the middle of the process. When it came back, one the the CNI containers was returning the following error(s):

...
network plugin is not ready: cni config uninitialized
..
/var/lib/calico/nodename: no such file or directory
...

Solution

SSH to the node you are trying to add, get root, create the following file:

nano /var/lib/calico/nodename

The content of the file should be its node hostname.

The CNI container should come up. If not, try deleting the pod (the replica set will recreated it automatically).

PreviousVolumes Nextkube-controller-manager

Last updated 4 years ago