Document tips when debugging/configuring kubernetes cluster
Kubernetes Cheatsheet
Get a shell to the running container see here for details
kubectl exec --stdin --tty your_pod_name_here -- /bin/bash
# Example output
root@oauth-77dfd7f8f-7lksn:/usr/local/tomcat# ll
total 128
drwxr-xr-x. 1 root root 21 Jan 5 15:59 webapps/
drwxr-xr-x. 7 root root 81 Jun 10 2021 webapps.dis
# You can run these example commands inside the container
ls /
cat /proc/mounts
cat /proc/1/maps
Kubernetes container connect to external server
Kubernetes container connect to external server (eg. database): https://kubernetes.io/docs/concepts/services-networking/service/
# Solution: use an ExternalName type to define a service.
apiVersion: v1
kind: Service
metadata:
name: database-service
namespace: prod
spec:
type: ExternalName
externalName: my.database.example.com
Get pod environment variables
# Get pod environment variables, eg. SERVICE
kubectl exec your_pod_name_here -- printenv | grep SERVICE
# Check kubernetes dns
kubectl get services kube-dns --namespace=kube-system
# Get kubernetes node details, status
kubectl get nodes -o wide
# Check calico network status
kubectl get pods -n calico-system
Solution: Kubernetes pod communicate to outside world
Issue: kubernetes pod unable to communicate with outside world, eg.
kubectl exec --stdin --tty your_pod_name_here -- /bin/bash
root@your_pod_name_here:/usr/local/tomcat# curl -v google.com
* Could not resolve host: google.com
* Closing connection 0
Working solution: set hostNetwork property to true. credit: https://discuss.kubernetes.io/t/pod-access-outside-cluster/11784/9
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: oauth-example
spec:
selector:
matchLabels:
app: oauth-example
replicas: 1
template:
metadata:
labels:
app: oauth-example
spec:
containers:
- name: example
image: rhel8/example.latest
ports:
- name: secured
containerPort: 8443
env:
- name: env
value: prod
hostNetwork: true
Kubernetes get pod DNS configuration
Kubernetes get pod dns configuration
kubectl exec your_pod_name_here -- cat /etc/resolv.conf
Sample output like below:
search default.svc.cluster.local svc.cluster.local cluster.local
nameserver 10.96.0.10
options ndots:5
Kubernetes configmap example: https://kubernetes.io/docs/tasks/administer-cluster/dns-custom-nameservers/
---
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns
namespace: kube-system
data:
upstreamNameservers: |
["8.8.8.8","8.8.4.4"]
Check kubernetes coredns is running
kubectl get pods --namespace=kube-system -l k8s-app=kube-dns
Check for errors in the DNS pod
kubectl logs --namespace=kube-system -l k8s-app=kube-dns
Kubernetes default Corefile configuration
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns
namespace: kube-system
data:
Corefile: |
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
prometheus :9153
forward . /etc/resolv.conf
cache 30
loop
reload
loadbalance
}
# Kubernetes frequently used commands
# Restart a pod
kubectl rollout restart deployment <deployment name>
Kubernetes taint node
# Kubernetes taint node
kubectl taint nodes --all node.kubernetes.io/unreachable:NoSchedule
# Kubernetes worker node status is Ready, SchedulingDisabled
kubectl uncordon [node-name]
Kubernetes Drain a node
# Kubernetes Drain a node
kubectl drain <node-name> --ignore-daemonsets --delete-emptydir-data
# Tell kubernetes resume scheduling.
kubectl uncordon <node-name>