Building a self-hosted Apache Kafka server on a Kubernetes cluster running on Red Hat servers is a robust and scalable solution. This guide will walk you through the process, focusing on best practices and leveraging Kubernetes’ strengths.
We’ll primarily use Strimzi, which is a Kubernetes Operator that simplifies the deployment and management of Kafka on Kubernetes. Strimzi is widely adopted and recommended for production Kafka deployments on Kubernetes.
Here’s a breakdown of the steps:
1. Prerequisites: Red Hat Servers & Kubernetes Cluster
Before deploying Kafka, you need a functioning Kubernetes cluster running on your Red Hat servers.
1.1 Red Hat Server Preparation
- Operating System: Red Hat Enterprise Linux (RHEL 8 or 9) or CentOS Stream 8/9.
- System Updates:
sudo dnf update -y sudo reboot # If kernel updated - Firewall: Ensure necessary ports are open for Kubernetes components.
sudo systemctl enable --now firewalld # Allow SSH sudo firewall-cmd --permanent --add-service=ssh # Allow KubeAPI server, Kubelet ports etc. (specific to your K8s setup) # If using OpenShift, it manages its own firewall rules sudo firewall-cmd --reload - SELinux: SELinux can be tricky. While for a quick test you might disable it (
setenforce 0and edit/etc/selinux/config), for production, it’s best to configure proper policies or use an environment like OpenShift that handles container SELinux contexts. - Swap: Kubernetes generally requires swap to be disabled.
sudo swapoff -a # To make it persistent across reboots: comment out swap lines in /etc/fstab sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
1.2 Kubernetes Cluster Setup on Red Hat
You have several options for setting up Kubernetes on RHEL:
Option A: OpenShift Container Platform (Recommended for RHEL Environments)
OpenShift is Red Hat’s enterprise-grade Kubernetes distribution, offering enhanced security, developer tools, and a more integrated experience.
- Advantages: Enterprise support, integrated registry, CI/CD, monitoring, advanced networking (OVN-Kubernetes), and built-in StorageClass for OpenShift Container Storage (OCS) (Ceph).
- Installation: OpenShift installation is complex and often automated.
- Installer-Provisioned Infrastructure (IPI): The installer deploys the underlying infrastructure (e.g., on AWS, Azure, vSphere) and then OpenShift.
- User-Provisioned Infrastructure (UPI): You prepare the underlying RHEL VMs/bare metal, and then the OpenShift installer deploys OpenShift on top.
- Resources: Refer to the official Red Hat OpenShift documentation for detailed installation guides: OpenShift Docs
Option B: Kubeadm (Open Source Kubernetes)
If you prefer pure upstream Kubernetes:
- Container Runtime: Install
containerd(recommended).- Refer to the Kubernetes documentation for
containerdinstallation on RHEL/CentOS: Install containerd
- Refer to the Kubernetes documentation for
- Kubeadm, Kubelet, Kubectl:
sudo dnf install -y kubelet kubeadm kubectl --disableexcludes=kubernetes sudo systemctl enable --now kubelet - Initialize Control Plane: On your master node:
sudo kubeadm init --pod-network-cidr=10.244.0.0/16 # For Flannel, adjust for others # After init, follow instructions to set up kubectl for current user mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config - Install CNI (Container Network Interface):
- Flannel:
kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml - Calico: Refer to Calico docs.
- Flannel:
- Join Worker Nodes: On worker nodes, run the
kubeadm joincommand provided bykubeadm init. - StorageClass: You’ll need a PersistentVolume provisioner.
- Rook-Ceph: For highly available, distributed storage: Rook.io
- NFS CSI: If you have an NFS server: NFS CSI Driver
- Local Path Provisioner: For single-node or testing (not production): Local Path Provisioner
1.3 Essential Tools
kubectl: For interacting with the Kubernetes cluster (installed withkubeadm).helm: For deploying Strimzi and Kafka.curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 chmod 700 get_helm.sh ./get_helm.shgit: To clone Strimzi examples (optional, but helpful).
2. Deploying Strimzi Kafka Operator
Strimzi makes deploying and managing Kafka on Kubernetes easy by using the Operator pattern.
2.1 Add Strimzi Helm Repository
helm repo add strimzi https://strimzi.io/charts/
helm repo update
2.2 Install Strimzi Operator
The Strimzi Operator watches for Kafka, KafkaTopic, KafkaUser, etc., Custom Resources (CRs) and manages the actual Kafka and ZooKeeper deployments.
You can install it in a specific namespace (e.g., kafka-operator) or cluster-wide. For simplicity, let’s install it in its own namespace and grant it cluster-wide permissions to manage Kafka resources across all namespaces.
kubectl create namespace kafka-operator
helm install strimzi-kafka-operator strimzi/strimzi-kafka-operator \
--namespace kafka-operator \
--set watchAnyNamespace=true \
--set createClusterRoles=true
watchAnyNamespace=true: Allows the operator to manage Kafka clusters in any namespace.createClusterRoles=true: Ensures the necessary RBAC roles are created for cluster-wide management.
Verify Operator Deployment:
kubectl get pods -n kafka-operator
# You should see 'strimzi-cluster-operator-...' running
3. Deploying Kafka Cluster with Strimzi
Now that the operator is running, you can define your Kafka cluster using a Kafka Custom Resource.
3.1 Choose a StorageClass
Before defining Kafka, ensure you have a StorageClass defined in your Kubernetes cluster. Kafka brokers and ZooKeeper nodes need persistent storage.
- Check existing StorageClasses:
kubectl get sc - Identify a default or create one: If you have OpenShift,
ocs-storagecluster-cephfsorocs-storagecluster-ceph-rbdmight be available. Forkubeadm, if you installed Rook-Ceph, it will haverook-ceph-blockor similar. If you’re usinglocal-path-provisionerfor testing, its name is typicallystandardorlocal-path.
3.2 Create the Kafka Custom Resource (CR)
Create a YAML file (e.g., my-kafka-cluster.yaml) to define your Kafka cluster. This example uses three Kafka brokers and three ZooKeeper nodes, with basic authentication and external listeners.
apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
name: my-kafka-cluster
namespace: kafka
spec:
kafka:
version: 3.6.1 # Specify your desired Kafka version
replicas: 3
listeners:
# Plain listener for internal K8s communication
- name: plain
port: 9092
type: internal
tls: false
# TLS listener for internal K8s communication (recommended for production)
- name: tls
port: 9093
type: internal
tls: true
# External listener for clients outside the Kubernetes cluster
- name: external
port: 9094
type: route # Use 'route' for OpenShift, 'loadbalancer' or 'nodeport' for others
tls: true
authentication:
type: scram-sha-512 # Recommended authentication
configuration:
broker.rack: ${STRIMZI_RACK_ID} # Required for 'rack' awareness
config:
offsets.topic.replication.factor: 3
transaction.state.log.replication.factor: 3
transaction.state.log.min.isr: 2
log.message.format.version: "3.0"
inter.broker.protocol.version: "3.0"
auto.create.topics.enable: "false" # Prevent accidental topic creation
storage:
type: persistent-claim
class: <YOUR_STORAGE_CLASS_NAME> # e.g., 'ocs-storagecluster-ceph-rbd', 'standard', 'rook-ceph-block'
size: 100Gi # Adjust size as needed per broker
deleteClaim: false # Set to true to delete PVCs when Kafka is deleted
resources:
requests:
memory: 4Gi
cpu: 2000m
limits:
memory: 4Gi
cpu: 2000m
zookeeper:
replicas: 3
storage:
type: persistent-claim
class: <YOUR_STORAGE_CLASS_NAME> # Same as Kafka storage class
size: 10Gi
deleteClaim: false
resources:
requests:
memory: 2Gi
cpu: 1000m
limits:
memory: 2Gi
cpu: 1000m
entityOperator:
topicOperator: {} # Manages Kafka topics via KafkaTopic CRs
userOperator: {} # Manages Kafka users via KafkaUser CRs
Key Configuration Details:
apiVersion: kafka.strimzi.io/v1beta2: The Strimzi API version.metadata.name: my-kafka-cluster: Name of your Kafka cluster.namespace: kafka: Create this namespace:kubectl create namespace kafka.kafka.version: Specify your desired Kafka version.kafka.replicas: Number of Kafka brokers (typically 3 or more for production).listeners:plain: For applications inside the Kubernetes cluster, often used for simplicity (e.g., testing or internal microservices).tls: Also for applications inside the cluster, but with TLS encryption (recommended).external: For applications outside the Kubernetes cluster.type: route: For OpenShift, exposes via an OpenShift Route.type: loadbalancer: If your Kubernetes cluster provides a cloud LoadBalancer (e.g., AWS, Azure, GCP). For on-prem, MetalLB can provide this.type: nodeport: Exposes via a static port on each node. Clients need to know the node IPs and ports. Less ideal for production with multiple brokers.authentication.type: scram-sha-512: Strongly recommended for external access. Strimzi will create secrets for users.broker.rack: Important for anti-affinity and rack-awareness, especially when usingtype: nodeportorloadbalancer. Strimzi injectsSTRIMZI_RACK_IDautomatically if your nodes havetopology.kubernetes.io/zonelabels.
kafka.config: Standard Kafka broker configurations.auto.create.topics.enable: "false"is a good production practice.kafka.storage: Define persistent storage for Kafka logs.type: persistent-claim: Uses Kubernetes PVCs.class: Crucial! Replace<YOUR_STORAGE_CLASS_NAME>with an available StorageClass.size: Allocate enough storage.deleteClaim: false: Prevents PVCs from being deleted if the Kafka CR is deleted (data safety).
zookeeper: Similar configuration for ZooKeeper (or use K-Raft if supported by your Kafka version and you configure it).entityOperator: Strimzi’s operators for managing Kafka Topics and Users via Kubernetes CRs.
3.3 Deploy the Kafka Cluster
kubectl create namespace kafka # If you haven't already
kubectl apply -f my-kafka-cluster.yaml
Verify Cluster Deployment:
It will take some time for the pods to spin up, PVCs to be provisioned, and Kafka to become ready.
kubectl get kafka -n kafka
kubectl get pods -n kafka
kubectl get pvc -n kafka
kubectl get svc -n kafka
kubectl get routes -n kafka # If using OpenShift and 'route' listener
Wait until all Kafka and ZooKeeper pods are in a Running state. The kubectl get kafka -n kafka command should show READY status.
4. Managing Kafka Resources (Topics, Users)
Strimzi’s Entity Operator includes a Topic Operator and a User Operator to manage Kafka topics and users directly through Kubernetes Custom Resources.
4.1 Create a Kafka Topic
Create a KafkaTopic CR (e.g., my-topic.yaml):
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaTopic
metadata:
name: my-first-topic
namespace: kafka
labels:
strimzi.io/cluster: my-kafka-cluster # Link to your Kafka cluster
spec:
partitions: 3
replicas: 3
config:
retention.ms: 604800000 # 7 days
segment.bytes: 1073741824 # 1GB
Apply it:
kubectl apply -f my-topic.yaml
4.2 Create a Kafka User
Create a KafkaUser CR (e.g., my-user.yaml) to define a user with SCRAM-SHA-512 authentication and ACLs.
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaUser
metadata:
name: my-app-user
namespace: kafka
labels:
strimzi.io/cluster: my-kafka-cluster # Link to your Kafka cluster
spec:
authentication:
type: scram-sha-512
authorization:
type: simple
acls:
# Allow producing to 'my-first-topic'
- resource:
type: topic
name: my-first-topic
patternType: literal
operation: Write
host: "*"
# Allow consuming from 'my-first-topic'
- resource:
type: topic
name: my-first-topic
patternType: literal
operation: Read
host: "*"
- resource:
type: topic
name: my-first-topic
patternType: literal
operation: Describe
host: "*"
# Required for consumer groups
- resource:
type: group
name: my-consumer-group
patternType: literal
operation: Read
host: "*"
Apply it:
kubectl apply -f my-user.yaml
Strimzi will create a Kubernetes Secret containing the username and password for this user.
kubectl get secret my-app-user -n kafka -o yaml
# You'll find 'password' and 'username' keys in base64 encoded format.
5. Connecting Clients to Kafka
The way clients connect depends on the external listener type you chose.
5.1 For OpenShift (type: route)
- Get the Bootstrap URL:
oc get route my-kafka-cluster-kafka-external-bootstrap -n kafka -o jsonpath='{.spec.host}{"\n"}' # Example output: my-kafka-cluster-kafka-external-bootstrap-kafka.apps.mycluster.example.com - Get User Credentials:
kubectl get secret my-app-user -n kafka -o jsonpath='{.data.password}' | base64 -d # And username is 'my-app-user' - Get CA Certificate: Strimzi automatically generates TLS certificates. Clients need the CA cert to trust the Kafka brokers.
kubectl get secret my-kafka-cluster-cluster-ca-cert -n kafka -o jsonpath='{.data.ca\.crt}' | base64 -d > ca.crt - Client Configuration Example (Java/Spring Boot properties):
Note: For OpenShift Routes, the port is usually 443, as the Route handles the TLS termination.spring.kafka.bootstrap-servers=<YOUR_BOOTSTRAP_URL>:443 spring.kafka.security.protocol=SASL_SSL spring.kafka.ssl.trust-store-location=file:/path/to/ca.crt spring.kafka.ssl.trust-store-type=PEM # Or JKS if you convert spring.kafka.properties.sasl.mechanism=SCRAM-SHA-512 spring.kafka.properties.sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required username="my-app-user" password="<YOUR_PASSWORD>";
5.2 For Kubeadm (type: loadbalancer or type: nodeport)
- LoadBalancer:
- Get the External IP:
kubectl get svc my-kafka-cluster-kafka-external-bootstrap -n kafka - Use that IP and port 9094.
- Get the External IP:
- NodePort:
- Get NodePort:
kubectl get svc my-kafka-cluster-kafka-external-bootstrap -n kafka -o jsonpath='{.spec.ports[?(@.name=="external")].nodePort}{"\n"}' - Clients need any node IP and the
nodePort. If you have 3 brokers, Strimzi exposes each broker on a separate NodePort, or it can expose a single bootstrap service. - Bootstrap Servers: Use
NODE_IP:NODE_PORTfor the bootstrap server, but the client will eventually discover all brokers on their respective NodePorts.
- Get NodePort:
The authentication and TLS steps for getting the user password and CA certificate remain the same as for OpenShift (get from secrets my-app-user and my-kafka-cluster-cluster-ca-cert).
6. Monitoring and Management
- Kafka Exporter: Strimzi includes
kafka-exporterwhich provides Prometheus metrics for Kafka brokers and topics. You can integrate this with Prometheus and Grafana for comprehensive monitoring. - Kafka UIs:
- Kafka UI (formerly akhq): A popular web-based UI for managing Kafka clusters. You can deploy it as another Kubernetes deployment.
- Control Center (Confluent): More feature-rich but requires a license for advanced features.
- Strimzi Documentation: The official Strimzi documentation is an excellent resource for advanced configurations, upgrades, and troubleshooting.
7. Security Best Practices
- TLS Everywhere: Always use TLS for communication between clients and brokers, and ideally for inter-broker communication too (Strimzi handles this by default).
- Authentication & Authorization: Use SCRAM-SHA-512 or mTLS for authentication and ACLs (configured via
KafkaUserCRs) for fine-grained authorization. - Network Policies: Implement Kubernetes Network Policies to restrict network access to your Kafka pods.
- Resource Limits: Set CPU and memory limits/requests for Kafka and ZooKeeper pods to prevent resource contention.
- SELinux: If not using OpenShift, ensure your host SELinux policies are correctly configured to allow container operations without disabling it entirely.
- Secrets Management: Use Kubernetes Secrets to store sensitive data like user passwords. Strimzi does this automatically.
8. Scalability and Resilience
- Strimzi Scaling: You can easily scale your Kafka brokers by modifying the
replicasfield in theKafkaCR. Strimzi handles the rolling updates. - Topic Partitions & Replicas: Design your topics with appropriate numbers of partitions and replication factors (e.g., 3 replicas for fault tolerance).
- Anti-Affinity: Strimzi automatically configures anti-affinity rules to ensure Kafka and ZooKeeper pods are spread across different nodes for high availability.
- Persistent Storage: Using reliable persistent storage (like Rook-Ceph or cloud block storage) is critical for data durability.
Conclusion
Building a self-hosted Kafka server on a Red Hat Kubernetes cluster with Strimzi provides a powerful, scalable, and manageable messaging platform. While the initial setup involves a few steps, Strimzi significantly simplifies the ongoing operations, making it an excellent choice for production environments. Always refer to the official Strimzi and Kubernetes documentation for the most up-to-date and detailed information.

