Apache Kafka is a distributed streaming platform developed by Apache Software Foundation and written in Java and Scala. Light Platform uses Kafka for messaging broker for event-based frameworks (light-eventuate-4j, light-tram-4j, and light-saga-4j). In this document, we will show you how to install Kafka 2.0 on Ubuntu 18.04 VMs to form a three nodes cluster.
Three KVM boxes with 32GB memory each and you need to have sudo access to all the VMs. In the following steps, we are going to use the names of the hosts as test1, test2, and test3. First, let’s log in to test1 and install everything.
Install OpenJDK 8
Update and upgrade packages
sudo apt update
sudo apt upgrade
sudo apt install openjdk-8-jdk -y
In this step, we will install the Apache Kafka using the binary files that can be downloaded from the Kafka website. We will install and configure apache Kafka and run it as a non-root user.
Add a new user named kafka
sudo useradd -d /opt/kafka -s /bin/bash kafka
sudo passwd kafka
Go to /opt and download kafka.
sudo wget http://www.apache.org/dist/kafka/2.0.0/kafka_2.11-2.0.0.tgz
Create a new kafka folder
sudo mkdir -p /opt/kafka
Extract the kafka_*.tar.gz file to the kafka directory and change the owner of the directory to the kafka user and group.
Now login to the kafka user and edit the zookeeper.properties configuration
su - kafka
#add here more servers if you want
Now create a myid file under /var/zookeeper and then insert unique id into this file.
echo "1" > /var/zookeeper/myid
echo "2" > /var/zookeeper/myid
echo "3" > /var/zookeeper/myid
Now edit the server.properties configuration.
su - kafka
Update the following configuration
broker.id=0 # for test1 host, increase for test2 and test3
log.dirs=/opt/kafka-logs # directory we created in the previous step
advertised.listeners=PLAINTEXT://18.104.22.168:9092 # change IP for test2 and test3
Config Kafka and Zookeeper as Services
Exit the kafka session and go to the ‘/lib/systemd/system’ directory and create a new service file ‘zookeeper.service’.
sudo vi zookeeper.service
You should see the producer input immedately from the beginning.
We have both Zookeeper and Kafka clusters running on three nodes and they are exposed on the Internet. We need to find a way to secure both.
For now, we are going to disable the access to Zookeeper and Kafka from the Internet. Let’s enable the firewall with ufw.
sudo ufw status
sudo ufw enable
sudo ufw allow ssh
sudo ufw allow from 22.214.171.124
sudo ufw allow from 126.96.36.199
sudo ufw allow from 188.8.131.52
This means that the cluster can only be accessed from these three nodes locally. If we build applications, we need to deploy them to the three nodes or enable firewall for the hosts with the applications individually.
To check how many brokers in the cluster, you can switch to kafka user and issue the following command.
su - kafka
bin/zookeeper-shell.sh localhost:2181 <<< "ls /brokers/ids"