- Start an EC2 instance, create a new key pair, and download the pem key into a folder.
- Open 4 terminals and navigate into that folder in all 4.
- ssh into the EC2 instance in all 4 terminals (Command found in AWS under {instance_name} -> SSH Client -> Connect )
i. chmod 400 kafka-stock-market-project-key.pem <br />
ii. ssh -i "{key-pair-name}.pem" ec2-user@ec2-{ip_address}.compute-1.amazonaws.com
- Download kafka into the Ec2 Instance in one of the terminals
i. go to https://kafka.apache.org/downloads <br />
ii. Right-click on the latest version of Kafka and copy the link address <br />
iii. wget {copied address} <br />
iv. ls (shows you the compressed kafka tgz folder inside the instance) <br />
v. tar -xvf kafka_{version}.tgz
- Download Java onto the EC2 instance
i. sudo yum search all java-1.8.0 <br />
ii. sudo yum install java-1.8.0-amazon-corretto.x86_64 (not the devel one) <br />
iii. java -version (to check version number)
- cd kafka_{version} (do ls to see kafka version) (do in all 4 terminals)
- The kafka server by default points to a private server, change server.properties so that it can run on public IP
i. sudo nano config/server.properties
ii. Uncomment advertised.listeners and change ip address to the public ip address of the instance
(can be found in AWS under {instance_name} -> Public IPV4 Address)
advertised.listeners=PLAINTEXT://{public_ip}:9092
- Change the security of the EC2 instance by adding an inbound rule to listen to our local machine
i. In AWS, go to {instance_name} -> Security -> security groups link -> Edit Inbound rules ->
Add rule : Type = All traffic, Source = Anywhere - IPV4 - Terminal 1: Start Zoo-keeper
i. bin/zookeeper-server-start.sh config/zookeeper.properties
- Terminal 2: Start Kafka server
i. export KAFKA_HEAP_OPTS="-Xmx256M -Xms128M"
ii. bin/kafka-server-start.sh config/server.properties
If starting this server gives an error,
i. cd /tmp <br />
ii. rm -r kafka-logs
- Terminal 3: Create the topic and start the producer
i. bin/kafka-topics.sh --create --topic {topic_name} --bootstrap-server {Public Ipv4 Addr}:9092 --replication-factor 1 --partitions 1 <br />
ii. bin/kafka-console-producer.sh --topic {topic_name} --bootstrap-server {Public Ipv4 Addr}:9092
- Terminal 4: Start consumer
i. bin/kafka-console-consumer.sh --topic {topic_name} --bootstrap-server {Public Ipv4 Addr}:9092
- Open new terminal and start jupyter notebook
jupyter notebook
- Start 2 new python notebooks and name them "KafkaProducer" and "KafkaConsumer"