Lambda Architecture
This project was done by Iheb KOTORSI & Dhia ZNAIDI and is part of the EU H: Advanced Big Data Architectures of our studies at IMT Atlantique
Getting started
During this project, we used a cluster of raspberry pi-4 nodes.
If you want to deploy the code on your own cluster, pleaser change the IP adresses in the code accordingly.
Starting Kafka broker
$zookeeper-server-start.sh /opt/kafka/conf/zookeeper.properties
$kafka-server-start.sh /opt/kafka/conf/server.properties
Starting Apache Spark
$SPARK_HOME/sbin/start-all.sh
Starting Apache HDFS
$start-dfs.sh
Starting Spark script with crontab scheduler
$crontab -e
Starting Cassandra
$cassandra -f
$ cqlsh your_cluster_ip_address
cqlsh> CREATE KEYSPACE test WITH replication = {'class': your_strategy , 'replication_factor': your_replication_factor };
Launching the twitter API v2
Having obtained your tokens, you can launch the .ipyinb file and start