The system simulates the VISA-transaction data management for analytics. It must receive data from an input data stream (Main.py) that simulates the real time data arrival. The data must be distributed to two types of clients with different needs:
- An internal analysis team: they can perform geographical queries or use a GIS interface (like Q-GIS)
- External analysis companies: these companies need data streams composed by one or more shop categories based on their interests.
Start server:
mongod --port 27018 --dbpath /Users/[your-username]/Documents/hackathon/mongo_data --replSet “hackathon”
Connect to server:
mongo --port 27018
Make the MongoDB node primary (in the mongo shell):
rs.initiate()
Start zookeeper:
zookeeper-server-start /usr/local/etc/kafka/zookeeper.properties
Start server Kafka:
kafka-server-start /usr/local/etc/kafka/server.properties
Create topic:
kafka-topics --create --zookeeper 192.168.1.28:2181 --replication-factor 1 --partitions 1 --topic nome_topic
Console del produttore:
kafka-console-producer --broker-list 192.168.1.28:9092 --topic nome_topic
Console del consumatore:
kafka-console-consumer --bootstrap-server 192.168.1.28:9092 --topic nome_topic —from-beginning
Start producer script:
/Users/[your-usersname]/Documents/code_kafka_changestreams/kafkaproducer.py
Start consumer script:
python /Users/[your-usersname]/Documents/code_kafka_changestreams/kafkaconsumer.py
data_in
mongo_in
You can find the instrcutions in the following file:
./SQL/postegre_creation.sql
- Main.py from its folder (this simulate data arrival)
- postgresql.postgreconsumer.py (to read the Kafka stream)
- postegresql.analysis_society.py (to use the raw interface to execute 3 given queries)
- mongo_router.router.py (to classify the input stream into shop_category topics)
- mongo_router.receive_interests (to load into mongo db the given_user stream)