HongZhen Xie: 773383
Dong Gao: 795622
NanJiang Li: 741524
KaiLe Wei: 812381
Chuang Ying: 844566
As shown in the Figure 1, the architecture of the tweets harvesting and scenario analysis application is listed as below.
Figure 1: Architecture
Launching instances on the NeCTAR.
Scripted deployment on the instances to set up all necessary software.
Tweets harvesting through twitter API and storing on couchdb cluster.
Scenario analysis by MapReduce and necessary libraries.Presentation on web-based pages.
There are several files in this folders for the launching and depolying process.
File setupbash.sh: a bash file to run the lanuching and deploying process.
File dbSetup.yaml : a yaml file for ansible-playbook to deploying the database instances on nectar.
File serSetup.yaml : a yaml file for ansible-playbook to deploying the processing server instances on nectar.
File hosts: stort the instance's configuration information.
nectarConf.sh : configuration of nectar project for ansible to connect to server.
Connect.key: the private key used for connecting the system.
By running the setupbash.sh file, it can automatically launch a new instance of harverstserver or database and automatically deploy those required.
Besides, as to how to create a cluster of couchdb, some useful information can be got from:
https://medium.com/linagora-engineering/setting-up-a-couchdb-2-cluster-on-centos-7-8cbf32ae619f
Few things to pay attention:
- Before running this script, make sure that ansible and openstack has been correctly installed.
- Make sure that the environmental variable: ANSIBLE_HOST_KEY_CHECKING has been set false, more information can be viewed: http://stackoverflow.com/questions/32297456/how-to-ignore-ansible-ssh-authenticity-checking
Otherwise, the ssh key connections may refuse.
- In this script, the availability zone of new instance has been pre-set as ‘Melbourne’ which guarantee that volume and instance are in the same zone.
If there is not enough host in this zone, the launch process may fail.
The main.py is written by Chuang Ying.
This is a python file for tweets harvesting by REST API to find new users & their tweets before and Streaming API to get current relevant tweets, and saving the tweets into couchdb.
A series of sentiment scenarios are achieved by Dong Gao, including sentiment score analysis for each city, Time-Happiness scenario for cities, and Factors-Happiness Scenario for suburbs.
This part consists of five python files:
sentiment_analysis.py: Input is a tweet json file, and a sentiment tag is the output.
coordinate_tweets.py: Filter out tweets with corridinate attribute.
reverse_geocode.py: Input is a coordinate, and the corresponding suburb is the output.
suburb_analysis.py: Input is a tweet, and the output is the corresponding suburb and sentiment score.
result_plots.py: According to final results, drawing plots using matplotlib library.
Figure 2: Average Sentiment Score
Figure 3: Sentiment Percentage Statistic
Figure 4: Time Score Statistic
The Cultural Integration Scenario is made by KaiLe Wei.
This document include one cultural.py, and a text file keywords.txt.
The py file is the function of cultural integration scenario, and the txt file stores the keywords of the function.
There are two executable functions, culture_per(tweet,result) for single tweet analysis, and culture_file() for json file analysis.
Figure 5: Chart of Ra and Rb(b)
The Alcohol-Tobacco Scenario is made by Nanjiang Li.
This document include one smoke_drink.py, and a text file smoke.txt.
The py file is the function of alcohol-tobacco scenario, and the txt file stores the keywords of the function.
There are two executable functions, smoke_Drink_per(tweet,result) for single tweet analysis, and smoke_Drink() for json file analysis.
Figure 6: Female Rate & Negative Rate
This folder contains the code used for building web interface.
This web interface use a RESTFUL framework. Implementing by Python with Django.
To run this web interface:
A Django packages are required and also apache service should open for running this web interface.
Before running, the allow_host should add its IP address.
To run this interface, you should run : python manage.py run server.
You can visit our interface through: 115.146.91.76:8000