GitHub - michellejohnstone/e-mission-server: The backend (server) code for the e-mission server

e-mission is a project to gather data about user travel patterns using phone apps, and use them to provide an personalized carbon footprint, and aggregate them to make data available to urban planners and transportation engineers.

It has two components, the backend server and the phone apps. This is the backend server - the phone apps are available in the e-mission-phone repo

The current build status is:

The backend in turn consists of two parts - a summary of their code structure is shown below. - The webapp supports a REST API, and accesses data from the database to fulfill the queries. A set of background scripts pull the data from external sources, and preprocessing results ensures reasonable performance.

Dependencies:

Database:

Install Mongodb
Windows: mongodb appears to be installed as a service on Windows devices and it starts automatically on reboot
OSX: You want to install homebrew and then use homebrew to install mongodb. Follow these instruction on how to do so ---> (http://docs.mongodb.org/manual/tutorial/install-mongodb-on-os-x/)
Ubuntu: http://docs.mongodb.org/manual/tutorial/install-mongodb-on-ubuntu/
Start it at the default port

$ mongod

Python distribution

We will use a distribution of python that is optimized for scientific computing. The anaconda distribution is available for a wide variety of platforms and includes the python scientific computing libraries (numpy/scipy/scikit-learn) along with native implementations for performance. Using the distribution avoids native library inconsistencies between versions.

The distribution also includes its own version of pip, and a separate package management tool called 'conda'.

Make sure you install Python 2.7 because some libraries used in this code repository do not support Python 3.5 yet.

After you install the anaconda distribution, please ensure that it is in your path, and you are using the anaconda versions of common python tools such as python and pip, e.g.

$ which python
/Users/shankari/OSS/anaconda/bin/python

$ which pip
/Users/shankari/OSS/anaconda/bin/pip

Python dependencies:

$ pip install -r requirements.txt 
# If you are running this in production over SSL, copy over the cherrypy-wsgiserver
$ cp api/wsgiserver2.py <dist-packages>/cherrypy/wsgiserver/wsgiserver2.py

Javascript dependencies

Run "bower install" instead if you are prompted password for 'https://github.com' after running "bower update".

$ cd webapp
$ bower update

Development:

In order to test out changes to the webapp, you should make the changes locally, test them and then push. Then, deployment is as simple as pulling from the repo to the real server and changing the config files slightly.

Here are the steps for doing this:

On OSX, start the database (Note: mongodb appears to be installed as a service on Windows devices and it starts automatically on reboot).
```
 $ mongod
```

Copy the following sample files. You should also configure the servers and keys in them if you wish to test the associated features, but can leave them filled with dummy values if you don't.

 # For the location -> name reverse lookup. Client will lookup if not populated.
 $ cp conf/net/ext_service/nominatim.json.sample conf/net/ext_service/nominatim.json
 # Store entries to the stats database. Currently required, dependency should be removed soon
 $ cp conf/net/int_service/giles_conf.json.sample conf/net/int_service/giles_conf.json
 # Game integration.
 $ cp conf/net/ext_service/habitica.json.sample conf/net/int_service/habitica.json

Start the server

 $ ./e-mission-py.bash emission/net/api/cfc_webapp.py

Test your connection to the server

Using a web browser, go to http://localhost:8080
Using the iOS emulator, connect to http://localhost:8080
Using the android emulator:
- change server.host in conf/net/api/webserver.conf to 0.0.0.0, and
- connect the app to the special IP for the current host in the android emulator - 10.0.2.2

Loading test data

You may also want to load some test data.

Sample timeline data from the data collection eval is available in the data-collection-eval repo.
You can choose to load either android data results_dec_2015/ucb.sdb.android.{1,2,3}/timeseries/* or iOS data results_dec_2015/ucb.sdb.ios.{1,2,3}/timeseries/*
Data is loaded using the bin/debug/load_timeline_for_day_and_user.py. It requires a timeline file and a user that the timeline is being loaded as. If you wish to view this timeline in the UI after processing it, you need to login with this email.
```
     $ cd ..../e-mission-server
     $ ./e-mission-py.bash bin/debug/load_timeline_for_day_and_user.py /tmp/data-collection-eval/results_dec_2015/ucb.sdb.android.1/timeseries/active_day_2.2015-11-27 shankari@eecs.berkeley.edu
```
Note that loading the data retains the object IDs. This means that if you load the same data twice with different user IDs, then only the second one will stick. In other words, if you load the file as user1@foo.edu and then load the same file as user2@foo.edu, you will only have data for user2@foo.edu in the database. This can be overwritten using the --make-new flag - e.g.
```
     $ ./e-mission-py.bash bin/debug/load_timeline_for_day_and_user.py -n /tmp/data-collection-eval/results_dec_2015/ucb.sdb.android.1/timeseries/active_day_2.2015-11-27 shankari@eecs.berkeley.edu
```

Creating fake user data

You may need a larger or more diverse set of data than the given test data supplies. To create it you can run the trip generation script included in the project.

The script works by creating random noise around starting and ending points of trips.

You can fill out options for the new data in emission/simulation/input.json. The different options are as follows

radius - the number of kilometers of randomization around starting and ending points (the amount of noise)
starting centroids - addresses you want trips to start around, as well as a weight defining the relative probability a trip will start there
ending centroids - addresses you want trips to end around, as well as a weight defining the relative probability a trip will end there
modes - the relative probability a user will take a trip with the given mode
number of trips - the amount of trips the simulation should create

run the script with

$ python emission/simulation/trip_gen.py <user_name>

Because this user data is specifically designed to test our tour model creation, you can create fake tour models easily by running the make_tour_model_from_fake_data function in emission/storage/decorations/tour_model_queries.py

Running the analysis pipeline

Once you have loaded the timeline, you probably want to segment it into trips and sections, smooth the sections, generate a timeline, etc. We have a unified script to do all of those, called the intake pipeline. You can run it like this.

$ ./e-mission-py.bash bin/intake_stage.py

Once the script is done running, places, trips, sections and stops would have been generated and stored in their respective mongodb tables, and the timelines for the last 7 days have been stored in the usercache.

We also do some modelling on the generated data. This is much more time-intensive than the intake, but also does not need to run at the same frequency as the intake pipeline. So it is pulled out to its own pipeline. If you want to work on the modelling, you need to run this pipeline as well.

$ ./e-mission-py.bash bin/model_stage.py

Running unit tests

Make sure that the anaconda python is in your path

 $ which python
 /Users/shankari/OSS/anaconda/bin/python

Run all tests.
```
 $ ./runAllTests.sh
```
If you get import errors, you may need to add the current directory to PYTHONPATH.
```
 $ PYTHONPATH=. ./runAllTests.sh
```

Analysis

Several exploratory analysis scripts are checked in as ipython notebooks into emission/analysis/notebooks. All data in the notebooks is from members of the research team who have provided permission to use it. The results in the notebooks cannot be replicated in the absence of the raw data, but they can be run on data collected from your own instance as well.

The notebooks are occasionally modified and simplified as code is moved out of them into utility functions. Original versions of the notebooks can be obtained by looking at other notebooks with the same name, or by looking at the history of the notebooks.

JS Testing

From the webapp directory

$ npm install karma --save-dev
$ npm install karma-jasmine karma-chrome-launcher --save-dev

Write tests in www/js/test If you're interested in having karma in your path and globally set, run

$ npm install -g karma-cli

To run tests if you have karma globally set, run

$ karma start my.conf.js

in the webapp directory. If you didn't run the -g command, you can run tests with

$ ./node_modules/karma/bin/karma start

in the webapp directory

TROUBLESHOOTING:

If a python execution fails to import a module, make sure to add current directory to your PYTHONPATH.
If starting the server gives a CONNECTION_ERROR, make sure MongoDB is actively running when you attempt to start the server.
After running MongoDB, if you get an error that says dbpath does not exist (on Windows) or Data directory /data/db not found (on Mac), make sure to manually create the data directory as follows.
```
 on Windows
 % md c:\data\db\  
```

or

    on Mac (the user account running mongod must have read and write permissions for the data directory)
    $ mkdir -p /data/db
    $ chmod 777 /data/db

Design decisions:

This site is currently designed to support commute mode tracking and aggregation. There is a fair amount of backend work that is more complex than just reading and writing data from a database. So we are not using any of the specialized web frameworks such as django or rails.

Instead, we have focused on developing the backend code and exposing it via a simple API. I have maintained separation between the backend code and the API glue so that we can swap out the API glue later if needed.

The API glue is currently Bottle, which is a single file webapp framework. I chose Bottle because it was simple, didn't use a lot of space, and because it wasn't heavy weight, could easily be replaced with something more heavyweight later.

The front-end is javascript based. In order to be consistent with the phone, it also uses angular + ionic. javascript components are largely managed using bower.

Deployment:

If you want to use this for anything other than deployment, you should really run it over SSL. In order to make the development flow smoother, if the server is running over HTTP as opposed to HTTPS, it has no security. The JWT basically consists of the user email in plain text. This means that anybody who knows a users' email access can download their detailed timeline. This is very bad.

If you are using this to store real, non-test data, use SSL right now

If you are running this in production, you should really run it over SSL. We use cherrypy to provide SSL support. The default version of cherrypy in the anaconda distribution had some issues, so I've checked in a working version of the wsgiserver file.

TODO: clean up later

$ cp api/wsgiserver2.py <dist-packages>/cherrypy/wsgiserver/wsgiserver2.py

Also, now that we decode the JWTs locally, we need to use the oauth2client, which requires the PyOpenSSL library. This can be installed on ubuntu using the python-openssl package, but then it is not accessible using the anaconda distribution. In order to enable it for the conda distribution as well, use

$ conda install pyopenssl

Also, installing via requirements.txt does not appear to install all of the requirements for the google-api-client. If you get mysterious "Invalid token" errors for tokens that are correctly validated by the backup URL method, try to uninstall and reinstall with the --update option.

$ pip uninstall google-api-python-client
$ pip install --upgrade google-api-python-client

If you are running the server on shared, cloud infrastructure such as AWS, then note that the data is accessible by AWS admins by directly looking at the disk. In order to avoid this, you want to encrypt the disk. You can do this by:

using an encrypted EBS store, but this doesn't appear to allow you to specify your own encryption key
using a normal drive that is encrypted using cryptfs (http://sleepyhead.de/howto/?href=cryptpart, https://wiki.archlinux.org/index.php/Dm-crypt/Encrypting_a_non-root_file_system). The standard AWS ubuntu AMI appears to have LUKS enabled, so you can follow the instructions with LUKS.

ubuntu@ip-10-203-173-119:/home/e-mission$ cryptsetup --help | grep luks
                                      for luksFormat
  -M, --type=STRING                   Type of device metadata: luks, plain,
...

In either of these cases, you need to reconfigure mongod.conf to point to data and log directories in the encrypted volume.

Getting keys

If you are associated with the e-mission project and will be integrating with our server, then you can get the key files from: https://repo.eecs.berkeley.edu/git/users/shankari/e-mission-keys.git

If not, please get your own copies of the following keys:

Google Developer Console (stored in conf/net/keys.json)
- iOS key (ios_client_key)
- webApp key (client_key)
Parse (coming soon)

Name		Name	Last commit message	Last commit date
Latest commit History 1,866 Commits
bin		bin
conf		conf
docs		docs
emission		emission
figs		figs
front		front
webapp		webapp
.gitignore		.gitignore
License.txt		License.txt
OpenSourceLicenses.txt		OpenSourceLicenses.txt
README.md		README.md
Timeseries_Sample.ipynb		Timeseries_Sample.ipynb
e-mission-ipy.bash		e-mission-ipy.bash
e-mission-py.bash		e-mission-py.bash
modeify.md		modeify.md
requirements.txt		requirements.txt
runAllTests.bat		runAllTests.bat
runAllTests.sh		runAllTests.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dependencies:

Database:

Python distribution

Python dependencies:

Javascript dependencies

Development:

Loading test data

Creating fake user data

Running the analysis pipeline

Running unit tests

Analysis

JS Testing

TROUBLESHOOTING:

Design decisions:

Deployment:

Getting keys

About

Releases

Packages

Languages

License

michellejohnstone/e-mission-server

Folders and files

Latest commit

History

Repository files navigation

Dependencies:

Database:

Python distribution

Python dependencies:

Javascript dependencies

Development:

Loading test data

Creating fake user data

Running the analysis pipeline

Running unit tests

Analysis

JS Testing

TROUBLESHOOTING:

Design decisions:

Deployment:

Getting keys

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages