Skip to content

Latest commit

 

History

History
executable file
·
496 lines (401 loc) · 14 KB

File metadata and controls

executable file
·
496 lines (401 loc) · 14 KB

Deploying the application on a server

Introduction

This document illustrate the steps to configure & install the required dependencies for running the vlabs-analytics-service application

Install dependendent python packages

Here we use the setuptools module from the standard lib, to make a setup.py file, which installs all the python library dependencies required to run the application.

from setuptools import setup

requires = [
    'flask',
    'flask-cors',
    'flask-testing',
    'requests',
    'pyyaml',
    'GitPython',
    'gunicorn'
]

setup(
    name='vlabs-analytics-service',
    version='v0.0.1',
    install_requires=requires
)

WSGI configuration

  • The Web Server Gateway Interface (WSGI) is a specification for simple and universal interface between web servers and web applications or frameworks for the Python programming language.
  • This application runs behind nginx webserver.
  • Following code snippet in wsgi.py makes the connection between nginx and flask python’s micro framework.
import sys, os

sys.path.insert(0, "/usr/share/nginx/html/")
from runtime.rest.app import create_app
from runtime.config import flask_app_config as config

application = create_app(config)

Make analytics as service

description "Gunicorn application server runninng anlytics-service"

start on runlevel [2345]
stop on runlevel [!2345]

respawn
setuid root
setgid www-data

chdir /usr/share/nginx/html/deployment
exec gunicorn --workers 3 --bind unix:analytics-service.sock -m 007 wsgi

Nginx socket configuration

server {
    listen 80;
    server_name localhost;

    location / {
        include proxy_params;
        proxy_pass http://unix:/usr/share/nginx/html/deployment/analytics-service.sock;
    }
}

Run vlabs-analytics-serevice on development environment

  1. Install pip and nginx server
    sudo apt-get update
    sudo apt-get install python-pip python-dev nginx
        
  2. Create a virtual environment for python
    virtualenv analytics-service
        
  3. Install virtualenv package
    sudo pip install virtualenv
        
  4. Activate the virtual environment
    source analytics-service/bin/activate
        
  5. Clone the repository
    git clone https://github.com/vlead/vlabs-analytics-service
        
  6. Checkout branch to develop and build the sources
    cd vlabs-analytics-service
    git checkout develop
    make readtheorg=true
        
  7. Install pre-requisites inside virutal environment
    cd build/code/deployment
    python setup.py install
        
  8. export PYTHONPATH to build/code to run the application
    cd build/code
    export PYTHONPATH=$(pwd)
        
  9. Configure application variables in runtime/config/system_config.py
    # Application URL
    APP_URL = "http://localhost:5000"
        
    # Configure key
    KEY = "defaultkey"
        
    # Lab Data Service URL
    LDS_URL = "http://lds.vlabs.ac.in"
        
    # Analytics database (i.e elasticsearch) URL
    ANALYTICS_DB_URL = "http://192.168.33.3"
        
    # Analytics database (i.e elasticsearch) indexes & doc_types to store the
    # analytics data
    ## Index to store vlabs analytics
    VLABS_USAGE = "vlabs"
        
    ## Types to store openedx & nonopenedx usages
    OPENEDX_USAGE = "openedx_usage"
    NONOPENEDX_USAGE = "nonopenedx_usage"
        
    # PATH to analytics of nonopenedx labs file which is copied from
    # stats.vlabs.ac.in server
    NONOPENEDX_USAGE_INFO_FILE_PATH = "/home/sripathi/output.txt"
        
    ###Credentials to analytics-db
    USER="username"
    PASSWORD="password"
        
  1. Run flask application server
    cd build/code/runtime/rest
    python app.py
        
  2. Access application on browser
    firefox http://localhost:5000
        

Integration with other services

vlabs-analytics-service aggregates all the analytics data from different services. This is achieved by setting up cronjob on all services to push the analytics to analytics-db via REST APIs of vlabs-analytics-service for every regular interval of time

stats-server

  1. This server contains all the analytics of the labs (usage, hits and visits) running on nonopenedx platform
  2. Usage, hits and visits of labs running on nonopendx platform are processed by erlang program and output statistics results were stored into **output.txt** file on **stats.vlabs.ac.in** server and this will happen for every 2 hrs regular interval of time
  3. Every line in output.txt file has the following format
    lab_id, lab_name, hits, visits, usages
        
  4. location of output.txt on server
    cd /root/
        
  5. Setup cron job to copy the source path file /root/output.txt of stats.vlabs.ac.in server to destination path /root/nonopenedx-usage.txt of vlabs-analytics-service server for every 3hrs of interval
    * 3 * * * root rsync -avz /root/output.txt root@vlabs-analytics.vlabs.ac.in:/root/nonopenedx-usage.txt
        
  6. Ensure that value of configuration variable NONOPENEDX_USAGE_INFO_FILE_PATH at link is same as step(5) destination path

feedback-service

Setup cron job on feedback.vlabs.ac.in

Openedx-platform VM running vlabs (http://vlabs.ac.in)

Setup logstash service

Installation

Pre requisites java version 8

java version 8 is the pre-requisite to install elasticsearch

sudo apt-add-repository ppa:webupd8team/java -y
sudo apt-get update -y
echo 'oracle-java8-installer shared/accepted-oracle-license-v1-1 select true' | sudo debconf-set-selections
sudo apt-get install oracle-java8-installer -y
Install logstash

Download and install the Public Signing Key:

wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -

You may need to install the apt-transport-https package on Debian before proceeding:

sudo apt-get install apt-transport-https

Run sudo apt-get update and the repository is ready for use. You can install it with:

sudo apt-get update && sudo apt-get install logstash
Run logstash

To run logstash service

service logstash start

Configuration file

  • Configuration to dump the login and logout nginx server logs into the analytics-db (i.e elasticsearch) database service
  • Copy below code snippet into /etc/logstash/conf.d/analytics.conf
    input {
       file {
           path => "/home/sripathi/test-logs.log"
           start_position => "beginning"
       }
    }        
    filter {
    
          grok {
                match => ["message", "%{IP:clientip} \- \- \[%{MONTHDAY:day}/%{MONTH:month}/%{YEAR:year}\:%{TIME:time} \+%{INT:zone}\]  \"%{WORD:method} %{URIPATHPARAM:api_endpoint} %{URIPROTO:protocal}/%{NUMBER:version}\" %{INT:status_code} %{INT:byte} %{NUMBER:byte1} \"%{URI:referrer}"]
    
               }
               geoip {
               source => "clientip"
               }
           if [month] == "Jan" {
            mutate { replace => { "month" => "01" } } 
           }
           else if [month] == "Feb" {
            mutate { replace => { "month" => "02" } } 
           }
           else if [month] == "Mar" {
            mutate { replace => { "month" => "03" } } 
           }
           else if [month] == "Apr" {
            mutate { replace => { "month" => "04" } } 
           }
           else if [month] == "May" {
            mutate { replace => { "month" => "05" } } 
           }
           else if [month] == "Jun" {
            mutate { replace => { "month" => "06" } } 
           }
           else if [month] == "Jul" {
            mutate { replace => { "month" => "07" } } 
           }
           else if [month] == "Aug" {
            mutate { replace => { "month" => "08" } } 
           }
           else if [month] == "Sep" {
            mutate { replace => { "month" => "09" } } 
           }
           else if [month] == "Oct" {
            mutate { replace => { "month" => "10" } } 
           }    
           else if [month] == "Nov" {
            mutate { replace => { "month" => "11" } } 
           }
           else {
            mutate { replace => { "month" => "12" } } 
           }    
           mutate {
    
             add_field => {
                           "date" => "%{year}-%{month}-%{day}"
                           }
             remove_field => ["year", "month", "day", "path", "host"]
            }
          if [api_endpoint] == "/dashboard" {
    
                if [status_code] != "200" {
                   drop {}
                }
                else if [referrer] != "https://vlabs.ac.in/login?next=/dashboard" {
                     drop {}
                }
          }
          else if [api_endpoint] == "/logout" {
              if [status_code] != "302" or [referrer] == "https://vlabs.ac.in/" {
                 drop {}
              }
    
          }
          else {
               drop {}
          }
    
    }
    
    output {
    
           elasticsearch {
                hosts => "192.168.33.3:80"
                user => "user"
                password => "pswd"
                index => "vlabs"
                document_type => "openedx_user_session_analytics_%{date}"
            }
       }
    
        

Script to get openedx user analytics

  1. This scripts gets the user analytics (registred, active and inactive) from openedx mysql database and forms the json record.
  2. Also it pushes obtained json data in step(1) into analytics database (i.e elasticsearch)
#!/usr/bin/python                                                                                                   
import MySQLdb
import json
import datetime
import requests

cursor = None
db = None
analytics_db_url = "http://192.168.33.3"
analytics_db_user = "<username>"
analytics_db_password = "<password>"

mysql_db_url = "localhost"
mysql_user = "<username>"
mysql_password = "<password>"
mysql_db = "edxapp"

def connect_db():
  try:
      global db
      global cursor
      db = MySQLdb.connect(mysql_db_url, mysql_user, mysql_password, mysql_db)
      cursor = db.cursor()

  except Exception as e:
      print "Exception = %s" % (str(e))
      exit(1)

def dis_connect_db():
    try:
        db.close()
    except Exception as e:
        print "Exception = %s" % (str(e))
        exit(1)

def get_users_count(query):
    try:
        cursor.execute(query)
        results = cursor.fetchall()
        for row in results:
            users_count = row[0]
        return int(users_count)
    except:
        print "Error: unable to fecth data"
        exit(1)

def push_data_to_analytics_db(data_dict):
    index = "vlabs"
    doc_type = "openedx_user_analytics"
    date = data_dict['date']
    analytics_db_api = "%s/%s/%s/%s" % \
                       (analytics_db_url, index, doc_type, date)
    auth = (analytics_db_user, analytics_db_password)
    headers = {'Content-Type': 'application/json'}
    try:
        r = requests.post(analytics_db_api, auth=auth, \
                          data=json.dumps(data_dict), headers=headers)
        if r.status_code == 201 or r.status_code == 200:
            print "data_dict is added = %s" % (data_dict)
        else:
            print "Error in adding data_dic = %s" % (data_dict)

    except Exception as e:
        print "Exception = %s" % (str(e))
        exit(1)

if __name__== "__main__":

    ### Connect to mysql database
    connect_db()

    ### Form SQL Queries
    active_users_query = "SELECT count(*) FROM auth_user WHERE is_active=1"
    inactive_users_query = "SELECT count(*) FROM auth_user WHERE is_active=0"
    registered_users_query = "SELECT count(*) FROM auth_user"

    ### Get users count 
    active_users = get_users_count(active_users_query)
    inactive_users = get_users_count(inactive_users_query)
    registered_users = get_users_count(registered_users_query)

    ### Form json dict
    today_date = str(datetime.datetime.today()).split()[0]
    data_dict = {
        "date" : today_date,
        "registered_users" : registered_users,
        "active_users" : active_users,
        "inactive_users" : inactive_users
    }
    ### disconnect database connection
    dis_connect_db()

    ### push data to anlytics db
    push_data_to_analytics_db(data_dict)

Tangle

print "deployment package"
<<logged_in_users>>
<<openedx_user_analytics>>