We have developed an application for complete collection, management and crowdsourced information analysis of HTTP traffic data. Specifically, the application gives access to users and the administrator which have access to different operations.
The login page is shown below where depending on whether the user or the administrator logs in, the corresponding page is displayed.
Registration in the system: The user registers and have acesss to the system by selecting a username & password of his choice, and providing his email. The password is required be at least 8 characters and contain at least one capital letter, a number and a symbol (eg # $ * & @).
Once logged in, the user can manage their profile, upload and visualize HAR file data.
Profile management: the user can change the username / password and see basic statistics for the data he has uploaded (date of last upload, number of records).
Upload data: The user selects a HAR file from his computer. The file is processed locally (using JavaScript) in order to delete sensitive data and then the user has two options:
a) Upload it to the system
b) Save the edited file locally.
If the file is uploaded to the system, it will need to be further processed (on the server) of the data to be uploaded, in order to store the desired data with appropriate format. Also, the IP of the user uploading the file should be "analyzed" so that to automatically discover the user connectivity provider and save the this information in the database along with the records, this process is done using an API.
Data visualization: The user can see in a map the locations of the IPs in which he has sent HTTP requests. Specifically, a heatmap is displayed on the map to display the distribution of the number of records related to HTML, PHP, ASP web objects, JSP (or pure domains, without path).
Display Basic Information: The administrator sees relevant information on one page, in tables and / or graphs according to:
a) The number of registered users.
b) The number of entries in the database per type (method) of application.
c) The number of entries in the database per response code (status).
d) The number of unique domains that exist in the database.
e) The number of unique connectivity providers in the database.
Response Time Analysis: A configurable diagram with the average response time (Y axis) in each request per hour of the day [0-24] (X axis) is displayed. The diagram can filter using the following arguments:
a) Web object type (select one or more CONTENT-TYPE or all).
b) Day of the week (Monday - Sunday or all).
c) HTTP method type on request (select one or more, or all).
d) Connectivity Provider (eg "Wind", "Cosmote" or all).
HTTP header analysis: The administrator has access to a page with the appropriate information, in tables and / or graphs according to the use of cache memories. More specifically:
a. Histogram of TTL distribution of web objects in response, by CONTENT-TYPE (select one or more CONTENT-TYPE or all).
b. Percentage of max-stale and min-fresh directives on the total number of applications per CONTENTTYPE (select one or more CONTENT-TYPE or all).
c. Percentage of cacheability directives (public, private, no-cache, no-store) on the total of responses per CONTENT-TYPE (select one or more CONTENT-TYPE or all).
All the above graphs / tables are configured by the provider selection connectivity Connectivity provider (eg "Wind", "Cosmote" or all)
Data visualization: The administrator can see on a map the IP locations to which it has sent HTTP requests. Specifically, one marker per IP appears with lines connecting each user's source city with each icon. The color of the lines is adjusted according to the number of applications they have made to that IP, normalized to the maximum number made to the most popular IP.