Skip to content

polcak/linking

Repository files navigation

linking

Copyright (C) 2017 Libor Polčák ipolcak@fit.vutbr.cz

This is a README file for linking -- a tool for linking identities.

Usage

usage: linking.py [-h] [--graph_file GRAPH_FILE] [--scope {1,2,3,4,5,6}]
                  [--begintime BEGINTIME] [--endtime ENDTIME]
                  [--timescope {1,2}] [--max_inaccuracy MAX_INACCURACY]
                  [--components] [--add_self]
                  inputid

Identity linking software

positional arguments:
  inputid               The input id (type: id).

optional arguments:
  -h, --help            show this help message and exit
  --graph_file GRAPH_FILE, -g GRAPH_FILE
                        Input graph file with identities.
  --scope {1,2,3,4,5,6}, -s {1,2,3,4,5,6}
                        The linking scope (1-6):
                        1~ Constraints revealing components of partial
                           identity aka Other corresponding identifiers.
                        2~ Constraints revealing partial identities of
                           specific computer aka Identifiers of
                           a specific computer.
                        3~ Constraints revealing partial identities of
                           computers where specific user authenticated
                           or logged in.
                        4~ Constraints revealing identifiers of all
                           users accessing specific resource.
                        5~ Constraints revealing all user accounts
                           logged in or authenticated from computer or
                           set of computers.
                        6~ Constraints revealing all accessed resources.
  --begintime BEGINTIME, -b BEGINTIME
                        Begin time for which to perform linkage (local TZ).
  --endtime ENDTIME, -e ENDTIME
                        End time for which to perform linkage (local TZ).
  --timescope {1,2}, -t {1,2}
                        Time scope (1-2):
                        1~ All edges on the path have to be valid during
                           the whole period.
                        2~ All edges on the path have to be valid at
                           least once during the period [-b, -e] and
                           the period during the previous identifier
                           is valid on the path. 
  --max_inaccuracy MAX_INACCURACY, -i MAX_INACCURACY
                        Maximal path inaccuracy.
  --components, -c      Compute the number of components in the graph.
  --add_self, -a        Add the input node to the output set
usage: log2gml.py [-h] [--dhcp DHCP_LOG,YEAR,LEASE_PERIOD]
                  [--graph_file GRAPH_FILE] [--clf CLF_LOG,SERVER_FQDN]
                  output_graph_file

Log to GML graph convertor

positional arguments:
  output_graph_file     The Output graph file with identities.

optional arguments:
  -h, --help            show this help message and exit
  --dhcp DHCP_LOG,YEAR,LEASE_PERIOD, -d DHCP_LOG,YEAR,LEASE_PERIOD
                        ISC DHCP log file(s) and parameters:
                        file_name,year,lease_period(seconds).
  --graph_file GRAPH_FILE, -g GRAPH_FILE
                        Input graph file(s) in the GML format used by
                        linked.py.
  --clf CLF_LOG,SERVER_FQDN, -c CLF_LOG,SERVER_FQDN
                        Common/combined log format log file(s) used by HTTP(s)
                        servers, e.g. Apache, and the server FQDN.

Note that log2gml.py supports multiple instances of --dhcp, --graph_file, and --clf.

Conversion of log files to GML

The utility log2gml.py can convert log files to GML files compatible with linking.py. So far ISC DHCP daemon and HTTP common/combined log format are supported. Additionally, log2gml can merge multiple GML files into a single GML file.

Feel free to develop additional convertors for different log file formats.

DHCP conversion example:

./log2gml.py -d examples/log/dhcpd-anon.log,2017,7200 network.gml

CLF conversion example based on files from Security Repo by Mike Sconzo that is licensed under a Creative Commons Attribution 4.0 International License:

wget http://www.secrepo.com/self.logs/access.log.2017-01-01.gz
gunzip access.log.2017-01-01.gz
wget http://www.secrepo.com/self.logs/access.log.2017-01-02.gz
gunzip access.log.2017-01-02.gz
./log2gml.py -c access.log.2017-01-01,www.secrepo.com -c access.log.2017-01-02,www.secrepo.com secrepo.gml

Merging:

./log2gml.py -g network.gml -g secrepo.gml combined.gml

Of course, you do not nedd to create the temporary GML files if you do not need them:

./log2gml.py -d examples/log/dhcpd-anon.log,2017,7200 -c access.log.2017-01-01,www.secrepo.com -c access.log.2017-01-02,www.secrepo.com combined.gml

Subsequently, you can use linking.py, for example, as follows:

./linking.py -g combined.gml "URL: www.secrepo.com/self.logs/access.log.2015-02-13.gz" -s 8
IPv4: 46.229.168.69

Getting GML data from PCF

Use convert_pcf_gml.py.

usage: convert_pcf_gml.py [-h] active graph_file

This program converts PCF active.xml into an GML graph compatible with the input of linking.py

positional arguments: active Input active.xml. graph_file Output graph file with identities.

optional arguments: -h, --help show this help message and exit

Query examples

For some query examples, have a look to the examples/test.sh file.

Required libraries

Project history

TBD

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages