Skip to content

Marco Lorusso's project for "Software and Computing for NSN" exam

Notifications You must be signed in to change notification settings

DrWatt/Softcompexam

Repository files navigation

Softcompexam Build Status

Exam code for Software and Computing for Nuclear and Subnuclear Physics course.

Description

The project consists on a small application based on two different approaches to simulate the bunch-crossing assignment done by the TwinMux breadboard in the muon tracking system of CMS:

  • A Deep Neural Network multi-classificator developed using the Keras library: 7 input nodes -> 2 hidden layers of densely connected nodes (8 -> 8) -> 9 output nodes corresponding to the 9 possible BX identifications.

  • A boosted decision tree provided by the Xgboost library.

In both cases the layouts and hyper-parameters were chosen after trying various configurations in order to get the highest test accuracy possible on my machine.

The data used for this analysis was collected by the CMS detector during the proton-proton running at s^2 = 13 TeV^2 of the Large Hadron Collider at the beginning of September 2018, and it corresponds to around 400 pb^(-1) of integrated luminosity. After being stored, data was further processed with the CMS offline software to perform reconstruction of physics objects. In particular track segments were built within single DT chambers. A further event selection, requiring pairs of oppositely charged muons with an invariant mass above 60 GeV, was performed. The resulting dataset has then been saved in a flat ROOT tree format for easier analysis. This ROOT Tree has then been parsed into a csv file, keeping the relevant variables, in order to use the DataFrame functionalities from within the Pandas module.

The script, written in python3, can be executed from the command line, typing ./project.py with the following flags and options:

  • --data to specify the Url or path of the dataset in csv format (if unspecified a default dataset will be used);
  • --xgb to activate the XGBoost model;
  • --nn to activate the NN built with Keras
  • --xgparams to specify the path of a json file specifying the following hyper parameters:
    • max_depth of the tree;
    • eta i.e. learning rate;
    • subsample of the total dataset to use for each iteration;
    • eval_metric the metrics used to quantify the validity of the model;
    • silent verbosity setting;
    • objective
    • num_class number of classes the algorithm has to deal with and predict;
    • seed seed of the random generator;
    • num_parallel_tree number of parallel trees to activate the Random Forest techinque.
  • --nnparams number of epochs, batch size and validation split performed by the NN (input values simply separated by space and set equal to 0 a parameter to use the relative default value);
  • --modelupload to specify the Url or path of a pretrained model in joblib format.

Examples

For a thorough explanation on how to use this application consult the How to guide.

Here there are some examples of using the script from command line:

$ ./project.py --xgb --data "https://raw.githubusercontent.com/DrWatt/softcomp/master/datatree.csv"

Starting training using XGBoost with data donwloaded from the URL provided, using default parameters.

$ ./project.py --nn --data "https://raw.githubusercontent.com/DrWatt/softcomp/master/datatree.csv"

Starting training using Keras Neural Network with data donwloaded from the URL provided, using default parameters.

$ ./project.py --xgb --data "https://raw.githubusercontent.com/DrWatt/softcomp/master/datatree.csv" --xgparams "params.json"

Starting training using XGBoost with data donwloaded from the URL provided and parameters taken from JSON configuration file.

$ ./project.py --nn --data "https://raw.githubusercontent.com/DrWatt/softcomp/master/datatree.csv" --nnparams 5 5 0.5

Starting training using Keras Neural Network with data donwloaded from the URL provided, with 5 Epochs, batch size of 5 input and a validation split of the data of 0.5.

The ROOT tree used to populate the test database can be found here. For more information about XGBoost and Keras, you can find their documentation, respectively, here and here. If you want to know more about the Muon Trigger system at CMS, you can have a look at my bachelor's thesis.

About

Marco Lorusso's project for "Software and Computing for NSN" exam

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published