FCC Study

Repo for FCC study analysis. This centres around a parametric neural network that is used for signal extraction.

1. Clone this repository

git clone git@github.com:Teddy-Curtis/FCC-scan.git

2. Install dependencies

All the dependencies are stored in environment.yml and can be installed using conda, mamba or micromamba. NOTE: If you are using conda, and it is taking forever to install the environmenrt, then follow the instructions in the link: https://www.anaconda.com/blog/a-faster-conda-for-a-growing-community .

I used conda, but if you want to install with mamba then just replace conda with micromamba. Note: You will also have to do the same swap in the setup.sh file as well.

To install with conda:

# First clean conda, press 'y' for each option to delete the stored tarballs etc.
conda clean -a
# Now create the environment
conda env create -f environment.yml

If you are having problems with storage during the conda env create bit, then create a directory in your EOS folder, called e.g. NEWTMP, then set that as the new temp directory before running conda env create:

mkdir YOUR_EOS_DIRECTORY/NEWTMP
export TMPDIR="YOUR_EOS_DIRECTORY/NEWTMP";  conda env create -f environment.yml

If this has now worked fine, then you can move onto the following: Following this run:

conda activate fcc-study

Next run setup.sh which will install mplhep, pytorch and also copyt all of the data files.

NOTE: For pytorch GPU to be installed correctly (and not the cpu variation) you need to run ./setup.sh on a system with a GPU. For Imperial that means first ssh'ing to the gpu00 node adn then running this.

./setup.sh

3. Install fcc_study

To install the fcc_study library, run

pip install -e .

That's it!

To run the network, all you really need to do is make edits to main.py. First, at the start of main.py you will see:

######################## Define Hyperparams and Model #########################
base_run_dir = "runs/fcc_scan"
run_loc = getRunLoc(base_run_dir)

Give base_run_dir which is where the training and testing data will be saved. Note that if you don't change base_run_dir that is fine, a subdirectory is made under it in the for run1, run2, run3... and so on.

Next, you will want to put all of the signal samples into the sample dictionary, along with all the background samples. This is of the form:

samples = {
    "backgrounds" : {
    "p8_ee_ZZ_ecm240": {
        "files" : ['p8_ee_ZZ_ecm240.root'], 
        "xs": 1.35899
    },
    "wzp6_ee_eeH_ecm240": {
        "files" : ["wzp6_ee_eeH_ecm240.root"],
        "xs": 0.0071611
    }
    },
    "signal" : {
        "BP1" : {
            "files" : ["e240_bp1_h2h2ll.root", "e240_bp1_h2h2llvv.root"],
            "masses" : [80, 150],
            "xs": 0.0069
        },
        "BP2" : {
            "files" : ["e240_bp2_h2h2ll.root", "e240_bp2_h2h2llvv.root"],
            "masses" : [80, 160],
            "xs": 0.005895
        },
    },
    "Luminosity" : 500,
    "test_size" : 0.25 # e.g. 0.2 means 20% of data used for test set
    }

So here you need to fill in multiple things:

Put all of the backgrounds in under backgrounds. For each background you need to put the list of files (if it's just 1 file then still put it in a list like shown), and the cross-section, xs. Note that all the background names in the dictionary need to be unique (here they are p8_ee_ZZ_ecm240 and wzp6_ee_eeH_ecm240)
Do the same for the signal samples, but also include mH and mA for each one in masses: "masses" : [mH, mA]. Each signal point needs to have a unique name (here BP1 and BP2)
Put the correct luminosity, here I just put 500 but you can change that.
Then you can pick how large you want the test dataset to be, 0.25 means 25% of the full dataset is used for the test dataset.

EXTRA NOTE: I think if you are running this on lxplus and on the CERN batch system, then when you put in the xrootd file location instead so that you can read the file from the batch system.

Note that this uses the cross-section to get the sample weight with the following formula:

weight = xs * lumi / n_samples

Next you want to put all in the hyper parameters for training. Really the only thing you will have to change here now is the number of epochs, the rest you can leave as is.

Once it has ran, the output is under runs.

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
fcc_study		fcc_study
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml
evaluate_data.py		evaluate_data.py
execute_evaluate_data_scenario.sh		execute_evaluate_data_scenario.sh
execute_main.sh		execute_main.sh
execute_main_scenario.sh		execute_main_scenario.sh
main.py		main.py
main_e240_scenarios.py		main_e240_scenarios.py
main_e365.py		main_e365.py
main_e365_scenarios.py		main_e365_scenarios.py
makePlots.py		makePlots.py
setup.py		setup.py
setup.sh		setup.sh
submit_evaluate_scenario.sh		submit_evaluate_scenario.sh
submit_scenario.sh		submit_scenario.sh
submit_script.txt		submit_script.txt
test.ipynb		test.ipynb
test_main.ipynb		test_main.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FCC Study

About

Releases

Packages

Languages

Teddy-Curtis/FCC-scan

Folders and files

Latest commit

History

Repository files navigation

FCC Study

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages