Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Pythia 8.3.10 to chromo #174

Open
wants to merge 31 commits into
base: main
Choose a base branch
from
Open

Conversation

gaudu
Copy link
Collaborator

@gaudu gaudu commented Sep 11, 2023

Attempt to solve #171 with @HDembinski

@afedynitch
Copy link
Member

Thanks for starting this! What about this cached directory? How big are the files? Does it cleanup/update itself?

@gaudu
Copy link
Collaborator Author

gaudu commented Sep 11, 2023

Thanks for starting this! What about this cached directory? How big are the files? Does it cleanup/update itself?

For each (projectile, target, energy) combination, you will produce three files:

  • .sigfit is no bigger than 50 bytes.
  • .mpi and .sasd.mpi increase in size while increasing energy and atomic number of either projectile or target, as would the overall number of interactions from the initial collision. The sizes of the .mpi files, from my personal experience, do not exceed ~150 kB, while the .sasd.mpi files are usually less than ~100 kB.

For example, a proton boosted against a target of oxygen-16 at rest would produce files from the following sizes:

pO @ 10^3 GeV pO @ 10^7 GeV pO @ 10^11 GeV
.sigfit 45 bytes 46 bytes 42 bytes
.mpi 26 kB 104 kB 122 kB
.sasd.mpi 18 kB 70 kB 80 kB

There is currently nothing in place for the files to cleanup or update themselves, since they are supposed to be generated once by Pythia (according to the user's settings) and kept for future runs using the same (projectile, target, energy) combination. I would need to check if the files are generated in the working directory of the user or in the /examples folder of their Pythia8 installation.

@HDembinski
Copy link
Collaborator

Thanks for starting this! What about this cached directory? How big are the files? Does it cleanup/update itself?

We also don't have cleanup routines for the other cached files in place. For example if you update chromo and it needs to download a newer zip file from our server, the old one is not deleted, as far as I am aware.

We should probably wipe the iamdata directory everytime a download of the zip file is triggered, which would do the correct cleanup automatically.

@HDembinski
Copy link
Collaborator

HDembinski commented Sep 11, 2023

@afedynitch @jncots This PR is failing on CI because we are don't have the new zip file for this Pythia8 version uploaded yet. Is it possible to add a new file without making a new release?

Edit: actually it fails even before that because some build code expects that urqmd is in the model list. For development we took all the other models out. @gaudu could you please restore the file default_models.cfg to its original state? And commit?

@afedynitch
Copy link
Member

afedynitch commented Sep 12, 2023

Is it possible to add a new file without making a new release?

Yes, it's possible. @gaudu, please attach the file here if it's not too big

@afedynitch
Copy link
Member

a proton boosted against a target of oxygen-16 at rest would produce files from the following sizes

The files are so small that I would rather prefer precomputing all of the common combinations and distribute these as binaries using our zip downloads. Otherwise testing on public workers may take too long. What do you think about this @gaudu & @HDembinski ?

@gaudu
Copy link
Collaborator Author

gaudu commented Sep 12, 2023

Is it possible to add a new file without making a new release?

Yes, it's possible. @gaudu, please attach the file here if it's not too big

  1. Here is the zip file, including the pdfdata and xmldoc of Pythia 8.3.10 (size of 13,2 MB which I hope is not too big for this post).
    Pythia8_v003.zip
  2. One also needs to update the Pythia8_v003 text file which will appear in /chromo/src/chromo/iamdata/Pythia8 with the link to the .zip file before merging again!

@gaudu
Copy link
Collaborator Author

gaudu commented Sep 12, 2023

The files are so small that I would rather prefer precomputing all of the common combinations and distribute these as binaries using our zip downloads.

I would not mind generating few files if needed, it really never took that long for me. @afedynitch Could you provide the "common" (projectile, target, center-of-mass energy) combinations you would like to have run?

@HDembinski
Copy link
Collaborator

Is it possible to add a new file without making a new release?

Yes, it's possible. @gaudu, please attach the file here if it's not too big

1. Here is the zip file, including the pdfdata and xmldoc of Pythia 8.3.10 (size of 13,2 MB which I hope is not too big for this post).
   [Pythia8_v003.zip](https://github.com/impy-project/chromo/files/12588226/Pythia8_v003.zip)

2. One also needs to update the `Pythia8_v003` text file which will appear in /chromo/src/chromo/iamdata/Pythia8 with the link to the .zip file before merging again!

@afedynitch @jncots You can ignore this, I independently created the zip and uploaded it to the right place.

@HDembinski
Copy link
Collaborator

HDembinski commented Sep 12, 2023

The files are so small that I would rather prefer precomputing all of the common combinations and distribute these as binaries using our zip downloads.

I would not mind generating few files if needed, it really never took that long for me. @afedynitch Could you provide the "common" (projectile, target, center-of-mass energy) combinations you would like to have run?

@afedynitch We can add some precomputed file to the Pythia8 zip file, for example those used in the unit tests, so that the unit tests run faster. However, we cannot cover all use cases of everyone, because we cannot predict how people use Chromo. Therefore, it is anyway needed to generate these cache files on demand. I see no issue with that and it is quite nice to have this functionality in place, as it speeds up Pythia8 a lot when you use it the second time for the same inputs. In practice that happens a lot.

To elaborate why we cannot precompute everything: there is a large but countable set of input particles, but the cache files also depend on ECM, and that's a real number, so it is unbounded.

@HDembinski
Copy link
Collaborator

HDembinski commented Sep 12, 2023

Good news, the tests are all green, except an unrelated Sibyll23StarMixed failure.

FAILED tests/test_final_state.py::test_generator[Sibyll23StarMixed] - AssertionError: Sibyll23StarMixed zero counts for Omega~+, Xi~0, Xi~+, Xi-, Xi0, Omega-2112 27491 False, 130 27728 False, 211 244736 False, 310 27554 False, 321 28965 False, -3334 0 False, -3322 0 False, -3312 0 False, -3222 1795 False, -3122 5876 False, -3112 1762 False, 3112 1725 False, 3122 8684 False, 3222 3300 False, 3312 0 False, 3322 0 False, 3334 0 False, -321 26585 False, -211 233154 False, -2112 16256 False__: [ True  True  True  True  True False False False  True  True  True  True

This test also fails locally. @afedynitch @jncots How could it happen that this test was committed? We should never merge PRs with failing tests.

In my opinion, this PR is a ready to be merged. Having the ability to simulate ion collisions with Pythia8 is a big improvement. Also thanks to the new caching, Pythia8 runs much faster.

There are some issues with the cross section calculation that Chloe made me aware of, which should be tackled in another PR. What to precompute and to put in iamdata is also open, but we can iterate on that independently of this PR.

@HDembinski
Copy link
Collaborator

HDembinski commented Sep 12, 2023

For the record, locally on my computer also another test fails in addition to the Sibyll23StarMixed one:

FAILED tests/test_final_state.py::test_generator[Sibyll23StarMixed] - AssertionError: Sibyll23StarMixed zero counts for Omega~+, Xi~0, Xi~+, Xi-, Xi0, Omega-2112 27491 False, ...
FAILED tests/test_final_state.py::test_generator[UrQMD34] - AssertionError: UrQMD34 zero counts for 2112 1705 False, 130 1812 False, 211 15775 False, 310 1756 False,...
E       AssertionError: UrQMD34 zero counts for 2112 1705 False, 130 1812 False, 211 15775 False, 310 1756 False, 321 1898 False, -3334 1 False, -3322 21 False, -3312 26 False, -3222 73 False, -3122 247 False, -3112 71 False, 3112 96 False, 3122 395 False, 3222 108 False, 3312 19 False, 3322 28 False, 3334 1 True, -321 1735 False, -211 14987 False, -2112 976 False__: [ True  True  True  True  True  True  True  True  True  True  True  True
E           True  True  True  True False  True  True  True]
E       assert False

@afedynitch
Copy link
Member

How could it happen that this test was committed? We should never merge PRs with failing tests.

The PR testing this was merged into main before SIBYLL* was merged, or the other way around. We fixed it in #170 . @jncots is about to merge it into main

@afedynitch
Copy link
Member

Hi @gaudu,

could you merge from main and see if the tests pass? We merged all of the features we wanted.
Then we go through a more thorough review of your PR.

@afedynitch afedynitch mentioned this pull request Oct 5, 2023
@gaudu
Copy link
Collaborator Author

gaudu commented Oct 16, 2023

Hi @gaudu,

could you merge from main and see if the tests pass? We merged all of the features we wanted. Then we go through a more thorough review of your PR.

Hi @afedynitch,

I am still pretty new to git, just to make sure I understood you correctly. Does this mean I should merge my working branch (gaudu/chromo/pythia8310) into my main branch (gaudu/chromo/main) to see if the tests pass, and then you will go through the review of this PR?

@afedynitch
Copy link
Member

Ah, I see.

please merge our main into the main of your fork. Then merge your main into your branch. This PR should then update automatically.

Probably, you can merge our main directly into your branch but I remember having some weird problems doing this.

If you succeed, then the conflicts message above should disappear.

@afedynitch
Copy link
Member

Oh, once you’ve done it, ‘install -e .’ locally and run ‘pytest -vv’. It should be better to run pytest locally after merging into your main to make sure that the errors you encounter later are not from your configuration.

Copy link
Member

@afedynitch afedynitch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great. Thanks for the progress. I left a couple of questions about things I’d like to understand better.

src/chromo/models/pythia8.py Outdated Show resolved Hide resolved
src/chromo/models/pythia8.py Outdated Show resolved Hide resolved
src/chromo/models/pythia8.py Show resolved Hide resolved
src/chromo/util.py Show resolved Hide resolved
src/cpp/_pythia8.cpp Show resolved Hide resolved
@HDembinski
Copy link
Collaborator

HDembinski commented Nov 7, 2023

@gaudu @afedynitch I am trying to summarise the todolist for accepting this patch. Is this ok?

  • Make caching optional via setting in Pythia8 init
  • Generate hash from the config settings of Pythia8 and generate the filename of the cache files from this hash. This ensure that caches are not falsly used when the user changes physics parameters in Pythia8
  • Generate a warning when too many cache files are generated (more than 1000?) and stop to generate new cache files then.

Copy link
Member

@afedynitch afedynitch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR is much improved and I'm happy with the featureset. The tests requrie a bit of work and still contain unfinished TODOs.

# Nuclei are supported in principle, but generation is very slow.
# Support for more nuclei can be added with ParticleData.addParticle.
_targets = _projectiles | {
_targets = standard_projectiles | {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I've asked this already but what exactly is the reason for this explicit list and not just Nuclei()?

config.append("HeavyIon:SigFitDefPar = 10.79,1.75,0.30,0.0,0.0,0.0,0.0,0.0")
if (kin.p1.A or 0) > 1 or (kin.p2.A or 1) > 1:
config += [
# similar to MultipartonInteractions:reuseInit
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here I'm struggling to follow the logic. If there is a file available for this combination of hA or AA, why is the caching switched off?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can find more information about this topic following this website (https://pythia.org/latest-manual/HeavyIons.html) and scrolling to the section for "Subcollision models".

It seems the default Pythia 8 settings, for the SubCollisionModel in Angantyr, is "HeavyIon:SigFitDefPar = 17.24,2.15,0.33,0.0,0.0,0.0,0.0,0.0". While the one, you highlighted in the code, might have been computed for another (heavy ion) run that I showed to Hans during a previous call (so most likely for negative pion boosted for 158 GeV/c onto a fixed carbon target using Angantyr).

@@ -161,28 +181,60 @@ def _cross_section(self, kin=None):
diffractive_axb=st.sigmaAXB,
)

# TODO use pythia.info.sigmaGen(i) for cross-sections including nucleus in the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The text is somewhat difficult to follow. Is it possible to word it a bit better?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can try to rephrase it here, or would you rather I try to modify the TODO itself in the code?

# to speed up subsequent runs
cache_prefix = str(
_cache_base_dir()
/ "Pythia8"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the base directory be an optional parameter? If that would be possible then the cache files could be stored as part of the test suite and then nuclei could be tested much faster.


event = run_collision(2000 * GeV, "p", (4, 2))
assert event.pid[0] == 2212
@pytest.mark.parametrize("projectile", ("pi-", "K+", "p", "He"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use precomputed cache here? The tests with He take very long time.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this linked to the four files you updated in your previous commit "Add reference files for Pythia8 with He projectile"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I meant that this test should use the cache files supplied with the binary data package in iamdata as @HDembinski suggested.

m = Pythia8(evt_kin, seed=1, cache=True) # False
config_lines = "\n".join(m._config)
assert "MultipartonInteractions:reuseInit = 3" in config_lines # 0
# TODO Test for the cache file production for pp@10GeV when we have agreed on the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name choice is fine I guess, so the test can be completed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to implement a second part to the test where it asserts the path of the produced cache file, using the name choice made in pythia8.py, and the file size to be non zero to make sure the file is not empty.

config_lines = "\n".join(m._config)
assert "HeavyIon:SasdMpiReuseInit = 3" in config_lines # 0
assert "HeavyIon:SigFitReuseInit = 3" in config_lines # 0
# TODO Test for the cache file production for pHe@2TeV when we have agreed on the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. The name is fine. Regarding the caching request further up...I think that this test should be the only one generating and using the cache file. All other tests involving nuclei, should come with caches distributed.

These cache files should be in tests/Pythia8_caches/...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a slightly different opinion expressed in a comment below, I think the place for the natural place for these hard wired cache files is still in iamdata

@gaudu
Copy link
Collaborator Author

gaudu commented Dec 1, 2023

I made few modifications in pythia8.py and test_pythia8.py recently which were unsuccessful at passing the macos and unbuntu tests at the step pythin -m pytest -vv -n 3 which is most likely pointing at failing one of the new test I implemented. I am trying to understand what I did that could have this outcome at the moment starting from:
FAILED tests/test_pythia8.py::test_cache_file - AttributeError: 'Pythia8' object has no attribute 'generate'
FAILED tests/test_pythia8.py::test_cache_file_heavy_ion - AttributeError: 'Pythia8' object has no attribute 'generate'

On another note, I would like to discuss where exactly we want to have/not have the cache files for (a) the tests, (b) the users? Mostly to know how to answer to @afedynitch comments:

Can the base directory be an optional parameter? If that would be possible then the cache files could be stored as part of the test suite and then nuclei could be tested much faster.

Regarding the caching request further up...I think that this test should be the only one generating and using the cache file. All other tests involving nuclei, should come with caches distributed.
These cache files should be in tests/Pythia8_caches/...

@HDembinski
Copy link
Collaborator

HDembinski commented Dec 5, 2023

I have been silent here for a while, but I kinda like the idea of making the cache directory a parameter. Since we anyway want to allow users to turn this feature off and instead of passing a boolean we might as well pass a string (the path) or None to deactivate the caching feature.

On the other hand, I don't see a fundamental difference between these cache files that we generate ourselves and the hardwired ones that we distributed in iamdata. So for testing, I would add the cache files needed for tests to iamdata and make it part of the data that is automatically downloaded from github.

@afedynitch
Copy link
Member

I was not aware of the distribution of some default cache files within the binary package... in that case I agree that the natural place for these files to live is within the iamdata directory.


event = run_collision(2000 * GeV, "p", (4, 2))
assert event.pid[0] == 2212
@pytest.mark.parametrize("projectile", ("pi-", "K+", "p", "He"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I meant that this test should use the cache files supplied with the binary data package in iamdata as @HDembinski suggested.

cache_file_path = os.path.join(
_cache_base_dir(), "Pythia8", "cache_2212_2212_10000.mpi"
)
assert os.path.exists(cache_file_path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a set of default cache files are provided in iamdata, this test would just check if these files are there or not. This has to be made sure that this test generates something that's not in the default files.

assert "HeavyIon:SigFitReuseInit = 3" in config_lines

cache_file_path = os.path.join(
_cache_base_dir(), "Pythia8", "cache_2212_1000020040_2000000.mpi"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these generated files are very simple. It would be better to actually compare the contents of the sigfit file by comparing open(fname,"r").read() of the newly generated file with concrete content such as

1 2000 2000
18.3948 
1.90724 
0.357502 

This would allow to track (partially) if cache files need updating after Pythia version updates or so.

# This test is very slow the first time when Pythia is computing
# some cached results. Afterwards it runs super fast.
ecm = 2000 * GeV
event = run_collision(ecm, projectile, "He")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the cache should be enabled using files distributed via the binary package in iamdata.

src/chromo/models/pythia8.py Outdated Show resolved Hide resolved
Copy link
Member

@afedynitch afedynitch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only a few items to go. Please also pull from the latest master after #189 is merged.

@HDembinski
Copy link
Collaborator

HDembinski commented Dec 5, 2023

@afedynitch

I was not aware of the distribution of some default cache files within the binary package... in that case I agree that the natural place for these files to live is within the iamdata directory.

Sorry, I think I was not clear. Currently, we don't have cache files for the new mpi-stuff in the iamdata directory of Pythia8, but I think that's the place where we should put those cache that we want to use for our unit tests. If we go for that , we have to generate these files and then add them to a new version of the iamdata_v001.zip, and change our code that it finds them.

What we have in there currently is a Pythia8 directory with data files that Pythia8 needs, like the information about particles known to Pythia8, etc. These are not strictly cache files, but data files that Pythia8 needs. I think that the new cache files would nicely fit in there, as they are also "data".

This is analog to QGSJetII, where the corresponding iamdata directory essentially contains a huge binary cache file that QGSJetII can in principle re-create, but we never do that as it takes too long.

gaudu and others added 20 commits January 30, 2024 15:36
…nmatched version numbers : in code 8.310 but in XML 8.308.
…he file production using cache flag during init + tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants