-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add possibility to copy and not merge some keys by lstchain_merge_hdf5_files #1174
Conversation
Hi |
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #1174 +/- ##
==========================================
+ Coverage 74.39% 74.40% +0.01%
==========================================
Files 129 129
Lines 13195 13206 +11
==========================================
+ Hits 9816 9826 +10
- Misses 3379 3380 +1 ☔ View full report in Codecov by Sentry. |
parser.add_argument( | ||
'--keys-to-copy', | ||
nargs="+", default=[''], | ||
help='List of duplicated keys to be copied and not to be merged' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this should be a command line argument. We know which keys are event-wise and should be merged and which are global so shouldn't be merged.
Let's define this in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, this would be the best, but since this seems a general merging function I was not sure we wanted to specify it inside the function. If we agree on that I prefer it also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a general merging function for our data, we do know all possible keys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm reopening this conversation because I think it is not resolved yet. Having this command line argument makes it a bit confusing at the same time as having a set of default keys to be copied. Do we really need this additional option? I think they can be simply defined in lstchain/io/io.py
This special handling of some keys is also needed to fix #1152 So we should just make a set of keys that are global and are copied from the first file (potentially checking that they are equal in all following files) and keys that are "event-like" so need to be merged. |
parser.add_argument( | ||
'--keys-to-copy', | ||
nargs="+", default=[''], | ||
help='List of duplicated keys to be copied and not to be merged' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm reopening this conversation because I think it is not resolved yet. Having this command line argument makes it a bit confusing at the same time as having a set of default keys to be copied. Do we really need this additional option? I think they can be simply defined in lstchain/io/io.py
lstchain/io/io.py
Outdated
@@ -332,15 +337,33 @@ def auto_merge_h5files( | |||
else: | |||
keys = set(nodes_keys) | |||
|
|||
copy_keys = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be this simple one liner:
keys_to_copy = set() if keys_to_copy is None else keys_to_copy.intersection(keys)
Hi @maxnoe , @morcuended , I added the default keys, but I prefer to keep the trailet in case of tests on new implementations. Do not hesitate to add changes if you have strong opinions on this. We need to merge this PR before next release, which should be done soon. |
@maxnoe , @morcuended are you ok with the changes so we can move on with this PR as well? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me
This is in particular necessary for the calibration keys in DL1 and DL2 files, which are the same for all subruns and should not be merged but simply copied from the first file of the list.
I wonder if we can already add them as default in the script...