Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Writing multiple podio collections with the same name in DD4hep... #1111

Closed
1 task
wdconinc opened this issue Aug 12, 2022 · 10 comments · Fixed by #1108
Closed
1 task

Writing multiple podio collections with the same name in DD4hep... #1111

wdconinc opened this issue Aug 12, 2022 · 10 comments · Fixed by #1108

Comments

@wdconinc
Copy link
Contributor

I don't really know where to post this issue exactly, since it seems to fall at the intersection of DD4hep, EDM4hep, and podio, but here goes... This is an issue reported on (if not properly reported on github) at one of the key4hep meetings in the spring. I'm also sorta debugging this again while I'm writing it, so forgive me for labeling this [DRAFT].

In our DD4hep detector description, we use different detector sections to describe logically similar detectors. For example, the various endcap planes in our tracker are different in size and chip arrangement, yet they share all other characteristics and should logically be expected to write to a single hit collection, e.g. TrackerEndcapHits.

However, when writing to a single collection from multiple detectors, these collections are not read correctly by our reconstruction code (the actual issue, I'll come back to this).

So, as an ugly workaround we write to a slightly differently name collection for each detector (e.g. TrackerEndcapHits1, TrackerEndcapHits2, etc..., see these readout definitions). In reconstruction we then collate these again with a SimTrackHitsCollector. This is not really scalable, since our reconstruction steering scripts now have to depend on exactly how many planes we have (something we are still determining).

What appears to happen when we write to the single hits collection (as e.g. in this branch) is that we get simulated output files that contain a single collection, but the counters are off.

  • add details on how the output files are 'off'.
@andresailer
Copy link
Member

I am wondering if that is fixed with #922 already?

@wdconinc
Copy link
Contributor Author

Yep, fixed by #922. Sorry for the delay in confirming :-D

@wdconinc
Copy link
Contributor Author

wdconinc commented May 5, 2023

I guess we should reopen this issue for the podio frames writer since moving to frames has caused regressions in our simulations.

I guess it was still a TODO item: https://github.com/AIDASoft/podio/blob/705721d53bcdd68c265351b86d4973c5bca3c0f9/include/podio/Frame.h#L410-L424

@tmadlener
Copy link
Contributor

Ah, sorry about that. Hopefully this will get better with AIDASoft/podio#412. On the other hand, even with that storing collections with the same name will not work, and we will most likely just be able to raise an exception at the earliest possible moment.

@wdconinc
Copy link
Contributor Author

wdconinc commented May 6, 2023

So support for multiple collections with the same name is not in the plans anymore and we should plan to rework everything again?

@hegner
Copy link

hegner commented May 6, 2023

@wdconinc - I probably was not there when the discussion took place. So the difference in the collections is the type?

@wdconinc
Copy link
Contributor Author

wdconinc commented May 6, 2023

Not even the type is different. Essentially we use the following in DD4hep (but for 10 endcap layers instead of the 2 shown here):

  <detectors>
    <detector
      id="TrackerEndcapN_1_ID"
      name="MiddleTrackerEndcapN"
      type="epic_TrapEndcapTracker"
      readout="TrackerEndcapHits"/>
    <detector
      id="TrackerEndcapN_2_ID"
      name="OuterTrackerEndcapN"
      type="epic_TrapEndcapTracker"
      readout="TrackerEndcapHits"/>
  </detectors>
  <readouts>
    <readout name="TrackerEndcapHits">
      <segmentation type="CartesianGridXZ" grid_size_x="0.010*mm" grid_size_z="0.010*mm" />
      <id>system:8,layer:4,module:12,sensor:2,x:32:-16,z:-16</id>
    </readout>
  </readouts>

Our tracking detector layers all use the same geometry plugin, readout name, and hit collection (just different dimensions, positions, etc, which are not simply a different radius and for ACTS geometry requirements cannot be in a single detector).

In the past this worked (i.e. the hits in all layers are written to the same collection), then it didn't, then it worked again, and now it doesn't again. Since the sensor segmentation, digitization, basically entire processing chain, are identical for all these detectors, we want to limit repeating ourselves by having a separate (identical) readout for each layer, then a separate (identical) digitization algorithm, etc... though that level of duplication is the workaround here.

@andresailer andresailer reopened this May 7, 2023
@tmadlener
Copy link
Contributor

OK. So the switching between working and not working was / is definitely a problem and not really intended. IMHO the intended behavior (from a podio point of view) is the one that we have right now, because otherwise we would need to make it possible to change collections that are already in a Frame, something we cannot really do without giving up a few nice properties for multi threading.

On the other hand, I would say this is something that DD4hep should be able to handle, because that has more control over when it puts a collection into a Frame and it should be possible to keep collections open until the end of an event and only then put them into the Frame that is written.

@wdconinc
Copy link
Contributor Author

wdconinc commented May 7, 2023

Right now, in DD4hep, we call saveEvent once for every hit collection:

void Geant4Output2EDM4hep::saveCollection(OutputContext<G4Event>& /*ctxt*/, G4VHitsCollection* collection) {

That is where the m_frame gets the collection put in:

m_frame.put(std::move(sthc), colName);

Except for MCParticles, which is put in on commit:

void Geant4Output2EDM4hep::commit( OutputContext<G4Event>& /* ctxt */) {

I wonder if we can upgrade the local edm4hep::SimTrackerHitCollection sthc in saveEvent to a class member map<string,edm4hep::SimTrackerHitCollection> Geant4Output2EDM4hep::m_simTrackerHitCollectionMap keyed on colName (and similar with calorimeter hits and contributions). On commit we can still std::move if we use map::extract (C++17), https://en.cppreference.com/w/cpp/container/map/extract, until the map is empty.

If this sounds reasonable, I'll take a stab at it (... probably after CHEP, or at least after my talk is done).

@wdconinc
Copy link
Contributor Author

wdconinc commented May 7, 2023

(feel free to transfer the issue to DD4hep where it may now belong)

@andresailer andresailer transferred this issue from AIDASoft/podio May 8, 2023
@wdconinc wdconinc changed the title [DRAFT] Writing multiple podio collections with the same name in DD4hep... Writing multiple podio collections with the same name in DD4hep... May 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants