Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with regexSensitiveDetector and volume IDs #1319

Open
lopezzot opened this issue Aug 26, 2024 · 24 comments
Open

Problem with regexSensitiveDetector and volume IDs #1319

lopezzot opened this issue Aug 26, 2024 · 24 comments
Labels

Comments

@lopezzot
Copy link

lopezzot commented Aug 26, 2024

Dear All,

I need to ask about an issue with the recent regexSensitiveDetector. I am using it on lxplus alma9 machines sourcing the key4hep nightlies. The geometry I am developing has several sensitive volumes (optical fibers) hence the memory footprint is huge (in the configuration used below it takes about 6 GB (4 from geometry and 2 from sensitive detectors)). Using regexSensitiveDetector helps a lot and the memory footprint reduces to 4 GB (i.e. only the geometry contribution).

Unfortunately, when using regexSensitiveDetector I find a problem using the method volumeID(aStep) inside Geant4SensitiveAction<>::process(). When using standard sensitive detectors, volumeID(aStep) returns the correct volume ID number; but when I use regexSensitiveDetector it returns 0.

The only difference when using regexSensitiveDetector is: marking optical fibers as "non sensitive" in the compact file and adding in the steering file

SIM.geometry.regexSensitiveDetector['DREndcapTubes'] = {'Match': ['DRETS'],'OutputLevel': 4,}

Please let me know if I am doing anything wrong here. Thank you!

Reproducer using lxplus alma9 machines.

Using standard sensitive detector (action):

git clone git@github.com:lopezzot/DREndcapTube.git
cd DREndcapTube/
git switch lp-standardsd
mkdir build; cd build
source /cvmfs/sw-nightlies.hsf.org/key4hep/setup.sh
cmake -DCMAKE_INSTALL_PREFIX=../install/ ..
make install
cd ..
cd install/
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PWD/lib64
cd ..
ddsim --steeringFile scripts/steer_e-_10GeV_SDAction.py

In the sensitive detector action process method I added this printout:

--> DREndcapTubes: track info: 
----> Track #: 76442 Step #: 23 Volume: DRETSclad_CLog_0 
--> DREndcapTubes:: position info: 
----> x: -9.99946y: 901.616z: 2524.28
--> DREndcapTubes: particle info: 
----> Particle gamma Dep(MeV) 0 Mat Fluorinated_Polymer Vol DRETSclad_CLog_0
--> DREndcapTubes: physVolID info: 
----> Volume ID 1402451615784 Endcap ID 0 Stave ID 10
----> Tower ID 22 Air ID 0
----> Col ID 69 Row ID 13
----> Clad ID 1 Core ID 0 Cherenkov ID 1

Note here "Volume ID" is not 0 (and correct).

If the same is repeated using regexSensitiveDetector

cd build
rm -rf *
git switch lp-regexsearchsd
cmake -DCMAKE_INSTALL_PREFIX=../install/ ..
make install
cd ../install
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PWD/lib64
cd ..
ddsim --steeringFile scripts/steer_e-_10GeV_SDAction.py

the printout becomes

--> DREndcapTubes: track info: 
----> Track #: 1 Step #: 40 Volume: DRETSclad_CLog_0 
--> DREndcapTubes:: position info: 
----> x: 0.164063y: 900.053z: 2509.6
--> DREndcapTubes: particle info: 
----> Particle e- Dep(MeV) 0.0257735 Mat Fluorinated_Polymer Vol DRETSclad_CLog_0
--> DREndcapTubes: physVolID info: 
----> Volume ID Geant4VolumeManager INFO  +++   Bad volume Geant4 Path: /Tank_0/phiER_9/towerLog_22/capillary_CLog_4275/DRETSclad_CLog_0
0 Endcap ID 0 Stave ID 0
----> Tower ID 0 Air ID 0
----> Col ID 0 Row ID 0
----> Clad ID 0 Core ID 0 Cherenkov ID 0

Note here "Volume ID" is 0, and the volID() method throws a warning Geant4VolumeManager INFO +++ Bad volume Geant4 Path: /Tank_0/phiER_9/towerLog_22/capillary_CLog_4275/DRETSclad_CLog_0.

@MarkusFrankATcernch
Copy link
Contributor

MarkusFrankATcernch commented Aug 26, 2024

Lorenzo,
First of all, I am not sure the above links are safe.
Edit by Andre: I deleted the posted comments.

Now to your problem:
The concept of using volume- and cell-IDs is not compatible with the concept of using the regexSensitiveDetector.
Volume- and cell-IDs require the DDG4 volume manager to be executed which leads to the huge memory usage you observe.
Consequently for subdetectors using regexSensitiveDetector these functions cannot be used at all.
You need to find a different way to get hold of volume IDs e.g. by navigating in the G4 touchable history and matching to
the DD4hep volumes to extract each volume id and then combining the for each hit.
There is unfortunately no free lunch in life.

@AIDASoft AIDASoft deleted a comment Aug 26, 2024
@AIDASoft AIDASoft deleted a comment Aug 26, 2024
@AIDASoft AIDASoft deleted a comment Aug 26, 2024
@AIDASoft AIDASoft deleted a comment Aug 26, 2024
@AIDASoft AIDASoft deleted a comment Aug 26, 2024
@lopezzot
Copy link
Author

lopezzot commented Aug 27, 2024

Hello Markus,

thank you!
I understand I cannot use volumeID() method for this subdetector. Just a clarification, when you write

navigating in the G4 touchable history and matching to the DD4hep volumes to extract each volume id

you mean navigating the Geant4 history and get Geant4 copynumbers (i.e. step->GetPreStepPoint()->GetTouchable()->GetCopyNumber()) instead of using dd4hep volumeID()) ? In other words, would Geant4 copynumbers assigned in geometry construction still work if regexSensitiveDetector is used?

@MarkusFrankATcernch
Copy link
Contributor

Hi Lorenzo,

  1. When building the subdetector in the detector constructor you give the volume placements the so-called volIDs using PlacedVolume::addPhysVolID(const std::string& name, int value). This is as usual.
  2. From the translation of DD4hep to Geant4 you can access the mapping between DD4hep(TGeo) entities and Geant4 entities
    using: Geant4GeometryInfo& p = Geant4Mapping::instance();. See the proper include files for details.
  3. in the touchable history you should be able to navigate all the placed volumes up to the world volume.
    Using the chain of placed volumes in Geant4 (G4VPhysicalVolume) you get the corresponding PlacedVolume and from this the the volIDs. There may be some extra gymnastics if the PlacedVolume is an Assembly.
  4. you combine the voilIDs and like this obtain the volumeID of the chain where the hit is.

This is nothing else what the VolumeManager does ,but the VolumeManager caches all this information together with the transformation from the sensitive placements to the world. Here you would do this calculation every time you need the volumeID and/or the transformation to the world or the parent volume.

Another possibility would be to have some alternative VolumeManager plugin only providing the absolute minimum of information necessary - hence also reducing the need to cache memory.

You see there are multiple possibilities to address this problem - depending on the specific needs and requirements. The existing VolumeManager is only one.

@lopezzot
Copy link
Author

Hello Markus,

thank you once again!
If I got it correctly when using regexSensitiveDetector you are suggesting something like this to access volIDs

G4VPhysicalVolume* PhysVol = aStep->GetPreStepPoint()->GetTouchableHandle()->GetVolume();
Geant4Mapping& Mapping = Geant4Mapping::instance();
PlacedVolume PlacedVol = Mapping.placement(PhysVol);
const PlacedVolumeExtension::VolIDs& TestIDs = PlacedVol.volIDs();
auto it = TestIDs.find("name");
std::cout<<it->first<<" "<<it->second<<std::endl;

this works in my simulation, however it makes it about 350 times slower.

@MarkusFrankATcernch
Copy link
Contributor

Yes. Something like this.
At this very stage however, some thinking has to enter the game like

  • Where are the expensive lookups, how can these be optimized ?
  • Is my subdetector structure really optimal ?
  • ... name it ...
    but clearly, the brute force approaches do not work in certain schemes anymore.

@MarkusFrankATcernch
Copy link
Contributor

P.S. Another possibility would e.g. be to play with the copy numbers provided there are no assemblies in the volume chain.
Then the copy numbers should be the same in TGeo and Geant4 and the resulting volume ID of the path
could be constructed without lookups. That would be fast .....

@lopezzot
Copy link
Author

P.S. Another possibility would e.g. be to play with the copy numbers provided there are no assemblies in the volume chain. Then the copy numbers should be the same in TGeo and Geant4 and the resulting volume ID of the path could be constructed without lookups. That would be fast .....

Yes, I was thinking about this possibility (I do not use assemblies in the dd4hep geometry constructor). One could use the "Geant4" copynumbers which are fast to access (step->GetPreStepPoint()->GetTouchable()->GetCopyNumber()). If this is done on the entire volume chain up to the world volume, then the dd4hep::BitFieldCoder could be used to create the volID using all the copynumbers. This would be identical to the volID from the usual DD4hep volumeID(step) method, correct?

The only question I have about this method is as follows.
In the geometry constructor copynumbers are assigned as parameters of the PlaceVolume() method (similar to Geant4 style). While the DD4hep volID is assigned with PlacedVolume.addPhysVolID("string",int) and I can concatenate multiple volIDs (e.g. PlacedVolume.addPhysVolID().addPhysVolID()). If I want to make sure that the copynumber is identical to the volID, I can only set one volID per placed volume (i.e. doing .addPhysVolID() only once on each placed volume). Otherwise, the volID created with the BitFieldCoder using all the copynumbers chain would never be identical to the dd4hep one.
Do you agree with this?

Thank you

@MarkusFrankATcernch
Copy link
Contributor

You have to be careful here:

  • copy numbers have nothing to do with the volIDs in DD4hep.
  • TGeo assigns automatically a copy number, which is the number of the daughter volume in the parent.
    It is a sequence from 0...n and not changeable. When translating to Geant4 I re-use this numbers (to be checked).
    So whatever you come up with, you have to live with these conditions.
    There is no freedom then anymore.

@lopezzot
Copy link
Author

lopezzot commented Aug 29, 2024

Ok... then what you meant with this

P.S. Another possibility would e.g. be to play with the copy numbers provided there are no assemblies in the volume chain.
Then the copy numbers should be the same in TGeo and Geant4 and the resulting volume ID of the path
could be constructed without lookups. That would be fast .....

is that I can construct some ID using the copynumbers, but there is no way to make this ID identical to the volID of DD4hep. Correct?

Please allow another question here. If copynumbers are assigned "automatically" and are "not changeable". Why does such a method exist in DD4hep?

/** Daughter placements with user supplied copy number for the daughter volume  */
/// Place daughter volume. The position and rotation are the identity
PlacedVolume placeVolume(const Volume& volume, int copy_no) const;

@MarkusFrankATcernch
Copy link
Contributor

I think I should remove this placement operation from DD4hep.
If you use placeVolume(const Volume& volume, int copy_no) and have "holes" in the copy numbers TGeo will have problems looping over daughters by index.
You will simply bomb....

@lopezzot
Copy link
Author

lopezzot commented Aug 29, 2024

Setting aside the timing bomb method PlacedVolume placeVolume(const Volume& volume, int copy_no) const. Let me try to summarize.

Simulating highly-granular calorimeters with DD4hep standard sensitive detector (action) will cache some information that allows the volumeID(step) and cellID(step) methods to work and to be quick. Unfortunately, this results in a large memory footprint that might be unacceptable (in my case 11 GB on top of the geometry footprint). regexSensitiveDetector allows the usage of a sensitive detector (action) without caching all this information (the 11 GB in my case just disappear), but the volumeID() method will not work. You might still access volIDs with something like

G4VPhysicalVolume* PhysVol = aStep->GetPreStepPoint()->GetTouchableHandle()->GetVolume();
Geant4Mapping& Mapping = Geant4Mapping::instance();
PlacedVolume PlacedVol = Mapping.placement(PhysVol);
const PlacedVolumeExtension::VolIDs& TestIDs = PlacedVol.volIDs();
auto it = TestIDs.find("name");
std::cout<<it->first<<" "<<it->second<<std::endl;

but this makes the simulation slower by order of magnitudes (another no-no).
On top of it, there is no way to recreate the dd4hep volIDs exploiting copynumbers, as such should not be interpreted as Geant4 copynumbers, i.e. an int to be associated at will to the placed volume, but they are automatically created by DD4hep and represent the number of daughters (or something close to it).

Unfortunately, this seems like a strong limitation. My geometry represents a calorimeter with 30 million volumes, but using proper nesting I managed to have in-memory less than 1 million placed volumes. I agree this is huge but I do not consider it out-of-this-world.

@lopezzot
Copy link
Author

Actually, I am observing something different about placeVolume(const Volume& volume, int copy_no).

I am setting some copynumbers with it to 99. Then inside Geant4SensitiveAction<>::process() step->GetPreStepPoint()->GetTouchable()->GetCopyNumber() correctly returns 99. It seems it is working as in Geant4...

Are you sure about this?

TGeo assigns automatically a copy number, which is the number of the daughter volume in the parent.
It is a sequence from 0...n and not changeable. When translating to Geant4 I re-use this numbers (to be checked).

@MarkusFrankATcernch
Copy link
Contributor

If you have non-secutive copy numbers typical loops like

  int num = nodes ? nodes->GetEntries() : 0;
  for (int i = 0; i < num; ++i)
    auto* n = (TGeoNode*)nodes->At(i);
  }

Did not work for me at some point in time......At least I once ran into SEGV's.
This is then a problem.

I cannot verify this now in the TGeoVolume code. You may perhaps give it a try.
But be vigilant against this problem - I may have not spotted the relevant code sections.

@s6anloes
Copy link

nodes would be something like a vector of volumes in this case?
It's not clear to me how this is related to the volume copy number. The vector would simply be filled one by one and the copy number would be irrelevant, no?
Unless it's more like a std::map<int, Volume> where the key is the copy number?

@MarkusFrankATcernch
Copy link
Contributor

Nodes would be something like "TObjArray" in this case. Discussion is void. We have no handle on internals of TGeo.

@lopezzot
Copy link
Author

Hello,

trying to add something on placeVolume(const Volume& volume, int copy_no).

This method adds a node to the parent volume (m_element)

PlacedVolume Volume::placeVolume(const Volume& volume, int copy_no) const {
  return _addNode(m_element, volume, copy_no, detail::matrix::_identity());
}

where _addNode essentially does

PlacedVolume _addNode(TGeoVolume* par, TGeoVolume* daughter, int id, TGeoMatrix* transform) {
  TGeoVolume* parent = par;
  // a lot of things
  /* n = */ parent->AddNode(daughter, id, transform);
  // some more
}

and AddNode of TGeoVolume class does

void TGeoVolume::AddNode(TGeoVolume *vol, Int_t copy_no, TGeoMatrix *mat, Option_t * /*option*/)
{
   // some things
   TGeoNodeMatrix *node = 0;
   node = new TGeoNodeMatrix(vol, matrix);
   // some things on node
   node->SetNumber(copy_no);
}

DD4hep users can set TGeoVolume's copy numbers at will, there is no check on this throughout the code. If this copy number is used internally in TGeo functionalities then there is a problem.

One last thing, I can confirm that Geant4 physical volume copynumbers created in the geometry conversion are copied from their TGeo counterpart (whatever this number is). DD4hep users can check this with something like this

auto cpno = aStep->GetPreStepPoint()->GetTouchable()->GetCopyNumber();
G4VPhysicalVolume* PhysVol = aStep->GetPreStepPoint()->GetTouchableHandle()->GetVolume();
Geant4Mapping& Mapping = Geant4Mapping::instance();
const PlacedVolume& PlacedVol = Mapping.placement(PhysVol);
std::cout<<"dd4hep copyNumber "<<PlacedVol.copyNumber()<<" g4 copynumber "<<cpno<<std::endl;

in my simulation, they are identical.

Regardless of this issue, I still have the problem I summarized a few messages above. The simulation I am doing is part of the CERN FCCee program, I cannot give up on this (yet). Please let me know what could be done.

@MarkusFrankATcernch
Copy link
Contributor

OK. But then your problem should be sort of solved:
You define the volID of a volume as being the copy number and set these values directly in the geometry constructor.

Then, during simulation you use the touchable history, walk up the path, logically 'or' the copy numbers and
what you get is the volume id of the sensitive volume in question - job done.

Whether to implement this as a utility is another question to be discussed, but the mechanism should work as described.
Do I miss here something?

@lopezzot
Copy link
Author

Hi, yes this is what I had in mind. And I thought that by keeping the copynumbers and the volIDs identical in the geometry constructor, one could even use the dd4hep::BitFieldCoder to recreate the dd4hep volID using the full history of copynumbers (to be checked if this is possible).

However, I kind of understood from your comments before that setting copynumbers is dangerous and you experienced problems with it in the past.

Did not work for me at some point in time......At least I once ran into SEGV's. This is then a problem.

If you use placeVolume(const Volume& volume, int copy_no) and have "holes" in the copy numbers TGeo will have problems looping over daughters by index. You will simply bomb....

Would you consider this approach safe or not?

Thank you.

@MarkusFrankATcernch
Copy link
Contributor

MarkusFrankATcernch commented Aug 31, 2024

Concerning the crashes:

I am not perfect, maybe I made a mistake in the past and remembered the failure with the copy numbers this as folklore....

Concerning volids:

The BitFieldDecoder takes 64 bits, copy numbers are 32 bits (aka int). You can only store a part
of the bitfield in the copy number. You have to restruct yourself to 32 bits for the VolumeID

Hence:
no system-id, and no high part of the field (which are usually anyhow only local coordinates on the sensitive volume).
The CellID you could then still implement using local coordinates on the sensitive volume. which would not be part of the copy number.

P.S. If you could come up with a sort of generic utility of common interest for this kind of sensitive detector implementation I would like to include it in the DD4hep code. All this sounds really interesting.

@lopezzot
Copy link
Author

Dear All,

Apologies for the long delay.
I reconstructed the 64-bit volID using only Geant4 copynumbers in the Geant4SensitiveAction<>::process() method. Essentially it is just bit-shifting operations divided into two steps:

  1. In the detector construction (create_detector()method).
    When volumes are placed I use the placeVolume() method in which an int is passed, this int will be the Geant4 copynumber to be accessed during the simulation from the step object. If the PlacedVolume only assigns one volID, i.e. .addPhysVolID() is called only once, I make sure the copynumber set is identical to the volID set (and I also make sure this number does not need more than 32 bits).
    If the PlacedVolume has more than one call to .addPhysVolID() I divide the 32 bits of the copynumber to allocate all the volIDs. For instance, when tubes of my calorimeter are placed, I assign two IDs corresponding to the raw and column (.addPhysVolID("row",k).addPhysVolID("col",j)). Then I divide the copynumber into a first 16 bits corresponding to the raw and a second 16 bits corresponding to the column. A simple function like this does the job
unsigned int CalcCpNo32bits(int id1, int id2)
{
  if (id1 > 0xFFF || id2 > 0xFFF){
    throw std::invalid_argument("row and col IDs should not take more than 16 bits");
  }
  unsigned int CpNo = (id1 << 16) | id2; 
  return CpNo;
}
  1. During the step processing (process() method) .
    Here I access every copynumber from the step starting with the current volume and navigating up to the uppermost volume (using aStep->GetPreStepPoint()->GetTouchable()->GetCopyNumber(n)). The copynumbers corresponding to a single ID are simply casted to unsigned int. The copynumbers which have been split to allocate multiple IDs are divided with bit masks. In the case of my tubes, something like this
auto TubeID = aStep->GetPreStepPoint()->GetTouchable()->GetCopyNumber(2);
unsigned int ColumnID = TubeID >> 16;
unsigned int RawID = TubeID & 0xFFFF;

Then, I can allocate 64 bits to reconstruct the DD4hep volID. For instance, having in my case this structure <id>stave:10,tower:6,air:1,col:16,row:16,clad:1,core:1,cherenkov:1</id> I do

uint64_t VolID = 0;
VolID |= (uint64_t)StaveID << 0;
VolID |= (uint64_t)TowerID << 10;
VolID |= (uint64_t)0 << 16;
VolID |= (uint64_t)ColumnID << 17;
VolID |= (uint64_t)RawID << 33;
VolID |= (uint64_t)1 << 49;
VolID |= (uint64_t)secondBitValue << 50;
VolID |= (uint64_t)firstBitValue << 51;

This VolID is identical to the one I get with the DD4hep method volumeID(aStep) when regexSensitiveDetector is not used.
With this trick it is possible to use regexSensitiveDetector without allocating a lot of memory for the sensitive volumes but still having access to the correct volID which can be used at later stages like in the analysis.

Now, can this become a generic utility? I am not sure at the moment... for sure some gymnastic is needed and also checks that the bit-shifting operations stay within boundaries.

FYI @s6anloes @SanghyunKo

@andresailer
Copy link
Member

Hi @lopezzot

There is a tool for the volumeID gymnastics, the BitFieldCoder.

https://dd4hep.web.cern.ch/dd4hep/reference/classdd4hep_1_1DDSegmentation_1_1BitFieldCoder.html#details

VolumeID VolID = 0;
BitFieldCoder bc("stave:10,tower:6,air:1,col:16,row:16,clad:1,core:1,cherenkov:1");
bc.set(VolID, "stave" , StaveID);
bc.set(VolID, "tower" , TowerID);
//...

Though it doesn't do range checks

@lopezzot
Copy link
Author

Hi,

I believe the last part when the volID is created could use the BitFieldCoder, thanks!

However, when I have to split the copynumber into two sets of 16-bits in the geometry construction, there should be range check with some protections like the one I added.

My two cents.

@andresailer
Copy link
Member

(.addPhysVolID("row",k).addPhysVolID("col",j)).

You mean add a addPhysVolID function that takes a vector of pairs or tuples (name, value, number of bits?) and do the magic bit shifting stuff?

@lopezzot
Copy link
Author

Sort of, yes. A function that takes a vector of (name, value, bits) and creates an int with the values inserted in the right bit position. This int can be used as the copynumber for the G4 geometry. Ideally bit ranges should be checked when creating the int.

Then one would need a function to be used in the SDAction that takes the copynumber and splits it into values. The BitFieldCoder does this over 64-bits, if it can be used also on 32-bits then we have this function already implemented.

I don't know if this should be a generic utility of just be left to the users to implement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants
@andresailer @MarkusFrankATcernch @lopezzot @s6anloes and others