Skip to content

Commit

Permalink
Add LCIO to EDM4hep conversion functionality (#11)
Browse files Browse the repository at this point in the history
* Add LCIO to EDM4hep conversion functionality

* Convert ParticleIDs as part of the ReconstructedParticles

* Add library for basic comparisons of LCIO and EDM4hep

* Add a standalone executable for converting LCIO to EDM4hep

* Add executable for comparing lcio and edm4hep file contents

* Add .vscode folder to .gitignore

* adding a conversion of the EventHeader and its data

* Create README.md

* Added a conversion for LCFloatVec and LCIntVec Collections.

* conversion of the event parameters

* Conversion of information in LCRunHeader

* Conversion of ParticleIDs of the Cluster Collection

---------

Co-authored-by: Finn Johannsen <finn.johannsen@desy.de>
Co-authored-by: jmcarcell <jmcarcell@users.noreply.github.com>
Co-authored-by: FinnJohannsen <57949495+FinnJohannsen@users.noreply.github.com>
  • Loading branch information
4 people committed Jun 13, 2023
1 parent 00b0112 commit b50bef5
Show file tree
Hide file tree
Showing 13 changed files with 2,508 additions and 3 deletions.
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -34,4 +34,7 @@ install*
*.app

# CMake generated
Testing
Testing

# Tooling
.vscode
6 changes: 6 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -26,5 +26,11 @@ find_package(LCIO REQUIRED)
find_package(EDM4HEP REQUIRED)

add_subdirectory(k4EDM4hep2LcioConv)
add_subdirectory(standalone)

include(CTest)
if (BUILD_TESTING)
add_subdirectory(tests)
endif()

include(cmake/k4EDM4hep2LcioConvCreateConfig.cmake)
125 changes: 125 additions & 0 deletions doc/LCIO2EDM4hep.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
# Standalone conversion from LCIO to EDM4hep
The `lcio2edm4hep` executable reads LCIO (`.slcio`) files and converts its
contents into EDM4hep. Each `LCEvent` of the input file will be put into a
`podio::Frame` in the output file (under the `events` category). The most basic
usage is simply

```bash
lcio2edm4hep <input.slcio> <output.edm4hep.root>
```

## Patching missing collections on the fly
A major difference between LCIO and EDM4hep is that in LCIO an `LCEvent` can
effectively have arbitrary contents, whereas in EDM4hep the assumption is that
each event consists of the same collections (even if some of them are empty).
Hence, it is necessary to either ensure that all events in the LCIO file have
the same contents or or to give `lcio2edm4hep` some additional information such
that it can patch in potentially missing collections on the fly. This additional
information comes in the form of a third argument to `lcio2edm4hep` and is
effectively a list of collection names and their types that comprise the
superset of all collectoins appearing in at least one event in the input LCIO
file. The format looks like this, where each collection is a single line
containing the name first and than its type, e.g.

```
SETSpacePoints TrackerHit
RecoMCTruthLink LCRelation[ReconstructedParticle,MCParticle]
```

The easiest way to obtain such a file is to use the `check_missing_cols`
executable that comes with LCIO using the `--minimal` flag. The output of this
can be directly consumed by `lcio2edm4hep`

#### Example:
1. Get the patch file
```bash
check_missing_cols --minimal \
/pnfs/desy.de/ilc/prod/ilc/mc-2020/ild/rec/250-SetA/higgs/ILD_l5_o2_v02/v02-02-01/00015671/000/rv02-02-01.sv02-02-01.mILD_l5_o2_v02.E250-SetA.I402005.Pe3e3h.eL.pR.n000_002.d_rec_00015671_493.slcio \
> patch.txt
```
2. Pass it to `lcio2edm4hep`
```bash
lcio2edm4hep \
/pnfs/desy.de/ilc/prod/ilc/mc-2020/ild/rec/250-SetA/higgs/ILD_l5_o2_v02/v02-02-01/00015671/000/rv02-02-01.sv02-02-01.mILD_l5_o2_v02.E250-SetA.I402005.Pe3e3h.eL.pR.n000_002.d_rec_00015671_493.slcio \
Output.root \
patch.txt
```

### Converting `LCRelation` collections
For collections of `LCRelation` type it is necessary to define the `FromType` and
`ToType` as well, as otherwise the converter will not be able to create the
correct edm4hep file. The `check_missing_cols` executable will try to determine
these types from the collection parameters and will warn if it cannot do it for
certain collections. In this case it is the **users responsibility to provide
the missing types** as otherwise the conversion will simply skip these
collections, or potentially even crash.


# Library usage of the conversion functions
The conversion functions are designed to also be usable as a library. The overall design is to make the conversion a two step process. Step one is converting the data and step two being the resolving of the relations and filling of subset collection.

## Converting collection (data)
The main entry point is `convertCollection` which will automatically dispatch to
the correct conversion function depending on the type information that is stored
in the input `LCCollection`. It is also possible to access the individual
conversion functions for each type. All of the conversion functions take a map
of LCIO to EDM4hep objects of their specific type that will be filled during the
conversion. for convenience all necessary maps are bundled in the
`LcioEdmTypeMapping` struct.

## Handling relations
**Once all necessary collections have been converted, it is necessary to resolve
the relations between the obects.** This is done using the `resolveRelations`
function. This will again dispatch to the correct relation resolving function
for the corresponding types, which can obviously also be invoked directly.

## Handling of subset collections
Subset collections are handled similar to relations using the function
`fillSubset`. Internally this simply forwards to `handleSubsetColl` which
handles all the type details and can obviously also be used directly.

## Handling of `LCRelation`s
`LCRelation` only exist in LCIO and their conversion is limited to what is
available in EDM4hep. They use the `"FromType"` and `"ToType"` collection
parameters to get the necessary type information.

The AssociationCollections in EDM4hep are then created using `createAssociations`.

## Converting entire events
Converting an entire event can be done calling the `convertEvent`. This can also
be used as an example to guide the implementation of custom conversions using
the available functionality.

## Converting Event parameters
This can be done by calling `convertObjectParameters` that will put all the event parameters into the passed `podio::Frame`.

## Subtle differences between LCIO and EDM4hep
There are a few small differences between LCIO and EDM4hep that shine through in the conversion, these are:

- `CaloHitContributions` are part of the SimCalorimeterHits in LCIO while being their own data type in EDM4hep. They are created by [`createCaloHitContributions`](../k4EDM4hep2LcioConv/include/k4EDM4hep2LcioConv/k4Lcio2EDM4hepConv.h).
- The event informaton like is part of the `LCEvent` in LCIO. In EDM4hep there is a separate EventHeader Collection. It can be created using [`EventHeaderCollection`](../k4EDM4hep2LcioConv/include/k4EDM4hep2LcioConv/k4Lcio2EDM4hepConv.h) which is stored under the name `"EventHeader"`.
- Particle IDs are converted during the conversion of the the reconstructed Particle collection.

## Example for a ReconstructedParticle Collection
```cpp
#include "k4EDM4hep2LcioConv/k4Lcio2EDM4hepConv.h"

// the struct defined in the header file is used for the maps linking Lcio particles
// to their EDM counterparts.

auto typeMapping = LcioEdmTypeMapping{};

// We assume that this is a collection of ReconstructedParticles!
LCEVENT::LCCollection* lcCollection;

// Convert the data
auto edmCollections = convertReconstructedParticle("name",
lcCollection,
typeMapping.recoParticles,
typeMapping.particleIDs);

// Resolve relations (only converted objects will be available)
// This has to be called at the very end, after all collection data has been
// converted
resolveRelations(typeMapping);
```
7 changes: 5 additions & 2 deletions k4EDM4hep2LcioConv/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
add_library(k4EDM4hep2LcioConv SHARED
src/k4EDM4hep2LcioConv.cpp)
src/k4EDM4hep2LcioConv.cpp
src/k4Lcio2EDM4hepConv.cpp
)

target_include_directories(k4EDM4hep2LcioConv PUBLIC
${LCIO_INCLUDE_DIRS}
Expand All @@ -12,7 +14,8 @@ target_link_libraries(k4EDM4hep2LcioConv PUBLIC

set_target_properties(${CMAKE_PROJECT_NAME}
PROPERTIES
PUBLIC_HEADER "include/${CMAKE_PROJECT_NAME}/k4EDM4hep2LcioConv.h")
PUBLIC_HEADER "include/${CMAKE_PROJECT_NAME}/k4EDM4hep2LcioConv.h;include/${CMAKE_PROJECT_NAME}/k4Lcio2EDM4hepConv.h"
)

install(TARGETS k4EDM4hep2LcioConv
EXPORT ${CMAKE_PROJECT_NAME}Targets
Expand Down
Loading

0 comments on commit b50bef5

Please sign in to comment.