How to Run the MDF2CSV Program#
The MDF2CSV program allows users to convert MDF files into CSV files for further analysis, using algorithms written in the Allen framework.
Danger
The Allen MDF2CSV program is currently available, but it has not been extensively tested. Its functionality is being maintained in case it is needed in the future.
The CSV dumping algorithms in Allen#
A specific sequence, called persistence_csv
(implemented in
persistence_csv.py)
runs 6 new algorithms, presented in the table below. They are defined in
Allen/host/persistence/
.
Algorithm |
Output file name |
A row corresponds to … |
---|---|---|
|
|
Velo cluster |
|
|
UT hit |
|
|
SciFi hit |
|
|
MC particle |
|
|
A MC particle - hit relation |
|
|
MC vertex |
The hits_velo.csv
, hits_ut.csv
, hits_scifi.csv
and mc_particles.csv
hfiles that are produced by the MDF2CSV program have columns that should be similar
to those generated by the XDIGI2CSV algorithm.
You can find detailed descriptions of these CSV files and their columns in the
CSV Files and Columns page.
Note
Note that Allen, which is the software that is used to dump the contents of MDF files,
cannot output the MC hits (mchits_{detector}
files).
Apart from the above-mentioned CSV files, the MDF2CSV program can also dump two additional CSV files:
mc_hit_links.csv
: provides an alternative way to link MC particles to their hits. The output CSV file allows you to associate a mcid with the lhcbids of the hits left by the MC particle.mc_vertices.csv
: contains the positions of the MC primary vertices of every event.
How to run#
Prerequisites#
First, you need to clone and build Allen. Refer to the Allen Documentation, more specifically the Build Allen page.
Run without the XDIGI2CSV repository#
You need configure the sequence using environment variables:
export MDF2CSV_INCLUDE_HITS_VELO=true
: dumphits_velo.csv
export MDF2CSV_INCLUDE_HITS_UT=true
: dumphits_ut.csv
export MDF2CSV_INCLUDE_HITS_SCIFI=true
: dumphit_scifi.csv
export MDF2CSV_INCLUDE_MC_PARTICLES=true
: dumpmc_particles.csv
export MDF2CSV_INCLUDE_MC_VERTICES=true
: dumpmc_vertices.csv
export MDF2CSV_INCLUDE_MC_HIT_LINKS=true
: dumpmc_hit_links.csv
export MDF2CSV_OUTDIR=your/outdir/
: to configure the output directory. If not specified, the output is saved inAllen/output/csv
.export MDF2CSV_ERASE=true
: same aserase
for the XDIGI2CSV algorithm, that is, whether to erase an existing CSV file.
Once the environment variables are set, move to the Allen/build
directory
and execute the following command:
/toolchain/wrapper ./Allen --sequence persistence_csv \
--mdf ../input/minbias/mdf/MiniBrunel_2018_MinBias_FTv4_DIGI_retinacluster_v1.mdf
Here, the sequence to run is persistence_csv
. The --mdf
option
specifies the input MDF file to convert.
Using the XDIGI2CSV repository#
You can configure the MDF2CSV algorithm using a YAML configuration file such as
the one provided in jobs/examples/mdf2csv/mdf2csv.yaml
:
allen_input: # Input to an Allen algorithm
indir: ../xdigi2mdf/xdigi2mdf-smog2 # Input directory
computing:
program: mdf2csv # the algorithm to run, here MDF2CSV
mdf2csv:
algos: # the list of algorithms to run
- hits_velo
- hits_ut
- hits_scifi
- mc_particles
- mc_vertices
- mc_hit_links
erase: true # whether to erase an existing CSV file
output:
outdir: "mdf2csv" # the output directory
You can then run it using the command:
./run/run.py -c jobs/examples/mdf2csv/mdf2csv.yaml
This is equivalent to running the command:
`./run/mdf2csv/run.py -c jobs/examples/mdf2csv/mdf2csv.yaml`
Note that the Allen input can be configured in 2 ways. You can either provide:
allen_input:
indir: /input/directory/
mdf_filename: "dumped_mdf.mdf" # MDF file name in `indir`. Default is `*.mdf`
geo_dirname: "geometry" # Geometry directory name in `indir`
This is practical because you just have to provide the input directory and don’t have to provide two different paths, one for the MDF files and one for the geometry directory.
Alternatively, you can provide the MDF file(s) and geometry directory separately using:
allen_input:
paths:
- /path/to/mdf1.mdf
- /path/to/mdf2.mdf
- {XDIGI2CSV_REPO}/data/mdf3.mdf
- ./some/relative/path/to/mdf4.mdf
geodir: /path/to/geodir/
This provides more flexibility and allows you to specify the paths of individual MDF files and the geometry directory separately.