How to Run the MDF2CSV Program#

The MDF2CSV program allows users to convert MDF files into CSV files for further analysis, using algorithms written in the Allen framework.

Danger

The Allen MDF2CSV program is currently available, but it has not been extensively tested. Its functionality is being maintained in case it is needed in the future.

The CSV dumping algorithms in Allen#

A specific sequence, called persistence_csv (implemented in persistence_csv.py) runs 6 new algorithms, presented in the table below. They are defined in Allen/host/persistence/.

Algorithm

Output file name

A row corresponds to …

host_velo_persistence_t

hits_velo.csv

Velo cluster

host_ut_persistence_t

hits_ut.csv

UT hit

host_scifi_persistence_t

hits_scifi.csv

SciFi hit

host_mc_particles_persistence_t

mc_particles.csv

MC particle

host_mc_hit_links_persistence_t

mc_hit_links.csv

A MC particle - hit relation

host_mc_vertices_persistence_t

mc_vertices.csv

MC vertex

The hits_velo.csv, hits_ut.csv, hits_scifi.csv and mc_particles.csv hfiles that are produced by the MDF2CSV program have columns that should be similar to those generated by the XDIGI2CSV algorithm. You can find detailed descriptions of these CSV files and their columns in the CSV Files and Columns page.

Note

Note that Allen, which is the software that is used to dump the contents of MDF files, cannot output the MC hits (mchits_{detector} files).

Apart from the above-mentioned CSV files, the MDF2CSV program can also dump two additional CSV files:

  • mc_hit_links.csv: provides an alternative way to link MC particles to their hits. The output CSV file allows you to associate a mcid with the lhcbids of the hits left by the MC particle.

  • mc_vertices.csv: contains the positions of the MC primary vertices of every event.

How to run#

Prerequisites#

First, you need to clone and build Allen. Refer to the Allen Documentation, more specifically the Build Allen page.

Run without the XDIGI2CSV repository#

You need configure the sequence using environment variables:

  • export MDF2CSV_INCLUDE_HITS_VELO=true: dump hits_velo.csv

  • export MDF2CSV_INCLUDE_HITS_UT=true: dump hits_ut.csv

  • export MDF2CSV_INCLUDE_HITS_SCIFI=true: dump hit_scifi.csv

  • export MDF2CSV_INCLUDE_MC_PARTICLES=true: dump mc_particles.csv

  • export MDF2CSV_INCLUDE_MC_VERTICES=true: dump mc_vertices.csv

  • export MDF2CSV_INCLUDE_MC_HIT_LINKS=true: dump mc_hit_links.csv

  • export MDF2CSV_OUTDIR=your/outdir/: to configure the output directory. If not specified, the output is saved in Allen/output/csv.

  • export MDF2CSV_ERASE=true: same as erase for the XDIGI2CSV algorithm, that is, whether to erase an existing CSV file.

Once the environment variables are set, move to the Allen/build directory and execute the following command:

/toolchain/wrapper ./Allen --sequence persistence_csv \
    --mdf ../input/minbias/mdf/MiniBrunel_2018_MinBias_FTv4_DIGI_retinacluster_v1.mdf 

Here, the sequence to run is persistence_csv. The --mdf option specifies the input MDF file to convert.

Using the XDIGI2CSV repository#

You can configure the MDF2CSV algorithm using a YAML configuration file such as the one provided in jobs/examples/mdf2csv/mdf2csv.yaml:

allen_input: # Input to an Allen algorithm
  indir: ../xdigi2mdf/xdigi2mdf-smog2 # Input directory
computing:
  program: mdf2csv # the algorithm to run, here MDF2CSV
mdf2csv:
  algos:  # the list of algorithms to run
    - hits_velo
    - hits_ut
    - hits_scifi
    - mc_particles
    - mc_vertices
    - mc_hit_links
  erase: true # whether to erase an existing CSV file
output:
  outdir: "mdf2csv"  # the output directory

You can then run it using the command:

./run/run.py -c jobs/examples/mdf2csv/mdf2csv.yaml

This is equivalent to running the command:

`./run/mdf2csv/run.py -c jobs/examples/mdf2csv/mdf2csv.yaml`

Note that the Allen input can be configured in 2 ways. You can either provide:

allen_input:
    indir: /input/directory/
    mdf_filename: "dumped_mdf.mdf" # MDF file name in `indir`. Default is `*.mdf`
    geo_dirname: "geometry" # Geometry directory name in `indir`

This is practical because you just have to provide the input directory and don’t have to provide two different paths, one for the MDF files and one for the geometry directory.

Alternatively, you can provide the MDF file(s) and geometry directory separately using:

allen_input:
    paths:
    - /path/to/mdf1.mdf
    - /path/to/mdf2.mdf
    - {XDIGI2CSV_REPO}/data/mdf3.mdf
    - ./some/relative/path/to/mdf4.mdf
    geodir: /path/to/geodir/

This provides more flexibility and allows you to specify the paths of individual MDF files and the geometry directory separately.