Configuration Files and Helper Scripts#
This guide explains how to use configuration files in order to configure programs. It does not explain how the programs work and how to run it on your data.
If you want to view all the possible options that can be set up in a YAML
configuration file, along with their descriptions, the easiest way is to refer to the
setup/config_default.yaml
file, which contains the default configuration.
Configuration files#
Brief point on YAML files#
YAML (Yet Another Marking Language) is a data serialisation language commonly used for configuration files. You can refer to this course for a small introduction to YAML.
In Python, a YAML file defines a dictionary whose keys are strings and values can be
str
, float
, int
, list
, dict
or None
.
Here is an example:
key_string1: string # str
key_string2: 'string' # str
key_string3: "string" # str
key_int: 1 # int
key_float: 1. # float
key_bool1: true # bool
key_bool2: false # bool
key_list1: [el1, el2, el3] # list
key_list2: # list
- el1
- el2
- el3
key_dict1: {subkey1: subvalue1, subkey2: subvalue2} # dict
key_dict2: # dict
subkey1: subvalue1
subkey2: subvalue2
key_none: null # `None`
This example shows different data types that can be used in YAML files.
Note that the hash symbol (#
) is used to indicate comments in the YAML file.
Note
In this repository, YAML files are used instead of JSON files because YAML allows comments to be included in the file.
Configuration and default configuration#
The programs in this repository are fully configurable using YAML files. The YAML files are divided into sections, as shown below:
section1:
param1.1: value1.1
param1.2: value1.2
section2:
param2.1: value2.1
param2.2: value2.2
This allows scripts used to run the programs (see next section) to load only the sections and parameters that are needed.
The default sections, parameters, and their default values are defined in the YAML file
config_default.yaml
. Default values can be overridden by creating and writing to
setup/config.yaml
.
The purpose of each section and parameter can be found in definitions/dconfig.py
.
The available sections are:
build
: the path to the build directories of Moore and Allen standalone. By default, this points to the repositories set up in my AFS public space.global
: general variablesmoore_input
: parameters used to configure the input of a Moore algorithmoutput
: parameters used to configure the output directorycomputing
: parameters used to divide a job into sub-jobs using ganga or HTCondor
In addition, each program in this repository has its own section in the configuration file: xdigi2csv
, xdigi2root
, xdigi2mdf
and mdf2csv
.
Run a program#
Helper scripts to run the programs are located in the run
folder:
run/moore/run.py
is used to run one of the Moore programs (XDIGI2CSV, XDIGI2ROOT and XDIGI2MDF). You can select the appropriate program by running either./run/moore/run.py xdigi2csv
, or./run/moore/run.py xdigi2root
or./run/moore/run.py xdigi2mdf
../run/mdf2csv/run.py
is used to run the MDF2CSV program.
Parse arguments#
To configure a program, you can also parse the arguments quoted in setup/config_default.yaml
.
For instance, if you execute ./run/moore/run.py xdigi2csv -h
,
you will see the appropriate parameters given in config_default.yaml
repeated.
This means you can configure (or override) the parameter values by parsing the arguments.
For example, you can configure the XDIGI2CSV program by running:
./run/moore/run.py xdigi2csv --detectors velo ut --extended true --paths /path/to/xdigi/file --outdir output/
In order to ensure that all the parameters in setup/config_default.yaml
can be
parsed, the following rules are set up:
If the parameter is an boolean parameter such as
extended
,--extended
or--extended true
set the parameter toTrue
while--extended false
set the parameter to false.If the parameter is an
str
such asoutdir
,--outdir
with no argument value set the parameter toNone
.If the parameter is a list such as
detectors
, the values are provided by separating by space each element of the list.
Using a configuration file#
Passing all the arguments through the command line is not very practical.
For this reason, it is possible to configure the algorithm using the -c
(or --config
)
parameter that the scripts have. The previous command is equivalent to
./run/moore/run.py xdigi2csv --config local_config.yaml
where the content of local_config.yaml
is
xdigi2csv:
detectors:
- velo
- ut
extended: true
moore_input:
paths: /path/to/xdigi/file
output:
outdir: output/
You can still use the command-line to override the arguments in local_config.yaml
,
e.g.,
./run/moore/run.py xdigi2csv --config local_config.yaml --extended false # finally set extended to `False`
Concretly, local_config.yaml
overrides the arguments in config.yaml
and config_default.yaml
.
Important
Relative paths in a YAML file are always expressed RELATIVE TO this YAML file.
Use several configuration files#
--config
can take a list of configuration so you can also run
./run/moore/run.py xdigi2csv --config local_config1.yaml local_config2.yaml
where local_config1.yaml
is
xdigi2csv:
detectors:
- velo
- ut
extended: true
moore_input:
paths: /path/to/xdigi/file
and local_config2.yaml
is
xdigi2csv:
extended: true
output:
outdir: output/
The parameters in local_config2.yaml
override the parameters in local_config1.yaml
It is also possible to include local_config1.yaml
in local_config2.yaml
:
include:
- local_config1.yaml
xdigi2csv:
extended: true
output:
outdir: output/
and execute ./run/moore/run.py xdigi2csv -c local_config2.yaml
. In this case,
the arguments in local_config2.yaml
override the ones in local_config1.yaml
.