tools package#
Package that contains helper functions used within this repository.
tools.configmanager module#
Module to handle the configuration of the repository
(the YAML files in the setup/
folder)
- tools.configmanager.assert_config_validity(config)[source]#
Check the the configuration has the correct formatting.
- Parameters:
config (
dict
) – configuration as configured in YAML files.
- tools.configmanager.filter_config(config, sections=None, constraints=None)[source]#
Filter the configuration to keep only what is necessary for the algorithm to run.
- Parameters:
config – actual configuration
sections – parameter sections to add
constraints – associates a section with the parameters to include. If the section is not in this dictionnary, all the parameters are included.
- Returns:
Filtered configuration
- tools.configmanager.group_argparse_args(parser_args, sections, constraints=None, ignore=None)[source]#
Group the argparse parameters by section according to as it is supposed to be for a configuration
- Parameters:
parser_args (argparse.Namespace | dict) – result of the
parse_args
methodsections (typing.Iterable[str]) – parameter sections to include
constraints (typing.Optional[typing.Dict[str, typing.Iterable[str]]]) – associates a section with the parameters to include. If the section is not in this dictionnary, all the parameters are included.
ignore (typing.Optional[typing.List[str]]) – list of variables to ignore in the grouping
- Return type:
dict
- Returns:
Configuration described by
parser_args
- tools.configmanager.import_config(path, resolve_relative_paths=True)[source]#
Import the configuration and asserts its format is valid. Also supports an
include:
section where a YAML file can be included.- Parameters:
path (
str
) – path to the YAML configuration fileresolve_relative_paths (
bool
) – whether to resolve the relative paths using theresolve_paths_in_config()
function
- Return type:
dict
- Returns:
Configuration
- tools.configmanager.import_default_config(repo, sections=None, constraints=None)[source]#
Import the default configuration.
- Parameters:
repo – path to the root of the repository
Notes
The default configuration is in
<repo>/setup/config_default.yaml
. The custom configuration is in<repo>/setup/config.yaml
- tools.configmanager.merge_configs(configs)[source]#
Merge configurations. Configurations are either given by their path ( in this case they are loaded) or the actual dictionnary.
- Parameters:
configs (typing.List[str | dict]) – list of configs or their paths
- Return type:
dict
- Returns:
Merged configuration, by override the configuration one by one
- tools.configmanager.override_nested_dict(dict_to_insert, dict_to_override, inplace=False)[source]#
Override a nested dictionnary by another nested dictionnary
- Parameters:
dict_to_insert (dict) – dictionnary to add to
dict_to_override
dict_to_override (dict) – base dictionnary which
dict_to_insert
is added toinplace (bool) – whether to override in place
- Return type:
dict | None
- Returns:
overriden dictionnary if not
inplace
- tools.configmanager.prepare_parser(parser, sections, program_args=None, constraints=None, with_config=True)[source]#
Add sections to a parser.
- Parameters:
parser (argparse.ArgumentParser | str) – parser to complete (type
argparse.ArgumentParser
) or description of the parser to create (if typestr
)sections (typing.Iterable[str]) – parameter sections to add
program_args (typing.Optional[dict]) – Dictionnary that allows to specify how to configure the subparsers that define which program to uses. It contains the key-value
dest
that specifies which program to run, and the possible programssections
(as a program is also associated with a section.) The other key-value couples (such ashelp
) are passed to theadd_subparsers()
method.constraints (typing.Optional[typing.Dict[str, typing.Iterable[str]]]) – associates a section with the parameters to include. If the section is not in this dictionnary, all the parameters are included.
with_config (bool) – whether to add the
--config
argument.
- Return type:
argparse.ArgumentParser | None
- Returns:
The parser that was created, if
parser
of typestr
- tools.configmanager.resolve_paths_in_config(config, config_path)[source]#
If relative paths or paths with wildcards are given in a configuration file,
Replace the wildcards by the actual environment variable values
express them relative to the directory where the YAML configuration file was.
The function is in-place.
- Parameters:
config (
dict
) – configuration inconfig_path
config_path (
str
) – path to the YAML file that containedconfig
Notes
This is applied to the options specified in
definitions.dconfig.path_in_config
- tools.configmanager.return_config_from_parser(repo, sections, program_args=None, constraints=None, parse_known_args=False)[source]#
Load the default configuration. Create a parser that allows to parse (a) YAML configuration file(s) as well as arguments to change the configuration. Returns the configuration altered by the arguments that are parsed.
- Parameters:
repo – path to the root of the repository
sections – parameter sections to add
constraints – associates a section with the parameters to include. If the section is not in this dictionnary, all the parameters are included.
program_args – Dictionnary that allows to specify how to configure the subparsers that define which program to use. It contains the key-value
dest
that specifies which program to run, and the possible programssections
(as a program is also associated with a section.) The other key-value couples (such ashelp
) are passed to theadd_subparsers()
method.parse_known_args – whether to only parse known arguments. The other arguments that are unknown given
sections
andconstraints
will be passed to the other scripts.return_last_config_path – whether to return the last configuration path that was used. The latter is used to determine the output directory in the case where
auto_output_mode
is set tosame
.
- Returns:
Configuration. If
parse_known_args
is set toTrue
, alse returns the arguments that were not parsed.. Ifreturn_last_config_path
, also return the path of the last configuration file that was used (orNone
if there were no configuration file)
- tools.configmanager.str2bool(value)[source]#
Correct parsing of a boolean.
- Parameters:
value (
str
) – value to parse- Return type:
bool
- Returns:
True
ifvalue
representsTrue
(yes
,true
, …).False
ifvalue
representsFalse
(no
,false
, …)
Notes
Taken from https://stackoverflow.com/questions/15008758
- tools.configmanager.update_config(config, base_config, inplace=False)[source]#
Update the configuration with another configuration
- Parameters:
config (str | dict) – path to the new YAML configuration or actual configuration
base_config (str | dict) – base configuration that will be updated
- Return type:
dict | None
- Returns:
Updated configuration if
inplace
set toFalse
tools.envvar module#
Helper functions to interface with environment variables.
- tools.envvar.get_environment_variable(env_var_name, is_bool=False)[source]#
Get the value of an environment variable.
- Parameters:
env_var_name (
str
) – Name of the environment variable to loadis_bool (
bool
) – whether the environment variable is assumed to be a boolean variable (eitherfalse
ortrue
). In this case, a boolean variable is returned.
- Raises:
AssertionError – The environment variable does not exist
AssertionError –
is_bool
but the environment is neithertrue
norfalse
- Return type:
Union
[bool
,str
]
- tools.envvar.resolve_wildcards(strings)[source]#
Replace wildcards or placeholders in strings by the corresponding environment variables.
- Parameters:
strings (str | typing.List[str]) – a string or list of strings
- Return type:
str | typing.List[str]
- Returns:
string or list of strings with any placeholders replaced by the corresponding environment variable.
Notes
An error is raised of not all the placeholders are replaced
tools.inoutconfig module#
Tools to configure the input and output paths.
- class tools.inoutconfig.MooreInputConfig(config, repo, return_paths=False)[source]#
Bases:
object
This context manager allows the configure the input needed for Moore, given the configuration.
It returns the path to the python file that is needed, that is,
python_input
in the configuration, or a custom python file that allows the configure the input in the case wherebookkeeping_path
orpaths
are used. In the latter, it creates the necessary temporary configuration file that the custom python input file will read to configure the input.When the context manager is exited, the environment variables and temporary files are deleted.
- config#
configuration of
moore_input
- repo#
path to the root of the repository
- return_paths#
whether to return the paths as well
- tools.inoutconfig.ban_storage_elements(banned_storage_elements, paths, xml_catalog_path)[source]#
Remove the physical links to banned storage elements in the XML catalog file.
- Parameters:
banned_storage_elements (
List
[str
]) – storage elements that must not be usedpaths (
List
[str
]) – list of paths that may include LFN pathsxml_catalog_path (
str
) – path to the XML catalog to alter
- Return type:
List
[str
]- Returns:
List of LFNs to remove as they are only stored on banned storage elements
Notes
All this is EXTREMELY ugly but I couldn’t find another way of removing storage elements while keeping XML files.
- tools.inoutconfig.generate_catalog(paths, use_ganga=False)[source]#
Get the PFNs given LFNs.
- Parameters:
paths (typing.Iterable[str]) – list of paths, that can contain LFNs (Logical File Name). LFNs must start with
LFN:
use_ganga (bool) – whether to use
ganga
- Return type:
tempfile.NamedTemporaryFile | tempfile.TemporaryDirectory | None
- Returns:
Temporary file that contains the XML catalog of the LFNs, or
None
if no LFNs were found.
Notes
This function uses ganga.
- tools.inoutconfig.get_allen_input(indir=None, mdf_filename=None, geo_dirname=None, paths=None, geodir=None)[source]#
Get the MDF input paths and the geometry directory from the configuration. There are 2 ways of specifying an Allen input
With
indir
,mdf_filename
andgeo_dirname
. This is practical for files generated by thexdigi2mdf
program because you just need to specify the input directoryWith
paths
andgeodir
- Parameters:
indir (typing.Optional[str]) – Input where directory where the MDF files are
mdf_filename (typing.Optional[str]) – MDF file name in
indir
geo_dirname (typing.Optional[str]) – geometry directory name in
indir
paths (typing.Optional[str]) – list of MDF paths
geodir (typing.Optional[str]) – input geometry directory
- Return type:
typing.Tuple[typing.List[str], str | None]
- Returns:
List of MDF input files and the geometry directory.
- tools.inoutconfig.get_bookkeeping_lfns(bookkeeping_path, start_index=0, nb_files=-1)[source]#
Get the LFNs associated with a bookkeeping path.
- Parameters:
bookkeeping_path (
str
) – path in the Dirac Bookkeeping browserstart_index (
int
) – index of the first LFN to retrievenb_files (
int
) – number of LFNs to retrieve
- Return type:
List
[str
]- Returns:
List of LFNs from
start_index
tostart_index + nb_files
Notes
This function uses ganga.
- tools.inoutconfig.get_moore_build(moore_build, platform=None)[source]#
Get what to run in order to have access to the Moore build
- Parameters:
moore_build (str) – value of the
build/moore
optionplatform (str | None) – Platform of the build to use (within lb-run or the local stack)
- Return type:
typing.List[str]
- Returns:
if
moore_build
starts withlb-run:
, it is intepreted aslb-run Moore/{version}
and the latter is returned. Otherwise,moore_build
is returned as it is
- tools.inoutconfig.get_moore_input_config(input_config)[source]#
Return the configuration dictionnary of the input file for Moore, in the case where the
python_input
is not used, and then, the generic python moore input file will be used.- Parameters:
input_config (dict) – section
moore_input
of the configuration- Return type:
dict | None
- Returns:
configuration dictionnary of the input file for Moore, or
None
if a python file is used as input.
- tools.inoutconfig.get_outdir(config, datatype=None)[source]#
Get the output path according to the output mode that was chosen.
- Parameters:
config (
dict
) – current configurationdatatype (
Optional
[str
]) – Type of the data (e.g.,csv
). This is used ifauto_output_mode
is set toeos
.
- Return type:
str
- Returns:
output path
tools.xdigi2csvtools module#
Module that contains functions to configure the algorithms to execute, to get the hits, MC particles and MC hits from a (X)DIGI file.
- tools.xdigi2csvtools.configure_algos(stack, algonames, outdir='persistence_csv', retina_clusters=True, extended=False, all_mc_particles=False, erase=True)[source]#
Configure the algorithms to run by changing the context (by basically doing a bunch of
algo.bind(...)
)- Parameters:
stack (
ExitStack
) – context manager. Modified in placealgonames (
List
[str
]) – list of the algorithms to run. We need this list in order only to to configure algorithms that will be run.retina_clusters (
bool
) – Whether to use Retina clustersextended (
bool
) – Whether to erase existing CSV files at the same locationsoutdir (
Optional
[str
]) – path to the directory where the CSV files are savederase (
bool
) – whether to erase an existing CSV file
- tools.xdigi2csvtools.get_algo_sequence(algonames)[source]#
A function that returns the list of algorithms to run to persist the hits and MC particles into CSV files, according to the configuration given as an input
- Parameters:
algonames (
List
[str
]) – list of the names of the algorithms to run- Return type:
Callable
[[],Reconstruction
]- Returns:
Returns a function that returns the list of the algorithms, wrapped around a Moore
Reconstruction
object
- tools.xdigi2csvtools.get_algonames(detectors, dump_event_info=False, dump_mc_hits=False)[source]#
Obtain the list of the names of the algorithms to run.
- Parameters:
detectors (
List
[str
]) – list of the detectors for which the hits and MC particles are dumpeddump_event_info (
bool
) – whether to dump event_info.csvdump_mc_hits (
bool
) – wether to dump the MC hits
- Return type:
List
[str
]- Returns:
List of the names of the algorithms to run
tools.tcomputing module#
- tools.tcomputing.add_environment_variables_to_copy(environment_variables, backend)[source]#
Add in-place the environment variables to copy.
- Parameters:
environment_variables (
Dict
[str
,str
]) – dictionnary of environemnt variables that is going to be altered in place.backend (
str
) – backend used for the computation
- tools.tcomputing.submit_job(script_path, args, nb_files=None, nb_files_per_job=-1, backend='condor', logdir=None, max_runtime=None, max_transfer_output_mb=None)[source]#
Submit a job in HTCondor or ganga.
- Parameters:
script_path (
str
) – path to the script to runargs (
list
) – list of the arguments to pass to the scriptnb_files (
Optional
[int
]) – number of files in totalnb_files_per_job (
int
) – number of files per jobbackend (
str
) –condor
,ganga-local
organga-dirac
. Only thecondor
backend is suppose to work properlylogdir (
Optional
[str
]) – path where to save the logmax_runtime (
Optional
[int
]) – time within which the job should be run. This is used by thecondor
backendmax_transfer_output_mb (
Optional
[int
]) –MAX_TRANSFER_OUTPUT_MB
HTCondor parameter. This is used by thecondor
backend
tools.tconversion module#
Tools to convert a CSV file into another format.
- tools.tconversion.convert_all_table_paths(indir, outformat, outdir=None, outcompression=None, informat='csv', incompression=None, keep_original=True, verbose=True)[source]#
Convert all the CSV-like files in a given directory into another format and/or compression.
- Parameters:
indir (
str
) – input directory where the files to convert areoutformat (
str
) – output formatoutdir (
Optional
[str
]) – output directory. If not given, it isindir
outcompression (
Optional
[str
]) – compression of the output filesinformat (
str
) – format of the input filesincompression (
Optional
[str
]) – compression of the input files. Only use to figure out the extensionkeep_original (
bool
) – whether to keep the original files. If set toFalse
, the files are removedverbose (
bool
) – whether to print some information
tools.tpaths module#
Module to manage paths.
- tools.tpaths.expand_paths(paths)[source]#
Expand paths in a list of paths that have a
*
.- Parameters:
paths (
List
[str
]) – list of paths. The paths that contain a star*
are expanded using the pythonglob.glob()
function**kwargs – passed to
glob.glob()
- Return type:
List
[str
]- Returns:
List of paths, where the paths that contain
*
have been expanded
- tools.tpaths.resolve_relative_path(path, reference)[source]#
If
path
is relative, turns it into an absolute path w.r.t.reference
. If it is an absolute path, just return the path itself.
- tools.tpaths.resolve_relative_paths(paths, reference)[source]#
Same as
resolve_relative_path()
but input can be one path or a list of paths.