9.2.8. API Functions#
Main API functions#
Provide imports for api sub-package.
- junifer.api.collect(storage)#
Collect and store data.
- Parameters:
- storage
dict
Storage to use. Must have a key
kind
with the kind of storage to use. All other keys are passed to the storage init function.
- storage
- junifer.api.queue(config, kind, jobname='junifer_job', overwrite=False, elements=None, **kwargs)#
Queue a job to be executed later.
- Parameters:
- config
dict
The configuration to be used for queueing the job.
- kind{“HTCondor”, “GNUParallelLocal”}
The kind of job queue system to use.
- jobname
str
, optional The name of the job (default “junifer_job”).
- overwritebool, optional
Whether to overwrite if job directory already exists (default False).
- elements
str
ortuple
orlist
ofstr
ortuple
, optional Element(s) to process. Will be used to index the DataGrabber (default None).
- **kwargs
dict
The keyword arguments to pass to the job queue system.
- config
- Raises:
ValueError
If
kind
is invalid or if thejobdir
exists andoverwrite = False
.
- junifer.api.run(workdir, datagrabber, markers, storage, preprocessors=None, elements=None)#
Run the pipeline on the selected element.
- Parameters:
- workdir
str
orpathlib.Path
Directory where the pipeline will be executed.
- datagrabber
dict
DataGrabber to use. Must have a key
kind
with the kind of DataGrabber to use. All other keys are passed to the DataGrabber init function.- markers
list
ofdict
List of markers to extract. Each marker is a dict with at least two keys:
name
andkind
. Thename
key is used to name the output marker. Thekind
key is used to specify the kind of marker to extract. The rest of the keys are used to pass parameters to the marker calculation.- storage
dict
Storage to use. Must have a key
kind
with the kind of storage to use. All other keys are passed to the storage init function.- preprocessors
list
ofdict
, optional List of preprocessors to use. Each preprocessor is a dict with at least a key
kind
specifying the preprocessor to use. All other keys are passed to the preprocessor init function (default None).- elements
str
ortuple
orlist
ofstr
ortuple
, optional Element(s) to process. Will be used to index the DataGrabber (default None).
- workdir
Decorators#
Provide decorators for api.
- junifer.api.decorators.register(step, name, klass)#
Register a function to be used in a pipeline step.
- Parameters:
- Raises:
ValueError
If the
step
is invalid.
- junifer.api.decorators.register_datagrabber(klass)#
Register DataGrabber.
Registers the DataGrabber so it can be used by name.
- Parameters:
- klass: class
The class of the DataGrabber to register.
- Returns:
- klass: class
The unmodified input class.
Notes
It should only be used as a decorator.
- junifer.api.decorators.register_datareader(klass)#
Register DataReader.
Registers the DataReader so it can be used by name.
- Parameters:
- klass: class
The class of the DataReader to register.
- Returns:
- klass: class
The unmodified input class.
Notes
It should only be used as a decorator.
- junifer.api.decorators.register_marker(klass)#
Marker registration decorator.
Registers the marker so it can be used by name.
- Parameters:
- klass: class
The class of the marker to register.
- Returns:
- klass: class
The unmodified input class.
- junifer.api.decorators.register_preprocessor(klass)#
Preprocessor registration decorator.
Registers the preprocessor so it can be used by name.
- Parameters:
- klass: class
The class of the preprocessor to register.
- Returns:
- klass: class
The unmodified input class.
- junifer.api.decorators.register_storage(klass)#
Storage registration decorator.
Registers the storage so it can be used by name.
- Parameters:
- klass: class
The class of the storage to register.
- Returns:
- klass: class
The unmodified input class.
Queue Context#
Provide imports for queue context sub-package.
- class junifer.api.queue_context.GnuParallelLocalAdapter(job_name, job_dir, yaml_config_path, elements, pre_run=None, pre_collect=None, env=None, verbose='info', submit=False)#
Class for generating commands for GNU Parallel (local).
- Parameters:
- job_name
str
The job name.
- job_dir
pathlib.Path
The path to the job directory.
- yaml_config_path
pathlib.Path
The path to the YAML config file.
- elements
list
ofstr
ortuple
Element(s) to process. Will be used to index the DataGrabber.
- pre_run
str
orNone
, optional Extra shell commands to source before the run (default None).
- pre_collect
str
orNone
, optional Extra bash commands to source before the collect (default None).
- env
dict
, optional The Python environment configuration. If None, will run without a virtual environment of any kind (default None).
- verbose
str
, optional The level of verbosity (default “info”).
- submitbool, optional
Whether to submit the jobs (default False).
- job_name
- Raises:
ValueError
If``env`` is invalid.
See also
QueueContextAdapter
The base class for QueueContext.
HTCondorAdapter
The concrete class for queueing via HTCondor.
Initialize the class.
- collect()#
Return collect commands.
- elements()#
Return elements to run.
- pre_collect()#
Return pre-collect commands.
- pre_run()#
Return pre-run commands.
- prepare()#
Prepare assets for submission.
- run()#
Return run commands.
- class junifer.api.queue_context.HTCondorAdapter(job_name, job_dir, yaml_config_path, elements, pre_run=None, pre_collect=None, env=None, verbose='info', cpus=1, mem='8G', disk='1G', extra_preamble=None, collect='yes', submit=False)#
Class for generating queueing scripts for HTCondor.
- Parameters:
- job_name
str
The job name to be used by HTCondor.
- job_dir
pathlib.Path
The path to the job directory.
- yaml_config_path
pathlib.Path
The path to the YAML config file.
- elements
list
ofstr
ortuple
Element(s) to process. Will be used to index the DataGrabber.
- pre_run
str
orNone
, optional Extra bash commands to source before the run (default None).
- pre_collect
str
orNone
, optional Extra bash commands to source before the collect (default None).
- env
dict
, optional The Python environment configuration. If None, will run without a virtual environment of any kind (default None).
- verbose
str
, optional The level of verbosity (default “info”).
- cpus
int
, optional The number of CPU cores to use (default 1).
- mem
str
, optional The size of memory (RAM) to use (default “8G”).
- disk
str
, optional The size of disk (HDD or SSD) to use (default “1G”).
- extra_preamble
str
orNone
, optional Extra commands to pass to HTCondor (default None).
- collect{“yes”, “on_success_only”, “no”}, optional
Whether to submit “collect” task for junifer (default “yes”). Valid options are:
- “yes”: Submit “collect” task and run even if some of the jobs
fail.
- “on_success_only”: Submit “collect” task and run only if all jobs
succeed.
“no”: Do not submit “collect” task.
- submitbool, optional
Whether to submit the jobs. In any case, .dag files will be created for submission (default False).
- job_name
- Raises:
ValueError
If
collect
is invalid or ifenv
is invalid.
See also
QueueContextAdapter
The base class for QueueContext.
GnuParallelLocalAdapter
The concrete class for queueing via GNU Parallel (local).
Initialize the class.
- collect()#
Return collect commands.
- dag()#
Return HTCondor DAG commands.
- pre_collect()#
Return pre-collect commands.
- pre_run()#
Return pre-run commands.
- prepare()#
Prepare assets for submission.
- run()#
Return run commands.
- class junifer.api.queue_context.QueueContextAdapter#
Abstract base class for queue context adapter.
For every interface that is required, one needs to provide a concrete implementation of this abstract class.
- abstract collect()#
Return collect commands.
- abstract pre_collect()#
Return pre-collect commands.
- abstract pre_run()#
Return pre-run commands.
- abstract prepare()#
Prepare assets for submission.
- abstract run()#
Return run commands.