9.2.8. API Functions#
Main API functions#
Provide imports for api sub-package.
- junifer.api.collect(storage)#
Collect and store data.
- Parameters:
- storage
dict Storage to use. Must have a key
kindwith the kind of storage to use. All other keys are passed to the storage init function.
- storage
- junifer.api.queue(config, kind, jobname='junifer_job', overwrite=False, elements=None, **kwargs)#
Queue a job to be executed later.
- Parameters:
- config
dict The configuration to be used for queueing the job.
- kind{“HTCondor”, “GNUParallelLocal”}
The kind of job queue system to use.
- jobname
str, optional The name of the job (default “junifer_job”).
- overwritebool, optional
Whether to overwrite if job directory already exists (default False).
- elements
strortupleorlistofstrortuple, optional Element(s) to process. Will be used to index the DataGrabber (default None).
- **kwargs
dict The keyword arguments to pass to the job queue system.
- config
- Raises:
ValueErrorIf
kindis invalid or if thejobdirexists andoverwrite = False.
- junifer.api.run(workdir, datagrabber, markers, storage, preprocessors=None, elements=None)#
Run the pipeline on the selected element.
- Parameters:
- workdir
strorpathlib.Path Directory where the pipeline will be executed.
- datagrabber
dict DataGrabber to use. Must have a key
kindwith the kind of DataGrabber to use. All other keys are passed to the DataGrabber init function.- markers
listofdict List of markers to extract. Each marker is a dict with at least two keys:
nameandkind. Thenamekey is used to name the output marker. Thekindkey is used to specify the kind of marker to extract. The rest of the keys are used to pass parameters to the marker calculation.- storage
dict Storage to use. Must have a key
kindwith the kind of storage to use. All other keys are passed to the storage init function.- preprocessors
listofdict, optional List of preprocessors to use. Each preprocessor is a dict with at least a key
kindspecifying the preprocessor to use. All other keys are passed to the preprocessor init function (default None).- elements
strortupleorlistofstrortuple, optional Element(s) to process. Will be used to index the DataGrabber (default None).
- workdir
Decorators#
Provide decorators for api.
- junifer.api.decorators.register(step, name, klass)#
Register a function to be used in a pipeline step.
- Parameters:
- Raises:
ValueErrorIf the
stepis invalid.
- junifer.api.decorators.register_datagrabber(klass)#
Register DataGrabber.
Registers the DataGrabber so it can be used by name.
- Parameters:
- klass: class
The class of the DataGrabber to register.
- Returns:
- klass: class
The unmodified input class.
Notes
It should only be used as a decorator.
- junifer.api.decorators.register_datareader(klass)#
Register DataReader.
Registers the DataReader so it can be used by name.
- Parameters:
- klass: class
The class of the DataReader to register.
- Returns:
- klass: class
The unmodified input class.
Notes
It should only be used as a decorator.
- junifer.api.decorators.register_marker(klass)#
Marker registration decorator.
Registers the marker so it can be used by name.
- Parameters:
- klass: class
The class of the marker to register.
- Returns:
- klass: class
The unmodified input class.
- junifer.api.decorators.register_preprocessor(klass)#
Preprocessor registration decorator.
Registers the preprocessor so it can be used by name.
- Parameters:
- klass: class
The class of the preprocessor to register.
- Returns:
- klass: class
The unmodified input class.
- junifer.api.decorators.register_storage(klass)#
Storage registration decorator.
Registers the storage so it can be used by name.
- Parameters:
- klass: class
The class of the storage to register.
- Returns:
- klass: class
The unmodified input class.
Queue Context#
Provide imports for queue context sub-package.
- class junifer.api.queue_context.GnuParallelLocalAdapter(job_name, job_dir, yaml_config_path, elements, pre_run=None, pre_collect=None, env=None, verbose='info', submit=False)#
Class for generating commands for GNU Parallel (local).
- Parameters:
- job_name
str The job name.
- job_dir
pathlib.Path The path to the job directory.
- yaml_config_path
pathlib.Path The path to the YAML config file.
- elements
listofstrortuple Element(s) to process. Will be used to index the DataGrabber.
- pre_run
strorNone, optional Extra shell commands to source before the run (default None).
- pre_collect
strorNone, optional Extra bash commands to source before the collect (default None).
- env
dict, optional The Python environment configuration. If None, will run without a virtual environment of any kind (default None).
- verbose
str, optional The level of verbosity (default “info”).
- submitbool, optional
Whether to submit the jobs (default False).
- job_name
- Raises:
ValueErrorIf``env`` is invalid.
See also
QueueContextAdapterThe base class for QueueContext.
HTCondorAdapterThe concrete class for queueing via HTCondor.
Initialize the class.
- collect()#
Return collect commands.
- elements()#
Return elements to run.
- pre_collect()#
Return pre-collect commands.
- pre_run()#
Return pre-run commands.
- prepare()#
Prepare assets for submission.
- run()#
Return run commands.
- class junifer.api.queue_context.HTCondorAdapter(job_name, job_dir, yaml_config_path, elements, pre_run=None, pre_collect=None, env=None, verbose='info', cpus=1, mem='8G', disk='1G', extra_preamble=None, collect='yes', submit=False)#
Class for generating queueing scripts for HTCondor.
- Parameters:
- job_name
str The job name to be used by HTCondor.
- job_dir
pathlib.Path The path to the job directory.
- yaml_config_path
pathlib.Path The path to the YAML config file.
- elements
listofstrortuple Element(s) to process. Will be used to index the DataGrabber.
- pre_run
strorNone, optional Extra bash commands to source before the run (default None).
- pre_collect
strorNone, optional Extra bash commands to source before the collect (default None).
- env
dict, optional The Python environment configuration. If None, will run without a virtual environment of any kind (default None).
- verbose
str, optional The level of verbosity (default “info”).
- cpus
int, optional The number of CPU cores to use (default 1).
- mem
str, optional The size of memory (RAM) to use (default “8G”).
- disk
str, optional The size of disk (HDD or SSD) to use (default “1G”).
- extra_preamble
strorNone, optional Extra commands to pass to HTCondor (default None).
- collect{“yes”, “on_success_only”, “no”}, optional
Whether to submit “collect” task for junifer (default “yes”). Valid options are:
- “yes”: Submit “collect” task and run even if some of the jobs
fail.
- “on_success_only”: Submit “collect” task and run only if all jobs
succeed.
“no”: Do not submit “collect” task.
- submitbool, optional
Whether to submit the jobs. In any case, .dag files will be created for submission (default False).
- job_name
- Raises:
ValueErrorIf
collectis invalid or ifenvis invalid.
See also
QueueContextAdapterThe base class for QueueContext.
GnuParallelLocalAdapterThe concrete class for queueing via GNU Parallel (local).
Initialize the class.
- collect()#
Return collect commands.
- dag()#
Return HTCondor DAG commands.
- pre_collect()#
Return pre-collect commands.
- pre_run()#
Return pre-run commands.
- prepare()#
Prepare assets for submission.
- run()#
Return run commands.
- class junifer.api.queue_context.QueueContextAdapter#
Abstract base class for queue context adapter.
For every interface that is required, one needs to provide a concrete implementation of this abstract class.
- abstract collect()#
Return collect commands.
- abstract pre_collect()#
Return pre-collect commands.
- abstract pre_run()#
Return pre-run commands.
- abstract prepare()#
Prepare assets for submission.
- abstract run()#
Return run commands.