9.2.8. API Functions#

Main API functions#

Provide imports for api sub-package.

junifer.api.collect(storage)#

Collect and store data.

Parameters:

storagedict: Storage to use. Must have a key kind with the kind of storage to use. All other keys are passed to the storage init function.

junifer.api.queue(config, kind, jobname='junifer_job', overwrite=False, elements=None, **kwargs)#

Queue a job to be executed later.

Parameters:

configdict: The configuration to be used for queueing the job.
kind{“HTCondor”, “GNUParallelLocal”}: The kind of job queue system to use.
jobnamestr, optional: The name of the job (default “junifer_job”).
overwritebool, optional: Whether to overwrite if job directory already exists (default False).
elementsstr or tuple or list of str or tuple, optional: Element(s) to process. Will be used to index the DataGrabber (default None).
**kwargsdict: The keyword arguments to pass to the job queue system.

Raises:

ValueError: If kind is invalid or if the jobdir exists and overwrite = False.

junifer.api.run(workdir, datagrabber, markers, storage, preprocessors=None, elements=None)#

Run the pipeline on the selected element.

Parameters:

workdirstr or pathlib.Path: Directory where the pipeline will be executed.
datagrabberdict: DataGrabber to use. Must have a key kind with the kind of DataGrabber to use. All other keys are passed to the DataGrabber init function.
markerslist of dict: List of markers to extract. Each marker is a dict with at least two keys: name and kind. The name key is used to name the output marker. The kind key is used to specify the kind of marker to extract. The rest of the keys are used to pass parameters to the marker calculation.
storagedict: Storage to use. Must have a key kind with the kind of storage to use. All other keys are passed to the storage init function.
preprocessorslist of dict, optional: List of preprocessors to use. Each preprocessor is a dict with at least a key kind specifying the preprocessor to use. All other keys are passed to the preprocessor init function (default None).
elementsstr or tuple or list of str or tuple, optional: Element(s) to process. Will be used to index the DataGrabber (default None).

Decorators#

Provide decorators for api.

junifer.api.decorators.register(step, name, klass)#

Parameters:

stepstr: Name of the step.
namestr: Name of the function.
klassclass: Class to be registered.

Raises:

ValueError: If the step is invalid.

junifer.api.decorators.register_datagrabber(klass)#

Registers the DataGrabber so it can be used by name.

Parameters:

klass: class: The class of the DataGrabber to register.

Returns:

klass: class: The unmodified input class.

Notes

It should only be used as a decorator.

junifer.api.decorators.register_datareader(klass)#

Registers the DataReader so it can be used by name.

Parameters:

klass: class: The class of the DataReader to register.

Returns:

klass: class: The unmodified input class.

Notes

It should only be used as a decorator.

junifer.api.decorators.register_marker(klass)#

Marker registration decorator.

Registers the marker so it can be used by name.

Parameters:

klass: class: The class of the marker to register.

Returns:

klass: class: The unmodified input class.

junifer.api.decorators.register_preprocessor(klass)#

Preprocessor registration decorator.

Registers the preprocessor so it can be used by name.

Parameters:

klass: class: The class of the preprocessor to register.

Returns:

klass: class: The unmodified input class.

junifer.api.decorators.register_storage(klass)#

Storage registration decorator.

Registers the storage so it can be used by name.

Parameters:

klass: class: The class of the storage to register.

Returns:

klass: class: The unmodified input class.

Queue Context#

Provide imports for queue context sub-package.

class junifer.api.queue_context.GnuParallelLocalAdapter(job_name, job_dir, yaml_config_path, elements, pre_run=None, pre_collect=None, env=None, verbose='info', submit=False)#

Class for generating commands for GNU Parallel (local).

Parameters:

job_namestr: The job name.
job_dirpathlib.Path: The path to the job directory.
yaml_config_pathpathlib.Path: The path to the YAML config file.
elementslist of str or tuple: Element(s) to process. Will be used to index the DataGrabber.
pre_runstr or None, optional: Extra shell commands to source before the run (default None).
pre_collectstr or None, optional: Extra bash commands to source before the collect (default None).
envdict, optional: The Python environment configuration. If None, will run without a virtual environment of any kind (default None).
verbosestr, optional: The level of verbosity (default “info”).
submitbool, optional: Whether to submit the jobs (default False).

Raises:

ValueError: If``env`` is invalid.