4.3. Data Grabber#

4.3.1. Description#

The DataGrabber is an object that can provide an interface to datasets you want to work with in junifer. Every concrete implementation of a DataGrabber is aware of a particular dataset’s structure and thus allows you to fetch specific elements of interest from the dataset. It adds the path key to each data type in the Data object.

DataGrabbers are intended to be used as context managers. When used within a context, a DataGrabber takes care of any pre and post steps for interacting with the dataset, for example, downloading and cleaning up. As the interface is consistent, you always use the same procedure to interact with the DataGrabber.

For example, a concrete implementation of DataladDataGrabber can provide junifer with data from a Datalad dataset. Of course, DataGrabbers are not only meant to work with Datalad datasets but any dataset.

If you are interested in using already provided DataGrabbers, please go to Built-in Pipeline Components. And, if you want to implement your own DataGrabber, you need to provide concrete implementations of abstract base classes already provided.

4.3.2. Base Classes#

In this section, we showcase different abstract and concrete base classes you might want to use to implement your own DataGrabber.

Name

Description

BaseDataGrabber

The abstract base class providing you an interface to implement your
own DataGrabber. You should try to avoid using this directly and
instead use PatternDataGrabber or
DataladDataGrabber. To build your own custom low-level
DataGrabber, you need to override the get_elements_keys,
get_elements and get_item methods, and most of the time you
should also override other existing methods like __enter__ and
__exit__.

PatternDataGrabber

It implements functionality to help you define the pattern of the
dataset you want to get. For example, you know that T1w images are
found in a directory following the pattern:
{subject}/anat/{subject}_T1w.nii.gz inside of the dataset. Now you
can provide this to the PatternDataGrabber and it will be
able to get the file.

DataladDataGrabber

It implements functionality to deal with Datalad datasets. Specifically,
the __enter__ and __exit__ methods take care of cloning and
removing the Datalad dataset.

PatternDataladDataGrabber

It is a combination of PatternDataGrabber and
DataladDataGrabber. This is probably the class you are looking
for when using Datalad datasets.