9.1.1. Data Grabbers#
Provide imports for datagrabber sub-package.
- class junifer.datagrabber.BaseDataGrabber(types, datadir)#
Abstract base class for DataGrabber.
For every interface that is required, one needs to provide a concrete implementation of this abstract class.
- Parameters:
- types
list
ofstr
The types of data to be grabbed.
- datadir
str
orpathlib.Path
The directory where the data is / will be stored.
- types
- Attributes:
datadir
pathlib.Path
Get data directory path.
- property datadir: Path#
Get data directory path.
- Returns:
pathlib.Path
Path to the data directory. Can be overridden by subclasses.
- filter(selection)#
Filter elements to be grabbed.
- abstract get_element_keys()#
Get element keys.
For each item in the
element
tuple passed to__getitem__()
, this method returns the corresponding key(s).
- abstract get_elements()#
Get elements.
- Returns:
list
List of elements that can be grabbed. The elements can be strings, tuples or any object that will be then used as a key to index the DataGrabber.
- abstract get_item(**element)#
Get the specified item from the dataset.
- class junifer.datagrabber.DMCC13Benchmark(datadir=None, types=None, sessions=None, tasks=None, phase_encodings=None, runs=None, native_t1w=False)#
Concrete implementation for datalad-based data fetching of DMCC13.
- Parameters:
- datadir
str
orPath
orNone
, optional The directory where the datalad dataset will be cloned. If None, the datalad dataset will be cloned into a temporary directory (default None).
- types: {“BOLD”, “BOLD_confounds”, “T1w”, “VBM_CSF”, “VBM_GM”, “VBM_WM”} or a list of the options, optional
DMCC data types. If None, all available data types are selected. (default None).
- sessions: {“ses-wave1bas”, “ses-wave1pro”, “ses-wave1rea”} or list of the options, optional
DMCC sessions. If None, all available sessions are selected (default None).
- tasks: {“Rest”, “Axcpt”, “Cuedts”, “Stern”, “Stroop”} or list of the options, optional
DMCC task sessions. If None, all available task sessions are selected (default None).
- phase_encodings{“AP”, “PA”} or
list
of the options, optional DMCC phase encoding directions. If None, all available phase encodings are selected (default None).
- runs{“1”, “2”} or
list
of the options, optional DMCC runs. If None, all available runs are selected (default None).
- native_t1wbool, optional
Whether to use T1w in native space (default False).
- datadir
- Raises:
ValueError
- If invalid value is passed for:
sessions
tasks
phase_encodings
runs
- get_elements()#
Implement fetching list of subjects in the dataset.
- get_item(subject, session, task, phase_encoding, run)#
Index one element in the dataset.
- Parameters:
- subject
str
The subject ID.
- session{“ses-wave1bas”, “ses-wave1pro”, “ses-wave1rea”}
The session to get.
- task{“Rest”, “Axcpt”, “Cuedts”, “Stern”, “Stroop”}
The task to get.
- phase_encoding{“AP”, “PA”}
The phase encoding to get.
- run{“1”, “2”}
The run to get.
- subject
- Returns:
- out
dict
Dictionary of paths for each type of data required for the specified element.
- out
- class junifer.datagrabber.DataladAOMICID1000(datadir=None, types=None, native_t1w=False)#
Concrete implementation for datalad-based data fetching of AOMIC ID1000.
- Parameters:
- datadir
str
orPath
orNone
, optional The directory where the datalad dataset will be cloned. If None, the datalad dataset will be cloned into a temporary directory (default None).
- types: {“BOLD”, “BOLD_confounds”, “T1w”, “VBM_CSF”, “VBM_GM”, “VBM_WM”, “DWI”} or a list of the options, optional
AOMIC data types. If None, all available data types are selected. (default None).
- native_t1wbool, optional
Whether to use T1w in native space (default False).
- datadir
- class junifer.datagrabber.DataladAOMICPIOP1(datadir=None, types=None, tasks=None, native_t1w=False)#
Concrete implementation for pattern-based data fetching of AOMIC PIOP1.
- Parameters:
- datadir
str
orPath
orNone
, optional The directory where the datalad dataset will be cloned. If None, the datalad dataset will be cloned into a temporary directory (default None).
- types: {“BOLD”, “BOLD_confounds”, “T1w”, “VBM_CSF”, “VBM_GM”, “VBM_WM”, “DWI”} or a list of the options, optional
AOMIC data types. If None, all available data types are selected. (default None).
- tasks{“restingstate”, “anticipation”, “emomatching”, “faces”, “gstroop”, “workingmemory”} or
list
of the options, optional AOMIC PIOP1 task sessions. If None, all available task sessions are selected (default None).
- native_t1wbool, optional
Whether to use T1w in native space (default False).
- datadir
- Raises:
ValueError
If invalid value is passed for
tasks
.
- get_elements()#
Implement fetching list of subjects in the dataset.
- class junifer.datagrabber.DataladAOMICPIOP2(datadir=None, types=None, tasks=None, native_t1w=False)#
Concrete implementation for pattern-based data fetching of AOMIC PIOP2.
- Parameters:
- datadir
str
orPath
orNone
, optional The directory where the datalad dataset will be cloned. If None, the datalad dataset will be cloned into a temporary directory (default None).
- types: {“BOLD”, “BOLD_confounds”, “T1w”, “VBM_CSF”, “VBM_GM”, “VBM_WM”, “DWI”} or a list of the options, optional
AOMIC data types. If None, all available data types are selected. (default None).
- tasks{“restingstate”, “stopsignal”, “workingmemory”} or
list
of the options, optional AOMIC PIOP2 task sessions. If None, all available task sessions are selected (default None).
- native_t1wbool, optional
Whether to use T1w in native space (default False).
- datadir
- Raises:
ValueError
If invalid value is passed for
tasks
.
- class junifer.datagrabber.DataladDataGrabber(rootdir='.', datadir=None, uri=None, **kwargs)#
Abstract base class for datalad-based data fetching.
Defines a DataGrabber that gets data from a datalad sibling.
- Parameters:
- rootdir
str
orpathlib.Path
, optional The path within the datalad dataset to the root directory (default “.”).
- datadir
str
orpathlib.Path
orNone
, optional That directory where the datalad dataset will be cloned. If None, the datalad dataset will be cloned into a temporary directory (default None).
- uri
str
orNone
, optional URI of the datalad sibling (default None).
- **kwargs
Keyword arguments passed to superclass.
- rootdir
See also
BaseDataGrabber
Abstract base class for DataGrabber.
PatternDataGrabber
Concrete implementation for pattern-based data fetching.
PatternDataladDataGrabber
Concrete implementation for pattern and datalad based data fetching.
Notes
This class is intended to be used as a superclass of a subclass with multiple inheritance.
Methods
install:
Installs (clones) the datalad dataset into the
datadir
. This method is called automatically when the datagrabber is used within a context.remove:
Removes the datalad dataset from the
datadir
. This method is called automatically when the datagrabber is used within a context.- cleanup()#
Cleanup the datalad dataset.
- property datadir: Path#
Get data directory path.
- Returns:
pathlib.Path
Path to the data directory.
- install()#
Install the datalad dataset into the
datadir
.- Raises:
ValueError
If the dataset is already installed but with a different ID.
- class junifer.datagrabber.DataladHCP1200(datadir=None, tasks=None, phase_encodings=None, ica_fix=False)#
Concrete implementation for datalad-based data fetching of HCP1200.
- Parameters:
- datadir
str
orPath
orNone
, optional The directory where the datalad dataset will be cloned. If None, the datalad dataset will be cloned into a temporary directory (default None).
- tasks{“REST1”, “REST2”, “SOCIAL”, “WM”, “RELATIONAL”, “EMOTION”, “LANGUAGE”, “GAMBLING”, “MOTOR”} or
list
of the options orNone
, optional HCP task sessions. If None, all available task sessions are selected (default None).
- phase_encodings{“LR”, “RL”} or
list
of the options orNone
, optional HCP phase encoding directions. If None, both will be used (default None).
- ica_fixbool, optional
Whether to retrieve data that was processed with ICA+FIX. Only “REST1” and “REST2” tasks are available with ICA+FIX (default False).
- datadir
- Raises:
ValueError
If invalid value is passed for
tasks
orphase_encodings
.
- class junifer.datagrabber.HCP1200(datadir, tasks=None, phase_encodings=None, ica_fix=False)#
Concrete implementation for pattern-based data fetching of HCP1200.
- Parameters:
- datadir
str
orPath
, optional The directory where the data is / will be stored.
- tasks{“REST1”, “REST2”, “SOCIAL”, “WM”, “RELATIONAL”, “EMOTION”, “LANGUAGE”, “GAMBLING”, “MOTOR”} or
list
of the options orNone
, optional HCP task sessions. If None, all available task sessions are selected (default None).
- phase_encodings{“LR”, “RL”} or
list
of the options orNone
, optional HCP phase encoding directions. If None, both will be used (default None).
- ica_fixbool, optional
Whether to retrieve data that was processed with ICA+FIX. Only “REST1” and “REST2” tasks are available with ICA+FIX (default False).
- datadir
- Raises:
ValueError
If invalid value is passed for
tasks
orphase_encodings
.
- get_elements()#
Implement fetching list of elements in the dataset.
- Returns:
list
The list of elements that can be grabbed in the dataset.
- get_item(subject, task, phase_encoding)#
Implement single element indexing in the database.
- class junifer.datagrabber.MultipleDataGrabber(datagrabbers, **kwargs)#
Concrete implementation for multi sourced data fetching.
Implements a DataGrabber which can be used to fetch data from multiple DataGrabbers.
- Parameters:
- datagrabbers
list
of DataGrabber-like objects The DataGrabbers to use for fetching data.
- **kwargs
Keyword arguments passed to superclass.
- datagrabbers
- get_element_keys()#
Get element keys.
For each item in the
element
tuple passed to__getitem__()
, this method returns the corresponding key(s).
- get_elements()#
Get elements.
- Returns:
list
List of elements that can be grabbed. The elements can be strings, tuples or any object that will be then used as a key to index the the DataGrabber. The element should be present in all of the related DataGrabbers.
- class junifer.datagrabber.PatternDataGrabber(types, patterns, replacements, datadir, confounds_format=None)#
Concrete implementation for pattern-based data fetching.
Implements a DataGrabber that understands patterns to grab data.
- Parameters:
- types
list
ofstr
The types of data to be grabbed.
- patterns
dict
Data type patterns as a dictionary. It has the following schema:
"T1w"
:{ "mandatory": ["pattern", "space"], "optional": [] }
"T2w"
:{ "mandatory": ["pattern", "space"], "optional": [] }
"BOLD"
:{ "mandatory": ["pattern", "space"], "optional": ["mask_item"] }
"Warp"
:{ "mandatory": ["pattern", "src", "dst"], "optional": [] }
"BOLD_confounds"
:{ "mandatory": ["pattern", "format"], "optional": [] }
"VBM_GM"
:{ "mandatory": ["pattern", "space"], "optional": [] }
"VBM_WM"
:{ "mandatory": ["pattern", "space"], "optional": [] }
Basically, for each data type, one needs to provide
mandatory
keys and can choose to also provideoptional
keys. The value for each key is a string. So, one needs to provide necessary data types as a dictionary, for example:{ "BOLD": { "pattern": "...", "space": "...", }, "T1w": { "pattern": "...", "space": "...", }, "Warp": { "pattern": "...", "src": "...", "dst": "...", } }
taken from
HCP1200
.- replacements
str
orlist
ofstr
Replacements in the
pattern
key of each data type. The value needs to be a list of all possible replacements.- datadir
str
orpathlib.Path
The directory where the data is / will be stored.
- confounds_format{“fmriprep”, “adhoc”} or
None
, optional The format of the confounds for the dataset (default None).
- types
- Raises:
ValueError
If
confounds_format
is invalid.
- get_element_keys()#
Get element keys.
For each item in the “element” tuple, this functions returns the corresponding key, that is, the
replacements
of patterns defined in the constructor.
- get_elements()#
Implement fetching list of elements in the dataset.
It will use regex to search for “replacements” in the “patterns” and return the intersection of the results for each type i.e., build a list of elements that have all the required types.
- Returns:
list
The list of elements that can be grabbed in the dataset.
- get_item(**element)#
Implement single element indexing for the datagrabber.
This method constructs a real path to the requested item’s data, by replacing the
patterns
with actual values passed via**element
.- Parameters:
- element
dict
The element to be indexed. The keys must be the same as the replacements.
- element
- Returns:
dict
Dictionary of dictionaries for each type of data required for the specified element.
- Raises:
RuntimeError
If more than one file matches for a data type’s pattern or if no file matches for a data type’s pattern or if file cannot be accessed for an element.
- class junifer.datagrabber.PatternDataladDataGrabber(**kwargs)#
Concrete implementation for pattern and datalad based data fetching.
Implements a DataGrabber that gets data from a datalad sibling, interpreting patterns.
- Parameters:
- types
list
ofstr
The types of data to be grabbed.
- patterns
dict
Data type patterns as a dictionary. It has the following schema:
"T1w"
:{ "mandatory": ["pattern", "space"], "optional": [] }
"T2w"
:{ "mandatory": ["pattern", "space"], "optional": [] }
"BOLD"
:{ "mandatory": ["pattern", "space"], "optional": ["mask_item"] }
"Warp"
:{ "mandatory": ["pattern", "src", "dst"], "optional": [] }
"BOLD_confounds"
:{ "mandatory": ["pattern", "format"], "optional": [] }
"VBM_GM"
:{ "mandatory": ["pattern", "space"], "optional": [] }
"VBM_WM"
:{ "mandatory": ["pattern", "space"], "optional": [] }
Basically, for each data type, one needs to provide
mandatory
keys and can choose to also provideoptional
keys. The value for each key is a string. So, one needs to provide necessary data types as a dictionary, for example:{ "BOLD": { "pattern": "...", "space": "...", }, "T1w": { "pattern": "...", "space": "...", }, "Warp": { "pattern": "...", "src": "...", "dst": "...", } }
taken from
HCP1200
.- replacements
str
orlist
ofstr
Replacements in the
pattern
key of each data type. The value needs to be a list of all possible replacements.- confounds_format{“fmriprep”, “adhoc”} or
None
, optional The format of the confounds for the dataset (default None).
- datadir
str
orpathlib.Path
orNone
, optional That directory where the datalad dataset will be cloned. If None, the datalad dataset will be cloned into a temporary directory (default None).
- rootdir
str
orpathlib.Path
, optional The path within the datalad dataset to the root directory (default “.”).
- uri
str
orNone
, optional URI of the datalad sibling (default None).
- types
See also
DataladDataGrabber
Abstract base class for datalad-based data fetching.
PatternDataGrabber
Concrete implementation for pattern-based data fetching.