.. include:: ../links.inc .. _extending_preprocessors: Creating Preprocessors ====================== As already mentioned in the introduction, ``junifer`` does not do traditional MRI pre-processing but can perform minimal preprocessing of the data that the DataGrabber provides, for example, smoothing after confound regression or transforming data to subject-native space before feature extraction. While there are a few Preprocessors available already and we are constantly adding new ones, you might need something specific and then you can create your own Preprocessor. While implementing your own Preprocessor, you need to always inherit from :class:`.BasePreprocessor` and implement a few methods and class attributes: #. (optional) ``validate_preprocessor_params``: The method to perform logical validation of parameters (if required). #. ``preprocess``: The method that given the data, preprocesses the data. As an example, we will develop a ``NilearnSmoothing`` Preprocessor, which smoothens the data using :func:`nilearn.image.smooth_img`. This is often desirable in cases where your data is preprocessed using ``fMRIPrep``, as ``fMRIPrep`` does not perform smoothing. .. _extending_preprocessors_input_output: Step 1: Configure input ----------------------- In this step, we define the input data types of the Preprocessor. For input we can accept ``T1w``, ``T2w`` and ``BOLD`` :ref:`data types ` and thus declare them in a class attribute: .. code-block:: python _VALID_DATA_TYPES = [DataType.T1w, DataType.T2w, DataType.BOLD] .. _extending_preprocessors_init: Step 2: Initialise the Preprocessor ----------------------------------- Now we need to define our Preprocessor class' parameters. Our class will have the following arguments: 1. ``fwhm``: The smoothing strength as a full-width at half maximum (in millimetres). Since we depend on :func:`nilearn.image.smooth_img`, we pass the value to it. 2. ``on``: The data type we want the Preprocessor to work on. If the user does not specify, it will work on all the data types given by the ``_VALID_DATA_TYPES`` attribute. .. attention:: Only basic types (*int*, *bool* and *str*), lists, tuples and dictionaries are allowed as parameters. This is because the parameters are stored in JSON format, and JSON only supports these types. As :class:`.BasePreprocessor` already defines ``on``, we can define the other: .. code-block:: python from typing import Literal from numpy.typing import ArrayLike ... fwhm: int | float | ArrayLike | Literal["fast"] | None ... .. caution:: Parameters of the Preprocessor must be stored as object attributes without using ``_`` as prefix. This is because any attribute that starts with ``_`` will not be considered as a parameter and not stored as part of the metadata of the Preprocessor. .. _extending_preprocessors_preprocess: Step 3: Preprocess the data --------------------------- Finally, we will write the actual logic of the Preprocessor. This method will be called by ``junifer`` when needed, using the data provided by the DataGrabber, as configured by the user. The method ``preprocess`` has two arguments: * ``input``: A dictionary with the data to be used by the Preprocessor. This will be the corresponding element in the :ref:`Data Object` already indexed. Thus, the dictionary has at least two keys: ``data`` and ``path``. The first one contains the data, while the second one contains the path to the data. The dictionary can also contain other keys, depending on the data type. * ``extra_input``: The rest of the :ref:`Data Object`. This is useful if you want to use other data (e.g., ``Warp`` can be used to provide the transformation matrix file for transformation to subject-native space). .. code-block:: python from typing import Any from nilearn import image as nimg ... def preprocess( self, input: dict[str, Any], extra_input: dict[str, Any] | None = None, ) -> dict[str, Any]: input["data"] = nimg.smooth_img(imgs=input["data"], fwhm=self.fwhm) return input ... Step 4: Finalise the Preprocessor --------------------------------- Now we just need to combine everything we have above and throw in a couple of other stuff to get our Preprocessor ready. First, we specify the :ref:`dependencies ` for our class, which are basically the packages that are required by the class. This is used for validation before running to ensure all the packages are installed and also to keep track of the dependencies and their versions in the metadata. We define it using a class attribute like so: .. code-block:: python _DEPENDENCIES = {"nilearn"} Then, we just need to register the Preprocessor using ``@register_preprocessor`` decorator and our final code should look like this: .. code-block:: python from collections.abc import Sequence from typing import Any, ClassVar, Literal from junifer.api.decorators import register_preprocessor from junifer.datagrabber import DataType from junifer.preprocess import BasePreprocessor from nilearn import image as nimg from numpy.typing import ArrayLike @register_preprocessor class NilearnSmoothing(BasePreprocessor): _DEPENDENCIES = {"nilearn"} _VALID_DATA_TYPES: ClassVar[Sequence[DataType]] = [DataType.T1w, DataType.T2w, DataType.BOLD] fwhm: int | float | ArrayLike | Literal["fast"] | None def preprocess( self, input: dict[str, Any], extra_input: dict[str, Any] | None = None, ) -> dict[str, Any]: input["data"] = nimg.smooth_img(imgs=input["data"], fwhm=self.fwhm) return input .. _extending_preprocessors_template: Template for a custom Preprocessor ---------------------------------- .. code-block:: python from junifer.api.decorators import register_preprocessor from junifer.preprocess import BasePreprocessor @register_preprocessor class TemplatePreprocessor(BasePreprocessor): # TODO: add the dependencies _DEPENDENCIES = {} # TODO: add the inputs _VALID_DATA_TYPES = [] # TODO: define preprocessor-specific parameters # optional def validate_preprocessor_params(self): # TODO: add validation logic for preprocessor parameters pass def preprocess(self, input, extra_input): # TODO: add the preprocessor logic return input