4. Overview of available Pipeline Steps#
The following is a list of all available steps that can be used to create a pipeline by name. The overview is sorted based on the type of the step: Transformers or Models (Estimators).
The column ‘Name (str)’ refers to the string-name of
the respective step, i.e. how it should be specified when passed to e.g. the
PipelineCreator. The column ‘Description’ gives a short
description of what the step is doing. The column ‘Class’ either indicates the
underlying scikit-learn class of the respective pipeline-step together with
a link to the class in the scikit-learn documentation (follow the link to
see the valid parameters) or indicates the class in
the Julearn code, so one can have a closer look at it in Julearn’s
Reference.
For feature transformations the Transformers have to be used
with the PipelineCreator and for target transformation with the
TargetPipelineCreator.
4.1. Transformers#
Scalers#
| Name (str) | Description | Class | 
|---|---|---|
| 
 | Removing mean and scale to unit variance | |
| 
 | Removing median and scale to IQR | |
| 
 | Scale to a given range | |
| 
 | Scale by max absolute value | |
| 
 | Normalize to unit norm | |
| 
 | Transform to uniform or normal distribution (robust) | |
| 
 | Gaussianise data | 
Feature Selection#
| Name (str) | Description | Class | 
|---|---|---|
| 
 | Removing mean and scale to unit variance | |
| 
 | Rank and select percentile | |
| 
 | Rank and select K | |
| 
 | Select based on estimated FDR | |
| 
 | Select based on FPR threshold | |
| 
 | Select based on FWE threshold | |
| 
 | Remove low variance features | 
DataFrame operations#
| Name (str) | Description | Class | 
|---|---|---|
| 
 | Removing confounds from features, by subtracting the prediction of each feature given all confounds. By default this is equal to “independently regressing out the confounds from the features” | |
| 
 | Drop columns from the dataframe | |
| 
 | Change the type of a column in a dataframe | |
| 
 | Filter columns in a dataframe | 
Decomposition#
| Name (str) | Description | Class | 
|---|---|---|
| 
 | Principal Component Analysis | 
Custom#
| Name (str) | Description | Class | 
|---|---|---|
| 
 | Connectome-based Predictive Modeling (CBPM) | 
4.2. Models (Estimators)#
Support Vector Machines#
Ensemble#
| Name (str) | Description | Class | Binary | Multiclass | Regression | 
|---|---|---|---|---|---|
| 
 | Random Forest | Y | Y | Y | |
| 
 | Extra-Trees | Y | Y | Y | |
| 
 | AdaBoost | Y | Y | Y | |
| 
 | Bagging | Y | Y | Y | |
| 
 | Gradient Boosting | Y | Y | Y | |
| 
 | Stacking | Y | Y | Y | 
Gaussian Processes#
| Name (str) | Description | Class | Binary | Multiclass | Regression | 
|---|---|---|---|---|---|
| 
 | Gaussian Process | Y | Y | Y | 
Linear Models#
| Name (str) | Description | Class | Binary | Multiclass | Regression | 
|---|---|---|---|---|---|
| 
 | Logistic Regression (aka logit, MaxEnt). | Y | Y | N | |
| 
 | Logistic Regression CV (aka logit, MaxEnt). | Y | Y | N | |
| 
 | Least Squares regression. | N | N | Y | |
| 
 | Linear least squares with l2 regularization. | 
 | Y | Y | Y | 
| 
 | Ridge regression with built-in cross-validation. | Y | Y | Y | |
| 
 | Linear model fitted by minimizing a regularized empirical loss with SGD | Y | Y | Y | 
Naive Bayes#
| Name (str) | Description | Class | Binary | Multiclass | Regression | 
|---|---|---|---|---|---|
| 
 | Multivariate Bernoulli models. | Y | Y | N | |
| 
 | Categorical features. | Y | Y | N | |
| 
 | Complement Naive Bayes | Y | Y | N | |
| 
 | Gaussian Naive Bayes | Y | Y | N | |
| 
 | Multinomial models | Y | Y | N | 
Dynamic Selection#
| Name (str) | Description | Class | Binary | Multiclass | Regression | 
|---|---|---|---|---|---|
| 
 | Support for DESlib models | Y | Y | Y | 
Dummy#
| Name (str) | Description | Class | Binary | Multiclass | Regression | 
|---|---|---|---|---|---|
| 
 | Use simple rules (without features). | Y | Y | Y |