4. Overview of available Pipeline Steps#
The following is a list of all available steps that can be used to create a pipeline by name. The overview is sorted based on the type of the step: Transformers or Models (Estimators).
The column ‘Name (str)’ refers to the string-name of
the respective step, i.e. how it should be specified when passed to e.g. the
PipelineCreator
. The column ‘Description’ gives a short
description of what the step is doing. The column ‘Class’ either indicates the
underlying scikit-learn class of the respective pipeline-step together with
a link to the class in the scikit-learn documentation (follow the link to
see the valid parameters) or indicates the class in
the Julearn code, so one can have a closer look at it in Julearn’s
Reference.
For feature transformations the Transformers have to be used
with the PipelineCreator
and for target transformation with the
TargetPipelineCreator
.
4.1. Transformers#
Scalers#
Name (str) |
Description |
Class |
---|---|---|
|
Removing mean and scale to unit variance |
|
|
Removing median and scale to IQR |
|
|
Scale to a given range |
|
|
Scale by max absolute value |
|
|
Normalize to unit norm |
|
|
Transform to uniform or normal distribution (robust) |
|
|
Gaussianise data |
Feature Selection#
Name (str) |
Description |
Class |
---|---|---|
|
Removing mean and scale to unit variance |
|
|
Rank and select percentile |
|
|
Rank and select K |
|
|
Select based on estimated FDR |
|
|
Select based on FPR threshold |
|
|
Select based on FWE threshold |
|
|
Remove low variance features |
DataFrame operations#
Name (str) |
Description |
Class |
---|---|---|
|
Removing confounds from features, by subtracting the prediction of each feature given all confounds. By default this is equal to “independently regressing out the confounds from the features” |
|
|
Drop columns from the dataframe |
|
|
Change the type of a column in a dataframe |
|
|
Filter columns in a dataframe |
Decomposition#
Name (str) |
Description |
Class |
---|---|---|
|
Principal Component Analysis |
Custom#
Name (str) |
Description |
Class |
---|---|---|
|
Connectome-based Predictive Modeling (CBPM) |
4.2. Models (Estimators)#
Support Vector Machines#
Ensemble#
Name (str) |
Description |
Class |
Binary |
Multiclass |
Regression |
---|---|---|---|---|---|
|
Random Forest |
Y |
Y |
Y |
|
|
Extra-Trees |
Y |
Y |
Y |
|
|
AdaBoost |
Y |
Y |
Y |
|
|
Bagging |
Y |
Y |
Y |
|
|
Gradient Boosting |
Y |
Y |
Y |
|
|
Stacking |
Y |
Y |
Y |
Gaussian Processes#
Name (str) |
Description |
Class |
Binary |
Multiclass |
Regression |
---|---|---|---|---|---|
|
Gaussian Process |
Y |
Y |
Y |
Linear Models#
Name (str) |
Description |
Class |
Binary |
Multiclass |
Regression |
---|---|---|---|---|---|
|
Logistic Regression (aka logit, MaxEnt). |
Y |
Y |
N |
|
|
Logistic Regression CV (aka logit, MaxEnt). |
Y |
Y |
N |
|
|
Least Squares regression. |
N |
N |
Y |
|
|
Linear least squares with l2 regularization. |
|
Y |
Y |
Y |
|
Ridge regression with built-in cross-validation. |
Y |
Y |
Y |
|
|
Linear model fitted by minimizing a regularized empirical loss with SGD |
Y |
Y |
Y |
Naive Bayes#
Name (str) |
Description |
Class |
Binary |
Multiclass |
Regression |
---|---|---|---|---|---|
|
Multivariate Bernoulli models. |
Y |
Y |
N |
|
|
Categorical features. |
Y |
Y |
N |
|
|
Complement Naive Bayes |
Y |
Y |
N |
|
|
Gaussian Naive Bayes |
Y |
Y |
N |
|
|
Multinomial models |
Y |
Y |
N |
Dynamic Selection#
Name (str) |
Description |
Class |
Binary |
Multiclass |
Regression |
---|---|---|---|---|---|
|
Support for DESlib models |
Y |
Y |
Y |
Dummy#
Name (str) |
Description |
Class |
Binary |
Multiclass |
Regression |
---|---|---|---|---|---|
|
Use simple rules (without features). |
Y |
Y |
Y |