7. Overview of available Pipeline Steps#
The following is a list of all available steps that can be used to create a pipeline by name. The overview is sorted based on the type of the step: Transformers or Models (Estimators).
The column
Name
refers to the string-name of the respective step, i.e. how it should be specified when passed to e.g., thePipelineCreator
.The column
Description
gives a short description of what the step is doing.The column
Class
either indicates the underlying scikit-learn class of the respective pipeline step together with a link to the class in the scikit-learn documentation (follow the link to see the valid parameters) or indicates the class injulearn
, so one can have a closer look at it injulearn
’s API Reference.
For feature transformations, the Transformers are to be used
with the PipelineCreator
and for target transformations, the
Transformers are to be used with the
TargetPipelineCreator
.
7.1. Transformers#
Scalers#
Name |
Description |
Class |
---|---|---|
|
Removing mean and scale to unit variance |
|
|
Removing median and scale to IQR |
|
|
Scale to a given range |
|
|
Scale by max absolute value |
|
|
Normalize to unit norm |
|
|
Transform to uniform or normal distribution (robust) |
|
|
Gaussianise data |
Feature Selection#
Name |
Description |
Class |
---|---|---|
|
Removing mean and scale to unit variance |
|
|
Rank and select percentile |
|
|
Rank and select K |
|
|
Select based on estimated FDR |
|
|
Select based on FPR threshold |
|
|
Select based on FWE threshold |
|
|
Remove low variance features |
DataFrame operations#
Name |
Description |
Class |
---|---|---|
|
Removing confounds from features,
by subtracting the prediction of each feature given all confounds.
By default this is equal to “independently regressing out
the confounds from the features”
|
|
|
Drop columns from the DataFrame |
|
|
Change the type of a column in a DataFrame |
|
|
Filter columns in a DataFrame |
Decomposition#
Name |
Description |
Class |
---|---|---|
|
Principal Component Analysis |
Custom#
Name |
Description |
Class |
---|---|---|
|
Connectome-based Predictive Modeling (CBPM) |
7.2. Models (Estimators)#
Support Vector Machines#
Ensemble#
Name |
Description |
Class |
Binary |
Multiclass |
Regression |
---|---|---|---|---|---|
|
Random Forest |
Y |
Y |
Y |
|
|
Extra-Trees |
Y |
Y |
Y |
|
|
AdaBoost |
Y |
Y |
Y |
|
|
Bagging |
Y |
Y |
Y |
|
|
Gradient Boosting |
Y |
Y |
Y |
|
|
Stacking |
Y |
Y |
Y |
Gaussian Processes#
Name |
Description |
Class |
Binary |
Multiclass |
Regression |
---|---|---|---|---|---|
|
Gaussian Process |
Y |
Y |
Y |
Linear Models#
Name |
Description |
Class |
Binary |
Multiclass |
Regression |
---|---|---|---|---|---|
|
Logistic Regression (aka logit, MaxEnt). |
Y |
Y |
N |
|
|
Logistic Regression CV (aka logit, MaxEnt). |
Y |
Y |
N |
|
|
Least Squares regression. |
N |
N |
Y |
|
|
Linear least squares with l2 regularization. |
RidgeClassifier and |
Y |
Y |
Y |
|
Ridge regression with built-in cross-validation. |
Y |
Y |
Y |
|
|
Linear model fitted by minimizing a regularized empirical loss with SGD |
Y |
Y |
Y |
Naive Bayes#
Name |
Description |
Class |
Binary |
Multiclass |
Regression |
---|---|---|---|---|---|
|
Multivariate Bernoulli models. |
Y |
Y |
N |
|
|
Categorical features. |
Y |
Y |
N |
|
|
Complement Naive Bayes |
Y |
Y |
N |
|
|
Gaussian Naive Bayes |
Y |
Y |
N |
|
|
Multinomial models |
Y |
Y |
N |
Dynamic Selection#
Name |
Description |
Class |
Binary |
Multiclass |
Regression |
---|---|---|---|---|---|
|
Support for DESlib models |
Y |
Y |
Y |
Dummy#
Name |
Description |
Class |
Binary |
Multiclass |
Regression |
---|---|---|---|---|---|
|
Use simple rules (without features). |
Y |
Y |
Y |