4. Overview of available Pipeline Steps#

The following is a list of all available steps that can be used to create a pipeline by name. The overview is sorted based on the type of the step: Transformers or Models (Estimators).

The column ‘Name (str)’ refers to the string-name of the respective step, i.e. how it should be specified when passed to e.g. the PipelineCreator. The column ‘Description’ gives a short description of what the step is doing. The column ‘Class’ either indicates the underlying scikit-learn class of the respective pipeline-step together with a link to the class in the scikit-learn documentation (follow the link to see the valid parameters) or indicates the class in the Julearn code, so one can have a closer look at it in Julearn’s Reference.

For feature transformations the Transformers have to be used with the PipelineCreator and for target transformation with the TargetPipelineCreator.

4.1. Transformers#

Scalers#

Name (str)	Description	Class
`zscore`	Removing mean and scale to unit variance	`StandardScaler`
`scaler_robust`	Removing median and scale to IQR	`RobustScaler`
`scaler_minmax`	Scale to a given range	`MinMaxScaler`
`scaler_maxabs`	Scale by max absolute value	`MaxAbsScaler`
`scaler_normalizer`	Normalize to unit norm	`Normalizer`
`scaler_quantile`	Transform to uniform or normal distribution (robust)	`QuantileTransformer`
`scaler_power`	Gaussianise data	`PowerTransformer`

Feature Selection#

Name (str)	Description	Class
`select_univariate`	Removing mean and scale to unit variance	`GenericUnivariateSelect`
`select_percentile`	Rank and select percentile	`SelectPercentile`
`select_k`	Rank and select K	`SelectKBest`
`select_fdr`	Select based on estimated FDR	`SelectFdr`
`select_fpr`	Select based on FPR threshold	`SelectFpr`
`select_fwe`	Select based on FWE threshold	`SelectFwe`
`select_variance`	Remove low variance features	`VarianceThreshold`

DataFrame operations#

Name (str)	Description	Class
`confound_removal`	Removing confounds from features, by subtracting the prediction of each feature given all confounds. By default this is equal to “independently regressing out the confounds from the features”	`ConfoundRemover`
`drop_columns`	Drop columns from the dataframe	`DropColumns`
`change_column_types`	Change the type of a column in a dataframe	`ChangeColumnTypes`
`filter_columns`	Filter columns in a dataframe	`FilterColumns`

Decomposition#

Name (str)	Description	Class
`pca`	Principal Component Analysis	`PCA`

Custom#

Name (str)	Description	Class
`cbpm`	Connectome-based Predictive Modeling (CBPM)	`CBPM`

4.2. Models (Estimators)#

Support Vector Machines#

Name (str)	Description	Class	Binary	Multiclass	Regression
`svm`	Support Vector Machine	`SVC` and `SVR`	Y	Y	Y

Ensemble#

Name (str)	Description	Class	Binary	Multiclass	Regression
`rf`	Random Forest	`RandomForestClassifier` and `RandomForestRegressor`	Y	Y	Y
`et`	Extra-Trees	`ExtraTreesClassifier` and `ExtraTreesRegressor`	Y	Y	Y
`adaboost`	AdaBoost	`AdaBoostClassifier` and `AdaBoostRegressor`	Y	Y	Y
`bagging`	Bagging	`BaggingClassifier` and `BaggingRegressor`	Y	Y	Y
`gradientboost`	Gradient Boosting	`GradientBoostingClassifier` and `GradientBoostingRegressor`	Y	Y	Y
`stacking`	Stacking	`StackingClassifier` and `StackingRegressor`	Y	Y	Y

Gaussian Processes#

Name (str)	Description	Class	Binary	Multiclass	Regression
`gauss`	Gaussian Process	`GaussianProcessClassifier` and `GaussianProcessRegressor`	Y	Y	Y

Linear Models#

Name (str)	Description	Class	Binary	Multiclass	Regression
`logit`	Logistic Regression (aka logit, MaxEnt).	`LogisticRegression`	Y	Y	N
`logitcv`	Logistic Regression CV (aka logit, MaxEnt).	`LogisticRegressionCV`	Y	Y	N
`linreg`	Least Squares regression.	`LinearRegression`	N	N	Y
`ridge`	Linear least squares with l2 regularization.	`RidgeClassifier` and `Ridge`	Y	Y	Y
`ridgecv`	Ridge regression with built-in cross-validation.	`RidgeClassifierCV` and `RidgeCV`	Y	Y	Y
`sgd`	Linear model fitted by minimizing a regularized empirical loss with SGD	`SGDClassifier` and `SGDRegressor`	Y	Y	Y

Naive Bayes#

Name (str)	Description	Class	Binary	Multiclass	Regression
`nb_bernoulli`	Multivariate Bernoulli models.	`BernoulliNB`	Y	Y	N
`nb_categorical`	Categorical features.	`CategoricalNB`	Y	Y	N
`nb_complement`	Complement Naive Bayes	`ComplementNB`	Y	Y	N
`nb_gaussian`	Gaussian Naive Bayes	`GaussianNB`	Y	Y	N
`nb_multinomial`	Multinomial models	`MultinomialNB`	Y	Y	N

Dynamic Selection#

Name (str)	Description	Class	Binary	Multiclass	Regression
`ds`	Support for DESlib models	`DynamicSelection`	Y	Y	Y

Dummy#

Name (str)	Description	Class	Binary	Multiclass	Regression
`dummy`	Use simple rules (without features).	`DummyClassifier` and `DummyRegressor`	Y	Y	Y