Available Pipeline Steps

The following is a list of all the available steps that can be used to create a pipeline by name.

Features Preprocessing

Scalers

Name (str)

Description

Class

zscore

Removing mean and scale to unit variance

StandardScaler

scaler_robust

Removing median and scale to IQR

RobustScaler

scaler_minmax

Scale to a given range

MinMaxScaler

scaler_maxabs

Scale by max absolute value

MaxAbsScaler

scaler_normalizer

Normalize to unit norm

Normalizer

scaler_quantile

Transform to uniform or normal distribution (robust)

QuantileTransformer

scaler_power

Gaussianise data

PowerTransformer

Feature Selection

Name (str)

Description

Class

select_univariate

Removing mean and scale to unit variance

GenericUnivariateSelect

select_percentile

Rank and select percentile

SelectPercentile

select_k

Rank and select K

SelectKBest

select_fdr

Select based on estimated FDR

SelectFdr

select_fpr

Select based on FPR threshold

SelectFpr

select_fwe

Select based on FWE threshold

SelectFwe

select_variance

Remove low variance features

VarianceThreshold

Confound Removal

Name (str)

Description

Class

remove_confound

removing confounds from features, by subtracting the prediction of each feature given all confounds. By default this is equal to “independently regressing out the confounds from the features”

confounds.DataFrameConfoundRemover

Decomposition

Name (str)

Description

Class

pca

Principal Component Analysis

PCA

Target Preprocessing

Target Scalers

Name (str)

Description

Class

zscore

Removing mean and scale to unit variance

StandardScaler

Target Confound Removal

Name (str)

Description

Class

remove_confound

removing confounds from target, by subtracting the prediction of the target given all confounds. By default this is equal to “regressing out the confounds from the target”

TargetConfoundRemover

Models

Support Vector Machines

Name (str)

Description

Class

Binary

Multiclass

Regression

svm

Support Vector Machine

SVC and SVR

Y

Y

Y

Ensemble

Name (str)

Description

Class

Binary

Multiclass

Regression

rf

Random Forest

RandomForestClassifier and RandomForestRegressor

Y

Y

Y

et

Extra-Trees

ExtraTreesClassifier and ExtraTreesRegressor

Y

Y

Y

adaboost

AdaBoost

AdaBoostClassifier and AdaBoostRegressor

Y

Y

Y

bagging

Bagging

BaggingClassifier and BaggingRegressor

Y

Y

Y

gradientboost

Gradient Boosting

GradientBoostingClassifier and GradientBoostingRegressor

Y

Y

Y

Gaussian Processes

Name (str)

Description

Class

Binary

Multiclass

Regression

gauss

Gaussian Process

GaussianProcessClassifier and GaussianProcessRegressor

Y

Y

Y

Linear Models

Name (str)

Description

Class

Binary

Multiclass

Regression

logit

Logistic Regression (aka logit, MaxEnt).

LogisticRegression

Y

Y

N

logitcv

Logistic Regression CV (aka logit, MaxEnt).

LogisticRegressionCV

Y

Y

N

linreg

Least Squares regression.

LinearRegression

N

N

Y

ridge

Linear least squares with l2 regularization.

RidgeClassifier and Ridge

Y

Y

Y

ridgecv

Ridge regression with built-in cross-validation.

RidgeClassifierCV and RidgeCV

Y

Y

Y

sgd

Linear model fitted by minimizing a regularized empirical loss with SGD

SGDClassifier and SGDRegressor

Y

Y

Y

Naive Bayes

Name (str)

Description

Class

Binary

Multiclass

Regression

nb_bernoulli

Multivariate Bernoulli models.

BernoulliNB

Y

Y

N

nb_categorical

Categorical features.

CategoricalNB

Y

Y

N

nb_complement

Complement Naive Bayes

ComplementNB

Y

Y

N

nb_gaussian

Gaussian Naive Bayes

GaussianNB

Y

Y

N

nb_multinomial

Multinomial models

MultinomialNB

Y

Y

N

Dummy

Name (str)

Description

Class

Binary

Multiclass

Regression

dummy

Use simple rules (without features).

DummyClassifier and DummyRegressor

Y

Y

Y