Contents Menu Expand Light mode Dark mode Auto light/dark mode
julearn documentation
Logo
julearn documentation
  • 1. Getting started
  • 2. Setup suggestion
  • 3. Installing
  • 4. Optional Dependencies
  • 5. What you really need to know
    • 5.1. Why cross validation?
    • 5.2. Data
    • 5.3. Model Building
    • 5.4. Model Evaluation
    • 5.5. Model Comparison
  • 6. Selected deeper topics
    • 6.1. Applying preprocessing to the target
    • 6.2. Cross-validation consistent Confound Removal
    • 6.3. Hyperparameter Tuning
    • 6.4. Inspecting Models
    • 6.5. Cross-validation splitters
    • 6.6. Stacking Models
    • 6.7. Connectome-based Predictive Modeling (CBPM)
    • 6.8. Parallelizing julearn with Joblib
  • 7. Overview of available Pipeline Steps
  • 8. Examples
    • 8.1. Starting with julearn
      • Working with pandas
      • Simple Binary Classification
      • Grouped CV
      • Multiclass Classification
      • Stratified K-fold CV for regression analysis
      • Regression Analysis
    • 8.2. Model Comparison
      • Simple Model Comparison
    • 8.3. Inspection
      • Inspecting the fold-wise predictions
      • Inspecting Random Forest models
      • Inspecting SVM models
      • Preprocessing with variance threshold, zscore and PCA
    • 8.4. Complex Models
      • Transforming target variable with z-score
      • Tuning Hyperparameters using Bayesian Search
      • Tuning Multiple Hyperparameters Grids
      • Stacking Classification
      • Tuning Hyperparameters
      • Regression Analysis
    • 8.5. Confounds
      • Return Confounds in Confound Removal
      • Confound Removal (model comparison)
    • 8.6. Customization
      • Custom Scoring Function for Regression
  • 9. API Reference
    • 9.1. Main API
      • julearn.run_cross_validation
      • julearn.run_fit
    • 9.2. Pipeline
      • julearn.PipelineCreator
      • julearn.TargetPipelineCreator
      • julearn.pipeline.JuTargetPipeline
      • julearn.pipeline.pipeline_creator.Step
    • 9.3. Model Selection
      • julearn.model_selection.ContinuousStratifiedKFold
      • julearn.model_selection.RepeatedContinuousStratifiedKFold
      • julearn.model_selection.ContinuousStratifiedGroupKFold
      • julearn.model_selection.RepeatedContinuousStratifiedGroupKFold
      • julearn.model_selection.StratifiedBootstrap
      • julearn.model_selection.get_searcher
      • julearn.model_selection.list_searchers
      • julearn.model_selection.register_searcher
      • julearn.model_selection.reset_searcher_register
    • 9.4. Base
      • julearn.base.JuBaseEstimator
      • julearn.base.JuTransformer
      • julearn.base.WrapModel
      • julearn.base.ColumnTypes
      • julearn.base.ColumnTypesLike
      • julearn.base.change_column_type
      • julearn.base.get_column_type
      • julearn.base.make_type_selector
      • julearn.base.ensure_column_types
    • 9.5. Inspect
      • julearn.inspect.Inspector
      • julearn.inspect.FoldsInspector
      • julearn.inspect.preprocess
    • 9.6. Models
      • julearn.models.list_models
      • julearn.models.get_model
      • julearn.models.register_model
      • julearn.models.reset_model_register
    • 9.7. Dynamic Selection (DESLib)
      • julearn.models.dynamic.DynamicSelection
    • 9.8. Scoring
      • julearn.scoring.get_scorer
      • julearn.scoring.list_scorers
      • julearn.scoring.register_scorer
      • julearn.scoring.reset_scorer_register
      • julearn.scoring.check_scoring
    • 9.9. Scoring Metrics
      • julearn.scoring.metrics.r_corr
      • julearn.scoring.metrics.r2_corr
    • 9.10. Transformers
      • julearn.transformers.DropColumns
      • julearn.transformers.ChangeColumnTypes
      • julearn.transformers.SetColumnTypes
      • julearn.transformers.FilterColumns
      • julearn.transformers.CBPM
      • julearn.transformers.JuColumnTransformer
      • julearn.transformers.confound_remover.ConfoundRemover
      • julearn.transformers.list_transformers
      • julearn.transformers.get_transformer
      • julearn.transformers.register_transformer
      • julearn.transformers.reset_transformer_register
    • 9.11. Target Transformers
      • julearn.transformers.target.JuTransformedTargetModel
      • julearn.transformers.target.JuTargetTransformer
      • julearn.transformers.target.TargetConfoundRemover
      • julearn.transformers.target.TransformedTargetWarning
      • julearn.transformers.target.get_target_transformer
      • julearn.transformers.target.list_target_transformers
      • julearn.transformers.target.register_target_transformer
      • julearn.transformers.target.reset_target_transformer_register
    • 9.12. Utils
      • julearn.utils.logger
      • julearn.utils.configure_logging
      • julearn.utils.raise_error
      • julearn.utils.warn_with_log
    • 9.13. Typing
      • julearn.utils.typing.JuEstimatorLike
      • julearn.utils.typing.EstimatorLike
      • julearn.utils.typing.EstimatorLikeFit1
      • julearn.utils.typing.EstimatorLikeFit2
      • julearn.utils.typing.EstimatorLikeFity
    • 9.14. Prepare
      • julearn.prepare.prepare_input_data
      • julearn.prepare.check_consistency
    • 9.15. Stats
      • julearn.stats.corrected_ttest
    • 9.16. Visualization
      • julearn.viz.plot_scores
    • 9.17. Config
      • julearn.config.set_config
      • julearn.config.get_config
  • 10. Configuring julearn
  • 11. Contributing
  • 12. Maintaining
  • 13. FAQs
  • 14. What’s new
Other Versions v: main
Tags
v0.2.2
v0.2.3
v0.2.4
v0.2.5
v0.2.7
v0.3.0
v0.3.1
v0.3.2
v0.3.3
v0.3.4
Branches
main
Back to top

Note

Go to the end to download the full example code

Tuning Hyperparameters using Bayesian Search#

This example uses the fmri dataset, performs simple binary classification using a Support Vector Machine classifier and analyzes the model.

References#

Waskom, M.L., Frank, M.C., Wagner, A.D. (2016). Adaptive engagement of cognitive control in context-dependent decision-making. Cerebral Cortex.

# Authors: Federico Raimondo <f.raimondo@fz-juelich.de>
# License: AGPL

import numpy as np
from seaborn import load_dataset

from julearn import run_cross_validation
from julearn.utils import configure_logging, logger
from julearn.pipeline import PipelineCreator

Set the logging level to info to see extra information.

configure_logging(level="INFO")
/home/runner/work/julearn/julearn/julearn/utils/logging.py:66: UserWarning: The '__version__' attribute is deprecated and will be removed in MarkupSafe 3.1. Use feature detection, or `importlib.metadata.version("markupsafe")`, instead.
  vstring = str(getattr(module, "__version__", None))
2024-10-23 11:29:18,052 - julearn - INFO - ===== Lib Versions =====
2024-10-23 11:29:18,052 - julearn - INFO - numpy: 1.26.4
2024-10-23 11:29:18,052 - julearn - INFO - scipy: 1.14.1
2024-10-23 11:29:18,052 - julearn - INFO - sklearn: 1.5.2
2024-10-23 11:29:18,052 - julearn - INFO - pandas: 2.2.3
2024-10-23 11:29:18,052 - julearn - INFO - julearn: 0.3.5.dev16
2024-10-23 11:29:18,052 - julearn - INFO - ========================

Set the random seed to always have the same example.

np.random.seed(42)

Load the dataset.

df_fmri = load_dataset("fmri")
df_fmri.head()
subject timepoint event region signal
0 s13 18 stim parietal -0.017552
1 s5 14 stim parietal -0.080883
2 s12 18 stim parietal -0.081033
3 s11 18 stim parietal -0.046134
4 s10 18 stim parietal -0.037970


Set the dataframe in the right format.

df_fmri = df_fmri.pivot(
    index=["subject", "timepoint", "event"], columns="region", values="signal"
)

df_fmri = df_fmri.reset_index()
df_fmri.head()
region subject timepoint event frontal parietal
0 s0 0 cue 0.007766 -0.006899
1 s0 0 stim -0.021452 -0.039327
2 s0 1 cue 0.016440 0.000300
3 s0 1 stim -0.021054 -0.035735
4 s0 2 cue 0.024296 0.033220


Following the hyperparamter tuning example, we will now use a Bayesian search to find the best hyperparameters for the SVM model.

X = ["frontal", "parietal"]
y = "event"

creator1 = PipelineCreator(problem_type="classification")
creator1.add("zscore")
creator1.add(
    "svm",
    kernel=["linear"],
    C=(1e-6, 1e3, "log-uniform"),
)

creator2 = PipelineCreator(problem_type="classification")
creator2.add("zscore")
creator2.add(
    "svm",
    kernel=["rbf"],
    C=(1e-6, 1e3, "log-uniform"),
    gamma=(1e-6, 1e1, "log-uniform"),
)

search_params = {
    "kind": "bayes",
    "cv": 2,  # to speed up the example
    "n_iter": 10,  # 10 iterations of bayesian search to speed up example
}


scores, estimator = run_cross_validation(
    X=X,
    y=y,
    data=df_fmri,
    model=[creator1, creator2],
    cv=2,  # to speed up the example
    search_params=search_params,
    return_estimator="final",
)

print(scores["test_score"].mean())
2024-10-23 11:29:18,060 - julearn - INFO - Adding step zscore that applies to ColumnTypes<types={'continuous'}; pattern=(?:__:type:__continuous)>
2024-10-23 11:29:18,061 - julearn - INFO - Step added
2024-10-23 11:29:18,061 - julearn - INFO - Adding step svm that applies to ColumnTypes<types={'continuous'}; pattern=(?:__:type:__continuous)>
2024-10-23 11:29:18,061 - julearn - INFO - Setting hyperparameter kernel = linear
2024-10-23 11:29:18,061 - julearn - INFO - Tuning hyperparameter C = (1e-06, 1000.0, 'log-uniform')
2024-10-23 11:29:18,061 - julearn - INFO - Step added
2024-10-23 11:29:18,061 - julearn - INFO - Adding step zscore that applies to ColumnTypes<types={'continuous'}; pattern=(?:__:type:__continuous)>
2024-10-23 11:29:18,061 - julearn - INFO - Step added
2024-10-23 11:29:18,061 - julearn - INFO - Adding step svm that applies to ColumnTypes<types={'continuous'}; pattern=(?:__:type:__continuous)>
2024-10-23 11:29:18,061 - julearn - INFO - Setting hyperparameter kernel = rbf
2024-10-23 11:29:18,061 - julearn - INFO - Tuning hyperparameter C = (1e-06, 1000.0, 'log-uniform')
2024-10-23 11:29:18,061 - julearn - INFO - Tuning hyperparameter gamma = (1e-06, 10.0, 'log-uniform')
2024-10-23 11:29:18,061 - julearn - INFO - Step added
2024-10-23 11:29:18,061 - julearn - INFO - ==== Input Data ====
2024-10-23 11:29:18,061 - julearn - INFO - Using dataframe as input
2024-10-23 11:29:18,062 - julearn - INFO -      Features: ['frontal', 'parietal']
2024-10-23 11:29:18,062 - julearn - INFO -      Target: event
2024-10-23 11:29:18,062 - julearn - INFO -      Expanded features: ['frontal', 'parietal']
2024-10-23 11:29:18,062 - julearn - INFO -      X_types:{}
2024-10-23 11:29:18,062 - julearn - WARNING - The following columns are not defined in X_types: ['frontal', 'parietal']. They will be treated as continuous.
/home/runner/work/julearn/julearn/julearn/prepare.py:509: RuntimeWarning: The following columns are not defined in X_types: ['frontal', 'parietal']. They will be treated as continuous.
  warn_with_log(
2024-10-23 11:29:18,062 - julearn - INFO - ====================
2024-10-23 11:29:18,062 - julearn - INFO -
2024-10-23 11:29:18,063 - julearn - INFO - = Model Parameters =
2024-10-23 11:29:18,063 - julearn - INFO - Tuning hyperparameters using bayes
2024-10-23 11:29:18,063 - julearn - INFO - Hyperparameters:
2024-10-23 11:29:18,063 - julearn - INFO -      svm__C: (1e-06, 1000.0, 'log-uniform')
2024-10-23 11:29:18,063 - julearn - INFO - Hyperparameter svm__C is log-uniform float [1e-06, 1000.0]
2024-10-23 11:29:18,064 - julearn - INFO - Using inner CV scheme KFold(n_splits=2, random_state=None, shuffle=False)
2024-10-23 11:29:18,064 - julearn - INFO - Search Parameters:
2024-10-23 11:29:18,064 - julearn - INFO -      cv: KFold(n_splits=2, random_state=None, shuffle=False)
2024-10-23 11:29:18,064 - julearn - INFO -      n_iter: 10
2024-10-23 11:29:18,064 - julearn - INFO - ====================
2024-10-23 11:29:18,064 - julearn - INFO -
2024-10-23 11:29:18,065 - julearn - INFO - = Model Parameters =
2024-10-23 11:29:18,065 - julearn - INFO - Tuning hyperparameters using bayes
2024-10-23 11:29:18,065 - julearn - INFO - Hyperparameters:
2024-10-23 11:29:18,065 - julearn - INFO -      svm__C: (1e-06, 1000.0, 'log-uniform')
2024-10-23 11:29:18,065 - julearn - INFO -      svm__gamma: (1e-06, 10.0, 'log-uniform')
2024-10-23 11:29:18,065 - julearn - INFO - Hyperparameter svm__C is log-uniform float [1e-06, 1000.0]
2024-10-23 11:29:18,066 - julearn - INFO - Hyperparameter svm__gamma is log-uniform float [1e-06, 10.0]
2024-10-23 11:29:18,066 - julearn - INFO - Using inner CV scheme KFold(n_splits=2, random_state=None, shuffle=False)
2024-10-23 11:29:18,066 - julearn - INFO - Search Parameters:
2024-10-23 11:29:18,067 - julearn - INFO -      cv: KFold(n_splits=2, random_state=None, shuffle=False)
2024-10-23 11:29:18,067 - julearn - INFO -      n_iter: 10
2024-10-23 11:29:18,067 - julearn - INFO - ====================
2024-10-23 11:29:18,067 - julearn - INFO -
2024-10-23 11:29:18,067 - julearn - INFO - = Model Parameters =
2024-10-23 11:29:18,067 - julearn - INFO - Tuning hyperparameters using bayes
2024-10-23 11:29:18,067 - julearn - INFO - Hyperparameters list:
2024-10-23 11:29:18,067 - julearn - INFO -      Set 0
2024-10-23 11:29:18,067 - julearn - INFO -              svm__C: Real(low=1e-06, high=1000.0, prior='log-uniform', transform='identity')
2024-10-23 11:29:18,067 - julearn - INFO -              set_column_types: [SetColumnTypes(X_types={})]
2024-10-23 11:29:18,067 - julearn - INFO -              zscore: [StandardScaler()]
2024-10-23 11:29:18,068 - julearn - INFO -              svm: [SVC(kernel='linear')]
2024-10-23 11:29:18,068 - julearn - INFO -      Set 1
2024-10-23 11:29:18,068 - julearn - INFO -              svm__C: Real(low=1e-06, high=1000.0, prior='log-uniform', transform='identity')
2024-10-23 11:29:18,068 - julearn - INFO -              svm__gamma: Real(low=1e-06, high=10.0, prior='log-uniform', transform='identity')
2024-10-23 11:29:18,068 - julearn - INFO -              set_column_types: [SetColumnTypes(X_types={})]
2024-10-23 11:29:18,068 - julearn - INFO -              zscore: [StandardScaler()]
2024-10-23 11:29:18,068 - julearn - INFO -              svm: [SVC()]
2024-10-23 11:29:18,068 - julearn - INFO - Hyperparameter svm__C as is Real(low=1e-06, high=1000.0, prior='log-uniform', transform='identity')
2024-10-23 11:29:18,068 - julearn - INFO - Hyperparameter set_column_types as is [SetColumnTypes(X_types={})]
2024-10-23 11:29:18,069 - julearn - INFO - Hyperparameter zscore as is [StandardScaler()]
2024-10-23 11:29:18,069 - julearn - INFO - Hyperparameter svm as is [SVC(kernel='linear')]
2024-10-23 11:29:18,069 - julearn - INFO - Hyperparameter svm__C as is Real(low=1e-06, high=1000.0, prior='log-uniform', transform='identity')
2024-10-23 11:29:18,069 - julearn - INFO - Hyperparameter svm__gamma as is Real(low=1e-06, high=10.0, prior='log-uniform', transform='identity')
2024-10-23 11:29:18,069 - julearn - INFO - Hyperparameter set_column_types as is [SetColumnTypes(X_types={})]
2024-10-23 11:29:18,069 - julearn - INFO - Hyperparameter zscore as is [StandardScaler()]
2024-10-23 11:29:18,069 - julearn - INFO - Hyperparameter svm as is [SVC()]
2024-10-23 11:29:18,069 - julearn - INFO - Using inner CV scheme KFold(n_splits=2, random_state=None, shuffle=False)
2024-10-23 11:29:18,070 - julearn - INFO - Search Parameters:
2024-10-23 11:29:18,070 - julearn - INFO -      cv: KFold(n_splits=2, random_state=None, shuffle=False)
2024-10-23 11:29:18,070 - julearn - INFO -      n_iter: 10
2024-10-23 11:29:18,078 - julearn - INFO - ====================
2024-10-23 11:29:18,078 - julearn - INFO -
2024-10-23 11:29:18,078 - julearn - INFO - = Data Information =
2024-10-23 11:29:18,078 - julearn - INFO -      Problem type: classification
2024-10-23 11:29:18,079 - julearn - INFO -      Number of samples: 532
2024-10-23 11:29:18,079 - julearn - INFO -      Number of features: 2
2024-10-23 11:29:18,079 - julearn - INFO - ====================
2024-10-23 11:29:18,079 - julearn - INFO -
2024-10-23 11:29:18,079 - julearn - INFO -      Number of classes: 2
2024-10-23 11:29:18,079 - julearn - INFO -      Target type: object
2024-10-23 11:29:18,079 - julearn - INFO -      Class distributions: event
cue     266
stim    266
Name: count, dtype: int64
2024-10-23 11:29:18,080 - julearn - INFO - Using outer CV scheme KFold(n_splits=2, random_state=None, shuffle=False) (incl. final model)
2024-10-23 11:29:18,080 - julearn - INFO - Binary classification problem detected.
0.656015037593985

It seems that we might have found a better model, but which one is it?

print(estimator.best_params_)
OrderedDict([('set_column_types', SetColumnTypes(X_types={})), ('svm', SVC()), ('svm__C', 0.0018082604408073564), ('svm__gamma', 1.6437581151471767), ('zscore', StandardScaler())])

Total running time of the script: (0 minutes 3.860 seconds)

Download Python source code: run_hyperparameter_tuning_bayessearch.py

Download Jupyter notebook: run_hyperparameter_tuning_bayessearch.ipynb

Gallery generated by Sphinx-Gallery

Next
Tuning Multiple Hyperparameters Grids
Previous
Transforming target variable with z-score
Copyright © 2023, Authors of julearn
Made with Sphinx and @pradyunsg's Furo
On this page
  • Tuning Hyperparameters using Bayesian Search
    • References