Contents Menu Expand Light mode Dark mode Auto light/dark mode
julearn documentation
Logo
julearn documentation
  • 1. Getting started
  • 2. Setup suggestion
  • 3. Installing
  • 4. Optional Dependencies
  • 5. What you really need to know
    • 5.1. Why cross validation?
    • 5.2. Data
    • 5.3. Model Building
    • 5.4. Model Evaluation
    • 5.5. Model Comparison
  • 6. Selected deeper topics
    • 6.1. Applying preprocessing to the target
    • 6.2. Cross-validation consistent Confound Removal
    • 6.3. Hyperparameter Tuning
    • 6.4. Inspecting Models
    • 6.5. Cross-validation splitters
    • 6.6. Stacking Models
    • 6.7. Connectome-based Predictive Modeling (CBPM)
  • 7. Overview of available Pipeline Steps
  • 8. Examples
    • 8.1. Starting with julearn
      • Working with pandas
      • Simple Binary Classification
      • Grouped CV
      • Multiclass Classification
      • Stratified K-fold CV for regression analysis
      • Regression Analysis
    • 8.2. Model Comparison
      • Simple Model Comparison
    • 8.3. Inspection
      • Inspecting the fold-wise predictions
      • Inspecting Random Forest models
      • Inspecting SVM models
      • Preprocessing with variance threshold, zscore and PCA
    • 8.4. Complex Models
      • Transforming target variable with z-score
      • Tuning Hyperparameters using Bayesian Search
      • Tuning Multiple Hyperparameters Grids
      • Stacking Classification
      • Tuning Hyperparameters
      • Regression Analysis
    • 8.5. Confounds
      • Return Confounds in Confound Removal
      • Confound Removal (model comparison)
    • 8.6. Customization
      • Custom Scoring Function for Regression
  • 9. API Reference
    • 9.1. Main API
      • julearn.run_cross_validation
      • julearn.run_fit
    • 9.2. Pipeline
      • julearn.PipelineCreator
      • julearn.TargetPipelineCreator
      • julearn.pipeline.JuTargetPipeline
      • julearn.pipeline.pipeline_creator.Step
    • 9.3. Model Selection
      • julearn.model_selection.ContinuousStratifiedKFold
      • julearn.model_selection.RepeatedContinuousStratifiedKFold
      • julearn.model_selection.ContinuousStratifiedGroupKFold
      • julearn.model_selection.RepeatedContinuousStratifiedGroupKFold
      • julearn.model_selection.StratifiedBootstrap
      • julearn.model_selection.get_searcher
      • julearn.model_selection.list_searchers
      • julearn.model_selection.register_searcher
      • julearn.model_selection.reset_searcher_register
    • 9.4. Base
      • julearn.base.JuBaseEstimator
      • julearn.base.JuTransformer
      • julearn.base.WrapModel
      • julearn.base.ColumnTypes
      • julearn.base.ColumnTypesLike
      • julearn.base.change_column_type
      • julearn.base.get_column_type
      • julearn.base.make_type_selector
      • julearn.base.ensure_column_types
    • 9.5. Inspect
      • julearn.inspect.Inspector
      • julearn.inspect.FoldsInspector
      • julearn.inspect.preprocess
    • 9.6. Models
      • julearn.models.list_models
      • julearn.models.get_model
      • julearn.models.register_model
      • julearn.models.reset_model_register
    • 9.7. Dynamic Selection (DESLib)
      • julearn.models.dynamic.DynamicSelection
    • 9.8. Scoring
      • julearn.scoring.get_scorer
      • julearn.scoring.list_scorers
      • julearn.scoring.register_scorer
      • julearn.scoring.reset_scorer_register
      • julearn.scoring.check_scoring
    • 9.9. Scoring Metrics
      • julearn.scoring.metrics.r_corr
      • julearn.scoring.metrics.r2_corr
    • 9.10. Transformers
      • julearn.transformers.DropColumns
      • julearn.transformers.ChangeColumnTypes
      • julearn.transformers.SetColumnTypes
      • julearn.transformers.FilterColumns
      • julearn.transformers.CBPM
      • julearn.transformers.JuColumnTransformer
      • julearn.transformers.confound_remover.ConfoundRemover
      • julearn.transformers.list_transformers
      • julearn.transformers.get_transformer
      • julearn.transformers.register_transformer
      • julearn.transformers.reset_transformer_register
    • 9.11. Target Transformers
      • julearn.transformers.target.JuTransformedTargetModel
      • julearn.transformers.target.JuTargetTransformer
      • julearn.transformers.target.TargetConfoundRemover
      • julearn.transformers.target.TransformedTargetWarning
      • julearn.transformers.target.get_target_transformer
      • julearn.transformers.target.list_target_transformers
      • julearn.transformers.target.register_target_transformer
      • julearn.transformers.target.reset_target_transformer_register
    • 9.12. Utils
      • julearn.utils.logger
      • julearn.utils.configure_logging
      • julearn.utils.raise_error
      • julearn.utils.warn_with_log
    • 9.13. Typing
      • julearn.utils.typing.JuEstimatorLike
      • julearn.utils.typing.EstimatorLike
      • julearn.utils.typing.EstimatorLikeFit1
      • julearn.utils.typing.EstimatorLikeFit2
      • julearn.utils.typing.EstimatorLikeFity
    • 9.14. Prepare
      • julearn.prepare.prepare_input_data
      • julearn.prepare.check_consistency
    • 9.15. Stats
      • julearn.stats.corrected_ttest
    • 9.16. Visualization
      • julearn.viz.plot_scores
  • 10. Configuring julearn
  • 11. Contributing
  • 12. Maintaining
  • 13. FAQs
  • 14. What’s new
Other Versions v: v0.3.4
Tags
v0.2.2
v0.2.3
v0.2.4
v0.2.5
v0.2.7
v0.3.0
v0.3.1
v0.3.2
v0.3.3
v0.3.4
Back to top

Note

Go to the end to download the full example code

Tuning Hyperparameters using Bayesian Search#

This example uses the fmri dataset, performs simple binary classification using a Support Vector Machine classifier and analyzes the model.

References#

Waskom, M.L., Frank, M.C., Wagner, A.D. (2016). Adaptive engagement of cognitive control in context-dependent decision-making. Cerebral Cortex.

# Authors: Federico Raimondo <f.raimondo@fz-juelich.de>
# License: AGPL

import numpy as np
from seaborn import load_dataset

from julearn import run_cross_validation
from julearn.utils import configure_logging, logger
from julearn.pipeline import PipelineCreator

Set the logging level to info to see extra information.

configure_logging(level="INFO")
/home/runner/work/julearn/julearn/julearn/utils/logging.py:66: UserWarning: The '__version__' attribute is deprecated and will be removed in MarkupSafe 3.1. Use feature detection, or `importlib.metadata.version("markupsafe")`, instead.
  vstring = str(getattr(module, "__version__", None))
2024-10-17 14:15:40,195 - julearn - INFO - ===== Lib Versions =====
2024-10-17 14:15:40,195 - julearn - INFO - numpy: 1.26.4
2024-10-17 14:15:40,195 - julearn - INFO - scipy: 1.14.1
2024-10-17 14:15:40,195 - julearn - INFO - sklearn: 1.5.2
2024-10-17 14:15:40,195 - julearn - INFO - pandas: 2.2.3
2024-10-17 14:15:40,195 - julearn - INFO - julearn: 0.3.4
2024-10-17 14:15:40,195 - julearn - INFO - ========================

Set the random seed to always have the same example.

np.random.seed(42)

Load the dataset.

df_fmri = load_dataset("fmri")
df_fmri.head()
subject timepoint event region signal
0 s13 18 stim parietal -0.017552
1 s5 14 stim parietal -0.080883
2 s12 18 stim parietal -0.081033
3 s11 18 stim parietal -0.046134
4 s10 18 stim parietal -0.037970


Set the dataframe in the right format.

df_fmri = df_fmri.pivot(
    index=["subject", "timepoint", "event"], columns="region", values="signal"
)

df_fmri = df_fmri.reset_index()
df_fmri.head()
region subject timepoint event frontal parietal
0 s0 0 cue 0.007766 -0.006899
1 s0 0 stim -0.021452 -0.039327
2 s0 1 cue 0.016440 0.000300
3 s0 1 stim -0.021054 -0.035735
4 s0 2 cue 0.024296 0.033220


Following the hyperparamter tuning example, we will now use a Bayesian search to find the best hyperparameters for the SVM model.

X = ["frontal", "parietal"]
y = "event"

creator1 = PipelineCreator(problem_type="classification")
creator1.add("zscore")
creator1.add(
    "svm",
    kernel=["linear"],
    C=(1e-6, 1e3, "log-uniform"),
)

creator2 = PipelineCreator(problem_type="classification")
creator2.add("zscore")
creator2.add(
    "svm",
    kernel=["rbf"],
    C=(1e-6, 1e3, "log-uniform"),
    gamma=(1e-6, 1e1, "log-uniform"),
)

search_params = {
    "kind": "bayes",
    "cv": 2,  # to speed up the example
    "n_iter": 10,  # 10 iterations of bayesian search to speed up example
}


scores, estimator = run_cross_validation(
    X=X,
    y=y,
    data=df_fmri,
    model=[creator1, creator2],
    cv=2,  # to speed up the example
    search_params=search_params,
    return_estimator="final",
)

print(scores["test_score"].mean())
2024-10-17 14:15:40,205 - julearn - INFO - Adding step zscore that applies to ColumnTypes<types={'continuous'}; pattern=(?:__:type:__continuous)>
2024-10-17 14:15:40,205 - julearn - INFO - Step added
2024-10-17 14:15:40,205 - julearn - INFO - Adding step svm that applies to ColumnTypes<types={'continuous'}; pattern=(?:__:type:__continuous)>
2024-10-17 14:15:40,205 - julearn - INFO - Setting hyperparameter kernel = linear
2024-10-17 14:15:40,205 - julearn - INFO - Tuning hyperparameter C = (1e-06, 1000.0, 'log-uniform')
2024-10-17 14:15:40,205 - julearn - INFO - Step added
2024-10-17 14:15:40,206 - julearn - INFO - Adding step zscore that applies to ColumnTypes<types={'continuous'}; pattern=(?:__:type:__continuous)>
2024-10-17 14:15:40,206 - julearn - INFO - Step added
2024-10-17 14:15:40,206 - julearn - INFO - Adding step svm that applies to ColumnTypes<types={'continuous'}; pattern=(?:__:type:__continuous)>
2024-10-17 14:15:40,206 - julearn - INFO - Setting hyperparameter kernel = rbf
2024-10-17 14:15:40,206 - julearn - INFO - Tuning hyperparameter C = (1e-06, 1000.0, 'log-uniform')
2024-10-17 14:15:40,206 - julearn - INFO - Tuning hyperparameter gamma = (1e-06, 10.0, 'log-uniform')
2024-10-17 14:15:40,206 - julearn - INFO - Step added
2024-10-17 14:15:40,206 - julearn - INFO - ==== Input Data ====
2024-10-17 14:15:40,206 - julearn - INFO - Using dataframe as input
2024-10-17 14:15:40,206 - julearn - INFO -      Features: ['frontal', 'parietal']
2024-10-17 14:15:40,206 - julearn - INFO -      Target: event
2024-10-17 14:15:40,206 - julearn - INFO -      Expanded features: ['frontal', 'parietal']
2024-10-17 14:15:40,206 - julearn - INFO -      X_types:{}
2024-10-17 14:15:40,206 - julearn - WARNING - The following columns are not defined in X_types: ['frontal', 'parietal']. They will be treated as continuous.
/home/runner/work/julearn/julearn/julearn/prepare.py:509: RuntimeWarning: The following columns are not defined in X_types: ['frontal', 'parietal']. They will be treated as continuous.
  warn_with_log(
2024-10-17 14:15:40,207 - julearn - INFO - ====================
2024-10-17 14:15:40,207 - julearn - INFO -
2024-10-17 14:15:40,208 - julearn - INFO - = Model Parameters =
2024-10-17 14:15:40,208 - julearn - INFO - Tuning hyperparameters using bayes
2024-10-17 14:15:40,208 - julearn - INFO - Hyperparameters:
2024-10-17 14:15:40,208 - julearn - INFO -      svm__C: (1e-06, 1000.0, 'log-uniform')
2024-10-17 14:15:40,208 - julearn - INFO - Hyperparameter svm__C is log-uniform float [1e-06, 1000.0]
2024-10-17 14:15:40,209 - julearn - INFO - Using inner CV scheme KFold(n_splits=2, random_state=None, shuffle=False)
2024-10-17 14:15:40,209 - julearn - INFO - Search Parameters:
2024-10-17 14:15:40,209 - julearn - INFO -      cv: KFold(n_splits=2, random_state=None, shuffle=False)
2024-10-17 14:15:40,209 - julearn - INFO -      n_iter: 10
2024-10-17 14:15:40,209 - julearn - INFO - ====================
2024-10-17 14:15:40,209 - julearn - INFO -
2024-10-17 14:15:40,210 - julearn - INFO - = Model Parameters =
2024-10-17 14:15:40,210 - julearn - INFO - Tuning hyperparameters using bayes
2024-10-17 14:15:40,210 - julearn - INFO - Hyperparameters:
2024-10-17 14:15:40,210 - julearn - INFO -      svm__C: (1e-06, 1000.0, 'log-uniform')
2024-10-17 14:15:40,210 - julearn - INFO -      svm__gamma: (1e-06, 10.0, 'log-uniform')
2024-10-17 14:15:40,210 - julearn - INFO - Hyperparameter svm__C is log-uniform float [1e-06, 1000.0]
2024-10-17 14:15:40,210 - julearn - INFO - Hyperparameter svm__gamma is log-uniform float [1e-06, 10.0]
2024-10-17 14:15:40,211 - julearn - INFO - Using inner CV scheme KFold(n_splits=2, random_state=None, shuffle=False)
2024-10-17 14:15:40,211 - julearn - INFO - Search Parameters:
2024-10-17 14:15:40,211 - julearn - INFO -      cv: KFold(n_splits=2, random_state=None, shuffle=False)
2024-10-17 14:15:40,211 - julearn - INFO -      n_iter: 10
2024-10-17 14:15:40,212 - julearn - INFO - ====================
2024-10-17 14:15:40,212 - julearn - INFO -
2024-10-17 14:15:40,212 - julearn - INFO - = Model Parameters =
2024-10-17 14:15:40,212 - julearn - INFO - Tuning hyperparameters using bayes
2024-10-17 14:15:40,212 - julearn - INFO - Hyperparameters list:
2024-10-17 14:15:40,212 - julearn - INFO -      Set 0
2024-10-17 14:15:40,212 - julearn - INFO -              svm__C: Real(low=1e-06, high=1000.0, prior='log-uniform', transform='identity')
2024-10-17 14:15:40,212 - julearn - INFO -              set_column_types: [SetColumnTypes(X_types={})]
2024-10-17 14:15:40,212 - julearn - INFO -              zscore: [StandardScaler()]
2024-10-17 14:15:40,212 - julearn - INFO -              svm: [SVC(kernel='linear')]
2024-10-17 14:15:40,212 - julearn - INFO -      Set 1
2024-10-17 14:15:40,213 - julearn - INFO -              svm__C: Real(low=1e-06, high=1000.0, prior='log-uniform', transform='identity')
2024-10-17 14:15:40,213 - julearn - INFO -              svm__gamma: Real(low=1e-06, high=10.0, prior='log-uniform', transform='identity')
2024-10-17 14:15:40,213 - julearn - INFO -              set_column_types: [SetColumnTypes(X_types={})]
2024-10-17 14:15:40,213 - julearn - INFO -              zscore: [StandardScaler()]
2024-10-17 14:15:40,213 - julearn - INFO -              svm: [SVC()]
2024-10-17 14:15:40,213 - julearn - INFO - Hyperparameter svm__C as is Real(low=1e-06, high=1000.0, prior='log-uniform', transform='identity')
2024-10-17 14:15:40,213 - julearn - INFO - Hyperparameter set_column_types as is [SetColumnTypes(X_types={})]
2024-10-17 14:15:40,213 - julearn - INFO - Hyperparameter zscore as is [StandardScaler()]
2024-10-17 14:15:40,213 - julearn - INFO - Hyperparameter svm as is [SVC(kernel='linear')]
2024-10-17 14:15:40,214 - julearn - INFO - Hyperparameter svm__C as is Real(low=1e-06, high=1000.0, prior='log-uniform', transform='identity')
2024-10-17 14:15:40,214 - julearn - INFO - Hyperparameter svm__gamma as is Real(low=1e-06, high=10.0, prior='log-uniform', transform='identity')
2024-10-17 14:15:40,214 - julearn - INFO - Hyperparameter set_column_types as is [SetColumnTypes(X_types={})]
2024-10-17 14:15:40,214 - julearn - INFO - Hyperparameter zscore as is [StandardScaler()]
2024-10-17 14:15:40,214 - julearn - INFO - Hyperparameter svm as is [SVC()]
2024-10-17 14:15:40,214 - julearn - INFO - Using inner CV scheme KFold(n_splits=2, random_state=None, shuffle=False)
2024-10-17 14:15:40,214 - julearn - INFO - Search Parameters:
2024-10-17 14:15:40,214 - julearn - INFO -      cv: KFold(n_splits=2, random_state=None, shuffle=False)
2024-10-17 14:15:40,214 - julearn - INFO -      n_iter: 10
2024-10-17 14:15:40,223 - julearn - INFO - ====================
2024-10-17 14:15:40,223 - julearn - INFO -
2024-10-17 14:15:40,223 - julearn - INFO - = Data Information =
2024-10-17 14:15:40,223 - julearn - INFO -      Problem type: classification
2024-10-17 14:15:40,223 - julearn - INFO -      Number of samples: 532
2024-10-17 14:15:40,223 - julearn - INFO -      Number of features: 2
2024-10-17 14:15:40,223 - julearn - INFO - ====================
2024-10-17 14:15:40,223 - julearn - INFO -
2024-10-17 14:15:40,223 - julearn - INFO -      Number of classes: 2
2024-10-17 14:15:40,223 - julearn - INFO -      Target type: object
2024-10-17 14:15:40,224 - julearn - INFO -      Class distributions: event
cue     266
stim    266
Name: count, dtype: int64
2024-10-17 14:15:40,224 - julearn - INFO - Using outer CV scheme KFold(n_splits=2, random_state=None, shuffle=False) (incl. final model)
2024-10-17 14:15:40,224 - julearn - INFO - Binary classification problem detected.
0.656015037593985

It seems that we might have found a better model, but which one is it?

print(estimator.best_params_)
OrderedDict([('set_column_types', SetColumnTypes(X_types={})), ('svm', SVC()), ('svm__C', 0.0018082604408073564), ('svm__gamma', 1.6437581151471767), ('zscore', StandardScaler())])

Total running time of the script: (0 minutes 4.007 seconds)

Download Python source code: run_hyperparameter_tuning_bayessearch.py

Download Jupyter notebook: run_hyperparameter_tuning_bayessearch.ipynb

Gallery generated by Sphinx-Gallery

Next
Tuning Multiple Hyperparameters Grids
Previous
Transforming target variable with z-score
Copyright © 2023, Authors of julearn
Made with Sphinx and @pradyunsg's Furo
On this page
  • Tuning Hyperparameters using Bayesian Search
    • References