Note
Go to the end to download the full example code.
Tuning Hyperparameters using Bayesian Search¶
This example uses the fmri dataset, performs simple binary classification
using a Support Vector Machine classifier and analyzes the model.
References¶
Waskom, M.L., Frank, M.C., Wagner, A.D. (2016). Adaptive engagement of cognitive control in context-dependent decision-making. Cerebral Cortex.
# Authors: Federico Raimondo <f.raimondo@fz-juelich.de>
# License: AGPL
import numpy as np
from seaborn import load_dataset
import sklearn
from julearn import run_cross_validation
from julearn.utils import configure_logging, logger
from julearn.pipeline import PipelineCreator
Set the logging level to info to see extra information.
configure_logging(level="INFO")
2026-05-29 20:46:16,635 - julearn - INFO - ===== Lib Versions =====
2026-05-29 20:46:16,636 - julearn - INFO - numpy: 2.4.6
2026-05-29 20:46:16,636 - julearn - INFO - scipy: 1.17.1
2026-05-29 20:46:16,636 - julearn - INFO - sklearn: 1.8.0
2026-05-29 20:46:16,636 - julearn - INFO - pandas: 3.0.3
2026-05-29 20:46:16,636 - julearn - INFO - julearn: 0.3.5
2026-05-29 20:46:16,636 - julearn - INFO - ========================
Disable metadata routing to avoid errors due to BayesSearchCV being used.
sklearn.set_config(enable_metadata_routing=False)
Set the random seed to always have the same example.
np.random.seed(42)
Load the dataset.
df_fmri = load_dataset("fmri")
df_fmri.head()
Set the dataframe in the right format.
df_fmri = df_fmri.pivot(
index=["subject", "timepoint", "event"], columns="region", values="signal"
)
df_fmri = df_fmri.reset_index()
df_fmri.head()
Following the hyperparamter tuning example, we will now use a Bayesian search to find the best hyperparameters for the SVM model.
X = ["frontal", "parietal"]
y = "event"
creator1 = PipelineCreator(problem_type="classification")
creator1.add("zscore")
creator1.add(
"svm",
kernel=["linear"],
C=(1e-6, 1e3, "log-uniform"),
)
creator2 = PipelineCreator(problem_type="classification")
creator2.add("zscore")
creator2.add(
"svm",
kernel=["rbf"],
C=(1e-6, 1e3, "log-uniform"),
gamma=(1e-6, 1e1, "log-uniform"),
)
search_params = {
"kind": "bayes",
"cv": 2, # to speed up the example
"n_iter": 10, # 10 iterations of bayesian search to speed up example
}
scores, estimator = run_cross_validation(
X=X,
y=y,
data=df_fmri,
model=[creator1, creator2],
cv=2, # to speed up the example
search_params=search_params,
return_estimator="final",
)
print(scores["test_score"].mean())
2026-05-29 20:46:16,650 - julearn - INFO - Adding step zscore that applies to ColumnTypes<types={'continuous'}; pattern=(?:__:type:__continuous)>
2026-05-29 20:46:16,650 - julearn - INFO - Step added
2026-05-29 20:46:16,650 - julearn - INFO - Adding step svm that applies to ColumnTypes<types={'continuous'}; pattern=(?:__:type:__continuous)>
2026-05-29 20:46:16,650 - julearn - INFO - Setting hyperparameter kernel = linear
2026-05-29 20:46:16,650 - julearn - INFO - Tuning hyperparameter C = (1e-06, 1000.0, 'log-uniform')
2026-05-29 20:46:16,651 - julearn - INFO - Step added
2026-05-29 20:46:16,651 - julearn - INFO - Adding step zscore that applies to ColumnTypes<types={'continuous'}; pattern=(?:__:type:__continuous)>
2026-05-29 20:46:16,651 - julearn - INFO - Step added
2026-05-29 20:46:16,651 - julearn - INFO - Adding step svm that applies to ColumnTypes<types={'continuous'}; pattern=(?:__:type:__continuous)>
2026-05-29 20:46:16,651 - julearn - INFO - Setting hyperparameter kernel = rbf
2026-05-29 20:46:16,651 - julearn - INFO - Tuning hyperparameter C = (1e-06, 1000.0, 'log-uniform')
2026-05-29 20:46:16,651 - julearn - INFO - Tuning hyperparameter gamma = (1e-06, 10.0, 'log-uniform')
2026-05-29 20:46:16,651 - julearn - INFO - Step added
2026-05-29 20:46:16,652 - julearn - INFO - ==== Input Data ====
2026-05-29 20:46:16,652 - julearn - INFO - Using dataframe as input
2026-05-29 20:46:16,652 - julearn - INFO - Features: ['frontal', 'parietal']
2026-05-29 20:46:16,652 - julearn - INFO - Target: event
2026-05-29 20:46:16,652 - julearn - INFO - Expanded features: ['frontal', 'parietal']
2026-05-29 20:46:16,652 - julearn - INFO - X_types:{}
2026-05-29 20:46:16,652 - julearn - WARNING - The following columns are not defined in X_types: ['frontal', 'parietal']. They will be treated as continuous.
/private/var/folders/09/t22x2_p106j7p24khr0jdxrw0000gn/T/tmpyvhr0tue/julearn/prepare.py:592: RuntimeWarning: The following columns are not defined in X_types: ['frontal', 'parietal']. They will be treated as continuous.
warn_with_log(
2026-05-29 20:46:16,653 - julearn - INFO - ====================
2026-05-29 20:46:16,653 - julearn - INFO -
2026-05-29 20:46:16,654 - julearn - INFO - = Model Parameters =
2026-05-29 20:46:16,655 - julearn - INFO - Tuning hyperparameters using bayes
2026-05-29 20:46:16,655 - julearn - INFO - Hyperparameters:
2026-05-29 20:46:16,655 - julearn - INFO - svm__C: (1e-06, 1000.0, 'log-uniform')
2026-05-29 20:46:16,655 - julearn - INFO - Hyperparameter svm__C is log-uniform float [1e-06, 1000.0]
2026-05-29 20:46:16,656 - julearn - INFO - Using inner CV scheme KFold(n_splits=2, random_state=None, shuffle=False)
2026-05-29 20:46:16,657 - julearn - INFO - Search Parameters:
2026-05-29 20:46:16,657 - julearn - INFO - cv: KFold(n_splits=2, random_state=None, shuffle=False)
2026-05-29 20:46:16,657 - julearn - INFO - n_iter: 10
2026-05-29 20:46:16,657 - julearn - INFO - ====================
2026-05-29 20:46:16,657 - julearn - INFO -
2026-05-29 20:46:16,658 - julearn - INFO - = Model Parameters =
2026-05-29 20:46:16,658 - julearn - INFO - Tuning hyperparameters using bayes
2026-05-29 20:46:16,658 - julearn - INFO - Hyperparameters:
2026-05-29 20:46:16,659 - julearn - INFO - svm__C: (1e-06, 1000.0, 'log-uniform')
2026-05-29 20:46:16,659 - julearn - INFO - svm__gamma: (1e-06, 10.0, 'log-uniform')
2026-05-29 20:46:16,659 - julearn - INFO - Hyperparameter svm__C is log-uniform float [1e-06, 1000.0]
2026-05-29 20:46:16,660 - julearn - INFO - Hyperparameter svm__gamma is log-uniform float [1e-06, 10.0]
2026-05-29 20:46:16,661 - julearn - INFO - Using inner CV scheme KFold(n_splits=2, random_state=None, shuffle=False)
2026-05-29 20:46:16,661 - julearn - INFO - Search Parameters:
2026-05-29 20:46:16,661 - julearn - INFO - cv: KFold(n_splits=2, random_state=None, shuffle=False)
2026-05-29 20:46:16,661 - julearn - INFO - n_iter: 10
2026-05-29 20:46:16,662 - julearn - INFO - ====================
2026-05-29 20:46:16,662 - julearn - INFO -
2026-05-29 20:46:16,663 - julearn - INFO - = Model Parameters =
2026-05-29 20:46:16,663 - julearn - INFO - Tuning hyperparameters using bayes
2026-05-29 20:46:16,663 - julearn - INFO - Hyperparameters list:
2026-05-29 20:46:16,663 - julearn - INFO - Set 0
2026-05-29 20:46:16,663 - julearn - INFO - svm__C: Real(low=1e-06, high=1000.0, prior='log-uniform', transform='identity')
2026-05-29 20:46:16,664 - julearn - INFO - set_column_types: [SetColumnTypes(X_types={})]
2026-05-29 20:46:16,664 - julearn - INFO - zscore: [StandardScaler()]
2026-05-29 20:46:16,664 - julearn - INFO - svm: [SVC(kernel='linear')]
2026-05-29 20:46:16,664 - julearn - INFO - Set 1
2026-05-29 20:46:16,665 - julearn - INFO - svm__C: Real(low=1e-06, high=1000.0, prior='log-uniform', transform='identity')
2026-05-29 20:46:16,665 - julearn - INFO - svm__gamma: Real(low=1e-06, high=10.0, prior='log-uniform', transform='identity')
2026-05-29 20:46:16,665 - julearn - INFO - set_column_types: [SetColumnTypes(X_types={})]
2026-05-29 20:46:16,665 - julearn - INFO - zscore: [StandardScaler()]
2026-05-29 20:46:16,666 - julearn - INFO - svm: [SVC()]
2026-05-29 20:46:16,666 - julearn - INFO - Hyperparameter svm__C as is Real(low=1e-06, high=1000.0, prior='log-uniform', transform='identity')
2026-05-29 20:46:16,667 - julearn - INFO - Hyperparameter set_column_types as is [SetColumnTypes(X_types={})]
2026-05-29 20:46:16,667 - julearn - INFO - Hyperparameter zscore as is [StandardScaler()]
2026-05-29 20:46:16,667 - julearn - INFO - Hyperparameter svm as is [SVC(kernel='linear')]
2026-05-29 20:46:16,667 - julearn - INFO - Hyperparameter svm__C as is Real(low=1e-06, high=1000.0, prior='log-uniform', transform='identity')
2026-05-29 20:46:16,668 - julearn - INFO - Hyperparameter svm__gamma as is Real(low=1e-06, high=10.0, prior='log-uniform', transform='identity')
2026-05-29 20:46:16,668 - julearn - INFO - Hyperparameter set_column_types as is [SetColumnTypes(X_types={})]
2026-05-29 20:46:16,668 - julearn - INFO - Hyperparameter zscore as is [StandardScaler()]
2026-05-29 20:46:16,669 - julearn - INFO - Hyperparameter svm as is [SVC()]
2026-05-29 20:46:16,669 - julearn - INFO - Using inner CV scheme KFold(n_splits=2, random_state=None, shuffle=False)
2026-05-29 20:46:16,669 - julearn - INFO - Search Parameters:
2026-05-29 20:46:16,669 - julearn - INFO - cv: KFold(n_splits=2, random_state=None, shuffle=False)
2026-05-29 20:46:16,669 - julearn - INFO - n_iter: 10
2026-05-29 20:46:16,685 - julearn - INFO - ====================
2026-05-29 20:46:16,685 - julearn - INFO -
2026-05-29 20:46:16,685 - julearn - INFO - = Data Information =
2026-05-29 20:46:16,685 - julearn - INFO - Problem type: classification
2026-05-29 20:46:16,686 - julearn - INFO - Number of samples: 532
2026-05-29 20:46:16,686 - julearn - INFO - Number of features: 2
2026-05-29 20:46:16,686 - julearn - INFO - ====================
2026-05-29 20:46:16,686 - julearn - INFO -
2026-05-29 20:46:16,687 - julearn - INFO - Number of classes: 2
2026-05-29 20:46:16,687 - julearn - INFO - Target type: str
2026-05-29 20:46:16,688 - julearn - INFO - Class distributions: event
cue 266
stim 266
Name: count, dtype: int64
2026-05-29 20:46:16,689 - julearn - INFO - Using outer CV scheme KFold(n_splits=2, random_state=None, shuffle=False) (incl. final model)
2026-05-29 20:46:16,689 - julearn - WARNING - The kind of values in y (str) is not suitable for a classification. Values should be numeric.
/private/var/folders/09/t22x2_p106j7p24khr0jdxrw0000gn/T/tmpyvhr0tue/julearn/prepare.py:408: RuntimeWarning: The kind of values in y (str) is not suitable for a classification. Values should be numeric.
warn_with_log(
2026-05-29 20:46:16,690 - julearn - INFO - Binary classification problem detected.
0.656015037593985
It seems that we might have found a better model, but which one is it?
print(estimator.best_params_)
OrderedDict([('set_column_types', SetColumnTypes(X_types={})), ('svm', SVC()), ('svm__C', 0.0018082604408073564), ('svm__gamma', 1.6437581151471767), ('zscore', StandardScaler())])
Total running time of the script: (0 minutes 5.880 seconds)