.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/advanced/run_hyperparameter_tuning.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_advanced_run_hyperparameter_tuning.py: Tuning Hyperparameters ======================= This example uses the 'fmri' dataset, performs simple binary classification using a Support Vector Machine classifier and analyse the model. References ---------- Waskom, M.L., Frank, M.C., Wagner, A.D. (2016). Adaptive engagement of cognitive control in context-dependent decision-making. Cerebral Cortex. .. include:: ../../links.inc .. GENERATED FROM PYTHON SOURCE LINES 17-26 .. code-block:: default # Authors: Federico Raimondo # # License: AGPL import numpy as np from seaborn import load_dataset from julearn import run_cross_validation from julearn.utils import configure_logging .. GENERATED FROM PYTHON SOURCE LINES 27-28 Set the logging level to info to see extra information .. GENERATED FROM PYTHON SOURCE LINES 28-30 .. code-block:: default configure_logging(level='INFO') .. rst-class:: sphx-glr-script-out .. code-block:: none 2022-12-08 10:45:56,370 - julearn - INFO - ===== Lib Versions ===== 2022-12-08 10:45:56,370 - julearn - INFO - numpy: 1.23.5 2022-12-08 10:45:56,370 - julearn - INFO - scipy: 1.9.3 2022-12-08 10:45:56,370 - julearn - INFO - sklearn: 1.0.2 2022-12-08 10:45:56,370 - julearn - INFO - pandas: 1.4.4 2022-12-08 10:45:56,370 - julearn - INFO - julearn: 0.2.7 2022-12-08 10:45:56,371 - julearn - INFO - ======================== .. GENERATED FROM PYTHON SOURCE LINES 31-32 Set the random seed to always have the same example .. GENERATED FROM PYTHON SOURCE LINES 32-35 .. code-block:: default np.random.seed(42) .. GENERATED FROM PYTHON SOURCE LINES 36-37 Load the dataset .. GENERATED FROM PYTHON SOURCE LINES 37-40 .. code-block:: default df_fmri = load_dataset('fmri') print(df_fmri.head()) .. rst-class:: sphx-glr-script-out .. code-block:: none subject timepoint event region signal 0 s13 18 stim parietal -0.017552 1 s5 14 stim parietal -0.080883 2 s12 18 stim parietal -0.081033 3 s11 18 stim parietal -0.046134 4 s10 18 stim parietal -0.037970 .. GENERATED FROM PYTHON SOURCE LINES 41-42 Set the dataframe in the right format .. GENERATED FROM PYTHON SOURCE LINES 42-50 .. code-block:: default df_fmri = df_fmri.pivot( index=['subject', 'timepoint', 'event'], columns='region', values='signal') df_fmri = df_fmri.reset_index() print(df_fmri.head()) .. rst-class:: sphx-glr-script-out .. code-block:: none region subject timepoint event frontal parietal 0 s0 0 cue 0.007766 -0.006899 1 s0 0 stim -0.021452 -0.039327 2 s0 1 cue 0.016440 0.000300 3 s0 1 stim -0.021054 -0.035735 4 s0 2 cue 0.024296 0.033220 .. GENERATED FROM PYTHON SOURCE LINES 51-52 Lets do a first attempt and use a linear SVM with the default parameters. .. GENERATED FROM PYTHON SOURCE LINES 52-61 .. code-block:: default model_params = {'svm__kernel': 'linear'} X = ['frontal', 'parietal'] y = 'event' scores = run_cross_validation( X=X, y=y, data=df_fmri, model='svm', preprocess_X='zscore', model_params=model_params) print(scores['test_score'].mean()) .. rst-class:: sphx-glr-script-out .. code-block:: none 2022-12-08 10:45:56,387 - julearn - INFO - Using default CV 2022-12-08 10:45:56,387 - julearn - INFO - ==== Input Data ==== 2022-12-08 10:45:56,387 - julearn - INFO - Using dataframe as input 2022-12-08 10:45:56,387 - julearn - INFO - Features: ['frontal', 'parietal'] 2022-12-08 10:45:56,388 - julearn - INFO - Target: event 2022-12-08 10:45:56,388 - julearn - INFO - Expanded X: ['frontal', 'parietal'] 2022-12-08 10:45:56,388 - julearn - INFO - Expanded Confounds: [] 2022-12-08 10:45:56,388 - julearn - INFO - ==================== 2022-12-08 10:45:56,389 - julearn - INFO - 2022-12-08 10:45:56,389 - julearn - INFO - ====== Model ====== 2022-12-08 10:45:56,389 - julearn - INFO - Obtaining model by name: svm 2022-12-08 10:45:56,389 - julearn - INFO - =================== 2022-12-08 10:45:56,389 - julearn - INFO - 2022-12-08 10:45:56,389 - julearn - INFO - = Model Parameters = 2022-12-08 10:45:56,389 - julearn - INFO - Setting hyperparameter svm__kernel = linear 2022-12-08 10:45:56,390 - julearn - INFO - ==================== 2022-12-08 10:45:56,390 - julearn - INFO - 2022-12-08 10:45:56,390 - julearn - INFO - CV interpreted as RepeatedKFold with 5 repetitions of 5 folds 0.5765508728619291 .. GENERATED FROM PYTHON SOURCE LINES 62-64 The score is not so good. Lets try to see if there is an optimal regularization parameter (C) for the linear SVM. .. GENERATED FROM PYTHON SOURCE LINES 64-76 .. code-block:: default model_params = { 'svm__kernel': 'linear', 'svm__C': [0.01, 0.1], 'cv': 2} # CV=2 too speed up the example X = ['frontal', 'parietal'] y = 'event' scores, estimator = run_cross_validation( X=X, y=y, data=df_fmri, model='svm', preprocess_X='zscore', model_params=model_params, return_estimator='final') print(scores['test_score'].mean()) .. rst-class:: sphx-glr-script-out .. code-block:: none 2022-12-08 10:45:56,873 - julearn - INFO - Using default CV 2022-12-08 10:45:56,873 - julearn - INFO - ==== Input Data ==== 2022-12-08 10:45:56,873 - julearn - INFO - Using dataframe as input 2022-12-08 10:45:56,873 - julearn - INFO - Features: ['frontal', 'parietal'] 2022-12-08 10:45:56,873 - julearn - INFO - Target: event 2022-12-08 10:45:56,873 - julearn - INFO - Expanded X: ['frontal', 'parietal'] 2022-12-08 10:45:56,873 - julearn - INFO - Expanded Confounds: [] 2022-12-08 10:45:56,874 - julearn - INFO - ==================== 2022-12-08 10:45:56,874 - julearn - INFO - 2022-12-08 10:45:56,874 - julearn - INFO - ====== Model ====== 2022-12-08 10:45:56,874 - julearn - INFO - Obtaining model by name: svm 2022-12-08 10:45:56,874 - julearn - INFO - =================== 2022-12-08 10:45:56,874 - julearn - INFO - 2022-12-08 10:45:56,874 - julearn - INFO - = Model Parameters = 2022-12-08 10:45:56,874 - julearn - INFO - Setting hyperparameter svm__kernel = linear 2022-12-08 10:45:56,875 - julearn - WARNING - `cv` should not be directly provided in the`model_params` anymore. This functionality willbe removed in the next version of julearn.Please use `cv` inside of `search_params` instead 2022-12-08 10:45:56,875 - julearn - INFO - Tunning hyperparameters using grid 2022-12-08 10:45:56,875 - julearn - INFO - Hyperparameters: 2022-12-08 10:45:56,875 - julearn - INFO - svm__C: [0.01, 0.1] 2022-12-08 10:45:56,875 - julearn - INFO - Using scikit-learn CV scheme KFold(n_splits=2, random_state=None, shuffle=False) 2022-12-08 10:45:56,875 - julearn - INFO - Search Parameters: 2022-12-08 10:45:56,876 - julearn - INFO - cv: KFold(n_splits=2, random_state=None, shuffle=False) 2022-12-08 10:45:56,876 - julearn - INFO - scoring: None 2022-12-08 10:45:56,876 - julearn - INFO - ==================== 2022-12-08 10:45:56,876 - julearn - INFO - 2022-12-08 10:45:56,876 - julearn - INFO - CV interpreted as RepeatedKFold with 5 repetitions of 5 folds 0.575591606418621 .. GENERATED FROM PYTHON SOURCE LINES 77-78 This did not change much, lets explore other kernels too. .. GENERATED FROM PYTHON SOURCE LINES 78-89 .. code-block:: default model_params = { 'svm__kernel': ['linear', 'rbf', 'poly'], 'svm__C': [0.01, 0.1], 'cv': 2} # CV=2 too speed up the example X = ['frontal', 'parietal'] y = 'event' scores, estimator = run_cross_validation( X=X, y=y, data=df_fmri, model='svm', preprocess_X='zscore', model_params=model_params, return_estimator='final') print(scores['test_score'].mean()) .. rst-class:: sphx-glr-script-out .. code-block:: none 2022-12-08 10:45:59,161 - julearn - INFO - Using default CV 2022-12-08 10:45:59,161 - julearn - INFO - ==== Input Data ==== 2022-12-08 10:45:59,161 - julearn - INFO - Using dataframe as input 2022-12-08 10:45:59,161 - julearn - INFO - Features: ['frontal', 'parietal'] 2022-12-08 10:45:59,161 - julearn - INFO - Target: event 2022-12-08 10:45:59,161 - julearn - INFO - Expanded X: ['frontal', 'parietal'] 2022-12-08 10:45:59,161 - julearn - INFO - Expanded Confounds: [] 2022-12-08 10:45:59,162 - julearn - INFO - ==================== 2022-12-08 10:45:59,162 - julearn - INFO - 2022-12-08 10:45:59,162 - julearn - INFO - ====== Model ====== 2022-12-08 10:45:59,162 - julearn - INFO - Obtaining model by name: svm 2022-12-08 10:45:59,162 - julearn - INFO - =================== 2022-12-08 10:45:59,162 - julearn - INFO - 2022-12-08 10:45:59,162 - julearn - INFO - = Model Parameters = 2022-12-08 10:45:59,162 - julearn - WARNING - `cv` should not be directly provided in the`model_params` anymore. This functionality willbe removed in the next version of julearn.Please use `cv` inside of `search_params` instead 2022-12-08 10:45:59,162 - julearn - INFO - Tunning hyperparameters using grid 2022-12-08 10:45:59,162 - julearn - INFO - Hyperparameters: 2022-12-08 10:45:59,162 - julearn - INFO - svm__kernel: ['linear', 'rbf', 'poly'] 2022-12-08 10:45:59,162 - julearn - INFO - svm__C: [0.01, 0.1] 2022-12-08 10:45:59,163 - julearn - INFO - Using scikit-learn CV scheme KFold(n_splits=2, random_state=None, shuffle=False) 2022-12-08 10:45:59,163 - julearn - INFO - Search Parameters: 2022-12-08 10:45:59,163 - julearn - INFO - cv: KFold(n_splits=2, random_state=None, shuffle=False) 2022-12-08 10:45:59,163 - julearn - INFO - scoring: None 2022-12-08 10:45:59,163 - julearn - INFO - ==================== 2022-12-08 10:45:59,163 - julearn - INFO - 2022-12-08 10:45:59,163 - julearn - INFO - CV interpreted as RepeatedKFold with 5 repetitions of 5 folds 0.7116487391994357 .. GENERATED FROM PYTHON SOURCE LINES 90-91 It seems that we might have found a better model, but which one is it? .. GENERATED FROM PYTHON SOURCE LINES 91-93 .. code-block:: default print(estimator.best_params_) .. rst-class:: sphx-glr-script-out .. code-block:: none {'svm__C': 0.1, 'svm__kernel': 'rbf'} .. GENERATED FROM PYTHON SOURCE LINES 94-96 Now that we know that a RBF kernel is better, lest test different *gamma* parameters. .. GENERATED FROM PYTHON SOURCE LINES 96-110 .. code-block:: default model_params = { 'svm__kernel': 'rbf', 'svm__C': [0.01, 0.1], 'svm__gamma': [1e-2, 1e-3], 'cv': 2} # CV=2 too speed up the example X = ['frontal', 'parietal'] y = 'event' scores, estimator = run_cross_validation( X=X, y=y, data=df_fmri, model='svm', preprocess_X='zscore', model_params=model_params, return_estimator='final') print(scores['test_score'].mean()) print(estimator.best_params_) .. rst-class:: sphx-glr-script-out .. code-block:: none 2022-12-08 10:46:05,065 - julearn - INFO - Using default CV 2022-12-08 10:46:05,065 - julearn - INFO - ==== Input Data ==== 2022-12-08 10:46:05,065 - julearn - INFO - Using dataframe as input 2022-12-08 10:46:05,066 - julearn - INFO - Features: ['frontal', 'parietal'] 2022-12-08 10:46:05,066 - julearn - INFO - Target: event 2022-12-08 10:46:05,066 - julearn - INFO - Expanded X: ['frontal', 'parietal'] 2022-12-08 10:46:05,066 - julearn - INFO - Expanded Confounds: [] 2022-12-08 10:46:05,066 - julearn - INFO - ==================== 2022-12-08 10:46:05,066 - julearn - INFO - 2022-12-08 10:46:05,067 - julearn - INFO - ====== Model ====== 2022-12-08 10:46:05,067 - julearn - INFO - Obtaining model by name: svm 2022-12-08 10:46:05,067 - julearn - INFO - =================== 2022-12-08 10:46:05,067 - julearn - INFO - 2022-12-08 10:46:05,067 - julearn - INFO - = Model Parameters = 2022-12-08 10:46:05,067 - julearn - INFO - Setting hyperparameter svm__kernel = rbf 2022-12-08 10:46:05,068 - julearn - WARNING - `cv` should not be directly provided in the`model_params` anymore. This functionality willbe removed in the next version of julearn.Please use `cv` inside of `search_params` instead 2022-12-08 10:46:05,068 - julearn - INFO - Tunning hyperparameters using grid 2022-12-08 10:46:05,068 - julearn - INFO - Hyperparameters: 2022-12-08 10:46:05,068 - julearn - INFO - svm__C: [0.01, 0.1] 2022-12-08 10:46:05,068 - julearn - INFO - svm__gamma: [0.01, 0.001] 2022-12-08 10:46:05,068 - julearn - INFO - Using scikit-learn CV scheme KFold(n_splits=2, random_state=None, shuffle=False) 2022-12-08 10:46:05,068 - julearn - INFO - Search Parameters: 2022-12-08 10:46:05,068 - julearn - INFO - cv: KFold(n_splits=2, random_state=None, shuffle=False) 2022-12-08 10:46:05,068 - julearn - INFO - scoring: None 2022-12-08 10:46:05,068 - julearn - INFO - ==================== 2022-12-08 10:46:05,068 - julearn - INFO - 2022-12-08 10:46:05,068 - julearn - INFO - CV interpreted as RepeatedKFold with 5 repetitions of 5 folds 0.47479104214424267 {'svm__C': 0.01, 'svm__gamma': 0.001} .. GENERATED FROM PYTHON SOURCE LINES 111-113 It seems that without tuning the gamma parameter we had a better accuracy. Let's add the default value and see what happens. .. GENERATED FROM PYTHON SOURCE LINES 113-127 .. code-block:: default model_params = { 'svm__kernel': 'rbf', 'svm__C': [0.01, 0.1], 'svm__gamma': [1e-2, 1e-3, 'scale'], 'cv': 2} # CV=2 too speed up the example X = ['frontal', 'parietal'] y = 'event' scores, estimator = run_cross_validation( X=X, y=y, data=df_fmri, model='svm', preprocess_X='zscore', model_params=model_params, return_estimator='final') print(scores['test_score'].mean()) print(estimator.best_params_) .. rst-class:: sphx-glr-script-out .. code-block:: none 2022-12-08 10:46:09,469 - julearn - INFO - Using default CV 2022-12-08 10:46:09,469 - julearn - INFO - ==== Input Data ==== 2022-12-08 10:46:09,469 - julearn - INFO - Using dataframe as input 2022-12-08 10:46:09,469 - julearn - INFO - Features: ['frontal', 'parietal'] 2022-12-08 10:46:09,469 - julearn - INFO - Target: event 2022-12-08 10:46:09,470 - julearn - INFO - Expanded X: ['frontal', 'parietal'] 2022-12-08 10:46:09,470 - julearn - INFO - Expanded Confounds: [] 2022-12-08 10:46:09,470 - julearn - INFO - ==================== 2022-12-08 10:46:09,470 - julearn - INFO - 2022-12-08 10:46:09,470 - julearn - INFO - ====== Model ====== 2022-12-08 10:46:09,471 - julearn - INFO - Obtaining model by name: svm 2022-12-08 10:46:09,471 - julearn - INFO - =================== 2022-12-08 10:46:09,471 - julearn - INFO - 2022-12-08 10:46:09,471 - julearn - INFO - = Model Parameters = 2022-12-08 10:46:09,471 - julearn - INFO - Setting hyperparameter svm__kernel = rbf 2022-12-08 10:46:09,472 - julearn - WARNING - `cv` should not be directly provided in the`model_params` anymore. This functionality willbe removed in the next version of julearn.Please use `cv` inside of `search_params` instead 2022-12-08 10:46:09,472 - julearn - INFO - Tunning hyperparameters using grid 2022-12-08 10:46:09,472 - julearn - INFO - Hyperparameters: 2022-12-08 10:46:09,472 - julearn - INFO - svm__C: [0.01, 0.1] 2022-12-08 10:46:09,472 - julearn - INFO - svm__gamma: [0.01, 0.001, 'scale'] 2022-12-08 10:46:09,472 - julearn - INFO - Using scikit-learn CV scheme KFold(n_splits=2, random_state=None, shuffle=False) 2022-12-08 10:46:09,472 - julearn - INFO - Search Parameters: 2022-12-08 10:46:09,472 - julearn - INFO - cv: KFold(n_splits=2, random_state=None, shuffle=False) 2022-12-08 10:46:09,472 - julearn - INFO - scoring: None 2022-12-08 10:46:09,472 - julearn - INFO - ==================== 2022-12-08 10:46:09,472 - julearn - INFO - 2022-12-08 10:46:09,472 - julearn - INFO - CV interpreted as RepeatedKFold with 5 repetitions of 5 folds 0.7074977958032092 {'svm__C': 0.1, 'svm__gamma': 'scale'} .. GENERATED FROM PYTHON SOURCE LINES 128-129 So what was the best ``gamma`` in the end? .. GENERATED FROM PYTHON SOURCE LINES 129-130 .. code-block:: default print(estimator.best_estimator_['svm']._gamma) .. rst-class:: sphx-glr-script-out .. code-block:: none 0.5 .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 19.355 seconds) .. _sphx_glr_download_auto_examples_advanced_run_hyperparameter_tuning.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: run_hyperparameter_tuning.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: run_hyperparameter_tuning.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_