6.4. Inspecting Models#

Understanding the internals of machine learning models is essential for interpreting their behavior and gaining insights into their predictions. By inspecting the parameters and hyperparameters of a trained model, we can identify the features that have the most significant impact on the model’s output and explore how the model works. By analyzing the performance of each model across different iterations and hyperparameters, we can assess the variability across models and identify any patterns that might help interpret the model’s outputs. The ability to inspect the internals of machine learning models can help us identify the most critical features that influence the model’s predictions, understand how the model works and make informed decisions about its deployment.

In this context, we will explore how to perform model inspection in julearn. julearn provides an intuitive suite of tools for model inspection and interpretation. We will focus on how to inspect models in julearn’s nested cross-validation workflow. With these techniques, we can gain a better understanding of how the model works and identify any patterns or anomalies that could affect its performance. This knowledge can help us deploy models more effectively and interpret their outputs with greater confidence.

Let’s start by importing some useful utilities:

from pprint import pprint
import seaborn as sns
import numpy as np

from sklearn.model_selection import RepeatedKFold

from julearn import run_cross_validation
from julearn.pipeline import PipelineCreator
from julearn.utils import configure_logging

Now, let’s configure julearn’s logger to get some output as the pipeline is running and get some toy data to play with. In this example, we will use the penguin dataset, and classify the penguin species based on the continuous measures in the dataset.

configure_logging(level="INFO")

# get some data
penguins_df = sns.load_dataset("penguins")
penguins_df = penguins_df.drop(columns=["island", "sex"])
penguins_df = penguins_df.query("species != 'Chinstrap'").dropna()
penguins_df["species"] = penguins_df["species"].replace(
    {"Adelie": 0, "Gentoo": 1}
)
features = [x for x in penguins_df.columns if x != "species"]

2024-05-03 15:22:14,978 - julearn - INFO - ===== Lib Versions =====
2024-05-03 15:22:14,979 - julearn - INFO - numpy: 1.26.4
2024-05-03 15:22:14,979 - julearn - INFO - scipy: 1.13.0
2024-05-03 15:22:14,979 - julearn - INFO - sklearn: 1.4.2
2024-05-03 15:22:14,979 - julearn - INFO - pandas: 2.1.4
2024-05-03 15:22:14,979 - julearn - INFO - julearn: 0.3.2
2024-05-03 15:22:14,979 - julearn - INFO - ========================

We are going to use a fairly simple pipeline, in which we z-score the features and then apply a support vector classifier to classify species.

# create model
pipeline_creator = PipelineCreator(problem_type="classification", apply_to="*")
pipeline_creator.add("zscore")
pipeline_creator.add("svm", kernel="linear", C=np.geomspace(1e-2, 1e2, 5))
print(pipeline_creator)

2024-05-03 15:22:15,440 - julearn - INFO - Adding step zscore that applies to ColumnTypes<types={'*'}; pattern=.*>
2024-05-03 15:22:15,440 - julearn - INFO - Step added
2024-05-03 15:22:15,440 - julearn - INFO - Adding step svm that applies to ColumnTypes<types={'*'}; pattern=.*>
2024-05-03 15:22:15,440 - julearn - INFO - Setting hyperparameter kernel = linear
2024-05-03 15:22:15,440 - julearn - INFO - Tuning hyperparameter C = [1.e-02 1.e-01 1.e+00 1.e+01 1.e+02]
2024-05-03 15:22:15,440 - julearn - INFO - Step added
PipelineCreator:
  Step 0: zscore
    estimator:     StandardScaler()
    apply to:      ColumnTypes<types={'*'}; pattern=.*>
    needed types:  ColumnTypes<types={'*'}; pattern=.*>
    tuning params: {}
  Step 1: svm
    estimator:     SVC(kernel='linear')
    apply to:      ColumnTypes<types={'*'}; pattern=.*>
    needed types:  ColumnTypes<types={'*'}; pattern=.*>
    tuning params: {'svm__C': array([1.e-02, 1.e-01, 1.e+00, 1.e+01, 1.e+02])}

Once this is set up, we can simply call julearn’s run_cross_validation(). Notice, how we set the return_inspector parameter to True. Importantly, we also have to set the return_estimator parameter to "all". This is because julearn’s Inspector extracts all relevant information from estimators after the pipeline has been run. The pipeline will take a few minutes in our example:

scores, final_model, inspector = run_cross_validation(
    X=features,
    y="species",
    data=penguins_df,
    model=pipeline_creator,
    seed=200,
    cv=RepeatedKFold(n_repeats=10, n_splits=5, random_state=200),
    return_estimator="all",
    return_inspector=True,
)

2024-05-03 15:22:15,441 - julearn - INFO - Setting random seed to 200
2024-05-03 15:22:15,442 - julearn - INFO - ==== Input Data ====
2024-05-03 15:22:15,442 - julearn - INFO - Using dataframe as input
2024-05-03 15:22:15,442 - julearn - INFO -      Features: ['bill_length_mm', 'bill_depth_mm', 'flipper_length_mm', 'body_mass_g']
2024-05-03 15:22:15,442 - julearn - INFO -      Target: species
2024-05-03 15:22:15,442 - julearn - INFO -      Expanded features: ['bill_length_mm', 'bill_depth_mm', 'flipper_length_mm', 'body_mass_g']
2024-05-03 15:22:15,442 - julearn - INFO -      X_types:{}
2024-05-03 15:22:15,442 - julearn - WARNING - The following columns are not defined in X_types: ['bill_length_mm', 'bill_depth_mm', 'flipper_length_mm', 'body_mass_g']. They will be treated as continuous.
/home/runner/work/julearn/julearn/julearn/prepare.py:505: RuntimeWarning: The following columns are not defined in X_types: ['bill_length_mm', 'bill_depth_mm', 'flipper_length_mm', 'body_mass_g']. They will be treated as continuous.
  warn_with_log(
2024-05-03 15:22:15,443 - julearn - INFO - ====================
2024-05-03 15:22:15,443 - julearn - INFO -
2024-05-03 15:22:15,444 - julearn - INFO - = Model Parameters =
2024-05-03 15:22:15,444 - julearn - INFO - Tuning hyperparameters using grid
2024-05-03 15:22:15,444 - julearn - INFO - Hyperparameters:
2024-05-03 15:22:15,444 - julearn - INFO -      svm__C: [1.e-02 1.e-01 1.e+00 1.e+01 1.e+02]
2024-05-03 15:22:15,444 - julearn - INFO - Using inner CV scheme KFold(n_splits=5, random_state=None, shuffle=False)
2024-05-03 15:22:15,445 - julearn - INFO - Search Parameters:
2024-05-03 15:22:15,445 - julearn - INFO -      cv: KFold(n_splits=5, random_state=None, shuffle=False)
2024-05-03 15:22:15,445 - julearn - INFO - ====================
2024-05-03 15:22:15,445 - julearn - INFO -
2024-05-03 15:22:15,445 - julearn - INFO - = Data Information =
2024-05-03 15:22:15,445 - julearn - INFO -      Problem type: classification
2024-05-03 15:22:15,445 - julearn - INFO -      Number of samples: 274
2024-05-03 15:22:15,445 - julearn - INFO -      Number of features: 4
2024-05-03 15:22:15,445 - julearn - INFO - ====================
2024-05-03 15:22:15,445 - julearn - INFO -
2024-05-03 15:22:15,445 - julearn - INFO -      Number of classes: 2
2024-05-03 15:22:15,445 - julearn - INFO -      Target type: int64
2024-05-03 15:22:15,446 - julearn - INFO -      Class distributions: species
0    151
1    123
Name: count, dtype: int64
2024-05-03 15:22:15,446 - julearn - INFO - Using outer CV scheme RepeatedKFold(n_repeats=10, n_splits=5, random_state=200)
2024-05-03 15:22:15,446 - julearn - INFO - Binary classification problem detected.
2024-05-03 15:22:31,430 - julearn - INFO - Fitting final model

After this is done, we can now use the inspector to look at final model parameters, but also at the parameters of individual models from each fold of the cross-validation. The final model can be inspected using the .model attribute. For example to get a quick overview over the model parameters, run:

# remember to actually import pprint as above, or just print out using print
pprint(inspector.model.get_params())

{'cv': KFold(n_splits=5, random_state=None, shuffle=False),
 'error_score': nan,
 'estimator': Pipeline(steps=[('set_column_types', SetColumnTypes(X_types={})),
                ('zscore',
                 JuColumnTransformer(apply_to=ColumnTypes<types={'*'}; pattern=.*>,
                                     copy=True, name='zscore',
                                     transformer=StandardScaler(),
                                     with_mean=True, with_std=True)),
                ('svm',
                 WrapModel(C=1.0, apply_to=ColumnTypes<types={'*'}; pattern=.*>,
                           break_ties=False, cache_size=200, class_weight=None,
                           coef0=0.0, decision_function_shape='ovr', degree=3,
                           gamma='scale', kernel='linear', max_iter=-1,
                           model=SVC(kernel='linear'), probability=False,
                           random_state=None, shrinking=True, tol=0.001,
                           verbose=False))]),
 'estimator__memory': None,
 'estimator__set_column_types': SetColumnTypes(X_types={}),
 'estimator__set_column_types__X_types': {},
 'estimator__set_column_types__row_select_col_type': None,
 'estimator__set_column_types__row_select_vals': None,
 'estimator__steps': [('set_column_types', SetColumnTypes(X_types={})),
                      ('zscore',
                       JuColumnTransformer(apply_to=ColumnTypes<types={'*'}; pattern=.*>, copy=True,
                    name='zscore', transformer=StandardScaler(), with_mean=True,
                    with_std=True)),
                      ('svm',
                       WrapModel(C=1.0, apply_to=ColumnTypes<types={'*'}; pattern=.*>,
          break_ties=False, cache_size=200, class_weight=None, coef0=0.0,
          decision_function_shape='ovr', degree=3, gamma='scale',
          kernel='linear', max_iter=-1, model=SVC(kernel='linear'),
          probability=False, random_state=None, shrinking=True, tol=0.001,
          verbose=False))],
 'estimator__svm': WrapModel(C=1.0, apply_to=ColumnTypes<types={'*'}; pattern=.*>,
          break_ties=False, cache_size=200, class_weight=None, coef0=0.0,
          decision_function_shape='ovr', degree=3, gamma='scale',
          kernel='linear', max_iter=-1, model=SVC(kernel='linear'),
          probability=False, random_state=None, shrinking=True, tol=0.001,
          verbose=False),
 'estimator__svm__C': 1.0,
 'estimator__svm__apply_to': ColumnTypes<types={'*'}; pattern=.*>,
 'estimator__svm__break_ties': False,
 'estimator__svm__cache_size': 200,
 'estimator__svm__class_weight': None,
 'estimator__svm__coef0': 0.0,
 'estimator__svm__decision_function_shape': 'ovr',
 'estimator__svm__degree': 3,
 'estimator__svm__gamma': 'scale',
 'estimator__svm__kernel': 'linear',
 'estimator__svm__max_iter': -1,
 'estimator__svm__model': SVC(kernel='linear'),
 'estimator__svm__needed_types': None,
 'estimator__svm__probability': False,
 'estimator__svm__random_state': None,
 'estimator__svm__shrinking': True,
 'estimator__svm__tol': 0.001,
 'estimator__svm__verbose': False,
 'estimator__verbose': False,
 'estimator__zscore': JuColumnTransformer(apply_to=ColumnTypes<types={'*'}; pattern=.*>, copy=True,
                    name='zscore', transformer=StandardScaler(), with_mean=True,
                    with_std=True),
 'estimator__zscore__apply_to': ColumnTypes<types={'*'}; pattern=.*>,
 'estimator__zscore__copy': True,
 'estimator__zscore__name': 'zscore',
 'estimator__zscore__needed_types': None,
 'estimator__zscore__row_select_col_type': None,
 'estimator__zscore__row_select_vals': None,
 'estimator__zscore__transformer': StandardScaler(),
 'estimator__zscore__with_mean': True,
 'estimator__zscore__with_std': True,
 'n_jobs': None,
 'param_grid': {'svm__C': array([1.e-02, 1.e-01, 1.e+00, 1.e+01, 1.e+02])},
 'pre_dispatch': '2*n_jobs',
 'refit': True,
 'return_train_score': False,
 'scoring': None,
 'verbose': 0}

This will print out a dictionary containing all the parameters of the final selected estimator. Similarly, we can also get an overview of the fitted parameters:

pprint(inspector.model.get_fitted_params())

{'set_column_types__column_mapper_': {'bill_depth_mm': 'bill_depth_mm__:type:__continuous',
                                      'bill_length_mm': 'bill_length_mm__:type:__continuous',
                                      'body_mass_g': 'body_mass_g__:type:__continuous',
                                      'flipper_length_mm': 'flipper_length_mm__:type:__continuous'},
 'set_column_types__feature_names_in_': Index(['bill_length_mm', 'bill_depth_mm', 'flipper_length_mm', 'body_mass_g'], dtype='object'),
 'svm__model_': SVC(C=0.01, kernel='linear'),
 'zscore__column_transformer_': ColumnTransformer(remainder='passthrough',
                  transformers=[('zscore', StandardScaler(),
                                 <julearn.base.column_types.make_type_selector object at 0x7fd4866bea70>)],
                  verbose_feature_names_out=False),
 'zscore__feature_names_in_': array(['bill_length_mm__:type:__continuous',
       'bill_depth_mm__:type:__continuous',
       'flipper_length_mm__:type:__continuous',
       'body_mass_g__:type:__continuous'], dtype=object),
 'zscore__mean_': array([  42.70291971,   16.83613139,  202.17883212, 4318.06569343]),
 'zscore__n_features_in_': 4,
 'zscore__n_samples_seen_': 274,
 'zscore__scale_': array([  5.18607683,   2.00973207,  15.02045287, 834.40628575]),
 'zscore__var_': array([2.68953929e+01, 4.03902299e+00, 2.25614004e+02, 6.96233850e+05])}

Again, this will print out quite a lot. What if we just want to look at a specific parameter? Well, this somewhat depends on the underlying structure and attributes of the used estimators or transformers, and will likely require some interactive exploring. But the inspector makes it quite easy to interactively explore your final model. For example, to see which sample means were used to z-score features in the final model we can run:

print(inspector.model.get_fitted_params()["zscore__mean_"])

[  42.70291971   16.83613139  202.17883212 4318.06569343]

In addition, sometimes it can be very useful to know what predictions were made in each individual train-test split of the cross-validation. This is where the .folds attribute comes in handy. This attribute has a .predict() method, that makes it very easy to display the predictions made for each sample in each test fold and in each repeat of the cross-validation. It will display a DataFrame with each row corresponding to a sample, and each column corresponding to a repeat of the cross-validation. Simply run:

fold_predictions = inspector.folds.predict()
print(fold_predictions)

     repeat0_p0  repeat1_p0  repeat2_p0  ...  repeat8_p0  repeat9_p0  target
           0           0           0  ...           0           0       0
           0           0           0  ...           0           0       0
           0           0           0  ...           0           0       0
           0           0           0  ...           0           0       0
           0           0           0  ...           0           0       0
..          ...         ...         ...  ...         ...         ...     ...
         1           1           1  ...           1           1       1
         1           1           1  ...           1           1       1
         1           1           1  ...           1           1       1
         1           1           1  ...           1           1       1
         1           1           1  ...           1           1       1

[274 rows x 11 columns]

This .folds attribute is actually an iterator, that can iterate over every single fold used in the cross-validation, and it yields an instance of a FoldsInspector, which can then be used to explore each model that was fitted during cross-validation. For example, we can collect the C parameters that were selected in each outer fold of our nested cross-validation. That way, we can assess the amount of variance on that particular parameter across folds:

c_values = []
for fold_inspector in inspector.folds:
    fold_model = fold_inspector.model
    c_values.append(
        fold_model.get_fitted_params()["svm__model_"].get_params()["C"]
    )

By printing out the unique values in the c_values list, we realize, that actually there was not much variance across models. In fact, there was only one parameter value ever selected. This may indicate that this is in fact the optimal value, or it may indicate that there is a potential problem with our search grid.

print(set(c_values))

{0.01}

As you can see the inspector provides you with a set of powerful tools to look at what exactly happened in your pipeline and the performance evaluation. It may help you better interpret your models, understand your results and identify problems if there are any. By leveraging these tools, you can gain deeper insights, interpret your models effectively, and address any issues that may arise. Model inspection serves as a valuable asset in the deployment of machine learning models, ensuring transparency, interpretability, and reliable decision-making. With julearn’s model inspection capabilities, you can confidently navigate the complexities of machine learning models and harness their full potential in real-world applications.

Total running time of the script: (0 minutes 16.989 seconds)