Custom Scoring Function for Regression#

This example uses the ‘diabetes’ data from sklearn datasets and performs a regression analysis using a Ridge Regression model. As scorers, it uses scikit-learn, julearn and a custom metric defined by the user.

# Authors: Shammi More <s.more@fz-juelich.de>
#          Federico Raimondo <f.raimondo@fz-juelich.de>
#
# License: AGPL

import pandas as pd
import scipy
from sklearn.datasets import load_diabetes

from sklearn.metrics import make_scorer
from julearn.scoring import register_scorer

from julearn import run_cross_validation
from julearn.utils import configure_logging

Set the logging level to info to see extra information

configure_logging(level="INFO")
2026-05-29 20:46:34,163 - julearn - INFO - ===== Lib Versions =====
2026-05-29 20:46:34,164 - julearn - INFO - numpy: 1.25.2
2026-05-29 20:46:34,164 - julearn - INFO - scipy: 1.16.3
2026-05-29 20:46:34,164 - julearn - INFO - sklearn: 1.3.0
2026-05-29 20:46:34,164 - julearn - INFO - pandas: 2.0.3
2026-05-29 20:46:34,164 - julearn - INFO - julearn: 0.3.1.dev0
2026-05-29 20:46:34,164 - julearn - INFO - ========================

load the diabetes data from sklearn as a pandas dataframe

features, target = load_diabetes(return_X_y=True, as_frame=True)

Dataset contains ten variables age, sex, body mass index, average blood pressure, and six blood serum measurements (s1-s6) diabetes patients and a quantitative measure of disease progression one year after baseline which is the target we are interested in predicting.

print("Features: \n", features.head())  # type: ignore
print("Target: \n", target.describe())  # type: ignore
Features:
         age       sex       bmi        bp        s1        s2        s3        s4        s5        s6
0  0.038076  0.050680  0.061696  0.021872 -0.044223 -0.034821 -0.043401 -0.002592  0.019907 -0.017646
1 -0.001882 -0.044642 -0.051474 -0.026328 -0.008449 -0.019163  0.074412 -0.039493 -0.068332 -0.092204
2  0.085299  0.050680  0.044451 -0.005670 -0.045599 -0.034194 -0.032356 -0.002592  0.002861 -0.025930
3 -0.089063 -0.044642 -0.011595 -0.036656  0.012191  0.024991 -0.036038  0.034309  0.022688 -0.009362
4  0.005383 -0.044642 -0.036385  0.021872  0.003935  0.015596  0.008142 -0.002592 -0.031988 -0.046641
Target:
 count    442.000000
mean     152.133484
std       77.093005
min       25.000000
25%       87.000000
50%      140.500000
75%      211.500000
max      346.000000
Name: target, dtype: float64

Let’s combine features and target together in one dataframe and define X and y

data_diabetes = pd.concat([features, target], axis=1)  # type: ignore

X = ["age", "sex", "bmi", "bp", "s1", "s2", "s3", "s4", "s5", "s6"]
y = "target"

Train a ridge regression model on train dataset and use mean absolute error for scoring

scores, model = run_cross_validation(
    X=X,
    y=y,
    data=data_diabetes,
    preprocess="zscore",
    problem_type="regression",
    model="ridge",
    return_estimator="final",
    scoring="neg_mean_absolute_error",
)
2026-05-29 20:46:34,184 - julearn - INFO - ==== Input Data ====
2026-05-29 20:46:34,185 - julearn - INFO - Using dataframe as input
2026-05-29 20:46:34,185 - julearn - INFO -      Features: ['age', 'sex', 'bmi', 'bp', 's1', 's2', 's3', 's4', 's5', 's6']
2026-05-29 20:46:34,185 - julearn - INFO -      Target: target
2026-05-29 20:46:34,185 - julearn - INFO -      Expanded features: ['age', 'sex', 'bmi', 'bp', 's1', 's2', 's3', 's4', 's5', 's6']
2026-05-29 20:46:34,185 - julearn - INFO -      X_types:{}
2026-05-29 20:46:34,186 - julearn - WARNING - The following columns are not defined in X_types: ['age', 'sex', 'bmi', 'bp', 's1', 's2', 's3', 's4', 's5', 's6']. They will be treated as continuous.
/private/var/folders/09/t22x2_p106j7p24khr0jdxrw0000gn/T/tmp4590_ea2/julearn/utils/logging.py:238: RuntimeWarning: The following columns are not defined in X_types: ['age', 'sex', 'bmi', 'bp', 's1', 's2', 's3', 's4', 's5', 's6']. They will be treated as continuous.
  warn(msg, category=category)
2026-05-29 20:46:34,187 - julearn - INFO - ====================
2026-05-29 20:46:34,187 - julearn - INFO -
2026-05-29 20:46:34,187 - julearn - INFO - Adding step zscore that applies to ColumnTypes<types={'continuous'}; pattern=(?:__:type:__continuous)>
2026-05-29 20:46:34,187 - julearn - INFO - Step added
2026-05-29 20:46:34,187 - julearn - INFO - Adding step ridge that applies to ColumnTypes<types={'continuous'}; pattern=(?:__:type:__continuous)>
2026-05-29 20:46:34,187 - julearn - INFO - Step added
2026-05-29 20:46:34,188 - julearn - INFO - = Model Parameters =
2026-05-29 20:46:34,188 - julearn - INFO - ====================
2026-05-29 20:46:34,188 - julearn - INFO -
2026-05-29 20:46:34,188 - julearn - INFO - = Data Information =
2026-05-29 20:46:34,188 - julearn - INFO -      Problem type: regression
2026-05-29 20:46:34,188 - julearn - INFO -      Number of samples: 442
2026-05-29 20:46:34,188 - julearn - INFO -      Number of features: 10
2026-05-29 20:46:34,189 - julearn - INFO - ====================
2026-05-29 20:46:34,189 - julearn - INFO -
2026-05-29 20:46:34,189 - julearn - INFO -      Target type: float64
2026-05-29 20:46:34,189 - julearn - INFO - Using outer CV scheme KFold(n_splits=5, random_state=None, shuffle=False)

The scores dataframe has all the values for each CV split.

print(scores.head())
   fit_time  score_time  test_score  n_train  n_test  repeat  fold                          cv_mdsum
0  0.009189    0.004312  -43.104359      353      89       0     0  b10eef89b4192178d482d7a1587a248a
1  0.008167    0.003268  -44.861364      353      89       0     1  b10eef89b4192178d482d7a1587a248a
2  0.007646    0.003907  -47.981407      354      88       0     2  b10eef89b4192178d482d7a1587a248a
3  0.006877    0.002668  -42.956254      354      88       0     3  b10eef89b4192178d482d7a1587a248a
4  0.007527    0.004122  -42.419886      354      88       0     4  b10eef89b4192178d482d7a1587a248a

Mean value of mean absolute error across CV

print(scores["test_score"].mean() * -1)  # type: ignore
44.264653948271885

Now do the same thing, but use mean absolute error and Pearson product-moment correlation coefficient (squared) as scoring functions

scores, model = run_cross_validation(
    X=X,
    y=y,
    data=data_diabetes,
    preprocess="zscore",
    problem_type="regression",
    model="ridge",
    return_estimator="final",
    scoring=["neg_mean_absolute_error", "r2_corr"],
)
2026-05-29 20:46:34,272 - julearn - INFO - ==== Input Data ====
2026-05-29 20:46:34,272 - julearn - INFO - Using dataframe as input
2026-05-29 20:46:34,272 - julearn - INFO -      Features: ['age', 'sex', 'bmi', 'bp', 's1', 's2', 's3', 's4', 's5', 's6']
2026-05-29 20:46:34,272 - julearn - INFO -      Target: target
2026-05-29 20:46:34,272 - julearn - INFO -      Expanded features: ['age', 'sex', 'bmi', 'bp', 's1', 's2', 's3', 's4', 's5', 's6']
2026-05-29 20:46:34,272 - julearn - INFO -      X_types:{}
2026-05-29 20:46:34,273 - julearn - WARNING - The following columns are not defined in X_types: ['age', 'sex', 'bmi', 'bp', 's1', 's2', 's3', 's4', 's5', 's6']. They will be treated as continuous.
/private/var/folders/09/t22x2_p106j7p24khr0jdxrw0000gn/T/tmp4590_ea2/julearn/utils/logging.py:238: RuntimeWarning: The following columns are not defined in X_types: ['age', 'sex', 'bmi', 'bp', 's1', 's2', 's3', 's4', 's5', 's6']. They will be treated as continuous.
  warn(msg, category=category)
2026-05-29 20:46:34,274 - julearn - INFO - ====================
2026-05-29 20:46:34,274 - julearn - INFO -
2026-05-29 20:46:34,274 - julearn - INFO - Adding step zscore that applies to ColumnTypes<types={'continuous'}; pattern=(?:__:type:__continuous)>
2026-05-29 20:46:34,274 - julearn - INFO - Step added
2026-05-29 20:46:34,274 - julearn - INFO - Adding step ridge that applies to ColumnTypes<types={'continuous'}; pattern=(?:__:type:__continuous)>
2026-05-29 20:46:34,275 - julearn - INFO - Step added
2026-05-29 20:46:34,275 - julearn - INFO - = Model Parameters =
2026-05-29 20:46:34,275 - julearn - INFO - ====================
2026-05-29 20:46:34,275 - julearn - INFO -
2026-05-29 20:46:34,275 - julearn - INFO - = Data Information =
2026-05-29 20:46:34,276 - julearn - INFO -      Problem type: regression
2026-05-29 20:46:34,276 - julearn - INFO -      Number of samples: 442
2026-05-29 20:46:34,276 - julearn - INFO -      Number of features: 10
2026-05-29 20:46:34,276 - julearn - INFO - ====================
2026-05-29 20:46:34,276 - julearn - INFO -
2026-05-29 20:46:34,276 - julearn - INFO -      Target type: float64
2026-05-29 20:46:34,276 - julearn - INFO - Using outer CV scheme KFold(n_splits=5, random_state=None, shuffle=False)

Now the scores dataframe has all the values for each CV split, but two scores unders the column names ‘test_neg_mean_absolute_error’ and ‘test_r2_corr’.

print(scores[["test_neg_mean_absolute_error", "test_r2_corr"]].mean())
test_neg_mean_absolute_error   -44.264654
test_r2_corr                     0.486498
dtype: float64

If we want to define a custom scoring metric, we need to define a function that takes the predicted and the actual values as input and returns a value. In this case, we want to compute Pearson correlation coefficient (r).

def pearson_scorer(y_true, y_pred):
    return scipy.stats.pearsonr(  # type: ignore
        y_true.squeeze(), y_pred.squeeze()
    )[0]

Before using it, we need to convert it to a sklearn scorer and register it with julearn.

register_scorer(scorer_name="pearsonr", scorer=make_scorer(pearson_scorer))
2026-05-29 20:46:34,351 - julearn - INFO - registering scorer named pearsonr

Now we can use it as another scoring metric.

scores, model = run_cross_validation(
    X=X,
    y=y,
    data=data_diabetes,
    preprocess="zscore",
    problem_type="regression",
    model="ridge",
    return_estimator="final",
    scoring=["neg_mean_absolute_error", "r2_corr", "pearsonr"],
)
2026-05-29 20:46:34,351 - julearn - INFO - ==== Input Data ====
2026-05-29 20:46:34,352 - julearn - INFO - Using dataframe as input
2026-05-29 20:46:34,352 - julearn - INFO -      Features: ['age', 'sex', 'bmi', 'bp', 's1', 's2', 's3', 's4', 's5', 's6']
2026-05-29 20:46:34,352 - julearn - INFO -      Target: target
2026-05-29 20:46:34,352 - julearn - INFO -      Expanded features: ['age', 'sex', 'bmi', 'bp', 's1', 's2', 's3', 's4', 's5', 's6']
2026-05-29 20:46:34,352 - julearn - INFO -      X_types:{}
2026-05-29 20:46:34,353 - julearn - WARNING - The following columns are not defined in X_types: ['age', 'sex', 'bmi', 'bp', 's1', 's2', 's3', 's4', 's5', 's6']. They will be treated as continuous.
/private/var/folders/09/t22x2_p106j7p24khr0jdxrw0000gn/T/tmp4590_ea2/julearn/utils/logging.py:238: RuntimeWarning: The following columns are not defined in X_types: ['age', 'sex', 'bmi', 'bp', 's1', 's2', 's3', 's4', 's5', 's6']. They will be treated as continuous.
  warn(msg, category=category)
2026-05-29 20:46:34,354 - julearn - INFO - ====================
2026-05-29 20:46:34,354 - julearn - INFO -
2026-05-29 20:46:34,354 - julearn - INFO - Adding step zscore that applies to ColumnTypes<types={'continuous'}; pattern=(?:__:type:__continuous)>
2026-05-29 20:46:34,354 - julearn - INFO - Step added
2026-05-29 20:46:34,354 - julearn - INFO - Adding step ridge that applies to ColumnTypes<types={'continuous'}; pattern=(?:__:type:__continuous)>
2026-05-29 20:46:34,355 - julearn - INFO - Step added
2026-05-29 20:46:34,355 - julearn - INFO - = Model Parameters =
2026-05-29 20:46:34,355 - julearn - INFO - ====================
2026-05-29 20:46:34,355 - julearn - INFO -
2026-05-29 20:46:34,356 - julearn - INFO - = Data Information =
2026-05-29 20:46:34,356 - julearn - INFO -      Problem type: regression
2026-05-29 20:46:34,356 - julearn - INFO -      Number of samples: 442
2026-05-29 20:46:34,356 - julearn - INFO -      Number of features: 10
2026-05-29 20:46:34,356 - julearn - INFO - ====================
2026-05-29 20:46:34,356 - julearn - INFO -
2026-05-29 20:46:34,356 - julearn - INFO -      Target type: float64
2026-05-29 20:46:34,356 - julearn - INFO - Using outer CV scheme KFold(n_splits=5, random_state=None, shuffle=False)

Total running time of the script: ( 0 minutes 0.266 seconds)

Gallery generated by Sphinx-Gallery