.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/03_complex_models/run_generate_target.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_03_complex_models_run_generate_target.py: Target Generation ================= This example uses the ``iris`` dataset and tests a regression model in which the target variable is generated from some features within the cross-validation procedure. We will use the Iris dataset and generate a target variable using PCA on the petal features. Then, we will evaluate if a regression model can predict the generated target from the sepal features .. include:: ../../links.inc .. GENERATED FROM PYTHON SOURCE LINES 13-21 .. code-block:: Python # Authors: Federico Raimondo # License: AGPL from seaborn import load_dataset from julearn import run_cross_validation from julearn.pipeline import PipelineCreator from julearn.utils import configure_logging .. GENERATED FROM PYTHON SOURCE LINES 22-23 Set the logging level to info to see extra information. .. GENERATED FROM PYTHON SOURCE LINES 23-25 .. code-block:: Python configure_logging(level="DEBUG") .. rst-class:: sphx-glr-script-out .. code-block:: none 2026-05-29 20:46:12,581 - julearn - INFO - ===== Lib Versions ===== 2026-05-29 20:46:12,581 - julearn - INFO - numpy: 2.4.6 2026-05-29 20:46:12,581 - julearn - INFO - scipy: 1.17.1 2026-05-29 20:46:12,581 - julearn - INFO - sklearn: 1.8.0 2026-05-29 20:46:12,582 - julearn - INFO - pandas: 3.0.3 2026-05-29 20:46:12,582 - julearn - INFO - julearn: 0.3.5 2026-05-29 20:46:12,582 - julearn - INFO - ======================== .. GENERATED FROM PYTHON SOURCE LINES 26-29 .. code-block:: Python df_iris = load_dataset("iris") .. GENERATED FROM PYTHON SOURCE LINES 30-32 As features, we will use the sepal length, width and petal length. We will try to predict the species. .. GENERATED FROM PYTHON SOURCE LINES 32-43 .. code-block:: Python X = ["sepal_length", "sepal_width", "petal_length", "petal_width"] y = "__generated__" # to indicate to julearn that the target will be generated # Define our feature types X_types = { "sepal": ["sepal_length", "sepal_width"], "petal": ["petal_length", "petal_width"], } .. GENERATED FROM PYTHON SOURCE LINES 44-47 We now use a Pipeline Creator to create the pipeline that will generate the features. This special pipeline should be configured to be a "transformer" and apply to the "petal" feature types. .. GENERATED FROM PYTHON SOURCE LINES 47-54 .. code-block:: Python target_creator = PipelineCreator(problem_type="transformer", apply_to="petal") target_creator.add("pca", n_components=2) # Select only the first component target_creator.add("pick_columns", keep="pca__pca0") .. rst-class:: sphx-glr-script-out .. code-block:: none 2026-05-29 20:46:12,585 - julearn - INFO - Adding step pca that applies to ColumnTypes 2026-05-29 20:46:12,585 - julearn - INFO - Setting hyperparameter n_components = 2 2026-05-29 20:46:12,586 - julearn - DEBUG - Getting estimator from string: pca 2026-05-29 20:46:12,586 - julearn - INFO - Step added 2026-05-29 20:46:12,586 - julearn - INFO - Adding step pick_columns that applies to ColumnTypes 2026-05-29 20:46:12,586 - julearn - INFO - Setting hyperparameter keep = pca__pca0 2026-05-29 20:46:12,586 - julearn - DEBUG - Getting estimator from string: pick_columns 2026-05-29 20:46:12,587 - julearn - INFO - Step added .. GENERATED FROM PYTHON SOURCE LINES 55-59 We now create the pipeline that will be used to predict the target. This pipeline will be a regression pipeline. The step previous to the model should be the the `generate_target`, applying to the "petal" features and using the target_creator pipeline as the transformer. .. GENERATED FROM PYTHON SOURCE LINES 59-64 .. code-block:: Python creator = PipelineCreator(problem_type="regression") creator.add("zscore", apply_to="*") creator.add("generate_target", apply_to="petal", transformer=target_creator) creator.add("linreg", apply_to="sepal") .. rst-class:: sphx-glr-script-out .. code-block:: none 2026-05-29 20:46:12,587 - julearn - INFO - Adding step zscore that applies to ColumnTypes 2026-05-29 20:46:12,587 - julearn - DEBUG - Getting estimator from string: zscore 2026-05-29 20:46:12,587 - julearn - INFO - Step added 2026-05-29 20:46:12,588 - julearn - INFO - Adding step generate_target that applies to ColumnTypes 2026-05-29 20:46:12,588 - julearn - INFO - Setting hyperparameter transformer = PipelineCreator: Step 0: pca estimator: PCA(n_components=2) apply to: ColumnTypes needed types: ColumnTypes tuning params: {} Step 1: pick_columns estimator: PickColumns(keep='pca__pca0') apply to: ColumnTypes needed types: ColumnTypes tuning params: {} 2026-05-29 20:46:12,589 - julearn - DEBUG - Special step is generate_target 2026-05-29 20:46:12,589 - julearn - INFO - Step added 2026-05-29 20:46:12,589 - julearn - INFO - Adding step linreg that applies to ColumnTypes 2026-05-29 20:46:12,589 - julearn - DEBUG - Getting estimator from string: linreg 2026-05-29 20:46:12,590 - julearn - INFO - Step added .. GENERATED FROM PYTHON SOURCE LINES 65-66 We finally evaluate the model within the cross validation. .. GENERATED FROM PYTHON SOURCE LINES 66-77 .. code-block:: Python scores, model = run_cross_validation( X=X, y=y, X_types=X_types, data=df_iris, model=creator, return_estimator="final", cv=2, ) print(scores["test_score"]) # type: ignore .. rst-class:: sphx-glr-script-out .. code-block:: none 2026-05-29 20:46:12,590 - julearn - INFO - ==== Input Data ==== 2026-05-29 20:46:12,591 - julearn - INFO - Using dataframe as input 2026-05-29 20:46:12,591 - julearn - INFO - Features: ['sepal_length', 'sepal_width', 'petal_length', 'petal_width'] 2026-05-29 20:46:12,591 - julearn - INFO - Target: __generated__ 2026-05-29 20:46:12,591 - julearn - INFO - Expanded features: ['sepal_length', 'sepal_width', 'petal_length', 'petal_width'] 2026-05-29 20:46:12,591 - julearn - INFO - X_types:{'sepal': ['sepal_length', 'sepal_width'], 'petal': ['petal_length', 'petal_width']} 2026-05-29 20:46:12,593 - julearn - INFO - Target will be generated 2026-05-29 20:46:12,593 - julearn - INFO - ==================== 2026-05-29 20:46:12,593 - julearn - INFO - 2026-05-29 20:46:12,593 - julearn - DEBUG - Generating pipeline from PipelineCreator or list of them 2026-05-29 20:46:12,593 - julearn - DEBUG - Creating pipeline 2026-05-29 20:46:12,593 - julearn - DEBUG - Ensuring target generator pipeline 2026-05-29 20:46:12,593 - julearn - DEBUG - Creating pipeline 2026-05-29 20:46:12,594 - julearn - DEBUG - Creating a pipeline with no model added 2026-05-29 20:46:12,594 - julearn - DEBUG - Adding transformer pca 2026-05-29 20:46:12,594 - julearn - DEBUG - Estimator: PCA(n_components=2) 2026-05-29 20:46:12,594 - julearn - DEBUG - Params to tune: {} 2026-05-29 20:46:12,594 - julearn - DEBUG - Adding transformer pick_columns 2026-05-29 20:46:12,595 - julearn - DEBUG - Estimator: PickColumns(keep='pca__pca0') 2026-05-29 20:46:12,595 - julearn - DEBUG - Params to tune: {} 2026-05-29 20:46:12,595 - julearn - INFO - = Model Parameters = 2026-05-29 20:46:12,595 - julearn - INFO - ==================== 2026-05-29 20:46:12,595 - julearn - INFO - 2026-05-29 20:46:12,595 - julearn - DEBUG - Pipeline created 2026-05-29 20:46:12,596 - julearn - DEBUG - Target generator pipeline created 2026-05-29 20:46:12,596 - julearn - DEBUG - Adding transformer zscore 2026-05-29 20:46:12,596 - julearn - DEBUG - Estimator: StandardScaler() 2026-05-29 20:46:12,596 - julearn - DEBUG - Params to tune: {} 2026-05-29 20:46:12,596 - julearn - DEBUG - Adding model linreg 2026-05-29 20:46:12,597 - julearn - DEBUG - Wrapping linreg 2026-05-29 20:46:12,597 - julearn - DEBUG - Estimator: WrapModel(apply_to=ColumnTypes, copy_X=True, fit_intercept=True, model=LinearRegression(), n_jobs=None, positive=False, tol=1e-06) 2026-05-29 20:46:12,598 - julearn - DEBUG - Looking for nested pipeline creators 2026-05-29 20:46:12,598 - julearn - DEBUG - Params to tune: {} 2026-05-29 20:46:12,598 - julearn - DEBUG - Wrapping target model linreg as target_generate 2026-05-29 20:46:12,598 - julearn - INFO - = Model Parameters = 2026-05-29 20:46:12,598 - julearn - INFO - ==================== 2026-05-29 20:46:12,598 - julearn - INFO - 2026-05-29 20:46:12,598 - julearn - DEBUG - Pipeline created 2026-05-29 20:46:12,599 - julearn - DEBUG - Pipeline has target generator 2026-05-29 20:46:12,599 - julearn - INFO - = Data Information = 2026-05-29 20:46:12,599 - julearn - INFO - Problem type: regression 2026-05-29 20:46:12,599 - julearn - INFO - Number of samples: 150 2026-05-29 20:46:12,599 - julearn - INFO - Number of features: 4 2026-05-29 20:46:12,599 - julearn - INFO - ==================== 2026-05-29 20:46:12,599 - julearn - INFO - 2026-05-29 20:46:12,600 - julearn - INFO - Target type: float64 2026-05-29 20:46:12,600 - julearn - INFO - Using outer CV scheme KFold(n_splits=2, random_state=None, shuffle=False) (incl. final model) 2026-05-29 20:46:12,611 - julearn - DEBUG - Setting column types for Index(['sepal_length', 'sepal_width', 'petal_length', 'petal_width'], dtype='str') 2026-05-29 20:46:12,612 - julearn - DEBUG - Column mappers for {'sepal_length': 'sepal_length__:type:__sepal', 'sepal_width': 'sepal_width__:type:__sepal', 'petal_length': 'petal_length__:type:__petal', 'petal_width': 'petal_width__:type:__petal'} 2026-05-29 20:46:12,617 - julearn - DEBUG - Fitting the target generator 2026-05-29 20:46:12,618 - julearn - DEBUG - Setting column types for Index(['sepal_length__:type:__sepal', 'sepal_width__:type:__sepal', 'petal_length__:type:__petal', 'petal_width__:type:__petal'], dtype='str') 2026-05-29 20:46:12,618 - julearn - DEBUG - Column mappers for {'sepal_length': 'sepal_length__:type:__sepal', 'sepal_width': 'sepal_width__:type:__sepal', 'petal_length': 'petal_length__:type:__petal', 'petal_width': 'petal_width__:type:__petal'} 2026-05-29 20:46:12,632 - julearn - DEBUG - Generating target 2026-05-29 20:46:12,641 - julearn - DEBUG - Picking columns: ['pca__pca0'] 2026-05-29 20:46:12,642 - julearn - DEBUG - Target generated: pca__pca0 2026-05-29 20:46:12,643 - julearn - DEBUG - Fitting model from generated target /private/var/folders/09/t22x2_p106j7p24khr0jdxrw0000gn/T/tmpyvhr0tue/.venv/lib/python3.11/site-packages/sklearn/model_selection/_validation.py:927: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details: Traceback (most recent call last): File "/private/var/folders/09/t22x2_p106j7p24khr0jdxrw0000gn/T/tmpyvhr0tue/.venv/lib/python3.11/site-packages/sklearn/model_selection/_validation.py", line 916, in _score scores = scorer(estimator, X_test, y_test, **score_params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/private/var/folders/09/t22x2_p106j7p24khr0jdxrw0000gn/T/tmpyvhr0tue/.venv/lib/python3.11/site-packages/sklearn/metrics/_scorer.py", line 485, in __call__ return estimator.score(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/private/var/folders/09/t22x2_p106j7p24khr0jdxrw0000gn/T/tmpyvhr0tue/.venv/lib/python3.11/site-packages/sklearn/pipeline.py", line 1138, in score routed_params = process_routing( ^^^^^^^^^^^^^^^^ File "/private/var/folders/09/t22x2_p106j7p24khr0jdxrw0000gn/T/tmpyvhr0tue/.venv/lib/python3.11/site-packages/sklearn/utils/_metadata_requests.py", line 1643, in process_routing request_routing.validate_metadata(params=kwargs, method=_method) File "/private/var/folders/09/t22x2_p106j7p24khr0jdxrw0000gn/T/tmpyvhr0tue/.venv/lib/python3.11/site-packages/sklearn/utils/_metadata_requests.py", line 1139, in validate_metadata raise TypeError( TypeError: Pipeline.score got unexpected argument(s) {'sample_weight'}, which are not routed to any object. warnings.warn( 2026-05-29 20:46:12,658 - julearn - DEBUG - Setting column types for Index(['sepal_length', 'sepal_width', 'petal_length', 'petal_width'], dtype='str') 2026-05-29 20:46:12,658 - julearn - DEBUG - Column mappers for {'sepal_length': 'sepal_length__:type:__sepal', 'sepal_width': 'sepal_width__:type:__sepal', 'petal_length': 'petal_length__:type:__petal', 'petal_width': 'petal_width__:type:__petal'} 2026-05-29 20:46:12,664 - julearn - DEBUG - Fitting the target generator 2026-05-29 20:46:12,665 - julearn - DEBUG - Setting column types for Index(['sepal_length__:type:__sepal', 'sepal_width__:type:__sepal', 'petal_length__:type:__petal', 'petal_width__:type:__petal'], dtype='str') 2026-05-29 20:46:12,665 - julearn - DEBUG - Column mappers for {'sepal_length': 'sepal_length__:type:__sepal', 'sepal_width': 'sepal_width__:type:__sepal', 'petal_length': 'petal_length__:type:__petal', 'petal_width': 'petal_width__:type:__petal'} 2026-05-29 20:46:12,679 - julearn - DEBUG - Generating target 2026-05-29 20:46:12,684 - julearn - DEBUG - Picking columns: ['pca__pca0'] 2026-05-29 20:46:12,685 - julearn - DEBUG - Target generated: pca__pca0 2026-05-29 20:46:12,686 - julearn - DEBUG - Fitting model from generated target /private/var/folders/09/t22x2_p106j7p24khr0jdxrw0000gn/T/tmpyvhr0tue/.venv/lib/python3.11/site-packages/sklearn/model_selection/_validation.py:927: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details: Traceback (most recent call last): File "/private/var/folders/09/t22x2_p106j7p24khr0jdxrw0000gn/T/tmpyvhr0tue/.venv/lib/python3.11/site-packages/sklearn/model_selection/_validation.py", line 916, in _score scores = scorer(estimator, X_test, y_test, **score_params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/private/var/folders/09/t22x2_p106j7p24khr0jdxrw0000gn/T/tmpyvhr0tue/.venv/lib/python3.11/site-packages/sklearn/metrics/_scorer.py", line 485, in __call__ return estimator.score(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/private/var/folders/09/t22x2_p106j7p24khr0jdxrw0000gn/T/tmpyvhr0tue/.venv/lib/python3.11/site-packages/sklearn/pipeline.py", line 1138, in score routed_params = process_routing( ^^^^^^^^^^^^^^^^ File "/private/var/folders/09/t22x2_p106j7p24khr0jdxrw0000gn/T/tmpyvhr0tue/.venv/lib/python3.11/site-packages/sklearn/utils/_metadata_requests.py", line 1643, in process_routing request_routing.validate_metadata(params=kwargs, method=_method) File "/private/var/folders/09/t22x2_p106j7p24khr0jdxrw0000gn/T/tmpyvhr0tue/.venv/lib/python3.11/site-packages/sklearn/utils/_metadata_requests.py", line 1139, in validate_metadata raise TypeError( TypeError: Pipeline.score got unexpected argument(s) {'sample_weight'}, which are not routed to any object. warnings.warn( 2026-05-29 20:46:12,701 - julearn - DEBUG - Setting column types for Index(['sepal_length', 'sepal_width', 'petal_length', 'petal_width'], dtype='str') 2026-05-29 20:46:12,701 - julearn - DEBUG - Column mappers for {'sepal_length': 'sepal_length__:type:__sepal', 'sepal_width': 'sepal_width__:type:__sepal', 'petal_length': 'petal_length__:type:__petal', 'petal_width': 'petal_width__:type:__petal'} 2026-05-29 20:46:12,707 - julearn - DEBUG - Fitting the target generator 2026-05-29 20:46:12,708 - julearn - DEBUG - Setting column types for Index(['sepal_length__:type:__sepal', 'sepal_width__:type:__sepal', 'petal_length__:type:__petal', 'petal_width__:type:__petal'], dtype='str') 2026-05-29 20:46:12,708 - julearn - DEBUG - Column mappers for {'sepal_length': 'sepal_length__:type:__sepal', 'sepal_width': 'sepal_width__:type:__sepal', 'petal_length': 'petal_length__:type:__petal', 'petal_width': 'petal_width__:type:__petal'} 2026-05-29 20:46:12,722 - julearn - DEBUG - Generating target 2026-05-29 20:46:12,728 - julearn - DEBUG - Picking columns: ['pca__pca0'] 2026-05-29 20:46:12,729 - julearn - DEBUG - Target generated: pca__pca0 2026-05-29 20:46:12,729 - julearn - DEBUG - Fitting model from generated target /private/var/folders/09/t22x2_p106j7p24khr0jdxrw0000gn/T/tmpyvhr0tue/.venv/lib/python3.11/site-packages/sklearn/model_selection/_validation.py:927: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details: Traceback (most recent call last): File "/private/var/folders/09/t22x2_p106j7p24khr0jdxrw0000gn/T/tmpyvhr0tue/.venv/lib/python3.11/site-packages/sklearn/model_selection/_validation.py", line 916, in _score scores = scorer(estimator, X_test, y_test, **score_params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/private/var/folders/09/t22x2_p106j7p24khr0jdxrw0000gn/T/tmpyvhr0tue/.venv/lib/python3.11/site-packages/sklearn/metrics/_scorer.py", line 485, in __call__ return estimator.score(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/private/var/folders/09/t22x2_p106j7p24khr0jdxrw0000gn/T/tmpyvhr0tue/.venv/lib/python3.11/site-packages/sklearn/pipeline.py", line 1138, in score routed_params = process_routing( ^^^^^^^^^^^^^^^^ File "/private/var/folders/09/t22x2_p106j7p24khr0jdxrw0000gn/T/tmpyvhr0tue/.venv/lib/python3.11/site-packages/sklearn/utils/_metadata_requests.py", line 1643, in process_routing request_routing.validate_metadata(params=kwargs, method=_method) File "/private/var/folders/09/t22x2_p106j7p24khr0jdxrw0000gn/T/tmpyvhr0tue/.venv/lib/python3.11/site-packages/sklearn/utils/_metadata_requests.py", line 1139, in validate_metadata raise TypeError( TypeError: Pipeline.score got unexpected argument(s) {'sample_weight'}, which are not routed to any object. warnings.warn( 0 NaN 1 NaN Name: test_score, dtype: float64 .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.173 seconds) .. _sphx_glr_download_auto_examples_03_complex_models_run_generate_target.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: run_generate_target.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: run_generate_target.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: run_generate_target.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_