.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/03_complex_models/run_generate_target.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_03_complex_models_run_generate_target.py: Target Generation ================= This example uses the ``iris`` dataset and tests a regression model in which the target variable is generated from some features within the cross-validation procedure. We will use the Iris dataset and generate a target variable using PCA on the petal features. Then, we will evaluate if a regression model can predict the generated target from the sepal features .. include:: ../../links.inc .. GENERATED FROM PYTHON SOURCE LINES 13-21 .. code-block:: Python # Authors: Federico Raimondo # License: AGPL from seaborn import load_dataset from julearn import run_cross_validation from julearn.pipeline import PipelineCreator from julearn.utils import configure_logging .. GENERATED FROM PYTHON SOURCE LINES 22-23 Set the logging level to info to see extra information. .. GENERATED FROM PYTHON SOURCE LINES 23-25 .. code-block:: Python configure_logging(level="DEBUG") .. rst-class:: sphx-glr-script-out .. code-block:: none 2026-03-31 12:29:04,011 - julearn - INFO - ===== Lib Versions ===== 2026-03-31 12:29:04,012 - julearn - INFO - numpy: 2.4.4 2026-03-31 12:29:04,012 - julearn - INFO - scipy: 1.17.1 2026-03-31 12:29:04,012 - julearn - INFO - sklearn: 1.8.0 2026-03-31 12:29:04,012 - julearn - INFO - pandas: 3.0.2 2026-03-31 12:29:04,012 - julearn - INFO - julearn: 0.3.6.dev15 2026-03-31 12:29:04,012 - julearn - INFO - ======================== .. GENERATED FROM PYTHON SOURCE LINES 26-29 .. code-block:: Python df_iris = load_dataset("iris") .. GENERATED FROM PYTHON SOURCE LINES 30-32 As features, we will use the sepal length, width and petal length. We will try to predict the species. .. GENERATED FROM PYTHON SOURCE LINES 32-43 .. code-block:: Python X = ["sepal_length", "sepal_width", "petal_length", "petal_width"] y = "__generated__" # to indicate to julearn that the target will be generated # Define our feature types X_types = { "sepal": ["sepal_length", "sepal_width"], "petal": ["petal_length", "petal_width"], } .. GENERATED FROM PYTHON SOURCE LINES 44-47 We now use a Pipeline Creator to create the pipeline that will generate the features. This special pipeline should be configured to be a "transformer" and apply to the "petal" feature types. .. GENERATED FROM PYTHON SOURCE LINES 47-54 .. code-block:: Python target_creator = PipelineCreator(problem_type="transformer", apply_to="petal") target_creator.add("pca", n_components=2) # Select only the first component target_creator.add("pick_columns", keep="pca__pca0") .. rst-class:: sphx-glr-script-out .. code-block:: none 2026-03-31 12:29:04,014 - julearn - INFO - Adding step pca that applies to ColumnTypes 2026-03-31 12:29:04,014 - julearn - INFO - Setting hyperparameter n_components = 2 2026-03-31 12:29:04,014 - julearn - DEBUG - Getting estimator from string: pca 2026-03-31 12:29:04,014 - julearn - INFO - Step added 2026-03-31 12:29:04,015 - julearn - INFO - Adding step pick_columns that applies to ColumnTypes 2026-03-31 12:29:04,015 - julearn - INFO - Setting hyperparameter keep = pca__pca0 2026-03-31 12:29:04,015 - julearn - DEBUG - Getting estimator from string: pick_columns 2026-03-31 12:29:04,015 - julearn - INFO - Step added .. GENERATED FROM PYTHON SOURCE LINES 55-59 We now create the pipeline that will be used to predict the target. This pipeline will be a regression pipeline. The step previous to the model should be the the `generate_target`, applying to the "petal" features and using the target_creator pipeline as the transformer. .. GENERATED FROM PYTHON SOURCE LINES 59-64 .. code-block:: Python creator = PipelineCreator(problem_type="regression") creator.add("zscore", apply_to="*") creator.add("generate_target", apply_to="petal", transformer=target_creator) creator.add("linreg", apply_to="sepal") .. rst-class:: sphx-glr-script-out .. code-block:: none 2026-03-31 12:29:04,016 - julearn - INFO - Adding step zscore that applies to ColumnTypes 2026-03-31 12:29:04,016 - julearn - DEBUG - Getting estimator from string: zscore 2026-03-31 12:29:04,016 - julearn - INFO - Step added 2026-03-31 12:29:04,016 - julearn - INFO - Adding step generate_target that applies to ColumnTypes 2026-03-31 12:29:04,016 - julearn - INFO - Setting hyperparameter transformer = PipelineCreator: Step 0: pca estimator: PCA(n_components=2) apply to: ColumnTypes needed types: ColumnTypes tuning params: {} Step 1: pick_columns estimator: PickColumns(keep='pca__pca0') apply to: ColumnTypes needed types: ColumnTypes tuning params: {} 2026-03-31 12:29:04,017 - julearn - DEBUG - Special step is generate_target 2026-03-31 12:29:04,017 - julearn - INFO - Step added 2026-03-31 12:29:04,017 - julearn - INFO - Adding step linreg that applies to ColumnTypes 2026-03-31 12:29:04,017 - julearn - DEBUG - Getting estimator from string: linreg 2026-03-31 12:29:04,018 - julearn - INFO - Step added .. GENERATED FROM PYTHON SOURCE LINES 65-66 We finally evaluate the model within the cross validation. .. GENERATED FROM PYTHON SOURCE LINES 66-77 .. code-block:: Python scores, model = run_cross_validation( X=X, y=y, X_types=X_types, data=df_iris, model=creator, return_estimator="final", cv=2, ) print(scores["test_score"]) # type: ignore .. rst-class:: sphx-glr-script-out .. code-block:: none 2026-03-31 12:29:04,018 - julearn - INFO - ==== Input Data ==== 2026-03-31 12:29:04,018 - julearn - INFO - Using dataframe as input 2026-03-31 12:29:04,018 - julearn - INFO - Features: ['sepal_length', 'sepal_width', 'petal_length', 'petal_width'] 2026-03-31 12:29:04,018 - julearn - INFO - Target: __generated__ 2026-03-31 12:29:04,019 - julearn - INFO - Expanded features: ['sepal_length', 'sepal_width', 'petal_length', 'petal_width'] 2026-03-31 12:29:04,019 - julearn - INFO - X_types:{'sepal': ['sepal_length', 'sepal_width'], 'petal': ['petal_length', 'petal_width']} 2026-03-31 12:29:04,020 - julearn - INFO - Target will be generated 2026-03-31 12:29:04,020 - julearn - INFO - ==================== 2026-03-31 12:29:04,020 - julearn - INFO - 2026-03-31 12:29:04,020 - julearn - DEBUG - Generating pipeline from PipelineCreator or list of them 2026-03-31 12:29:04,020 - julearn - DEBUG - Creating pipeline 2026-03-31 12:29:04,020 - julearn - DEBUG - Ensuring target generator pipeline 2026-03-31 12:29:04,020 - julearn - DEBUG - Creating pipeline 2026-03-31 12:29:04,020 - julearn - DEBUG - Creating a pipeline with no model added 2026-03-31 12:29:04,020 - julearn - DEBUG - Adding transformer pca 2026-03-31 12:29:04,021 - julearn - DEBUG - Estimator: PCA(n_components=2) 2026-03-31 12:29:04,021 - julearn - DEBUG - Params to tune: {} 2026-03-31 12:29:04,021 - julearn - DEBUG - Adding transformer pick_columns 2026-03-31 12:29:04,021 - julearn - DEBUG - Estimator: PickColumns(keep='pca__pca0') 2026-03-31 12:29:04,022 - julearn - DEBUG - Params to tune: {} 2026-03-31 12:29:04,022 - julearn - INFO - = Model Parameters = 2026-03-31 12:29:04,022 - julearn - INFO - ==================== 2026-03-31 12:29:04,022 - julearn - INFO - 2026-03-31 12:29:04,022 - julearn - DEBUG - Pipeline created 2026-03-31 12:29:04,022 - julearn - DEBUG - Target generator pipeline created 2026-03-31 12:29:04,022 - julearn - DEBUG - Adding transformer zscore 2026-03-31 12:29:04,022 - julearn - DEBUG - Estimator: StandardScaler() 2026-03-31 12:29:04,023 - julearn - DEBUG - Params to tune: {} 2026-03-31 12:29:04,023 - julearn - DEBUG - Adding model linreg 2026-03-31 12:29:04,023 - julearn - DEBUG - Wrapping linreg 2026-03-31 12:29:04,023 - julearn - DEBUG - Estimator: WrapModel(apply_to=ColumnTypes, copy_X=True, fit_intercept=True, model=LinearRegression(), n_jobs=None, positive=False, tol=1e-06) 2026-03-31 12:29:04,024 - julearn - DEBUG - Looking for nested pipeline creators 2026-03-31 12:29:04,024 - julearn - DEBUG - Params to tune: {} 2026-03-31 12:29:04,024 - julearn - DEBUG - Wrapping target model linreg as target_generate 2026-03-31 12:29:04,024 - julearn - INFO - = Model Parameters = 2026-03-31 12:29:04,024 - julearn - INFO - ==================== 2026-03-31 12:29:04,024 - julearn - INFO - 2026-03-31 12:29:04,025 - julearn - DEBUG - Pipeline created 2026-03-31 12:29:04,025 - julearn - DEBUG - Pipeline has target generator 2026-03-31 12:29:04,025 - julearn - INFO - = Data Information = 2026-03-31 12:29:04,025 - julearn - INFO - Problem type: regression 2026-03-31 12:29:04,025 - julearn - INFO - Number of samples: 150 2026-03-31 12:29:04,025 - julearn - INFO - Number of features: 4 2026-03-31 12:29:04,025 - julearn - INFO - ==================== 2026-03-31 12:29:04,025 - julearn - INFO - 2026-03-31 12:29:04,025 - julearn - INFO - Target type: float64 2026-03-31 12:29:04,025 - julearn - INFO - Using outer CV scheme KFold(n_splits=2, random_state=None, shuffle=False) (incl. final model) 2026-03-31 12:29:04,033 - julearn - DEBUG - Setting column types for Index(['sepal_length', 'sepal_width', 'petal_length', 'petal_width'], dtype='str') 2026-03-31 12:29:04,033 - julearn - DEBUG - Column mappers for {'sepal_length': 'sepal_length__:type:__sepal', 'sepal_width': 'sepal_width__:type:__sepal', 'petal_length': 'petal_length__:type:__petal', 'petal_width': 'petal_width__:type:__petal'} 2026-03-31 12:29:04,037 - julearn - DEBUG - Fitting the target generator 2026-03-31 12:29:04,038 - julearn - DEBUG - Setting column types for Index(['sepal_length__:type:__sepal', 'sepal_width__:type:__sepal', 'petal_length__:type:__petal', 'petal_width__:type:__petal'], dtype='str') 2026-03-31 12:29:04,038 - julearn - DEBUG - Column mappers for {'sepal_length': 'sepal_length__:type:__sepal', 'sepal_width': 'sepal_width__:type:__sepal', 'petal_length': 'petal_length__:type:__petal', 'petal_width': 'petal_width__:type:__petal'} 2026-03-31 12:29:04,047 - julearn - DEBUG - Generating target 2026-03-31 12:29:04,050 - julearn - DEBUG - Picking columns: ['pca__pca0'] 2026-03-31 12:29:04,050 - julearn - DEBUG - Target generated: pca__pca0 2026-03-31 12:29:04,051 - julearn - DEBUG - Fitting model from generated target /opt/hostedtoolcache/Python/3.14.3/x64/lib/python3.14/site-packages/sklearn/model_selection/_validation.py:927: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details: Traceback (most recent call last): File "/opt/hostedtoolcache/Python/3.14.3/x64/lib/python3.14/site-packages/sklearn/model_selection/_validation.py", line 916, in _score scores = scorer(estimator, X_test, y_test, **score_params) File "/opt/hostedtoolcache/Python/3.14.3/x64/lib/python3.14/site-packages/sklearn/metrics/_scorer.py", line 485, in __call__ return estimator.score(*args, **kwargs) ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^ File "/opt/hostedtoolcache/Python/3.14.3/x64/lib/python3.14/site-packages/sklearn/pipeline.py", line 1138, in score routed_params = process_routing( self, "score", sample_weight=sample_weight, **params ) File "/opt/hostedtoolcache/Python/3.14.3/x64/lib/python3.14/site-packages/sklearn/utils/_metadata_requests.py", line 1643, in process_routing request_routing.validate_metadata(params=kwargs, method=_method) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/hostedtoolcache/Python/3.14.3/x64/lib/python3.14/site-packages/sklearn/utils/_metadata_requests.py", line 1139, in validate_metadata raise TypeError( ...<2 lines>... ) TypeError: Pipeline.score got unexpected argument(s) {'sample_weight'}, which are not routed to any object. warnings.warn( 2026-03-31 12:29:04,104 - julearn - DEBUG - Setting column types for Index(['sepal_length', 'sepal_width', 'petal_length', 'petal_width'], dtype='str') 2026-03-31 12:29:04,105 - julearn - DEBUG - Column mappers for {'sepal_length': 'sepal_length__:type:__sepal', 'sepal_width': 'sepal_width__:type:__sepal', 'petal_length': 'petal_length__:type:__petal', 'petal_width': 'petal_width__:type:__petal'} 2026-03-31 12:29:04,109 - julearn - DEBUG - Fitting the target generator 2026-03-31 12:29:04,109 - julearn - DEBUG - Setting column types for Index(['sepal_length__:type:__sepal', 'sepal_width__:type:__sepal', 'petal_length__:type:__petal', 'petal_width__:type:__petal'], dtype='str') 2026-03-31 12:29:04,109 - julearn - DEBUG - Column mappers for {'sepal_length': 'sepal_length__:type:__sepal', 'sepal_width': 'sepal_width__:type:__sepal', 'petal_length': 'petal_length__:type:__petal', 'petal_width': 'petal_width__:type:__petal'} 2026-03-31 12:29:04,118 - julearn - DEBUG - Generating target 2026-03-31 12:29:04,121 - julearn - DEBUG - Picking columns: ['pca__pca0'] 2026-03-31 12:29:04,122 - julearn - DEBUG - Target generated: pca__pca0 2026-03-31 12:29:04,122 - julearn - DEBUG - Fitting model from generated target /opt/hostedtoolcache/Python/3.14.3/x64/lib/python3.14/site-packages/sklearn/model_selection/_validation.py:927: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details: Traceback (most recent call last): File "/opt/hostedtoolcache/Python/3.14.3/x64/lib/python3.14/site-packages/sklearn/model_selection/_validation.py", line 916, in _score scores = scorer(estimator, X_test, y_test, **score_params) File "/opt/hostedtoolcache/Python/3.14.3/x64/lib/python3.14/site-packages/sklearn/metrics/_scorer.py", line 485, in __call__ return estimator.score(*args, **kwargs) ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^ File "/opt/hostedtoolcache/Python/3.14.3/x64/lib/python3.14/site-packages/sklearn/pipeline.py", line 1138, in score routed_params = process_routing( self, "score", sample_weight=sample_weight, **params ) File "/opt/hostedtoolcache/Python/3.14.3/x64/lib/python3.14/site-packages/sklearn/utils/_metadata_requests.py", line 1643, in process_routing request_routing.validate_metadata(params=kwargs, method=_method) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/hostedtoolcache/Python/3.14.3/x64/lib/python3.14/site-packages/sklearn/utils/_metadata_requests.py", line 1139, in validate_metadata raise TypeError( ...<2 lines>... ) TypeError: Pipeline.score got unexpected argument(s) {'sample_weight'}, which are not routed to any object. warnings.warn( 2026-03-31 12:29:04,131 - julearn - DEBUG - Setting column types for Index(['sepal_length', 'sepal_width', 'petal_length', 'petal_width'], dtype='str') 2026-03-31 12:29:04,131 - julearn - DEBUG - Column mappers for {'sepal_length': 'sepal_length__:type:__sepal', 'sepal_width': 'sepal_width__:type:__sepal', 'petal_length': 'petal_length__:type:__petal', 'petal_width': 'petal_width__:type:__petal'} 2026-03-31 12:29:04,134 - julearn - DEBUG - Fitting the target generator 2026-03-31 12:29:04,135 - julearn - DEBUG - Setting column types for Index(['sepal_length__:type:__sepal', 'sepal_width__:type:__sepal', 'petal_length__:type:__petal', 'petal_width__:type:__petal'], dtype='str') 2026-03-31 12:29:04,135 - julearn - DEBUG - Column mappers for {'sepal_length': 'sepal_length__:type:__sepal', 'sepal_width': 'sepal_width__:type:__sepal', 'petal_length': 'petal_length__:type:__petal', 'petal_width': 'petal_width__:type:__petal'} 2026-03-31 12:29:04,143 - julearn - DEBUG - Generating target 2026-03-31 12:29:04,147 - julearn - DEBUG - Picking columns: ['pca__pca0'] 2026-03-31 12:29:04,147 - julearn - DEBUG - Target generated: pca__pca0 2026-03-31 12:29:04,148 - julearn - DEBUG - Fitting model from generated target /opt/hostedtoolcache/Python/3.14.3/x64/lib/python3.14/site-packages/sklearn/model_selection/_validation.py:927: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details: Traceback (most recent call last): File "/opt/hostedtoolcache/Python/3.14.3/x64/lib/python3.14/site-packages/sklearn/model_selection/_validation.py", line 916, in _score scores = scorer(estimator, X_test, y_test, **score_params) File "/opt/hostedtoolcache/Python/3.14.3/x64/lib/python3.14/site-packages/sklearn/metrics/_scorer.py", line 485, in __call__ return estimator.score(*args, **kwargs) ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^ File "/opt/hostedtoolcache/Python/3.14.3/x64/lib/python3.14/site-packages/sklearn/pipeline.py", line 1138, in score routed_params = process_routing( self, "score", sample_weight=sample_weight, **params ) File "/opt/hostedtoolcache/Python/3.14.3/x64/lib/python3.14/site-packages/sklearn/utils/_metadata_requests.py", line 1643, in process_routing request_routing.validate_metadata(params=kwargs, method=_method) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/hostedtoolcache/Python/3.14.3/x64/lib/python3.14/site-packages/sklearn/utils/_metadata_requests.py", line 1139, in validate_metadata raise TypeError( ...<2 lines>... ) TypeError: Pipeline.score got unexpected argument(s) {'sample_weight'}, which are not routed to any object. warnings.warn( 0 NaN 1 NaN Name: test_score, dtype: float64 .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.147 seconds) .. _sphx_glr_download_auto_examples_03_complex_models_run_generate_target.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: run_generate_target.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: run_generate_target.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: run_generate_target.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_