.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/03_complex_models/run_generate_target.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_03_complex_models_run_generate_target.py: Target Generation ================= This example uses the ``iris`` dataset and tests a regression model in which the target variable is generated from some features within the cross-validation procedure. We will use the Iris dataset and generate a target variable using PCA on the petal features. Then, we will evaluate if a regression model can predict the generated target from the sepal features .. include:: ../../links.inc .. GENERATED FROM PYTHON SOURCE LINES 13-21 .. code-block:: Python # Authors: Federico Raimondo # License: AGPL from seaborn import load_dataset from julearn import run_cross_validation from julearn.pipeline import PipelineCreator from julearn.utils import configure_logging .. GENERATED FROM PYTHON SOURCE LINES 22-23 Set the logging level to info to see extra information. .. GENERATED FROM PYTHON SOURCE LINES 23-25 .. code-block:: Python configure_logging(level="DEBUG") .. rst-class:: sphx-glr-script-out .. code-block:: none 2026-01-16 10:54:05,223 - julearn - INFO - ===== Lib Versions ===== 2026-01-16 10:54:05,223 - julearn - INFO - numpy: 1.26.4 2026-01-16 10:54:05,224 - julearn - INFO - scipy: 1.17.0 2026-01-16 10:54:05,224 - julearn - INFO - sklearn: 1.7.2 2026-01-16 10:54:05,224 - julearn - INFO - pandas: 2.3.3 2026-01-16 10:54:05,224 - julearn - INFO - julearn: 0.3.5.dev123 2026-01-16 10:54:05,224 - julearn - INFO - ======================== .. GENERATED FROM PYTHON SOURCE LINES 26-29 .. code-block:: Python df_iris = load_dataset("iris") .. GENERATED FROM PYTHON SOURCE LINES 30-32 As features, we will use the sepal length, width and petal length. We will try to predict the species. .. GENERATED FROM PYTHON SOURCE LINES 32-43 .. code-block:: Python X = ["sepal_length", "sepal_width", "petal_length", "petal_width"] y = "__generated__" # to indicate to julearn that the target will be generated # Define our feature types X_types = { "sepal": ["sepal_length", "sepal_width"], "petal": ["petal_length", "petal_width"], } .. GENERATED FROM PYTHON SOURCE LINES 44-47 We now use a Pipeline Creator to create the pipeline that will generate the features. This special pipeline should be configured to be a "transformer" and apply to the "petal" feature types. .. GENERATED FROM PYTHON SOURCE LINES 47-54 .. code-block:: Python target_creator = PipelineCreator(problem_type="transformer", apply_to="petal") target_creator.add("pca", n_components=2) # Select only the first component target_creator.add("pick_columns", keep="pca__pca0") .. rst-class:: sphx-glr-script-out .. code-block:: none 2026-01-16 10:54:05,226 - julearn - INFO - Adding step pca that applies to ColumnTypes 2026-01-16 10:54:05,226 - julearn - INFO - Setting hyperparameter n_components = 2 2026-01-16 10:54:05,226 - julearn - DEBUG - Getting estimator from string: pca 2026-01-16 10:54:05,226 - julearn - INFO - Step added 2026-01-16 10:54:05,226 - julearn - INFO - Adding step pick_columns that applies to ColumnTypes 2026-01-16 10:54:05,227 - julearn - INFO - Setting hyperparameter keep = pca__pca0 2026-01-16 10:54:05,227 - julearn - DEBUG - Getting estimator from string: pick_columns 2026-01-16 10:54:05,227 - julearn - INFO - Step added .. GENERATED FROM PYTHON SOURCE LINES 55-59 We now create the pipeline that will be used to predict the target. This pipeline will be a regression pipeline. The step previous to the model should be the the `generate_target`, applying to the "petal" features and using the target_creator pipeline as the transformer. .. GENERATED FROM PYTHON SOURCE LINES 59-64 .. code-block:: Python creator = PipelineCreator(problem_type="regression") creator.add("zscore", apply_to="*") creator.add("generate_target", apply_to="petal", transformer=target_creator) creator.add("linreg", apply_to="sepal") .. rst-class:: sphx-glr-script-out .. code-block:: none 2026-01-16 10:54:05,227 - julearn - INFO - Adding step zscore that applies to ColumnTypes 2026-01-16 10:54:05,227 - julearn - DEBUG - Getting estimator from string: zscore 2026-01-16 10:54:05,228 - julearn - INFO - Step added 2026-01-16 10:54:05,228 - julearn - INFO - Adding step generate_target that applies to ColumnTypes 2026-01-16 10:54:05,228 - julearn - INFO - Setting hyperparameter transformer = PipelineCreator: Step 0: pca estimator: PCA(n_components=2) apply to: ColumnTypes needed types: ColumnTypes tuning params: {} Step 1: pick_columns estimator: PickColumns(keep='pca__pca0') apply to: ColumnTypes needed types: ColumnTypes tuning params: {} 2026-01-16 10:54:05,228 - julearn - DEBUG - Special step is generate_target 2026-01-16 10:54:05,228 - julearn - INFO - Step added 2026-01-16 10:54:05,229 - julearn - INFO - Adding step linreg that applies to ColumnTypes 2026-01-16 10:54:05,229 - julearn - DEBUG - Getting estimator from string: linreg 2026-01-16 10:54:05,229 - julearn - INFO - Step added .. GENERATED FROM PYTHON SOURCE LINES 65-66 We finally evaluate the model within the cross validation. .. GENERATED FROM PYTHON SOURCE LINES 66-77 .. code-block:: Python scores, model = run_cross_validation( X=X, y=y, X_types=X_types, data=df_iris, model=creator, return_estimator="final", cv=2, ) print(scores["test_score"]) # type: ignore .. rst-class:: sphx-glr-script-out .. code-block:: none 2026-01-16 10:54:05,229 - julearn - INFO - ==== Input Data ==== 2026-01-16 10:54:05,229 - julearn - INFO - Using dataframe as input 2026-01-16 10:54:05,230 - julearn - INFO - Features: ['sepal_length', 'sepal_width', 'petal_length', 'petal_width'] 2026-01-16 10:54:05,230 - julearn - INFO - Target: __generated__ 2026-01-16 10:54:05,230 - julearn - INFO - Expanded features: ['sepal_length', 'sepal_width', 'petal_length', 'petal_width'] 2026-01-16 10:54:05,230 - julearn - INFO - X_types:{'sepal': ['sepal_length', 'sepal_width'], 'petal': ['petal_length', 'petal_width']} 2026-01-16 10:54:05,231 - julearn - INFO - Target will be generated 2026-01-16 10:54:05,231 - julearn - INFO - ==================== 2026-01-16 10:54:05,231 - julearn - INFO - 2026-01-16 10:54:05,231 - julearn - DEBUG - Generating pipeline from PipelineCreator or list of them 2026-01-16 10:54:05,231 - julearn - DEBUG - Creating pipeline 2026-01-16 10:54:05,231 - julearn - DEBUG - Ensuring target generator pipeline 2026-01-16 10:54:05,231 - julearn - DEBUG - Creating pipeline 2026-01-16 10:54:05,231 - julearn - DEBUG - Creating a pipeline with no model added 2026-01-16 10:54:05,232 - julearn - DEBUG - Adding transformer pca 2026-01-16 10:54:05,232 - julearn - DEBUG - Estimator: PCA(n_components=2) 2026-01-16 10:54:05,232 - julearn - DEBUG - Params to tune: {} 2026-01-16 10:54:05,232 - julearn - DEBUG - Adding transformer pick_columns 2026-01-16 10:54:05,232 - julearn - DEBUG - Estimator: PickColumns(keep='pca__pca0') 2026-01-16 10:54:05,232 - julearn - DEBUG - Params to tune: {} 2026-01-16 10:54:05,233 - julearn - INFO - = Model Parameters = 2026-01-16 10:54:05,233 - julearn - INFO - ==================== 2026-01-16 10:54:05,233 - julearn - INFO - 2026-01-16 10:54:05,233 - julearn - DEBUG - Pipeline created 2026-01-16 10:54:05,233 - julearn - DEBUG - Target generator pipeline created 2026-01-16 10:54:05,233 - julearn - DEBUG - Adding transformer zscore 2026-01-16 10:54:05,233 - julearn - DEBUG - Estimator: StandardScaler() 2026-01-16 10:54:05,233 - julearn - DEBUG - Params to tune: {} 2026-01-16 10:54:05,233 - julearn - DEBUG - Adding model linreg 2026-01-16 10:54:05,234 - julearn - DEBUG - Wrapping linreg 2026-01-16 10:54:05,234 - julearn - DEBUG - Estimator: WrapModel(apply_to=ColumnTypes, copy_X=True, fit_intercept=True, model=LinearRegression(), n_jobs=None, positive=False, tol=1e-06) 2026-01-16 10:54:05,234 - julearn - DEBUG - Looking for nested pipeline creators 2026-01-16 10:54:05,235 - julearn - DEBUG - Params to tune: {} 2026-01-16 10:54:05,235 - julearn - DEBUG - Wrapping target model linreg as target_generate 2026-01-16 10:54:05,235 - julearn - INFO - = Model Parameters = 2026-01-16 10:54:05,235 - julearn - INFO - ==================== 2026-01-16 10:54:05,235 - julearn - INFO - 2026-01-16 10:54:05,235 - julearn - DEBUG - Pipeline created 2026-01-16 10:54:05,235 - julearn - DEBUG - Pipeline has target generator 2026-01-16 10:54:05,235 - julearn - INFO - = Data Information = 2026-01-16 10:54:05,235 - julearn - INFO - Problem type: regression 2026-01-16 10:54:05,235 - julearn - INFO - Number of samples: 150 2026-01-16 10:54:05,235 - julearn - INFO - Number of features: 4 2026-01-16 10:54:05,236 - julearn - INFO - ==================== 2026-01-16 10:54:05,236 - julearn - INFO - 2026-01-16 10:54:05,236 - julearn - INFO - Target type: float64 2026-01-16 10:54:05,236 - julearn - INFO - Using outer CV scheme KFold(n_splits=2, random_state=None, shuffle=False) (incl. final model) 2026-01-16 10:54:05,241 - julearn - DEBUG - Setting column types for Index(['sepal_length', 'sepal_width', 'petal_length', 'petal_width'], dtype='object') 2026-01-16 10:54:05,241 - julearn - DEBUG - Column mappers for {'sepal_length': 'sepal_length__:type:__sepal', 'sepal_width': 'sepal_width__:type:__sepal', 'petal_length': 'petal_length__:type:__petal', 'petal_width': 'petal_width__:type:__petal'} 2026-01-16 10:54:05,244 - julearn - DEBUG - Fitting the target generator 2026-01-16 10:54:05,245 - julearn - DEBUG - Setting column types for Index(['sepal_length__:type:__sepal', 'sepal_width__:type:__sepal', 'petal_length__:type:__petal', 'petal_width__:type:__petal'], dtype='object') 2026-01-16 10:54:05,245 - julearn - DEBUG - Column mappers for {'sepal_length': 'sepal_length__:type:__sepal', 'sepal_width': 'sepal_width__:type:__sepal', 'petal_length': 'petal_length__:type:__petal', 'petal_width': 'petal_width__:type:__petal'} 2026-01-16 10:54:05,253 - julearn - DEBUG - Generating target 2026-01-16 10:54:05,256 - julearn - DEBUG - Picking columns: ['pca__pca0'] 2026-01-16 10:54:05,256 - julearn - DEBUG - Target generated: pca__pca0 2026-01-16 10:54:05,257 - julearn - DEBUG - Fitting model from generated target /opt/hostedtoolcache/Python/3.14.2/x64/lib/python3.14/site-packages/sklearn/model_selection/_validation.py:953: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details: Traceback (most recent call last): File "/opt/hostedtoolcache/Python/3.14.2/x64/lib/python3.14/site-packages/sklearn/model_selection/_validation.py", line 942, in _score scores = scorer(estimator, X_test, y_test, **score_params) File "/opt/hostedtoolcache/Python/3.14.2/x64/lib/python3.14/site-packages/sklearn/metrics/_scorer.py", line 492, in __call__ return estimator.score(*args, **kwargs) ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^ File "/opt/hostedtoolcache/Python/3.14.2/x64/lib/python3.14/site-packages/sklearn/pipeline.py", line 1192, in score routed_params = process_routing( self, "score", sample_weight=sample_weight, **params ) File "/opt/hostedtoolcache/Python/3.14.2/x64/lib/python3.14/site-packages/sklearn/utils/_metadata_requests.py", line 1625, in process_routing request_routing.validate_metadata(params=kwargs, method=_method) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/hostedtoolcache/Python/3.14.2/x64/lib/python3.14/site-packages/sklearn/utils/_metadata_requests.py", line 1109, in validate_metadata raise TypeError( ...<2 lines>... ) TypeError: Pipeline.score got unexpected argument(s) {'sample_weight'}, which are not routed to any object. warnings.warn( 2026-01-16 10:54:05,265 - julearn - DEBUG - Setting column types for Index(['sepal_length', 'sepal_width', 'petal_length', 'petal_width'], dtype='object') 2026-01-16 10:54:05,265 - julearn - DEBUG - Column mappers for {'sepal_length': 'sepal_length__:type:__sepal', 'sepal_width': 'sepal_width__:type:__sepal', 'petal_length': 'petal_length__:type:__petal', 'petal_width': 'petal_width__:type:__petal'} 2026-01-16 10:54:05,268 - julearn - DEBUG - Fitting the target generator 2026-01-16 10:54:05,269 - julearn - DEBUG - Setting column types for Index(['sepal_length__:type:__sepal', 'sepal_width__:type:__sepal', 'petal_length__:type:__petal', 'petal_width__:type:__petal'], dtype='object') 2026-01-16 10:54:05,269 - julearn - DEBUG - Column mappers for {'sepal_length': 'sepal_length__:type:__sepal', 'sepal_width': 'sepal_width__:type:__sepal', 'petal_length': 'petal_length__:type:__petal', 'petal_width': 'petal_width__:type:__petal'} 2026-01-16 10:54:05,276 - julearn - DEBUG - Generating target 2026-01-16 10:54:05,279 - julearn - DEBUG - Picking columns: ['pca__pca0'] 2026-01-16 10:54:05,279 - julearn - DEBUG - Target generated: pca__pca0 2026-01-16 10:54:05,280 - julearn - DEBUG - Fitting model from generated target /opt/hostedtoolcache/Python/3.14.2/x64/lib/python3.14/site-packages/sklearn/model_selection/_validation.py:953: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details: Traceback (most recent call last): File "/opt/hostedtoolcache/Python/3.14.2/x64/lib/python3.14/site-packages/sklearn/model_selection/_validation.py", line 942, in _score scores = scorer(estimator, X_test, y_test, **score_params) File "/opt/hostedtoolcache/Python/3.14.2/x64/lib/python3.14/site-packages/sklearn/metrics/_scorer.py", line 492, in __call__ return estimator.score(*args, **kwargs) ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^ File "/opt/hostedtoolcache/Python/3.14.2/x64/lib/python3.14/site-packages/sklearn/pipeline.py", line 1192, in score routed_params = process_routing( self, "score", sample_weight=sample_weight, **params ) File "/opt/hostedtoolcache/Python/3.14.2/x64/lib/python3.14/site-packages/sklearn/utils/_metadata_requests.py", line 1625, in process_routing request_routing.validate_metadata(params=kwargs, method=_method) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/hostedtoolcache/Python/3.14.2/x64/lib/python3.14/site-packages/sklearn/utils/_metadata_requests.py", line 1109, in validate_metadata raise TypeError( ...<2 lines>... ) TypeError: Pipeline.score got unexpected argument(s) {'sample_weight'}, which are not routed to any object. warnings.warn( 2026-01-16 10:54:05,286 - julearn - DEBUG - Setting column types for Index(['sepal_length', 'sepal_width', 'petal_length', 'petal_width'], dtype='object') 2026-01-16 10:54:05,286 - julearn - DEBUG - Column mappers for {'sepal_length': 'sepal_length__:type:__sepal', 'sepal_width': 'sepal_width__:type:__sepal', 'petal_length': 'petal_length__:type:__petal', 'petal_width': 'petal_width__:type:__petal'} 2026-01-16 10:54:05,289 - julearn - DEBUG - Fitting the target generator 2026-01-16 10:54:05,290 - julearn - DEBUG - Setting column types for Index(['sepal_length__:type:__sepal', 'sepal_width__:type:__sepal', 'petal_length__:type:__petal', 'petal_width__:type:__petal'], dtype='object') 2026-01-16 10:54:05,290 - julearn - DEBUG - Column mappers for {'sepal_length': 'sepal_length__:type:__sepal', 'sepal_width': 'sepal_width__:type:__sepal', 'petal_length': 'petal_length__:type:__petal', 'petal_width': 'petal_width__:type:__petal'} 2026-01-16 10:54:05,297 - julearn - DEBUG - Generating target 2026-01-16 10:54:05,300 - julearn - DEBUG - Picking columns: ['pca__pca0'] 2026-01-16 10:54:05,300 - julearn - DEBUG - Target generated: pca__pca0 2026-01-16 10:54:05,301 - julearn - DEBUG - Fitting model from generated target /opt/hostedtoolcache/Python/3.14.2/x64/lib/python3.14/site-packages/sklearn/model_selection/_validation.py:953: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details: Traceback (most recent call last): File "/opt/hostedtoolcache/Python/3.14.2/x64/lib/python3.14/site-packages/sklearn/model_selection/_validation.py", line 942, in _score scores = scorer(estimator, X_test, y_test, **score_params) File "/opt/hostedtoolcache/Python/3.14.2/x64/lib/python3.14/site-packages/sklearn/metrics/_scorer.py", line 492, in __call__ return estimator.score(*args, **kwargs) ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^ File "/opt/hostedtoolcache/Python/3.14.2/x64/lib/python3.14/site-packages/sklearn/pipeline.py", line 1192, in score routed_params = process_routing( self, "score", sample_weight=sample_weight, **params ) File "/opt/hostedtoolcache/Python/3.14.2/x64/lib/python3.14/site-packages/sklearn/utils/_metadata_requests.py", line 1625, in process_routing request_routing.validate_metadata(params=kwargs, method=_method) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/hostedtoolcache/Python/3.14.2/x64/lib/python3.14/site-packages/sklearn/utils/_metadata_requests.py", line 1109, in validate_metadata raise TypeError( ...<2 lines>... ) TypeError: Pipeline.score got unexpected argument(s) {'sample_weight'}, which are not routed to any object. warnings.warn( 0 NaN 1 NaN Name: test_score, dtype: float64 .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.087 seconds) .. _sphx_glr_download_auto_examples_03_complex_models_run_generate_target.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: run_generate_target.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: run_generate_target.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: run_generate_target.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_