Skip to content

271 feature implementation of columntransformer support#279

Open
fernandonbarros wants to merge 3 commits into
paucablop:mainfrom
fernandonbarros:271-feature-implementation-of-columntransformer-support
Open

271 feature implementation of columntransformer support#279
fernandonbarros wants to merge 3 commits into
paucablop:mainfrom
fernandonbarros:271-feature-implementation-of-columntransformer-support

Conversation

@fernandonbarros

@fernandonbarros fernandonbarros commented May 28, 2026

Copy link
Copy Markdown

Summary

  • Updates the PLSRegression transformer to add compatibility with sklearn's ColumnTransformer.
  • When setting transform_out="pandas", the PLSRegression transformer will output scores as a dataframe with columns LV1 to LVn.
  • This PR also addresses issues with the test_optional_dependencies test failing.

Why is this change needed?

  • This change will allow the passthrough of metadata when creating a pipeline to process pandas dataframes.

Closes

Related issues

Type of change

  • Bug fix
  • New feature
  • Refactor
  • Performance improvement
  • Documentation update
  • Tests only
  • CI / DevOps / tooling
  • Dependency update
  • Breaking change

What changed?

  • Updated chemotools/regression/_pls_regression.py:
    • Defined get_feature_names_out method to return an array ["LV1" ... "LVn"] to serve as column names for scores dataframe.
    • Updated the transform method to add a flag to define if y_scores should be returned by the method.
  • Updated chemotools/tests/regression/test_pls_regression.py:
    • Added test_works_with_column_transformer test to ensure that the PLSRegression transformer is compatible with the ColumnTransformer.
  • Updated chemotools/_optional.py and chemotools/tests/utils/test_optional_dependencies.py to fix issues with the test_import_optional_dependency_real_optional test.

What did NOT change?

  • No other transformers other than PLSRegression were modified.

API and compatibility impact

  • No public API changes
  • Public API added
  • Public API changed
  • Deprecation introduced
  • Breaking change introduced

Notes

  • scikit-learn API compatibility: PLSRegression should still be fully compliant with SKLearn
  • Affected modules/classes/functions:
    • PLSRegression class in chemotools/regression/_pls_regression.py was updated.
    • import_optional_dependency function in chemotools/_optional.py was updated.
  • Backward compatibility considerations:
    • Instances where PLSRegression transformer is used and expect y_scores to be returned should be modified to include the return_y=True flag.

Validation

  • task format-check
  • task lint
  • task spelling
  • task type-check
  • task test
  • task coverage
  • task test:matrix
  • task docs:check
  • task build

Validation notes

  • test_import_optional_dependency_real_optional was originally failing, but issue was addressed by updating the test and the import_optional_dependency function.
  • Failure observed when running task docs:check

Tests

  • Added new tests
  • Updated existing tests
  • No tests added because this PR only changes docs / CI / metadata

Test coverage details

  • Relevant test files:
    • chemotools/tests/utils/test_optional_dependencies.py
    • chemotools/tests/regression/test_pls_regression.py
  • Edge cases covered:
    • Use of PLSRegression in a pipeline with ColumnTransformer where the transform_output="pandas"
  • Numerical / estimator behavior checked: N/A

Documentation

  • Docs updated
  • Docstrings updated
  • No docs update needed

Documentation notes

  • Updates to methods and functions were reflected in the previously existing docstring.

Dependency / build / CI impact

  • No dependency changes
  • Runtime dependency change
  • Dev dependency change
  • GitHub Actions / CI changed
  • Build/release process changed

Notes

Reviewer focus

Please focus on:

  • The updated transform and the new get_feature_names_out methods for PLSRegression.
  • The new/updated tests for utils and PLSRegression.

Screenshots / logs / benchmark output

Details

Checklist

  • I linked the relevant issue(s) or explained why none exists
  • I kept this PR scoped to a single purpose
  • I added or updated tests where appropriate
  • I updated documentation where appropriate
  • I verified the change locally using the relevant tasks above
  • I considered backward compatibility and public API impact
  • I am ready for review

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@fernandonbarros

Copy link
Copy Markdown
Author

It looks like the update to the PLSRegression API introduced a bug in the PLSRegressionInspector that was not caught be any of the existing tests. I created an additional commit to address the bug that was introduced.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feature: Implement ColumnTransformer support in PLSRegression method

2 participants