Skip to content

FIX, MAINT: Implement 'everything follows X' and namespace checks for KNN#3127

Merged
david-cortes-intel merged 3 commits into
uxlfoundation:mainfrom
david-cortes-intel:knn_y
May 11, 2026
Merged

FIX, MAINT: Implement 'everything follows X' and namespace checks for KNN#3127
david-cortes-intel merged 3 commits into
uxlfoundation:mainfrom
david-cortes-intel:knn_y

Conversation

@david-cortes-intel
Copy link
Copy Markdown
Contributor

Description

This PR:

  • Implements the logic of 'everything follows X' for KNN classes.
  • Implements the logic where class predictions from classifiers follow the 'y' namespace (e.g. so that it can return strings when fitting on GPU).
  • Adds array API namespace and device checks for KNN methods that come after .fit() in order to throw informative Python exceptions instead of segfaults, the same way scikit-learn would do.
  • Fixes some issues with internal functions that were not working with array_api_strict.

Includes the changes from #3117 since they are also necessary for KNN classes.


Checklist:

Completeness and readability

  • I have commented my code, particularly in hard-to-understand areas.
  • Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details).
  • I have resolved any merge conflicts that might occur with the base branch.

Testing

  • I have run it locally and tested the changes extensively.
  • All CI jobs are green or I have provided justification why they aren't.
  • I have extended testing suite if new functionality was introduced in this PR.

@david-cortes-intel
Copy link
Copy Markdown
Contributor Author

/intelci: run

@david-cortes-intel
Copy link
Copy Markdown
Contributor Author

/azp run Nightly

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@david-cortes-intel
Copy link
Copy Markdown
Contributor Author

CI failures in BasicStatistics are unrelated to the changes here and should be solved with this PR:
#3128

@david-cortes-intel
Copy link
Copy Markdown
Contributor Author

/intelci: run

@david-cortes-intel
Copy link
Copy Markdown
Contributor Author

/azp run Nightly

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 24, 2026

Codecov Report

❌ Patch coverage is 28.94737% with 54 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
sklearnex/neighbors/common.py 42.85% 12 Missing and 4 partials ⚠️
sklearnex/neighbors/knn_classification.py 23.80% 9 Missing and 7 partials ⚠️
sklearnex/neighbors/knn_regression.py 17.64% 8 Missing and 6 partials ⚠️
sklearnex/neighbors/knn_unsupervised.py 20.00% 5 Missing and 3 partials ⚠️
Flag Coverage Δ
azure 77.50% <28.94%> (-1.55%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
sklearnex/neighbors/_lof.py 98.01% <ø> (-1.99%) ⬇️
sklearnex/neighbors/knn_unsupervised.py 86.17% <20.00%> (-8.02%) ⬇️
sklearnex/neighbors/knn_regression.py 79.36% <17.64%> (-9.93%) ⬇️
sklearnex/neighbors/common.py 85.83% <42.85%> (-4.53%) ⬇️
sklearnex/neighbors/knn_classification.py 84.76% <23.80%> (-10.05%) ⬇️

... and 37 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@david-cortes-intel
Copy link
Copy Markdown
Contributor Author

/azp run Nightly

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@david-cortes-intel
Copy link
Copy Markdown
Contributor Author

/intelci: run

@david-cortes-intel david-cortes-intel marked this pull request as ready for review April 30, 2026 10:14
device = getattr(y_train, "device", None)
neigh_dist = xp.asarray(neigh_dist, device=device)
neigh_ind = xp.asarray(neigh_ind, device=device)
if not _is_numpy_namespace(xp):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the logic of this if-statement? As I understand it previously it was that neigh_dist and neigh_ind originally have numpy type and we only need to convert them if y is not a numpy. Is it correct? If yes do we need the same logic after the array-api update?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how it was before if you look at the changes. I guess the purpose is to have them work with the other arrays.

Comment thread sklearnex/neighbors/_lof.py Outdated
device = getattr(y_train, "device", None)
neigh_dist = xp.asarray(neigh_dist, device=device)
neigh_ind = xp.asarray(neigh_ind, device=device)
if not _is_numpy_namespace(xp):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we also need some explanation about why do we need this numpy check

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has a different codepath for numpy with operations that are not supported by array API.

Comment thread sklearnex/neighbors/knn_classification.py
@avolkov-intel
Copy link
Copy Markdown
Contributor

Left a few comments mostly related to clarification about the current logic. Also PR needs to be rebased to fix some CI failures

@david-cortes-intel david-cortes-intel merged commit 895535d into uxlfoundation:main May 11, 2026
24 of 31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants