Skip to content

4117433: Update _asset_utils.py to fix Performance issue#46572

Closed
sangeetha-cpu wants to merge 2 commits intoAzure:mainfrom
sangeetha-cpu:main
Closed

4117433: Update _asset_utils.py to fix Performance issue#46572
sangeetha-cpu wants to merge 2 commits intoAzure:mainfrom
sangeetha-cpu:main

Conversation

@sangeetha-cpu
Copy link
Copy Markdown

Description

Please add an informative description that covers that changes made by the pull request and link all relevant issues.

If an SDK is being regenerated based on a new API spec, a link to the pull request containing these API spec changes should be included above.

All SDK Contribution checklist:

  • The pull request does not introduce [breaking changes]
  • CHANGELOG is updated for new features, bug fixes or other significant changes.
  • I have read the contribution guidelines.

General Guidelines and Best Practices

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

@github-actions github-actions Bot added Community Contribution Community members are working on the issue customer-reported Issues that are reported by GitHub users external to the Azure organization. Machine Learning labels Apr 28, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Thank you for your contribution @sangeetha-cpu! We will review the pull request and get back to you soon.

@sangeetha-cpu sangeetha-cpu changed the title 4117433: Update _asset_utils.py to fix Performance issue while creating model from local files 4117433: Update _asset_utils.py to fix Performance issue Apr 28, 2026
Copilot AI review requested due to automatic review settings April 30, 2026 04:34
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR targets a performance improvement in asset upload file enumeration by avoiding descending into ignored directories during folder traversal. It also includes import/type-related refactors in _asset_utils.py (currently introducing runtime import issues that need correction).

Changes:

  • Prune ignored directories during os.walk in get_upload_files_from_folder to reduce traversal cost (e.g., large .venv-like folders).
  • Reformat several imports (including typing and constants imports).
  • Update restclient model imports and adjust _get_latest type handling (currently incorrect/broken).

Comment on lines +62 to +67
from azure.ai.ml._restclient.v2022_05_01.models import (
DataVersionBaseData,
ModelVersionData,
ModelVersionResourceArmPaginatedResult,
)
from azure.ai.ml._restclient.arm_ml_service.models import PendingUploadRequestDto
from azure.ai.ml._restclient.v2023_04_01.models import PendingUploadRequestDto
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new restclient imports reference modules that don't exist in this package (azure.ai.ml._restclient.v2022_05_01.models and azure.ai.ml._restclient.v2023_04_01.models are not present under _restclient/). This will raise ImportError at runtime. Please switch these imports back to an existing restclient version (e.g., v2022_02_01_preview.models / v2023_04_01_preview.models) or keep using arm_ml_service.models, matching the operations you use in this file.

Copilot uses AI. Check for mistakes.
Comment on lines +947 to 949
if latest and isinstance(latest, ModelVersionResourceArmPaginatedResult):
# Data list return object doesn't require this since its elements are already DatasetVersionResources
latest = cast(ModelVersionData, latest)
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The isinstance(latest, ModelVersionResourceArmPaginatedResult) branch is effectively dead: the ModelVersionsOperations.list() paging implementation yields elements from deserialized.value (typed as List[ModelVersionData]), so result.next() returns a ModelVersionData, not the paginated result wrapper. This check should likely be against ModelVersionData (or removed entirely) to keep the logic accurate and avoid confusion.

Suggested change
if latest and isinstance(latest, ModelVersionResourceArmPaginatedResult):
# Data list return object doesn't require this since its elements are already DatasetVersionResources
latest = cast(ModelVersionData, latest)

Copilot uses AI. Check for mistakes.
Comment on lines +20 to +27
TYPE_CHECKING,
Any,
Dict,
Iterable,
List,
Optional,
Tuple,
Union,
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are trailing spaces in the multi-line from typing import (...) block (e.g., after TYPE_CHECKING, / Any, / Dict, etc.). These will trip linters (W291). Please remove the trailing whitespace and keep formatting consistent.

Suggested change
TYPE_CHECKING,
Any,
Dict,
Iterable,
List,
Optional,
Tuple,
Union,
TYPE_CHECKING,
Any,
Dict,
Iterable,
List,
Optional,
Tuple,
Union,

Copilot uses AI. Check for mistakes.
Comment on lines +401 to +407
for root, dirs, files in os.walk(path, followlinks=True, topdown=True):
# Skip ignored directories to avoid descending into them
# This significantly improves performance when large ignored folders (e.g., .venv) are present
dirs[:] = [
d for d in dirs
if not ignore_file.is_file_excluded(Path(root).joinpath(d).as_posix())
]
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description is still the template text and doesn’t describe the actual change or link the relevant issue (the title suggests a performance fix). Please update the PR description to summarize what changed and reference the tracking issue/bug ID.

Copilot uses AI. Check for mistakes.
@sangeetha-cpu sangeetha-cpu closed this by deleting the head repository May 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Community Contribution Community members are working on the issue customer-reported Issues that are reported by GitHub users external to the Azure organization. Machine Learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants