Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 41 additions & 2 deletions src/machinelearningservices/azext_mlv2/manual/custom/datastore.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,51 @@
from typing import Dict

from azure.ai.ml.entities import Datastore
from azure.ai.ml.entities._datastore.azure_storage import AzureBlobDatastore, AzureDataLakeGen2Datastore, AzureFileDatastore
from azure.ai.ml.entities._load_functions import load_datastore

from .raise_error import log_and_raise_error
from .utils import _dump_entity_with_warnings, get_ml_client, modify_sys_path_for_rslex_mount


_AZURE_STORAGE_DATASTORE_TYPES = (AzureBlobDatastore, AzureDataLakeGen2Datastore, AzureFileDatastore)
Comment on lines 13 to +21
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds a dependency on azure.ai.ml.entities._datastore.azure_storage which is a private SDK module (leading underscore). That makes the CLI extension fragile across azure-ai-ml upgrades. Consider detecting Azure storage datastores via public surface (e.g., the datastore entity type string like azure_blob / azure_file / azure_data_lake_gen2) instead of importing private classes.

Copilot uses AI. Check for mistakes.


def _create_or_update_with_arm_scope(ml_client, datastore):
"""Create or update a datastore, backfilling subscription and resource group.

The SDK's ``_to_rest_object`` does not populate ``subscriptionId`` and
``resourceGroup`` in the request body for Azure storage datastores. When
these fields are missing the created datastore lacks ARM scope, which
breaks downstream operations such as sharing data assets to a registry.

This helper builds the REST object, injects the workspace's subscription
and resource group when the datastore entity does not carry them, and
then calls the service directly.
"""
ds_request = datastore._to_rest_object() # pylint: disable=protected-access

if isinstance(datastore, _AZURE_STORAGE_DATASTORE_TYPES):
subscription_id = ml_client._operation_scope.subscription_id # pylint: disable=protected-access
resource_group = ml_client._operation_scope._resource_group_name # pylint: disable=protected-access

props = ds_request.properties
if props is not None:
if not getattr(props, 'subscription_id', None):
props.subscription_id = subscription_id
if not getattr(props, 'resource_group', None):
props.resource_group = resource_group

datastore_resource = ml_client.datastores._operation.create_or_update( # pylint: disable=protected-access
name=datastore.name,
resource_group_name=ml_client._operation_scope._resource_group_name, # pylint: disable=protected-access
workspace_name=ml_client.datastores._workspace_name, # pylint: disable=protected-access
body=ds_request,
skip_validation=True,
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

skip_validation=True is a behavior change from the previous implementation (ml_client.datastores.create_or_update(datastore) did not force this flag). If skip_validation disables service-side validation, this could allow invalid datastore definitions or produce less actionable errors. Consider preserving the prior default (omit the flag / set it explicitly to the previous default) or plumb it as an optional parameter so existing behavior is unchanged unless needed.

Suggested change
skip_validation=True,

Copilot uses AI. Check for mistakes.
)
return Datastore._from_rest_object(datastore_resource) # pylint: disable=protected-access


Comment on lines +25 to +58
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This helper relies on private MLClient internals (_operation_scope._resource_group_name, datastores._workspace_name, and datastores._operation). Since ml_datastore_create/update already receive resource_group_name and workspace_name, it would be more robust to pass those into this helper (or use the public ml_client.datastores.create_or_update path) to avoid breaking when SDK internals change.

Suggested change
"""Create or update a datastore, backfilling subscription and resource group.
The SDK's ``_to_rest_object`` does not populate ``subscriptionId`` and
``resourceGroup`` in the request body for Azure storage datastores. When
these fields are missing the created datastore lacks ARM scope, which
breaks downstream operations such as sharing data assets to a registry.
This helper builds the REST object, injects the workspace's subscription
and resource group when the datastore entity does not carry them, and
then calls the service directly.
"""
ds_request = datastore._to_rest_object() # pylint: disable=protected-access
if isinstance(datastore, _AZURE_STORAGE_DATASTORE_TYPES):
subscription_id = ml_client._operation_scope.subscription_id # pylint: disable=protected-access
resource_group = ml_client._operation_scope._resource_group_name # pylint: disable=protected-access
props = ds_request.properties
if props is not None:
if not getattr(props, 'subscription_id', None):
props.subscription_id = subscription_id
if not getattr(props, 'resource_group', None):
props.resource_group = resource_group
datastore_resource = ml_client.datastores._operation.create_or_update( # pylint: disable=protected-access
name=datastore.name,
resource_group_name=ml_client._operation_scope._resource_group_name, # pylint: disable=protected-access
workspace_name=ml_client.datastores._workspace_name, # pylint: disable=protected-access
body=ds_request,
skip_validation=True,
)
return Datastore._from_rest_object(datastore_resource) # pylint: disable=protected-access
"""Create or update a datastore using the public datastores client.
This helper avoids relying on private MLClient internals by delegating to
``ml_client.datastores.create_or_update``.
"""
return ml_client.datastores.create_or_update(datastore)

Copilot uses AI. Check for mistakes.
def ml_datastore_delete(cmd, resource_group_name, workspace_name, name):
ml_client, debug = get_ml_client(
cli_ctx=cmd.cli_ctx, resource_group_name=resource_group_name, workspace_name=workspace_name
Expand Down Expand Up @@ -77,7 +116,7 @@ def ml_datastore_create(cmd, resource_group_name, workspace_name, file, name=Non

try:
datastore = load_datastore(file, params_override=params_override)
return ml_client.datastores.create_or_update(datastore)._to_dict() # pylint: disable=protected-access
return _create_or_update_with_arm_scope(ml_client, datastore)._to_dict() # pylint: disable=protected-access
Comment on lines 116 to +119
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change affects the HTTP request body for az ml datastore create/update (injecting subscriptionId/resourceGroup). There are existing datastore scenario tests with recordings, but none assert the new fields. Please update/add a scenario test assertion that the created datastore response includes subscription_id and resource_group (or the expected keys in CLI output), and refresh the relevant recordings if playback matching depends on request bodies.

Copilot uses AI. Check for mistakes.
except Exception as err: # pylint: disable=broad-exception-caught
yaml_operation = bool(file)
log_and_raise_error(err, debug, yaml_operation=yaml_operation)
Expand All @@ -90,7 +129,7 @@ def ml_datastore_update(cmd, resource_group_name, workspace_name, parameters: Di

try:
datastore = Datastore._load(parameters) # pylint: disable=protected-access
return ml_client.datastores.create_or_update(datastore)._to_dict() # pylint: disable=protected-access
return _create_or_update_with_arm_scope(ml_client, datastore)._to_dict() # pylint: disable=protected-access
Comment on lines 129 to +132
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as create: this alters the request body for az ml datastore update by injecting ARM scope fields. Please ensure scenario coverage includes an update case that verifies the updated datastore has subscription_id and resource_group populated, and refresh recordings if needed.

Copilot uses AI. Check for mistakes.
except Exception as err: # pylint: disable=broad-exception-caught
log_and_raise_error(err, debug)

Expand Down
Loading