Skip to content

Commit d2ae7f4

Browse files
mbhaskarCopilot
andcommitted
Add GlobalSecondaryIndexDefinition for GSI container support
Implement client-side support for Global Secondary Index (GSI) containers in the Python Cosmos SDK, mirroring the Java SDK implementation. Changes: - Add GlobalSecondaryIndexDefinition class with serialization/deserialization - Add global_secondary_index_definition keyword to create_container, create_container_if_not_exists, and replace_container (sync and async) - Implement dual-write pattern: writes to both globalSecondaryIndexDefinition and materializedViewDefinition keys for backward compatibility - Add comprehensive unit tests (20 tests) - Update CHANGELOG Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 70bcd19 commit d2ae7f4

6 files changed

Lines changed: 403 additions & 1 deletion

File tree

sdk/cosmos/azure-cosmos/CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,7 @@
5858
* Added `get_response_headers()` and `get_last_response_headers()` methods to the `CosmosItemPaged` and `CosmosAsyncItemPaged` objects returned by `query_items()`, allowing access to response headers from query operations. See [PR 44593](https://github.com/Azure/azure-sdk-for-python/pull/44593)
5959
* Added InferenceRequestTimeout property for HttpTimeout Policy to Reranking API. See [45469](https://github.com/Azure/azure-sdk-for-python/pull/45469)
6060
* Added `full_text_score_scope` parameter to `query_items()` for controlling BM25 statistics scope in hybrid search queries. Supports "Local" and "Global" (default) scopes. See [45686](https://github.com/Azure/azure-sdk-for-python/pull/45686)
61+
* Added `GlobalSecondaryIndexDefinition` class and `global_secondary_index_definition` keyword to `create_container`, `create_container_if_not_exists`, and `replace_container` methods for creating Global Secondary Index (GSI) containers.
6162

6263
#### Bugs Fixed
6364
* Fixed bug where a compound session token (containing multiple partition tokens) was sent for single-partition feed range queries. See [PR 44484](https://github.com/Azure/azure-sdk-for-python/pull/44484)

sdk/cosmos/azure-cosmos/azure/cosmos/__init__.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,7 @@
4343
)
4444
from .partition_key import PartitionKey
4545
from .permission import Permission
46+
from ._global_secondary_index import GlobalSecondaryIndexDefinition
4647

4748
__all__ = (
4849
"CosmosClient",
@@ -66,6 +67,7 @@
6667
"ConnectionRetryPolicy",
6768
"ThroughputProperties",
6869
"CosmosDict",
69-
"CosmosList"
70+
"CosmosList",
71+
"GlobalSecondaryIndexDefinition"
7072
)
7173
__version__ = VERSION
Lines changed: 123 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,123 @@
1+
# The MIT License (MIT)
2+
# Copyright (c) 2014 Microsoft Corporation
3+
4+
# Permission is hereby granted, free of charge, to any person obtaining a copy
5+
# of this software and associated documentation files (the "Software"), to deal
6+
# in the Software without restriction, including without limitation the rights
7+
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
8+
# copies of the Software, and to permit persons to whom the Software is
9+
# furnished to do so, subject to the following conditions:
10+
11+
# The above copyright notice and this permission notice shall be included in all
12+
# copies or substantial portions of the Software.
13+
14+
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
15+
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
16+
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
17+
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
18+
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
19+
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
20+
# SOFTWARE.
21+
22+
"""Global Secondary Index (GSI) container definition."""
23+
24+
from typing import Optional
25+
26+
27+
class GlobalSecondaryIndexDefinition:
28+
"""Definition for a Global Secondary Index (GSI) container.
29+
30+
A GSI container is a derived container built from a source container
31+
using a SQL-like projection query. The GSI definition is immutable after creation.
32+
33+
.. note::
34+
A maximum of 5 GSI containers can be created per source container.
35+
All GSI containers must be deleted before deleting the source container.
36+
37+
:param str source_container_id: The ID of the source container the GSI is derived from. Required.
38+
:param str definition: The SQL-like projection query that defines the GSI. Required.
39+
"""
40+
41+
def __init__(self, source_container_id: str, definition: str):
42+
if not source_container_id or not source_container_id.strip():
43+
raise ValueError("source_container_id cannot be None or empty.")
44+
if not definition or not definition.strip():
45+
raise ValueError("definition cannot be None or empty.")
46+
self._source_container_id = source_container_id
47+
self._definition = definition
48+
self._source_container_rid: Optional[str] = None
49+
self._status: Optional[str] = None
50+
51+
@property
52+
def source_container_id(self) -> str:
53+
"""The ID of the source container.
54+
55+
:returns: The source container ID.
56+
:rtype: str
57+
"""
58+
return self._source_container_id
59+
60+
@property
61+
def definition(self) -> str:
62+
"""The SQL-like projection query that defines the GSI.
63+
64+
:returns: The projection query.
65+
:rtype: str
66+
"""
67+
return self._definition
68+
69+
@property
70+
def source_container_rid(self) -> Optional[str]:
71+
"""The server-populated resource ID (_rid) of the source container. Read-only.
72+
73+
:returns: The source container resource ID, or None if not yet populated.
74+
:rtype: str or None
75+
"""
76+
return self._source_container_rid
77+
78+
@property
79+
def status(self) -> Optional[str]:
80+
"""The GSI build status. Read-only, server-populated.
81+
82+
Possible values: "Initializing", "InitialBuildAfterCreate",
83+
"InitialBuildAfterRestore", "Active", "DeleteInProgress"
84+
85+
:returns: The GSI status, or None if not yet populated.
86+
:rtype: str or None
87+
"""
88+
return self._status
89+
90+
def _to_dict(self) -> dict:
91+
"""Serialize to wire format dict.
92+
93+
:returns: A dictionary representation of the GSI definition.
94+
:rtype: dict
95+
"""
96+
result: dict = {
97+
"sourceCollectionId": self._source_container_id,
98+
"definition": self._definition,
99+
}
100+
if self._source_container_rid is not None:
101+
result["sourceCollectionRid"] = self._source_container_rid
102+
if self._status is not None:
103+
result["status"] = self._status
104+
return result
105+
106+
@classmethod
107+
def _from_dict(cls, data: Optional[dict]) -> Optional["GlobalSecondaryIndexDefinition"]:
108+
"""Deserialize from wire format dict.
109+
110+
:param dict data: The wire format dictionary.
111+
:returns: A GlobalSecondaryIndexDefinition instance, or None if data is None or invalid.
112+
:rtype: ~azure.cosmos.GlobalSecondaryIndexDefinition or None
113+
"""
114+
if data is None:
115+
return None
116+
source_container_id = data.get("sourceCollectionId")
117+
definition_query = data.get("definition")
118+
if not source_container_id or not definition_query:
119+
return None
120+
instance = cls(source_container_id, definition_query)
121+
instance._source_container_rid = data.get("sourceCollectionRid") # pylint: disable=protected-access
122+
instance._status = data.get("status") # pylint: disable=protected-access
123+
return instance

sdk/cosmos/azure-cosmos/azure/cosmos/aio/_database.py

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -178,6 +178,7 @@ async def create_container(
178178
vector_embedding_policy: Optional[dict[str, Any]] = None,
179179
change_feed_policy: Optional[dict[str, Any]] = None,
180180
full_text_policy: Optional[dict[str, Any]] = None,
181+
global_secondary_index_definition: Optional[Any] = None,
181182
return_properties: Literal[False] = False,
182183
**kwargs: Any
183184
) -> ContainerProxy:
@@ -259,6 +260,7 @@ async def create_container( # pylint: disable=too-many-statements
259260
vector_embedding_policy: Optional[dict[str, Any]] = None,
260261
change_feed_policy: Optional[dict[str, Any]] = None,
261262
full_text_policy: Optional[dict[str, Any]] = None,
263+
global_secondary_index_definition: Optional[Any] = None,
262264
return_properties: Literal[True],
263265
**kwargs: Any
264266
) -> tuple[ContainerProxy, CosmosDict]:
@@ -365,6 +367,9 @@ async def create_container( # pylint:disable=docstring-should-be-keyword, too-ma
365367
:keyword dict[str, Any] full_text_policy: **provisional** The full text policy for the container.
366368
Used to denote the default language to be used for all full text indexes, or to individually
367369
assign a language to each full text index path.
370+
:keyword global_secondary_index_definition: The global secondary index definition for the container.
371+
Used to create a GSI container derived from a source container via a SQL projection query.
372+
:paramtype global_secondary_index_definition: ~azure.cosmos.GlobalSecondaryIndexDefinition or dict[str, Any]
368373
:keyword bool return_properties: Specifies whether to return either a ContainerProxy
369374
or a Tuple of a ContainerProxy and the container properties.
370375
:raises ~azure.cosmos.exceptions.CosmosHttpResponseError: The container creation failed.
@@ -405,6 +410,7 @@ async def create_container( # pylint:disable=docstring-should-be-keyword, too-ma
405410
computed_properties = kwargs.pop('computed_properties', None)
406411
change_feed_policy = kwargs.pop('change_feed_policy', None)
407412
full_text_policy = kwargs.pop('full_text_policy', None)
413+
global_secondary_index_definition = kwargs.pop('global_secondary_index_definition', None)
408414
return_properties = kwargs.pop('return_properties', False)
409415

410416
session_token = kwargs.get('session_token')
@@ -452,6 +458,12 @@ async def create_container( # pylint:disable=docstring-should-be-keyword, too-ma
452458
definition["changeFeedPolicy"] = change_feed_policy
453459
if full_text_policy is not None:
454460
definition["fullTextPolicy"] = full_text_policy
461+
if global_secondary_index_definition is not None:
462+
gsi_dict = (global_secondary_index_definition._to_dict()
463+
if hasattr(global_secondary_index_definition, '_to_dict')
464+
else global_secondary_index_definition)
465+
definition["globalSecondaryIndexDefinition"] = gsi_dict
466+
definition["materializedViewDefinition"] = gsi_dict
455467
request_options = _build_options(kwargs)
456468
_set_throughput_options(offer=offer_throughput, request_options=request_options)
457469

@@ -479,6 +491,7 @@ async def create_container_if_not_exists(
479491
vector_embedding_policy: Optional[dict[str, Any]] = None,
480492
change_feed_policy: Optional[dict[str, Any]] = None,
481493
full_text_policy: Optional[dict[str, Any]] = None,
494+
global_secondary_index_definition: Optional[Any] = None,
482495
return_properties: Literal[False] = False,
483496
**kwargs: Any
484497
) -> ContainerProxy:
@@ -544,6 +557,7 @@ async def create_container_if_not_exists(
544557
vector_embedding_policy: Optional[dict[str, Any]] = None,
545558
change_feed_policy: Optional[dict[str, Any]] = None,
546559
full_text_policy: Optional[dict[str, Any]] = None,
560+
global_secondary_index_definition: Optional[Any] = None,
547561
return_properties: Literal[True],
548562
**kwargs: Any
549563
) -> tuple[ContainerProxy, CosmosDict]:
@@ -636,6 +650,9 @@ async def create_container_if_not_exists( # pylint:disable=docstring-should-be-k
636650
:keyword dict[str, Any] full_text_policy: **provisional** The full text policy for the container.
637651
Used to denote the default language to be used for all full text indexes, or to individually
638652
assign a language to each full text index path.
653+
:keyword global_secondary_index_definition: The global secondary index definition for the container.
654+
Used to create a GSI container derived from a source container via a SQL projection query.
655+
:paramtype global_secondary_index_definition: ~azure.cosmos.GlobalSecondaryIndexDefinition or dict[str, Any]
639656
:keyword bool return_properties: Specifies whether to return either a ContainerProxy
640657
or a Tuple of a ContainerProxy and the container properties.
641658
:raises ~azure.cosmos.exceptions.CosmosHttpResponseError: The container creation failed.
@@ -659,6 +676,7 @@ async def create_container_if_not_exists( # pylint:disable=docstring-should-be-k
659676
computed_properties = kwargs.pop('computed_properties', None)
660677
change_feed_policy = kwargs.pop('change_feed_policy', None)
661678
full_text_policy = kwargs.pop('full_text_policy', None)
679+
global_secondary_index_definition = kwargs.pop('global_secondary_index_definition', None)
662680
return_properties = kwargs.pop('return_properties', False)
663681

664682
session_token = kwargs.get('session_token')
@@ -703,6 +721,7 @@ async def create_container_if_not_exists( # pylint:disable=docstring-should-be-k
703721
vector_embedding_policy=vector_embedding_policy,
704722
change_feed_policy=change_feed_policy,
705723
full_text_policy=full_text_policy,
724+
global_secondary_index_definition=global_secondary_index_definition,
706725
return_properties=return_properties,
707726
**kwargs
708727
)
@@ -840,6 +859,7 @@ async def replace_container(
840859
analytical_storage_ttl: Optional[int] = None,
841860
computed_properties: Optional[list[dict[str, str]]] = None,
842861
full_text_policy: Optional[dict[str, Any]] = None,
862+
global_secondary_index_definition: Optional[Any] = None,
843863
return_properties: Literal[False] = False,
844864
vector_embedding_policy: Optional[dict[str, Any]] = None,
845865
**kwargs: Any
@@ -901,6 +921,7 @@ async def replace_container( # pylint:disable=docstring-missing-param
901921
analytical_storage_ttl: Optional[int] = None,
902922
computed_properties: Optional[list[dict[str, str]]] = None,
903923
full_text_policy: Optional[dict[str, Any]] = None,
924+
global_secondary_index_definition: Optional[Any] = None,
904925
return_properties: Literal[True],
905926
vector_embedding_policy: Optional[dict[str, Any]] = None,
906927
**kwargs: Any
@@ -983,6 +1004,9 @@ async def replace_container( # pylint:disable=docstring-should-be-keyword
9831004
:keyword dict[str, Any] full_text_policy: **provisional** The full text policy for the container.
9841005
Used to denote the default language to be used for all full text indexes, or to individually
9851006
assign a language to each full text index path.
1007+
:keyword global_secondary_index_definition: The global secondary index definition for the container.
1008+
Used to create a GSI container derived from a source container via a SQL projection query.
1009+
:paramtype global_secondary_index_definition: ~azure.cosmos.GlobalSecondaryIndexDefinition or dict[str, Any]
9861010
:keyword bool return_properties: Specifies whether to return either a ContainerProxy
9871011
or a Tuple of a ContainerProxy and the container properties.
9881012
:returns: A `ContainerProxy` instance representing the new container or a tuple of the ContainerProxy
@@ -1013,6 +1037,7 @@ async def replace_container( # pylint:disable=docstring-should-be-keyword
10131037
analytical_storage_ttl = kwargs.pop('analytical_storage_ttl', None)
10141038
computed_properties = kwargs.pop('computed_properties', None)
10151039
full_text_policy = kwargs.pop('full_text_policy', None)
1040+
global_secondary_index_definition = kwargs.pop('global_secondary_index_definition', None)
10161041
return_properties = kwargs.pop('return_properties', False)
10171042
vector_embedding_policy = kwargs.pop('vector_embedding_policy', None)
10181043

@@ -1055,6 +1080,12 @@ async def replace_container( # pylint:disable=docstring-should-be-keyword
10551080
}.items()
10561081
if value is not None
10571082
}
1083+
if global_secondary_index_definition is not None:
1084+
gsi_dict = (global_secondary_index_definition._to_dict()
1085+
if hasattr(global_secondary_index_definition, '_to_dict')
1086+
else global_secondary_index_definition)
1087+
parameters["globalSecondaryIndexDefinition"] = gsi_dict
1088+
parameters["materializedViewDefinition"] = gsi_dict
10581089

10591090
container_properties = await self.client_connection.ReplaceContainer(
10601091
container_link, collection=parameters, options=request_options, **kwargs

0 commit comments

Comments
 (0)