Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
78 changes: 53 additions & 25 deletions docs/sdk/tutorials/importing_assets_and_metadata.md

Large diffs are not rendered by default.

Binary file added recipes/img/json_metadata.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
92 changes: 57 additions & 35 deletions recipes/importing_assets_and_metadata.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"First, let's install and import the required modules."
"First, let's install and import the required modules.\n"
]
},
{
Expand Down Expand Up @@ -273,48 +273,58 @@
"\n",
"- `imageUrl`\n",
"- `text`\n",
"- `url`"
"- `url`\n",
"\n",
"\n",
"## Setting metadata properties"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"As an optional step, you can set data types for each type of your metadata.\n",
"The default data type is `string`, but setting some of your metadata as `number` can really help apply filters on your assets later on.\n",
"As an optional step, you can define properties for each type of your metadata.\n",
"These properties allow you to control:\n",
"\n",
"Note that we don't need to set data types for `imageUrl`, `text`, and `url`."
"- The data type (`string` or `number`)\n",
"- Whether the metadata is filterable in project queue\n",
"- Visibility of each metadata to labelers and reviewers\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'id': 'cllamrwgl00670j393poh2t4j',\n",
" 'metadataTypes': {'sensitiveData': 'string',\n",
" 'customConsensus': 'number',\n",
" 'uploadedFromCloud': 'string',\n",
" 'modelLabelErrorScore': 'number'}}"
]
},
"execution_count": null,
"metadata": {},
"output_type": "execute_result"
}
],
"outputs": [],
"source": [
"kili.update_properties_in_project(\n",
" project_id=project_id,\n",
" metadata_types={\n",
" \"customConsensus\": \"number\",\n",
" \"sensitiveData\": \"string\",\n",
" \"uploadedFromCloud\": \"string\",\n",
" \"modelLabelErrorScore\": \"number\",\n",
" metadata_properties={\n",
" \"customConsensus\": {\n",
" \"type\": \"number\",\n",
" \"filterable\": True,\n",
" \"visibleByLabeler\": True,\n",
" \"visibleByReviewer\": True,\n",
" },\n",
" \"sensitiveData\": {\n",
" \"type\": \"string\",\n",
" \"filterable\": True,\n",
" \"visibleByLabeler\": False, # Hide this from labelers\n",
" \"visibleByReviewer\": True,\n",
" },\n",
" \"uploadedFromCloud\": {\n",
" \"type\": \"string\",\n",
" \"filterable\": True,\n",
" \"visibleByLabeler\": True,\n",
" \"visibleByReviewer\": True,\n",
" },\n",
" \"modelLabelErrorScore\": {\n",
" \"type\": \"number\",\n",
" \"filterable\": True,\n",
" \"visibleByLabeler\": True,\n",
" \"visibleByReviewer\": True,\n",
" },\n",
" },\n",
")"
]
Expand All @@ -324,6 +334,17 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"> **Note**: The previous `metadata_types` parameter is deprecated. Please use metadata_properties instead. If you use metadata_types, it will still work but will be converted to metadata_properties internally with default visibility and filterability settings.\n",
"\n",
"If you don't specify all properties, default values will be used:\n",
"\n",
"```\n",
"filterable: true\n",
"type: 'string'\n",
"visibleByLabeler: true\n",
"visibleByReviewer: true\n",
"```\n",
"\n",
"Now we can add metadata to our assets:"
]
},
Expand Down Expand Up @@ -355,19 +376,17 @@
" \"sensitiveData\": \"yes\",\n",
" \"uploadedFromCloud\": \"no\",\n",
" \"modelLabelErrorScore\": 50,\n",
" # Add metadata that will be visible to labelers in the labeling interface:\n",
" \"imageUrl\": \"www.example.com/image_1.png\",\n",
" \"text\": \"some text for asset 1\",\n",
" \"imageUrl\": \"https://placehold.co/600x400/EEE/31343C\",\n",
" \"text\": \"Some text for asset 1\",\n",
" \"url\": \"www.example-website.com\",\n",
" },\n",
" {\n",
" \"customConsensus\": 40,\n",
" \"sensitiveData\": \"no\",\n",
" \"uploadedFromCloud\": \"yes\",\n",
" \"modelLabelErrorScore\": 30,\n",
" # Add metadata that will be visible to labelers in the labeling interface:\n",
" \"imageUrl\": \"www.example.com/image_2.png\",\n",
" \"text\": \"some text for asset 2\",\n",
" \"imageUrl\": \"https://placehold.co/600x400/EEE/31343C\",\n",
" \"text\": \"Some text for asset 2\",\n",
" \"url\": \"www.example-website.com\",\n",
" },\n",
" ],\n",
Expand All @@ -378,7 +397,10 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"In the labeling interface, we can see that the assets have some metadata:"
"> **Note** : alternatively, you can use `kili.set_metadata` or `kili.add_metadata` methods.\n",
"\n",
"\n",
"In the labeling interface, we can see that the assets have some metadata (note that `sensitiveData` will be hidden from labelers based on our settings)."
]
},
{
Expand All @@ -390,7 +412,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"![image.png](attachment:0bea7811-9a67-461c-b716-319de1343ac8.png)"
"![image.png](./img/json_metadata.png)"
]
},
{
Expand Down Expand Up @@ -441,7 +463,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"We've successfully set up a Kili project, imported assets to it, and finally added some metadata to our assets. Well done!"
"We've successfully set up a Kili project, imported assets to it, and finally added some metadata to our assets with advanced property settings. Well done!"
]
}
],
Expand Down
18 changes: 16 additions & 2 deletions src/kili/adapters/kili_api_gateway/project/mappers.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ def project_where_mapper(filters: ProjectFilters) -> Dict:

def project_data_mapper(data: ProjectDataKiliAPIGatewayInput) -> Dict:
"""Build the GraphQL ProjectData variable to be sent in an operation."""
return {
result = {
"archived": data.archived,
"author": data.author,
"complianceTags": data.compliance_tags,
Expand All @@ -42,7 +42,6 @@ def project_data_mapper(data: ProjectDataKiliAPIGatewayInput) -> Dict:
"inputType": data.input_type,
"instructions": data.instructions,
"jsonInterface": data.json_interface,
"metadataTypes": data.metadata_types,
"minConsensusSize": data.min_consensus_size,
"numberOfAssets": data.number_of_assets,
"rules": data.rules,
Expand All @@ -55,3 +54,18 @@ def project_data_mapper(data: ProjectDataKiliAPIGatewayInput) -> Dict:
"title": data.title,
"useHoneyPot": data.use_honeypot,
}

if data.metadata_properties is not None:
result["metadataProperties"] = data.metadata_properties
elif data.metadata_types is not None:
metadata_properties = {}
for key, type_value in data.metadata_types.items():
metadata_properties[key] = {
"filterable": True,
"type": type_value,
"visibleByLabeler": True,
"visibleByReviewer": True,
}
result["metadataProperties"] = metadata_properties

return result
1 change: 1 addition & 0 deletions src/kili/adapters/kili_api_gateway/project/types.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ class ProjectDataKiliAPIGatewayInput:
instructions: Optional[str]
json_interface: Optional[str]
metadata_types: Optional[Dict]
metadata_properties: Optional[Dict]
min_consensus_size: Optional[int]
number_of_assets: Optional[int]
rules: Optional[str]
Expand Down
118 changes: 114 additions & 4 deletions src/kili/entrypoints/mutations/asset/__init__.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
"""Asset mutations."""

import warnings
from typing import Any, Dict, List, Literal, Optional, Union, cast

Expand All @@ -11,7 +10,7 @@
from kili.adapters.kili_api_gateway.helpers.queries import QueryOptions
from kili.core.helpers import is_empty_list_with_warning
from kili.core.utils.pagination import mutate_from_paginated_call
from kili.domain.asset import AssetFilters
from kili.domain.asset import AssetFilters, AssetId
from kili.domain.project import ProjectId
from kili.entrypoints.base import BaseOperationEntrypointMixin
from kili.entrypoints.mutations.asset.helpers import (
Expand Down Expand Up @@ -91,8 +90,8 @@ def append_many_to_dataset(

json_metadata_array: The metadata given to each asset should be stored in a json like dict with keys.

- Add metadata visible on the asset with the following keys: `imageUrl`, `text`, `url`.
Example for one asset: `json_metadata_array = [{'imageUrl': '','text': '','url': ''}]`.
- Add metadata visible on the asset
Example for one asset: `json_metadata_array = [{'imageUrl': '','text': '','url': '','key1': 'value1'}]`.
- For VIDEO projects (and not VIDEO_LEGACY), you can specify a value with key 'processingParameters' to specify the sampling rate (default: 30).
Example for one asset: `json_metadata_array = [{'processingParameters': {'framesPlayedPerSecond': 10}}]`.
- In Image projects with geoTIFF assets, you can specify the epsg, the `minZoom` and `maxZoom` values for the `processingParameters` key.
Expand Down Expand Up @@ -409,6 +408,117 @@ def generate_variables(batch: Dict) -> Dict:
formated_results = [self.format_result("data", result, None) for result in results]
return [item for batch_list in formated_results for item in batch_list]

@typechecked
def add_metadata(
self,
json_metadata: List[Dict[str, Union[str, int, float]]],
asset_ids: List[str],
project_id: str,
) -> List[Dict[Literal["id"], str]]:
"""Add metadata to assets without overriding existing metadata.

Args:
json_metadata: List of metadata dictionaries to add to each asset.
Each dictionary contains key/value pairs to be added to the asset's metadata.
asset_ids: The asset IDs to modify.
project_id: The project ID.

Returns:
A list of dictionaries with the asset ids.

Examples:
>>> kili.add_metadata(
json_metadata=[
{"key1": "value1", "key2": "value2"},
{"key3": "value3"}
],
asset_ids=["ckg22d81r0jrg0885unmuswj8", "ckg22d81s0jrh0885pdxfd03n"],
project_id="cm92to3cx012u7l0w6kij9qvx"
)
"""
if is_empty_list_with_warning("add_metadata", "json_metadata", json_metadata):
return []

assets = self.kili_api_gateway.list_assets(
AssetFilters(
project_id=ProjectId(project_id), asset_id_in=cast(List[AssetId], asset_ids)
),
["id", "jsonMetadata"],
QueryOptions(disable_tqdm=True),
)

json_metadatas = []
for i, asset in enumerate(assets):
current_metadata = asset.get("jsonMetadata", {}) if asset.get("jsonMetadata") else {}
Comment thread
RuellePaul marked this conversation as resolved.
new_metadata = json_metadata[i] if i < len(json_metadata) else {}

current_metadata.update(new_metadata)

json_metadatas.append(current_metadata)

return self.update_properties_in_assets(
asset_ids=asset_ids,
json_metadatas=json_metadatas,
)

@typechecked
def set_metadata(
self,
json_metadata: List[Dict[str, Union[str, int, float]]],
asset_ids: List[str],
project_id: str,
) -> List[Dict[Literal["id"], str]]:
"""Set metadata on assets, replacing any existing metadata.

Args:
json_metadata: List of metadata dictionaries to set on each asset.
Each dictionary contains key/value pairs to be set as the asset's metadata.
asset_ids: The asset IDs to modify.
project_id: The project ID.

Returns:
A list of dictionaries with the asset ids.

Examples:
>>> kili.set_metadata(
json_metadata=[
{"key1": "value1", "key2": "value2"},
{"key3": "value3"}
],
asset_ids=["ckg22d81r0jrg0885unmuswj8", "ckg22d81s0jrh0885pdxfd03n"],
project_id="cm92to3cx012u7l0w6kij9qvx"
)
"""
if is_empty_list_with_warning("set_metadata", "json_metadata", json_metadata):
return []

assets = self.kili_api_gateway.list_assets(
AssetFilters(
project_id=ProjectId(project_id), asset_id_in=cast(List[AssetId], asset_ids)
),
["id", "jsonMetadata"],
QueryOptions(disable_tqdm=True),
)

json_metadatas = []
for i, asset in enumerate(assets):
current_metadata = asset.get("jsonMetadata", {}) if asset.get("jsonMetadata") else {}
Comment thread
RuellePaul marked this conversation as resolved.
new_metadata = json_metadata[i] if i < len(json_metadata) else {}

special_keys = ["text", "imageUrl", "url", "processingParameters"]
preserved_metadata = {
k: current_metadata[k] for k in special_keys if k in current_metadata
}

preserved_metadata.update(new_metadata)

json_metadatas.append(preserved_metadata)

return self.update_properties_in_assets(
asset_ids=asset_ids,
json_metadatas=json_metadatas,
)

@typechecked
def change_asset_external_ids(
self,
Expand Down
4 changes: 2 additions & 2 deletions src/kili/entrypoints/mutations/project/queries.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
$instructions: String
$inputType: InputType
$jsonInterface: String
$metadataTypes: JSON
$metadataProperties: JSON
$minConsensusSize: Int
$numberOfAssets: Int
$numberOfSkippedAssets: Int
Expand All @@ -50,7 +50,7 @@
instructions: $instructions
inputType: $inputType
jsonInterface: $jsonInterface
metadataTypes: $metadataTypes
metadataProperties: $metadataProperties
minConsensusSize: $minConsensusSize
numberOfAssets: $numberOfAssets
numberOfSkippedAssets: $numberOfSkippedAssets
Expand Down
Loading