Skip to content

Commit 967e5cd

Browse files
Fix extension field mapping in ES index files to prevent reindex failures (#27080)
* Fix extension field mapping in ES index files to prevent reindex failures File entity reindex fails when custom properties are populated because extension is mapped as keyword but receives an object/map value. Changed extension type to flattened in file_index_mapping.json across all locales (en, jp, ru, zh). Also fixed JP locale inconsistencies in api_collection_index_mapping.json and database_index_mapping.json where extension was mapped as object instead of flattened. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add test to enforce extension field is flattened in all ES index mappings Adds extensionFieldMustBeFlattenedInAllIndices test that scans all index mapping files across all locales and verifies extension fields use the flattened type. This prevents regressions where extension is mapped as keyword or object, which causes reindex failures when custom properties contain object/map values. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Move extension field from tags.properties to root level in JP mappings The extension field for entity custom properties belongs at the root properties level, not under tags.properties. TagLabel does not have an extension property. Moves extension to match the EN/RU/ZH locale structure for api_collection and database index mappings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add missing customPropertiesTyped field to non-EN locale index mappings PR #25627 added customPropertiesTyped nested field to all EN mappings but missed 15 files across JP/RU/ZH locales. Without this field, custom property searches fail on these entity types when using non-English locales. Missing from JP (7): api_collection, dashboard_data_model, database, directory, pipeline, spreadsheet, worksheet Missing from RU (3): directory, spreadsheet, worksheet Missing from ZH (5): dashboard_data_model, directory, pipeline, spreadsheet, worksheet Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Detect implicit object mappings in extension field test The extension type check now also catches fields declared with properties but no explicit type, which Elasticsearch treats as an implicit object mapping. Previously these would bypass the test. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix extension/fileExtension field clash in file index search The searchSettings.json was using the `extension` field (entity custom properties, mapped as flattened/flat_object in OpenSearch) for both multi-match search and terms aggregation. OpenSearch's flat_object type supports neither operation, causing HTTP 500 on all file search queries. The `extension` field is the entityExtension (custom properties) JSON object. The actual file extension string (.pdf, .xlsx, etc.) is stored in `fileExtension`. Fix the clash by: - Add explicit `fileExtension: keyword` mapping to file_index_mapping.json in all four locales (en, jp, ru, zh) - Replace `extension` with `fileExtension` in searchSettings.json for the file asset type: search field, terms aggregation, and filterable field descriptor Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
1 parent e2458fd commit 967e5cd

21 files changed

Lines changed: 697 additions & 14 deletions

openmetadata-service/src/main/resources/json/data/settings/searchSettings.json

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1973,7 +1973,7 @@
19731973
"matchType": "standard"
19741974
},
19751975
{
1976-
"field": "extension",
1976+
"field": "fileExtension",
19771977
"boost": 2.0,
19781978
"matchType": "standard"
19791979
}
@@ -1985,9 +1985,9 @@
19851985
"field": "fileType"
19861986
},
19871987
{
1988-
"name": "extension",
1988+
"name": "fileExtension",
19891989
"type": "terms",
1990-
"field": "extension"
1990+
"field": "fileExtension"
19911991
},
19921992
{
19931993
"name": "directory.displayName.keyword",
@@ -3643,7 +3643,7 @@
36433643
"description": "Exact match on the MIME type of the file."
36443644
},
36453645
{
3646-
"name": "extension",
3646+
"name": "fileExtension",
36473647
"description": "Exact match on the file extension (e.g., pdf, docx, xlsx)."
36483648
},
36493649
{

openmetadata-service/src/test/java/org/openmetadata/service/search/IndexMappingNestedFieldConsistencyTest.java

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,25 @@ static void loadAllMappings() throws IOException {
4949
assertTrue(allMappings.size() > 1, "Should load more than one index mapping");
5050
}
5151

52+
@Test
53+
void extensionFieldMustBeFlattenedInAllIndices() {
54+
List<String> violations = new ArrayList<>();
55+
for (Map.Entry<String, JsonNode> entry : allMappings.entrySet()) {
56+
String entity = entry.getKey();
57+
JsonNode properties = getTopLevelProperties(entry.getValue());
58+
assertNotNull(
59+
properties,
60+
"Index mapping for '" + entity + "' has no properties — mapping file may be malformed.");
61+
findExtensionTypeViolations(properties, "", violations, entity);
62+
}
63+
assertTrue(
64+
violations.isEmpty(),
65+
"The 'extension' field must have \"type\": \"flattened\" in all index mappings. "
66+
+ "Using 'keyword' or 'object' will cause reindex failures when custom properties "
67+
+ "(entityExtension) contain object/map values. Violations: "
68+
+ violations);
69+
}
70+
5271
@Test
5372
void ownersFieldMustBeNestedInAllIndices() {
5473
List<String> violations = new ArrayList<>();
@@ -72,6 +91,28 @@ void ownersFieldMustBeNestedInAllIndices() {
7291
+ ". RBAC nested queries will fail on these indices.");
7392
}
7493

94+
private static void findExtensionTypeViolations(
95+
JsonNode properties, String currentPath, List<String> violations, String entity) {
96+
Iterator<String> fieldNames = properties.fieldNames();
97+
while (fieldNames.hasNext()) {
98+
String name = fieldNames.next();
99+
JsonNode fieldNode = properties.get(name);
100+
String path = currentPath.isEmpty() ? name : currentPath + "." + name;
101+
if (name.equals("extension")) {
102+
String type = fieldNode.path("type").asText("");
103+
if (!"flattened".equals(type)) {
104+
String detail =
105+
type.isEmpty() ? "missing \"type\" (implicit object)" : "\"" + type + "\"";
106+
violations.add(entity + " (" + path + "): " + detail);
107+
}
108+
}
109+
JsonNode childProps = fieldNode.path("properties");
110+
if (!childProps.isMissingNode()) {
111+
findExtensionTypeViolations(childProps, path, violations, entity);
112+
}
113+
}
114+
}
115+
75116
private static void findViolations(
76117
JsonNode properties, String fieldName, String currentPath, List<String> violations) {
77118
Iterator<String> fieldNames = properties.fieldNames();

openmetadata-spec/src/main/resources/elasticsearch/en/file_index_mapping.json

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -313,9 +313,12 @@
313313
"mimeType": {
314314
"type": "keyword"
315315
},
316-
"extension": {
316+
"fileExtension": {
317317
"type": "keyword"
318318
},
319+
"extension": {
320+
"type": "flattened"
321+
},
319322
"path": {
320323
"type": "text",
321324
"analyzer": "om_analyzer",

openmetadata-spec/src/main/resources/elasticsearch/jp/api_collection_index_mapping.json

Lines changed: 45 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -238,14 +238,14 @@
238238
}
239239
}
240240
},
241-
"extension": {
242-
"type": "object"
243-
},
244241
"state": {
245242
"type": "keyword"
246243
}
247244
}
248245
},
246+
"extension": {
247+
"type": "flattened"
248+
},
249249
"entityType": {
250250
"type": "keyword",
251251
"fields": {
@@ -640,6 +640,48 @@
640640
}
641641
}
642642
},
643+
"customPropertiesTyped": {
644+
"type": "nested",
645+
"properties": {
646+
"name": {
647+
"type": "keyword"
648+
},
649+
"propertyType": {
650+
"type": "keyword"
651+
},
652+
"stringValue": {
653+
"type": "keyword"
654+
},
655+
"textValue": {
656+
"type": "text",
657+
"analyzer": "om_analyzer"
658+
},
659+
"longValue": {
660+
"type": "long"
661+
},
662+
"doubleValue": {
663+
"type": "double"
664+
},
665+
"start": {
666+
"type": "long"
667+
},
668+
"end": {
669+
"type": "long"
670+
},
671+
"refId": {
672+
"type": "keyword"
673+
},
674+
"refType": {
675+
"type": "keyword"
676+
},
677+
"refName": {
678+
"type": "keyword"
679+
},
680+
"refFqn": {
681+
"type": "keyword"
682+
}
683+
}
684+
},
643685
"fingerprint": {
644686
"type": "keyword"
645687
},

openmetadata-spec/src/main/resources/elasticsearch/jp/dashboard_data_model_index_mapping.json

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -679,6 +679,48 @@
679679
}
680680
}
681681
},
682+
"customPropertiesTyped": {
683+
"type": "nested",
684+
"properties": {
685+
"name": {
686+
"type": "keyword"
687+
},
688+
"propertyType": {
689+
"type": "keyword"
690+
},
691+
"stringValue": {
692+
"type": "keyword"
693+
},
694+
"textValue": {
695+
"type": "text",
696+
"analyzer": "om_analyzer"
697+
},
698+
"longValue": {
699+
"type": "long"
700+
},
701+
"doubleValue": {
702+
"type": "double"
703+
},
704+
"start": {
705+
"type": "long"
706+
},
707+
"end": {
708+
"type": "long"
709+
},
710+
"refId": {
711+
"type": "keyword"
712+
},
713+
"refType": {
714+
"type": "keyword"
715+
},
716+
"refName": {
717+
"type": "keyword"
718+
},
719+
"refFqn": {
720+
"type": "keyword"
721+
}
722+
}
723+
},
682724
"fingerprint": {
683725
"type": "keyword"
684726
},

openmetadata-spec/src/main/resources/elasticsearch/jp/database_index_mapping.json

Lines changed: 45 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -250,14 +250,14 @@
250250
}
251251
}
252252
},
253-
"extension": {
254-
"type": "object"
255-
},
256253
"state": {
257254
"type": "keyword"
258255
}
259256
}
260257
},
258+
"extension": {
259+
"type": "flattened"
260+
},
261261
"entityType": {
262262
"type": "keyword",
263263
"fields": {
@@ -636,6 +636,48 @@
636636
}
637637
}
638638
},
639+
"customPropertiesTyped": {
640+
"type": "nested",
641+
"properties": {
642+
"name": {
643+
"type": "keyword"
644+
},
645+
"propertyType": {
646+
"type": "keyword"
647+
},
648+
"stringValue": {
649+
"type": "keyword"
650+
},
651+
"textValue": {
652+
"type": "text",
653+
"analyzer": "om_analyzer"
654+
},
655+
"longValue": {
656+
"type": "long"
657+
},
658+
"doubleValue": {
659+
"type": "double"
660+
},
661+
"start": {
662+
"type": "long"
663+
},
664+
"end": {
665+
"type": "long"
666+
},
667+
"refId": {
668+
"type": "keyword"
669+
},
670+
"refType": {
671+
"type": "keyword"
672+
},
673+
"refName": {
674+
"type": "keyword"
675+
},
676+
"refFqn": {
677+
"type": "keyword"
678+
}
679+
}
680+
},
639681
"fingerprint": {
640682
"type": "keyword"
641683
},

openmetadata-spec/src/main/resources/elasticsearch/jp/directory_index_mapping.json

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -690,6 +690,48 @@
690690
}
691691
}
692692
},
693+
"customPropertiesTyped": {
694+
"type": "nested",
695+
"properties": {
696+
"name": {
697+
"type": "keyword"
698+
},
699+
"propertyType": {
700+
"type": "keyword"
701+
},
702+
"stringValue": {
703+
"type": "keyword"
704+
},
705+
"textValue": {
706+
"type": "text",
707+
"analyzer": "om_analyzer"
708+
},
709+
"longValue": {
710+
"type": "long"
711+
},
712+
"doubleValue": {
713+
"type": "double"
714+
},
715+
"start": {
716+
"type": "long"
717+
},
718+
"end": {
719+
"type": "long"
720+
},
721+
"refId": {
722+
"type": "keyword"
723+
},
724+
"refType": {
725+
"type": "keyword"
726+
},
727+
"refName": {
728+
"type": "keyword"
729+
},
730+
"refFqn": {
731+
"type": "keyword"
732+
}
733+
}
734+
},
693735
"fingerprint": {
694736
"type": "keyword"
695737
},

openmetadata-spec/src/main/resources/elasticsearch/jp/file_index_mapping.json

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -167,9 +167,12 @@
167167
"mimeType": {
168168
"type": "keyword"
169169
},
170-
"extension": {
170+
"fileExtension": {
171171
"type": "keyword"
172172
},
173+
"extension": {
174+
"type": "flattened"
175+
},
173176
"path": {
174177
"type": "keyword"
175178
},

openmetadata-spec/src/main/resources/elasticsearch/jp/pipeline_index_mapping.json

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -631,6 +631,48 @@
631631
}
632632
}
633633
},
634+
"customPropertiesTyped": {
635+
"type": "nested",
636+
"properties": {
637+
"name": {
638+
"type": "keyword"
639+
},
640+
"propertyType": {
641+
"type": "keyword"
642+
},
643+
"stringValue": {
644+
"type": "keyword"
645+
},
646+
"textValue": {
647+
"type": "text",
648+
"analyzer": "om_analyzer"
649+
},
650+
"longValue": {
651+
"type": "long"
652+
},
653+
"doubleValue": {
654+
"type": "double"
655+
},
656+
"start": {
657+
"type": "long"
658+
},
659+
"end": {
660+
"type": "long"
661+
},
662+
"refId": {
663+
"type": "keyword"
664+
},
665+
"refType": {
666+
"type": "keyword"
667+
},
668+
"refName": {
669+
"type": "keyword"
670+
},
671+
"refFqn": {
672+
"type": "keyword"
673+
}
674+
}
675+
},
634676
"fingerprint": {
635677
"type": "keyword"
636678
},

0 commit comments

Comments
 (0)