Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -641,13 +641,59 @@ private boolean isKnownCanonicalIndex(String name) {
return false;
}

/**
* Resolve the supplied index alias into the actual Elasticsearch / OpenSearch index name to
* query. Handles four shapes:
*
* <ul>
* <li><b>Entity-specific alias</b> (e.g. {@code "table"}): looked up in
* {@code entityIndexMap} and resolved to the canonical {@code *_search_index} name.
* This is the bug fix — without resolving, ES would treat {@code "table"} as an alias
* and expand it to every index that has that alias attached, including
* {@code column_search_index} (because {@code tableColumn} declares {@code "table"} as
* a {@code parentAlias}). Resolving here bypasses ES's alias expansion entirely so a
* query for tables only hits the table index.
* <li><b>Compound alias</b> (e.g. {@code "all"}, {@code "dataAsset"}): no entry in
* {@code entityIndexMap}, no canonical index, so the alias passes through and ES
* resolves it natively across the entities that have registered the alias. This is the
* intended behavior — searching {@code dataAsset} should surface every data-asset
* entity.
* <li><b>Canonical / legacy index name</b> (e.g. {@code "table_search_index"}): not a key
* in {@code entityIndexMap}, falls through to the prefix-and-pass branch, identical to
* the legacy behavior.
* <li><b>Already cluster-prefixed token</b>: idempotent — returned unchanged so that
* internal code paths that hand back a resolved value don't double-prefix.
* </ul>
*
* Comma-separated tokens are resolved independently. Empty tokens (from {@code "table,"} or
* {@code ","}) are dropped instead of materializing as a bare cluster prefix; if every token
* is empty the original input is returned unchanged so downstream ES surfaces a normal
* "unknown index" error instead of an empty-target failure.
*/
public String getIndexOrAliasName(String name) {
if (clusterAlias == null || clusterAlias.isEmpty()) {
if (nullOrEmpty(name)) {
return name;
}
return Arrays.stream(name.split(","))
.map(index -> clusterAlias + INDEX_NAME_SEPARATOR + index.trim())
.collect(Collectors.joining(","));
String prefix =
clusterAlias == null || clusterAlias.isEmpty() ? null : clusterAlias + INDEX_NAME_SEPARATOR;
String resolved =
Arrays.stream(name.split(","))
.map(String::trim)
.filter(t -> !t.isEmpty())
.map(t -> resolveSingleAliasToken(t, prefix))
.collect(Collectors.joining(","));
return resolved.isEmpty() ? name : resolved;
}

private String resolveSingleAliasToken(String token, String clusterPrefix) {
if (clusterPrefix != null && token.startsWith(clusterPrefix)) {
return token;
}
IndexMapping mapping = entityIndexMap == null ? null : entityIndexMap.get(token);
if (mapping != null) {
return mapping.getIndexName(clusterAlias);
}
Comment thread
mohityadav766 marked this conversation as resolved.
return clusterPrefix == null ? token : clusterPrefix + token;
}

private static final Map<String, Set<String>> RBAC_CHILD_TYPES =
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -275,6 +275,68 @@ void indexNameHelpersRespectClusterAlias() {
"table_search_index", repository.getIndexNameWithoutAlias("cluster_table_search_index"));
}

/**
* Bug regression for issue #27761: passing the entity-specific alias {@code "table"} used to
* leak into ES alias expansion and surface tableColumn docs (because column_search_index is
* registered with {@code "table"} as one of its aliases). Resolving the alias to its canonical
* index name here bypasses ES's alias resolution, so the search hits exactly the table index.
*/
@Test
void getIndexOrAliasNameResolvesEntitySpecificAliasToCanonicalIndex() {
assertEquals("cluster_table_search_index", repository.getIndexOrAliasName("table"));
assertEquals("cluster_domain_search_index", repository.getIndexOrAliasName("domain"));
}

/**
* Compound aliases like {@code "all"} and {@code "dataAsset"} have no entry in
* {@code entityIndexMap} (they're meta-aliases registered against many entities at index
* creation time). The resolver passes them through with the cluster prefix so ES expands them
* natively — searching {@code dataAsset} should still surface every data-asset entity.
*/
@Test
void getIndexOrAliasNamePassesCompoundAliasesThroughForNativeESExpansion() {
assertEquals("cluster_dataAsset", repository.getIndexOrAliasName("dataAsset"));
Comment on lines +284 to +298
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Edge Case: No test coverage for non-clustered (null/empty clusterAlias) path

All new tests in SearchRepositoryBehaviorTest use clusterAlias = "cluster". The fix changes behavior for non-clustered deployments too: previously getIndexOrAliasName("table") with no cluster alias returned "table" (passthrough), now it returns "table_search_index" (resolved canonical name). This is the core of the fix and arguably the more common deployment mode, but it has zero dedicated test coverage. If IndexMapping.getIndexName(null) or the prefix == null path in resolveSingleAliasToken ever regresses, no test will catch it.

Suggested fix:

@Test
void getIndexOrAliasNameResolvesAliasWithoutClusterAlias() {
  SearchRepository noPrefixRepo =
      newRepository(
          Map.of(Entity.TABLE, TABLE_MAPPING, Entity.DOMAIN, DOMAIN_MAPPING),
          null);
  assertEquals("table_search_index", noPrefixRepo.getIndexOrAliasName("table"));
  assertEquals("dataAsset", noPrefixRepo.getIndexOrAliasName("dataAsset"));
  assertEquals("table_search_index", noPrefixRepo.getIndexOrAliasName("table_search_index"));
}

Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion

assertEquals("cluster_all", repository.getIndexOrAliasName("all"));
}

/**
* Defense-in-depth: a token that already carries the cluster prefix must not get prefixed
* again. Otherwise multi-tenant deployments would 404 on
* {@code cluster_cluster_table_search_index} if any internal code accidentally hands a
* resolved value back to this method.
*/
@Test
void getIndexOrAliasNameIsIdempotentForAlreadyPrefixedTokens() {
assertEquals(
"cluster_table_search_index", repository.getIndexOrAliasName("cluster_table_search_index"));
}

/**
* Mixed input: each comma-separated token is resolved independently. Entity-specific aliases
* resolve to canonical names; compound aliases pass through.
*/
@Test
void getIndexOrAliasNameResolvesEachCommaSeparatedTokenIndependently() {
assertEquals(
"cluster_table_search_index,cluster_dataAsset",
repository.getIndexOrAliasName("table,dataAsset"));
}

/**
* Stray-comma / empty-token input must not produce bare cluster prefixes such as
* {@code "cluster_"}. Empty tokens are dropped; if every token is empty the original string
* is returned unchanged so downstream ES surfaces a normal "unknown index" error instead of
* a confusing empty-target failure.
*/
@Test
void getIndexOrAliasNameDropsEmptyTokensAndPreservesAllEmptyInput() {
assertEquals("cluster_table_search_index", repository.getIndexOrAliasName("table,"));
assertEquals(
"cluster_table_search_index,cluster_domain_search_index",
repository.getIndexOrAliasName("table, ,domain"));
assertEquals(", ,", repository.getIndexOrAliasName(", ,"));
}

@Test
void indexExistsFallsBackToAliasLookup() {
when(searchClient.indexExists("cluster_table_search_index")).thenReturn(false);
Expand Down
Loading