Skip to content

Fix payload size issue#27342

Closed
mohityadav766 wants to merge 32 commits intomainfrom
fix-payload-issue
Closed

Fix payload size issue#27342
mohityadav766 wants to merge 32 commits intomainfrom
fix-payload-issue

Conversation

@mohityadav766
Copy link
Copy Markdown
Member

Describe your changes:

Fixes

I worked on ... because ...

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

@mohityadav766 mohityadav766 self-assigned this Apr 14, 2026
@mohityadav766 mohityadav766 requested a review from a team as a code owner April 14, 2026 08:19
Copilot AI review requested due to automatic review settings April 14, 2026 08:19
@github-actions github-actions Bot added backend safe to test Add this label to run secure Github workflows on PRs labels Apr 14, 2026
@github-actions
Copy link
Copy Markdown
Contributor

✅ TypeScript Types Auto-Updated

The generated TypeScript types have been automatically updated based on JSON schema changes in this PR.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses search-index payload size limits by deduplicating repeated lineage SQL text across edges, storing the SQL once per document and referencing it via a key on each edge. It also tightens bulk payload sizing/headroom and adds safeguards for oversized single documents.

Changes:

  • Introduces sqlQueryKey on lineage edges and a doc-level lineageSqlQueries map; deduplicates SQL during index doc construction and in the lineage update script.
  • Updates Elasticsearch/OpenSearch index mappings (EN) to include sqlQueryKey and to store lineageSqlQueries as a non-indexed object (enabled: false).
  • Reduces effective bulk payload target to ~9MB, adds direct-index fallback for oversized documents, and adds unit tests for SQL deduplication.

Reviewed changes

Copilot reviewed 66 out of 67 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
openmetadata-ui/src/main/resources/ui/src/generated/api/lineage/esLineageData.ts Adds sqlQueryKey to the generated UI type for ES lineage edge payloads.
openmetadata-spec/src/main/resources/json/schema/api/lineage/esLineageData.json Extends lineage edge schema with sqlQueryKey.
openmetadata-spec/src/main/resources/elasticsearch/en/worksheet_index_mapping.json Adds sqlQueryKey to upstreamLineage mapping and adds lineageSqlQueries (disabled object).
openmetadata-spec/src/main/resources/elasticsearch/en/topic_index_mapping.json Adds sqlQueryKey and lineageSqlQueries mapping.
openmetadata-spec/src/main/resources/elasticsearch/en/table_index_mapping.json Adds sqlQueryKey and lineageSqlQueries mapping to support SQL dedup in table lineage edges.
openmetadata-spec/src/main/resources/elasticsearch/en/stored_procedure_index_mapping.json Adds sqlQueryKey and lineageSqlQueries mapping.
openmetadata-spec/src/main/resources/elasticsearch/en/spreadsheet_index_mapping.json Adds sqlQueryKey and lineageSqlQueries mapping.
openmetadata-spec/src/main/resources/elasticsearch/en/search_entity_index_mapping.json Adds sqlQueryKey and lineageSqlQueries mapping.
openmetadata-spec/src/main/resources/elasticsearch/en/prompt_template_index_mapping.json Adds lineageSqlQueries mapping.
openmetadata-spec/src/main/resources/elasticsearch/en/pipeline_index_mapping.json Adds sqlQueryKey and lineageSqlQueries mapping.
openmetadata-spec/src/main/resources/elasticsearch/en/mlmodel_index_mapping.json Adds sqlQueryKey and lineageSqlQueries mapping.
openmetadata-spec/src/main/resources/elasticsearch/en/metric_index_mapping.json Adds sqlQueryKey and lineageSqlQueries mapping.
openmetadata-spec/src/main/resources/elasticsearch/en/llm_model_index_mapping.json Adds lineageSqlQueries mapping.
openmetadata-spec/src/main/resources/elasticsearch/en/file_index_mapping.json Adds sqlQueryKey and lineageSqlQueries mapping.
openmetadata-spec/src/main/resources/elasticsearch/en/directory_index_mapping.json Adds sqlQueryKey and lineageSqlQueries mapping.
openmetadata-spec/src/main/resources/elasticsearch/en/dashboard_index_mapping.json Adds sqlQueryKey and lineageSqlQueries mapping.
openmetadata-spec/src/main/resources/elasticsearch/en/dashboard_data_model_index_mapping.json Adds sqlQueryKey and lineageSqlQueries mapping.
openmetadata-spec/src/main/resources/elasticsearch/en/container_index_mapping.json Adds sqlQueryKey and lineageSqlQueries mapping.
openmetadata-spec/src/main/resources/elasticsearch/en/chart_index_mapping.json Adds sqlQueryKey and lineageSqlQueries mapping.
openmetadata-spec/src/main/resources/elasticsearch/en/api_endpoint_index_mapping.json Adds sqlQueryKey and lineageSqlQueries mapping.
openmetadata-spec/src/main/resources/elasticsearch/en/api_collection_index_mapping.json Adds sqlQueryKey and lineageSqlQueries mapping.
openmetadata-spec/src/main/resources/elasticsearch/en/ai_governance_policy_index_mapping.json Adds lineageSqlQueries mapping.
openmetadata-spec/src/main/resources/elasticsearch/en/ai_agent_index_mapping.json Adds lineageSqlQueries mapping.
openmetadata-service/src/test/java/org/openmetadata/service/search/indexes/SearchIndexTest.java Adds unit tests validating SQL deduplication behavior and large-edge-count scenario.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/WorksheetIndex.java Switches to SearchIndex.populateLineageData to write deduped lineage fields.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/TopicIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/TableIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/StoredProcedureIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/StorageServiceIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/SpreadsheetIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/SecurityServiceIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/SearchServiceIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/SearchIndex.java Adds populateLineageData to populate upstreamLineage + lineageSqlQueries and invoke dedup.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/SearchEntityIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/PromptTemplateIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/PipelineServiceIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/PipelineIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/MlModelServiceIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/MlModelIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/MetricIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/MetadataServiceIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/MessagingServiceIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/McpServiceIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/McpServerIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/LlmServiceIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/LlmModelIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/FileIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/DriveServiceIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/DomainIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/DirectoryIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/DatabaseServiceIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/DataProductIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/DashboardServiceIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/DashboardIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/DashboardDataModelIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/ContainerIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/ChartIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/AiGovernancePolicyIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/AiApplicationIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/APIServiceIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/indexes/APIEndpointIndex.java Switches to SearchIndex.populateLineageData.
openmetadata-service/src/main/java/org/openmetadata/service/search/SearchRepository.java Lowers bulk max payload target to ~9MB headroom.
openmetadata-service/src/main/java/org/openmetadata/service/search/SearchIndexUtils.java Adds deduplicateSqlAcrossEdges utility that clears sqlQuery and sets sqlQueryKey.
openmetadata-service/src/main/java/org/openmetadata/service/search/SearchClusterMetrics.java Applies the same 10% payload headroom in conservative defaults.
openmetadata-service/src/main/java/org/openmetadata/service/search/SearchClient.java Updates lineage upsert script to deduplicate SQL into doc-level lineageSqlQueries and set sqlQueryKey.
openmetadata-service/src/main/java/org/openmetadata/service/apps/bundles/searchIndex/OpenSearchBulkSink.java Adjusts bulk overhead estimate, flush thresholding, and adds direct-index fallback for oversized docs.
openmetadata-service/src/main/java/org/openmetadata/service/apps/bundles/searchIndex/ElasticSearchBulkSink.java Same as OpenSearch sink: overhead, flush behavior, and direct-index fallback.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 14, 2026

Jest test Coverage

UI tests summary

Lines Statements Branches Functions
Coverage: 63%
63.88% (59797/93603) 43.64% (31322/71768) 46.73% (9415/20144)

Copilot AI review requested due to automatic review settings April 14, 2026 09:34
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 67 out of 68 changed files in this pull request and generated 4 comments.

ulixius9 and others added 9 commits April 14, 2026 23:30
- Extract URL credential sanitization to generic `sanitize_url_credentials` in logger utils
- Fix misleading log prefix `GitHubCloneReader::_clone` → `_clone_repo`
- Add BigQuery INFORMATION_SCHEMA context to `split_table_name` comment
- Add unit tests for `sanitize_url_credentials`

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ed mode for RDF reindex (#26902)

* RDF, cleanup relations and remove unnecessary bindings, add distributed mode for RDF reindex

* Update generated TypeScript types

* Address comments from copilot

* Update generated TypeScript types

* fix test issues

* Fix minor UI bugs

* Add the missing filters

* Fix RDF export API error

* Add export functionality

* Fix ui-checkstyle

* Fix java checkstyle

* Fix unit tests

* Fix and increase the coverage for KnowledgeGraph.spec.ts

* Fix tests

* Remove rdf as default in playwright and local docker

* fix ui-checkstyle

* Address comments

* Potential fix for pull request finding 'CodeQL / Artifact poisoning'

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* Address copilot comments

* Address copilot comments

* FIx tests

* FIx docker

* Update openmetadata-service/src/main/java/org/openmetadata/service/apps/bundles/rdf/distributed/DistributedRdfIndexCoordinator.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Address copilot review comments: license headers, JSON escaping, type safety, border-color, stop semantics

Agent-Logs-Url: https://github.com/open-metadata/OpenMetadata/sessions/c026e52e-162b-4c9a-9874-43791d4aaac1

Co-authored-by: harshach <38649+harshach@users.noreply.github.com>

* Show error toast for unsupported export format in KnowledgeGraph

Agent-Logs-Url: https://github.com/open-metadata/OpenMetadata/sessions/c026e52e-162b-4c9a-9874-43791d4aaac1

Co-authored-by: harshach <38649+harshach@users.noreply.github.com>

* Fix docker

* Fix docker for playwright

* Fix docker for playwright

* Fix tests

* Fix tests

* Fix docker

* Fix docker

* Fix glossary and pagination spec flakiness

* update the missing translations

* Fix docker

* Fix docker

* Fix integration test

* Fix fuseki not starting

* Fixed the run local docker script

* worked on comments

* Fix flakiness in knowledge graph tests

* Fix checkstyle

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Aniket Katkar <aniketkatkar97@gmail.com>
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: harshach <38649+harshach@users.noreply.github.com>
…ailure (#27355)

* Fixed the component re-rendering issue which causes the Entity test failures

* fixed DataAssetsTest

* fixed loading issue
* fix flaky search index application specs

* fix lint issues

---------

Co-authored-by: Shrabanti Paul <shrabantipaul@Shrabantis-MacBook-Pro.local>
…ecurity (#27269)

* feat: implement Content Security Policy nonce handling for enhanced security

* address comment

* address comments

* fix: address PR review feedback - fix IndexResource resource leak and CSP policy formatting

Agent-Logs-Url: https://github.com/open-metadata/OpenMetadata/sessions/049d4931-ba83-4a4f-b4bc-1f0f8d27f718

Co-authored-by: chirag-madlani <12962843+chirag-madlani@users.noreply.github.com>

* fix migration issue

* revert quote change for reportOnlyPolicy

* fix: address PR review - license header, shared constants, and test correctness

Agent-Logs-Url: https://github.com/open-metadata/OpenMetadata/sessions/c3c86206-0ef2-480e-af0b-3aac18706365

Co-authored-by: chirag-madlani <12962843+chirag-madlani@users.noreply.github.com>

* fix: correct YAML quoting for CSP policy in openmetadata.yaml

Agent-Logs-Url: https://github.com/open-metadata/OpenMetadata/sessions/a56f2afb-53b2-4dbe-836e-7f6e12bf85dc

Co-authored-by: chirag-madlani <12962843+chirag-madlani@users.noreply.github.com>

* fix errors

* revert csp enabled tests

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
…tion verification in teams & teams drag drop spec (#27341)

* fix(playwright): replace flaky toast assertion with search-based deletion verification in teams & teams drag drop spec

* address gitar

* address comments
* Fix Parquet Reader to use boto3 client

* Fix Avro/csv to use boto3 client

* Fix Avro/csv to use boto3 client

* Address comments

* Fix S3 tests

* Fix failing test

* address gitar comments

* address gitar comments

* Address co-pilot comments
- Update mock expectations in SearchIndexExecutorControlFlowTest and
  DistributedJobParticipantTest to use DEFAULT_BULK_PAYLOAD_SIZE_BYTES
  instead of hardcoded 104857600L (old 100MB default)
- Remove "charts" from DashboardIndex excluded fields — charts are needed
  for search filters and column lineage resolution

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 15, 2026 08:17
@github-actions
Copy link
Copy Markdown
Contributor

The Java checkstyle failed.

Please run mvn spotless:apply in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Java code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 250 out of 252 changed files in this pull request and generated 4 comments.

Comment on lines +26 to +36
try (InputStream inputStream = IndexResource.class.getResourceAsStream("/assets/index.html")) {
if (inputStream == null) {
throw new IllegalStateException("Missing required resource: /assets/index.html");
}
RAW_INDEX_HTML =
new BufferedReader(new InputStreamReader(inputStream, StandardCharsets.UTF_8))
.lines()
.collect(Collectors.joining("\n"));
} catch (IOException e) {
throw new IllegalStateException("Failed to load /assets/index.html", e);
}
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IndexResource now fails fast if /assets/index.html is missing. In the source tree, there is no openmetadata-service/src/main/resources/assets/index.html (only a test resource was added), so production will rely on this file being provided by another module/build step. Please ensure /assets/index.html is actually present on the runtime classpath in the service artifact (or adjust the resource path to the packaged location), otherwise class initialization will throw and the server won’t start.

Copilot uses AI. Check for mistakes.
Comment on lines +287 to +293
it('logs schema import failures before showing the fallback toast', async () => {
mockImportSchema.mockRejectedValueOnce(new Error('schema missing'));

await renderAppDetails();

expect(mockShowErrorToast).toHaveBeenCalled();
});
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test name says it "logs schema import failures" but the assertion only checks that showErrorToast was called (and the implementation doesn’t log the caught error). Consider renaming the test to match what it verifies, or add an assertion for the intended logging behavior.

Copilot uses AI. Check for mistakes.
Comment on lines 36 to +39
public static final long DEFAULT_MAX_CONTENT_LENGTH =
10 * 1024 * 1024L; // Conservative 10MB default
10 * 1024 * 1024L; // Conservative 10MB default (AWS OpenSearch hard limit)
// Safe bulk payload threshold: 90% of max_content_length to leave headroom for HTTP framing
public static final long DEFAULT_BULK_PAYLOAD_SIZE_BYTES = DEFAULT_MAX_CONTENT_LENGTH * 9 / 10;
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR title/description indicates a payload-size fix, but the diff also includes substantial unrelated changes (CSP nonce handling, RDF distributed indexing state tables, UI/export changes, ingestion reader changes, etc.). Please update the PR description to reflect the full scope or split into focused PRs to simplify review and risk management.

Copilot uses AI. Check for mistakes.
Comment on lines +35 to 69
root: 'tw:focus-visible:outline-2 tw:focus-visible:outline-offset-2 tw:relative tw:overflow-hidden tw:rounded-xl tw:transition tw:duration-100',
},
variants: {
default: {
root: 'tw:ring-1 tw:ring-inset tw:ring-secondary tw:bg-primary',
root: 'tw:border-1 tw:border-secondary tw:bg-primary',
},
elevated: {
root: 'tw:ring-1 tw:ring-inset tw:ring-secondary tw:bg-primary tw:shadow-md',
root: 'tw:border-1 tw:border-secondary tw:bg-primary tw:shadow-md',
},
outlined: { root: 'tw:ring-2 tw:ring-inset tw:ring-primary tw:bg-primary' },
outlined: { root: 'tw:border-2 tw:border-primary tw:bg-primary' },
ghost: { root: 'tw:bg-transparent' },
},
colors: {
default: { root: '' },
brand: {
root: 'tw:bg-utility-brand-50 tw:ring-1 tw:ring-inset tw:ring-utility-brand-200',
root: 'tw:bg-utility-brand-50 tw:border-1 tw:border-utility-brand-200',
},
error: {
root: 'tw:bg-utility-error-50 tw:ring-1 tw:ring-inset tw:ring-utility-error-200',
root: 'tw:bg-utility-error-50 tw:border-1 tw:border-utility-error-200',
},
warning: {
root: 'tw:bg-utility-warning-50 tw:ring-1 tw:ring-inset tw:ring-utility-warning-200',
root: 'tw:bg-utility-warning-50 tw:border-1 tw:border-utility-warning-200',
},
success: {
root: 'tw:bg-utility-success-50 tw:ring-1 tw:ring-inset tw:ring-utility-success-200',
root: 'tw:bg-utility-success-50 tw:border-1 tw:border-utility-success-200',
},
},
interactive: { root: 'tw:cursor-pointer' },
interactiveVariants: {
default: { root: 'tw:hover:bg-primary_hover tw:hover:ring-primary' },
default: { root: 'tw:hover:bg-primary_hover tw:hover:border-primary' },
elevated: { root: 'tw:hover:bg-primary_hover tw:hover:shadow-lg' },
outlined: { root: 'tw:hover:bg-secondary tw:hover:ring-primary' },
outlined: { root: 'tw:hover:bg-secondary tw:hover:border-primary' },
ghost: {
root: 'tw:hover:bg-secondary tw:hover:ring-1 tw:hover:ring-inset tw:hover:ring-secondary',
root: 'tw:hover:bg-secondary tw:hover:border-1 tw:hover:border-inset tw:hover:border-secondary',
},
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cardStyles introduces Tailwind classes tw:border-1 and tw:hover:border-inset. border-1 / border-inset are not part of Tailwind’s default utility set (and there are no other usages in the codebase), so these classes are likely no-ops and will regress card borders/hover styling. Consider using tw:border for 1px borders and remove the unsupported border-inset (or replace with an equivalent, e.g. tw:ring-inset if the intent was an inset ring).

Copilot uses AI. Check for mistakes.
mohityadav766 and others added 2 commits April 15, 2026 14:07
Replace hardcoded 104857600L with SearchClusterMetrics.DEFAULT_BULK_PAYLOAD_SIZE_BYTES
to use the centralized 9MB default consistently.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@gitar-bot
Copy link
Copy Markdown

gitar-bot Bot commented Apr 15, 2026

Code Review ✅ Approved 5 resolved / 5 findings

Fixes payload size issues by resolving buffer flush ordering, addressing direct-index increment gaps, and deduplicating magic numbers. Refactored strip methods to eliminate redundant large memory allocations and corrected SQL key collision risks.

✅ 5 resolved
Bug: Missing totalSubmitted increment for directly-indexed oversized docs

📄 openmetadata-service/src/main/java/org/openmetadata/service/apps/bundles/searchIndex/ElasticSearchBulkSink.java:317-322 📄 openmetadata-service/src/main/java/org/openmetadata/service/apps/bundles/searchIndex/OpenSearchBulkSink.java:368-374 📄 openmetadata-service/src/main/java/org/openmetadata/service/apps/bundles/searchIndex/ElasticSearchBulkSink.java:311-323 📄 openmetadata-service/src/main/java/org/openmetadata/service/apps/bundles/searchIndex/OpenSearchBulkSink.java:362-374
When a document exceeds maxPayloadSizeBytes, indexDocumentDirectly() increments totalSuccess (or totalFailed), and the caller increments processSuccess, but totalSubmitted is never incremented. For normal bulk documents, totalSubmitted is incremented in flushInternal(). This causes stats to report fewer documents submitted than actually succeeded, breaking the invariant totalSubmitted >= totalSuccess.

Bug: Buffer flush reorder can leave single oversized op stuck in buffer

📄 openmetadata-service/src/main/java/org/openmetadata/service/apps/bundles/searchIndex/ElasticSearchBulkSink.java:822-828 📄 openmetadata-service/src/main/java/org/openmetadata/service/apps/bundles/searchIndex/OpenSearchBulkSink.java:965-971
The refactored CustomBulkProcessor.add() now flushes before adding the operation and only when !buffer.isEmpty(). If a single operation larger than maxPayloadSizeBytes arrives when the buffer is empty, it gets added without triggering a flush. The next operation will then trigger a flush of the oversized batch. While the addEntity guard mostly prevents this, time-series and column paths (and any future callers) can still hit this scenario.

Bug: Painless script sqlKey assignment can collide on non-contiguous maps

📄 openmetadata-service/src/main/java/org/openmetadata/service/search/SearchClient.java:264-265
The ADD_UPDATE_LINEAGE Painless script computes the new SQL key as String.valueOf(sqlMap.size() + 1). This assumes keys are a dense sequence starting at 1. If lineageSqlQueries ever has gaps (e.g., stale entries from removed edges that weren't cleaned up, or manual modifications), size() + 1 can collide with an existing key, silently overwriting a different SQL query. The REMOVE_LINEAGE_SCRIPT removes edges from upstreamLineage but does not clean up the corresponding lineageSqlQueries entries, so orphaned keys accumulate and the map size diverges from the max key.

Performance: Redundant multi-MB byte array allocations in strip methods

📄 openmetadata-service/src/main/java/org/openmetadata/service/search/SearchIndexUtils.java:104-105 📄 openmetadata-service/src/main/java/org/openmetadata/service/search/SearchIndexUtils.java:131-132
Both stripLineageForSize and stripDocMapIfOversized call json.getBytes(StandardCharsets.UTF_8) multiple times on the same (unchanged) string. For example in stripLineageForSize, lines 104 and 105 compute .getBytes(UTF_8).length on the same json string twice, allocating a potentially multi-megabyte byte array each time only to discard it. The same pattern repeats in stripDocMapIfOversized at lines 131-132.

Store the byte length in a local variable after each serialization to avoid the redundant allocation.

Quality: Magic number 9MB duplicated across three files

📄 openmetadata-service/src/main/java/org/openmetadata/service/search/SearchRepository.java:1085 📄 openmetadata-service/src/main/java/org/openmetadata/service/search/SearchIndexRetryWorker.java:484 📄 openmetadata-service/src/main/java/org/openmetadata/service/apps/bundles/searchIndex/ReindexingConfiguration.java:49
The 9 MB payload limit (9L * 1024L * 1024L) is hardcoded independently in SearchRepository.java:1085, SearchIndexRetryWorker.java:484, and ReindexingConfiguration.java:49. If the limit needs to change, it's easy to miss one of these locations, leading to inconsistent behavior between live indexing, retry, and reindexing paths.

Consider extracting a shared constant (e.g., in ReindexingConfiguration or a common constants class) and referencing it from all three call sites.

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@mohityadav766
Copy link
Copy Markdown
Member Author

closing will open new one

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 251 out of 253 changed files in this pull request and generated 2 comments.

Comment on lines +39 to 43
root: 'tw:border-1 tw:border-secondary tw:bg-primary',
},
elevated: {
root: 'tw:ring-1 tw:ring-inset tw:ring-secondary tw:bg-primary tw:shadow-md',
root: 'tw:border-1 tw:border-secondary tw:bg-primary tw:shadow-md',
},
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tw:border-1 is not a standard Tailwind utility and it’s the only occurrence in the repo. This will likely be emitted as an unknown class and the card won’t render the intended 1px border. Consider using the standard tw:border (1px) utility (or a project-standard border width utility if one exists) and keep tw:border-secondary for color.

Copilot uses AI. Check for mistakes.
Comment on lines 67 to 69
ghost: {
root: 'tw:hover:bg-secondary tw:hover:ring-1 tw:hover:ring-inset tw:hover:ring-secondary',
root: 'tw:hover:bg-secondary tw:hover:border-1 tw:hover:border-inset tw:hover:border-secondary',
},
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tw:hover:border-inset is not a Tailwind utility (Tailwind has ring-inset, but not border-inset). This likely results in a no-op class and makes the hover styling inconsistent with the previous ring-based implementation. Remove this class or replace it with the correct utility for the intended effect.

Copilot uses AI. Check for mistakes.
@sonarqubecloud
Copy link
Copy Markdown

@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants