Skip to content

refactor(powerbi): scope per-workspace caches to workspace lifetime#28098

Merged
harshsoni2024 merged 3 commits into
mainfrom
improve-powerbi-memory-handling
May 14, 2026
Merged

refactor(powerbi): scope per-workspace caches to workspace lifetime#28098
harshsoni2024 merged 3 commits into
mainfrom
improve-powerbi-memory-handling

Conversation

@IceS2
Copy link
Copy Markdown
Contributor

@IceS2 IceS2 commented May 13, 2026

Summary

  • Move per-workspace state (filtered dashboards, filtered datamodels, dashboard chart mappings, dataflow exports) into a new WorkspaceState with explicit enter / exit lifecycle so caches are released between workspace iterations. The cross-workspace report registry persists so tile-pinned report-to-dashboard lineage still resolves across workspaces.
  • Extract _emit_om_target_lineage shared helper used by the four create_*_lineage methods.
  • Add per-workspace exception isolation: a failing workspace is logged and recorded in status.failed; the next workspace continues. Previously a single workspace error aborted the rest of the dashboard stage.
  • Tighten Optional[str] handling around fqn.build() (log + skip a missing FQN before get_by_name) and fix a wrong return-type annotation on create_report_dashboard_lineage (Either[CreateDashboardRequest] -> Either[AddLineageRequest]).

Test plan

  • Unit tests: cd ingestion && python -m pytest tests/unit/topology/dashboard/test_powerbi.py tests/unit/topology/dashboard/test_powerbi_workspace_state.py (59 passing locally)
  • Static checks pass: cd ingestion && make static-checks (powerbi-scoped: 0 errors / 0 warnings; full-repo errors unchanged from main on Linux runner)
  • Ruff: cd ingestion && ruff check src/metadata/ingestion/source/dashboard/powerbi/ && ruff format --check src/metadata/ingestion/source/dashboard/powerbi/
  • End-to-end ingestion against a PowerBI instance: dashboards, datamodels, charts, and lineage (dataset->dataflow, dataset->dataset, dataflow->dataflow, tile-pinned report->dashboard) produced identically to current main.
  • Verify per-workspace exception path: introduce a malformed workspace response in a test environment and confirm the run skips it and continues.

Move per-workspace state (filtered dashboards/datamodels, dashboard
charts, dataflow exports) into `WorkspaceState` with an explicit
`enter`/`exit` lifecycle so caches release between workspace
iterations. The cross-workspace report registry persists so
tile-pinned lineage can still resolve reports from other workspaces.

Also in the same refactor:
- Extract `_emit_om_target_lineage` shared helper used by the four
  `create_*_lineage` methods.
- Add per-workspace exception isolation: a failing workspace is
  logged and recorded in status, the next workspace continues.
- Tighten `Optional[str]` handling around `fqn.build()` so a missing
  FQN is logged and skipped before `get_by_name`.
- Fix wrong return-type annotation on
  `create_report_dashboard_lineage`
  (`Either[CreateDashboardRequest]` -> `Either[AddLineageRequest]`).
Copilot AI review requested due to automatic review settings May 13, 2026 19:44
@IceS2 IceS2 requested a review from a team as a code owner May 13, 2026 19:44
@github-actions github-actions Bot added Ingestion safe to test Add this label to run secure Github workflows on PRs labels May 13, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the PowerBI dashboard ingestion to scope per-workspace caches to an explicit workspace lifecycle, reducing cross-workspace state bleed and memory retention while keeping a cross-workspace report registry for tile lineage resolution. It also consolidates repeated lineage-emission logic into a shared helper and adjusts unit tests accordingly.

Changes:

  • Introduce WorkspaceState with enter/exit lifecycle to manage per-workspace caches (filtered dashboards/datamodels, dashboard→chart mapping, dataflow exports) and a cross-workspace report registry.
  • Add _emit_om_target_lineage helper to unify target-entity resolution + lineage emission across multiple create_*_lineage methods.
  • Isolate per-workspace failures so one broken workspace is recorded/logged and subsequent workspaces continue processing.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
ingestion/src/metadata/ingestion/source/dashboard/powerbi/metadata.py Wire in WorkspaceState, add workspace-level try/finally lifecycle, and refactor lineage creation via _emit_om_target_lineage.
ingestion/src/metadata/ingestion/source/dashboard/powerbi/workspace_state.py New workspace-scoped state container with explicit lifecycle and operation-shaped APIs for caches.
ingestion/tests/unit/topology/dashboard/test_powerbi.py Update tests to use WorkspaceState APIs and adjust mocks for dataset lookup / dashboard chart mapping.
ingestion/tests/unit/topology/dashboard/test_powerbi_workspace_state.py New unit tests locking the WorkspaceState lifecycle contract and cache behavior.

Comment on lines +744 to +749
tile_report_ids = [
report.id
for chart in dashboard_details.tiles or []
for report in [self.state.find_report(chart.reportId)]
if report is not None
]
Comment thread ingestion/tests/unit/topology/dashboard/test_powerbi.py
…d_lineage

The refactor that consolidated the four `create_*_lineage` methods
into `_emit_om_target_lineage` dropped the outer try/except wrapper
on `create_report_dashboard_lineage`. The helper covers per-target
failures, but this method has pre-loop work (resolving the
dashboard's FQN + entity) that needs its own guard. Restored,
yielding `Either(left=StackTraceError(...))` on resolution failures.
@gitar-bot
Copy link
Copy Markdown

gitar-bot Bot commented May 13, 2026

Code Review ✅ Approved

Refactors PowerBI workspace state management to improve memory handling and adds per-workspace exception isolation. No issues found.

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@sonarqubecloud
Copy link
Copy Markdown

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 13, 2026

🟡 Playwright Results — all passed (14 flaky)

✅ 4062 passed · ❌ 0 failed · 🟡 14 flaky · ⏭️ 97 skipped

Shard Passed Failed Flaky Skipped
🟡 Shard 1 298 0 1 4
🟡 Shard 2 750 0 6 19
🟡 Shard 3 783 0 1 7
🟡 Shard 4 789 0 1 18
🟡 Shard 5 708 0 1 41
🟡 Shard 6 734 0 4 8
🟡 14 flaky test(s) (passed on retry)
  • Flow/Metric.spec.ts › Metric creation flow should work (shard 1, 1 retry)
  • Features/BulkEditEntity.spec.ts › Glossary (shard 2, 1 retry)
  • Features/Glossary/GlossaryWorkflow.spec.ts › should display correct status badge color and icon (shard 2, 2 retries)
  • Features/KnowledgeCenter.spec.ts › Article mentions in description should working for Knowledge Center (shard 2, 1 retry)
  • Features/KnowledgeCenterTextEditor.spec.ts › Rich Text Editor - Text Formatting (shard 2, 1 retry)
  • Features/KnowledgeCenterTextEditor.spec.ts › Rich Text Editor - Text Formatting (shard 2, 1 retry)
  • Features/KnowledgeCenterTextEditor.spec.ts › Rich Text Editor - Text Formatting (shard 2, 1 retry)
  • Features/RTL.spec.ts › Verify Following widget functionality (shard 3, 1 retry)
  • Pages/Domains.spec.ts › Domain Rbac (shard 4, 1 retry)
  • Pages/EntityDataConsumer.spec.ts › Tier Add, Update and Remove (shard 5, 1 retry)
  • Pages/Lineage/DataAssetLineage.spec.ts › Column lineage for apiEndpoint -> table (shard 6, 1 retry)
  • Pages/Lineage/LineageFilters.spec.ts › Verify Impact Analysis service filter selection (shard 6, 1 retry)
  • Pages/UserDetails.spec.ts › Admin user can edit teams from the user profile (shard 6, 1 retry)
  • Pages/UserDetails.spec.ts › Create team with domain and verify visibility of inherited domain in user profile after team removal (shard 6, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

@harshsoni2024 harshsoni2024 merged commit 2d1d534 into main May 14, 2026
64 of 70 checks passed
@harshsoni2024 harshsoni2024 deleted the improve-powerbi-memory-handling branch May 14, 2026 11:08
IceS2 added a commit that referenced this pull request May 14, 2026
The base class method `register_record_chart` was added on main in
PR #27309 ("Mark Charts deleted") but never backported to 1.12.8.
The cherry-pick of #28098 incidentally pulled in the call site via
`--theirs` conflict resolution, breaking chart ingestion at runtime
with `AttributeError: 'PowerbiSource' object has no attribute
'register_record_chart'`.

Removing the call restores the cherry-pick to its semantically
correct shape for the 1.12.8 base class. No corresponding change is
needed on main, where the method exists.
harshsoni2024 pushed a commit that referenced this pull request May 18, 2026
The base class method `register_record_chart` was added on main in
PR #27309 ("Mark Charts deleted") but never backported to 1.12.8.
The cherry-pick of #28098 incidentally pulled in the call site via
`--theirs` conflict resolution, breaking chart ingestion at runtime
with `AttributeError: 'PowerbiSource' object has no attribute
'register_record_chart'`.

Removing the call restores the cherry-pick to its semantically
correct shape for the 1.12.8 base class. No corresponding change is
needed on main, where the method exists.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Ingestion safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants