Skip to content

Reports and runtime dashboard fixes#645

Draft
micahjo7 wants to merge 9 commits into
mainfrom
reports-and-runtime-dashboard-fixes
Draft

Reports and runtime dashboard fixes#645
micahjo7 wants to merge 9 commits into
mainfrom
reports-and-runtime-dashboard-fixes

Conversation

@micahjo7
Copy link
Copy Markdown
Collaborator

@micahjo7 micahjo7 commented Apr 24, 2026

#623 introduced Playwright report links for each test case on the evals in the dashboard. In the Playwright report, the screenshot and trace report are present as links. These worked fine when we had the eval result data hosted on GitHub, but now that they are back to being stored on GCS, those artifacts cannot be accessed through the report because of storage permission constraints. For the screenshot, we can independently pull the screenshot in the grade-report generated by Playwright and pull that out into the dashboard, per failed test case. The trace view was more involved, because it requires dynamic assets to be pulled.

To workaround this, I tried several things:

  • Opening the trace.zip file in trace.playwright.dev by specifying the hosted zip in the address
  • Packaging the Playwright trace viewer into a custom (inlined) file, hosting it on GCS, and using that file to open the trace zip
  • Adding a custom Playwright trace viewer template, creating a trace generator which takes the template and the trace zip file as inputs, and outputs an inlined trace file with the data already populated

I eventually got the last one to work, but it requires the generation of the trace file to be run for every failed test case during the evaluation phase, and creates the additional artifact for each.

1. Why trace.playwright.dev failed

The online trace viewer attempts to fetch your trace .zip file from the URL query parameters. Because your GCS bucket is on a secure corporate domain, browser security policies and proxy restrictions prevented that external page from reading your secure files, failing with a 403 error.

Attempted Fix: We tried adding a loose CORS policy to the bucket (allowing all origins ["*"] and GET methods with the Content-Type header), but that did not grant access to the external viewer.

2. Why fetching the ZIP file in your hosted viewer failed

You created a custom hosted Trace Viewer that had all assets inlined and service workers removed. But when it loaded, it tried to make a background network fetch call to pull that trace .zip file from storage.googleapis.com. Because your bucket requires authentication and standard background fetches do not automatically pass your active mTLS session cookies, Google rejected the request with a 403 Forbidden error.

Attempted Fix: We also looked into setting up a proxy through GCP at one point to route traffic, but that approach did not work either.

3. The Final Resolution

To bypass both URL parameter fallbacks and Service Worker blockers:

Generator Script: You added a script (generate-trace-report.ts) that converts the test's trace .zip file to Base64 and embeds it directly inside a clean viewer template.
Harness Workflow: You codified this command into the harness test extraction loop in harness/lib/collection.ts so that a standalone, fully functioning HTML trace viewer is generated automatically in that folder level whenever tests fail!
Dashboard UI: You wired up direct, clean UI buttons for both traces and screenshots right next to failed tests in the dashboard drawer to open that local HTML file directly!

Dashboard currently (4/24/26) shows an example of this: https://googlechrome.github.io/guidance/dashboard.html?testId=test-2026-04-24T10-31-45&source=remote&testName=task+-+animated-select-picker+-+unguided

image

Note that when running the dashboard locally, the Report works perfectly, showing the screenshot and trace. Remotely, those links to not work, hence why we pull this data out into the separate Screenshot and Trace buttons.

Comment thread harness/lib/collection.ts
Comment thread guides/playwright.config.ts
@paulirish
Copy link
Copy Markdown
Member

sorry i didnt give you a review for this.

how do you want to handle it?

also.. is there a more simple solution that would be preferred if X was feasible? now that i have a github proxy up and running... we might have more options.

@micahjo7
Copy link
Copy Markdown
Collaborator Author

micahjo7 commented May 6, 2026

I am considering we hold off on this workaround in the short term. For now, the ability to see the report (without the trace/screenshot) already is an improvement (in the remote dashboard), and the trace/screenshot are fully functional in the local dashboard.

filed b/510409977

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants