Reports and runtime dashboard fixes#645
Draft
micahjo7 wants to merge 9 commits into
Draft
Conversation
…acked remote dash
micahjo7
commented
Apr 24, 2026
micahjo7
commented
Apr 24, 2026
Member
|
sorry i didnt give you a review for this. how do you want to handle it? also.. is there a more simple solution that would be preferred if X was feasible? now that i have a github proxy up and running... we might have more options. |
Collaborator
Author
|
I am considering we hold off on this workaround in the short term. For now, the ability to see the report (without the trace/screenshot) already is an improvement (in the remote dashboard), and the trace/screenshot are fully functional in the local dashboard. filed b/510409977 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
#623 introduced Playwright report links for each test case on the evals in the dashboard. In the Playwright report, the screenshot and trace report are present as links. These worked fine when we had the eval result data hosted on GitHub, but now that they are back to being stored on GCS, those artifacts cannot be accessed through the report because of storage permission constraints. For the screenshot, we can independently pull the screenshot in the
grade-reportgenerated by Playwright and pull that out into the dashboard, per failed test case. The trace view was more involved, because it requires dynamic assets to be pulled.To workaround this, I tried several things:
I eventually got the last one to work, but it requires the generation of the trace file to be run for every failed test case during the evaluation phase, and creates the additional artifact for each.
1. Why trace.playwright.dev failed
The online trace viewer attempts to fetch your trace .zip file from the URL query parameters. Because your GCS bucket is on a secure corporate domain, browser security policies and proxy restrictions prevented that external page from reading your secure files, failing with a 403 error.
Attempted Fix: We tried adding a loose CORS policy to the bucket (allowing all origins ["*"] and GET methods with the Content-Type header), but that did not grant access to the external viewer.
2. Why fetching the ZIP file in your hosted viewer failed
You created a custom hosted Trace Viewer that had all assets inlined and service workers removed. But when it loaded, it tried to make a background network fetch call to pull that trace .zip file from storage.googleapis.com. Because your bucket requires authentication and standard background fetches do not automatically pass your active mTLS session cookies, Google rejected the request with a 403 Forbidden error.
Attempted Fix: We also looked into setting up a proxy through GCP at one point to route traffic, but that approach did not work either.
3. The Final Resolution
To bypass both URL parameter fallbacks and Service Worker blockers:
Generator Script: You added a script (generate-trace-report.ts) that converts the test's trace .zip file to Base64 and embeds it directly inside a clean viewer template.
Harness Workflow: You codified this command into the harness test extraction loop in harness/lib/collection.ts so that a standalone, fully functioning HTML trace viewer is generated automatically in that folder level whenever tests fail!
Dashboard UI: You wired up direct, clean UI buttons for both traces and screenshots right next to failed tests in the dashboard drawer to open that local HTML file directly!
Dashboard currently (4/24/26) shows an example of this: https://googlechrome.github.io/guidance/dashboard.html?testId=test-2026-04-24T10-31-45&source=remote&testName=task+-+animated-select-picker+-+unguided
Note that when running the dashboard locally, the
Reportworks perfectly, showing the screenshot and trace. Remotely, those links to not work, hence why we pull this data out into the separateScreenshotandTracebuttons.