Ignore rows which are missing stats during daily aggregation#5287
Open
dylanjew wants to merge 1 commit into
Open
Ignore rows which are missing stats during daily aggregation#5287dylanjew wants to merge 1 commit into
dylanjew wants to merge 1 commit into
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
There were a couple old versions of CF running in the clusterfuzz-candidate2 bots. This PR ignores the stats which were generated by those bots. It's unfortunate that the stats are missing these, but IMO it's better to completely ignore these rows rather than writing partial data. There were 10 of these instances running, which resulted in some fuzzers having a fuzzing session unaccounted for.
Testing: Ran this SQL against the BigQuery table and verified it correctly filters out the rows and resulted in the
testcases_generated == testcases_executedRollout
we will have to rerun the cron job for the past month again after this deploys to properly ignore those rows.