Skip to content

Add regression test suite for datasets (QA)#8289

Open
yuchen814 wants to merge 1 commit into
huggingface:mainfrom
yuchen814:qa/regression-suite
Open

Add regression test suite for datasets (QA)#8289
yuchen814 wants to merge 1 commit into
huggingface:mainfrom
yuchen814:qa/regression-suite

Conversation

@yuchen814

Copy link
Copy Markdown

What does this PR do?

Adds a data-driven manual/semi-automated regression test suite for datasets at tests/regression/regression_suite.md (128 test cases).

The suite was built by mining 984 historical issues (2020-04 → 2021-09) to find the bug categories that keep repeating, cross-checking the currently-open issues via the GitHub Search API, and mapping the result onto the documented QA focus areas. It defines a 30-minute smoke pass (24 P0 cases) and a 180-minute full regression run, each case with preconditions, numbered steps, and expected results, plus a traceability matrix back to specific issues.

Categories driving the most bugs (historical)

  1. Data loading (48%) 2. Caching (29%) 3. Splits/slicing (29%) 4. Arrow serialization (26%)

Coverage gaps filled

Before submitting this PR, please make sure you have read the contribution guidelines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant