Skip to content

Add extensive pycocotools comparison tests with large synthetic datasets#71

Merged
MiXaiLL76 merged 4 commits intomainfrom
copilot/add-tests-for-pycocotools-equality
Feb 11, 2026
Merged

Add extensive pycocotools comparison tests with large synthetic datasets#71
MiXaiLL76 merged 4 commits intomainfrom
copilot/add-tests-for-pycocotools-equality

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Feb 4, 2026

Motivation

Existing tests validate equality with pycocotools using only 1-2 small examples (3-9 annotations). Insufficient coverage across dataset scales and task types blocks production adoption.

Modification

New test module: test_extensive_pycocotools_comparison.py

  • 12 parameterized tests validating bit-for-bit equality (tolerance 1e-10) across:
    • Object detection (bbox): 10/50/100 images, 50-1500 annotations
    • Instance segmentation (segm): Same scales with RLE masks
    • Keypoint detection: Same scales with 17 keypoints/instance
    • Edge cases: perfect predictions, low confidence, mixed object sizes

Synthetic data generation

  • Realistic COCO-formatted datasets with proper size distributions (small/medium/large)
  • Variable image dimensions, prediction noise, false positives
  • Keypoint visibility/occlusion modeling

Documentation

  • tests/README.md: Test suite organization and execution guide
  • README.md: Added "Testing & Reliability" section

BC-breaking (Optional)

None. Tests only.

Use cases (Optional)

Validates drop-in replacement guarantee at scale. All existing 91 tests pass.

Checklist

  1. ✅ Pre-commit/linting applied
  2. ✅ 12 new tests added, all pass (91 existing + 12 new)
  3. N/A - Test-only changes
  4. ✅ Documentation updated (README.md, tests/README.md)
Original prompt

This section details on the original issue you should resolve

<issue_title>Add more extensive tests checking equality with pycocotools</issue_title>
<issue_description>Hi, I help maintain the rf-detr repo . You have an open PR in this repo. In looking at tests in this Repo, we see that you check equality of results with pycocotools only with one example. We would like to see a larger test suite to ensure equality across a wider range of possible inputs. Would you be willing to provide either a test script or adding to your test suite here comparing the 2 methods using a full suite of model predictions on Coco for all task types?

Obviously, we would prefer that rf-detr and rf-detr-seg are used to generate the predictions jsons for objdet and instseg respectively, whereas we don't have a keypoints model ready to use, so a yolo-pose model could generate said predictions json.

We understand the value of faster evaluation and would like to use this repo, but the above test is a bit of a blocker for us. Otherwise we will do the above test when our team finds time to prioritize it.

Thanks for your PR and your work maintaining this repo!

cc: @Borda @isaacrob</issue_description>

<agent_instructions>Adding tests aligned with actual code base and in particular validation on larger batch</agent_instructions>

Comments on the Issue (you are @copilot in this section)


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI and others added 3 commits February 4, 2026 01:01
Co-authored-by: Borda <6035284+Borda@users.noreply.github.com>
Co-authored-by: Borda <6035284+Borda@users.noreply.github.com>
…d maxDiff

Co-authored-by: Borda <6035284+Borda@users.noreply.github.com>
Copilot AI changed the title [WIP] Add extensive tests for equality with pycocotools Add extensive pycocotools comparison tests with large synthetic datasets Feb 4, 2026
Copilot AI requested a review from Borda February 4, 2026 01:05
@Borda Borda requested review from MiXaiLL76 and Copilot February 6, 2026 11:58
@Borda Borda marked this pull request as ready for review February 6, 2026 11:58
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds extensive validation tests to ensure faster_coco_eval produces identical results to pycocotools across large-scale synthetic datasets. The addition addresses concerns about insufficient test coverage by creating comprehensive parameterized tests spanning object detection, instance segmentation, and keypoint detection tasks with realistic COCO-formatted data.

Changes:

  • Added 12 parameterized tests validating exact equality (tolerance 1e-10) across varying dataset scales (10/50/100 images, 50-1500 annotations)
  • Implemented synthetic COCO dataset generation with realistic size distributions, RLE masks, and keypoint visibility modeling
  • Created comprehensive documentation explaining test organization, execution, and validation criteria

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
tests/test_extensive_pycocotools_comparison.py New test module with synthetic data generation and 12 parameterized tests validating equality across bbox, segmentation, and keypoint tasks
tests/README.md New documentation detailing test suite organization, execution commands, and validation criteria
README.md Added "Testing & Reliability" section highlighting comprehensive test coverage and pycocotools comparison

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

keypoints = []
num_keypoints = 17
num_visible = 0
for i in range(num_keypoints):
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable i is declared in the loop but never used within the loop body. Consider using _ instead to indicate it's intentionally unused: for _ in range(num_keypoints):

Suggested change
for i in range(num_keypoints):
for _ in range(num_keypoints):

Copilot uses AI. Check for mistakes.
if iou_type == "keypoints":
# Create dummy keypoints
keypoints = []
for i in range(17):
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loop variable i is unused in the loop body. Replace with _ to indicate intentional discard: for _ in range(17):

Suggested change
for i in range(17):
for _ in range(17):

Copilot uses AI. Check for mistakes.
Comment thread README.md

### Comprehensive Test Suite

- **90+ automated tests** covering all functionality
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The count '90+ automated tests' may become outdated as tests are added or removed. Consider using a more maintainable phrasing like 'Comprehensive automated test suite' or implement dynamic test counting if precision is important.

Suggested change
- **90+ automated tests** covering all functionality
- **Comprehensive automated test suite** covering all functionality

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@MiXaiLL76 MiXaiLL76 merged commit 768455a into main Feb 11, 2026
18 checks passed
@MiXaiLL76 MiXaiLL76 mentioned this pull request Feb 21, 2026
7 tasks
@Borda Borda deleted the copilot/add-tests-for-pycocotools-equality branch February 22, 2026 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add more extensive tests checking equality with pycocotools

4 participants