Skip to content

docs: record h2 order-control seed stability#313

Merged
DeliciousBuding merged 1 commit into
mainfrom
docs/h2-order-control-seed177-results-20260525
May 24, 2026
Merged

docs: record h2 order-control seed stability#313
DeliciousBuding merged 1 commit into
mainfrom
docs/h2-order-control-seed177-results-20260525

Conversation

@DeliciousBuding
Copy link
Copy Markdown
Owner

Summary

  • record the bounded 256 / 256 shared-position seed 177 stability scout for H2 output-cloud geometry
  • add the two curated JSON artifacts for seed 177 original-label and label-shuffle reviews
  • update Research docs to state that the controlled signal is seed-stable while remaining candidate-only

Decision

  • no Platform/Runtime schema, runner, UI type, or admitted bundle row
  • no same-cache feature sweep
  • no full 512 / 512 shared-position rerun selected by default

Verification

  • python -X utf8 scripts/check_markdown_links.py
  • python -X utf8 scripts/check_public_surface.py
  • python -X utf8 scripts/export_admitted_evidence_bundle.py --check
  • python -X utf8 scripts/run_pr_checks.py

Copilot AI review requested due to automatic review settings May 24, 2026 22:23
@DeliciousBuding DeliciousBuding merged commit 84dff57 into main May 24, 2026
2 of 3 checks passed
@DeliciousBuding DeliciousBuding deleted the docs/h2-order-control-seed177-results-20260525 branch May 24, 2026 22:24
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request documents the results of a seed-stability scout for the H2 output-cloud geometry candidate using seed 177. The findings confirm that the signal remains strong (AUC = 0.956192) and is not a single-seed artifact, while label-shuffle sanity tests remain at random levels. The changes include updates to AGENTS.md, ROADMAP.md, and various evidence documents to reflect these results, alongside the addition of two new JSON artifact files. Feedback was provided to improve cross-platform compatibility by using forward slashes instead of Windows-style backslashes in file paths within the new JSON artifacts.

"track": "black-box",
"method": "H2 output-cloud geometry scorer",
"mode": "cpu-cache-review",
"response_cache": "workspaces\\black-box\\runs\\h2-response-strength-256-shared-position-seed177-20260525-r1\\response-cache.npz",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The response_cache path uses Windows-style backslashes. For better cross-platform compatibility and consistency with other documentation in this repository (e.g., .gitignore and ROADMAP.md), it is recommended to use forward slashes.

Suggested change
"response_cache": "workspaces\\black-box\\runs\\h2-response-strength-256-shared-position-seed177-20260525-r1\\response-cache.npz",
"response_cache": "workspaces/black-box/runs/h2-response-strength-256-shared-position-seed177-20260525-r1/response-cache.npz",

"track": "black-box",
"method": "H2 output-cloud geometry scorer",
"mode": "cpu-cache-review",
"response_cache": "workspaces\\black-box\\runs\\h2-response-strength-256-shared-position-seed177-20260525-r1\\response-cache.npz",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The response_cache path uses Windows-style backslashes. For better cross-platform compatibility and consistency with other documentation in this repository, it is recommended to use forward slashes.

Suggested change
"response_cache": "workspaces\\black-box\\runs\\h2-response-strength-256-shared-position-seed177-20260525-r1\\response-cache.npz",
"response_cache": "workspaces/black-box/runs/h2-response-strength-256-shared-position-seed177-20260525-r1/response-cache.npz",

@DeliciousBuding DeliciousBuding review requested due to automatic review settings May 24, 2026 22:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant