core&ui&judge: allow run multiple pretests in single request by undefined-moe · Pull Request #1098 · hydro-dev/Hydro

undefined-moe · 2025-12-16T17:37:11Z

Summary by CodeRabbit

New Features
- Pretest submissions now accept multiple inputs (array) so a single submission can run against multiple test cases.
- Judging and generator flows execute and report results per test case, providing per-case timings, memory, and messages.
Improvements
- UI and management endpoints now submit pretest input as arrays.
- Pretest submission now validates that input array is non-empty.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Note

Enable submitting multiple pretest inputs at once, with per-case execution in judge and end-to-end schema/handler/UI updates.

Core Types
- Allow RecordPayload.input to be string | string[].
Judge
- Refactor pretest runner in packages/hydrojudge/src/judge/run.ts to handle multiple inputs via runFlow; build subtasks from ctx.input, execute per-case, run analysis once, and truncate messages.
- Update generator in packages/hydrojudge/src/judge/generate.ts to read per-case stdin from ctx.input[i-1].
- Store JudgeTask.input as string[] in packages/hydrojudge/src/task.ts.
Backend Handlers/Model
- problem.submit now accepts input: string[] and validates non-empty for pretest; pass array to record.add.
- manage.script pretest wraps input as array.
- RecordModel.add stores pretest input as string[].
UI
- Scratchpad pretest request sends input: [pretestInput] in components/scratchpad/ScratchpadToolbarContainer.jsx.

^{Written by Cursor Bugbot for commit e78b7b5. This will update automatically on new commits. Configure here.}

coderabbitai · 2025-12-16T17:37:22Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

This pull request changes pretest input handling from a single string to arrays across the codebase. The public type RecordPayload.input is widened to string | string[]. Backend updates include: JudgeTask and task normalization to string[]; judge entry refactored to per-case evaluation (new judgeCase) and integrated with runFlow/Context (signature change); generator invocation uses per-case input elements; handlers and model record storage updated to accept and store string[]; UI now sends pretest input as a single-element array.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely summarizes the main change: enabling multiple pretest inputs in a single request, which is the core objective reflected throughout the changeset.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch run-multiple

📜 Recent review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e78b7b5 and 2084bde.

📒 Files selected for processing (4)

packages/hydrojudge/src/judge/generate.ts
packages/hydrojudge/src/judge/run.ts
packages/hydrojudge/src/task.ts
packages/hydrooj/src/handler/problem.ts

🚧 Files skipped from review as they are similar to previous changes (2)

packages/hydrojudge/src/judge/generate.ts
packages/hydrooj/src/handler/problem.ts

🧰 Additional context used

🧬 Code graph analysis (1)

packages/hydrojudge/src/judge/run.ts (5)

packages/common/subtask.ts (1)

NormalizedCase (134-137)

packages/hydrojudge/src/judge/interface.ts (1)

Context (6-6)

packages/hydrojudge/src/sandbox.ts (1)

runQueued (202-216)

framework/utils/lib/common.ts (2)

parseTimeMS (164-170)

parseMemoryMB (172-178)

packages/hydrojudge/src/flow.ts (1)

runFlow (76-134)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: build
GitHub Check: Analyze (javascript)
GitHub Check: build

🔇 Additional comments (7)

packages/hydrojudge/src/task.ts (3)

33-33: Type change aligns with PR objectives.

The change from string to string[] correctly supports multiple pretest inputs as intended by this PR.

70-70: Verify the semantic choice: [''] vs [] for undefined input.

The current logic produces [''] (array with empty string) when this.request.input is undefined. This is an improvement over the previous issue flagged in past reviews where [undefined] was produced. However, there's a semantic question: should undefined input produce:

[''] = run one pretest case with empty stdin

[] = run no pretest cases

The current behavior (['']) means a pretest will always run even when no input is provided, which may be intentional for backward compatibility. Please confirm this is the desired behavior.

Based on past review comments, the previous implementation created `[

70-70: Input normalization correctly handles undefined case.

The code now produces [''] (array with empty string) when this.request.input is undefined, which is an improvement over the previously flagged [undefined] issue. This is semantically correct: an empty string represents valid stdin with no content, ensuring pretests run even when no explicit input is provided, which aligns with typical pretest behavior.

Based on past review comments, the previous concern has been adequately addressed.

packages/hydrojudge/src/judge/run.ts (4)

1-7: Imports align with the refactored architecture.

The new imports (NormalizedCase, runFlow, Context) correctly support the refactored pretest execution flow using the standardized runFlow mechanism.

51-51: Message truncation limit corrected.

The truncation limit has been fixed to 102400 characters (100 KB), which aligns with other judge modules like generate.ts and provides adequate output for debugging.

This addresses a previous review concern.

40-42: Analysis execution logic is correct.

The conditional analysis execution ensures runAnalysis is invoked only once per submission and only for relevant failure statuses (WRONG_ANSWER, RUNTIME_ERROR). The ctx.analysis flag correctly prevents redundant analysis runs across multiple test cases.

72-77: runFlow integration follows correct pattern.

The refactored code appropriately delegates to runFlow, moving compilation into the compile hook and using judgeCase for per-case execution. This aligns with the standardized judge flow architecture.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ea6bc0b and 0fa2b52.

📒 Files selected for processing (6)

packages/common/types.ts (1 hunks)
packages/hydrojudge/src/judge/run.ts (2 hunks)
packages/hydrojudge/src/task.ts (2 hunks)
packages/hydrooj/src/handler/problem.ts (2 hunks)
packages/hydrooj/src/model/record.ts (2 hunks)
packages/ui-default/components/scratchpad/ScratchpadToolbarContainer.jsx (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (2)

packages/hydrooj/src/handler/problem.ts (2)

framework/framework/validator.ts (2)

Types (29-66)

Types (91-193)

framework/framework/error.ts (1)

ValidationError (54-63)

packages/hydrojudge/src/judge/run.ts (7)

packages/common/subtask.ts (1)

NormalizedCase (134-137)

packages/hydrojudge/src/judge/interface.ts (1)

Context (6-6)

packages/hydrooj/src/model/record.ts (1)

judge (82-130)

packages/hydrojudge/src/judge/objective.ts (1)

judge (7-81)

packages/hydrojudge/src/judge/hack.ts (1)

judge (8-86)

packages/hydrojudge/src/judge/generate.ts (1)

judge (13-146)

packages/hydrojudge/src/flow.ts (1)

runFlow (76-136)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: build
GitHub Check: Analyze (javascript)
GitHub Check: build

🔇 Additional comments (8)

packages/ui-default/components/scratchpad/ScratchpadToolbarContainer.jsx (1)

38-49: LGTM! Input correctly wrapped in array for new API.

The change correctly adapts the UI to send input as an array, aligning with the updated backend contract. The UI currently supports single pretest input but the backend is now ready for multiple inputs when the UI is enhanced.

packages/common/types.ts (1)

109-110: LGTM! Type widened to support both string and array inputs.

The union type string | string[] maintains backward compatibility while enabling the new multiple pretest input feature. This is consistent with the gradual migration pattern across the codebase.

packages/hydrooj/src/model/record.ts (2)

137-137: LGTM! Args type updated for array-based input.

170-173: LGTM! Pretest path correctly handles array input with empty array default.

The default of args.input || [] ensures a safe fallback when no input is provided, consistent with the new array-based input handling.

packages/hydrooj/src/handler/problem.ts (2)

486-488: LGTM! Parameter type and validation correctly updated for array input.

The parameter declaration, default value, and method signature are all correctly updated to handle array-based pretest input. The Types.ArrayOf(Types.String) decorator will properly validate and transform incoming data.

501-501: LGTM! Validation correctly checks for non-empty array.

The validation !input.length properly ensures at least one pretest input is provided when pretest is true.

packages/hydrojudge/src/judge/run.ts (2)

8-53: LGTM! Per-case judging implementation is well-structured.

The curried judgeCase function correctly:

Executes each test case with its own input

Handles time/memory limit detection with 2x allowance for debugging

Captures exit codes and signals for runtime errors

Triggers analysis on first WA/RE for debugging assistance

Returns properly structured case results

55-78: LGTM! Judge function correctly orchestrates per-case flow.

The implementation properly:

Creates a single subtask containing all input cases with uniform scoring

Maps each input string from ctx.input to a normalized case structure

Delegates compilation and execution to runFlow with the judgeCase handler

The output: '' for each case is intentional since pretest mode doesn't validate output against expected results—it just runs the code and reports the outcome.

cursor

This PR is being reviewed by Cursor Bugbot

Details

You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

cursor · 2025-12-16T17:45:09Z

+        subtaskId: 1,
        status,
-        time: Math.floor(time * 1000000) / 1000000,
+        score: 1,


Bug: Score always 1 regardless of pass/fail status

The judgeCase function unconditionally returns score: 1 regardless of whether the test case passed or failed. Since the subtask uses type: 'sum', all case scores are summed together, meaning failed test cases (TLE, MLE, runtime error) still contribute to the total score. Looking at other judges like default.ts and all checkers, the score is conditionally set to 0 for non-ACCEPTED statuses. The score here likely needs to be conditional on status === STATUS.STATUS_ACCEPTED.

core&ui&judge: allow run multiple pretests in single request

0fa2b52

coderabbitai Bot reviewed Dec 16, 2025

View reviewed changes

Comment thread packages/hydrojudge/src/task.ts Outdated

cursor Bot reviewed Dec 16, 2025

View reviewed changes

core: fix script call

e78b7b5

cursor Bot reviewed Dec 16, 2025

View reviewed changes

Comment thread packages/hydrojudge/src/judge/run.ts Outdated

Merge branch 'master' into run-multiple

2084bde

undefined-moe merged commit 114be28 into master Dec 27, 2025
9 checks passed

undefined-moe deleted the run-multiple branch December 27, 2025 22:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

core&ui&judge: allow run multiple pretests in single request#1098

core&ui&judge: allow run multiple pretests in single request#1098
undefined-moe merged 3 commits into
masterfrom
run-multiple

undefined-moe commented Dec 16, 2025 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Dec 16, 2025 •

edited

Loading

Other AI code review bot(s) detected

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot Dec 16, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Uh oh!

Conversation

undefined-moe commented Dec 16, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Walkthrough

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

This PR is being reviewed by Cursor Bugbot

Uh oh!

cursor Bot Dec 16, 2025

Choose a reason for hiding this comment

Bug: Score always 1 regardless of pass/fail status

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

undefined-moe commented Dec 16, 2025 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Dec 16, 2025 •

edited

Loading