【Feature】Supports rerunning specified use cases on SWE dataset#331
【Feature】Supports rerunning specified use cases on SWE dataset#331yejj710 wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces the ability to filter SWE-Bench dataset instances using a list of instance IDs loaded from a text file. It updates the SWEBenchDataset class to parse this file and apply the filters, updates several example configuration files, and adds unit tests. The reviewer suggests removing the strict .txt file extension requirement to support other plain text formats, and adding validation to raise an error if the provided instance IDs file is empty. They also recommend updating the unit tests to align with these changes.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| def _load_instance_ids_file(self, instance_ids_file: str) -> set[str]: | ||
| path = Path(instance_ids_file).expanduser() | ||
| if not path.is_file(): | ||
| raise FileOperationError( | ||
| SWEB_CODES.LOCAL_PATH_RESOLVE_FAILED, | ||
| f"SWE-Bench instance ids file does not exist: {instance_ids_file!r}", | ||
| ) | ||
| if path.suffix.lower() != ".txt": | ||
| raise FileOperationError( | ||
| SWEB_CODES.LOCAL_PATH_RESOLVE_FAILED, | ||
| f"SWE-Bench instance ids file must be a .txt file: {instance_ids_file!r}", | ||
| ) | ||
|
|
||
| try: | ||
| instance_ids = { | ||
| line.strip() | ||
| for line in path.read_text(encoding="utf-8").splitlines() | ||
| if line.strip() | ||
| } | ||
| except OSError as e: | ||
| raise FileOperationError( | ||
| SWEB_CODES.LOCAL_PATH_RESOLVE_FAILED, | ||
| f"Failed to read SWE-Bench instance ids file {instance_ids_file!r}: {e}", | ||
| ) | ||
| return instance_ids |
There was a problem hiding this comment.
- 移除不必要的
.txt后缀限制:限制instance_ids_file必须为.txt后缀是不必要的。用户可能会使用.log、.csv、.list或无后缀的文件。只要文件是纯文本且每行一个 ID,就应该允许读取。移除此限制可以提升通用性和用户体验。 - 增加空文件校验:如果用户提供了一个空文件,当前代码会返回一个空集合
set(),导致后续过滤后的数据集为空,并在没有任何错误提示的情况下静默结束运行。建议在读取文件后,如果解析出的instance_ids为空,则抛出FileOperationError异常,以明确提示用户文件内容无效。
def _load_instance_ids_file(self, instance_ids_file: str) -> set[str]:
path = Path(instance_ids_file).expanduser()
if not path.is_file():
raise FileOperationError(
SWEB_CODES.LOCAL_PATH_RESOLVE_FAILED,
f"SWE-Bench instance ids file does not exist: {instance_ids_file!r}",
)
try:
instance_ids = {
line.strip()
for line in path.read_text(encoding="utf-8").splitlines()
if line.strip()
}
except OSError as e:
raise FileOperationError(
SWEB_CODES.LOCAL_PATH_RESOLVE_FAILED,
f"Failed to read SWE-Bench instance ids file {instance_ids_file!r}: {e}",
)
if not instance_ids:
raise FileOperationError(
SWEB_CODES.LOCAL_PATH_RESOLVE_FAILED,
f"SWE-Bench instance ids file is empty or contains no valid ids: {instance_ids_file!r}",
)
return instance_idsThere was a problem hiding this comment.
增加这种约束是为了避免太宽松导致的一些不确定性问题
| def test_load_instance_ids_file_requires_txt_suffix(self): | ||
| with tempfile.TemporaryDirectory() as temp_dir: | ||
| ids_file = Path(temp_dir) / "ids.csv" | ||
| ids_file.write_text("django__django-1\n", encoding="utf-8") | ||
|
|
||
| with self.assertRaises(FileOperationError): | ||
| self.dataset._load_instance_ids_file(str(ids_file)) |
There was a problem hiding this comment.
由于移除了对 .txt 后缀的限制,原有的 test_load_instance_ids_file_requires_txt_suffix 测试用例将不再适用。建议将其替换为针对空文件抛出 FileOperationError 异常的测试用例,以确保新添加的空文件校验逻辑得到充分测试。
| def test_load_instance_ids_file_requires_txt_suffix(self): | |
| with tempfile.TemporaryDirectory() as temp_dir: | |
| ids_file = Path(temp_dir) / "ids.csv" | |
| ids_file.write_text("django__django-1\n", encoding="utf-8") | |
| with self.assertRaises(FileOperationError): | |
| self.dataset._load_instance_ids_file(str(ids_file)) | |
| def test_load_instance_ids_file_empty_raises_error(self): | |
| with tempfile.TemporaryDirectory() as temp_dir: | |
| ids_file = Path(temp_dir) / "ids.txt" | |
| ids_file.write_text(" \n\n \n", encoding="utf-8") | |
| with self.assertRaises(FileOperationError): | |
| self.dataset._load_instance_ids_file(str(ids_file)) |
🔍 Motivation / 变更动机
支持在一个txt文件中,逐行写入指定用例进行swe测评,方便重跑没有生成patch的用例。
📝 Modification / 修改内容
正则表达式匹配对于一些特定case的场景,不方便使用,直接指定用例更加直白方便。
✅ Checklist / 检查列表
Before PR:
After PR:
👥 Collaboration Info / 协作信息
🌟 Useful CI Command / 实用的CI命令
/gemini review/gemini summary/gemini help/readthedocs build