-
Notifications
You must be signed in to change notification settings - Fork 47
- #297 · junemoon-happy opened
on May 12, 2026 3
Issues
is:issue state:open
is:issue state:open
Issue creation is restricted in this repository
Search results
[需求] AISBench镜像支持测评agent数据集
content_check_passedissue content check passedissue content check passedenhancementNew feature or requestNew feature or requestStatus: Open.#390 In AISBench/benchmark;[Bug] aisbench后台多进程的情况下会在Summarizing performance results...时卡死
content_check_failedissue content check failedissue content check failedStatus: Open.#387 In AISBench/benchmark;[需求] 优化SWE-Bench_Pro数据集资源清理逻辑
content_check_passedissue content check passedissue content check passedenhancementNew feature or requestNew feature or requestStatus: Open.#380 In AISBench/benchmark;[Bug] MultiTurnGenInferencer
infer_every模式未正确累积历史对话content_check_passedissue content check passedissue content check passedquestionFurther information is requestedFurther information is requestedStatus: Open.#369 In AISBench/benchmark;[gsm8k] 模型使用千位分隔法输出答案会存在答案提取错误的情况
content_check_failedissue content check failedissue content check failedStatus: Open.#359 In AISBench/benchmark;[疑问] vbench评分结果如何获取单个视频的数据?
content_check_passedissue content check passedissue content check passedquestionFurther information is requestedFurther information is requestedStatus: Open.#348 In AISBench/benchmark;[疑问] aisbench报错AttributeError: 'PreTrainedConfig' object has no attribute 'max_position_embeddings'
content_check_passedissue content check passedissue content check passedquestionFurther information is requestedFurther information is requestedStatus: Open.#309 In AISBench/benchmark;[Roadmap] AISBench 2026 Q2 Roadmap
content_check_failedissue content check failedissue content check failedStatus: Open.#297 In AISBench/benchmark;[需求] AISBench 压测工具增加像 Evalscope 中的SLA 自动调优设置
content_check_passedissue content check passedissue content check passedenhancementNew feature or requestNew feature or requestStatus: Open.#294 In AISBench/benchmark;【RFC】【性能测评】AISBench性能测评能力增强
content_check_failedissue content check failedissue content check failedStatus: Open.#284 In AISBench/benchmark;[需求] 在配置虚拟数据集的时候能直接传入字符,比如52K,这样能自动解析
content_check_passedissue content check passedissue content check passedenhancementNew feature or requestNew feature or requestStatus: Open.#275 In AISBench/benchmark;[疑问] 如何获取性能测试结果中每一条请求的精确性能指标
content_check_passedissue content check passedissue content check passedquestionFurther information is requestedFurther information is requestedStatus: Open.#262 In AISBench/benchmark;