add deepseek_r1_distill_qwen-32b_in_process llm config#355
Conversation
test itstart @
Monitor this job here: http://monitoring.pegasus.kl.dfki.de/d/slurm-job-details/job-details?var-jobid=2513133&from=1770253012000 crashed with OOM, restart with
Monitor this job here: http://monitoring.pegasus.kl.dfki.de/d/slurm-job-details/job-details?var-jobid=2513141&from=1770253776000 "no reasonign parser configure...", cancel job.
Monitor this job here: http://monitoring.pegasus.kl.dfki.de/d/slurm-job-details/job-details?var-jobid=2513143&from=1770253997000 [2026-02-05 03:51:01,459][HYDRA] Contents of /netscratch/binder/projects/kibad-llm/logs/355_faktencheck_core_with_persona/predict/multiruns/2026-02-05_02-13-25/job_return_value.md: click to see content
f1[2026-02-05 03:59:39,685][HYDRA] Contents of /netscratch/binder/projects/kibad-llm/logs/355_faktencheck_core_with_persona/evaluate/multiruns/2026-02-05_03-59-37/job_return_value.md: click to see result
errorclick to see result
|
max_model_len=65536
[2026-02-05 06:15:15,219][HYDRA] Contents of /netscratch/binder/projects/kibad-llm/logs/355_faktencheck_core_with_persona/predict/multiruns/2026-02-05_04-19-19/job_return_value.md: click to see content
metrics[2026-02-05 11:10:26,774][HYDRA] Contents of /netscratch/binder/projects/kibad-llm/logs/355_faktencheck_core_with_persona/evaluate/multiruns/2026-02-05_11-10-23/job_return_value.md: click to see results
errors[2026-02-05 11:12:55,302][HYDRA] Contents of /netscratch/binder/projects/kibad-llm/logs/355_faktencheck_core_with_persona/evaluate/multiruns/2026-02-05_11-12-54/job_return_value.md: click to see results
|
5156b18 to
e3f43c4
Compare
a3cdaf6 to
5ef5797
Compare
with chunkingstarted at crashed |
this implements #354
EDIT: We should wait for
#282#397 since the new llm seems to work just withmax_model_len=65536which causes much more "too long" errors. But also this needs to be verified (i.e. run again with full max_model_len of 128k, but excludeH100-TRAILS).