Skip to content

Commit 8818d45

Browse files
Luodianclaude
andauthored
Fix TypeError when using write_out with log_samples (#839)
* Fix TypeError when using write_out with log_samples When both --write_out and --log_samples flags are used together, the print_writeout function can encounter instances where inst.doc is None, causing a TypeError when trying to access doc[doc_to_target]. This fix adds a check for None documents and provides a fallback message instead of crashing. Fixes the issue where the following error occurs: TypeError: 'NoneType' object is not subscriptable at lmms_eval/api/task.py, line 1347, in doc_to_target 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> Reported-by: rixejzvdl649 Reported-by: pspdada Github-Issue: #143 * Add warnings for --write_out flag usage The --write_out flag is intended for debugging purposes only and can significantly impact performance during evaluations. This commit adds: 1. Runtime warning when --write_out is enabled 2. Updated help text to clearly indicate it's for debugging only 3. Documentation in print_writeout function about its debugging purpose 4. Suggestion to use --log_samples for production use These warnings help users understand that --write_out should not be used during actual evaluation runs. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Mark --write_out flag as deprecated The --write_out flag has limited use and overlaps with --log_samples functionality. This commit marks it as deprecated to guide users toward the better-maintained --log_samples feature. Changes: - Added DEPRECATION WARNING when --write_out is used - Updated help text to indicate deprecation - Added deprecation notices in function docstrings - Specified removal target as v0.5.0 - Clear guidance to use --log_samples instead The --write_out flag only prints first few documents to console and impacts performance, while --log_samples saves all outputs to files for comprehensive debugging without performance impact. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>
1 parent 1d30b96 commit 8818d45

3 files changed

Lines changed: 27 additions & 2 deletions

File tree

lmms_eval/__main__.py

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -165,7 +165,9 @@ def parse_eval_args() -> argparse.Namespace:
165165
"-w",
166166
action="store_true",
167167
default=False,
168-
help="Prints the prompt for the first few documents.",
168+
help="DEPRECATED: This flag is deprecated and will be removed in a future version. "
169+
"For debugging, use --log_samples to save all outputs to files. "
170+
"This flag prints prompts for the first few documents to console, impacting performance.",
169171
)
170172
parser.add_argument(
171173
"--log_samples",
@@ -399,6 +401,14 @@ def cli_evaluate_single(args: Union[argparse.Namespace, None] = None) -> None:
399401

400402
evaluation_tracker = EvaluationTracker(**evaluation_tracker_args)
401403

404+
if args.write_out:
405+
eval_logger.warning(
406+
"DEPRECATION WARNING: --write_out is deprecated and will be removed in v0.5.0. "
407+
"For debugging and analysis, use --log_samples instead, which saves all model "
408+
"outputs to files without impacting performance. The --write_out flag only prints "
409+
"the first few documents to console and provides limited debugging value."
410+
)
411+
402412
if args.predict_only:
403413
args.log_samples = True
404414
if (args.log_samples or args.predict_only) and not args.output_path:

lmms_eval/evaluator.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -459,6 +459,11 @@ def evaluate(
459459
)
460460
eval_logger.debug(f"Task: {task_output.task_name}; number of requests on this rank: {len(task._instances)}")
461461
if write_out:
462+
eval_logger.warning(
463+
"DEPRECATION WARNING: --write_out is deprecated and will be removed in v0.5.0. "
464+
"Use --log_samples instead for saving model outputs and debugging. "
465+
"The write_out flag only prints the first few documents and impacts performance."
466+
)
462467
print_writeout(task)
463468
# aggregate Instances by LM method requested to get output.
464469
for instance in task.instances:

lmms_eval/evaluator_utils.py

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -177,12 +177,22 @@ def get_subtask_list(task_dict, task_root=None, depth=0):
177177

178178

179179
def print_writeout(task) -> None:
180+
"""Print first few documents for debugging purposes.
181+
182+
DEPRECATED: This function is deprecated and will be removed in v0.5.0.
183+
Use log_samples functionality instead for better debugging capabilities.
184+
185+
WARNING: This function only prints the first few documents to console
186+
and can significantly impact performance during evaluations.
187+
"""
180188
for inst in task.instances:
181189
# print the prompt for the first few documents
182190
if inst.doc_id < 1:
191+
# Handle cases where inst.doc might be None (e.g., when using log_samples)
192+
target = "N/A (document is None)" if inst.doc is None else task.doc_to_target(inst.doc)
183193
eval_logger.info(
184194
f"Task: {task}; document {inst.doc_id}; context prompt (starting on next line):\
185-
\n{inst.args[0]}\n(end of prompt on previous line)\ntarget string or answer choice index (starting on next line):\n{task.doc_to_target(inst.doc)}\n(end of target on previous line)"
195+
\n{inst.args[0]}\n(end of prompt on previous line)\ntarget string or answer choice index (starting on next line):\n{target}\n(end of target on previous line)"
186196
)
187197
eval_logger.info(f"Request: {str(inst)}")
188198

0 commit comments

Comments
 (0)