Skip to content

Commit b78829c

Browse files
BioGeekclaude
andcommitted
fix: recognize 'prediction' column in main() preprocessing entry point
The V2 handler clean_winnow_rescored() already checks multiple column name candidates, but main() only checked 'preds' and 'prediction_untokenised'. Winnow outputs use 'prediction', which was missed. Unified the column detection to match the V2 candidates list. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent b325846 commit b78829c

1 file changed

Lines changed: 4 additions & 4 deletions

File tree

src/instanexus/preprocessing.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -358,10 +358,10 @@ def main(
358358
if metadata_json is not None and "experiment_name" in df.columns:
359359
df["protease"] = df["experiment_name"].apply(lambda name: extract_protease(name, proteases))
360360

361-
if "preds" in df.columns:
362-
df["cleaned_preds"] = df["preds"].apply(remove_modifications)
363-
elif "prediction_untokenised" in df.columns:
364-
df["cleaned_preds"] = df["prediction_untokenised"].apply(remove_modifications)
361+
seq_candidates = ["preds", "prediction_untokenised", "prediction", "Peptide", "sequence"]
362+
seq_col = next((c for c in seq_candidates if c in df.columns), None)
363+
if seq_col is not None:
364+
df["cleaned_preds"] = df[seq_col].apply(remove_modifications)
365365
else:
366366
raise ValueError("No suitable column found for peptide sequences.")
367367

0 commit comments

Comments
 (0)