Feature foreach unordered validation#3062
Conversation
Implements serializable dict representation for Run and Task, making the Metaflow client more agent/LLM-friendly. Closes Netflix#1833
Greptile SummaryThis PR adds Confidence Score: 4/5Safe to merge for the foreach validation change; unrelated The core fix is correct: the method is defined once, placed appropriately in
Important Files Changed
Reviews (3): Last reviewed commit: "Merge branch 'master' into feature/forea..." | Re-trigger Greptile |
| meta_dict = self.metadata_dict | ||
| return env.get_client_info(self.path_components[0], meta_dict) | ||
|
|
||
| def to_dict(self) -> Dict[str, Any]: | ||
| """ | ||
| Returns a dictionary representation of this Task. | ||
|
|
||
| Useful for agents and external systems that need a serializable | ||
| summary of the task without fetching full artifact data. | ||
|
|
||
| Returns | ||
| ------- | ||
| Dict[str, Any] | ||
| Dictionary containing key metadata about this task. | ||
| """ | ||
| return { | ||
| "pathspec": self.pathspec, | ||
| "id": self.id, | ||
| "successful": self.successful, | ||
| "finished": self.finished, | ||
| "finished_at": self.finished_at.isoformat() if self.finished_at else None, | ||
| "created_at": self.created_at.isoformat() if self.created_at else None, | ||
| "tags": list(self.tags), | ||
| "user_tags": list(self.user_tags), | ||
| "runtime_name": self.runtime_name, | ||
| "current_attempt": self.current_attempt, | ||
| "origin_pathspec": self.origin_pathspec, | ||
| } | ||
|
|
There was a problem hiding this comment.
Unrelated changes mixed into the PR
The additions of Task.to_dict() and Run.to_dict() (and the corresponding test/unit/test_to_dict.py file) are entirely unrelated to the foreach-unordered validation fix described in the PR description and the linked issue. Mixing unrelated features in one PR makes review harder, complicates bisection if a regression is introduced, and makes the changelog harder to read.
Consider splitting these to_dict additions into a separate PR so each change can be reviewed and landed on its own merit.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
PR Type
Summary
When a set or frozenset is used to foreach Metaflow silently produces incorrect results— some tasks run on wrong inputs and others are skipped entirely. Now this PR adds early validation in FlowSpec.next() to reject unordered collections (set, frozenset) with a clear error message suggesting list().
Issue
Fixes #469
Reproduction
Runtime: local
Commands to run:
Where evidence shows up: parent console — tasks silently run on wrong inputs or are skipped with no error.
Before (error / log snippet)
``` # No error raised — tasks execute silently on wrong or duplicate inputs ```After (evidence that fix works)
``` MetaflowException: Foreach variable *self.items* in step *start* is a set, which has no guaranteed iteration order. This can cause tasks to run on wrong inputs or some inputs to be skipped entirely. Wrap it in list() first, e.g.: self.items = list(self.items) ```Root Cause
FlowSpec.next() expects the foreach input to have a fixed order, since
tasks are assigned to elements based on their position.Sets and frozensets don’t guarantee a consistent order, so the same data can be processed in a different order across runs. This can cause tasks to receive the wrong inputs or some inputs to be skipped. Since this happens silently without any error, it is hard to notice and debug. Since this failure is silent, it is particularly difficult to debug in practice.
Why This Fix Is Correct
The fix adds
_validate_foreach_type()which raisesInvalidNextExceptionif a set or frozenset is used before any tasks are created.Instead of running with wrong results, it stops early and shows a clear message telling the user to convert it to a list. Lists and tuples keep working the same, so existing valid usage is not affected.
Failure Modes Considered
Tests
5 unit tests in
test/unit/test_foreach_unordered.py:test_set_raises— set raises InvalidNextExceptiontest_frozenset_raises— frozenset raises InvalidNextExceptiontest_error_message_contains_fix_hint— error message containslist()hinttest_list_does_not_raise— list passes through cleanlytest_tuple_does_not_raise— tuple passes through cleanlyNon-Goals
AI Tool Usage
I used AI tools to identify relevant code locations and also to draft initial tests.All code changes were reviewed, understood, and tested manually.