Mutable defaults fixes#222
Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR implements comprehensive mutable defaults validation for the Cog runtime to prevent Python's mutable default footgun where shared mutable objects across function calls can cause subtle bugs. The implementation adds import-time validation that hard errors on mutable defaults and suggests proper default_factory usage, with comprehensive support for all collection types in both Python 3.9+ and typing module syntax.
- Added comprehensive mutable defaults validation to the
Input()function with clear error messages - Enhanced type system to support all collection types including sets and bare types like
list→List[Any] - Implemented proper
default_factorysupport using realdataclass.Fieldobjects - Fixed compatibility issues across Python versions 3.9-3.13
Reviewed Changes
Copilot reviewed 30 out of 32 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| python/coglet/api.py | Added mutable default validation and default_factory parameter to Input() function |
| python/coglet/adt.py | Enhanced type system for collection types and added PrimitiveType.ANY |
| python/coglet/schemas.py | Fixed Secret schema encoding with proper Field object handling |
| python/coglet/inspector.py | Enhanced Field object handling in validation and runtime execution |
| python/cog/coder/set_coder.py | Added new SetCoder for set type support |
| python/cog/coder/json_coder.py | Enhanced JsonCoder for Python 3.13 compatibility |
| python/cog/coder/pydantic_coder.py | Fixed BaseModelCoder for generic type compatibility |
| python/tests/test_input_function.py | Added comprehensive mutable defaults validation tests |
| python/tests/test_input_functional.py | Updated test to use default_factory instead of mutable default |
| python/tests/schemas/*.py | Updated all schema test files to use default_factory for mutable defaults |
| python/tests/runners/*.py | Added collection type test files and updated existing ones |
| python/tests/cases/repetition.py | Updated to use default_factory for list default |
| python/tests/bad_predictors/*.py | Updated test files to use default_factory for mutable defaults |
| internal/tests/input_defaults_test.go | Added Go integration tests for mutable default validation |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
c0f913a to
811ee48
Compare
Python's mutable default footgun (shared mutable objects across function calls)
was not being caught in Cog runtime, leading to subtle bugs where predictors
with `default=[]` would share the same list instance across predictions.
Implement import-time validation that hard errors on mutable defaults and
suggests proper `default_factory` usage, with comprehensive support for all
collection types in both Python 3.9+ and typing module syntax.
- Detect mutable defaults (`[]`, `{}`, `set()`, custom objects) at import time
- Provide clear, actionable error messages:
- `default=[]` → "Use Input(default_factory=list) instead"
- `default=[1,2,3]` → "Use Input(default_factory=lambda: [1,2,3]) instead"
- Validate mutual exclusion of `default` and `default_factory` parameters
- **Lists**: `list`, `List[T]`, `list[T]` (Python 3.9+)
- **Dicts**: `dict`, `Dict[K,V]`, `dict[K,V]` (Python 3.9+)
- **Sets**: `set`, `Set[T]`, `set[T]` (Python 3.9+)
- **Bare types**: `list`→`List[Any]`, `dict`→`Dict[str,Any]`, `set`→`Set[Any]`
- Use real `dataclass.Field` objects instead of bespoke implementation
- Compatible with Python 3.9+ (`kw_only` parameter handling)
- Full support for `default_factory` execution at runtime
- Go integration tests verify end-to-end runtime behavior
- Python tests validate all collection type combinations
- Smart pytest skipping for intentionally-bad test modules
- `python/coglet/api.py`: Added mutable default validation to `Input()` function
- `python/coglet/adt.py`: Enhanced type system for all collection types, added `PrimitiveType.ANY`
- `python/coglet/schemas.py`: Fixed Secret schema encoding with proper normalization
- `python/coglet/inspector.py`: Enhanced Field object handling in validation/runtime
- `python/cog/coder/`: Added `SetCoder`, enhanced `JsonCoder` for Python 3.13 compatibility
- 10 mutable defaults validation tests (empty/populated collections, custom objects)
- 9 collection type tests (typing vs built-in vs bare syntax)
- Go integration tests for setup failure detection
- Python 3.9, 3.10, 3.11, 3.12, 3.13 compatibility validation
- Fixed `Field` constructor for Python 3.9 (`kw_only` parameter)
- Fixed `issubclass()` and `inspect.getmro()` for generic types in Python 3.13
- Enhanced `pydantic_coder` and `json_coder` for generic type compatibility
Before (silent footgun):
```python
def predict(self, items: List[str] = Input(default=[])) -> str:
items.append("new") # Modifies shared list across predictions!
return str(items)
```
After (clear error + fix):
```python
def predict(self, items: List[str] = Input(default_factory=list)) -> str:
items.append("new") # Safe: fresh list each time
return str(items)
```
Testing
- Python tests: 19/19 collection + validation tests
- Go tests: 3/3 integration tests
- mypy: All type checking passes
- Multi-version: Python 3.9, 3.10, 3.11, 3.12, 3.13 validated
Fixes mutable default footgun while maintaining full backward compatibility
for correctly-written predictors.
811ee48 to
bd12721
Compare
There was a problem hiding this comment.
Pull Request Overview
Copilot reviewed 30 out of 32 changed files in this pull request and generated 2 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
michaeldwan
left a comment
There was a problem hiding this comment.
Code good, but be warned, I have less than zero experience with this aspect of python 😬
Let's talk about the question I raised about failing vs best-effort correcting.
Replace error-throwing validation with automatic conversion of mutable defaults
to safe default_factory implementations. This prevents Python's mutable default
footgun while maintaining backward compatibility.
Changes:
- Convert empty collections ([], {}, set()) to built-in constructors (list, dict, set)
- Convert populated collections/objects to lambda factories using copy.deepcopy
- Update all tests to verify auto-conversion instead of error cases
- Add comprehensive E2E test verifying isolation between prediction calls
- Remove pytest skip logic from test files now that conversion works
There was a problem hiding this comment.
Pull Request Overview
Copilot reviewed 31 out of 33 changed files in this pull request and generated 2 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
michaeldwan
left a comment
There was a problem hiding this comment.
looks good, I like the new behavior!
Python's mutable default footgun (shared mutable objects across function calls)
was not being caught in Cog runtime, leading to subtle bugs where predictors with
default=[]would share the same list instance across predictions.Replace error-throwing validation with automatic conversion of mutable defaults
to safe
default_factoryimplementations. This prevents the footgun whilemaintaining backward compatibility and requiring no code changes.
Automatic Conversion Logic:
[]→default_factory=list,{}→default_factory=dict[1,2,3]→default_factory=lambda: copy.deepcopy([1,2,3])MyObj()→default_factory=lambda: copy.deepcopy(MyObj())Collection Type Support:
list,List[T],list[T](Python 3.9+)dict,Dict[K,V],dict[K,V](Python 3.9+)set,Set[T],set[T](Python 3.9+)Implementation:
dataclass.Fieldobjects with properdefault_factoryInput()functionTesting & Validation:
Before (silent footgun):
After (automatic fix):
Prevents mutable default footgun with zero breaking changes to existing code.
This accurately describes what we actually implemented - automatic conversion rather than validation errors.