The Core Models system (src/core/models.py) is the foundational data modeling layer of the Marcus architecture. It defines the fundamental data structures and enumerations that represent all core business entities within the Marcus ecosystem, providing type-safe, well-documented abstractions for tasks, projects, workers, assignments, and system state.
TaskStatus TaskAssignment WorkerStatus
↓ ↓ ↓
Task ──────────────────────────── ProjectState
↓ ↓
BlockerReport ProjectRisk
The system is built around six primary data classes and three enumerations:
Enumerations:
TaskStatus- Lifecycle states (TODO, IN_PROGRESS, DONE, BLOCKED)Priority- Urgency levels (LOW, MEDIUM, HIGH, URGENT)RiskLevel- Impact severity (LOW, MEDIUM, HIGH, CRITICAL)
Core Models:
Task- Individual work items with dependencies and metadataProjectState- Aggregate project health and metricsWorkerStatus- Agent capabilities, workload, and performanceTaskAssignment- Execution context for assigned workBlockerReport- Issue tracking and resolutionProjectRisk- Risk assessment and mitigation planning
The Core Models system sits at the architectural center of Marcus, serving as the common language between all subsystems:
graph TD
A[Core Models] --> B[AI Engine]
A --> C[Task Assignment]
A --> D[Kanban Integration]
A --> E[MCP Server]
A --> F[Workflow Management]
A --> G[Monitoring Systems]
A --> H[Learning Systems]
A --> I[Context Management]
In the typical Marcus scenario workflow, Core Models are invoked at every stage:
ProjectStateinstances created to track metricsProjectRiskmodels initialized for assessment- Project metadata stored in model structures
WorkerStatusmodel populated with agent capabilities- Skills, availability, and performance scores recorded
- Agent pool state maintained through models
Taskmodels filtered and analyzed for assignmentTaskAssignmentcreated with execution context- Dependency graphs traversed using model relationships
Task.actual_hoursand status updatedProjectStatemetrics recalculated automatically- Progress tracking through model state transitions
BlockerReportinstances created with AI analysisRiskLevelassessed and propagated to project state- Resolution workflows tracked through model lifecycle
Task.statustransitioned to DONEProjectState.completed_tasksincremented- Dependency chains unblocked automatically
The system uses Python dataclasses with type hints for compile-time safety while maintaining runtime flexibility:
@dataclass
class Task:
id: str
name: str
description: str
status: TaskStatus
priority: Priority
# Optional fields with smart defaults
actual_hours: float = 0.0
dependencies: List[str] = field(default_factory=list)
labels: List[str] = field(default_factory=list)All models include temporal tracking for audit trails and analytics:
created_atandupdated_atfor lifecycle trackingassigned_atfor assignment timingreported_atfor issue trackingidentified_atfor risk management
Models support both flat and hierarchical project structures:
- Optional
project_idandproject_namefor multi-project support - Backward compatibility with single-project deployments
- Context propagation through model relationships
- Immutable enum types for efficient comparison
- Default factory functions prevent mutable default arguments
- Minimal object creation overhead through dataclass optimization
class TaskStatus(Enum):
TODO = "todo"
IN_PROGRESS = "in_progress"
DONE = "done"
BLOCKED = "blocked"String-based enums chosen over integers for:
- JSON serialization compatibility with external systems
- Human-readable database storage
- API transparency and debugging
- Internationalization support
# Safe mutable defaults
dependencies: List[str] = field(default_factory=list)
# Performance-optimized defaults
actual_hours: float = 0.0
# Optional context fields
project_id: Optional[str] = NoneTasks use string-based dependency references rather than object references:
- Prevents circular import issues
- Enables lazy loading and partial graphs
- Supports distributed task storage
- Simplifies serialization/deserialization
The TaskAssignment model includes security-focused fields:
workspace_path: Optional[str] = None
forbidden_paths: List[str] = field(default_factory=list)These enable sandbox isolation for worker agents, preventing unauthorized file system access.
For basic tasks, the models provide lightweight tracking:
- Minimal required fields (id, name, description, status, priority)
- Default values for complex fields
- Direct status transitions
For sophisticated workflows, models scale up naturally:
- Rich dependency networks through
dependencieslists - Multi-project context via
project_id/project_name - Performance tracking via
estimated_hours/actual_hours - Risk assessment through
BlockerReportandProjectRisk
The AI-powered task assignment system (src/core/ai_powered_task_assignment.py) uses model metadata to determine task complexity:
# Phase 1: Safety filtering uses task.labels and dependencies
safe_tasks = await self._filter_safe_tasks(available_tasks)
# Phase 2: Dependency analysis leverages model relationships
dependency_scores = await self._analyze_dependencies(safe_tasks)
# Phase 3: AI matching considers all model attributes
ai_scores = await self._get_ai_recommendations(safe_tasks, agent_info)Models provide seamless integration with external Kanban systems:
# Planka mapping
task.status -> Planka card status
task.labels -> Planka tags
task.description -> Planka card content
# Linear mapping
task.priority -> Linear priority levels
task.dependencies -> Linear parent/child relationships
task.estimated_hours -> Linear time estimatesThe system includes board-specific quality checks:
- Required field validation per provider
- Status transition rules enforcement
- Dependency cycle detection
- Data consistency verification
Models abstract away provider-specific details:
- Normalized priority levels across systems
- Standardized status workflows
- Common dependency representations
- Unified metadata handling
1. Consistency Across System
- Single source of truth for data structures
- Consistent field naming and types
- Unified validation rules
2. Type Safety
- Compile-time error detection
- IDE autocomplete support
- Reduced runtime type errors
3. Documentation Integration
- Numpy-style docstrings for all models
- Field-level documentation
- Usage examples included
4. Extensibility
- Easy to add new fields without breaking changes
- Optional fields support gradual feature rollout
- Enum values can be extended safely
5. Performance
- Lightweight dataclass implementation
- Efficient enum comparisons
- Minimal memory overhead
1. Coupling Risk
- Central models create dependency bottlenecks
- Changes require careful impact analysis
- Version compatibility challenges
2. Serialization Complexity
- Datetime handling across timezones
- Enum serialization for different targets
- Nested object serialization overhead
3. Validation Limitations
- No built-in field validation
- Complex constraint checking requires external logic
- Cross-field validation not enforced
4. Database Mapping
- ORM impedance mismatch potential
- No built-in persistence layer
- Manual mapping to storage formats
- Reduced boilerplate: Automatic
__init__,__repr__,__eq__ - Type safety: Native type hint support
- Immutability options:
frozen=Truewhen needed - Performance: Optimized memory layout
- API clarity: Self-documenting values
- Debugging ease: Human-readable in logs
- JSON compatibility: No conversion needed
- Database storage: Readable column values
- Backward compatibility: Existing code continues working
- Gradual migration: Features can be adopted incrementally
- Flexibility: Different use cases have different requirements
- Default handling: Sensible defaults reduce configuration burden
- Serialization: Easy JSON conversion
- Distributed systems: Works across process boundaries
- Lazy loading: Dependencies resolved on demand
- Circular reference avoidance: No memory leaks
1. Validation Framework
- Pydantic integration for field validation
- Cross-field constraint checking
- Custom validation rules per provider
2. Persistence Layer
- SQLAlchemy model mapping
- Document database support
- Caching layer integration
3. Event Sourcing
- Model state change events
- Audit trail automation
- Replay capability for debugging
4. Schema Evolution
- Automatic migration support
- Version compatibility checking
- Backward compatibility guarantees
1. Memory Efficiency
__slots__for reduced memory usage- Interning for common string values
- Lazy property evaluation
2. Serialization Speed
- Custom serializers for hot paths
- Binary serialization options
- Compression for large datasets
3. Caching Strategy
- Model instance caching
- Computed property memoization
- Query result caching
Currently, the Core Models system does not have direct Cato integration, as Cato appears to be a future enhancement. However, the models are designed to support AI coaching systems through:
Context Awareness:
# Models provide rich context for AI analysis
task_context = {
"complexity": len(task.dependencies),
"urgency": task.priority.value,
"historical_performance": task.actual_hours / task.estimated_hours
}Coaching Metadata:
performance_scoreinWorkerStatusfor coaching recommendationsBlockerReportpatterns for learning opportunitiesProjectRiskanalysis for proactive coaching
Future Cato Integration Points:
- Agent performance coaching based on
WorkerStatusmetrics - Task difficulty prediction using
Taskhistorical data - Risk mitigation coaching through
ProjectRiskpatterns
Models serve as the primary interface to Marcus's AI capabilities:
Analysis Context:
# Task complexity analysis
analysis_context = AnalysisContext(
task_count=len(project_tasks),
avg_priority=avg([t.priority for t in tasks]),
dependency_depth=max_dependency_depth(tasks)
)Assignment Context:
# Agent matching context
assignment_context = AssignmentContext(
agent_skills=worker.skills,
current_workload=len(worker.current_tasks),
performance_history=worker.performance_score
)Models integrate with Marcus's error framework for robust error handling:
# Model validation errors
from src.core.error_framework import ValidationError
if not task.name.strip():
raise ValidationError(
field_name="name",
field_value=task.name,
validation_rule="non_empty_string"
)1. Immutability Where Possible
# Good: Use frozen dataclasses for immutable data
@dataclass(frozen=True)
class TaskSnapshot:
task_id: str
status: TaskStatus
timestamp: datetime2. Defensive Field Access
# Good: Handle optional fields safely
priority_weight = task.priority.value if task.priority else "medium"3. Type Validation
# Good: Validate enum assignments
if status_value in [s.value for s in TaskStatus]:
task.status = TaskStatus(status_value)1. Model Conversion
# Converting to external format
def to_kanban_card(task: Task) -> Dict[str, Any]:
return {
"title": task.name,
"description": task.description,
"status": task.status.value,
"labels": task.labels
}2. Batch Operations
# Efficient bulk operations
completed_tasks = [t for t in tasks if t.status == TaskStatus.DONE]
total_effort = sum(t.actual_hours for t in completed_tasks)The Core Models system represents a mature, well-architected foundation that successfully balances simplicity with extensibility, type safety with flexibility, and performance with maintainability. Its position at the center of the Marcus architecture makes it a critical success factor for the entire system's reliability and evolution.