- Implement comprehensive Undo/Redo system providing 40-60% reduction in error recovery time
- Deliver Node Grouping/Container functionality enabling 5-10x larger graph management
- Achieve feature parity with professional node editors, moving PyFlowGraph from "interesting prototype" to "viable tool"
- Enable management of graphs with 200+ nodes effectively through abstraction layers
- Establish foundation for professional adoption by addressing critical competitive disadvantages
PyFlowGraph is a universal node-based visual scripting editor built with Python and PySide6, following a "Code as Nodes" philosophy. Positioned as a workflow automation and integration platform, it enables users to build ETL pipelines, API integrations, data transformations, webhook handlers, and business process automation through visual programming.
The competitive landscape includes direct competitors in AI-focused visual workflows which validates market demand while highlighting PyFlowGraph's unique positioning as a developer-centric, self-hosted alternative with unlimited Python ecosystem access. Currently, PyFlowGraph lacks two fundamental features that every professional workflow automation tool provides: Undo/Redo functionality and Node Grouping capabilities. Market analysis reveals that 100% of competitors in the workflow automation space have both features, and user feedback consistently cites these as deal-breakers for professional adoption. This PRD addresses these critical gaps to transform PyFlowGraph into a professional-grade workflow automation platform capable of handling complex enterprise integration scenarios while maintaining our core differentiator of full programming flexibility.
| Date | Version | Description | Author |
|---|---|---|---|
| 2025-08-16 | 1.0 | Initial PRD creation | BMad Master |
| 2025-08-17 | 1.1 | Added AI-focused competitor analysis | Sarah (PO) |
- FR1: The system shall provide multi-level undo/redo with configurable history depth (default 50, max 200)
- FR2: The system shall support standard keyboard shortcuts (Ctrl+Z, Ctrl+Y, Ctrl+Shift+Z) with customization
- FR3: The system shall display action descriptions in menus and provide undo/redo history dialog
- FR4: The system shall support undo/redo for: node creation/deletion, connection creation/deletion, node movement/positioning, property modifications, code changes, copy/paste operations, group/ungroup operations
- FR5: The system shall validate group creation preventing circular dependencies and invalid selections
- FR6: The system shall generate group interface pins automatically based on external connections with type inference
- FR7: The system shall handle command failures gracefully with rollback capabilities
- NFR1: Individual undo/redo operations shall complete within 100ms; bulk operations within 500ms
- NFR2: Group operations shall scale linearly: 10ms per node for creation, 5ms per node for expansion
- NFR3: Memory usage for command history shall not exceed 50MB regardless of operation count
- NFR4: Grouped graph files shall increase by maximum 25% over equivalent flat representation
- NFR5: All operations shall maintain ACID properties with automatic consistency validation
- NFR6: System shall support graphs up to 1000 nodes with graceful degradation beyond limits
Professional desktop application feel with modern dark theme aesthetics. The interface should feel familiar to users of other node editors (Blender Shader Editor, Unreal Blueprint) while maintaining PyFlowGraph's unique "Code as Nodes" philosophy. Prioritize efficiency for power users while remaining approachable for newcomers to visual scripting.
- Node-based visual programming with drag-and-drop connections
- Context-sensitive right-click menus for rapid access to functions
- Keyboard shortcuts for all major operations (professionals expect this)
- Pan/zoom navigation for large graphs with smooth transitions
- Multi-selection with standard Ctrl+Click and drag-rectangle patterns
- Visual feedback for all state changes (hover, selection, execution)
- Main Graph Editor (primary workspace with node canvas)
- Code Editor Dialog (modal Python code editing with syntax highlighting)
- Node Properties Dialog (node configuration and metadata)
- Undo History Dialog (visual undo timeline)
- Settings/Preferences Dialog (keyboard shortcuts, appearance, behavior)
No specific accessibility requirements for this MVP iteration.
Maintain PyFlowGraph's existing dark theme aesthetic with professional color scheme. Use Font Awesome icons for consistency. Ensure visual distinction between different node types through color coding and iconography.
Windows, Linux, macOS desktop applications with mouse and keyboard as primary input methods. Minimum screen resolution 1920x1080 for comfortable large graph editing.
Single repository containing all PyFlowGraph components. Current structure with src/, tests/, docs/, examples/ will be maintained and extended for new features.
Monolithic desktop application architecture using PySide6 Qt framework. All functionality integrated into single executable with modular internal architecture based on existing patterns (node system, execution engine, UI components).
Comprehensive testing approach following existing patterns: Unit tests for core functionality, integration tests for component interaction, GUI tests for user workflows. Maintain current fast execution model (<5 seconds total) with new test coverage for undo/redo and grouping features.
- Language: Python 3.8+ maintaining current compatibility requirements
- GUI Framework: Continue with PySide6 for cross-platform desktop consistency
- Architecture Pattern: Implement Command Pattern for undo/redo functionality
- Data Persistence: Extend existing Markdown flow format for group metadata
- Performance: Leverage existing QGraphicsView framework optimizations
- Dependencies: Minimize new external dependencies - prefer built-in Qt functionality
- File Format: Backward compatibility with existing .md graph files required
- Execution: Maintain existing subprocess isolation model for node execution
- Memory Management: Use Qt's parent-child hierarchy for automatic cleanup
- Code Style: Follow established patterns in docs/architecture/coding-standards.md
Epic 1: Foundation & Undo/Redo Infrastructure Establish the Command Pattern infrastructure and basic undo/redo functionality, delivering immediate user value through mistake recovery capabilities.
Epic 2: Advanced Undo/Redo & User Interface Complete the undo/redo system with full operation coverage, UI integration, and professional user experience features.
Epic 3: Core Node Grouping System Implement fundamental grouping functionality allowing users to organize and manage complex graphs through collapsible node containers.
Establish the Command Pattern infrastructure and implement core undo/redo functionality for basic graph operations, providing users immediate ability to recover from common mistakes like accidental node deletion or connection errors. This epic delivers the foundation for all future undo/redo capabilities while providing immediate user value.
As a developer, I want a robust command pattern infrastructure, so that all graph operations can be made undoable in a consistent manner.
- Command base class with execute(), undo(), and get_description() methods
- CommandHistory class managing operation stack with configurable depth
- Integration point in NodeGraph for command execution
- Unit tests covering command execution and undo behavior
- Memory management preventing command history leaks
As a user, I want to undo node creation and deletion, so that I can recover from accidental node operations.
- CreateNodeCommand implementing node creation with position tracking
- DeleteNodeCommand with full node state preservation (code, properties, connections)
- Undo restores exact node state including all properties
- Multiple sequential node operations can be undone individually
- Node IDs remain consistent across undo/redo cycles
As a user, I want to undo connection creation and deletion, so that I can experiment with graph connectivity without fear of losing work.
- CreateConnectionCommand tracking source and target pins
- DeleteConnectionCommand preserving connection properties
- Undo preserves bezier curve positioning and visual properties
- Connection validation occurs during redo operations
- Orphaned connections are handled gracefully during node deletion undo
As a user, I want standard Ctrl+Z and Ctrl+Y keyboard shortcuts, so that I can quickly undo and redo operations using familiar patterns.
- Ctrl+Z triggers undo with visual feedback
- Ctrl+Y and Ctrl+Shift+Z trigger redo operations
- Shortcuts work regardless of current focus within the application
- Visual status indication when no undo/redo operations available
- Keyboard shortcuts are configurable in settings
Complete the undo/redo system with full operation coverage, UI integration, and professional user experience features.
As a user, I want to undo node movement and property changes, so that I can experiment with graph layout and node configuration without losing my work.
- MoveNodeCommand tracks position changes with start/end coordinates
- PropertyChangeCommand handles all node property modifications
- Batch movement operations (multiple nodes) handled as single undo unit
- Property changes preserve original values for complete restoration
- Visual feedback during undo shows nodes moving back to original positions
As a user, I want to undo code changes within nodes, so that I can experiment with Python code without fear of losing working implementations.
- CodeChangeCommand tracks full code content before/after modification
- Integration with code editor dialog for automatic command creation
- Undo restores exact code state including cursor position if possible
- Code syntax validation occurs during redo operations
- Large code changes are handled efficiently without memory issues
As a user, I want to undo copy/paste operations and complex multi-step actions, so that I can quickly revert bulk changes to my graph.
- CompositeCommand handles multi-operation transactions as single undo unit
- Copy/paste operations create appropriate grouped commands
- Selection-based operations (delete multiple, move multiple) group automatically
- Undo description shows meaningful operation summaries (e.g., "Delete 3 nodes")
- Composite operations can be partially undone if individual commands fail
As a user, I want visual undo/redo controls and history viewing, so that I can see what operations are available to undo and choose specific points to revert to.
- Edit menu shows undo/redo options with operation descriptions
- Toolbar buttons for undo/redo with appropriate icons and tooltips
- Undo History dialog showing list of operations with descriptions
- Status bar feedback showing current operation result
- Disabled state handling when no operations available
Replace the current isolated subprocess-per-node execution model with a single shared Python interpreter, enabling direct object passing and 100-1000x performance improvements for ML/data science workflows while respecting GPU memory constraints.
As a user, I want to select multiple nodes and create a group, so that I can organize related functionality into manageable containers.
- Multi-select nodes using Ctrl+Click and drag-rectangle selection
- Right-click context menu "Group Selected" option on valid selections
- Keyboard shortcut Ctrl+G for grouping selected nodes
- Group creation validation preventing invalid selections (isolated nodes, etc.)
- Automatic group naming with user override option in creation dialog
As a developer, I want all nodes to execute in a single persistent Python interpreter, so that objects can be passed directly without any serialization or process boundaries.
- Single Python interpreter shared across all node executions
- Persistent namespace allowing imports and variables to remain loaded
- Direct function calls replacing subprocess communication
- Shared memory space for all Python objects
- Zero startup overhead between node executions
As a user, I want to pass Python objects directly between nodes without any serialization, so that I can work with large tensors and DataFrames at maximum performance.
- Direct Python object references passed between nodes (no copying)
- Support for all Python types including PyTorch tensors, NumPy arrays, Pandas DataFrames
- Memory-mapped sharing for objects already in RAM
- Reference counting system for automatic cleanup
- No type restrictions or JSON fallbacks ever
As a user, I want nodes to execute sequentially with intelligent resource-aware scheduling, so that GPU memory constraints are respected and execution is optimized.
- Sequential execution following data dependency graph (no parallel execution)
- VRAM-aware scheduling preventing GPU out-of-memory conditions
- Memory threshold monitoring before executing memory-intensive nodes
- Execution queue management for optimal resource utilization
- Node priority system based on resource requirements
As a user working with ML models, I want intelligent GPU memory management, so that I can work with large models and datasets without running out of VRAM.
- Real-time VRAM usage tracking per GPU device
- Pre-execution memory requirement estimation for GPU nodes
- Automatic tensor cleanup and garbage collection between executions
- GPU memory pooling and reuse strategies for common tensor sizes
- Warning system and graceful failure for potential OOM situations
As a developer and power user, I want detailed performance profiling of node execution, so that I can identify bottlenecks and optimize my workflows.
- Nanosecond-precision timing for individual node executions
- Memory usage tracking for both RAM and VRAM consumption
- Data transfer metrics showing object sizes and access patterns
- Bottleneck identification with visual indicators in the graph
- Performance regression detection comparing execution runs
As a developer, I want interactive debugging capabilities within the shared execution environment, so that I can inspect and debug node logic effectively.
- Breakpoint support within node execution with interactive debugging
- Variable inspection showing object contents between nodes
- Step-through execution mode for debugging data flow
- Live data visualization on connection lines during execution
- Python debugger (pdb) integration for advanced debugging
As a user, I want a clean migration path and comprehensive testing, so that the transition to single-process execution is reliable and performant.
- One-time migration removing subprocess dependencies from existing graphs
- Performance benchmarks demonstrating 100-1000x speedup for ML workflows
- ML framework testing (PyTorch, TensorFlow, JAX compatibility)
- Large data pipeline testing (Pandas, Polars, DuckDB integration)
- Memory leak detection and long-running execution stability tests
Deliver specialized optimizations and integrations for machine learning and data science workflows, leveraging the single-process architecture for maximum performance with popular frameworks and libraries.
As a data scientist or ML engineer, I want first-class integration with popular ML frameworks, so that I can build high-performance model training and inference pipelines.
- First-class PyTorch tensor support with automatic device management
- TensorFlow/Keras compatibility with session and graph management
- JAX array handling with JIT compilation support
- Automatic gradient tape and computation graph management
- Model state persistence and checkpointing between nodes
As a data engineer, I want optimized data processing capabilities for large datasets, so that I can build efficient ETL and analysis workflows.
- Pandas DataFrame zero-copy operations and view-based processing
- Polars lazy evaluation integration with query optimization
- DuckDB query planning and execution for analytical workloads
- Streaming data support with configurable buffering for large datasets
- Batch processing with intelligent chunk size optimization
As a power user, I want intelligent resource management and monitoring, so that I can maximize hardware utilization while preventing system overload.
- CPU core affinity settings and NUMA-aware execution
- GPU device selection and multi-GPU workload distribution
- Memory pressure monitoring with automatic cleanup strategies
- Disk I/O optimization for data loading and model checkpoints
- Network I/O handling for remote data sources and model serving
As a developer and data scientist, I want comprehensive visualization of data flow and system performance, so that I can optimize workflows and debug issues effectively.
- Real-time tensor shape and data type visualization on connections
- DataFrame schema and sample data preview during execution
- GPU utilization graphs and VRAM usage monitoring
- Memory allocation timeline with garbage collection events
- Interactive execution DAG with performance hotspot highlighting
Executive Summary:
- Overall PRD completeness: 95%
- MVP scope appropriateness: Just Right
- Readiness for architecture phase: Ready
- Most critical gaps: Minor integration testing details
Category Analysis:
| Category | Status | Critical Issues |
|---|---|---|
| 1. Problem Definition & Context | PASS | None |
| 2. MVP Scope Definition | PASS | None |
| 3. User Experience Requirements | PASS | None |
| 4. Functional Requirements | PASS | None |
| 5. Non-Functional Requirements | PASS | None |
| 6. Epic & Story Structure | PASS | None |
| 7. Technical Guidance | PASS | None |
| 8. Cross-Functional Requirements | PARTIAL | Integration test details |
| 9. Clarity & Communication | PASS | None |
Key Strengths:
- Clear problem statement with market validation
- Well-defined epic structure with logical sequencing
- Comprehensive user stories with testable acceptance criteria
- Strong technical foundation building on existing architecture
- Appropriate MVP scope focusing on core competitive gaps
Minor Improvements Needed:
- Integration testing approach between undo/redo and grouping systems
- Error recovery scenarios for complex nested group operations
- Performance testing methodology for large graph scenarios
Final Decision: READY FOR ARCHITECT
"Based on the completed PyFlowGraph PRD, create detailed UI/UX specifications for the undo/redo interface and node grouping visual design. Focus on professional node editor best practices and accessibility compliance."
"Using this PyFlowGraph PRD as input, create comprehensive technical architecture documentation covering Command Pattern implementation, Node Grouping system architecture, and integration with existing PySide6 codebase. Address performance requirements and backward compatibility constraints."