This document explains the architectural decisions and design philosophy behind gomark.
gomark is built on the principle of pragmatic simplicity:
"Solve real problems efficiently without over-engineering"
- Simplicity over Complexity: Choose the simplest solution that works
- Performance over Features: Fast, reliable parsing over theoretical completeness
- Maintainability over Flexibility: Code that's easy to understand and modify
- Real Needs over Theoretical Needs: Implement what's actually used
- Direct Solutions: Avoid layers of abstraction when direct approaches work
Decision: Use single-pass tokenization followed by token-based parsing
Rationale:
- Performance: Single-pass tokenization is very fast
- Simplicity: Tokens are easy to work with and debug
- Reusability: Tokens can be reused by multiple parsers
- Memory Efficiency: Tokens reference original string data
Alternative Considered: Text-based parsing Why Rejected: Added complexity without clear benefits for our use cases
Decision: Use minimal Node interface with direct field access
type Node interface {
Type() NodeType
Restore() string
}Rationale:
- Performance: Direct field access (
node.Children) is faster than method calls - Simplicity: Easy to understand and work with
- Focused: Only implements what's actually needed
- Memory Efficient: No overhead for unused tree navigation features
Alternative Considered: Complex tree interface Why Rejected: Analysis showed no actual usage of tree navigation in our codebase
Decision: Each parser is independent and stateless
Rationale:
- Simplicity: No complex context management
- Debuggability: Easy to test individual parsers
- Performance: No context overhead
- Maintainability: Clear separation of concerns
Alternative Considered: Context-heavy parsing Why Rejected: Added complexity without clear benefits
Decision: Use NodeType string constants
type NodeType string
const ParagraphNode NodeType = "PARAGRAPH"Rationale:
- Debuggability: Easy to inspect and debug
- Simplicity: No complex type hierarchies
- Extensibility: Easy to add new types
- JSON-Friendly: Serializes naturally
Alternative Considered: Interface-based type system Why Rejected: Unnecessary complexity for our needs
Decision: Use configuration to enable/disable features
Rationale:
- Performance: Disabled features have zero overhead
- Flexibility: Easy to customize for different use cases
- Maintainability: Clear feature boundaries
- User-Friendly: Simple API for configuration
Decision: Use bytes.Buffer for output accumulation
Rationale:
- Performance: Efficient string building
- Memory: Reusable buffers
- Simplicity: Standard Go pattern
- Flexibility: Easy to extend
Public Packages:
├── ast/ # AST definitions - users need access
├── config/ # Configuration - users need to configure
├── parser/ # Parser interfaces - users may extend
├── renderer/ # Renderer interfaces - users may extend
Internal Implementation:
└── parser/internal/ # Parser implementations - users don't need access
Rationale:
- Public APIs allow extensibility where it matters
- Internal packages keep implementation details hidden
- Clean separation of concerns
- Reuse token slices where possible
- Buffer pooling in renderers
- Direct field access instead of method calls
- Tokenization is single-pass
- No multiple traversals of input text
- Direct token-to-AST conversion
- Only implement actually-used functionality
- No complex tree operations unless needed
- Disable unused extensions for zero overhead
These are conscious decisions, not oversights:
Current: Basic HTML tags without attributes Rationale: Complex attribute parsing adds significant complexity for minimal benefit
Current: Single-character tokenization Rationale: Works for all supported markdown features, simpler implementation
Current: Direct field access only Rationale: No actual usage found in codebase analysis
Current: Stateless parsers Rationale: Sufficient for current feature set, much simpler
Problem: Blank lines in blockquotes weren't rendered correctly
Solution: Enhanced Blockquote.Restore() to handle LineBreak nodes properly
Result: Perfect preservation of blank lines in blockquotes
Problem: Everything was in internal/ packages
Solution: Moved key packages to public for extensibility
Result: Modular architecture with better extensibility
✅ Choose gomark when:
- You need fast, reliable markdown parsing
- You want simple, maintainable code
- You're building applications, not markdown libraries
- You need good performance with moderate extensibility
- You want zero-configuration setup with all features enabled
Addition: Added support for essential HTML elements: <kbd>, <br>, <img>, <small>, <mark>
Approach:
- Reused existing
HTMLElementNoderather than creating separate node types - Enhanced with
ChildrenandIsSelfClosingfields for flexibility - Smart parsing: Different strategies for self-closing vs container elements
- Attribute handling: Proper parsing with quote support and sanitization
- Security-first: HTML-escaped attributes and content validation
Rationale:
- These elements have no markdown equivalents (can't be achieved with existing syntax)
- Essential for documentation and note-taking (especially
<kbd>for shortcuts) - CommonMark and GFM standards support for these elements
Change: Simplified configuration to "zero-config by default"
Before:
// Required configuration for HTML elements
cfg := config.DefaultConfig().WithAllowHTML(true)
engine := gomark.NewEngine(gomark.WithConfig(cfg))After:
// HTML elements work by default - no config needed!
doc, err := gomark.Parse("Press <kbd>Ctrl</kbd> to copy")New Configuration Approach:
gomark.Parse()→ UsesDefaultConfig()(all features enabled)config.DefaultConfig()→ Single configuration with sensible defaults
Rationale:
- gomark is primarily used in memos where users want all features
- Configuration complexity was barrier to adoption
- Smart defaults reduce cognitive load
gomark is designed to evolve pragmatically:
- Add features only when needed: No speculative features
- Maintain simplicity: New features shouldn't complicate existing code
- Performance first: New features shouldn't hurt performance
- Backward compatibility: Changes should be additive
Only if there's demonstrated need:
- Phase 2 HTML Elements:
<details>/<summary>,<a>with attributes,<div> - AST walking API (if users request it)
- More output formats (if users request them)
- Advanced HTML attribute parsing (if current approach proves insufficient)
gomark represents a pragmatic approach to markdown parsing:
- Clean modular architecture for extensibility
- Performance-focused implementation for real-world applications
- Simple, maintainable code that developers can understand and modify
- Focused feature set that solves real problems without over-engineering
This approach delivers excellent performance and maintainability while providing enough extensibility for most real-world use cases.