|
| 1 | +# Task: Generate API Reference Documentation for Kafka KafkaConsumer |
| 2 | + |
| 3 | +## Objective |
| 4 | + |
| 5 | +Generate comprehensive API reference documentation for the Apache Kafka `KafkaConsumer` Java API. The documentation should cover the complete API surface with emphasis on behavioral semantics, offset management strategies, rebalancing mechanics, and error handling patterns. |
| 6 | + |
| 7 | +## Scope |
| 8 | + |
| 9 | +Your documentation should cover the **KafkaConsumer API** from the `org.apache.kafka.clients.consumer` package, including: |
| 10 | + |
| 11 | +1. **Core Consumer Lifecycle** |
| 12 | + - Constructor variants and configuration |
| 13 | + - Subscription methods (dynamic and manual assignment) |
| 14 | + - Polling and data fetching semantics |
| 15 | + - Resource cleanup and closing |
| 16 | + |
| 17 | +2. **Offset Management** |
| 18 | + - Synchronous and asynchronous commit strategies |
| 19 | + - Offset queries and position control |
| 20 | + - Seek operations and offset discovery |
| 21 | + - Committed offset semantics |
| 22 | + |
| 23 | +3. **Consumer Group Mechanics** |
| 24 | + - ConsumerRebalanceListener interface and callbacks |
| 25 | + - Rebalance triggers and timing |
| 26 | + - Group membership and heartbeat behavior |
| 27 | + - Partition assignment vs subscription models |
| 28 | + |
| 29 | +4. **Flow Control and Position Management** |
| 30 | + - Pause and resume functionality |
| 31 | + - Position queries and manipulation |
| 32 | + - Offset-to-timestamp lookups |
| 33 | + - Beginning and end offset discovery |
| 34 | + |
| 35 | +5. **Error Handling** |
| 36 | + - Exception types and recovery strategies |
| 37 | + - CommitFailedException and group fencing |
| 38 | + - WakeupException and thread interruption |
| 39 | + - Timeout and authentication errors |
| 40 | + |
| 41 | +## Requirements |
| 42 | + |
| 43 | +### API Methods Documentation (40%) |
| 44 | + |
| 45 | +Document all public methods of the `KafkaConsumer` class with: |
| 46 | +- Method signatures including all overloads |
| 47 | +- Parameter semantics and validation rules |
| 48 | +- Return types and their meanings |
| 49 | +- Exception types thrown and conditions |
| 50 | + |
| 51 | +### Behavioral Notes (30%) |
| 52 | + |
| 53 | +Explain critical behavioral semantics: |
| 54 | +- **poll() blocking behavior**: when it returns immediately vs when it blocks, timeout handling, rebalance callback execution during poll |
| 55 | +- **Offset commit semantics**: difference between sync/async commits, retry behavior, commit failure handling |
| 56 | +- **Rebalance coordination**: when rebalances occur (only during poll), callback ordering (revoked then assigned), partition ownership guarantees |
| 57 | +- **Thread safety**: which methods are thread-safe, wakeup() special case, event loop model |
| 58 | +- **Group membership**: max.poll.interval.ms enforcement, proactive leave behavior, session timeout vs poll timeout |
| 59 | +- **Manual vs dynamic assignment**: mutually exclusive nature, use cases for each model |
| 60 | +- **Position vs committed offset**: the off-by-one relationship ("committed should be next offset to read") |
| 61 | +- **Transactional semantics**: read_committed isolation level, LSO boundary, filtered messages |
| 62 | + |
| 63 | +### Usage Examples (20%) |
| 64 | + |
| 65 | +Provide concrete code examples demonstrating: |
| 66 | +- **Basic subscription and polling loop**: subscribe to topics, poll for records, process messages |
| 67 | +- **Manual offset commit**: disable auto-commit, explicit commitSync/commitAsync after processing |
| 68 | +- **Rebalance listener**: implement ConsumerRebalanceListener, commit offsets on revoke, initialize positions on assign |
| 69 | +- **Seek operations**: seekToBeginning, seekToEnd, seek to specific offset, timestamp-based seeking |
| 70 | +- **Multi-threaded processing pattern**: single consumer thread with pause/resume coordination and worker pool |
| 71 | + |
| 72 | +### Documentation Structure (10%) |
| 73 | + |
| 74 | +Organize documentation with clear sections: |
| 75 | +- Overview and threading model |
| 76 | +- Core types (KafkaConsumer, ConsumerRebalanceListener, ConsumerRecords, etc.) |
| 77 | +- Subscription and assignment methods |
| 78 | +- Polling and data fetching |
| 79 | +- Offset management methods |
| 80 | +- Flow control and position queries |
| 81 | +- Metadata and monitoring |
| 82 | +- Lifecycle and resource management |
| 83 | +- Exception handling guide |
| 84 | +- Configuration-driven behaviors |
| 85 | +- Common patterns and best practices |
| 86 | + |
| 87 | +## Output Format |
| 88 | + |
| 89 | +Write your documentation to `/workspace/documentation.md` in Markdown format. |
| 90 | + |
| 91 | +## Notes |
| 92 | + |
| 93 | +- Focus on **behavioral semantics** that aren't obvious from method signatures alone |
| 94 | +- Include edge cases and gotchas (e.g., poll may block longer than timeout during rebalance callbacks) |
| 95 | +- Explain the relationship between different offset concepts (position, committed, beginning, end, LSO) |
| 96 | +- Cover both group-managed and standalone consumer patterns |
| 97 | +- Document configuration properties that significantly affect API behavior |
| 98 | +- Use the codebase to find real usage patterns in tests and internal components |
| 99 | + |
| 100 | +## Evaluation |
| 101 | + |
| 102 | +Your documentation will be evaluated on: |
| 103 | +1. **Completeness**: All key API methods and types documented |
| 104 | +2. **Accuracy**: Behavioral descriptions match actual implementation |
| 105 | +3. **Clarity**: Complex semantics explained clearly with examples |
| 106 | +4. **Practical value**: Real-world usage patterns and error handling strategies included |
0 commit comments