The Conversation Extraction feature in Roo-Code enables users to export, filter, and process conversation histories from tasks. This technical guide documents the implementation, APIs, and advanced usage of the feature.
The conversation extraction system follows a multi-stage pipeline approach:
- Raw Export: Capturing the full conversation history with all metadata
- Content Filtering: Removing non-essential elements using regex patterns
- Message Cleaning: Processing user and assistant messages individually
- Formatting: Converting to the desired output format
- Optimization: Removing duplicates and normalizing content
The extraction system uses regex patterns to identify content types:
// Core patterns for content identification
const PATTERNS = {
// Environment details and structural content
environmentDetails: /\<environment_details\>.*?\<\/environment_details\>/s,
taskTag: /\<task\>(.*?)\<\/task\>/s,
feedbackTag: /\<feedback\>(.*?)\<\/feedback\>/s,
// File content and results to remove
fileContentTag: /\<file_content path=".*?"\>.*?\<\/file_content\>/s,
toolResult: /\[[^\]]+\] Result:.*?(?=\n\n|\Z)/s,
// Tool use patterns
toolUsePatterns: {
writeToFile: /\<write_to_file\>.*?\<\/write_to_file\>/s,
applyDiff: /\<apply_diff\>.*?\<\/apply_diff\>/s,
executeCommand: /\<execute_command\>.*?\<\/execute_command\>/s,
// Other tool patterns...
},
// Assistant thinking and responses
thinking: /\<thinking\>(.*?)\<\/thinking\>/s,
attemptCompletion: /\<attempt_completion\>.*?\<result\>(.*?)\<\/result\>.*?\<\/attempt_completion\>/s,
}These patterns selectively filter content based on type and importance.
The system processes user and assistant messages differently:
// Simplified extraction pipeline
export async function extractConversation(inputFile, outputFile) {
// Read raw conversation
const content = await fs.readFile(inputFile, 'utf8');
// Extract conversation turns
const turns = extractTurns(content);
// Process each turn based on speaker
const processedTurns = turns.map(turn => {
if (turn.speaker === 'User') {
return processTurn(turn, cleanUserMessage);
} else {
return processTurn(turn, cleanAssistantMessage);
}
});
// Remove duplicates
const finalTurns = removeDuplicates(processedTurns);
// Format and save
await saveConversation(finalTurns, outputFile);
return getExtractionMetrics(content, finalTurns);
}function cleanUserMessage(message) {
// Remove environment details
message = removePattern(message, PATTERNS.environmentDetails);
// Remove tool results
message = removePattern(message, PATTERNS.toolResult);
// Remove file content
message = removePattern(message, PATTERNS.fileContentTag);
// Extract task content
message = extractPattern(message, PATTERNS.taskTag);
return normalizeWhitespace(message);
}function cleanAssistantMessage(message) {
// Clean thinking sections
message = processThinkingSections(message);
// Extract completion results
message = extractCompletionResults(message);
// Remove or simplify tool uses
message = procesToolUses(message);
return normalizeWhitespace(message);
}function removeDuplicates(turns) {
// Track seen content
const seen = new Map();
return turns.map(turn => {
// Split message into paragraphs
const paragraphs = turn.message.split(/\n\n+/);
// Filter out duplicate paragraphs
const uniqueParagraphs = paragraphs.filter(para => {
// Skip short paragraphs
if (para.length < 10) return true;
// Create simplified version for comparison
const simplified = simplifyForComparison(para);
// Keep if not seen before
if (!seen.has(simplified)) {
seen.set(simplified, true);
return true;
}
return false;
});
return {
...turn,
message: uniqueParagraphs.join('\n\n')
};
});
}Roo-Code provides a programmatic API for task history exports:
export interface ExportOptions {
taskId: string;
format: 'markdown' | 'json' | 'txt';
cleanupLevel: 'minimal' | 'moderate' | 'aggressive';
includeMetadata: boolean;
outputPath?: string;
}
export async function exportTaskHistory(options: ExportOptions): Promise<ExportResult> {
const taskHistory = await getTaskHistory(options.taskId);
const processed = processTaskHistory(taskHistory, options);
if (options.outputPath) {
await fs.writeFile(options.outputPath, processed.content);
}
return processed;
}**User:**
How can I optimize my React components for performance?
---
**Assistant:**
Here are the key strategies for optimizing React components:
1. Use React.memo for functional components
2. Implement shouldComponentUpdate for class components
3. Use the useCallback hook for event handlers{
"conversation": [
{
"speaker": "User",
"message": "How can I optimize my React components for performance?",
"timestamp": "2023-06-15T14:32:45Z"
},
{
"speaker": "Assistant",
"message": "Here are the key strategies for optimizing React components:\n\n1. Use React.memo for functional components\n2. Implement shouldComponentUpdate for class components\n3. Use the useCallback hook for event handlers",
"timestamp": "2023-06-15T14:32:58Z"
}
],
"metadata": {
"taskId": "task_123456",
"mode": "code",
"startTime": "2023-06-15T14:32:40Z",
"endTime": "2023-06-15T14:35:12Z"
}
}Roo-Code provides a CLI interface for conversation extraction:
roo export-task <task-id> [options]
Options:
--format <format> Output format (markdown, json, txt) [default: markdown]
--cleanup <level> Cleanup level (minimal, moderate, aggressive) [default: moderate]
--output <path> Output file path
--metadata Include task metadata [default: false]The following settings control conversation extraction behavior:
| Setting | Type | Default | Description |
|---|---|---|---|
extractionCleanupLevel |
String | "moderate" | Default cleanup level for extractions |
preserveThinking |
Boolean | true | Whether to preserve thinking tags |
preserveCodeBlocks |
Boolean | true | Whether to preserve code blocks |
removeToolUses |
Boolean | true | Whether to remove tool use blocks |
removeEnvironmentDetails |
Boolean | true | Whether to remove environment details |
deduplicateContent |
Boolean | true | Whether to remove duplicate content |
Conversation extraction integrates with Roo-Code's task management system:
// Export from task management UI
export async function exportCurrentTask(format = 'markdown'): Promise<void> {
const taskManager = getTaskManager();
const currentTask = taskManager.getCurrentTask();
if (!currentTask) {
throw new Error('No active task');
}
const options = {
taskId: currentTask.id,
format,
cleanupLevel: vscode.workspace.getConfiguration('roo-cline').get('extractionCleanupLevel'),
includeMetadata: true
};
const result = await exportTaskHistory(options);
// Show in editor
const doc = await vscode.workspace.openTextDocument({
content: result.content,
language: getLanguageForFormat(format)
});
await vscode.window.showTextDocument(doc);
}The extraction process can be resource-intensive for large conversations. Optimizations include:
- Lazy Regex Evaluation: Patterns are compiled once and reused
- Incremental Processing: Processes the conversation in chunks
- Caching: Previously cleaned messages are cached
- Streaming Output: For large exports, content is written incrementally
Conversation extraction includes several security features:
- Sensitive Content Detection: Identifies and optionally redacts API keys, tokens, etc.
- Metadata Filtering: Removes system-specific details from metadata
- Permission Checks: Verifies user has appropriate permissions to export tasks
The extraction system provides extension points for custom processing:
// Register a custom message processor
registerMessageProcessor('customType', (message, context) => {
// Custom processing logic
return processedMessage;
});
// Register a custom format handler
registerFormatHandler('custom', {
contentType: 'application/x-custom',
fileExtension: '.custom',
format: (conversation) => {
// Custom formatting logic
return formattedContent;
}
});The Conversation Extraction feature in Roo-Code provides powerful capabilities for exporting, cleaning, and processing task histories. Through its multi-stage pipeline, pattern-based filtering, and extensible architecture, it enables users to create clean, shareable conversation exports while maintaining privacy and security.