Skip to content

Commit 3cab365

Browse files
authored
Merge pull request #145 from jongio/backend
feat: Add Azure Storage backend integration with comprehensive sync, sharing, and analytics capabilities
2 parents dcc09dc + 1a3bbd4 commit 3cab365

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

65 files changed

+13932
-327
lines changed

.github/skills/README.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,9 @@
1+
---
2+
title: GitHub Copilot Agent Skills
3+
description: Overview of agent skills for GitHub Copilot Token Tracker extension
4+
lastUpdated: 2026-01-26
5+
---
6+
17
# GitHub Copilot Agent Skills
28

39
This directory contains Agent Skills for GitHub Copilot and other compatible AI agents. Agent Skills are used to teach agents specialized tasks and provide domain-specific knowledge.

.github/skills/copilot-log-analysis/SKILL.md

Lines changed: 40 additions & 176 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
name: copilot-log-analysis
3-
description: Analyzing GitHub Copilot session log files to extract token usage, model info, and interaction data. Use when working with session files or debugging token tracking.
3+
description: Analyzing GitHub Copilot session log files to extract token usage, model information, and interaction data. Use when working with session files, understanding the extension's log analysis methods, or debugging token tracking issues.
44
---
55

66
# Copilot Log Analysis Skill
@@ -16,8 +16,8 @@ The extension analyzes two types of log files:
1616
## Session File Discovery
1717

1818
### Key Method: `getCopilotSessionFiles()`
19-
**Location**: `src/extension.ts` (lines 905-1017)
20-
**Helper Methods**: `getVSCodeUserPaths()` (lines 860-903), `scanDirectoryForSessionFiles()` (lines 1020-1045)
19+
**Location**: `src/extension.ts` (lines 975-1073)
20+
**Helper Methods**: `getVSCodeUserPaths()` (lines 934-972), `scanDirectoryForSessionFiles()` (lines 1078-1110)
2121

2222
This method discovers session files across all VS Code variants and locations:
2323

@@ -43,99 +43,36 @@ This method discovers session files across all VS Code variants and locations:
4343
- **Remote/Server**: `~/.vscode-server/data/User`, `~/.vscode-server-insiders/data/User`
4444

4545
### Helper Method: `getVSCodeUserPaths()`
46-
**Location**: `src/extension.ts` (lines 860-903)
46+
**Location**: `src/extension.ts` (lines 934-972)
4747

4848
Returns all possible VS Code user data paths for different variants and platforms.
4949

5050
### Helper Method: `scanDirectoryForSessionFiles()`
51-
**Location**: `src/extension.ts` (lines 1020-1045)
51+
**Location**: `src/extension.ts` (lines 1078-1110)
5252

5353
Recursively scans directories for `.json` and `.jsonl` session files.
5454

5555
## Field Extraction Methods
5656

57-
### 1. Token Estimation: `estimateTokensFromSession()`
58-
**Location**: `src/extension.ts` (lines 1047-1088)
57+
### Parsing and Token Accounting: `parseSessionFileContent()`
58+
**Location**: `src/sessionParser.ts` (lines 184-347)
5959

60-
**Purpose**: Estimates total tokens used in a session by analyzing message content.
60+
**Purpose**: Parses session files and returns tokens, interactions, model usage, and editor type-safe model IDs.
6161

6262
**How it works:**
63-
1. Reads session file content
64-
2. Dispatches to format-specific handler:
65-
- `.jsonl` files → `estimateTokensFromJsonlSession()` (lines 1094-1121)
66-
- `.json` files → analyzes `requests` array
67-
68-
**For JSON files:**
69-
- **Input tokens**: Extracted from `requests[].message.parts[].text`
70-
- **Output tokens**: Extracted from `requests[].response[].value`
71-
- Uses model-specific character-to-token ratios from `tokenEstimators.json`
72-
73-
**For JSONL files:**
74-
- Processes line-by-line JSON events
75-
- **Copilot CLI format** (uses `type` field):
76-
- **User messages**: `type: 'user.message'`, field: `data.content`
77-
- **Assistant messages**: `type: 'assistant.message'`, field: `data.content`
78-
- **Tool results**: `type: 'tool.result'`, field: `data.output`
79-
- **VS Code Incremental format** (uses `kind` field):
80-
- **User requests**: `kind: 1`, field: `request.message.parts[].text`
81-
- **Assistant responses**: `kind: 2`, field: `response[].value`, `model`
82-
83-
### 2. Interaction Counting: `countInteractionsInSession()`
84-
**Location**: `src/extension.ts` (lines 615-651)
85-
86-
**Purpose**: Counts the number of user interactions in a session.
63+
1. Accepts raw file content along with callbacks for token estimation and model detection.
64+
2. Supports both `.json` (Copilot Chat) and `.jsonl` (CLI/agent) formats, including delta-based JSONL streams.
65+
3. Counts interactions (user messages), input tokens, and output tokens while grouping by model.
66+
4. Uses `estimateTokensFromText()` (lines 1139-1155 in `src/extension.ts`) for character-to-token estimation.
8767

88-
**How it works:**
89-
90-
**For JSON files:**
91-
- Counts items in `requests` array
92-
- Each request = one user interaction
93-
94-
**For JSONL files:**
95-
- **Copilot CLI format**: Counts events with `type: 'user.message'`
96-
- **VS Code Incremental format**: Counts events with `kind: 1`
97-
- Processes line-by-line, skipping malformed lines
98-
- **Note**: Sessions with 0 interactions (empty `requests: []` or no `kind: 1` entries) are filtered out in diagnostics view
99-
100-
### 3. Model Usage Extraction: `getModelUsageFromSession()`
101-
**Location**: `src/extension.ts` (lines 653-729)
102-
103-
**Purpose**: Extracts per-model token usage (input vs output).
104-
105-
**How it works:**
106-
107-
**For JSON files:**
108-
- Iterates through `requests` array
109-
- Determines model using `getModelFromRequest()` helper (lines 1123-1145)
110-
- Tracks input tokens from `message.parts[].text`
111-
- Tracks output tokens from `response[].value`
112-
113-
**For JSONL files (Copilot CLI format):**
114-
- Default model: `gpt-4o` (for CLI sessions)
115-
- Reads `event.model` if specified
116-
- Categorizes by event type:
117-
- `user.message` → input tokens
118-
- `assistant.message` → output tokens
119-
- `tool.result` → input tokens (context)
120-
121-
**For JSONL files (VS Code Incremental format):**
122-
- Reads `model` field from `kind: 2` response entries
123-
- Categorizes by kind:
124-
- `kind: 1` → input tokens (from `request.message.parts[].text`)
125-
- `kind: 2` → output tokens (from `response[].value`)
126-
127-
**Model Detection Logic**: `getModelFromRequest()`
68+
### Model Detection Logic: `getModelFromRequest()`
69+
**Location**: `src/extension.ts` (lines 1102-1134)
12870
- Primary: `request.result.metadata.modelId`
129-
- Fallback: Parse `request.result.details` string for model names
130-
- Detected patterns (defined in code lines 1129-1143):
131-
- OpenAI: GPT-3.5-Turbo, GPT-4, GPT-4.1, GPT-4o, GPT-4o-mini, GPT-5, o3-mini, o4-mini
132-
- Anthropic: Claude Sonnet 3.5, Claude Sonnet 3.7, Claude Sonnet 4
133-
- Google: Gemini 2.5 Pro, Gemini 3 Pro (Preview), Gemini 3 Pro
134-
- Default fallback: gpt-4
71+
- Fallback: parses `request.result.details` for known model patterns
72+
- Detected patterns: GPT-3.5-Turbo, GPT-4 family (4, 4.1, 4o, 4o-mini, 5, o3-mini, o4-mini), Claude Sonnet (3.5, 3.7, 4), Gemini (2.5 Pro, 3 Pro, 3 Pro Preview); defaults to `gpt-4`
73+
- Display name mapping in `getModelDisplayName()` (lines 1778-1811) adds variants such as GPT-5 family, Claude Haiku, Claude Opus, Gemini 3 Flash, Grok, and Raptor when present in `metadata.modelId`.
13574

136-
**Note**: The display name mapping in `getModelDisplayName()` includes additional model variants (GPT-5 family, Claude Haiku, Claude Opus, Gemini 3 Flash, Grok, Raptor) that may appear if specified via `metadata.modelId` but are not pattern-matched from `result.details`.
137-
138-
### 4. Editor Type Detection: `getEditorTypeFromPath()`
75+
### Editor Type Detection: `getEditorTypeFromPath()`
13976
**Location**: `src/extension.ts` (lines 111-143)
14077

14178
**Purpose**: Determines which VS Code variant created the session file.
@@ -151,31 +88,10 @@ Recursively scans directories for `.json` and `.jsonl` session files.
15188
- Contains `/code/``'VS Code'`
15289
- Default → `'Unknown'`
15390

154-
### 5. Session Title Extraction
155-
**Location**: `src/extension.ts` in `getSessionFileDetails()` method
156-
157-
**Purpose**: Extracts the session title for display in diagnostics.
158-
159-
**How it works:**
160-
161-
**For JSON files:**
162-
1. Primary: `customTitle` field from root of session object
163-
2. Fallback: `generatedTitle` from response items (e.g., thinking blocks, tool invocations)
164-
- Iterates through `requests[].response[]` looking for `generatedTitle`
165-
166-
**For JSONL files (Incremental format):**
167-
1. Primary: `customTitle` from the `kind: 0` header entry
168-
2. Fallback: `generatedTitle` from `kind: 2` response entries
169-
170-
**For JSONL files (CLI format):**
171-
- Not available (CLI sessions don't have titles)
172-
173-
**Note**: `customTitle` is user-defined (when they rename the session). `generatedTitle` is AI-generated summary text found in thinking blocks or tool results.
174-
17591
## Token Estimation Algorithm
17692

17793
### Character-to-Token Conversion: `estimateTokensFromText()`
178-
**Location**: `src/extension.ts` (lines 1147-1160)
94+
**Location**: `src/extension.ts` (lines 1139-1155)
17995

18096
**Approach**: Uses model-specific character-to-token ratios
18197
- Default ratio: 0.25 (4 characters per token)
@@ -191,31 +107,14 @@ Recursively scans directories for `.json` and `.jsonl` session files.
191107
### Cache Structure: `SessionFileCache`
192108
**Location**: `src/extension.ts` (lines 72-77)
193109

194-
Stores pre-calculated data to avoid re-processing unchanged files:
195-
```typescript
196-
{
197-
tokens: number,
198-
interactions: number,
199-
modelUsage: ModelUsage,
200-
mtime: number // file modification timestamp
201-
}
202-
```
110+
Stores pre-calculated tokens, interactions, model usage, and file mtime to avoid re-processing unchanged files.
203111

204112
### Cache Methods:
205-
- **`isCacheValid()`** (lines 165-168): Checks if cache is valid for file
206-
- **`getCachedSessionData()`** (lines 170-172): Retrieves cached data
207-
- **`setCachedSessionData()`** (lines 174-186): Stores data with size limit (1000 files max)
208-
- **`clearExpiredCache()`** (lines 188-201): Removes cache for deleted files
209-
210-
### Cached Wrapper Methods:
211-
- `estimateTokensFromSessionCached()` (lines 755-758)
212-
- `countInteractionsInSessionCached()` (lines 760-763)
213-
- `getModelUsageFromSessionCached()` (lines 765-768)
214-
215-
All use `getSessionFileDataCached()` (lines 732-753) which:
216-
1. Checks cache validity using file mtime
217-
2. Returns cached data if valid
218-
3. Otherwise reads file and caches result
113+
- `isCacheValid()` (lines 227-230): Validates cached entry by mtime
114+
- `getCachedSessionData()` (lines 232-234): Retrieves cached data
115+
- `setCachedSessionData()` (lines 236-254): Stores data with FIFO eviction after 1000 files
116+
- `clearExpiredCache()` (lines 250-264): Drops cache entries for missing files
117+
- `getSessionFileDataCached()` (lines 811-845): Reads session content, parses via `parseSessionFileContent()`, and caches results
219118

220119
## Schema Documentation
221120

@@ -229,10 +128,10 @@ All use `getSessionFileDataCached()` (lines 732-753) which:
229128
4. **`SCHEMA-ANALYSIS.md`**: Quick reference guide
230129
5. **`VSCODE-VARIANTS.md`**: VS Code variant detection documentation
231130

232-
**Note**: The analysis JSON file is auto-generated and may not exist in fresh clones. It's created by running the schema analysis script documented in the README.
131+
**Note**: The analysis JSON file is auto-generated and may not exist in fresh clones. It is created by running the schema analysis script documented below.
233132

234133
### Schema Analysis
235-
See the **Executable Scripts** section above for three available scripts:
134+
See the **Executable Scripts** section for available utilities:
236135
1. `get-session-files.js` - Quick session file discovery
237136
2. `diagnose-session-files.js` - Detailed diagnostics
238137
3. `analyze-session-schema.ps1` - PowerShell schema analysis
@@ -288,45 +187,6 @@ See the **Executable Scripts** section above for three available scripts:
288187
- Tool output: `data.output` (when `type: 'tool.result'`)
289188
- Model: `model` (optional, defaults to `gpt-4o`)
290189

291-
## JSONL File Structure (VS Code Incremental)
292-
293-
**Introduced in**: VS Code Insiders ~0.25+ (April 2025)
294-
295-
This is a newer incremental format used by VS Code Insiders that logs session data progressively. Unlike the CLI format that uses `type`, this format uses `kind` to identify log entry types.
296-
297-
**Entry kinds:**
298-
299-
```jsonl
300-
{"kind": 0, "sessionId": "...", "customTitle": "Session Title", "mode": "agent", "version": 1}
301-
{"kind": 1, "requestId": "...", "request": {"message": {"parts": [{"text": "user prompt"}]}}}
302-
{"kind": 2, "requestId": "...", "response": [{"value": "assistant reply"}], "model": "claude-3.5-sonnet"}
303-
```
304-
305-
**Kind values:**
306-
- `kind: 0` - Session header (contains `sessionId`, `customTitle`, `mode`, `version`)
307-
- `kind: 1` - User request (contains `requestId`, `request.message.parts[].text`)
308-
- `kind: 2` - Assistant response (contains `requestId`, `response[].value`, `model`)
309-
310-
**Key fields:**
311-
- Session title: `customTitle` (when `kind: 0`)
312-
- User input: `request.message.parts[].text` (when `kind: 1`)
313-
- Assistant output: `response[].value` (when `kind: 2`)
314-
- Model: `model` (when `kind: 2`, e.g., `claude-3.5-sonnet`)
315-
316-
**Format detection:**
317-
```javascript
318-
// Read first line of JSONL file
319-
const firstLine = JSON.parse(lines[0]);
320-
if ('kind' in firstLine) {
321-
// VS Code Incremental format
322-
} else if ('type' in firstLine) {
323-
// Copilot CLI format
324-
}
325-
```
326-
327-
**Official source reference**:
328-
- `vscode-copilot-chat/src/vs/workbench/contrib/chat/common/chatSessionsProvider.d.ts`
329-
330190
## Pricing and Cost Calculation
331191

332192
### Pricing Data
@@ -471,7 +331,6 @@ pwsh .github/skills/copilot-log-analysis/analyze-session-schema.ps1 -OutputPath
471331
- Documents field types, occurrences, and variations
472332

473333
**Note**: This script generates the `session-file-schema-analysis.json` file referenced in the Schema Documentation section below.
474-
475334
## Usage Examples
476335

477336
### Example 1: Finding all session files
@@ -485,30 +344,35 @@ console.log(`Found ${sessionFiles.length} session files`);
485344
const filePath = '/path/to/session.json';
486345
const stats = fs.statSync(filePath);
487346
const mtime = stats.mtime.getTime();
347+
const content = await fs.promises.readFile(filePath, 'utf8');
348+
349+
const estimate = (text: string, model = 'gpt-4o') => Math.ceil(text.length * 0.25);
350+
const detectModel = (req: any) => req?.result?.metadata?.modelId ?? 'gpt-4o';
488351

489-
// Get all data (cached if unchanged)
490-
const tokens = await estimateTokensFromSessionCached(filePath, mtime);
491-
const interactions = await countInteractionsInSessionCached(filePath, mtime);
492-
const modelUsage = await getModelUsageFromSessionCached(filePath, mtime);
352+
const parsed = parseSessionFileContent(filePath, content, estimate, detectModel);
493353
const editorType = getEditorTypeFromPath(filePath);
494354

495-
console.log(`Tokens: ${tokens}`);
496-
console.log(`Interactions: ${interactions}`);
355+
console.log(`Tokens: ${parsed.tokens}`);
356+
console.log(`Interactions: ${parsed.interactions}`);
497357
console.log(`Editor: ${editorType}`);
498-
console.log(`Models:`, modelUsage);
358+
console.log(`Models:`, parsed.modelUsage);
499359
```
500360

501361
### Example 3: Processing daily statistics
502362
```typescript
503363
const now = new Date();
504364
const todayStart = new Date(now.getFullYear(), now.getMonth(), now.getDate());
505365
const sessionFiles = await getCopilotSessionFiles();
366+
const estimate = (text: string, model = 'gpt-4o') => Math.ceil(text.length * 0.25);
367+
const detectModel = (req: any) => req?.result?.metadata?.modelId ?? 'gpt-4o';
506368

507369
let todayTokens = 0;
508370
for (const file of sessionFiles) {
509371
const stats = fs.statSync(file);
510372
if (stats.mtime >= todayStart) {
511-
todayTokens += await estimateTokensFromSessionCached(file, stats.mtime.getTime());
373+
const content = await fs.promises.readFile(file, 'utf8');
374+
const parsed = parseSessionFileContent(file, content, estimate, detectModel);
375+
todayTokens += parsed.tokens;
512376
}
513377
}
514378
```

.github/skills/refresh-json-data/README.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,9 @@
1+
---
2+
title: Refresh JSON Data Skill
3+
description: Instructions for refreshing token estimator and model pricing data
4+
lastUpdated: 2026-01-26
5+
---
6+
17
# Refresh JSON Data Skill
28

39
This is a GitHub Copilot Agent Skill that provides instructions for refreshing the token estimator and model pricing data in the Copilot Token Tracker extension.

.vscode/settings.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@
2626
"/^node esbuild\\.js 2>&1$/": {
2727
"approve": true,
2828
"matchCommandLine": true
29-
}
29+
},
30+
"git fetch": true
3031
}
3132
}

0 commit comments

Comments
 (0)