Agentmail#197
Conversation
There was a problem hiding this comment.
2 issues found across 8 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="pkg/clients/agentmail.go">
<violation number="1" location="pkg/clients/agentmail.go:151">
P2: URL-encode pageToken before appending it to the query string; raw concatenation can break pagination when the token contains reserved characters.</violation>
</file>
<file name="pkg/sources/providers/agentmail.go">
<violation number="1" location="pkg/sources/providers/agentmail.go:122">
P2: Do not return sentinel size `0` for generated files in `ReadDir`; advertise real content size to prevent size-based readers from truncating or skipping content.
(Based on your team's feedback about avoiding fixed sentinel file sizes in directory metadata.) [FEEDBACK_USED]</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.
There was a problem hiding this comment.
4 issues found across 17 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="pkg/gateway/services/view_filters.go">
<violation number="1" location="pkg/gateway/services/view_filters.go:410">
P1: Combining `from` and `subject` into one space-joined AgentMail query breaks multi-field filtering because the provider performs literal substring matching on a single query string.</violation>
</file>
<file name="pkg/sources/queries/baml_client/baml_source_map.go">
<violation number="1" location="pkg/sources/queries/baml_client/baml_source_map.go:20">
P2: The embedded generated BAML map is out of sync with `baml_src`: it downgrades `generators.baml` to `0.218.1` while source is `0.220.0`. Keep generated artifacts consistent with source.</violation>
</file>
<file name="pkg/sources/providers/agentmail.go">
<violation number="1" location="pkg/sources/providers/agentmail.go:192">
P2: ExecuteQuery only reads the first page of inboxes, so results silently omit inboxes after the first 100.</violation>
<violation number="2" location="pkg/sources/providers/agentmail.go:206">
P1: Do not silently ignore inbox fetch errors in ExecuteQuery; returning partial/empty success can mask provider failures and miss events.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.
| if f.Subject != "" { | ||
| parts = append(parts, f.Subject) | ||
| } | ||
| spec := newSpec("agentmail", strings.Join(parts, " "), limit) |
There was a problem hiding this comment.
P1: Combining from and subject into one space-joined AgentMail query breaks multi-field filtering because the provider performs literal substring matching on a single query string.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At pkg/gateway/services/view_filters.go, line 410:
<comment>Combining `from` and `subject` into one space-joined AgentMail query breaks multi-field filtering because the provider performs literal substring matching on a single query string.</comment>
<file context>
@@ -387,6 +395,25 @@ func buildConfluenceFilter(raw json.RawMessage, limit int) (string, error) {
+ if f.Subject != "" {
+ parts = append(parts, f.Subject)
+ }
+ spec := newSpec("agentmail", strings.Join(parts, " "), limit)
+ if f.Inbox != "" {
+ spec["inbox_filter"] = f.Inbox
</file context>
| for _, id := range inboxIDs { | ||
| msgs, err := a.getCachedMessages(ctx, id) | ||
| if err != nil { | ||
| continue |
There was a problem hiding this comment.
P1: Do not silently ignore inbox fetch errors in ExecuteQuery; returning partial/empty success can mask provider failures and miss events.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At pkg/sources/providers/agentmail.go, line 206:
<comment>Do not silently ignore inbox fetch errors in ExecuteQuery; returning partial/empty success can mask provider failures and miss events.</comment>
<file context>
@@ -165,6 +167,165 @@ func (a *AgentMailProvider) Search(_ context.Context, _ *sources.ProviderContext
+ for _, id := range inboxIDs {
+ msgs, err := a.getCachedMessages(ctx, id)
+ if err != nil {
+ continue
+ }
+ all = append(all, msgs...)
</file context>
| continue | |
| return nil, err |
| "cron.baml": "class CronResult {\n cron_expr string @description(\"Standard 5-field cron expression: minute hour day_of_month month day_of_week\")\n}\n\nfunction ParseCronSchedule(description: string, timezone: string) -> CronResult {\n client SmallModelCerebras\n prompt #\"\n Convert the following human-readable schedule description into a standard\n 5-field cron expression (minute hour day_of_month month day_of_week).\n\n The user is in the {{ timezone }} timezone. All times in the description\n refer to {{ timezone }} local time. Output the cron expression in that\n local timezone (do NOT convert to UTC).\n\n Description: {{ description }}\n\n {{ ctx.output_format }}\n \"#\n}\n", | ||
| "generators.baml": "generator target {\n output_type \"go\"\n output_dir \"../\"\n version \"0.220.0\"\n\n // 'baml-cli generate' will run this after generating go code\n // This command will be run from within $output_dir/baml_client\n on_generate \"bash \\\"$(git rev-parse --show-toplevel)/bin/patch-baml-runtime.sh\\\" && gofmt -w . && goimports -w .\"\n\n // Go packages name as specified in go.mod\n // We need this to generate correct imports in the generated baml_client\n client_package_name \"github.com/beam-cloud/airstore/pkg/sources/queries\"\n}\n", | ||
| "smart_queries.baml": "// =============================================================================\n// Smart Query BAML Functions\n// Infer source-specific queries from folder/file names\n// =============================================================================\n\nclass GmailQueryResult {\n gmail_query string @description(\"Gmail search query using Gmail's operators\")\n limit int @description(\"Max results to return (default: 50)\")\n filename_format string @description(\"Template for result filenames. Use {id}, {date}, {from}, {subject}. Example: {date}_{from}_{subject}_{id}.txt\")\n}\n\nclass GDriveQueryResult {\n gdrive_query string @description(\"Google Drive search query\")\n limit int @description(\"Max results to return (default: 50)\")\n filename_format string @description(\"Template for result filenames. Use {id}, {name}, {date}. Example: {name}_{id}\")\n}\n\nclass NotionQueryResult {\n notion_query string @description(\"Notion search query text\")\n limit int @description(\"Max results to return (default: 50)\")\n filename_format string @description(\"Template for result filenames. Use {id}, {title}, {date}. Example: {title}_{id}.md\")\n}\n\nclass GitHubQueryResult {\n github_query string @description(\"GitHub query - format depends on search_type\")\n search_type string @description(\"Type: repos, issues, prs, commits, releases, workflows, files, branches\")\n content_type string @description(\"Format: markdown (default), diff (PRs/commits), json (metadata), raw (files)\")\n limit int @description(\"Max results (default: 50)\")\n filename_format string @description(\"Filename template with placeholders like {number}, {title}, {sha}, {name}, {path}\")\n}\n\nclass SlackQueryResult {\n slack_query string @description(\"Slack search query using Slack's search syntax\")\n limit int @description(\"Max results to return (default: 50)\")\n filename_format string @description(\"Template for result filenames. Use {id}, {channel}, {user}, {date}, {text}. Example: {date}_{channel}_{user}_{id}.txt\")\n}\n\nclass LinearQueryResult {\n linear_query string @description(\"Linear issue filter query using Linear's filter syntax\")\n search_type string @description(\"Type of search: 'issues' or 'projects'. Default: 'issues'\")\n limit int @description(\"Max results to return (default: 50)\")\n filename_format string @description(\"Template for result filenames. Use {id}, {identifier}, {title}, {state}, {assignee}, {team}, {priority}, {date}. Example: {identifier}_{title}.md\")\n}\n\nclass PostHogQueryResult {\n posthog_query string @description(\"Search query string for PostHog API\")\n search_type string @description(\"Type of search: 'events', 'feature-flags', 'insights', 'cohorts'\")\n project_id int @description(\"PostHog project ID to search in. Use 0 to auto-select the first available project.\")\n limit int @description(\"Max results to return (default: 50)\")\n filename_format string @description(\"Template for result filenames. Placeholders vary by search_type.\")\n}\n\nclass WebQueryResult {\n web_mode string @description(\"'map' to crawl a specific site, 'search' to find pages via web search. Default: map\")\n web_query string @description(\"For web_mode=map: the URL to crawl. For web_mode=search: the search query string.\")\n include_paths string[] @description(\"URL path patterns to include when web_mode=map (e.g. /cocktails/*). Empty array = all paths. Ignored for search.\")\n limit int @description(\"Max pages to discover (default: 100)\")\n filename_format string @description(\"Template for result filenames. Use {title}, {path}, {id}, {date}. Example: {title}_{id}.md\")\n}\n\n// -----------------------------------------------------------------------------\n// PostHog Query Inference\n// -----------------------------------------------------------------------------\n\nfunction InferPostHogQuery(name: string, guidance: string?) -> PostHogQueryResult {\n client AnthropicClient\n prompt #\"\n Convert a folder/file name into a PostHog query.\n\n SEARCH TYPES:\n 1. feature-flags — Feature flags (toggles). Use when name contains: flag, toggle, feature, rollout, experiment\n 2. insights — Saved queries, funnels, retention, trends. Use when name contains: insight, funnel, retention, trend, dashboard, chart, analytics\n 3. cohorts — User segments/groups. Use when name contains: cohort, segment, group, users, audience\n 4. events — Raw event data (default). Use when name contains: event, pageview, click, action, or nothing specific\n\n POSTHOG STANDARD EVENTS (start with $):\n $pageview, $pageleave, $autocapture, $screen, $exception, $rageclick, $web_vitals\n\n QUERY FORMAT:\n - For events: the event name to filter on (e.g., \"$pageview\", \"signup\", \"purchase\"). Empty string = all recent events.\n - For feature-flags/insights/cohorts: a search term to filter by name/key (e.g., \"beta\", \"onboarding\", \"retention\").\n\n ACTIVE FLAGS:\n If the name implies \"active\" or \"enabled\" flags (e.g., \"active-feature-flags\", \"enabled-flags\"), set search_type to \"feature-flags\" and use a broad query (empty or relevant keyword). The provider will filter for active=true.\n\n DEFAULTS:\n - project_id: 0 (auto-select first project)\n - limit: 50\n\n FILENAME FORMATS by search_type:\n - events: {date}_{event}_{id}.json\n - feature-flags: {key}_{id}.json\n - insights: {name}_{id}.json\n - cohorts: {name}_{id}.json\n\n Query examples:\n - \"active-feature-flags\" → search_type: feature-flags, query: \"\" (all flags, provider filters active)\n - \"pageview-events\" → search_type: events, query: \"$pageview\"\n - \"error-events\" → search_type: events, query: \"$exception\"\n - \"retention-insights\" → search_type: insights, query: \"retention\"\n - \"beta-users\" → search_type: cohorts, query: \"beta\"\n - \"signup-funnel\" → search_type: insights, query: \"signup funnel\"\n - \"recent-events\" → search_type: events, query: \"\"\n - \"experiment-flags\" → search_type: feature-flags, query: \"experiment\"\n - \"rage-clicks\" → search_type: events, query: \"$rageclick\"\n - \"web-vitals\" → search_type: events, query: \"$web_vitals\"\n - \"onboarding-cohort\" → search_type: cohorts, query: \"onboarding\"\n\n Folder/file name: {{ name }}\n {% if guidance %}\n\n Additional guidance from user: {{ guidance }}\n {% endif %}\n\n {{ ctx.output_format }}\n \"#\n}\n\n// -----------------------------------------------------------------------------\n// Gmail Query Inference\n// -----------------------------------------------------------------------------\n\nfunction InferGmailQuery(name: string, guidance: string?) -> GmailQueryResult {\n client AnthropicClient\n prompt #\"\n Convert a folder/file name into a Gmail search query.\n \n IMPORTANT: Gmail search is FUZZY by default!\n - Bare words (no operator) search across subject, body, AND sender name/email\n - Use bare words for most searches - they match everywhere!\n - Gmail automatically handles plurals, common misspellings, and related terms\n \n CRITICAL: PREFER BARE WORDS OVER from: FOR FUZZY SENDER MATCHING!\n - from:company only matches if that word appears exactly in sender name/email\n - Bare words match sender, subject, AND body - much more flexible!\n - Many companies use subdomains like noreply@mail.company.com - bare words catch these better\n - ONLY use from: when you need STRICT sender matching (rare)\n \n CRITICAL SYNTAX RULES:\n - Multi-word operator values MUST be quoted: from:\"summit health\" NOT from:summit health\n - Bare multi-word phrases should also be quoted: \"lab results\" NOT lab results\n - Unquoted spaces mean AND: lab results = lab AND results (both must appear)\n - Use OR to match alternatives: mychart OR \"nyu langone\" OR \"summit health\"\n \n Gmail search operators (use sparingly - bare words are usually better!):\n - from:\"name or email\" - STRICT sender match (avoid for fuzzy matching!)\n - subject:\"phrase\" - matches subject line only\n - has:attachment, filename:pdf\n - is:unread, is:starred, is:important\n - newer_than:1d, newer_than:7d, older_than:30d\n \n QUERY STRATEGY:\n 1. For company/brand emails → USE BARE WORDS: nike OR adidas OR \"north face\"\n (This finds emails FROM them, ABOUT them, or MENTIONING them - much better coverage!)\n 2. For topics → bare words or quoted phrases: \"lab results\" OR prescription OR appointment\n 3. For people → from:firstname works but bare word also good: john OR from:john\n 4. For mixed (topic + senders) → all bare words: mychart OR nyu OR \"summit health\" OR prescription\n 5. For health/medical → medical OR health OR doctor OR appointment OR mychart OR prescription\n 6. When in doubt → ADD MORE ORs with related terms, synonyms, and variations!\n \n DEFAULTS:\n - limit: 50\n - filename_format: {date}_{from}_{subject}_{id}.txt\n \n Query examples:\n - \"invoices\" → invoice OR receipt OR bill OR statement\n - \"stripe-invoices\" → stripe invoice OR stripe receipt OR stripe payment\n - \"receipts\" → receipt OR invoice OR order OR confirmation OR purchase\n - \"meeting-notes\" → meeting OR agenda OR \"calendar invite\" OR standup OR sync\n - \"flight-confirmation\" → flight OR itinerary OR \"boarding pass\" OR airline OR booking\n - \"health-emails\" → medical OR health OR doctor OR appointment OR mychart OR prescription OR lab\n - \"from-john\" → from:john (explicit \"from\" in name = use the operator)\n - \"emails-from-acme-corp\" → from:\"acme corp\" (explicit \"from\" + multi-word = quoted operator)\n - \"unread\" → is:unread\n - \"project-alpha\" → \"project alpha\" (quoted phrase)\n - \"shipping\" → shipping OR tracking OR delivery OR fedex OR ups OR usps OR package\n - \"recent\" → newer_than:7d\n - \"amazon-orders\" → amazon order OR amazon shipment OR amazon delivery OR amazon purchase\n - \"uber-receipts\" → uber receipt OR uber ride OR lyft receipt OR lyft ride\n \n Guidance examples:\n - \"from nike\" → nike (bare word for fuzzy matching - catches nike.com, news@nike.com, etc.)\n - \"anything from summit health, nyu, mychart, health related\" \n → mychart OR nyu OR \"summit health\" OR medical OR appointment OR prescription OR lab\n (Use bare words with lots of ORs - they match sender names AND content!)\n - \"strictly only emails where the sender is john@company.com\" → from:john@company.com\n (Use from: ONLY when user explicitly wants strict sender matching)\n \n Filename format: {date}_{from}_{subject}_{id}.txt\n \n Folder/file name: {{ name }}\n {% if guidance %}\n \n Additional guidance from user: {{ guidance }}\n {% endif %}\n \n {{ ctx.output_format }}\n \"#\n}\n\n// -----------------------------------------------------------------------------\n// Gmail Query Evaluation and Refinement\n// -----------------------------------------------------------------------------\n\nclass GmailQueryEvaluation {\n is_satisfactory bool @description(\"True if results match user intent well enough\")\n refined_query string? @description(\"Improved Gmail query if not satisfactory, null if satisfactory\")\n reasoning string @description(\"Brief explanation of evaluation (1-2 sentences)\")\n}\n\nfunction EvaluateGmailQueryResults(\n original_guidance: string,\n generated_query: string,\n result_count: int,\n sample_results: string\n) -> GmailQueryEvaluation {\n client AnthropicClient\n prompt #\"\n You are a STRICT evaluator checking if Gmail search results actually match what the user asked for.\n Be critical - if results don't clearly match the user's intent, suggest a better query.\n \n USER'S REQUEST: {{ original_guidance }}\n \n CURRENT QUERY: {{ generated_query }}\n \n RESULTS: {{ result_count }} emails found\n \n SAMPLE RESULTS: {{ sample_results }}\n \n STRICT EVALUATION CHECKLIST:\n \n 1. ZERO RESULTS = AUTOMATIC FAIL\n - If result_count is 0, the query is broken. Mark as NOT satisfactory.\n - Common causes:\n * Unquoted multi-word values: from:summit health → from:\"summit health\"\n * Using from: when bare word would match better (from:bonobos might miss noreply@e.bonobos.com)\n \n 2. CHECK EACH THING THE USER MENTIONED:\n - If user mentioned specific senders/companies (e.g., \"bonobos\", \"amazon\", \"stripe\"):\n → Are emails FROM or ABOUT these in the results?\n → Note: bare words like \"bonobos\" are GOOD - they match sender, subject, AND body\n → If from: operator fails, suggest using bare word instead for fuzzy matching\n - If user mentioned topics (e.g., \"health\", \"medical\", \"appointments\"):\n → Do the subjects/snippets actually contain these words?\n → Generic results that don't mention the topics = NOT satisfactory.\n \n 3. LOOK FOR OBVIOUS MISMATCHES:\n - Results about completely unrelated topics = NOT satisfactory\n - Missing the main thing the user asked for = NOT satisfactory\n \n 4. QUERY IMPROVEMENTS TO SUGGEST:\n - from:bonobos (0 results) → bonobos (bare word matches subdomains like e.bonobos.com)\n - from:summit health → from:\"summit health\" (MUST quote multi-word)\n - Using from: when company uses weird email domains → use bare word instead\n - Not enough OR alternatives for fuzzy matching\n \n BE HARSH: If you're unsure whether results match, lean toward NOT satisfactory and suggest improvements.\n Only mark satisfactory if the results CLEARLY match what the user asked for.\n \n {{ ctx.output_format }}\n \"#\n}\n\n// -----------------------------------------------------------------------------\n// Google Drive Query Inference\n// -----------------------------------------------------------------------------\n\nfunction InferGDriveQuery(name: string, guidance: string?) -> GDriveQueryResult {\n client AnthropicClient\n prompt #\"\n Convert a folder/file name into a Google Drive files.list query (the `q` parameter).\n \n Output MUST be ONLY the query expression (no `q=`, no URL encoding).\n It MUST be valid Drive query syntax.\n \n Drive query syntax:\n - name contains 'text'\n - fullText contains 'text'\n - mimeType = 'application/pdf'\n - mimeType = 'application/vnd.google-apps.document'\n - mimeType = 'application/vnd.google-apps.spreadsheet'\n - mimeType = 'application/vnd.google-apps.presentation'\n - mimeType contains 'image'\n - sharedWithMe = true\n - 'me' in owners\n - 'email@example.com' in owners\n - 'email@example.com' in writers\n - 'email@example.com' in readers\n - starred = true\n - trashed = false\n - modifiedTime > '2024-01-01T00:00:00'\n - createdTime > '2024-01-01T00:00:00'\n - 'FOLDER_ID' in parents\n \n Combine with: and, or, not\n \n IMPORTANT:\n - Always include `trashed = false` unless the user explicitly asks for trash / deleted files.\n - Prefer `fullText contains` for topic/keyword folders (it searches name + indexed content).\n Use `name contains` when the intent is specifically filename-based.\n - Use RFC3339 timestamps for createdTime/modifiedTime. Prefer including a timezone, e.g. '2026-01-22T00:00:00Z'.\n \n DEFAULTS:\n - limit: 50\n - filename_format: {name}_{id}\n \n Query examples:\n - \"recent\" → trashed = false and modifiedTime > '2026-01-22T00:00:00Z'\n - \"shared\" → trashed = false and sharedWithMe = true\n - \"my-docs\" → trashed = false and 'me' in owners and mimeType = 'application/vnd.google-apps.document'\n - \"pdfs\" → trashed = false and mimeType = 'application/pdf'\n - \"documents\" → trashed = false and mimeType = 'application/vnd.google-apps.document'\n - \"spreadsheets\" → trashed = false and mimeType = 'application/vnd.google-apps.spreadsheet'\n - \"presentations\" → trashed = false and mimeType = 'application/vnd.google-apps.presentation'\n - \"images\" → trashed = false and mimeType contains 'image/'\n - \"starred\" → trashed = false and starred = true\n - \"college-docs\" → trashed = false and (fullText contains 'college' or name contains 'college') and mimeType = 'application/vnd.google-apps.document'\n - Guidance: \"created in the past week\" → use createdTime > '...'\n - Guidance: \"modified in the past week\" → use modifiedTime > '...'\n \n Filename format: Choose a format that makes sense for the query intent.\n Available placeholders: {id}, {name}, {date}, {created}, {mime_type}, {ext}\n - {name} is the basename (no extension)\n - {ext} is the extension (includes leading dot)\n - For documents: {name}_{id}\n - For dated files: {date}_{name}_{id}\n - For created-time focused queries: {created}_{name}_{id}\n - Default: {name}_{id}\n \n Folder/file name: {{ name }}\n {% if guidance %}\n \n Additional guidance: {{ guidance }}\n {% endif %}\n \n {{ ctx.output_format }}\n \"#\n}\n\n// -----------------------------------------------------------------------------\n// Notion Query Inference\n// -----------------------------------------------------------------------------\n\nfunction InferNotionQuery(name: string, guidance: string?) -> NotionQueryResult {\n client AnthropicClient\n prompt #\"\n Convert a folder/file name into a Notion search query.\n \n Notion search is simple text-based. Extract key search terms from the folder name.\n Remove common words like \"my\", \"the\", \"all\", etc.\n \n DEFAULTS:\n - limit: 50\n - filename_format: {title}_{id}.md\n \n Query examples:\n - \"meeting-notes\" → meeting notes\n - \"project-docs\" → project docs\n - \"todo\" → todo\n - \"weekly-reports\" → weekly reports\n \n Filename format: Choose a format that makes sense for the query intent.\n Available placeholders: {id}, {title}, {type}, {date}, {created}\n - For notes: {title}_{id}.md\n - For dated pages: {date}_{title}_{id}.md\n - Default: {title}_{id}.md\n \n Folder/file name: {{ name }}\n {% if guidance %}\n \n Additional guidance from user: {{ guidance }}\n {% endif %}\n \n {{ ctx.output_format }}\n \"#\n}\n\n// -----------------------------------------------------------------------------\n// Slack Query Inference\n// -----------------------------------------------------------------------------\n\nfunction InferSlackQuery(name: string, guidance: string?) -> SlackQueryResult {\n client AnthropicClient\n prompt #\"\n Convert a folder/file name into a Slack search query.\n \n Slack search supports various operators for finding messages across channels.\n The query should find relevant messages based on the folder name intent.\n \n SLACK SEARCH OPERATORS:\n - in:#channel-name - messages in a specific channel\n - in:@username - direct messages with a user\n - from:@username - messages from a specific user\n - from:me - messages you sent\n - to:me - messages sent to you (DMs)\n - has:link - messages containing links\n - has:reaction - messages with reactions\n - has:star - starred messages\n - has:pin - pinned messages\n - before:YYYY-MM-DD - messages before a date\n - after:YYYY-MM-DD - messages after a date\n - during:month - messages from a specific month (e.g., during:january)\n - on:YYYY-MM-DD - messages on a specific date\n - is:saved - saved messages\n \n QUERY STRATEGY:\n 1. For channel-specific queries → use in:#channel\n 2. For user-specific queries → use from:@user\n 3. For topic searches → use keywords directly\n 4. For time-based queries → use before:, after:, during:, on:\n 5. For content type → use has:link, has:reaction, etc.\n 6. Combine operators with spaces (implicit AND) or OR\n \n DEFAULTS:\n - limit: 50\n - filename_format: {date}_{channel}_{user}_{id}.txt\n \n Query examples:\n - \"engineering-updates\" → in:#engineering OR in:#engineering-updates\n - \"team-announcements\" → in:#announcements OR in:#general announcement\n - \"from-ceo\" → from:@ceo OR from:@founder\n - \"recent-links\" → has:link after:2026-01-01\n - \"pinned-messages\" → has:pin\n - \"design-feedback\" → in:#design feedback OR in:#design-reviews\n - \"standup-notes\" → standup OR \"daily standup\" OR in:#standup\n - \"customer-issues\" → in:#support OR in:#customer-success issue OR bug\n - \"product-discussions\" → in:#product OR product roadmap OR feature\n - \"onboarding\" → onboarding OR \"new hire\" OR welcome\n - \"decisions\" → decision OR decided OR \"we will\" OR approved\n - \"action-items\" → \"action item\" OR TODO OR \"follow up\" OR assigned\n \n Filename format: Choose a format that makes sense for the query intent.\n Available placeholders: {id}, {channel}, {user}, {date}, {text}\n - For channel messages: {date}_{channel}_{user}_{id}.txt\n - For user messages: {date}_{user}_{channel}_{id}.txt\n - Default: {date}_{channel}_{user}_{id}.txt\n \n Folder/file name: {{ name }}\n {% if guidance %}\n \n Additional guidance from user: {{ guidance }}\n {% endif %}\n \n {{ ctx.output_format }}\n \"#\n}\n\n// -----------------------------------------------------------------------------\n// Linear Query Inference\n// -----------------------------------------------------------------------------\n\nfunction InferLinearQuery(name: string, guidance: string?) -> LinearQueryResult {\n client AnthropicClient\n prompt #\"\n Convert natural language into a Linear search query.\n \n CORE PRINCIPLE: Text search is powerful - use it liberally. Only add filters when intent is unambiguous.\n \n SYNTAX:\n - Plain text: searches title + description (very flexible, use this a lot!)\n - assignee:name | assignee:me - who it's assigned to\n - no:assignee - unassigned issues\n - state:backlog | state:todo | state:\"in progress\" | state:done | state:canceled\n - priority:1 (urgent) | priority:2 (high) | priority:3 (medium) | priority:4 (low) \n - label:name - filter by label\n - no:label - unlabeled issues\n - team:KEY - filter by team key\n - project:name - filter by project name\n - creator:name - who created it\n - cycle:current | cycle:next | cycle:previous - filter by cycle\n - estimate:N - filter by point estimate\n - OR - combine alternatives (use generously!)\n \n STRATEGY - Think about what the user ACTUALLY wants:\n \n 1. TOPIC/FEATURE searches → mostly text search\n \"checkout bugs\" → checkout bug\n \"api performance\" → api performance\n \"mobile crashes\" → mobile crash\n \"auth issues\" → auth authentication login\n \n 2. PERSON searches → assignee filter + text backup\n \"john's tickets\" → assignee:john OR john\n \"sarah jones work\" → assignee:sarah OR assignee:jones OR sarah OR jones\n \"assigned to mike\" → assignee:mike\n \"unassigned issues\" → no:assignee\n \n 3. STATUS/PRIORITY searches → filters\n \"my open issues\" → assignee:me state:todo OR state:\"in progress\"\n \"urgent bugs\" → priority:1 bug\n \"blocked tickets\" → blocked OR blocking\n \"stuff in review\" → review\n \"in progress issues\" → state:\"in progress\"\n \n 4. PROJECT/CYCLE searches → project or cycle filter\n \"acme project\" → project:acme\n \"current sprint\" → cycle:current\n \"next sprint issues\" → cycle:next\n \"q1 roadmap\" → q1 roadmap\n \"enterprise features\" → enterprise\n \n 5. VAGUE/BROAD searches → text search with synonyms\n \"tech debt\" → \"tech debt\" OR refactor OR cleanup\n \"security\" → security vulnerability\n \"onboarding\" → onboarding OR \"getting started\" OR setup\n \n KEY RULES:\n - When in doubt, use text search - it's flexible\n - Use OR generously to catch variations \n - For multi-word names: assignee:first OR assignee:last (never the full name)\n - Add synonyms for common concepts\n - Don't over-filter - better to return more results than miss things\n \n SEARCH TYPE: \"issues\" (default) or \"projects\" (only if explicitly asking for projects list)\n \n DEFAULTS:\n - limit: 50\n - filename_format: {identifier}_{title}.md\n \n CRITICAL: filename_format MUST contain placeholders like {identifier}, {title}, etc.\n NEVER return a literal filename like \"test.md\" or the folder name as the format.\n The format is a TEMPLATE applied to every result, so it MUST use placeholders.\n \n Folder/file name: {{ name }}\n {% if guidance %}\n Additional guidance: {{ guidance }}\n {% endif %}\n \n {{ ctx.output_format }}\n \"#\n}\n\n// -----------------------------------------------------------------------------\n// GitHub Query Inference\n// -----------------------------------------------------------------------------\n\nfunction InferGitHubQuery(name: string, guidance: string?) -> GitHubQueryResult {\n client AnthropicClient\n prompt #\"\n Convert natural language into a GitHub query. Think about what an agent would need.\n \n ═══════════════════════════════════════════════════════════════════════════════\n SEARCH TYPES & QUERY FORMATS\n ═══════════════════════════════════════════════════════════════════════════════\n \n 1. REPOS (search_type: \"repos\")\n Listing: list:org/ORGNAME or list:user/USERNAME [type:private|public]\n Search: language:go stars:>100 topic:ml org:company\n \n Examples:\n - \"beam-cloud repos\" → list:org/beam-cloud\n - \"private repos in acme\" → list:org/acme type:private\n - \"popular go projects\" → language:go stars:>1000\n \n 2. ISSUES (search_type: \"issues\") \n Query: repo:owner/name [is:open|closed] [label:X] [assignee:X] [author:X]\n \n Examples:\n - \"open bugs in beta9\" → repo:beam-cloud/beta9 is:open label:bug\n - \"my issues\" → assignee:@me is:open\n - \"security issues in react\" → repo:facebook/react label:security\n \n 3. PULL REQUESTS (search_type: \"prs\")\n Query: repo:owner/name [is:open|closed|merged] [author:X] [review:approved]\n \n Examples:\n - \"open PRs in beta9\" → repo:beam-cloud/beta9 is:open\n - \"merged PRs this week\" → repo:owner/repo is:merged merged:>2026-01-26\n - \"PRs needing review\" → repo:owner/repo is:open review:none\n \n 4. COMMITS (search_type: \"commits\")\n Query: repo:owner/name [branch:X] [author:X] [since:DATE] [path:X]\n \n Examples:\n - \"recent commits in beta9\" → repo:beam-cloud/beta9\n - \"commits to main this week\" → repo:owner/repo branch:main since:2026-01-26\n - \"john's commits\" → repo:owner/repo author:john\n - \"changes to src/api\" → repo:owner/repo path:src/api\n \n 5. RELEASES (search_type: \"releases\")\n Query: repo:owner/name [latest] [prerelease:true|false]\n \n Examples:\n - \"beta9 releases\" → repo:beam-cloud/beta9\n - \"latest release of kubernetes\" → repo:kubernetes/kubernetes latest\n - \"stable releases only\" → repo:owner/repo prerelease:false\n \n 6. WORKFLOW RUNS (search_type: \"workflows\")\n Query: repo:owner/name [status:success|failure|in_progress] [branch:X] [event:push|pr]\n \n Examples:\n - \"failed CI in beta9\" → repo:beam-cloud/beta9 status:failure\n - \"recent builds\" → repo:owner/repo\n - \"PR checks\" → repo:owner/repo event:pull_request\n \n 7. FILES (search_type: \"files\")\n Query: repo:owner/name path:PATH [ref:branch|tag|sha]\n \n Examples:\n - \"beta9 readme\" → repo:beam-cloud/beta9 path:README.md\n - \"package.json in react\" → repo:facebook/react path:package.json\n - \"config files\" → repo:owner/repo path:*.yaml ref:main\n - \"src directory\" → repo:owner/repo path:src/\n \n 8. BRANCHES (search_type: \"branches\")\n Query: repo:owner/name [protected:true|false]\n \n Examples:\n - \"beta9 branches\" → repo:beam-cloud/beta9\n - \"protected branches\" → repo:owner/repo protected:true\n \n ═══════════════════════════════════════════════════════════════════════════════\n CONTENT TYPES (how to format the content)\n ═══════════════════════════════════════════════════════════════════════════════\n \n - \"markdown\" (default) - Rich formatted content, best for agents to read\n - \"diff\" - Code changes (for PRs/commits)\n - \"json\" - Structured metadata\n - \"raw\" - Unprocessed file content (for files)\n \n ═══════════════════════════════════════════════════════════════════════════════\n FILENAME FORMATS (use appropriate placeholders)\n ═══════════════════════════════════════════════════════════════════════════════\n \n Repos: {name}.md or {full_name}.json\n Issues: {number}_{title}.md\n PRs: {number}_{title}.md or {number}_{title}_diff.md\n Commits: {sha_short}_{message}.md\n Releases: {tag}_{name}.md\n Workflows: {id}_{name}_{status}.md\n Files: {path} (preserve original path/name)\n Branches: {name}.md\n \n ═══════════════════════════════════════════════════════════════════════════════\n SMART DETECTION\n ═══════════════════════════════════════════════════════════════════════════════\n \n Detect intent from keywords:\n - \"repos/repositories/projects\" in org/user → repos listing\n - \"prs/pulls/pull requests/merge requests\" → prs\n - \"issues/bugs/tickets/tasks\" → issues\n - \"commits/history/changes/log\" → commits \n - \"releases/versions/tags\" → releases\n - \"ci/cd/builds/workflows/actions/checks\" → workflows\n - \"readme/config/file/code\" → files\n - \"branches\" → branches\n - \"diff/patch/changes\" → content_type: diff\n \n Folder/file name: {{ name }}\n {% if guidance %}\n Additional guidance: {{ guidance }}\n {% endif %}\n \n {{ ctx.output_format }}\n \"#\n}\n\n// -----------------------------------------------------------------------------\n// Tests\n// -----------------------------------------------------------------------------\n\ntest TestGitHubOpenPRs {\n functions [InferGitHubQuery]\n args {\n name \"open-prs-beta9\"\n guidance null\n }\n}\n\ntest TestGitHubIssues {\n functions [InferGitHubQuery]\n args {\n name \"beta9-bugs\"\n guidance \"show open bugs in the beta9 repo\"\n }\n}\n\ntest TestGitHubRepos {\n functions [InferGitHubQuery]\n args {\n name \"go-cli-tools\"\n guidance null\n }\n}\n\ntest TestGmailUnread {\n functions [InferGmailQuery]\n args {\n name \"unread-emails\"\n guidance null\n }\n}\n\ntest TestGmailFromPerson {\n functions [InferGmailQuery]\n args {\n name \"from-eli\"\n guidance null\n }\n}\n\ntest TestGmailWithGuidance {\n functions [InferGmailQuery]\n args {\n name \"important-emails\"\n guidance \"Only from the last 7 days, max 100 results\"\n }\n}\n\ntest TestGmailInvoices {\n functions [InferGmailQuery]\n args {\n name \"invoices\"\n guidance null\n }\n}\n\ntest TestGmailMeetingNotes {\n functions [InferGmailQuery]\n args {\n name \"meeting-notes\"\n guidance null\n }\n}\n\ntest TestGmailFlightConfirmation {\n functions [InferGmailQuery]\n args {\n name \"flight-confirmations\"\n guidance null\n }\n}\n\ntest TestGmailReceipts {\n functions [InferGmailQuery]\n args {\n name \"receipts\"\n guidance null\n }\n}\n\ntest TestGmailProjectAlpha {\n functions [InferGmailQuery]\n args {\n name \"project-alpha-emails\"\n guidance null\n }\n}\n\ntest TestGmailShippingUpdates {\n functions [InferGmailQuery]\n args {\n name \"shipping-updates\"\n guidance null\n }\n}\n\ntest TestGmailHealthWithGuidance {\n functions [InferGmailQuery]\n args {\n name \"Medical Documents\"\n guidance \"anything from: summit health, nyu, anything health related basically, mychart\"\n }\n}\n\ntest TestGmailMultiWordSender {\n functions [InferGmailQuery]\n args {\n name \"acme-corp-emails\"\n guidance \"from acme corp and related companies\"\n }\n}\n\ntest TestGmailFuzzySenderNike {\n functions [InferGmailQuery]\n args {\n name \"nike-emails\"\n guidance \"from nike\"\n }\n}\n\ntest TestGmailFuzzySenderAmazon {\n functions [InferGmailQuery]\n args {\n name \"amazon-orders\"\n guidance null\n }\n}\n\ntest TestGDriveShared {\n functions [InferGDriveQuery]\n args {\n name \"shared-with-me\"\n guidance null\n }\n}\n\ntest TestGDrivePdfs {\n functions [InferGDriveQuery]\n args {\n name \"pdfs\"\n guidance null\n }\n}\n\ntest TestGDriveCollegeDocs {\n functions [InferGDriveQuery]\n args {\n name \"college docs\"\n guidance null\n }\n}\n\ntest TestGDriveRecentDocsWithGuidance {\n functions [InferGDriveQuery]\n args {\n name \"recent-docs\"\n guidance \"any docs created in the past week\"\n }\n}\n\ntest TestGDriveOwnedByMe {\n functions [InferGDriveQuery]\n args {\n name \"my docs\"\n guidance null\n }\n}\n\ntest TestNotionMeetings {\n functions [InferNotionQuery]\n args {\n name \"meeting-notes\"\n guidance null\n }\n}\n\ntest TestEvaluateGmailResults_NoResults {\n functions [EvaluateGmailQueryResults]\n args {\n original_guidance \"anything from summit health, nyu, mychart, health related\"\n generated_query \"from:summit health OR from:nyu OR from:mychart\"\n result_count 0\n sample_results \"[]\"\n }\n}\n\ntest TestEvaluateGmailResults_GoodResults {\n functions [EvaluateGmailQueryResults]\n args {\n original_guidance \"invoices from stripe\"\n generated_query \"invoice from:stripe\"\n result_count 15\n sample_results \"[{\\\"from\\\":\\\"Stripe\\\",\\\"subject\\\":\\\"Your invoice from Acme Corp\\\",\\\"snippet\\\":\\\"Invoice #12345...\\\"},{\\\"from\\\":\\\"Stripe Billing\\\",\\\"subject\\\":\\\"Invoice for January\\\",\\\"snippet\\\":\\\"Amount due: $99\\\"}]\"\n }\n}\n\ntest TestSlackEngineering {\n functions [InferSlackQuery]\n args {\n name \"engineering-updates\"\n guidance null\n }\n}\n\ntest TestSlackFromCEO {\n functions [InferSlackQuery]\n args {\n name \"from-ceo\"\n guidance \"messages from our CEO\"\n }\n}\n\ntest TestSlackRecentLinks {\n functions [InferSlackQuery]\n args {\n name \"recent-links\"\n guidance null\n }\n}\n\ntest TestSlackStandups {\n functions [InferSlackQuery]\n args {\n name \"standup-notes\"\n guidance null\n }\n}\n\ntest TestSlackDecisions {\n functions [InferSlackQuery]\n args {\n name \"team-decisions\"\n guidance \"important decisions made by the team\"\n }\n}\n\ntest TestLinearMyIssues {\n functions [InferLinearQuery]\n args {\n name \"my-issues\"\n guidance null\n }\n}\n\ntest TestLinearMyBugs {\n functions [InferLinearQuery]\n args {\n name \"my-bugs\"\n guidance \"bugs assigned to me\"\n }\n}\n\ntest TestLinearUrgent {\n functions [InferLinearQuery]\n args {\n name \"urgent-issues\"\n guidance null\n }\n}\n\ntest TestLinearTeamBacklog {\n functions [InferLinearQuery]\n args {\n name \"eng-backlog\"\n guidance \"engineering team backlog\"\n }\n}\n\ntest TestLinearInProgress {\n functions [InferLinearQuery]\n args {\n name \"in-progress\"\n guidance null\n }\n}\n\ntest TestPostHogActiveFlags {\n functions [InferPostHogQuery]\n args {\n name \"active-feature-flags\"\n guidance null\n }\n}\n\ntest TestPostHogPageviewEvents {\n functions [InferPostHogQuery]\n args {\n name \"pageview-events\"\n guidance null\n }\n}\n\ntest TestPostHogRetentionInsights {\n functions [InferPostHogQuery]\n args {\n name \"retention-insights\"\n guidance null\n }\n}\n\ntest TestPostHogBetaUsers {\n functions [InferPostHogQuery]\n args {\n name \"beta-users\"\n guidance null\n }\n}\n\ntest TestPostHogErrorEvents {\n functions [InferPostHogQuery]\n args {\n name \"error-events\"\n guidance \"exception and error events from the past week\"\n }\n}\n\ntest TestPostHogSignupFunnel {\n functions [InferPostHogQuery]\n args {\n name \"signup-funnel\"\n guidance null\n }\n}\n\n// -----------------------------------------------------------------------------\n// Confluence Query Inference\n// -----------------------------------------------------------------------------\n\nclass ConfluenceQueryResult {\n cql_query string @description(\"Confluence Query Language (CQL) query string\")\n content_type string @description(\"Content type filter: 'page', 'blogpost', or 'all'. Default: 'all'\")\n limit int @description(\"Max results to return (default: 50)\")\n filename_format string @description(\"Template for result filenames. Use {id}, {title}, {space}, {date}, {type}. Example: {space}_{title}_{id}.md\")\n}\n\nfunction InferConfluenceQuery(name: string, guidance: string?) -> ConfluenceQueryResult {\n client AnthropicClient\n prompt #\"\n Convert a folder/file name into a Confluence CQL (Confluence Query Language) query.\n\n CQL SYNTAX:\n - type=page — pages only\n - type=blogpost — blog posts only\n - space=KEY — filter by space key (e.g., space=DEV, space=ENG)\n - title~\"text\" — title contains text (fuzzy match)\n - text~\"text\" — full-text search (searches title + body)\n - label=\"name\" — filter by label\n - creator=\"username\" — filter by creator\n - contributor=\"username\" — filter by contributor\n - lastModified>now(\"-7d\") — modified in the last 7 days\n - lastModified>now(\"-30d\") — modified in the last 30 days\n - lastModified>\"2025-01-01\" — modified after a specific date\n - created>now(\"-7d\") — created in the last 7 days\n - ancestor=12345 — pages under a specific parent\n - AND, OR — combine conditions\n - ORDER BY lastModified DESC, title ASC — sorting\n\n QUERY STRATEGY:\n 1. For topic searches → use text~ for full-text search\n 2. For space-specific → use space=KEY\n 3. For recent content → use lastModified>now(\"-Nd\")\n 4. For labeled content → use label=\"name\"\n 5. For specific content types → use type=page or type=blogpost\n 6. Combine operators with AND/OR\n\n DEFAULTS:\n - content_type: \"all\"\n - limit: 50\n - filename_format: {space}_{title}_{id}.md\n\n Query examples:\n - \"recent-pages\" → lastModified>now(\"-7d\") ORDER BY lastModified DESC, content_type: \"page\"\n - \"engineering-docs\" → (space=ENG OR text~\"engineering\") AND type=page ORDER BY lastModified DESC\n - \"kubernetes-pages\" → text~\"kubernetes\" AND type=page ORDER BY lastModified DESC\n - \"dev-blog-posts\" → space=DEV AND type=blogpost ORDER BY lastModified DESC\n - \"important-docs\" → label=\"important\" ORDER BY lastModified DESC\n - \"api-documentation\" → text~\"api\" AND (label=\"documentation\" OR title~\"api\") ORDER BY lastModified DESC\n - \"onboarding\" → text~\"onboarding\" OR title~\"onboarding\" ORDER BY lastModified DESC\n - \"architecture-decisions\" → text~\"architecture\" OR label=\"adr\" ORDER BY lastModified DESC\n - \"meeting-notes\" → title~\"meeting\" OR label=\"meeting-notes\" ORDER BY lastModified DESC\n - \"recent-blog-posts\" → type=blogpost AND lastModified>now(\"-30d\") ORDER BY lastModified DESC\n - \"project-alpha\" → text~\"project alpha\" ORDER BY lastModified DESC\n\n Filename format: Choose a format that makes sense for the query intent.\n Available placeholders: {id}, {title}, {space}, {date}, {type}\n - For space-specific: {title}_{id}.md\n - For cross-space: {space}_{title}_{id}.md\n - For dated content: {date}_{title}_{id}.md\n - Default: {space}_{title}_{id}.md\n\n Folder/file name: {{ name }}\n {% if guidance %}\n\n Additional guidance from user: {{ guidance }}\n {% endif %}\n\n {{ ctx.output_format }}\n \"#\n}\n\n// -----------------------------------------------------------------------------\n// Web Query Inference\n// -----------------------------------------------------------------------------\n\nfunction InferWebQuery(name: string, guidance: string?) -> WebQueryResult {\n client AnthropicClient\n prompt #\"\n Convert a folder name into a web query. Two modes:\n\n web_mode=map — crawl a specific website. web_query must be a valid https:// URL.\n web_mode=search — web search. web_query is a search query string (not a URL).\n\n Use web_mode=map when guidance contains a URL or the name implies a specific site.\n Use web_mode=search when the name describes a topic, question, or information need.\n\n EXAMPLES:\n - \"hennessy-cocktails\" (guidance: \"https://www.hennessy.com/en-us/cocktails\")\n → web_mode: \"map\", web_query: \"https://www.hennessy.com/en-us/cocktails\", include_paths: [\"/en-us/cocktails/*\"], limit: 100\n - \"react-docs\"\n → web_mode: \"map\", web_query: \"https://react.dev/reference\", include_paths: [\"/reference/*\"], limit: 100\n - \"python-tutorial\"\n → web_mode: \"map\", web_query: \"https://docs.python.org/3/tutorial/\", include_paths: [\"/3/tutorial/*\"], limit: 50\n - \"latest AI news\"\n → web_mode: \"search\", web_query: \"latest AI news 2025\", include_paths: [], limit: 20\n - \"best rust web frameworks\"\n → web_mode: \"search\", web_query: \"best rust web frameworks comparison\", include_paths: [], limit: 15\n - \"climate change research papers\"\n → web_mode: \"search\", web_query: \"climate change research papers 2025\", include_paths: [], limit: 20\n\n Defaults: web_mode=map, limit=100, filename_format={title}_{id}.md\n\n Folder name: {{ name }}\n {% if guidance %}\n User guidance: {{ guidance }}\n {% endif %}\n\n {{ ctx.output_format }}\n \"#\n}\n\n// -----------------------------------------------------------------------------\n// Web Query Tests\n// -----------------------------------------------------------------------------\n\ntest TestWebQueryWithGuidance {\n functions [InferWebQuery]\n args {\n name \"hennessy-cocktails\"\n guidance \"https://www.hennessy.com/en-us/cocktails?page=1\"\n }\n}\n\ntest TestWebQueryDocs {\n functions [InferWebQuery]\n args {\n name \"react-docs\"\n guidance null\n }\n}\n\ntest TestWebQueryBlog {\n functions [InferWebQuery]\n args {\n name \"openai-blog\"\n guidance null\n }\n}\n\ntest TestWebQuerySearch {\n functions [InferWebQuery]\n args {\n name \"latest AI news\"\n guidance null\n }\n}\n\ntest TestWebQuerySearchTopic {\n functions [InferWebQuery]\n args {\n name \"best rust web frameworks\"\n guidance null\n }\n}\n\n// -----------------------------------------------------------------------------\n// Confluence Query Tests\n// -----------------------------------------------------------------------------\n\ntest TestConfluenceRecentPages {\n functions [InferConfluenceQuery]\n args {\n name \"recent-pages\"\n guidance null\n }\n}\n\ntest TestConfluenceEngDocs {\n functions [InferConfluenceQuery]\n args {\n name \"engineering-docs\"\n guidance \"documentation from the engineering space\"\n }\n}\n\ntest TestConfluenceKubernetes {\n functions [InferConfluenceQuery]\n args {\n name \"kubernetes-pages\"\n guidance null\n }\n}\n\ntest TestConfluenceMeetingNotes {\n functions [InferConfluenceQuery]\n args {\n name \"meeting-notes\"\n guidance null\n }\n}\n\ntest TestConfluenceArchitectureDecisions {\n functions [InferConfluenceQuery]\n args {\n name \"architecture-decisions\"\n guidance \"ADRs and architecture decision records\"\n }\n}\n", | ||
| "generators.baml": "generator target {\n output_type \"go\"\n output_dir \"../\"\n version \"0.218.1\"\n\n // 'baml-cli generate' will run this after generating go code\n // This command will be run from within $output_dir/baml_client\n on_generate \"bash \\\"$(git rev-parse --show-toplevel)/bin/patch-baml-runtime.sh\\\" && gofmt -w . && goimports -w .\"\n\n // Go packages name as specified in go.mod\n // We need this to generate correct imports in the generated baml_client\n client_package_name \"github.com/beam-cloud/airstore/pkg/sources/queries\"\n}\n", |
There was a problem hiding this comment.
P2: The embedded generated BAML map is out of sync with baml_src: it downgrades generators.baml to 0.218.1 while source is 0.220.0. Keep generated artifacts consistent with source.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At pkg/sources/queries/baml_client/baml_source_map.go, line 20:
<comment>The embedded generated BAML map is out of sync with `baml_src`: it downgrades `generators.baml` to `0.218.1` while source is `0.220.0`. Keep generated artifacts consistent with source.</comment>
<file context>
@@ -17,8 +17,8 @@ var file_map = map[string]string{
"cron.baml": "class CronResult {\n cron_expr string @description(\"Standard 5-field cron expression: minute hour day_of_month month day_of_week\")\n}\n\nfunction ParseCronSchedule(description: string, timezone: string) -> CronResult {\n client SmallModelCerebras\n prompt #\"\n Convert the following human-readable schedule description into a standard\n 5-field cron expression (minute hour day_of_month month day_of_week).\n\n The user is in the {{ timezone }} timezone. All times in the description\n refer to {{ timezone }} local time. Output the cron expression in that\n local timezone (do NOT convert to UTC).\n\n Description: {{ description }}\n\n {{ ctx.output_format }}\n \"#\n}\n",
- "generators.baml": "generator target {\n output_type \"go\"\n output_dir \"../\"\n version \"0.220.0\"\n\n // 'baml-cli generate' will run this after generating go code\n // This command will be run from within $output_dir/baml_client\n on_generate \"bash \\\"$(git rev-parse --show-toplevel)/bin/patch-baml-runtime.sh\\\" && gofmt -w . && goimports -w .\"\n\n // Go packages name as specified in go.mod\n // We need this to generate correct imports in the generated baml_client\n client_package_name \"github.com/beam-cloud/airstore/pkg/sources/queries\"\n}\n",
- "smart_queries.baml": "// =============================================================================\n// Smart Query BAML Functions\n// Infer source-specific queries from folder/file names\n// =============================================================================\n\nclass GmailQueryResult {\n gmail_query string @description(\"Gmail search query using Gmail's operators\")\n limit int @description(\"Max results to return (default: 50)\")\n filename_format string @description(\"Template for result filenames. Use {id}, {date}, {from}, {subject}. Example: {date}_{from}_{subject}_{id}.txt\")\n}\n\nclass GDriveQueryResult {\n gdrive_query string @description(\"Google Drive search query\")\n limit int @description(\"Max results to return (default: 50)\")\n filename_format string @description(\"Template for result filenames. Use {id}, {name}, {date}. Example: {name}_{id}\")\n}\n\nclass NotionQueryResult {\n notion_query string @description(\"Notion search query text\")\n limit int @description(\"Max results to return (default: 50)\")\n filename_format string @description(\"Template for result filenames. Use {id}, {title}, {date}. Example: {title}_{id}.md\")\n}\n\nclass GitHubQueryResult {\n github_query string @description(\"GitHub query - format depends on search_type\")\n search_type string @description(\"Type: repos, issues, prs, commits, releases, workflows, files, branches\")\n content_type string @description(\"Format: markdown (default), diff (PRs/commits), json (metadata), raw (files)\")\n limit int @description(\"Max results (default: 50)\")\n filename_format string @description(\"Filename template with placeholders like {number}, {title}, {sha}, {name}, {path}\")\n}\n\nclass SlackQueryResult {\n slack_query string @description(\"Slack search query using Slack's search syntax\")\n limit int @description(\"Max results to return (default: 50)\")\n filename_format string @description(\"Template for result filenames. Use {id}, {channel}, {user}, {date}, {text}. Example: {date}_{channel}_{user}_{id}.txt\")\n}\n\nclass LinearQueryResult {\n linear_query string @description(\"Linear issue filter query using Linear's filter syntax\")\n search_type string @description(\"Type of search: 'issues' or 'projects'. Default: 'issues'\")\n limit int @description(\"Max results to return (default: 50)\")\n filename_format string @description(\"Template for result filenames. Use {id}, {identifier}, {title}, {state}, {assignee}, {team}, {priority}, {date}. Example: {identifier}_{title}.md\")\n}\n\nclass PostHogQueryResult {\n posthog_query string @description(\"Search query string for PostHog API\")\n search_type string @description(\"Type of search: 'events', 'feature-flags', 'insights', 'cohorts'\")\n project_id int @description(\"PostHog project ID to search in. Use 0 to auto-select the first available project.\")\n limit int @description(\"Max results to return (default: 50)\")\n filename_format string @description(\"Template for result filenames. Placeholders vary by search_type.\")\n}\n\nclass WebQueryResult {\n web_mode string @description(\"'map' to crawl a specific site, 'search' to find pages via web search. Default: map\")\n web_query string @description(\"For web_mode=map: the URL to crawl. For web_mode=search: the search query string.\")\n include_paths string[] @description(\"URL path patterns to include when web_mode=map (e.g. /cocktails/*). Empty array = all paths. Ignored for search.\")\n limit int @description(\"Max pages to discover (default: 100)\")\n filename_format string @description(\"Template for result filenames. Use {title}, {path}, {id}, {date}. Example: {title}_{id}.md\")\n}\n\n// -----------------------------------------------------------------------------\n// PostHog Query Inference\n// -----------------------------------------------------------------------------\n\nfunction InferPostHogQuery(name: string, guidance: string?) -> PostHogQueryResult {\n client AnthropicClient\n prompt #\"\n Convert a folder/file name into a PostHog query.\n\n SEARCH TYPES:\n 1. feature-flags — Feature flags (toggles). Use when name contains: flag, toggle, feature, rollout, experiment\n 2. insights — Saved queries, funnels, retention, trends. Use when name contains: insight, funnel, retention, trend, dashboard, chart, analytics\n 3. cohorts — User segments/groups. Use when name contains: cohort, segment, group, users, audience\n 4. events — Raw event data (default). Use when name contains: event, pageview, click, action, or nothing specific\n\n POSTHOG STANDARD EVENTS (start with $):\n $pageview, $pageleave, $autocapture, $screen, $exception, $rageclick, $web_vitals\n\n QUERY FORMAT:\n - For events: the event name to filter on (e.g., \"$pageview\", \"signup\", \"purchase\"). Empty string = all recent events.\n - For feature-flags/insights/cohorts: a search term to filter by name/key (e.g., \"beta\", \"onboarding\", \"retention\").\n\n ACTIVE FLAGS:\n If the name implies \"active\" or \"enabled\" flags (e.g., \"active-feature-flags\", \"enabled-flags\"), set search_type to \"feature-flags\" and use a broad query (empty or relevant keyword). The provider will filter for active=true.\n\n DEFAULTS:\n - project_id: 0 (auto-select first project)\n - limit: 50\n\n FILENAME FORMATS by search_type:\n - events: {date}_{event}_{id}.json\n - feature-flags: {key}_{id}.json\n - insights: {name}_{id}.json\n - cohorts: {name}_{id}.json\n\n Query examples:\n - \"active-feature-flags\" → search_type: feature-flags, query: \"\" (all flags, provider filters active)\n - \"pageview-events\" → search_type: events, query: \"$pageview\"\n - \"error-events\" → search_type: events, query: \"$exception\"\n - \"retention-insights\" → search_type: insights, query: \"retention\"\n - \"beta-users\" → search_type: cohorts, query: \"beta\"\n - \"signup-funnel\" → search_type: insights, query: \"signup funnel\"\n - \"recent-events\" → search_type: events, query: \"\"\n - \"experiment-flags\" → search_type: feature-flags, query: \"experiment\"\n - \"rage-clicks\" → search_type: events, query: \"$rageclick\"\n - \"web-vitals\" → search_type: events, query: \"$web_vitals\"\n - \"onboarding-cohort\" → search_type: cohorts, query: \"onboarding\"\n\n Folder/file name: {{ name }}\n {% if guidance %}\n\n Additional guidance from user: {{ guidance }}\n {% endif %}\n\n {{ ctx.output_format }}\n \"#\n}\n\n// -----------------------------------------------------------------------------\n// Gmail Query Inference\n// -----------------------------------------------------------------------------\n\nfunction InferGmailQuery(name: string, guidance: string?) -> GmailQueryResult {\n client AnthropicClient\n prompt #\"\n Convert a folder/file name into a Gmail search query.\n \n IMPORTANT: Gmail search is FUZZY by default!\n - Bare words (no operator) search across subject, body, AND sender name/email\n - Use bare words for most searches - they match everywhere!\n - Gmail automatically handles plurals, common misspellings, and related terms\n \n CRITICAL: PREFER BARE WORDS OVER from: FOR FUZZY SENDER MATCHING!\n - from:company only matches if that word appears exactly in sender name/email\n - Bare words match sender, subject, AND body - much more flexible!\n - Many companies use subdomains like noreply@mail.company.com - bare words catch these better\n - ONLY use from: when you need STRICT sender matching (rare)\n \n CRITICAL SYNTAX RULES:\n - Multi-word operator values MUST be quoted: from:\"summit health\" NOT from:summit health\n - Bare multi-word phrases should also be quoted: \"lab results\" NOT lab results\n - Unquoted spaces mean AND: lab results = lab AND results (both must appear)\n - Use OR to match alternatives: mychart OR \"nyu langone\" OR \"summit health\"\n \n Gmail search operators (use sparingly - bare words are usually better!):\n - from:\"name or email\" - STRICT sender match (avoid for fuzzy matching!)\n - subject:\"phrase\" - matches subject line only\n - has:attachment, filename:pdf\n - is:unread, is:starred, is:important\n - newer_than:1d, newer_than:7d, older_than:30d\n \n QUERY STRATEGY:\n 1. For company/brand emails → USE BARE WORDS: nike OR adidas OR \"north face\"\n (This finds emails FROM them, ABOUT them, or MENTIONING them - much better coverage!)\n 2. For topics → bare words or quoted phrases: \"lab results\" OR prescription OR appointment\n 3. For people → from:firstname works but bare word also good: john OR from:john\n 4. For mixed (topic + senders) → all bare words: mychart OR nyu OR \"summit health\" OR prescription\n 5. For health/medical → medical OR health OR doctor OR appointment OR mychart OR prescription\n 6. When in doubt → ADD MORE ORs with related terms, synonyms, and variations!\n \n DEFAULTS:\n - limit: 50\n - filename_format: {date}_{from}_{subject}_{id}.txt\n \n Query examples:\n - \"invoices\" → invoice OR receipt OR bill OR statement\n - \"stripe-invoices\" → stripe invoice OR stripe receipt OR stripe payment\n - \"receipts\" → receipt OR invoice OR order OR confirmation OR purchase\n - \"meeting-notes\" → meeting OR agenda OR \"calendar invite\" OR standup OR sync\n - \"flight-confirmation\" → flight OR itinerary OR \"boarding pass\" OR airline OR booking\n - \"health-emails\" → medical OR health OR doctor OR appointment OR mychart OR prescription OR lab\n - \"from-john\" → from:john (explicit \"from\" in name = use the operator)\n - \"emails-from-acme-corp\" → from:\"acme corp\" (explicit \"from\" + multi-word = quoted operator)\n - \"unread\" → is:unread\n - \"project-alpha\" → \"project alpha\" (quoted phrase)\n - \"shipping\" → shipping OR tracking OR delivery OR fedex OR ups OR usps OR package\n - \"recent\" → newer_than:7d\n - \"amazon-orders\" → amazon order OR amazon shipment OR amazon delivery OR amazon purchase\n - \"uber-receipts\" → uber receipt OR uber ride OR lyft receipt OR lyft ride\n \n Guidance examples:\n - \"from nike\" → nike (bare word for fuzzy matching - catches nike.com, news@nike.com, etc.)\n - \"anything from summit health, nyu, mychart, health related\" \n → mychart OR nyu OR \"summit health\" OR medical OR appointment OR prescription OR lab\n (Use bare words with lots of ORs - they match sender names AND content!)\n - \"strictly only emails where the sender is john@company.com\" → from:john@company.com\n (Use from: ONLY when user explicitly wants strict sender matching)\n \n Filename format: {date}_{from}_{subject}_{id}.txt\n \n Folder/file name: {{ name }}\n {% if guidance %}\n \n Additional guidance from user: {{ guidance }}\n {% endif %}\n \n {{ ctx.output_format }}\n \"#\n}\n\n// -----------------------------------------------------------------------------\n// Gmail Query Evaluation and Refinement\n// -----------------------------------------------------------------------------\n\nclass GmailQueryEvaluation {\n is_satisfactory bool @description(\"True if results match user intent well enough\")\n refined_query string? @description(\"Improved Gmail query if not satisfactory, null if satisfactory\")\n reasoning string @description(\"Brief explanation of evaluation (1-2 sentences)\")\n}\n\nfunction EvaluateGmailQueryResults(\n original_guidance: string,\n generated_query: string,\n result_count: int,\n sample_results: string\n) -> GmailQueryEvaluation {\n client AnthropicClient\n prompt #\"\n You are a STRICT evaluator checking if Gmail search results actually match what the user asked for.\n Be critical - if results don't clearly match the user's intent, suggest a better query.\n \n USER'S REQUEST: {{ original_guidance }}\n \n CURRENT QUERY: {{ generated_query }}\n \n RESULTS: {{ result_count }} emails found\n \n SAMPLE RESULTS: {{ sample_results }}\n \n STRICT EVALUATION CHECKLIST:\n \n 1. ZERO RESULTS = AUTOMATIC FAIL\n - If result_count is 0, the query is broken. Mark as NOT satisfactory.\n - Common causes:\n * Unquoted multi-word values: from:summit health → from:\"summit health\"\n * Using from: when bare word would match better (from:bonobos might miss noreply@e.bonobos.com)\n \n 2. CHECK EACH THING THE USER MENTIONED:\n - If user mentioned specific senders/companies (e.g., \"bonobos\", \"amazon\", \"stripe\"):\n → Are emails FROM or ABOUT these in the results?\n → Note: bare words like \"bonobos\" are GOOD - they match sender, subject, AND body\n → If from: operator fails, suggest using bare word instead for fuzzy matching\n - If user mentioned topics (e.g., \"health\", \"medical\", \"appointments\"):\n → Do the subjects/snippets actually contain these words?\n → Generic results that don't mention the topics = NOT satisfactory.\n \n 3. LOOK FOR OBVIOUS MISMATCHES:\n - Results about completely unrelated topics = NOT satisfactory\n - Missing the main thing the user asked for = NOT satisfactory\n \n 4. QUERY IMPROVEMENTS TO SUGGEST:\n - from:bonobos (0 results) → bonobos (bare word matches subdomains like e.bonobos.com)\n - from:summit health → from:\"summit health\" (MUST quote multi-word)\n - Using from: when company uses weird email domains → use bare word instead\n - Not enough OR alternatives for fuzzy matching\n \n BE HARSH: If you're unsure whether results match, lean toward NOT satisfactory and suggest improvements.\n Only mark satisfactory if the results CLEARLY match what the user asked for.\n \n {{ ctx.output_format }}\n \"#\n}\n\n// -----------------------------------------------------------------------------\n// Google Drive Query Inference\n// -----------------------------------------------------------------------------\n\nfunction InferGDriveQuery(name: string, guidance: string?) -> GDriveQueryResult {\n client AnthropicClient\n prompt #\"\n Convert a folder/file name into a Google Drive files.list query (the `q` parameter).\n \n Output MUST be ONLY the query expression (no `q=`, no URL encoding).\n It MUST be valid Drive query syntax.\n \n Drive query syntax:\n - name contains 'text'\n - fullText contains 'text'\n - mimeType = 'application/pdf'\n - mimeType = 'application/vnd.google-apps.document'\n - mimeType = 'application/vnd.google-apps.spreadsheet'\n - mimeType = 'application/vnd.google-apps.presentation'\n - mimeType contains 'image'\n - sharedWithMe = true\n - 'me' in owners\n - 'email@example.com' in owners\n - 'email@example.com' in writers\n - 'email@example.com' in readers\n - starred = true\n - trashed = false\n - modifiedTime > '2024-01-01T00:00:00'\n - createdTime > '2024-01-01T00:00:00'\n - 'FOLDER_ID' in parents\n \n Combine with: and, or, not\n \n IMPORTANT:\n - Always include `trashed = false` unless the user explicitly asks for trash / deleted files.\n - Prefer `fullText contains` for topic/keyword folders (it searches name + indexed content).\n Use `name contains` when the intent is specifically filename-based.\n - Use RFC3339 timestamps for createdTime/modifiedTime. Prefer including a timezone, e.g. '2026-01-22T00:00:00Z'.\n \n DEFAULTS:\n - limit: 50\n - filename_format: {name}_{id}\n \n Query examples:\n - \"recent\" → trashed = false and modifiedTime > '2026-01-22T00:00:00Z'\n - \"shared\" → trashed = false and sharedWithMe = true\n - \"my-docs\" → trashed = false and 'me' in owners and mimeType = 'application/vnd.google-apps.document'\n - \"pdfs\" → trashed = false and mimeType = 'application/pdf'\n - \"documents\" → trashed = false and mimeType = 'application/vnd.google-apps.document'\n - \"spreadsheets\" → trashed = false and mimeType = 'application/vnd.google-apps.spreadsheet'\n - \"presentations\" → trashed = false and mimeType = 'application/vnd.google-apps.presentation'\n - \"images\" → trashed = false and mimeType contains 'image/'\n - \"starred\" → trashed = false and starred = true\n - \"college-docs\" → trashed = false and (fullText contains 'college' or name contains 'college') and mimeType = 'application/vnd.google-apps.document'\n - Guidance: \"created in the past week\" → use createdTime > '...'\n - Guidance: \"modified in the past week\" → use modifiedTime > '...'\n \n Filename format: Choose a format that makes sense for the query intent.\n Available placeholders: {id}, {name}, {date}, {created}, {mime_type}, {ext}\n - {name} is the basename (no extension)\n - {ext} is the extension (includes leading dot)\n - For documents: {name}_{id}\n - For dated files: {date}_{name}_{id}\n - For created-time focused queries: {created}_{name}_{id}\n - Default: {name}_{id}\n \n Folder/file name: {{ name }}\n {% if guidance %}\n \n Additional guidance: {{ guidance }}\n {% endif %}\n \n {{ ctx.output_format }}\n \"#\n}\n\n// -----------------------------------------------------------------------------\n// Notion Query Inference\n// -----------------------------------------------------------------------------\n\nfunction InferNotionQuery(name: string, guidance: string?) -> NotionQueryResult {\n client AnthropicClient\n prompt #\"\n Convert a folder/file name into a Notion search query.\n \n Notion search is simple text-based. Extract key search terms from the folder name.\n Remove common words like \"my\", \"the\", \"all\", etc.\n \n DEFAULTS:\n - limit: 50\n - filename_format: {title}_{id}.md\n \n Query examples:\n - \"meeting-notes\" → meeting notes\n - \"project-docs\" → project docs\n - \"todo\" → todo\n - \"weekly-reports\" → weekly reports\n \n Filename format: Choose a format that makes sense for the query intent.\n Available placeholders: {id}, {title}, {type}, {date}, {created}\n - For notes: {title}_{id}.md\n - For dated pages: {date}_{title}_{id}.md\n - Default: {title}_{id}.md\n \n Folder/file name: {{ name }}\n {% if guidance %}\n \n Additional guidance from user: {{ guidance }}\n {% endif %}\n \n {{ ctx.output_format }}\n \"#\n}\n\n// -----------------------------------------------------------------------------\n// Slack Query Inference\n// -----------------------------------------------------------------------------\n\nfunction InferSlackQuery(name: string, guidance: string?) -> SlackQueryResult {\n client AnthropicClient\n prompt #\"\n Convert a folder/file name into a Slack search query.\n \n Slack search supports various operators for finding messages across channels.\n The query should find relevant messages based on the folder name intent.\n \n SLACK SEARCH OPERATORS:\n - in:#channel-name - messages in a specific channel\n - in:@username - direct messages with a user\n - from:@username - messages from a specific user\n - from:me - messages you sent\n - to:me - messages sent to you (DMs)\n - has:link - messages containing links\n - has:reaction - messages with reactions\n - has:star - starred messages\n - has:pin - pinned messages\n - before:YYYY-MM-DD - messages before a date\n - after:YYYY-MM-DD - messages after a date\n - during:month - messages from a specific month (e.g., during:january)\n - on:YYYY-MM-DD - messages on a specific date\n - is:saved - saved messages\n \n QUERY STRATEGY:\n 1. For channel-specific queries → use in:#channel\n 2. For user-specific queries → use from:@user\n 3. For topic searches → use keywords directly\n 4. For time-based queries → use before:, after:, during:, on:\n 5. For content type → use has:link, has:reaction, etc.\n 6. Combine operators with spaces (implicit AND) or OR\n \n DEFAULTS:\n - limit: 50\n - filename_format: {date}_{channel}_{user}_{id}.txt\n \n Query examples:\n - \"engineering-updates\" → in:#engineering OR in:#engineering-updates\n - \"team-announcements\" → in:#announcements OR in:#general announcement\n - \"from-ceo\" → from:@ceo OR from:@founder\n - \"recent-links\" → has:link after:2026-01-01\n - \"pinned-messages\" → has:pin\n - \"design-feedback\" → in:#design feedback OR in:#design-reviews\n - \"standup-notes\" → standup OR \"daily standup\" OR in:#standup\n - \"customer-issues\" → in:#support OR in:#customer-success issue OR bug\n - \"product-discussions\" → in:#product OR product roadmap OR feature\n - \"onboarding\" → onboarding OR \"new hire\" OR welcome\n - \"decisions\" → decision OR decided OR \"we will\" OR approved\n - \"action-items\" → \"action item\" OR TODO OR \"follow up\" OR assigned\n \n Filename format: Choose a format that makes sense for the query intent.\n Available placeholders: {id}, {channel}, {user}, {date}, {text}\n - For channel messages: {date}_{channel}_{user}_{id}.txt\n - For user messages: {date}_{user}_{channel}_{id}.txt\n - Default: {date}_{channel}_{user}_{id}.txt\n \n Folder/file name: {{ name }}\n {% if guidance %}\n \n Additional guidance from user: {{ guidance }}\n {% endif %}\n \n {{ ctx.output_format }}\n \"#\n}\n\n// -----------------------------------------------------------------------------\n// Linear Query Inference\n// -----------------------------------------------------------------------------\n\nfunction InferLinearQuery(name: string, guidance: string?) -> LinearQueryResult {\n client AnthropicClient\n prompt #\"\n Convert natural language into a Linear search query.\n \n CORE PRINCIPLE: Text search is powerful - use it liberally. Only add filters when intent is unambiguous.\n \n SYNTAX:\n - Plain text: searches title + description (very flexible, use this a lot!)\n - assignee:name | assignee:me - who it's assigned to\n - no:assignee - unassigned issues\n - state:backlog | state:todo | state:\"in progress\" | state:done | state:canceled\n - priority:1 (urgent) | priority:2 (high) | priority:3 (medium) | priority:4 (low) \n - label:name - filter by label\n - no:label - unlabeled issues\n - team:KEY - filter by team key\n - project:name - filter by project name\n - creator:name - who created it\n - cycle:current | cycle:next | cycle:previous - filter by cycle\n - estimate:N - filter by point estimate\n - OR - combine alternatives (use generously!)\n \n STRATEGY - Think about what the user ACTUALLY wants:\n \n 1. TOPIC/FEATURE searches → mostly text search\n \"checkout bugs\" → checkout bug\n \"api performance\" → api performance\n \"mobile crashes\" → mobile crash\n \"auth issues\" → auth authentication login\n \n 2. PERSON searches → assignee filter + text backup\n \"john's tickets\" → assignee:john OR john\n \"sarah jones work\" → assignee:sarah OR assignee:jones OR sarah OR jones\n \"assigned to mike\" → assignee:mike\n \"unassigned issues\" → no:assignee\n \n 3. STATUS/PRIORITY searches → filters\n \"my open issues\" → assignee:me state:todo OR state:\"in progress\"\n \"urgent bugs\" → priority:1 bug\n \"blocked tickets\" → blocked OR blocking\n \"stuff in review\" → review\n \"in progress issues\" → state:\"in progress\"\n \n 4. PROJECT/CYCLE searches → project or cycle filter\n \"acme project\" → project:acme\n \"current sprint\" → cycle:current\n \"next sprint issues\" → cycle:next\n \"q1 roadmap\" → q1 roadmap\n \"enterprise features\" → enterprise\n \n 5. VAGUE/BROAD searches → text search with synonyms\n \"tech debt\" → \"tech debt\" OR refactor OR cleanup\n \"security\" → security vulnerability\n \"onboarding\" → onboarding OR \"getting started\" OR setup\n \n KEY RULES:\n - When in doubt, use text search - it's flexible\n - Use OR generously to catch variations \n - For multi-word names: assignee:first OR assignee:last (never the full name)\n - Add synonyms for common concepts\n - Don't over-filter - better to return more results than miss things\n \n SEARCH TYPE: \"issues\" (default) or \"projects\" (only if explicitly asking for projects list)\n \n DEFAULTS:\n - limit: 50\n - filename_format: {identifier}_{title}.md\n \n CRITICAL: filename_format MUST contain placeholders like {identifier}, {title}, etc.\n NEVER return a literal filename like \"test.md\" or the folder name as the format.\n The format is a TEMPLATE applied to every result, so it MUST use placeholders.\n \n Folder/file name: {{ name }}\n {% if guidance %}\n Additional guidance: {{ guidance }}\n {% endif %}\n \n {{ ctx.output_format }}\n \"#\n}\n\n// -----------------------------------------------------------------------------\n// GitHub Query Inference\n// -----------------------------------------------------------------------------\n\nfunction InferGitHubQuery(name: string, guidance: string?) -> GitHubQueryResult {\n client AnthropicClient\n prompt #\"\n Convert natural language into a GitHub query. Think about what an agent would need.\n \n ═══════════════════════════════════════════════════════════════════════════════\n SEARCH TYPES & QUERY FORMATS\n ═══════════════════════════════════════════════════════════════════════════════\n \n 1. REPOS (search_type: \"repos\")\n Listing: list:org/ORGNAME or list:user/USERNAME [type:private|public]\n Search: language:go stars:>100 topic:ml org:company\n \n Examples:\n - \"beam-cloud repos\" → list:org/beam-cloud\n - \"private repos in acme\" → list:org/acme type:private\n - \"popular go projects\" → language:go stars:>1000\n \n 2. ISSUES (search_type: \"issues\") \n Query: repo:owner/name [is:open|closed] [label:X] [assignee:X] [author:X]\n \n Examples:\n - \"open bugs in beta9\" → repo:beam-cloud/beta9 is:open label:bug\n - \"my issues\" → assignee:@me is:open\n - \"security issues in react\" → repo:facebook/react label:security\n \n 3. PULL REQUESTS (search_type: \"prs\")\n Query: repo:owner/name [is:open|closed|merged] [author:X] [review:approved]\n \n Examples:\n - \"open PRs in beta9\" → repo:beam-cloud/beta9 is:open\n - \"merged PRs this week\" → repo:owner/repo is:merged merged:>2026-01-26\n - \"PRs needing review\" → repo:owner/repo is:open review:none\n \n 4. COMMITS (search_type: \"commits\")\n Query: repo:owner/name [branch:X] [author:X] [since:DATE] [path:X]\n \n Examples:\n - \"recent commits in beta9\" → repo:beam-cloud/beta9\n - \"commits to main this week\" → repo:owner/repo branch:main since:2026-01-26\n - \"john's commits\" → repo:owner/repo author:john\n - \"changes to src/api\" → repo:owner/repo path:src/api\n \n 5. RELEASES (search_type: \"releases\")\n Query: repo:owner/name [latest] [prerelease:true|false]\n \n Examples:\n - \"beta9 releases\" → repo:beam-cloud/beta9\n - \"latest release of kubernetes\" → repo:kubernetes/kubernetes latest\n - \"stable releases only\" → repo:owner/repo prerelease:false\n \n 6. WORKFLOW RUNS (search_type: \"workflows\")\n Query: repo:owner/name [status:success|failure|in_progress] [branch:X] [event:push|pr]\n \n Examples:\n - \"failed CI in beta9\" → repo:beam-cloud/beta9 status:failure\n - \"recent builds\" → repo:owner/repo\n - \"PR checks\" → repo:owner/repo event:pull_request\n \n 7. FILES (search_type: \"files\")\n Query: repo:owner/name path:PATH [ref:branch|tag|sha]\n \n Examples:\n - \"beta9 readme\" → repo:beam-cloud/beta9 path:README.md\n - \"package.json in react\" → repo:facebook/react path:package.json\n - \"config files\" → repo:owner/repo path:*.yaml ref:main\n - \"src directory\" → repo:owner/repo path:src/\n \n 8. BRANCHES (search_type: \"branches\")\n Query: repo:owner/name [protected:true|false]\n \n Examples:\n - \"beta9 branches\" → repo:beam-cloud/beta9\n - \"protected branches\" → repo:owner/repo protected:true\n \n ═══════════════════════════════════════════════════════════════════════════════\n CONTENT TYPES (how to format the content)\n ═══════════════════════════════════════════════════════════════════════════════\n \n - \"markdown\" (default) - Rich formatted content, best for agents to read\n - \"diff\" - Code changes (for PRs/commits)\n - \"json\" - Structured metadata\n - \"raw\" - Unprocessed file content (for files)\n \n ═══════════════════════════════════════════════════════════════════════════════\n FILENAME FORMATS (use appropriate placeholders)\n ═══════════════════════════════════════════════════════════════════════════════\n \n Repos: {name}.md or {full_name}.json\n Issues: {number}_{title}.md\n PRs: {number}_{title}.md or {number}_{title}_diff.md\n Commits: {sha_short}_{message}.md\n Releases: {tag}_{name}.md\n Workflows: {id}_{name}_{status}.md\n Files: {path} (preserve original path/name)\n Branches: {name}.md\n \n ═══════════════════════════════════════════════════════════════════════════════\n SMART DETECTION\n ═══════════════════════════════════════════════════════════════════════════════\n \n Detect intent from keywords:\n - \"repos/repositories/projects\" in org/user → repos listing\n - \"prs/pulls/pull requests/merge requests\" → prs\n - \"issues/bugs/tickets/tasks\" → issues\n - \"commits/history/changes/log\" → commits \n - \"releases/versions/tags\" → releases\n - \"ci/cd/builds/workflows/actions/checks\" → workflows\n - \"readme/config/file/code\" → files\n - \"branches\" → branches\n - \"diff/patch/changes\" → content_type: diff\n \n Folder/file name: {{ name }}\n {% if guidance %}\n Additional guidance: {{ guidance }}\n {% endif %}\n \n {{ ctx.output_format }}\n \"#\n}\n\n// -----------------------------------------------------------------------------\n// Tests\n// -----------------------------------------------------------------------------\n\ntest TestGitHubOpenPRs {\n functions [InferGitHubQuery]\n args {\n name \"open-prs-beta9\"\n guidance null\n }\n}\n\ntest TestGitHubIssues {\n functions [InferGitHubQuery]\n args {\n name \"beta9-bugs\"\n guidance \"show open bugs in the beta9 repo\"\n }\n}\n\ntest TestGitHubRepos {\n functions [InferGitHubQuery]\n args {\n name \"go-cli-tools\"\n guidance null\n }\n}\n\ntest TestGmailUnread {\n functions [InferGmailQuery]\n args {\n name \"unread-emails\"\n guidance null\n }\n}\n\ntest TestGmailFromPerson {\n functions [InferGmailQuery]\n args {\n name \"from-eli\"\n guidance null\n }\n}\n\ntest TestGmailWithGuidance {\n functions [InferGmailQuery]\n args {\n name \"important-emails\"\n guidance \"Only from the last 7 days, max 100 results\"\n }\n}\n\ntest TestGmailInvoices {\n functions [InferGmailQuery]\n args {\n name \"invoices\"\n guidance null\n }\n}\n\ntest TestGmailMeetingNotes {\n functions [InferGmailQuery]\n args {\n name \"meeting-notes\"\n guidance null\n }\n}\n\ntest TestGmailFlightConfirmation {\n functions [InferGmailQuery]\n args {\n name \"flight-confirmations\"\n guidance null\n }\n}\n\ntest TestGmailReceipts {\n functions [InferGmailQuery]\n args {\n name \"receipts\"\n guidance null\n }\n}\n\ntest TestGmailProjectAlpha {\n functions [InferGmailQuery]\n args {\n name \"project-alpha-emails\"\n guidance null\n }\n}\n\ntest TestGmailShippingUpdates {\n functions [InferGmailQuery]\n args {\n name \"shipping-updates\"\n guidance null\n }\n}\n\ntest TestGmailHealthWithGuidance {\n functions [InferGmailQuery]\n args {\n name \"Medical Documents\"\n guidance \"anything from: summit health, nyu, anything health related basically, mychart\"\n }\n}\n\ntest TestGmailMultiWordSender {\n functions [InferGmailQuery]\n args {\n name \"acme-corp-emails\"\n guidance \"from acme corp and related companies\"\n }\n}\n\ntest TestGmailFuzzySenderNike {\n functions [InferGmailQuery]\n args {\n name \"nike-emails\"\n guidance \"from nike\"\n }\n}\n\ntest TestGmailFuzzySenderAmazon {\n functions [InferGmailQuery]\n args {\n name \"amazon-orders\"\n guidance null\n }\n}\n\ntest TestGDriveShared {\n functions [InferGDriveQuery]\n args {\n name \"shared-with-me\"\n guidance null\n }\n}\n\ntest TestGDrivePdfs {\n functions [InferGDriveQuery]\n args {\n name \"pdfs\"\n guidance null\n }\n}\n\ntest TestGDriveCollegeDocs {\n functions [InferGDriveQuery]\n args {\n name \"college docs\"\n guidance null\n }\n}\n\ntest TestGDriveRecentDocsWithGuidance {\n functions [InferGDriveQuery]\n args {\n name \"recent-docs\"\n guidance \"any docs created in the past week\"\n }\n}\n\ntest TestGDriveOwnedByMe {\n functions [InferGDriveQuery]\n args {\n name \"my docs\"\n guidance null\n }\n}\n\ntest TestNotionMeetings {\n functions [InferNotionQuery]\n args {\n name \"meeting-notes\"\n guidance null\n }\n}\n\ntest TestEvaluateGmailResults_NoResults {\n functions [EvaluateGmailQueryResults]\n args {\n original_guidance \"anything from summit health, nyu, mychart, health related\"\n generated_query \"from:summit health OR from:nyu OR from:mychart\"\n result_count 0\n sample_results \"[]\"\n }\n}\n\ntest TestEvaluateGmailResults_GoodResults {\n functions [EvaluateGmailQueryResults]\n args {\n original_guidance \"invoices from stripe\"\n generated_query \"invoice from:stripe\"\n result_count 15\n sample_results \"[{\\\"from\\\":\\\"Stripe\\\",\\\"subject\\\":\\\"Your invoice from Acme Corp\\\",\\\"snippet\\\":\\\"Invoice #12345...\\\"},{\\\"from\\\":\\\"Stripe Billing\\\",\\\"subject\\\":\\\"Invoice for January\\\",\\\"snippet\\\":\\\"Amount due: $99\\\"}]\"\n }\n}\n\ntest TestSlackEngineering {\n functions [InferSlackQuery]\n args {\n name \"engineering-updates\"\n guidance null\n }\n}\n\ntest TestSlackFromCEO {\n functions [InferSlackQuery]\n args {\n name \"from-ceo\"\n guidance \"messages from our CEO\"\n }\n}\n\ntest TestSlackRecentLinks {\n functions [InferSlackQuery]\n args {\n name \"recent-links\"\n guidance null\n }\n}\n\ntest TestSlackStandups {\n functions [InferSlackQuery]\n args {\n name \"standup-notes\"\n guidance null\n }\n}\n\ntest TestSlackDecisions {\n functions [InferSlackQuery]\n args {\n name \"team-decisions\"\n guidance \"important decisions made by the team\"\n }\n}\n\ntest TestLinearMyIssues {\n functions [InferLinearQuery]\n args {\n name \"my-issues\"\n guidance null\n }\n}\n\ntest TestLinearMyBugs {\n functions [InferLinearQuery]\n args {\n name \"my-bugs\"\n guidance \"bugs assigned to me\"\n }\n}\n\ntest TestLinearUrgent {\n functions [InferLinearQuery]\n args {\n name \"urgent-issues\"\n guidance null\n }\n}\n\ntest TestLinearTeamBacklog {\n functions [InferLinearQuery]\n args {\n name \"eng-backlog\"\n guidance \"engineering team backlog\"\n }\n}\n\ntest TestLinearInProgress {\n functions [InferLinearQuery]\n args {\n name \"in-progress\"\n guidance null\n }\n}\n\ntest TestPostHogActiveFlags {\n functions [InferPostHogQuery]\n args {\n name \"active-feature-flags\"\n guidance null\n }\n}\n\ntest TestPostHogPageviewEvents {\n functions [InferPostHogQuery]\n args {\n name \"pageview-events\"\n guidance null\n }\n}\n\ntest TestPostHogRetentionInsights {\n functions [InferPostHogQuery]\n args {\n name \"retention-insights\"\n guidance null\n }\n}\n\ntest TestPostHogBetaUsers {\n functions [InferPostHogQuery]\n args {\n name \"beta-users\"\n guidance null\n }\n}\n\ntest TestPostHogErrorEvents {\n functions [InferPostHogQuery]\n args {\n name \"error-events\"\n guidance \"exception and error events from the past week\"\n }\n}\n\ntest TestPostHogSignupFunnel {\n functions [InferPostHogQuery]\n args {\n name \"signup-funnel\"\n guidance null\n }\n}\n\n// -----------------------------------------------------------------------------\n// Confluence Query Inference\n// -----------------------------------------------------------------------------\n\nclass ConfluenceQueryResult {\n cql_query string @description(\"Confluence Query Language (CQL) query string\")\n content_type string @description(\"Content type filter: 'page', 'blogpost', or 'all'. Default: 'all'\")\n limit int @description(\"Max results to return (default: 50)\")\n filename_format string @description(\"Template for result filenames. Use {id}, {title}, {space}, {date}, {type}. Example: {space}_{title}_{id}.md\")\n}\n\nfunction InferConfluenceQuery(name: string, guidance: string?) -> ConfluenceQueryResult {\n client AnthropicClient\n prompt #\"\n Convert a folder/file name into a Confluence CQL (Confluence Query Language) query.\n\n CQL SYNTAX:\n - type=page — pages only\n - type=blogpost — blog posts only\n - space=KEY — filter by space key (e.g., space=DEV, space=ENG)\n - title~\"text\" — title contains text (fuzzy match)\n - text~\"text\" — full-text search (searches title + body)\n - label=\"name\" — filter by label\n - creator=\"username\" — filter by creator\n - contributor=\"username\" — filter by contributor\n - lastModified>now(\"-7d\") — modified in the last 7 days\n - lastModified>now(\"-30d\") — modified in the last 30 days\n - lastModified>\"2025-01-01\" — modified after a specific date\n - created>now(\"-7d\") — created in the last 7 days\n - ancestor=12345 — pages under a specific parent\n - AND, OR — combine conditions\n - ORDER BY lastModified DESC, title ASC — sorting\n\n QUERY STRATEGY:\n 1. For topic searches → use text~ for full-text search\n 2. For space-specific → use space=KEY\n 3. For recent content → use lastModified>now(\"-Nd\")\n 4. For labeled content → use label=\"name\"\n 5. For specific content types → use type=page or type=blogpost\n 6. Combine operators with AND/OR\n\n DEFAULTS:\n - content_type: \"all\"\n - limit: 50\n - filename_format: {space}_{title}_{id}.md\n\n Query examples:\n - \"recent-pages\" → lastModified>now(\"-7d\") ORDER BY lastModified DESC, content_type: \"page\"\n - \"engineering-docs\" → (space=ENG OR text~\"engineering\") AND type=page ORDER BY lastModified DESC\n - \"kubernetes-pages\" → text~\"kubernetes\" AND type=page ORDER BY lastModified DESC\n - \"dev-blog-posts\" → space=DEV AND type=blogpost ORDER BY lastModified DESC\n - \"important-docs\" → label=\"important\" ORDER BY lastModified DESC\n - \"api-documentation\" → text~\"api\" AND (label=\"documentation\" OR title~\"api\") ORDER BY lastModified DESC\n - \"onboarding\" → text~\"onboarding\" OR title~\"onboarding\" ORDER BY lastModified DESC\n - \"architecture-decisions\" → text~\"architecture\" OR label=\"adr\" ORDER BY lastModified DESC\n - \"meeting-notes\" → title~\"meeting\" OR label=\"meeting-notes\" ORDER BY lastModified DESC\n - \"recent-blog-posts\" → type=blogpost AND lastModified>now(\"-30d\") ORDER BY lastModified DESC\n - \"project-alpha\" → text~\"project alpha\" ORDER BY lastModified DESC\n\n Filename format: Choose a format that makes sense for the query intent.\n Available placeholders: {id}, {title}, {space}, {date}, {type}\n - For space-specific: {title}_{id}.md\n - For cross-space: {space}_{title}_{id}.md\n - For dated content: {date}_{title}_{id}.md\n - Default: {space}_{title}_{id}.md\n\n Folder/file name: {{ name }}\n {% if guidance %}\n\n Additional guidance from user: {{ guidance }}\n {% endif %}\n\n {{ ctx.output_format }}\n \"#\n}\n\n// -----------------------------------------------------------------------------\n// Web Query Inference\n// -----------------------------------------------------------------------------\n\nfunction InferWebQuery(name: string, guidance: string?) -> WebQueryResult {\n client AnthropicClient\n prompt #\"\n Convert a folder name into a web query. Two modes:\n\n web_mode=map — crawl a specific website. web_query must be a valid https:// URL.\n web_mode=search — web search. web_query is a search query string (not a URL).\n\n Use web_mode=map when guidance contains a URL or the name implies a specific site.\n Use web_mode=search when the name describes a topic, question, or information need.\n\n EXAMPLES:\n - \"hennessy-cocktails\" (guidance: \"https://www.hennessy.com/en-us/cocktails\")\n → web_mode: \"map\", web_query: \"https://www.hennessy.com/en-us/cocktails\", include_paths: [\"/en-us/cocktails/*\"], limit: 100\n - \"react-docs\"\n → web_mode: \"map\", web_query: \"https://react.dev/reference\", include_paths: [\"/reference/*\"], limit: 100\n - \"python-tutorial\"\n → web_mode: \"map\", web_query: \"https://docs.python.org/3/tutorial/\", include_paths: [\"/3/tutorial/*\"], limit: 50\n - \"latest AI news\"\n → web_mode: \"search\", web_query: \"latest AI news 2025\", include_paths: [], limit: 20\n - \"best rust web frameworks\"\n → web_mode: \"search\", web_query: \"best rust web frameworks comparison\", include_paths: [], limit: 15\n - \"climate change research papers\"\n → web_mode: \"search\", web_query: \"climate change research papers 2025\", include_paths: [], limit: 20\n\n Defaults: web_mode=map, limit=100, filename_format={title}_{id}.md\n\n Folder name: {{ name }}\n {% if guidance %}\n User guidance: {{ guidance }}\n {% endif %}\n\n {{ ctx.output_format }}\n \"#\n}\n\n// -----------------------------------------------------------------------------\n// Web Query Tests\n// -----------------------------------------------------------------------------\n\ntest TestWebQueryWithGuidance {\n functions [InferWebQuery]\n args {\n name \"hennessy-cocktails\"\n guidance \"https://www.hennessy.com/en-us/cocktails?page=1\"\n }\n}\n\ntest TestWebQueryDocs {\n functions [InferWebQuery]\n args {\n name \"react-docs\"\n guidance null\n }\n}\n\ntest TestWebQueryBlog {\n functions [InferWebQuery]\n args {\n name \"openai-blog\"\n guidance null\n }\n}\n\ntest TestWebQuerySearch {\n functions [InferWebQuery]\n args {\n name \"latest AI news\"\n guidance null\n }\n}\n\ntest TestWebQuerySearchTopic {\n functions [InferWebQuery]\n args {\n name \"best rust web frameworks\"\n guidance null\n }\n}\n\n// -----------------------------------------------------------------------------\n// Confluence Query Tests\n// -----------------------------------------------------------------------------\n\ntest TestConfluenceRecentPages {\n functions [InferConfluenceQuery]\n args {\n name \"recent-pages\"\n guidance null\n }\n}\n\ntest TestConfluenceEngDocs {\n functions [InferConfluenceQuery]\n args {\n name \"engineering-docs\"\n guidance \"documentation from the engineering space\"\n }\n}\n\ntest TestConfluenceKubernetes {\n functions [InferConfluenceQuery]\n args {\n name \"kubernetes-pages\"\n guidance null\n }\n}\n\ntest TestConfluenceMeetingNotes {\n functions [InferConfluenceQuery]\n args {\n name \"meeting-notes\"\n guidance null\n }\n}\n\ntest TestConfluenceArchitectureDecisions {\n functions [InferConfluenceQuery]\n args {\n name \"architecture-decisions\"\n guidance \"ADRs and architecture decision records\"\n }\n}\n",
+ "generators.baml": "generator target {\n output_type \"go\"\n output_dir \"../\"\n version \"0.218.1\"\n\n // 'baml-cli generate' will run this after generating go code\n // This command will be run from within $output_dir/baml_client\n on_generate \"bash \\\"$(git rev-parse --show-toplevel)/bin/patch-baml-runtime.sh\\\" && gofmt -w . && goimports -w .\"\n\n // Go packages name as specified in go.mod\n // We need this to generate correct imports in the generated baml_client\n client_package_name \"github.com/beam-cloud/airstore/pkg/sources/queries\"\n}\n",
+ "smart_queries.baml": "// =============================================================================\n// Smart Query BAML Functions\n// Infer source-specific queries from folder/file names\n// =============================================================================\n\nclass GmailQueryResult {\n gmail_query string @description(\"Gmail search query using Gmail's operators\")\n limit int @description(\"Max results to return (default: 50)\")\n filename_format string @description(\"Template for result filenames. Use {id}, {date}, {from}, {subject}. Example: {date}_{from}_{subject}_{id}.txt\")\n}\n\nclass GDriveQueryResult {\n gdrive_query string @description(\"Google Drive search query\")\n limit int @description(\"Max results to return (default: 50)\")\n filename_format string @description(\"Template for result filenames. Use {id}, {name}, {date}. Example: {name}_{id}\")\n}\n\nclass NotionQueryResult {\n notion_query string @description(\"Notion search query text\")\n limit int @description(\"Max results to return (default: 50)\")\n filename_format string @description(\"Template for result filenames. Use {id}, {title}, {date}. Example: {title}_{id}.md\")\n}\n\nclass GitHubQueryResult {\n github_query string @description(\"GitHub query - format depends on search_type\")\n search_type string @description(\"Type: repos, issues, prs, commits, releases, workflows, files, branches\")\n content_type string @description(\"Format: markdown (default), diff (PRs/commits), json (metadata), raw (files)\")\n limit int @description(\"Max results (default: 50)\")\n filename_format string @description(\"Filename template with placeholders like {number}, {title}, {sha}, {name}, {path}\")\n}\n\nclass SlackQueryResult {\n slack_query string @description(\"Slack search query using Slack's search syntax\")\n limit int @description(\"Max results to return (default: 50)\")\n filename_format string @description(\"Template for result filenames. Use {id}, {channel}, {user}, {date}, {text}. Example: {date}_{channel}_{user}_{id}.txt\")\n}\n\nclass LinearQueryResult {\n linear_query string @description(\"Linear issue filter query using Linear's filter syntax\")\n search_type string @description(\"Type of search: 'issues' or 'projects'. Default: 'issues'\")\n limit int @description(\"Max results to return (default: 50)\")\n filename_format string @description(\"Template for result filenames. Use {id}, {identifier}, {title}, {state}, {assignee}, {team}, {priority}, {date}. Example: {identifier}_{title}.md\")\n}\n\nclass PostHogQueryResult {\n posthog_query string @description(\"Search query string for PostHog API\")\n search_type string @description(\"Type of search: 'events', 'feature-flags', 'insights', 'cohorts'\")\n project_id int @description(\"PostHog project ID to search in. Use 0 to auto-select the first available project.\")\n limit int @description(\"Max results to return (default: 50)\")\n filename_format string @description(\"Template for result filenames. Placeholders vary by search_type.\")\n}\n\nclass WebQueryResult {\n web_mode string @description(\"'map' to crawl a specific site, 'search' to find pages via web search. Default: map\")\n web_query string @description(\"For web_mode=map: the URL to crawl. For web_mode=search: the search query string.\")\n include_paths string[] @description(\"URL path patterns to include when web_mode=map (e.g. /cocktails/*). Empty array = all paths. Ignored for search.\")\n limit int @description(\"Max pages to discover (default: 100)\")\n filename_format string @description(\"Template for result filenames. Use {title}, {path}, {id}, {date}. Example: {title}_{id}.md\")\n}\n\n// -----------------------------------------------------------------------------\n// PostHog Query Inference\n// -----------------------------------------------------------------------------\n\nfunction InferPostHogQuery(name: string, guidance: string?) -> PostHogQueryResult {\n client AnthropicClient\n prompt #\"\n Convert a folder/file name into a PostHog query.\n\n SEARCH TYPES:\n 1. feature-flags — Feature flags (toggles). Use when name contains: flag, toggle, feature, rollout, experiment\n 2. insights — Saved queries, funnels, retention, trends. Use when name contains: insight, funnel, retention, trend, dashboard, chart, analytics\n 3. cohorts — User segments/groups. Use when name contains: cohort, segment, group, users, audience\n 4. events — Raw event data (default). Use when name contains: event, pageview, click, action, or nothing specific\n\n POSTHOG STANDARD EVENTS (start with $):\n $pageview, $pageleave, $autocapture, $screen, $exception, $rageclick, $web_vitals\n\n QUERY FORMAT:\n - For events: the event name to filter on (e.g., \"$pageview\", \"signup\", \"purchase\"). Empty string = all recent events.\n - For feature-flags/insights/cohorts: a search term to filter by name/key (e.g., \"beta\", \"onboarding\", \"retention\").\n\n ACTIVE FLAGS:\n If the name implies \"active\" or \"enabled\" flags (e.g., \"active-feature-flags\", \"enabled-flags\"), set search_type to \"feature-flags\" and use a broad query (empty or relevant keyword). The provider will filter for active=true.\n\n DEFAULTS:\n - project_id: 0 (auto-select first project)\n - limit: 50\n\n FILENAME FORMATS by search_type:\n - events: {date}_{event}_{id}.json\n - feature-flags: {key}_{id}.json\n - insights: {name}_{id}.json\n - cohorts: {name}_{id}.json\n\n Query examples:\n - \"active-feature-flags\" → search_type: feature-flags, query: \"\" (all flags, provider filters active)\n - \"pageview-events\" → search_type: events, query: \"$pageview\"\n - \"error-events\" → search_type: events, query: \"$exception\"\n - \"retention-insights\" → search_type: insights, query: \"retention\"\n - \"beta-users\" → search_type: cohorts, query: \"beta\"\n - \"signup-funnel\" → search_type: insights, query: \"signup funnel\"\n - \"recent-events\" → search_type: events, query: \"\"\n - \"experiment-flags\" → search_type: feature-flags, query: \"experiment\"\n - \"rage-clicks\" → search_type: events, query: \"$rageclick\"\n - \"web-vitals\" → search_type: events, query: \"$web_vitals\"\n - \"onboarding-cohort\" → search_type: cohorts, query: \"onboarding\"\n\n Folder/file name: {{ name }}\n {% if guidance %}\n\n Additional guidance from user: {{ guidance }}\n {% endif %}\n\n {{ ctx.output_format }}\n \"#\n}\n\n// -----------------------------------------------------------------------------\n// Gmail Query Inference\n// -----------------------------------------------------------------------------\n\nfunction InferGmailQuery(name: string, guidance: string?) -> GmailQueryResult {\n client AnthropicClient\n prompt #\"\n Convert a folder/file name into a Gmail search query.\n \n IMPORTANT: Gmail search is FUZZY by default!\n - Bare words (no operator) search across subject, body, AND sender name/email\n - Use bare words for most searches - they match everywhere!\n - Gmail automatically handles plurals, common misspellings, and related terms\n \n CRITICAL: PREFER BARE WORDS OVER from: FOR FUZZY SENDER MATCHING!\n - from:company only matches if that word appears exactly in sender name/email\n - Bare words match sender, subject, AND body - much more flexible!\n - Many companies use subdomains like noreply@mail.company.com - bare words catch these better\n - ONLY use from: when you need STRICT sender matching (rare)\n \n CRITICAL SYNTAX RULES:\n - Multi-word operator values MUST be quoted: from:\"summit health\" NOT from:summit health\n - Bare multi-word phrases should also be quoted: \"lab results\" NOT lab results\n - Unquoted spaces mean AND: lab results = lab AND results (both must appear)\n - Use OR to match alternatives: mychart OR \"nyu langone\" OR \"summit health\"\n \n Gmail search operators (use sparingly - bare words are usually better!):\n - from:\"name or email\" - STRICT sender match (avoid for fuzzy matching!)\n - subject:\"phrase\" - matches subject line only\n - has:attachment, filename:pdf\n - is:unread, is:starred, is:important\n - newer_than:1d, newer_than:7d, older_than:30d\n \n QUERY STRATEGY:\n 1. For company/brand emails → USE BARE WORDS: nike OR adidas OR \"north face\"\n (This finds emails FROM them, ABOUT them, or MENTIONING them - much better coverage!)\n 2. For topics → bare words or quoted phrases: \"lab results\" OR prescription OR appointment\n 3. For people → from:firstname works but bare word also good: john OR from:john\n 4. For mixed (topic + senders) → all bare words: mychart OR nyu OR \"summit health\" OR prescription\n 5. For health/medical → medical OR health OR doctor OR appointment OR mychart OR prescription\n 6. When in doubt → ADD MORE ORs with related terms, synonyms, and variations!\n \n DEFAULTS:\n - limit: 50\n - filename_format: {date}_{from}_{subject}_{id}.txt\n \n Query examples:\n - \"invoices\" → invoice OR receipt OR bill OR statement\n - \"stripe-invoices\" → stripe invoice OR stripe receipt OR stripe payment\n - \"receipts\" → receipt OR invoice OR order OR confirmation OR purchase\n - \"meeting-notes\" → meeting OR agenda OR \"calendar invite\" OR standup OR sync\n - \"flight-confirmation\" → flight OR itinerary OR \"boarding pass\" OR airline OR booking\n - \"health-emails\" → medical OR health OR doctor OR appointment OR mychart OR prescription OR lab\n - \"from-john\" → from:john (explicit \"from\" in name = use the operator)\n - \"emails-from-acme-corp\" → from:\"acme corp\" (explicit \"from\" + multi-word = quoted operator)\n - \"unread\" → is:unread\n - \"project-alpha\" → \"project alpha\" (quoted phrase)\n - \"shipping\" → shipping OR tracking OR delivery OR fedex OR ups OR usps OR package\n - \"recent\" → newer_than:7d\n - \"amazon-orders\" → amazon order OR amazon shipment OR amazon delivery OR amazon purchase\n - \"uber-receipts\" → uber receipt OR uber ride OR lyft receipt OR lyft ride\n \n Guidance examples:\n - \"from nike\" → nike (bare word for fuzzy matching - catches nike.com, news@nike.com, etc.)\n - \"anything from summit health, nyu, mychart, health related\" \n → mychart OR nyu OR \"summit health\" OR medical OR appointment OR prescription OR lab\n (Use bare words with lots of ORs - they match sender names AND content!)\n - \"strictly only emails where the sender is john@company.com\" → from:john@company.com\n (Use from: ONLY when user explicitly wants strict sender matching)\n \n Filename format: {date}_{from}_{subject}_{id}.txt\n \n Folder/file name: {{ name }}\n {% if guidance %}\n \n Additional guidance from user: {{ guidance }}\n {% endif %}\n \n {{ ctx.output_format }}\n \"#\n}\n\n// -----------------------------------------------------------------------------\n// Gmail Query Evaluation and Refinement\n// -----------------------------------------------------------------------------\n\nclass GmailQueryEvaluation {\n is_satisfactory bool @description(\"True if results match user intent well enough\")\n refined_query string? @description(\"Improved Gmail query if not satisfactory, null if satisfactory\")\n reasoning string @description(\"Brief explanation of evaluation (1-2 sentences)\")\n}\n\nfunction EvaluateGmailQueryResults(\n original_guidance: string,\n generated_query: string,\n result_count: int,\n sample_results: string\n) -> GmailQueryEvaluation {\n client AnthropicClient\n prompt #\"\n You are a STRICT evaluator checking if Gmail search results actually match what the user asked for.\n Be critical - if results don't clearly match the user's intent, suggest a better query.\n \n USER'S REQUEST: {{ original_guidance }}\n \n CURRENT QUERY: {{ generated_query }}\n \n RESULTS: {{ result_count }} emails found\n \n SAMPLE RESULTS: {{ sample_results }}\n \n STRICT EVALUATION CHECKLIST:\n \n 1. ZERO RESULTS = AUTOMATIC FAIL\n - If result_count is 0, the query is broken. Mark as NOT satisfactory.\n - Common causes:\n * Unquoted multi-word values: from:summit health → from:\"summit health\"\n * Using from: when bare word would match better (from:bonobos might miss noreply@e.bonobos.com)\n \n 2. CHECK EACH THING THE USER MENTIONED:\n - If user mentioned specific senders/companies (e.g., \"bonobos\", \"amazon\", \"stripe\"):\n → Are emails FROM or ABOUT these in the results?\n → Note: bare words like \"bonobos\" are GOOD - they match sender, subject, AND body\n → If from: operator fails, suggest using bare word instead for fuzzy matching\n - If user mentioned topics (e.g., \"health\", \"medical\", \"appointments\"):\n → Do the subjects/snippets actually contain these words?\n → Generic results that don't mention the topics = NOT satisfactory.\n \n 3. LOOK FOR OBVIOUS MISMATCHES:\n - Results about completely unrelated topics = NOT satisfactory\n - Missing the main thing the user asked for = NOT satisfactory\n \n 4. QUERY IMPROVEMENTS TO SUGGEST:\n - from:bonobos (0 results) → bonobos (bare word matches subdomains like e.bonobos.com)\n - from:summit health → from:\"summit health\" (MUST quote multi-word)\n - Using from: when company uses weird email domains → use bare word instead\n - Not enough OR alternatives for fuzzy matching\n \n BE HARSH: If you're unsure whether results match, lean toward NOT satisfactory and suggest improvements.\n Only mark satisfactory if the results CLEARLY match what the user asked for.\n \n {{ ctx.output_format }}\n \"#\n}\n\n// -----------------------------------------------------------------------------\n// Google Drive Query Inference\n// -----------------------------------------------------------------------------\n\nfunction InferGDriveQuery(name: string, guidance: string?) -> GDriveQueryResult {\n client AnthropicClient\n prompt #\"\n Convert a folder/file name into a Google Drive files.list query (the `q` parameter).\n \n Output MUST be ONLY the query expression (no `q=`, no URL encoding).\n It MUST be valid Drive query syntax.\n \n Drive query syntax:\n - name contains 'text'\n - fullText contains 'text'\n - mimeType = 'application/pdf'\n - mimeType = 'application/vnd.google-apps.document'\n - mimeType = 'application/vnd.google-apps.spreadsheet'\n - mimeType = 'application/vnd.google-apps.presentation'\n - mimeType contains 'image'\n - sharedWithMe = true\n - 'me' in owners\n - 'email@example.com' in owners\n - 'email@example.com' in writers\n - 'email@example.com' in readers\n - starred = true\n - trashed = false\n - modifiedTime > '2024-01-01T00:00:00'\n - createdTime > '2024-01-01T00:00:00'\n - 'FOLDER_ID' in parents\n \n Combine with: and, or, not\n \n IMPORTANT:\n - Always include `trashed = false` unless the user explicitly asks for trash / deleted files.\n - Prefer `fullText contains` for topic/keyword folders (it searches name + indexed content).\n Use `name contains` when the intent is specifically filename-based.\n - Use RFC3339 timestamps for createdTime/modifiedTime. Prefer including a timezone, e.g. '2026-01-22T00:00:00Z'.\n \n DEFAULTS:\n - limit: 50\n - filename_format: {name}_{id}\n \n Query examples:\n - \"recent\" → trashed = false and modifiedTime > '2026-01-22T00:00:00Z'\n - \"shared\" → trashed = false and sharedWithMe = true\n - \"my-docs\" → trashed = false and 'me' in owners and mimeType = 'application/vnd.google-apps.document'\n - \"pdfs\" → trashed = false and mimeType = 'application/pdf'\n - \"documents\" → trashed = false and mimeType = 'application/vnd.google-apps.document'\n - \"spreadsheets\" → trashed = false and mimeType = 'application/vnd.google-apps.spreadsheet'\n - \"presentations\" → trashed = false and mimeType = 'application/vnd.google-apps.presentation'\n - \"images\" → trashed = false and mimeType contains 'image/'\n - \"starred\" → trashed = false and starred = true\n - \"college-docs\" → trashed = false and (fullText contains 'college' or name contains 'college') and mimeType = 'application/vnd.google-apps.document'\n - Guidance: \"created in the past week\" → use createdTime > '...'\n - Guidance: \"modified in the past week\" → use modifiedTime > '...'\n \n Filename format: Choose a format that makes sense for the query intent.\n Available placeholders: {id}, {name}, {date}, {created}, {mime_type}, {ext}\n - {name} is the basename (no extension)\n - {ext} is the extension (includes leading dot)\n - For documents: {name}_{id}\n - For dated files: {date}_{name}_{id}\n - For created-time focused queries: {created}_{name}_{id}\n - Default: {name}_{id}\n \n Folder/file name: {{ name }}\n {% if guidance %}\n \n Additional guidance: {{ guidance }}\n {% endif %}\n \n {{ ctx.output_format }}\n \"#\n}\n\n// -----------------------------------------------------------------------------\n// Notion Query Inference\n// -----------------------------------------------------------------------------\n\nfunction InferNotionQuery(name: string, guidance: string?) -> NotionQueryResult {\n client AnthropicClient\n prompt #\"\n Convert a folder/file name into a Notion search query.\n \n Notion search is simple text-based. Extract key search terms from the folder name.\n Remove common words like \"my\", \"the\", \"all\", etc.\n \n DEFAULTS:\n - limit: 50\n - filename_format: {title}_{id}.md\n \n Query examples:\n - \"meeting-notes\" → meeting notes\n - \"project-docs\" → project docs\n - \"todo\" → todo\n - \"weekly-reports\" → weekly reports\n \n Filename format: Choose a format that makes sense for the query intent.\n Available placeholders: {id}, {title}, {type}, {date}, {created}\n - For notes: {title}_{id}.md\n - For dated pages: {date}_{title}_{id}.md\n - Default: {title}_{id}.md\n \n Folder/file name: {{ name }}\n {% if guidance %}\n \n Additional guidance from user: {{ guidance }}\n {% endif %}\n \n {{ ctx.output_format }}\n \"#\n}\n\n// -----------------------------------------------------------------------------\n// Slack Query Inference\n// -----------------------------------------------------------------------------\n\nfunction InferSlackQuery(name: string, guidance: string?) -> SlackQueryResult {\n client AnthropicClient\n prompt #\"\n Convert a folder/file name into a Slack search query.\n \n Slack search supports various operators for finding messages across channels.\n The query should find relevant messages based on the folder name intent.\n \n SLACK SEARCH OPERATORS:\n - in:#channel-name - messages in a specific channel\n - in:@username - direct messages with a user\n - from:@username - messages from a specific user\n - from:me - messages you sent\n - to:me - messages sent to you (DMs)\n - has:link - messages containing links\n - has:reaction - messages with reactions\n - has:star - starred messages\n - has:pin - pinned messages\n - before:YYYY-MM-DD - messages before a date\n - after:YYYY-MM-DD - messages after a date\n - during:month - messages from a specific month (e.g., during:january)\n - on:YYYY-MM-DD - messages on a specific date\n - is:saved - saved messages\n \n QUERY STRATEGY:\n 1. For channel-specific queries → use in:#channel\n 2. For user-specific queries → use from:@user\n 3. For topic searches → use keywords directly\n 4. For time-based queries → use before:, after:, during:, on:\n 5. For content type → use has:link, has:reaction, etc.\n 6. Combine operators with spaces (implicit AND) or OR\n \n DEFAULTS:\n - limit: 50\n - filename_format: {date}_{channel}_{user}_{id}.txt\n \n Query examples:\n - \"engineering-updates\" → in:#engineering OR in:#engineering-updates\n - \"team-announcements\" → in:#announcements OR in:#general announcement\n - \"from-ceo\" → from:@ceo OR from:@founder\n - \"recent-links\" → has:link after:2026-01-01\n - \"pinned-messages\" → has:pin\n - \"design-feedback\" → in:#design feedback OR in:#design-reviews\n - \"standup-notes\" → standup OR \"daily standup\" OR in:#standup\n - \"customer-issues\" → in:#support OR in:#customer-success issue OR bug\n - \"product-discussions\" → in:#product OR product roadmap OR feature\n - \"onboarding\" → onboarding OR \"new hire\" OR welcome\n - \"decisions\" → decision OR decided OR \"we will\" OR approved\n - \"action-items\" → \"action item\" OR TODO OR \"follow up\" OR assigned\n \n Filename format: Choose a format that makes sense for the query intent.\n Available placeholders: {id}, {channel}, {user}, {date}, {text}\n - For channel messages: {date}_{channel}_{user}_{id}.txt\n - For user messages: {date}_{user}_{channel}_{id}.txt\n - Default: {date}_{channel}_{user}_{id}.txt\n \n Folder/file name: {{ name }}\n {% if guidance %}\n \n Additional guidance from user: {{ guidance }}\n {% endif %}\n \n {{ ctx.output_format }}\n \"#\n}\n\n// -----------------------------------------------------------------------------\n// Linear Query Inference\n// -----------------------------------------------------------------------------\n\nfunction InferLinearQuery(name: string, guidance: string?) -> LinearQueryResult {\n client AnthropicClient\n prompt #\"\n Convert natural language into a Linear search query.\n \n CORE PRINCIPLE: Text search is powerful - use it liberally. Only add filters when intent is unambiguous.\n \n SYNTAX:\n - Plain text: searches title + description (very flexible, use this a lot!)\n - assignee:name | assignee:me - who it's assigned to\n - no:assignee - unassigned issues\n - state:backlog | state:todo | state:\"in progress\" | state:done | state:canceled\n - priority:1 (urgent) | priority:2 (high) | priority:3 (medium) | priority:4 (low) \n - label:name - filter by label\n - no:label - unlabeled issues\n - team:KEY - filter by team key\n - project:name - filter by project name\n - creator:name - who created it\n - cycle:current | cycle:next | cycle:previous - filter by cycle\n - estimate:N - filter by point estimate\n - OR - combine alternatives (use generously!)\n \n STRATEGY - Think about what the user ACTUALLY wants:\n \n 1. TOPIC/FEATURE searches → mostly text search\n \"checkout bugs\" → checkout bug\n \"api performance\" → api performance\n \"mobile crashes\" → mobile crash\n \"auth issues\" → auth authentication login\n \n 2. PERSON searches → assignee filter + text backup\n \"john's tickets\" → assignee:john OR john\n \"sarah jones work\" → assignee:sarah OR assignee:jones OR sarah OR jones\n \"assigned to mike\" → assignee:mike\n \"unassigned issues\" → no:assignee\n \n 3. STATUS/PRIORITY searches → filters\n \"my open issues\" → assignee:me state:todo OR state:\"in progress\"\n \"urgent bugs\" → priority:1 bug\n \"blocked tickets\" → blocked OR blocking\n \"stuff in review\" → review\n \"in progress issues\" → state:\"in progress\"\n \n 4. PROJECT/CYCLE searches → project or cycle filter\n \"acme project\" → project:acme\n \"current sprint\" → cycle:current\n \"next sprint issues\" → cycle:next\n \"q1 roadmap\" → q1 roadmap\n \"enterprise features\" → enterprise\n \n 5. VAGUE/BROAD searches → text search with synonyms\n \"tech debt\" → \"tech debt\" OR refactor OR cleanup\n \"security\" → security vulnerability\n \"onboarding\" → onboarding OR \"getting started\" OR setup\n \n KEY RULES:\n - When in doubt, use text search - it's flexible\n - Use OR generously to catch variations \n - For multi-word names: assignee:first OR assignee:last (never the full name)\n - Add synonyms for common concepts\n - Don't over-filter - better to return more results than miss things\n \n SEARCH TYPE: \"issues\" (default) or \"projects\" (only if explicitly asking for projects list)\n \n DEFAULTS:\n - limit: 50\n - filename_format: {identifier}_{title}.md\n \n CRITICAL: filename_format MUST contain placeholders like {identifier}, {title}, etc.\n NEVER return a literal filename like \"test.md\" or the folder name as the format.\n The format is a TEMPLATE applied to every result, so it MUST use placeholders.\n \n Folder/file name: {{ name }}\n {% if guidance %}\n Additional guidance: {{ guidance }}\n {% endif %}\n \n {{ ctx.output_format }}\n \"#\n}\n\n// -----------------------------------------------------------------------------\n// GitHub Query Inference\n// -----------------------------------------------------------------------------\n\nfunction InferGitHubQuery(name: string, guidance: string?) -> GitHubQueryResult {\n client AnthropicClient\n prompt #\"\n Convert natural language into a GitHub query. Think about what an agent would need.\n \n ═══════════════════════════════════════════════════════════════════════════════\n SEARCH TYPES & QUERY FORMATS\n ═══════════════════════════════════════════════════════════════════════════════\n \n 1. REPOS (search_type: \"repos\")\n Listing: list:org/ORGNAME or list:user/USERNAME [type:private|public]\n Search: language:go stars:>100 topic:ml org:company\n \n Examples:\n - \"beam-cloud repos\" → list:org/beam-cloud\n - \"private repos in acme\" → list:org/acme type:private\n - \"popular go projects\" → language:go stars:>1000\n \n 2. ISSUES (search_type: \"issues\") \n Query: repo:owner/name [is:open|closed] [label:X] [assignee:X] [author:X]\n \n Examples:\n - \"open bugs in beta9\" → repo:beam-cloud/beta9 is:open label:bug\n - \"my issues\" → assignee:@me is:open\n - \"security issues in react\" → repo:facebook/react label:security\n \n 3. PULL REQUESTS (search_type: \"prs\")\n Query: repo:owner/name [is:open|closed|merged] [author:X] [review:approved]\n \n Examples:\n - \"open PRs in beta9\" → repo:beam-cloud/beta9 is:open\n - \"merged PRs this week\" → repo:owner/repo is:merged merged:>2026-01-26\n - \"PRs needing review\" → repo:owner/repo is:open review:none\n \n 4. COMMITS (search_type: \"commits\")\n Query: repo:owner/name [branch:X] [author:X] [since:DATE] [path:X]\n \n Examples:\n - \"recent commits in beta9\" → repo:beam-cloud/beta9\n - \"commits to main this week\" → repo:owner/repo branch:main since:2026-01-26\n - \"john's commits\" → repo:owner/repo author:john\n - \"changes to src/api\" → repo:owner/repo path:src/api\n \n 5. RELEASES (search_type: \"releases\")\n Query: repo:owner/name [latest] [prerelease:true|false]\n \n Examples:\n - \"beta9 releases\" → repo:beam-cloud/beta9\n - \"latest release of kubernetes\" → repo:kubernetes/kubernetes latest\n - \"stable releases only\" → repo:owner/repo prerelease:false\n \n 6. WORKFLOW RUNS (search_type: \"workflows\")\n Query: repo:owner/name [status:success|failure|in_progress] [branch:X] [event:push|pr]\n \n Examples:\n - \"failed CI in beta9\" → repo:beam-cloud/beta9 status:failure\n - \"recent builds\" → repo:owner/repo\n - \"PR checks\" → repo:owner/repo event:pull_request\n \n 7. FILES (search_type: \"files\")\n Query: repo:owner/name path:PATH [ref:branch|tag|sha]\n \n Examples:\n - \"beta9 readme\" → repo:beam-cloud/beta9 path:README.md\n - \"package.json in react\" → repo:facebook/react path:package.json\n - \"config files\" → repo:owner/repo path:*.yaml ref:main\n - \"src directory\" → repo:owner/repo path:src/\n \n 8. BRANCHES (search_type: \"branches\")\n Query: repo:owner/name [protected:true|false]\n \n Examples:\n - \"beta9 branches\" → repo:beam-cloud/beta9\n - \"protected branches\" → repo:owner/repo protected:true\n \n ═══════════════════════════════════════════════════════════════════════════════\n CONTENT TYPES (how to format the content)\n ═══════════════════════════════════════════════════════════════════════════════\n \n - \"markdown\" (default) - Rich formatted content, best for agents to read\n - \"diff\" - Code changes (for PRs/commits)\n - \"json\" - Structured metadata\n - \"raw\" - Unprocessed file content (for files)\n \n ═══════════════════════════════════════════════════════════════════════════════\n FILENAME FORMATS (use appropriate placeholders)\n ═══════════════════════════════════════════════════════════════════════════════\n \n Repos: {name}.md or {full_name}.json\n Issues: {number}_{title}.md\n PRs: {number}_{title}.md or {number}_{title}_diff.md\n Commits: {sha_short}_{message}.md\n Releases: {tag}_{name}.md\n Workflows: {id}_{name}_{status}.md\n Files: {path} (preserve original path/name)\n Branches: {name}.md\n \n ═══════════════════════════════════════════════════════════════════════════════\n SMART DETECTION\n ═══════════════════════════════════════════════════════════════════════════════\n \n Detect intent from keywords:\n - \"repos/repositories/projects\" in org/user → repos listing\n - \"prs/pulls/pull requests/merge requests\" → prs\n - \"issues/bugs/tickets/tasks\" → issues\n - \"commits/history/changes/log\" → commits \n - \"releases/versions/tags\" → releases\n - \"ci/cd/builds/workflows/actions/checks\" → workflows\n - \"readme/config/file/code\" → files\n - \"branches\" → branches\n - \"diff/patch/changes\" → content_type: diff\n \n Folder/file name: {{ name }}\n {% if guidance %}\n Additional guidance: {{ guidance }}\n {% endif %}\n \n {{ ctx.output_format }}\n \"#\n}\n\n// -----------------------------------------------------------------------------\n// Tests\n// -----------------------------------------------------------------------------\n\ntest TestGitHubOpenPRs {\n functions [InferGitHubQuery]\n args {\n name \"open-prs-beta9\"\n guidance null\n }\n}\n\ntest TestGitHubIssues {\n functions [InferGitHubQuery]\n args {\n name \"beta9-bugs\"\n guidance \"show open bugs in the beta9 repo\"\n }\n}\n\ntest TestGitHubRepos {\n functions [InferGitHubQuery]\n args {\n name \"go-cli-tools\"\n guidance null\n }\n}\n\ntest TestGmailUnread {\n functions [InferGmailQuery]\n args {\n name \"unread-emails\"\n guidance null\n }\n}\n\ntest TestGmailFromPerson {\n functions [InferGmailQuery]\n args {\n name \"from-eli\"\n guidance null\n }\n}\n\ntest TestGmailWithGuidance {\n functions [InferGmailQuery]\n args {\n name \"important-emails\"\n guidance \"Only from the last 7 days, max 100 results\"\n }\n}\n\ntest TestGmailInvoices {\n functions [InferGmailQuery]\n args {\n name \"invoices\"\n guidance null\n }\n}\n\ntest TestGmailMeetingNotes {\n functions [InferGmailQuery]\n args {\n name \"meeting-notes\"\n guidance null\n }\n}\n\ntest TestGmailFlightConfirmation {\n functions [InferGmailQuery]\n args {\n name \"flight-confirmations\"\n guidance null\n }\n}\n\ntest TestGmailReceipts {\n functions [InferGmailQuery]\n args {\n name \"receipts\"\n guidance null\n }\n}\n\ntest TestGmailProjectAlpha {\n functions [InferGmailQuery]\n args {\n name \"project-alpha-emails\"\n guidance null\n }\n}\n\ntest TestGmailShippingUpdates {\n functions [InferGmailQuery]\n args {\n name \"shipping-updates\"\n guidance null\n }\n}\n\ntest TestGmailHealthWithGuidance {\n functions [InferGmailQuery]\n args {\n name \"Medical Documents\"\n guidance \"anything from: summit health, nyu, anything health related basically, mychart\"\n }\n}\n\ntest TestGmailMultiWordSender {\n functions [InferGmailQuery]\n args {\n name \"acme-corp-emails\"\n guidance \"from acme corp and related companies\"\n }\n}\n\ntest TestGmailFuzzySenderNike {\n functions [InferGmailQuery]\n args {\n name \"nike-emails\"\n guidance \"from nike\"\n }\n}\n\ntest TestGmailFuzzySenderAmazon {\n functions [InferGmailQuery]\n args {\n name \"amazon-orders\"\n guidance null\n }\n}\n\ntest TestGDriveShared {\n functions [InferGDriveQuery]\n args {\n name \"shared-with-me\"\n guidance null\n }\n}\n\ntest TestGDrivePdfs {\n functions [InferGDriveQuery]\n args {\n name \"pdfs\"\n guidance null\n }\n}\n\ntest TestGDriveCollegeDocs {\n functions [InferGDriveQuery]\n args {\n name \"college docs\"\n guidance null\n }\n}\n\ntest TestGDriveRecentDocsWithGuidance {\n functions [InferGDriveQuery]\n args {\n name \"recent-docs\"\n guidance \"any docs created in the past week\"\n }\n}\n\ntest TestGDriveOwnedByMe {\n functions [InferGDriveQuery]\n args {\n name \"my docs\"\n guidance null\n }\n}\n\ntest TestNotionMeetings {\n functions [InferNotionQuery]\n args {\n name \"meeting-notes\"\n guidance null\n }\n}\n\ntest TestEvaluateGmailResults_NoResults {\n functions [EvaluateGmailQueryResults]\n args {\n original_guidance \"anything from summit health, nyu, mychart, health related\"\n generated_query \"from:summit health OR from:nyu OR from:mychart\"\n result_count 0\n sample_results \"[]\"\n }\n}\n\ntest TestEvaluateGmailResults_GoodResults {\n functions [EvaluateGmailQueryResults]\n args {\n original_guidance \"invoices from stripe\"\n generated_query \"invoice from:stripe\"\n result_count 15\n sample_results \"[{\\\"from\\\":\\\"Stripe\\\",\\\"subject\\\":\\\"Your invoice from Acme Corp\\\",\\\"snippet\\\":\\\"Invoice #12345...\\\"},{\\\"from\\\":\\\"Stripe Billing\\\",\\\"subject\\\":\\\"Invoice for January\\\",\\\"snippet\\\":\\\"Amount due: $99\\\"}]\"\n }\n}\n\ntest TestSlackEngineering {\n functions [InferSlackQuery]\n args {\n name \"engineering-updates\"\n guidance null\n }\n}\n\ntest TestSlackFromCEO {\n functions [InferSlackQuery]\n args {\n name \"from-ceo\"\n guidance \"messages from our CEO\"\n }\n}\n\ntest TestSlackRecentLinks {\n functions [InferSlackQuery]\n args {\n name \"recent-links\"\n guidance null\n }\n}\n\ntest TestSlackStandups {\n functions [InferSlackQuery]\n args {\n name \"standup-notes\"\n guidance null\n }\n}\n\ntest TestSlackDecisions {\n functions [InferSlackQuery]\n args {\n name \"team-decisions\"\n guidance \"important decisions made by the team\"\n }\n}\n\ntest TestLinearMyIssues {\n functions [InferLinearQuery]\n args {\n name \"my-issues\"\n guidance null\n }\n}\n\ntest TestLinearMyBugs {\n functions [InferLinearQuery]\n args {\n name \"my-bugs\"\n guidance \"bugs assigned to me\"\n }\n}\n\ntest TestLinearUrgent {\n functions [InferLinearQuery]\n args {\n name \"urgent-issues\"\n guidance null\n }\n}\n\ntest TestLinearTeamBacklog {\n functions [InferLinearQuery]\n args {\n name \"eng-backlog\"\n guidance \"engineering team backlog\"\n }\n}\n\ntest TestLinearInProgress {\n functions [InferLinearQuery]\n args {\n name \"in-progress\"\n guidance null\n }\n}\n\ntest TestPostHogActiveFlags {\n functions [InferPostHogQuery]\n args {\n name \"active-feature-flags\"\n guidance null\n }\n}\n\ntest TestPostHogPageviewEvents {\n functions [InferPostHogQuery]\n args {\n name \"pageview-events\"\n guidance null\n }\n}\n\ntest TestPostHogRetentionInsights {\n functions [InferPostHogQuery]\n args {\n name \"retention-insights\"\n guidance null\n }\n}\n\ntest TestPostHogBetaUsers {\n functions [InferPostHogQuery]\n args {\n name \"beta-users\"\n guidance null\n }\n}\n\ntest TestPostHogErrorEvents {\n functions [InferPostHogQuery]\n args {\n name \"error-events\"\n guidance \"exception and error events from the past week\"\n }\n}\n\ntest TestPostHogSignupFunnel {\n functions [InferPostHogQuery]\n args {\n name \"signup-funnel\"\n guidance null\n }\n}\n\n// -----------------------------------------------------------------------------\n// Confluence Query Inference\n// -----------------------------------------------------------------------------\n\nclass ConfluenceQueryResult {\n cql_query string @description(\"Confluence Query Language (CQL) query string\")\n content_type string @description(\"Content type filter: 'page', 'blogpost', or 'all'. Default: 'all'\")\n limit int @description(\"Max results to return (default: 50)\")\n filename_format string @description(\"Template for result filenames. Use {id}, {title}, {space}, {date}, {type}. Example: {space}_{title}_{id}.md\")\n}\n\nfunction InferConfluenceQuery(name: string, guidance: string?) -> ConfluenceQueryResult {\n client AnthropicClient\n prompt #\"\n Convert a folder/file name into a Confluence CQL (Confluence Query Language) query.\n\n CQL SYNTAX:\n - type=page — pages only\n - type=blogpost — blog posts only\n - space=KEY — filter by space key (e.g., space=DEV, space=ENG)\n - title~\"text\" — title contains text (fuzzy match)\n - text~\"text\" — full-text search (searches title + body)\n - label=\"name\" — filter by label\n - creator=\"username\" — filter by creator\n - contributor=\"username\" — filter by contributor\n - lastModified>now(\"-7d\") — modified in the last 7 days\n - lastModified>now(\"-30d\") — modified in the last 30 days\n - lastModified>\"2025-01-01\" — modified after a specific date\n - created>now(\"-7d\") — created in the last 7 days\n - ancestor=12345 — pages under a specific parent\n - AND, OR — combine conditions\n - ORDER BY lastModified DESC, title ASC — sorting\n\n QUERY STRATEGY:\n 1. For topic searches → use text~ for full-text search\n 2. For space-specific → use space=KEY\n 3. For recent content → use lastModified>now(\"-Nd\")\n 4. For labeled content → use label=\"name\"\n 5. For specific content types → use type=page or type=blogpost\n 6. Combine operators with AND/OR\n\n DEFAULTS:\n - content_type: \"all\"\n - limit: 50\n - filename_format: {space}_{title}_{id}.md\n\n Query examples:\n - \"recent-pages\" → lastModified>now(\"-7d\") ORDER BY lastModified DESC, content_type: \"page\"\n - \"engineering-docs\" → (space=ENG OR text~\"engineering\") AND type=page ORDER BY lastModified DESC\n - \"kubernetes-pages\" → text~\"kubernetes\" AND type=page ORDER BY lastModified DESC\n - \"dev-blog-posts\" → space=DEV AND type=blogpost ORDER BY lastModified DESC\n - \"important-docs\" → label=\"important\" ORDER BY lastModified DESC\n - \"api-documentation\" → text~\"api\" AND (label=\"documentation\" OR title~\"api\") ORDER BY lastModified DESC\n - \"onboarding\" → text~\"onboarding\" OR title~\"onboarding\" ORDER BY lastModified DESC\n - \"architecture-decisions\" → text~\"architecture\" OR label=\"adr\" ORDER BY lastModified DESC\n - \"meeting-notes\" → title~\"meeting\" OR label=\"meeting-notes\" ORDER BY lastModified DESC\n - \"recent-blog-posts\" → type=blogpost AND lastModified>now(\"-30d\") ORDER BY lastModified DESC\n - \"project-alpha\" → text~\"project alpha\" ORDER BY lastModified DESC\n\n Filename format: Choose a format that makes sense for the query intent.\n Available placeholders: {id}, {title}, {space}, {date}, {type}\n - For space-specific: {title}_{id}.md\n - For cross-space: {space}_{title}_{id}.md\n - For dated content: {date}_{title}_{id}.md\n - Default: {space}_{title}_{id}.md\n\n Folder/file name: {{ name }}\n {% if guidance %}\n\n Additional guidance from user: {{ guidance }}\n {% endif %}\n\n {{ ctx.output_format }}\n \"#\n}\n\n// -----------------------------------------------------------------------------\n// AgentMail Query Inference\n// -----------------------------------------------------------------------------\n\nclass AgentMailQueryResult {\n agentmail_query string @description(\"Text filter to match against sender, subject, and body. Empty string = all messages (no filter).\")\n inbox_filter string @description(\"Specific inbox email address to query. Empty string = all inboxes.\")\n limit int @description(\"Max results to return (default: 50)\")\n filename_format string @description(\"Template for result filenames. Use {id}, {date}, {from}, {subject}, {inbox}. Example: {date}_{from}_{subject}_{id}.txt\")\n}\n\nfunction InferAgentMailQuery(name: string, guidance: string?) -> AgentMailQueryResult {\n client AnthropicClient\n prompt #\"\n Convert a folder/file name into an AgentMail query.\n\n AgentMail manages multiple email inboxes (each is an email address like agent@example.com).\n The query is a simple text filter applied client-side against sender, subject, and body.\n There is no server-side search — use simple keywords.\n\n QUERY STRATEGY:\n 1. For general browsing (recent, all, inbox) → empty query, empty inbox_filter\n 2. For sender-specific → use the sender name/email as query text\n 3. For topic-specific → use topic keywords as query text\n 4. For inbox-specific → set inbox_filter to the inbox address if mentioned\n 5. AgentMail has no read/unread distinction — \"unread\" just means recent messages\n\n DEFAULTS:\n - agentmail_query: \"\" (all messages)\n - inbox_filter: \"\" (all inboxes)\n - limit: 50\n - filename_format: {date}_{from}_{subject}_{id}.txt\n\n Query examples:\n - \"recent-emails\" → query: \"\", inbox_filter: \"\"\n - \"unread\" → query: \"\", inbox_filter: \"\"\n - \"emails-from-john\" → query: \"john\", inbox_filter: \"\"\n - \"support-emails\" → query: \"support\", inbox_filter: \"\"\n - \"invoices\" → query: \"invoice\", inbox_filter: \"\"\n - \"meeting-notes\" → query: \"meeting\", inbox_filter: \"\"\n\n Folder/file name: {{ name }}\n {% if guidance %}\n\n Additional guidance from user: {{ guidance }}\n {% endif %}\n\n {{ ctx.output_format }}\n \"#\n}\n\n// -----------------------------------------------------------------------------\n// Web Query Inference\n// -----------------------------------------------------------------------------\n\nfunction InferWebQuery(name: string, guidance: string?) -> WebQueryResult {\n client AnthropicClient\n prompt #\"\n Convert a folder name into a web query. Two modes:\n\n web_mode=map — crawl a specific website. web_query must be a valid https:// URL.\n web_mode=search — web search. web_query is a search query string (not a URL).\n\n Use web_mode=map when guidance contains a URL or the name implies a specific site.\n Use web_mode=search when the name describes a topic, question, or information need.\n\n EXAMPLES:\n - \"hennessy-cocktails\" (guidance: \"https://www.hennessy.com/en-us/cocktails\")\n → web_mode: \"map\", web_query: \"https://www.hennessy.com/en-us/cocktails\", include_paths: [\"/en-us/cocktails/*\"], limit: 100\n - \"react-docs\"\n → web_mode: \"map\", web_query: \"https://react.dev/reference\", include_paths: [\"/reference/*\"], limit: 100\n - \"python-tutorial\"\n → web_mode: \"map\", web_query: \"https://docs.python.org/3/tutorial/\", include_paths: [\"/3/tutorial/*\"], limit: 50\n - \"latest AI news\"\n → web_mode: \"search\", web_query: \"latest AI news 2025\", include_paths: [], limit: 20\n - \"best rust web frameworks\"\n → web_mode: \"search\", web_query: \"best rust web frameworks comparison\", include_paths: [], limit: 15\n - \"climate change research papers\"\n → web_mode: \"search\", web_query: \"climate change research papers 2025\", include_paths: [], limit: 20\n\n Defaults: web_mode=map, limit=100, filename_format={title}_{id}.md\n\n Folder name: {{ name }}\n {% if guidance %}\n User guidance: {{ guidance }}\n {% endif %}\n\n {{ ctx.output_format }}\n \"#\n}\n\n// -----------------------------------------------------------------------------\n// Web Query Tests\n// -----------------------------------------------------------------------------\n\ntest TestWebQueryWithGuidance {\n functions [InferWebQuery]\n args {\n name \"hennessy-cocktails\"\n guidance \"https://www.hennessy.com/en-us/cocktails?page=1\"\n }\n}\n\ntest TestWebQueryDocs {\n functions [InferWebQuery]\n args {\n name \"react-docs\"\n guidance null\n }\n}\n\ntest TestWebQueryBlog {\n functions [InferWebQuery]\n args {\n name \"openai-blog\"\n guidance null\n }\n}\n\ntest TestWebQuerySearch {\n functions [InferWebQuery]\n args {\n name \"latest AI news\"\n guidance null\n }\n}\n\ntest TestWebQuerySearchTopic {\n functions [InferWebQuery]\n args {\n name \"best rust web frameworks\"\n guidance null\n }\n}\n\n// -----------------------------------------------------------------------------\n// Confluence Query Tests\n// -----------------------------------------------------------------------------\n\ntest TestConfluenceRecentPages {\n functions [InferConfluenceQuery]\n args {\n name \"recent-pages\"\n guidance null\n }\n}\n\ntest TestConfluenceEngDocs {\n functions [InferConfluenceQuery]\n args {\n name \"engineering-docs\"\n guidance \"documentation from the engineering space\"\n }\n}\n\ntest TestConfluenceKubernetes {\n functions [InferConfluenceQuery]\n args {\n name \"kubernetes-pages\"\n guidance null\n }\n}\n\ntest TestConfluenceMeetingNotes {\n functions [InferConfluenceQuery]\n args {\n name \"meeting-notes\"\n guidance null\n }\n}\n\ntest TestConfluenceArchitectureDecisions {\n functions [InferConfluenceQuery]\n args {\n name \"architecture-decisions\"\n guidance \"ADRs and architecture decision records\"\n }\n}\n\n// -----------------------------------------------------------------------------\n// AgentMail Query Tests\n// -----------------------------------------------------------------------------\n\ntest TestAgentMailRecent {\n functions [InferAgentMailQuery]\n args {\n name \"recent-emails\"\n guidance null\n }\n}\n\ntest TestAgentMailFromJohn {\n functions [InferAgentMailQuery]\n args {\n name \"emails-from-john\"\n guidance null\n }\n}\n\ntest TestAgentMailInvoices {\n functions [InferAgentMailQuery]\n args {\n name \"invoices\"\n guidance null\n }\n}\n\ntest TestAgentMailWithGuidance {\n functions [InferAgentMailQuery]\n args {\n name \"support-emails\"\n guidance \"emails about customer support issues\"\n }\n}\n",
}
</file context>
| inboxes, _, err := a.client.ListInboxes(ctx, 100, "") | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| for _, inbox := range inboxes { | ||
| inboxIDs = append(inboxIDs, inbox.InboxID) | ||
| } |
There was a problem hiding this comment.
P2: ExecuteQuery only reads the first page of inboxes, so results silently omit inboxes after the first 100.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At pkg/sources/providers/agentmail.go, line 192:
<comment>ExecuteQuery only reads the first page of inboxes, so results silently omit inboxes after the first 100.</comment>
<file context>
@@ -165,6 +167,165 @@ func (a *AgentMailProvider) Search(_ context.Context, _ *sources.ProviderContext
+ if inboxFilter != "" {
+ inboxIDs = []string{inboxFilter}
+ } else {
+ inboxes, _, err := a.client.ListInboxes(ctx, 100, "")
+ if err != nil {
+ return nil, err
</file context>
| inboxes, _, err := a.client.ListInboxes(ctx, 100, "") | |
| if err != nil { | |
| return nil, err | |
| } | |
| for _, inbox := range inboxes { | |
| inboxIDs = append(inboxIDs, inbox.InboxID) | |
| } | |
| pageToken := "" | |
| for { | |
| inboxes, nextPageToken, err := a.client.ListInboxes(ctx, 100, pageToken) | |
| if err != nil { | |
| return nil, err | |
| } | |
| for _, inbox := range inboxes { | |
| inboxIDs = append(inboxIDs, inbox.InboxID) | |
| } | |
| if nextPageToken == "" { | |
| break | |
| } | |
| pageToken = nextPageToken | |
| } |
Summary by cubic
Adds AgentMail as a first-class source and tool so you can browse inboxes, read threads and messages, and send/reply via a shared server API key. Also adds cross-provider email thread views (Gmail and Outlook), inbound webhook routing, and runtime routing so replies come from the right inbox.
New Features
agentmailwith a browsable tree:README.md,recent.json,inboxes/{inbox_id}/messages/{sender}/{date_subject}/{meta.json|body.txt}, andinboxes/{inbox_id}/threads/{thread_id}/{meta.json|messages.txt}.ListInboxes,ListMessages,GetMessage,ListThreads,GetThread,ReplyToMessage.to/reply_to, and forwards to agents.AIRSTORE_AGENT_ROUTING_JSON, injects a prompt hint, and binds a default sender inbox from agent/workspace channel bindings.agentmailfilter (inbox/from/subject).agentmailquery inference and default filename format support.list-messages,get-message,get-thread,send,reply. Gateway builds summaries forsend/reply. Tools run with server-level auth (AuthNone) and don’t require user connections. Auto-registersagentmailsource and tools when configured.Migration
channels.agentMail.apiKey,channels.agentMail.baseURL, andchannels.agentMail.domainin server config to enable. No user OAuth; uses shared server credentials.Written for commit 59586a8. Summary will update on new commits.