Skip to content

feat(knowledge): add Live sync option to KB connectors + fix embedding billing#3955

Closed
waleedlatif1 wants to merge 13 commits intostagingfrom
feat/kb-connector-live-sync-option
Closed

feat(knowledge): add Live sync option to KB connectors + fix embedding billing#3955
waleedlatif1 wants to merge 13 commits intostagingfrom
feat/kb-connector-live-sync-option

Conversation

@waleedlatif1
Copy link
Copy Markdown
Collaborator

Re-opening of #3916 — originally merged into staging but lost when the branch was accidentally deleted and recreated from main.

TheodoreSpeaks and others added 13 commits April 3, 2026 15:24
* feat(block): add cloudwatch integration

* Fix bun lock

* Add logger, use execution timeout

* Switch metric dimensions to map style input

* Fix attribute names for dimension map

* Fix import styling

---------

Co-authored-by: Theodore Li <theo@sim.ai>
…nterprise users

Adds a "Live" (every 5 min) sync frequency option gated to Max and Enterprise plan users.
Includes client-side badge + disabled state, shared sync intervals constant, and server-side
plan validation on both POST and PATCH connector routes.
Adds billing tracking to the KB embedding pipeline, which was previously
generating OpenAI API calls with no cost recorded. Token counts are now
captured from the actual API response and recorded via recordUsage after
successful embedding insertion. BYOK workspaces are excluded from billing.
Applies to all execution paths: direct, BullMQ, and Trigger.dev.
…n modelName

- Use calculateCost() from @/providers/utils instead of inline formula, consistent
  with how LLM billing works throughout the platform
- Return modelName from GenerateEmbeddingsResult so billing uses the actual model
  (handles custom Azure deployments) instead of a hardcoded fallback string
- Fix docs-chunker.ts empty-path fallback to satisfy full GenerateEmbeddingsResult type
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 4, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped Apr 4, 2026 11:43pm

Request Review

@cursor
Copy link
Copy Markdown

cursor bot commented Apr 4, 2026

PR Summary

High Risk
High risk due to new AWS CloudWatch tool endpoints that accept AWS credentials and significant changes to embedding generation/billing and connector sync gating logic, which could impact costs and access control if incorrect.

Overview
Adds a “Live” (5-minute) sync option for knowledge-base connectors, with UI gating (Max badge/disabled option) and server-side enforcement via new hasLiveSyncAccess() checks when creating/updating connectors.

Reworks embeddings generation to return { embeddings, totalTokens, isBYOK, modelName }, updates callers accordingly, and records/bills embedding usage during KB document processing (skipping billing for BYOK and handling missing pricing).

Introduces a new AWS CloudWatch tool integration (blocks + tools + API routes + selectors) covering Log Insights queries, log group/stream discovery, log event retrieval, metrics listing/statistics, and alarm listing; includes new AWS SDK dependencies and selector context fields.

Reviewed by Cursor Bugbot for commit 80e97aa. Configure here.

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 80e97aa. Configure here.

): Promise<ChunkData> {
logger.info(`[${requestId}] Generating embedding for manual chunk`)
const embeddings = await generateEmbeddings([chunkData.content], undefined, workspaceId)
const { embeddings } = await generateEmbeddings([chunkData.content], undefined, workspaceId)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Embedding billing skipped for individual chunk create/update operations

Medium Severity

The generateEmbeddings call in createChunk and updateChunk now returns { embeddings, totalTokens, isBYOK, modelName }, but only embeddings is destructured. The totalTokens, isBYOK, and modelName values are discarded, so no billing is recorded for embeddings generated during manual chunk creation or content updates. The document processing path in documents/service.ts correctly uses these values to call recordUsage, but the chunk service paths silently skip billing for the same embedding API calls.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 80e97aa. Configure here.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 4, 2026

Greptile Summary

This PR adds a Live sync option (every 5 minutes) for KB connectors gated behind the Max/Enterprise plan, fixes embedding token billing so document processing costs are properly recorded, and introduces a comprehensive AWS CloudWatch integration with 7 operations across logs, metrics, and alarms.

Key changes:

  • SYNC_INTERVALS extracted to a shared consts.ts and a new { label: 'Live', value: 5, requiresMax: true } entry added; the UI disables and badges this option for non-Max users
  • Server-side hasLiveSyncAccess() guard added in both POST (create) and PATCH (update) connector routes — any syncIntervalMinutes in the range (0, 60) requires a Max/Enterprise subscription
  • generateEmbeddings() now returns { embeddings, totalTokens, isBYOK, modelName } instead of number[][]; all call-sites updated; processDocumentAsync uses the token count + BYOK flag to record a knowledge-base usage entry via recordUsage
  • New CloudWatch block, tools, API routes, selector definitions, and DB migration (adds knowledge-base to the usage_log_source enum)

Confidence Score: 5/5

Safe to merge — no P0/P1 issues found; all remaining findings are style/quality suggestions.

The two core features (live-sync billing gate and embedding token billing) are correctly implemented end-to-end with both client-side gating and server-side enforcement. The CloudWatch integration follows existing patterns. All call-sites of the refactored generateEmbeddings were updated. Only minor style issues remain (import ordering in 3 files, first-batch-only BYOK capture), none of which affect runtime behaviour.

apps/sim/app/api/tools/cloudwatch/describe-log-streams/route.ts, apps/sim/app/api/tools/cloudwatch/get-log-events/route.ts, apps/sim/app/api/tools/cloudwatch/query-logs/route.ts (import ordering)

Important Files Changed

Filename Overview
apps/sim/lib/knowledge/embeddings.ts Refactored generateEmbeddings to return GenerateEmbeddingsResult (embeddings, totalTokens, isBYOK, modelName) for billing; callEmbeddingAPI updated to also return totalTokens from usage.total_tokens
apps/sim/lib/knowledge/documents/service.ts Added post-processing billing: accumulates totalEmbeddingTokens across batches, then calls recordUsage and checkAndBillOverageThreshold if !isBYOK; minor fragility in first-batch-only capture of isBYOK/modelName
apps/sim/lib/billing/core/subscription.ts New hasLiveSyncAccess() function mirrors hasInboxAccess(); checks Max plan (>=25k credits) or Enterprise; parallel Promise.all for efficiency
apps/sim/app/api/knowledge/[id]/connectors/route.ts Server-side live sync gate on POST: returns 403 if syncIntervalMinutes in (0,60) and user lacks Max/Enterprise
apps/sim/app/api/knowledge/[id]/connectors/[connectorId]/route.ts Same gate applied on PATCH with undefined check before range check; correct enforcement
apps/sim/app/workspace/[workspaceId]/knowledge/[id]/components/consts.ts Extracted SYNC_INTERVALS with new Live (5 min) entry carrying requiresMax: true; used by both add/edit modals
apps/sim/app/workspace/[workspaceId]/knowledge/[id]/components/add-connector-modal/add-connector-modal.tsx Reads subscription data to compute hasMaxAccess; disables Live button and renders MaxBadge for non-Max users
apps/sim/app/workspace/[workspaceId]/knowledge/[id]/components/edit-connector-modal/edit-connector-modal.tsx Same Live sync gating applied via hasMaxAccess prop threaded to SettingsTab
apps/sim/blocks/blocks/cloudwatch.ts Full-featured CloudWatch block with 7 operations, cascading file-selectors for log groups/streams, Wand AI support for query generation, and complete param normalisation in tools.config.params
apps/sim/app/api/tools/cloudwatch/utils.ts Shared CloudWatch Logs client factory; pollQueryResults with timeout fallback returning partial results; describeLogStreams and getLogEvents helpers
apps/sim/hooks/selectors/registry.ts cloudwatch.logGroups and cloudwatch.logStreams selectors added; logStreams correctly cascades from logGroupName context
packages/db/migrations/0185_new_gravity.sql Adds 'knowledge-base' to usage_log_source enum — required for recordUsage calls in document processing
apps/sim/app/api/tools/cloudwatch/query-logs/route.ts Log Insights query route; polls until Complete/Failed/Cancelled or timeout; import ordering issue (import after const logger)
apps/sim/app/api/tools/cloudwatch/describe-log-streams/route.ts Correctly uses checkSessionOrInternalAuth for browser selector + executor dual-path; import ordering issue
apps/sim/app/api/tools/cloudwatch/get-log-events/route.ts Uses logGroupIdentifier (accepts names and ARNs); import ordering issue

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[User sets sync interval] --> B{syncIntervalMinutes in 0..60?}
    B -- No --> C[Allow — no billing check]
    B -- Yes = 5 min Live --> D[hasLiveSyncAccess]
    D -- false --> E[403: Requires Max/Enterprise]
    D -- true --> F[Create/Update connector]

    G[processDocumentAsync] --> H[generateEmbeddings per batch]
    H --> I{isBYOK?}
    I -- Yes BYOK key --> J[Skip billing — user's own key]
    I -- No platform key --> K[Accumulate totalEmbeddingTokens]
    K --> L[calculateCost]
    L --> M[recordUsage source=knowledge-base]
    M --> N[checkAndBillOverageThreshold]
Loading

Reviews (1): Last reviewed commit: "fix(knowledge): call checkAndBillOverage..." | Re-trigger Greptile


const logger = createLogger('CloudWatchDescribeLogStreams')

import { createCloudWatchLogsClient, describeLogStreams } from '@/app/api/tools/cloudwatch/utils'
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Import placed after const declaration

The import statement on line 8 appears after the const logger = ... initialisation on line 6. While JavaScript hoists import declarations so this is valid at runtime, it violates standard module conventions and will likely cause ESLint import/first rule violations. The same pattern appears in get-log-events/route.ts (line 8) and query-logs/route.ts (line 9). All three should move the import to the top of the file with the other imports.

Suggested change
import { createCloudWatchLogsClient, describeLogStreams } from '@/app/api/tools/cloudwatch/utils'
import { createCloudWatchLogsClient, describeLogStreams } from '@/app/api/tools/cloudwatch/utils'
const logger = createLogger('CloudWatchDescribeLogStreams')

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!


const logger = createLogger('CloudWatchGetLogEvents')

import { createCloudWatchLogsClient, getLogEvents } from '@/app/api/tools/cloudwatch/utils'
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Import placed after const declaration

Same import-ordering issue as describe-log-streams/route.ts. The import of createCloudWatchLogsClient and getLogEvents from the cloudwatch utils should be moved before const logger = ....

Suggested change
import { createCloudWatchLogsClient, getLogEvents } from '@/app/api/tools/cloudwatch/utils'
import { createCloudWatchLogsClient, getLogEvents } from '@/app/api/tools/cloudwatch/utils'
const logger = createLogger('CloudWatchGetLogEvents')

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!


const logger = createLogger('CloudWatchQueryLogs')

import { createCloudWatchLogsClient, pollQueryResults } from '@/app/api/tools/cloudwatch/utils'
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Import placed after const declaration

Same import-ordering issue. Move createCloudWatchLogsClient / pollQueryResults import above the const logger line.

Suggested change
import { createCloudWatchLogsClient, pollQueryResults } from '@/app/api/tools/cloudwatch/utils'
import { createCloudWatchLogsClient, pollQueryResults } from '@/app/api/tools/cloudwatch/utils'
const logger = createLogger('CloudWatchQueryLogs')

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment on lines +519 to +523
totalEmbeddingTokens += batchTokens
if (i === 0) {
embeddingIsBYOK = isBYOK
embeddingModelName = modelName
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 BYOK/model captured only from first batch

embeddingIsBYOK and embeddingModelName are only set when i === 0. Since isBYOK and modelName derive entirely from getEmbeddingConfig() (same workspace, same invocation environment), they cannot differ across batches within the same document processing run. The code is correct today, but future changes that modify getEmbeddingConfig's behaviour per-batch would silently use stale values. Consider capturing them unconditionally (or at least adding a comment explaining why first-batch capture is sufficient).

Suggested change
totalEmbeddingTokens += batchTokens
if (i === 0) {
embeddingIsBYOK = isBYOK
embeddingModelName = modelName
}
totalEmbeddingTokens += batchTokens
// isBYOK and modelName are constant per workspace — safe to overwrite each iteration
embeddingIsBYOK = isBYOK
embeddingModelName = modelName

Comment on lines 655 to 699

const processingTime = Date.now() - startTime
logger.info(`[${documentId}] Successfully processed document in ${processingTime}ms`)

if (!embeddingIsBYOK && totalEmbeddingTokens > 0 && kb[0].userId) {
try {
const costMultiplier = getCostMultiplier()
const { total: cost } = calculateCost(
embeddingModelName,
totalEmbeddingTokens,
0,
false,
costMultiplier
)
if (cost > 0) {
await recordUsage({
userId: kb[0].userId,
workspaceId: kb[0].workspaceId ?? undefined,
entries: [
{
category: 'model',
source: 'knowledge-base',
description: embeddingModelName,
cost,
metadata: { inputTokens: totalEmbeddingTokens, outputTokens: 0 },
},
],
additionalStats: {
totalTokensUsed: sql`total_tokens_used + ${totalEmbeddingTokens}`,
},
})
await checkAndBillOverageThreshold(kb[0].userId)
} else {
logger.warn(
`[${documentId}] Embedding model "${embeddingModelName}" has no pricing entry — billing skipped`,
{ totalEmbeddingTokens, embeddingModelName }
)
}
} catch (billingError) {
logger.error(`[${documentId}] Failed to record embedding usage`, { error: billingError })
}
}
} catch (error) {
const processingTime = Date.now() - startTime
const errorMessage = error instanceof Error ? error.message : 'Unknown error'
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Billing code unreachable when withTimeout throws

The billing block sits after withTimeout(...) but inside the outer try. If withTimeout rejects (e.g. the processing timeout fires after several embedding batches have already completed and totalEmbeddingTokens > 0), execution jumps straight to the outer catch and the recordUsage call is never reached — those tokens are consumed but not billed.

This is an under-billing edge case rather than a runtime error, and the current structure is intentional to keep billing concerns separate from error handling. Worth adding a comment so the next developer understands why partial completions aren't billed, e.g.:

// Note: billing only fires on full success; tokens consumed in a timed-out or
// error-aborted run are not recorded. This is an intentional under-billing
// tradeoff to keep the billing path separate from error paths.

@waleedlatif1 waleedlatif1 deleted the feat/kb-connector-live-sync-option branch April 5, 2026 01:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants