Skip to content

Rubric Gap Analysis — New rule needed: query-olap-detection — No warning against analytical queries on transactional containers #49

@jaydestro

Description

@jaydestro

#49 — Rubric Gap Analysis — New rule needed: query-olap-detection — No warning against analytical queries on transactional containers

Field Value
Type New Rule
Proposed Rule query-olap-detection
Category query-*
Severity HIGH
Source SCOPE Rubric Criteria — Query Optimization, Criterion 10 (OLAP Detection)
Labels enhancement, SCOPE, agent-kit, rule:query

Summary

No existing rule warns against running large aggregations (COUNT, SUM, AVG over unbounded data), GROUP BY across all partitions, or full-container scans for reporting purposes against transactional Cosmos DB containers. The SQL-like query syntax makes Cosmos DB appear suitable for analytical workloads, but these queries consume massive RU, trigger sustained 429 throttling that impacts transactional traffic, and can time out against the 5-second execution limit.

Rubric Gap Analysis

During rubric criteria review for query optimization, this was identified as a platform constraint with no corresponding rule. The existing monitoring-ru-consumption rule covers tracking RU costs but does not warn about the fundamental mismatch between analytical query patterns and transactional container design. No query-* rule mentions Synapse Link, analytical store, or the HTAP separation pattern.

Evidence

Incorrect Pattern — Analytical queries on transactional container

// ❌ Unbounded aggregation across all partitions
var query = "SELECT c.region, COUNT(1) as orderCount, SUM(c.total) as revenue " +
            "FROM c WHERE c.orderDate >= '2025-01-01' GROUP BY c.region";

var iterator = container.GetItemQueryIterator<dynamic>(query);
// Fans out to ALL partitions, reads ALL matching documents
// At 10M orders: potentially 50,000+ RU per execution
// Blocks transactional traffic with sustained high RU consumption
# ❌ Dashboard refreshing aggregations against transactional store
def get_dashboard_metrics(self):
    queries = [
        "SELECT VALUE COUNT(1) FROM c",                           # Full scan
        "SELECT c.status, COUNT(1) FROM c GROUP BY c.status",     # Unbounded GROUP BY
        "SELECT VALUE AVG(c.responseTime) FROM c WHERE c.type = 'request'"  # Cross-partition AVG
    ]
    # Each query scans the entire container
    # Running these every 30 seconds for a dashboard = sustained throttling
// ❌ Reporting query running against operational container
@Query("SELECT c.category, SUM(c.quantity) as totalSold, AVG(c.price) as avgPrice " +
       "FROM c WHERE c.type = 'sale' GROUP BY c.category")
List<CategoryReport> getCategorySalesReport();
// Full cross-partition scan + aggregation — hundreds of thousands of RU
// Competes with real-time order processing for the same throughput budget

Correct Pattern — Synapse Link / analytical store / materialized views

// ✅ Enable analytical store on the container
var containerProperties = new ContainerProperties
{
    Id = "orders",
    PartitionKeyPath = "/customerId",
    AnalyticalStoreTimeToLiveInSeconds = -1  // Enable analytical store
};

// ✅ Run aggregations via Synapse Link (no RU consumed on transactional store)
// In Synapse SQL or Spark:
// SELECT region, COUNT(*) as orderCount, SUM(total) as revenue
// FROM cosmos_db.orders WHERE orderDate >= '2025-01-01' GROUP BY region
// ✅ Pre-compute via Change Feed materialized views
// Real-time aggregation maintained by Change Feed processor
public class SalesAggregate
{
    public string Id { get; set; }           // "category-electronics"
    public string PartitionKey { get; set; } // "aggregates"
    public string Category { get; set; }
    public long TotalSold { get; set; }
    public decimal AveragePrice { get; set; }
    public DateTime LastUpdated { get; set; }
}

// Dashboard reads pre-computed aggregates: 1 RU per read
// Instead of recalculating from millions of source documents
// ✅ Single-partition aggregation is acceptable when scoped and bounded
var query = new QueryDefinition(
    "SELECT VALUE COUNT(1) FROM c WHERE c.customerId = @cid AND c.status = 'pending'")
    .WithParameter("@cid", customerId);

var iterator = container.GetItemQueryIterator<int>(query,
    requestOptions: new QueryRequestOptions
    {
        PartitionKey = new PartitionKey(customerId)  // Scoped to ONE partition
    });
// Single-partition, bounded data — acceptable cost

Impact

  • Throughput starvation: Analytical queries consume the entire RU budget, throttling transactional operations (point reads, writes) with 429 responses
  • Cascading failures: Sustained throttling from reporting queries propagates to all application operations sharing the container
  • Timeout failures: Large aggregations exceed the 5-second query execution timeout, producing partial or failed results
  • Cost amplification: Running these queries on a schedule (dashboard refresh) multiplies the damage

Recommended New Rule

Create query-olap-detection.md:

Do not run large aggregations, unbounded GROUP BY, or full-container scans against transactional Cosmos DB containers.

Cosmos DB's transactional store is optimized for OLTP: point reads, targeted queries within a partition, and bounded result sets. Analytical patterns — COUNT/SUM/AVG across all partitions, GROUP BY over unbounded data, or full-container scans — consume massive RU, trigger 429 throttling that impacts transactional operations, and can time out.

For analytical workloads, use:

  • Azure Synapse Link + Analytical Store: Zero-ETL, no RU impact on transactional store
  • Change Feed materialized views: Pre-compute aggregates incrementally
  • Dedicated reporting containers: Populated via Change Feed with query-optimized partition keys

Single-partition aggregations scoped to a known partition key with bounded data are acceptable — the concern is unbounded cross-partition scans.

References

Metadata

Metadata

Labels

SCOPEIssues generated by SCOPE toolagent-kitIssues requiring updates to cosmosdb-best-practices Agent Kit rulesenhancementNew feature or requestrule:queryCosmos DB query rule enhancement

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions