Skip to content

Rubric Gap Analysis — monitoring-ru-consumption enhancement: Add per-query development-time RU inspection practice #56

@jaydestro

Description

@jaydestro

#56 — Rubric Gap Analysis — monitoring-ru-consumption enhancement: Add per-query development-time RU inspection practice

Field Value
Type Rule Enhancement
Target Rule monitoring-ru-consumption
Severity MEDIUM
Source SCOPE Rubric Criteria — Query Optimization, Criterion 9 (Query Metrics & RU Accountability)
Labels enhancement, SCOPE, agent-kit, rule:monitoring

Summary

The monitoring-ru-consumption rule comprehensively covers production-time RU tracking (response RequestCharge, middleware handlers, Azure Monitor KQL queries) but does not frame RU inspection as a development-time practice. Queries are deployed to production without ever inspecting their RU cost during development. A query that costs 5 RU against 100 documents may cost 5,000 RU in production against 1 million documents if it triggers a scan. Without establishing RU expectations during development, regressions are invisible until throttling occurs in production.

Scoping Note — Careful Framing Required

The existing rule already covers the mechanism for RU tracking (RequestCharge, PopulateIndexMetrics, middleware, KQL). This enhancement should add a development workflow section, not duplicate the existing production monitoring content. The gap is: when to inspect (during development, not just in production) and what to do with the numbers (set expectations, assert in tests, detect regressions).

Rubric Gap Analysis

During rubric criteria review for query optimization, this was identified as a cross-cutting practice gap. The existing rule shows how to read RU costs but not how to act on them during the development lifecycle. Query metrics are the primary feedback loop for all other query optimization rules — without development-time inspection, developers cannot verify whether their optimizations (projections, index tuning, filter ordering) are actually effective.

Evidence

Existing Rule Coverage

The rule currently covers:

  • response.RequestCharge logging per operation
  • Per-page RU tracking for queries (while (iterator.HasMoreResults))
  • Expensive query detection (if (totalRU > 100))
  • Custom RequestHandler middleware for global RU tracking
  • Azure Monitor KQL queries for production analysis

What's missing: framing these as development-time practices with concrete workflow guidance.

Missing Development-Time Practice

// ❌ Current typical agent output: RU tracking exists but only in production
public async Task<List<Order>> GetActiveOrders(string customerId)
{
    var query = "SELECT * FROM c WHERE c.status = 'active'";
    var iterator = container.GetItemQueryIterator<Order>(query);
    var results = new List<Order>();
    while (iterator.HasMoreResults)
    {
        var page = await iterator.ReadNextAsync();
        results.AddRange(page);
        // No RU logging, no metrics — "it works" during development
        // In production with 500K documents: 2,500 RU per call, 429 throttling
    }
    return results;
}

Development-Time Inspection Practice

// ✅ Inspect RU during development — verify optimization decisions
public async Task<List<Order>> GetActiveOrders(string customerId)
{
    var query = new QueryDefinition(
        "SELECT c.id, c.orderDate, c.total FROM c WHERE c.customerId = @cid AND c.status = 'active'")
        .WithParameter("@cid", customerId);

    var options = new QueryRequestOptions
    {
        PartitionKey = new PartitionKey(customerId),
        PopulateIndexMetrics = true  // ← Enable during development
    };

    var iterator = container.GetItemQueryIterator<OrderSummary>(query, requestOptions: options);
    var results = new List<OrderSummary>();
    double totalRU = 0;

    while (iterator.HasMoreResults)
    {
        var page = await iterator.ReadNextAsync();
        results.AddRange(page);
        totalRU += page.RequestCharge;

        // Development-time inspection:
        Console.WriteLine($"  Page: {page.Count} items, {page.RequestCharge:F1} RU");
        Console.WriteLine($"  Index metrics: {page.IndexMetrics}");
    }

    Console.WriteLine($"  TOTAL: {results.Count} items, {totalRU:F1} RU");
    // Expected: < 10 RU for a single-partition query returning ~50 items
    // If actual is 500+ RU → index miss or cross-partition scan detected!
    
    return results;
}
// ✅ Assert RU expectations in integration tests
[Fact]
public async Task GetActiveOrders_ShouldCostLessThan10RU()
{
    // Arrange: seed test container with representative data
    await SeedTestData(customerId: "test-customer", orderCount: 100);

    // Act
    var (results, totalRU) = await _repository.GetActiveOrdersWithMetrics("test-customer");

    // Assert: RU cost within expected bounds
    Assert.True(totalRU < 10.0,
        $"GetActiveOrders cost {totalRU:F1} RU — expected < 10 RU. " +
        "Check: missing partition key? Missing projection? Index scan?");
}

Recommended Enhancement

Add a "Development-Time RU Inspection" section to monitoring-ru-consumption:

Inspect RU costs during development, not just in production.

Queries that perform well against 100 test documents can cost 100x more in production against millions of documents. Establishing RU expectations during development is the primary feedback loop for validating all other query optimizations.

Development workflow:

  1. Enable PopulateIndexMetrics = true during development to verify index utilization
  2. Log RequestCharge for every query during development and compare against expectations
  3. Check that retrieved document count ≈ returned document count (a large gap indicates a scan)
  4. Set RU thresholds in integration tests to catch regressions before deployment
  5. Document expected RU costs for critical query paths (e.g., "< 10 RU for single-partition lookup")

Query metrics are the feedback loop for all other optimization rules. Without them, you cannot verify whether projections, index tuning, or filter ordering are actually working.

References

Metadata

Metadata

Assignees

Labels

SCOPEIssues generated by SCOPE toolagent-kitIssues requiring updates to cosmosdb-best-practices Agent Kit rulesenhancementNew feature or requestrule:monitoringMonitoring rules (monitoring-*)

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions