Skip to content

Rubric Gap Analysis — partition-high-cardinality enhancement: Add /id as partition key anti-pattern #50

@jaydestro

Description

@jaydestro

#50 — Rubric Gap Analysis — partition-high-cardinality enhancement: Add /id as partition key anti-pattern

Field Value
Type Rule Enhancement
Target Rule partition-high-cardinality
Severity HIGH
Source SCOPE Rubric Criteria — Partition Key Design, Criterion 1, check 3
Labels enhancement, SCOPE, agent-kit, rule:partition

Summary

The partition-high-cardinality rule documents low-cardinality anti-patterns (status, country) but does not warn against using /id (the document ID) as the partition key. The /id anti-pattern is fundamentally different from low-cardinality: cardinality is perfect (every document in its own partition), but every non-point-read query becomes a full cross-partition fan-out. This is a common mistake where developers see maximum cardinality as maximum distribution without recognizing it destroys query efficiency.

Rubric Gap Analysis

This gap was identified During rubric criteria review for partition key design. Using /id as partition key is a distinct failure mode from low-cardinality keys — it passes a naive "high cardinality" check while violating the "query pattern alignment" principle. Agents frequently default to /id when uncertain, especially those trained on relational patterns where primary key indexing is universal.

The existing rule's guidance — "Good partition keys typically: Match your most common query patterns" — is consistent with this anti-pattern but does not call it out explicitly. The /id trap requires its own incorrect/correct example block.

Evidence

Existing Rule Coverage (what's already there)

The rule currently shows two anti-patterns:

  • Status — few unique values (low cardinality)
  • Country — ~195 values, uneven distribution

Neither addresses the high-cardinality-but-wrong-pattern case.

Missing Anti-Pattern

// Anti-pattern: /id as partition key — perfect cardinality, zero query efficiency
public class Order
{
    public string Id { get; set; }          // Partition key = /id
    public string CustomerId { get; set; }
    public DateTime OrderDate { get; set; }
    public string Status { get; set; }
}

// Point reads work perfectly (1 RU):
await container.ReadItemAsync<Order>(orderId, new PartitionKey(orderId));

// But EVERY query fans out across ALL partitions:
var query = "SELECT * FROM c WHERE c.customerId = @cid";
// With 10,000 physical partitions, this query hits ALL 10,000
// "Get my orders" becomes the most expensive possible query shape

// Why agents do this:
// - Relational thinking: "primary key = best index"
// - Cardinality check: /id passes trivially (infinite unique values)
// - It "works" at small scale — fan-out cost is hidden until production

Correct Guidance

// /id is only appropriate when the ONLY access pattern is point reads
// (fetch one document by exact id, no list/query operations needed)

// For most workloads, use a domain property that groups related documents:
public class Order
{
    public string Id { get; set; }
    public string CustomerId { get; set; }  // ✅ Groups orders by customer
    public DateTime OrderDate { get; set; }
}
// "Get customer's orders" → single-partition query
// "Get order by id" → still works (query within customer's partition)

Recommended Enhancement

Add a third anti-pattern block to partition-high-cardinality after the existing Country example:

// Anti-pattern: using /id as partition key
public class Order
{
    // Perfect cardinality but every query fans out to ALL partitions
    public string Id { get; set; }  // ❌ BAD — only efficient for point reads
    public string CustomerId { get; set; }
}
// "Get customer's orders" query hits every physical partition
// Use /id ONLY when the workload is exclusively point reads (no queries)

Also add to the "Good partition keys typically" list:

  • Avoid using /id (the document ID) as the partition key unless the only access pattern is point reads by exact id. While /id has perfect cardinality, it forces every non-point-read query into a cross-partition fan-out.

References

Metadata

Metadata

Assignees

Labels

SCOPEIssues generated by SCOPE toolagent-kitIssues requiring updates to cosmosdb-best-practices Agent Kit rulesenhancementNew feature or requestrule:partitionPartition key rules (partition-*)

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions