Skip to content

[QUESTION] - Support multi-level composite external Ids and cross-object-set lookups with CSV files. #1183

@mitchspano

Description

@mitchspano

Feature title
Support multi-level composite external Ids and cross-object-set lookups with CSV files.

Problem statement
SFDMU's composite external Id feature doesn't properly resolve parent lookup or master/detail relationships when deploying data from CSV files to an org. It may work for org-to-org migrations but not CSV-to-org in its current state.

When using nested external Ids like Product2.ProductCode or Pricebook2.Name with CSV source files, SFDMU currently cannot resolve lookups across object sets when parent records come from earlier CSV imports and match records on subsequent runs.

Example: A PricebookEntry CSV has columns Product2.ProductCode and Pricebook2.Name exported from a source system. SFDMU can't use Product2.ProductCode;Pricebook2.Name as a composite external Id to establish a relationship with parent records or avoid duplicates on subsequent runs.

This means CSV-based migrations are limited to actual external Ids defined in the SObject's schema. Users migrating from CSV sources must either use actual external Id fields (which may not be feasible, especially for managed packages) or accept duplicate records on re-runs (unacceptable in most contexts).

Proposed solution
Enable SFDMU to support relationship paths in composite external Id field definitions, allowing configurations like:

  • Simple relationship: Product2.ProductCode
  • Custom relationship: Custom_Parent__r.Name
  • Composite with relationships: Product2.ProductCode;Pricebook2.Name
  • Deep nesting: Parent__r.Grandparent__r.Owner.Name

The system should:

  1. Extract values from relationship columns in CSV files (e.g., Product2.ProductCode column)
  2. Cache records by these relationship values across object sets
  3. Match records on subsequent runs using composite keys that include relationship fields
  4. Set parent relationships by these cached values for lookup and master/detail relationship fields
  5. Support both custom relationship notation (Account__r.Name) and standard relationship notation (Product2.ProductCode)

Expected user impact
This feature would enable pure CSV as source of truth for test data in scratch orgs and SObjects which contain setup data in production orgs. This is particularly applicable for packages or solutions which rely on SObject rows to perform setup or configuration tasks such as CPQ or Industries.

Configuration example (if applicable)

export.json:

{
  "objectSets": [
    {
      "objects": [
        {
          "query": "SELECT Id, Name FROM Product2",
          "operation": "Upsert",
          "externalId": "ProductCode"
        },
        {
          "query": "SELECT Id, Name FROM Pricebook2",
          "operation": "Upsert",
          "externalId": "Name"
        }
      ]
    },
    {
      "objects": [
        {
          "query": "SELECT Id, Product2Id, Pricebook2Id, UnitPrice, Product2.Name, Pricebook2.ProductCode FROM PricebookEntry",
          "operation": "Upsert",
          "externalId": "Product2.ProductCode;Pricebook2.Name"
        }
      ]
    }
  ]
}

CSV Files:

Product2.csv:

Id,ProductCode
,Widget_Pro
,Widget_Lite

Pricebook2.csv:

Id,Name
,Standard Price Book
,Partner Price Book

PricebookEntry.csv:

Id,Product2Id,Pricebook2Id,UnitPrice,Product2.ProductCode,Pricebook2.Name
,,,100.00,Widget_Pro,Standard Price Book
,,,150.00,Widget_Pro,Partner Price Book
,,,75.00,Widget_Lite,Standard Price Book

Expected behavior:

  • First run: Products, Pricebooks, and Entries are created with master-detail relationships populated
  • Second run: All records match correctly - no duplicates because entries match by composite key Product2.ProductCode;Pricebook2.Name

Additional context
This feature is particularly valuable for:

  • Standard Salesforce objects - Products, Pricebooks, PricebookEntries as demonstrated above
  • Salesforce CPQ - Pricing rules and other setup data
  • Industries - Many SObjects contain setup data.

This proposal extends the existing composite key capability (using semicolons for same-object fields) to support relationship traversal for CSV data sources.

Metadata

Metadata

Assignees

Labels

help-wantedUser need a help or something not working, not a bug

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions