Clarify canonical source attribution: `thoughts.source_type` vs `thoughts.metadata.source`

### What's on your mind?

## Summary

There appears to be schema drift around how OB1 represents the source of a thought.

Observed patterns include:

- Stock/core write paths and source filtering guidance use `thoughts.metadata.source`, queried in SQL as `thoughts.metadata->>'source'`.
- Newer dashboard, schema, integration, and stats contributions read or write a dedicated `thoughts.source_type` column.
- Some current contributions write both fields, or perform a follow-up update to set `source_type` after calling stock `upsert_thought`, because stock `upsert_thought` only persists `p_payload->'metadata'`.
- At least one prior recipe discussion exposed confusion around whether `source` belongs as a top-level `thoughts` column versus inside `thoughts.metadata`.

This makes it unclear which representation new recipes, dashboards, migrations, and user-contributed integrations should target.

_(This issue was AI-generated with OpenAI Codex)_

## Why This Matters

If one component writes only `thoughts.metadata.source` while another reads only `thoughts.source_type`, source filtering, dashboard grouping, statistics, and backfill workflows can silently miss records.

If `thoughts.source_type` is intended to be an enhanced-schema projection or denormalized index-friendly representation of `thoughts.metadata.source`, that should be documented explicitly. If it is intended to replace `thoughts.metadata.source`, then recipes and existing source-filtering guidance likely need migration guidance.

This is especially important because the current stock setup guide's `upsert_thought(p_content, p_payload)` stores only `p_payload->'metadata'`. A caller that sends sibling top-level keys such as `source_type` can succeed while silently leaving enhanced columns unset unless it performs a second update or mirrors the value into metadata.

## Current Observed Patterns

- Source filtering work documents a `source` parameter for `search_thoughts`, `list_thoughts`, and `thought_stats`:
  https://github.com/NateBJones-Projects/OB1/pull/30

- Slack deduplication recipe discussion uses `thoughts.metadata` for source-specific event identity, and also surfaced confusion about top-level `source` columns versus JSONB metadata:
  https://github.com/NateBJones-Projects/OB1/pull/89

- The dashboard contribution was merged under `dashboards/open-brain-dashboard-next/`, and appears relevant to source/type filtering behavior:
  https://github.com/NateBJones-Projects/OB1/pull/111

- Enhanced schema work adds `thoughts.source_type` as an idempotent structured column on `thoughts`:
  https://github.com/NateBJones-Projects/OB1/pull/191

- Brain health monitoring views read `source_type` for source volume and enrichment gap monitoring:
  https://github.com/NateBJones-Projects/OB1/pull/194

- Consolidation worker work treats `source_type` as part of prepared payloads and notes that stock `upsert_thought` can drop sibling top-level fields like `type`, `importance`, and `source_type` on first-run inserts:
  https://github.com/NateBJones-Projects/OB1/pull/200

- Synthesis capture work mirrors provenance into metadata because stock `upsert_thought` drops top-level payload keys; its anti-loop checks read top-level `source_type` when available:
  https://github.com/NateBJones-Projects/OB1/pull/212

- Brain stats daily and heatmap work depends on `source_type` from the enhanced-thoughts schema:
  https://github.com/NateBJones-Projects/OB1/pull/221

- Wiki pipeline work includes a fallback from `source_type` to metadata-derived source fields when the column is not present:
  https://github.com/NateBJones-Projects/OB1/pull/234

- Readwise capture/import work writes both `source_type = 'readwise'` and `metadata.source = 'readwise'`, and documents that some RPCs filter on `source_type`:
  https://github.com/NateBJones-Projects/OB1/pull/237

- The open-brain-rest gateway writes through stock `upsert_thought` with `metadata.source`, then separately updates top-level `source_type` for dashboard reads:
  https://github.com/NateBJones-Projects/OB1/pull/239

## Related but Not Duplicate

- Standardizing ingestion patterns is already tracked separately:
  https://github.com/NateBJones-Projects/OB1/issues/61

  That issue is broader. This issue is specifically about canonical source attribution and read/write compatibility between `metadata.source` and `source_type`.

- A prior closed issue reported similar drift for `type`: a migration assumed a top-level `thoughts.type` column on a stock schema where type lived in `metadata->>'type'`:
  https://github.com/NateBJones-Projects/OB1/issues/182

  The fix changed the workflow-status backfill to read `metadata->>'type'` instead:
  https://github.com/NateBJones-Projects/OB1/pull/185

  That prior fix does not resolve source attribution, but it shows the same class of confusion around stock metadata fields versus enhanced top-level columns.

## Questions for Maintainers

1. What is the canonical source attribution field for OB1 going forward: `thoughts.metadata.source`, `thoughts.source_type`, or both?
2. If both are supported, is `thoughts.source_type` intended to be a denormalized/generated/indexed projection of `thoughts.metadata.source`, or an independent field?
3. Should stock `upsert_thought` populate `source_type` from `p_payload->>'source_type'`, from `p_payload->'metadata'->>'source'`, or leave enhanced columns to follow-up updates?
4. Should dashboards and source filters query a compatibility expression such as `coalesce(thoughts.source_type, thoughts.metadata->>'source')`?
5. Should recipes write only the canonical representation, or should they write both fields for compatibility?
6. Would maintainers accept pull requests that document the canonical model, normalize recipes to that model, update dashboard queries, add a compatibility view/RPC/helper, and provide migration/backfill guidance?

## Possible Contribution Path

1. Document the canonical source attribution model.
2. Clarify whether `upsert_thought` should populate enhanced columns from payload keys or metadata.
3. Add a read-side compatibility helper such as `effective_source`.
4. Update dashboard queries to use `effective_source` or an equivalent `coalesce(...)` expression.
5. Update recipes to write the canonical field consistently.
6. Add migration/backfill guidance for installations that already have only one representation.

## Reference Links

### Repository References

- Source filtering recipe:
  https://github.com/NateBJones-Projects/OB1/pull/30

- Slack message deduplication recipe:
  https://github.com/NateBJones-Projects/OB1/pull/89

- Dashboard contribution merged as `dashboards/open-brain-dashboard-next/`:
  https://github.com/NateBJones-Projects/OB1/pull/111

- Enhanced thoughts columns adding `thoughts.source_type`:
  https://github.com/NateBJones-Projects/OB1/pull/191

- Brain health monitoring views:
  https://github.com/NateBJones-Projects/OB1/pull/194

- Consolidation workers:
  https://github.com/NateBJones-Projects/OB1/pull/200

- Synthesis capture:
  https://github.com/NateBJones-Projects/OB1/pull/212

- Brain stats daily and heatmap filter:
  https://github.com/NateBJones-Projects/OB1/pull/221

- Wiki pipeline source fallback:
  https://github.com/NateBJones-Projects/OB1/pull/234

- Readwise capture/import:
  https://github.com/NateBJones-Projects/OB1/pull/237

- open-brain-rest gateway:
  https://github.com/NateBJones-Projects/OB1/pull/239

- Standardize thought ingestion patterns:
  https://github.com/NateBJones-Projects/OB1/issues/61

- Prior `type` drift report and fix:
  https://github.com/NateBJones-Projects/OB1/issues/182
  https://github.com/NateBJones-Projects/OB1/pull/185

### Discord Discussion References

- Discussion of controlled vocabulary inside `thoughts.metadata`:
  https://discord.com/channels/1481783256641699840/1481789897470906429/1482622942121689169

- Discussion of migrating/custom schema overlap involving `source_type` and JSONB metadata:
  https://discord.com/channels/1481783256641699840/1481790073392599041/1482874122747777095

- Discord notification/discussion for the Slack deduplication metadata recipe:
  https://discord.com/channels/1481783256641699840/1481790377534034101/1484302480127950879


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify canonical source attribution: `thoughts.source_type` vs `thoughts.metadata.source` #240

What's on your mind?

Summary

Why This Matters

Current Observed Patterns

Related but Not Duplicate

Questions for Maintainers

Possible Contribution Path

Reference Links

Repository References

Discord Discussion References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Clarify canonical source attribution: thoughts.source_type vs thoughts.metadata.source #240

Description

What's on your mind?

Summary

Why This Matters

Current Observed Patterns

Related but Not Duplicate

Questions for Maintainers

Possible Contribution Path

Reference Links

Repository References

Discord Discussion References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Clarify canonical source attribution: `thoughts.source_type` vs `thoughts.metadata.source` #240