Document user_profiles_mv and user_labels tables#88
Conversation
Document two new warehouse tables in the data catalog: - user_profiles_mv: project-scoped, first-party identify traits, placed next to identities as the identity-data counterpart to wallet_profiles_mv. Flagged as an AggregatingMergeTree with a -Merge / argMaxIfMerge querying note. - user_labels: project-scoped labels you assign to addresses, placed next to wallet_profiles_labels (its global counterpart). Omits the internal _is_deleted column to match the wallet_profiles_labels entry. Also list user_profiles_mv in the aggregate-tables section and add an argMaxIf row to the quick reference. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01XeF9u7YLwBCXtsADz9esGH
There was a problem hiding this comment.
Code Review
This pull request adds documentation for the new user_profiles_mv and user_labels tables in data/catalog.mdx, and updates the aggregate tables section. The reviewer noted that documenting the columns of the aggregate table user_profiles_mv with simple types like String and DateTime is misleading and inconsistent. They suggested updating the table format to explicitly specify the ClickHouse AggregateFunction types and query functions.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| | Column | Type | Description | | ||
| |--------|------|-------------| | ||
| | `address` | String | Wallet address (primary key within your project) | | ||
| | `user_id` | String | Your own external user identifier | | ||
| | `display_name` | String | Display name | | ||
| | `email` | String | Email address | | ||
| | `farcaster`, `discord`, `twitter`, `telegram`, `instagram`, `github`, `linkedin`, `facebook`, `tiktok`, `youtube`, `reddit` | String | Social handles | | ||
| | `website` | String | Website URL | | ||
| | `ens`, `lens`, `basenames`, `linea` | String | Web3 name-service handles | | ||
| | `avatar` | String | Avatar image URL | | ||
| | `description` | String | Bio / description | | ||
| | `location` | String | Location | | ||
| | `updated_at` | DateTime | Last time any trait was updated | |
There was a problem hiding this comment.
Since user_profiles_mv is an aggregate table (AggregatingMergeTree), documenting the columns with simple types like String and DateTime is inconsistent with the rest of the catalog (e.g., the users and wallet_profiles_mv tables). It is also misleading to users, as querying these columns directly will return binary aggregate states instead of readable values.
Updating the table to use the Query Function format and specifying the actual ClickHouse AggregateFunction types will make it clear how to query these fields.
| Column | Type | Query Function |
|--------|------|---------------|
| `address` | String | Direct access |
| `user_id` | AggregateFunction(argMaxIf, String, DateTime, UInt8) | `argMaxIfMerge(user_id)` |
| `display_name` | AggregateFunction(argMaxIf, String, DateTime, UInt8) | `argMaxIfMerge(display_name)` |
| `email` | AggregateFunction(argMaxIf, String, DateTime, UInt8) | `argMaxIfMerge(email)` |
| `farcaster`, `discord`, `twitter`, `telegram`, `instagram`, `github`, `linkedin`, `facebook`, `tiktok`, `youtube`, `reddit` | AggregateFunction(argMaxIf, String, DateTime, UInt8) | `argMaxIfMerge(column_name)` |
| `website` | AggregateFunction(argMaxIf, String, DateTime, UInt8) | `argMaxIfMerge(website)` |
| `ens`, `lens`, `basenames`, `linea` | AggregateFunction(argMaxIf, String, DateTime, UInt8) | `argMaxIfMerge(column_name)` |
| `avatar` | AggregateFunction(argMaxIf, String, DateTime, UInt8) | `argMaxIfMerge(avatar)` |
| `description` | AggregateFunction(argMaxIf, String, DateTime, UInt8) | `argMaxIfMerge(description)` |
| `location` | AggregateFunction(argMaxIf, String, DateTime, UInt8) | `argMaxIfMerge(location)` |
| `updated_at` | SimpleAggregateFunction(max, DateTime) | `max(updated_at)` |
|
Preview deployment for your docs. Learn more about Mintlify Previews.
💡 Tip: Enable Workflows to automatically generate PRs for you. |
Summary
Added documentation for two new project-scoped data tables:
user_profiles_mv(first-party user profile traits from identify events) anduser_labels(custom labels and tags assigned to wallet addresses). Updated the aggregate tables reference to includeuser_profiles_mvand document theargMaxIfMergefunction pattern.Changes
user_profiles_mvtable documentation with full column reference, use cases, and example query showing proper aggregate table usage with-Mergefunctionsuser_labelstable documentation with column definitions and use cases for custom segmentation and scoringuser_profiles_mvin the list of tables using ClickHouse AggregateFunction typesargMaxIfMergepattern to the aggregate function reference table to document the conditional max aggregation functionImplementation details
user_profiles_mvis documented as anAggregatingMergeTreewith a warning to use-Mergefunctions andGROUP BY addressuser_labelsis positioned as the project-specific counterpart to the globalwallet_profiles_labelstablehttps://claude.ai/code/session_01XeF9u7YLwBCXtsADz9esGH
Need help on this PR? Tag
/codesmithwith what you need. Autofix is disabled.