Skip to content

Support filtering interface/union types on _typename#1169

Draft
marcdaniels-toast wants to merge 4 commits intoblock:mainfrom
marcdaniels-toast:mdaniels/typename-filter
Draft

Support filtering interface/union types on _typename#1169
marcdaniels-toast wants to merge 4 commits intoblock:mainfrom
marcdaniels-toast:mdaniels/typename-filter

Conversation

@marcdaniels-toast
Copy link
Copy Markdown
Contributor

Closes #1024

Finishes the work @myronmarston started in #1027.

Adds a _typename filter field to all interface and union type filter inputs, allowing clients to filter by concrete subtype:

{ distribution_channels(filter: { _typename: { equal_to_any_of: ["OnlineStore"] } }) { ... } }

What this adds on top of #1027:

  • name_in_index: "__typename" on the filter field so FilterArgsTranslator maps _typename__typename in the datastore query at runtime
  • Regenerated schema artifacts with the name_in_index entry in runtime metadata for each abstract FilterInput type
  • Unit test verifying the _typename__typename translation in filters_spec.rb
  • Acceptance tests covering _typename filtering on: top-level abstract type queries (distribution_channels, retailers) where multiple concrete types share an index, embedded union/interface fields (inventor, named_inventor), and aggregations

Known limitation: _typename only works reliably for types stored in a shared index (where the indexer injects __typename). Types with a dedicated index (e.g. PhysicalStore in physical_stores) don't have __typename stored, so equal_to_any_of: ["PhysicalStore"] returns nothing and not: { equal_to_any_of: [..., "PhysicalStore"] } fails to exclude them. This is inherent to how dedicated-index types work, not a bug introduced here. As a workaround, equal_to_any_of: [null] correctly matches dedicated-index documents, mirroring how AbstractTypeFilter handles them internally.

myronmarston and others added 4 commits May 7, 2026 16:12
Enables filtering union/interface types by concrete subtype.
Uses single underscore (`_typename`) since GraphQL spec prohibits
`__` prefix on input fields. Also fixes CamelCaseConverter to
preserve leading underscores via lookbehind assertion.

This is step 1 for block#1024.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The `_typename` filter field added to abstract type filter inputs
needed `name_in_index: "__typename"` so that FilterArgsTranslator
maps it to the correct datastore field at query time.

Also regenerates schema artifacts (runtime_metadata.yaml gets the
name_in_index entry for each abstract FilterInput type) and adds a
unit test verifying the translation in filters_spec.rb.

Generated with Claude Code
Tests cover:
- Union type embedded field (inventor): filter by Company or Person
- Interface type embedded field (named_inventor): filter by subtype
- Top-level abstract type query (distribution_channels): filter by OnlineStore
- Sub-interface query (retailers): _typename filter combined with automatic
  __typename scoping

Also removes the now-stale comment that said __typename filtering was
not supported.

Generated with Claude Code
@myronmarston
Copy link
Copy Markdown
Collaborator

Known limitation: _typename only works reliably for types stored in a shared index (where the indexer injects __typename). Types with a dedicated index (e.g. PhysicalStore in physical_stores) don't have __typename stored, so equal_to_any_of: ["PhysicalStore"] returns nothing and not: { equal_to_any_of: [..., "PhysicalStore"] } fails to exclude them. This is inherent to how dedicated-index types work, not a bug introduced here. As a workaround, equal_to_any_of: [null] correctly matches dedicated-index documents, mirroring how AbstractTypeFilter handles them internally.

I think this known limitation is a problem. I think we either need to get _typename to work in these spots or find a way to omit _typename from the schema where it's not going to work correctly. Having it available in the GraphQL schema, but not work correctly in some cases, isn't satisfactory, IMO. After all--clients of the GraphQL API have no understanding of the index structure of the types, to understand when _typename will work and when it won't work.

I think there's a solution, though--and it's one that should support more optimal queries to boot: let's leverage the _typename filter to narrow down the set of indices being queried. In a case where a concrete subtype lives in its own index, there's no __typename to filter on but we can update the query.search_index_definitions so that we hit only the index that has that type.

That said, there's a situation that's tricky to solve. Imagine this:

# The DistributionChannel hierarchy has two branches:
#   DistributionChannel (index: distribution_channels)
#   ├── Wholesale            (interface, inherits distribution_channels)
#   │   ├── DirectWholesaler (concrete, distribution_channels index)
#   │   └── BrokerWholesaler (concrete, distribution_channels index)
#   └── Retail               (interface, inherits distribution_channels)
#       └── Store            (interface, inherits distribution_channels)
#           ├── OnlineStore  (concrete, distribution_channels index)
#           └── PhysicalStore (concrete, physical_stores index)

And imagine a query comes in like this:

distributionChannels(filter: {
 _typename: {equalToAnyOf: ["PhysicalStore", "BrokerWholesale"]}
}) {
  # ...
}

This query needs to hit both the physical_stores index and the distribution_channels index, and it needs to filter the distribution_channels index on __typename == "BrokerWholesale". But there's no way that I know of for a datastore query to have a "conditional" filter that applies to only one of the indexes. I think the solution is to introduce a __typename field on the physical_stores index as a constant_keyword (ES docs, OS docs). You could then filter on _typename IN ("PhysicalStore", "BrokerWholesale") against both indices and it should work, without requiring per-document storage space in the physical_stores index to store __typename: "PhysicalStore".

Now that I've thought of using constant_keyword for this case, it makes me wonder if it might make sense to always include a __typename property in every index for every object type, using constant_keyword if it's not an abstract type. We have some spots where we conditionally use __typename when available, and it could simplify to make it always available. We don't want to take up the storage costs of storing __typename per-document when not needed but constant_keyword solves that nicely!

Also, I just realized something while writing that up--a client might try this:

distributionChannels(filter: {
 _typename: {equalToAnyOf: ["Wholesale"]}
}) {
  # ...
}

A client could try that expecting it to return all documents that are subtypes of "Wholesale". But of course it won't work because there are no docs where __typename == "Wholesale". And _typename is meant to mimic the __typename return field clients can request. I see two potential solutions here:

  • Document this--in the generated GraphQL schema, document the semantics of _typename, so clients are aware of how it works.
  • Use an enum for _typename. Instead of it being a string, we could make it an enum, and in the enum the only members would be the concrete types (e.g. DirectWholesaler, BrokerWholesaler, OnlineStore, PhysicalStore, but not DistributionChannel, Retail, Store). Then the schema itself would make it impossible to filter on an abstract supertype.

I want to hear what you favor but to share my two cents: I like the 2nd option a lot for the type safety it provides, but I think I ultimately lean towards the first option, for a few reasons:

  • It's less effort! Just a documentation update.
  • Given that it's an equalToAnyOf filter I think it's natural for clients to expect __typename == value semantics, so this isn't that likely to occur in practice.
  • It mimics the __typename return field which is a String instead of an enum.
  • Having enum values like PhsyicalStore violates common GraphQL lint rules that expect enum values like PHYSICAL_STORE, such as in Apollo's linter. We could do the screaming case but then we have to map between them and it feels suboptimal.
  • If we're going to go the enum route we'd want it to be as typesafe as possible and instead of having a single ConcreteTypeName enum type containing all concrete types in the schema, we'd probably want to make it per-abstract type. But that's more to manage and would kinda bloat the schema.
  • There's potential for composition issues when such a type gets included into a supergraph schema.

Thoughts @marcdaniels-toast?

@marcdaniels-toast
Copy link
Copy Markdown
Contributor Author

Thanks for pushing back on the limitation. It did seem a bit awkward at the time but I didn't know we had such a clean solution as constant_keyword. Thanks for pointing it out. I'm on board with using that for the single-type indexes. I'd like to make that change in at least one separate PR before updating this one to keep a PR focused on that change.

As for filtering by abstract types, what about query-time expansion? When the filter interpreter sees a _typename filter value that names an abstract type, it could expand it to the set of concrete subtypes. So ["Wholesale"] becomes ["DirectWholesaler", "BrokerWholesaler"] before the filter hits the datastore.

@marcdaniels-toast marcdaniels-toast marked this pull request as draft May 9, 2026 03:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support filtering interface/union types on __typename

2 participants