Skip to content

Dataloader source's fetch method called N times via object authorization block #5399

Description

@sethc2

Describe the bug

So we have a GQL type using a dataloader source in its authorization method like this.

  def self.authorized?(object, context)
    is_authorized = context.dataloader.with(AuthorizationSource).load(object.id)
    is_authorized
  end

However the fetch method for the source was getting called N times with a single item, instead of what I expected of being called once with N items.

I dug in and discovered the parent type's resolver was using a GraphQL::Batch::Loader to lazy load the object used in the authorized method.

It seemed if I updated the parent resolver to lazy load using a dataloader source, then the subsequent calls to our authorization source worked as expected and the fetch method was called once with N items.

Versions

graphql version: 2.5.10
graphql-batch version: 0.6.0

Steps to reproduce

I've created a ruby script that setups a simple schema, where you can query the same 10 list of objects that are using dataloader in their authorized method, through two separate paths. In one path we load the objects via a dataloader source, and see that the AuthorizationSource is called once with 10 items, and in another we use a batch loader to load the objects, and we can see the AuthorizationSource is called 10 times with 1 item each time.

require 'graphql'
require 'graphql/batch'

puts '=== Simple DataLoader vs Batch Loader Reproduction ==='
puts

class SimpleUser
  attr_reader :id

  def initialize(id)
    @id = id
  end
end

class SimpleProfile
  attr_reader :id, :name

  def initialize(id, name)
    @id = id
    @name = name
  end
end

class ProfileByUserIdSource < GraphQL::Dataloader::Source
  def initialize(profile_proc)
    @profile_proc = profile_proc
  end

  def fetch(user_ids)
    @profile_proc.call(user_ids)
  end
end

class AuthorizationSource < GraphQL::Dataloader::Source
  def initialize(auth_proc)
    @auth_proc = auth_proc
  end

  def fetch(user_ids)
    puts "AuthorizationSource fetch for user_ids: #{user_ids}"
    @auth_proc.call(user_ids)
  end
end

# Batch loader for profiles
class ProfileBatchLoader < GraphQL::Batch::Loader
  def initialize(profile_proc)
    @profile_proc = profile_proc
  end

  def perform(user_ids)
    results = @profile_proc.call(user_ids)
    user_ids.each_with_index do |id, index|
      fulfill(id, results[index])
    end
  end
end

# GraphQL Types
class ProfileType < GraphQL::Schema::Object
  field :id, ID, null: false
  field :name, String, null: false

  def self.authorized?(object, context)
    auth_proc = context[:auth_proc]
    context.dataloader.with(AuthorizationSource, auth_proc).load(object.id)
  end
end

class UserType < GraphQL::Schema::Object
  field :id, ID, null: false
  field :profile_by_dataloader, ProfileType, null: false
  field :profile_by_batch_loader, ProfileType, null: false

  def profile_by_dataloader
    profile_proc = context[:profile_proc]
    dataloader.with(ProfileByUserIdSource, profile_proc).load(object.id)
  end

  def profile_by_batch_loader
    profile_proc = context[:profile_proc]
    ProfileBatchLoader.for(profile_proc).load(object.id)
  end
end

class QueryType < GraphQL::Schema::Object
  field :all_users, [UserType], null: false

  def all_users
    # Hardcoded 10 users
    (1..10).map { |i| SimpleUser.new(i.to_s) }
  end
end

# Schema
class Schema < GraphQL::Schema
  query QueryType

  # Enable both DataLoader and Batch
  use GraphQL::Dataloader
  use GraphQL::Batch
end

# Test queries
dataloader_query = <<~GRAPHQL
  query GetProfilesByDataLoader {
    allUsers {
      id
      profileByDataloader {
        # the authorization for each of these types will happen in a single fetch call
        id
        name
      }
    }
  }
GRAPHQL

batch_query = <<~GRAPHQL
  query GetProfilesByBatchLoader {
    allUsers {
      id
      profileByBatchLoader {
        # the authorization for each of these types is happening in N fetch calls (N being the number of users)
        id
        name
      }
    }
  }
GRAPHQL

# Create procs for tracking calls
$dataloader_fetch_count = 0
$batch_perform_count = 0
$auth_fetch_count = 0

profile_proc = -> (user_ids) do
  # Simulate database query
  sleep(0.001)
  user_ids.map { |id| SimpleProfile.new("profile_#{id}", "Profile #{id}") }
end

auth_proc = -> (user_ids) do
  $auth_fetch_count += 1
  # Simulate database query
  sleep(0.001)
  user_ids.map { |_id| true }
end

puts 'Testing DataLoader approach:'
puts '=' * 50
$auth_fetch_count = 0

Schema.execute(
  dataloader_query,
  context: {
    profile_proc: profile_proc,
    auth_proc: auth_proc,
  }
)

puts "Authorization fetch calls for DataLoader: #{$auth_fetch_count}"
puts

puts 'Testing Batch Loader approach:'
puts '=' * 50
$auth_fetch_count = 0

Schema.execute(
  batch_query,
  context: {
    profile_proc: profile_proc,
    auth_proc: auth_proc,
  }
)

puts "Authorization fetch calls for BatchLoader: #{$auth_fetch_count}"
puts

Expected behavior

I would expect the AuthorizationSource in my example script to get called only once in the batch loader case.

Actual behavior

When I run bundle exec ruby script_to_reproduce.rb I get the following output

Testing DataLoader approach:
==================================================
AuthorizationSource fetch for user_ids: ["profile_1", "profile_2", "profile_3", "profile_4", "profile_5", "profile_6", "profile_7", "profile_8", "profile_9", "profile_10"]
Authorization fetch calls for DataLoader: 1

Testing Batch Loader approach:
==================================================
AuthorizationSource fetch for user_ids: ["profile_1"]
AuthorizationSource fetch for user_ids: ["profile_2"]
AuthorizationSource fetch for user_ids: ["profile_3"]
AuthorizationSource fetch for user_ids: ["profile_4"]
AuthorizationSource fetch for user_ids: ["profile_5"]
AuthorizationSource fetch for user_ids: ["profile_6"]
AuthorizationSource fetch for user_ids: ["profile_7"]
AuthorizationSource fetch for user_ids: ["profile_8"]
AuthorizationSource fetch for user_ids: ["profile_9"]
AuthorizationSource fetch for user_ids: ["profile_10"]
Authorization fetch calls for BatchLoader: 10

Additional context

I work on a horizontal GraphQL team, so trying to figure out what to recommend to our product engineers to avoid these N+1s.

If this is intended behavior, or if it would just be way too much work/risk to fix quickly, I'll probably just recommend they switch the batch loaders to use dataloaders (we have a massive schema that has a large variety of patterns throughout it from different teams).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions