Skip to content

Bulk Evaluation Test#6

Draft
GoodPie wants to merge 7 commits into
mainfrom
bulk-evaluation
Draft

Bulk Evaluation Test#6
GoodPie wants to merge 7 commits into
mainfrom
bulk-evaluation

Conversation

@GoodPie
Copy link
Copy Markdown
Contributor

@GoodPie GoodPie commented Feb 6, 2026

[STILL A WIP]

Description

Motivation and Context

How Has This Been Tested?

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@GoodPie GoodPie self-assigned this Feb 9, 2026
@GoodPie GoodPie requested a review from Copilot February 9, 2026 01:44
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds first-pass support for batching concurrent Flagr evaluations via /api/v1/evaluation/batch, including config, types, implementation, documentation, and tests.

Changes:

  • Introduces FlagrBatchEvaluator and integrates it into FlagrProvider behind a batching.enabled config flag.
  • Adds Zod request/response types for Flagr’s batch evaluation endpoint and exports the new public types.
  • Adds unit tests + MSW handlers for batch request paths and documents batching behavior/options in the README.

Reviewed changes

Copilot reviewed 10 out of 11 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
src/types/flagr/evaluation/FlagrBatchEvaluationResponse.ts Adds Zod schema/type for batch evaluation responses
src/types/flagr/evaluation/FlagrBatchEvaluationRequest.ts Adds Zod schema/type for batch evaluation requests/entities
src/types/FlagrProviderConfig.ts Adds batching config block (enabled/maxBatchSize/scheduleFn)
src/index.ts Exposes new batch request/response types from package entrypoint
src/tests/unit/FlagrProvider.test.ts Adds provider-level tests ensuring batching path resolves and errors correctly
src/tests/unit/FlagrBatchEvaluator.test.ts Adds focused tests for coalescing, splitting, error paths, and scheduleFn
src/tests/mocks/handlers.ts Adds MSW handlers for batch endpoint success/error/timeout
src/FlagrProvider.ts Wires batch evaluator into provider resolution methods
src/FlagrBatchEvaluator.ts Implements a DataLoader-style batcher for Flagr’s batch evaluation endpoint
README.md Documents batching purpose, enabling, options, mechanics, and limitations

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/FlagrProvider.ts
Comment on lines +74 to +77
private evaluateForFlag(flagKey: string, context: EvaluationContext, logger: Logger) {
return this.batchEvaluator
? this.batchEvaluator.load(flagKey, context, logger)
: evaluateFlag(flagKey, context, this.config, logger);
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FlagrBatchEvaluator.load()can resolvenull(no segment match), but the non-batching path appears to return a non-null evaluation response. This makesresponsepotentially null in allresolve*Evaluationmethods, which can break existing logic that reads fields likeresponse.variantKey/response.variantAttachment. Align the contract by either (a) making the batching path return the same non-null shape as evaluateFlag(e.g., a response withvariantKeyundefined), or (b) adding an explicitnullhandling branch in each evaluation method immediately after awaitingevaluateForFlag` (and before any property access).

Suggested change
private evaluateForFlag(flagKey: string, context: EvaluationContext, logger: Logger) {
return this.batchEvaluator
? this.batchEvaluator.load(flagKey, context, logger)
: evaluateFlag(flagKey, context, this.config, logger);
private async evaluateForFlag(flagKey: string, context: EvaluationContext, logger: Logger) {
if (this.batchEvaluator) {
const response = await this.batchEvaluator.load(flagKey, context, logger);
// Normalize potential null from batch evaluation to match non-batching evaluateFlag contract
if (response === null) {
return {} as Awaited<ReturnType<typeof evaluateFlag>>;
}
return response;
}
return evaluateFlag(flagKey, context, this.config, logger);

Copilot uses AI. Check for mistakes.
Comment thread src/FlagrProvider.ts
): Promise<ResolutionDetails<boolean>> {
try {
const response = await evaluateFlag(flagKey, context, this.config, logger);
const response = await this.evaluateForFlag(flagKey, context, logger);
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FlagrBatchEvaluator.load()can resolvenull(no segment match), but the non-batching path appears to return a non-null evaluation response. This makesresponsepotentially null in allresolve*Evaluationmethods, which can break existing logic that reads fields likeresponse.variantKey/response.variantAttachment. Align the contract by either (a) making the batching path return the same non-null shape as evaluateFlag(e.g., a response withvariantKeyundefined), or (b) adding an explicitnullhandling branch in each evaluation method immediately after awaitingevaluateForFlag` (and before any property access).

Copilot uses AI. Check for mistakes.
const entityContext = transformContext(context);

const sortedEntries = Object.entries(entityContext).sort(([a], [b]) => a.localeCompare(b));
const entityDedupKey = `${entityID}::${JSON.stringify(sortedEntries)}`;
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

entityDedupKeyincludesentityContext, but result mapping only keys on flagKey + entityID. If two pending evaluations share the same entityIDbut have differententityContext(or differententityType), their responses are not distinguishable in resultMapand will collide/overwrite, causing incorrect promise resolution. A concrete fix is to avoid putting multiple entries with the sameentityIDinto the same HTTP request: group pending work by a dedup key that includesentityType(and context if you want to keep it distinct), then issue one batch request per group (still batching multiple flags per entity), and map results back using that group’s identity (since you know which entity the request was for). Also includeentityType` in internal keys to avoid collisions between entities that share an ID across types.

Suggested change
const entityDedupKey = `${entityID}::${JSON.stringify(sortedEntries)}`;
const entityDedupKey = `${entityType}:${entityID}::${JSON.stringify(sortedEntries)}`;

Copilot uses AI. Check for mistakes.
Comment on lines +149 to +150
const key = `${result.flagKey}::${result.evalContext.entityID}`;
resultMap.set(key, result);
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

entityDedupKeyincludesentityContext, but result mapping only keys on flagKey + entityID. If two pending evaluations share the same entityIDbut have differententityContext(or differententityType), their responses are not distinguishable in resultMapand will collide/overwrite, causing incorrect promise resolution. A concrete fix is to avoid putting multiple entries with the sameentityIDinto the same HTTP request: group pending work by a dedup key that includesentityType(and context if you want to keep it distinct), then issue one batch request per group (still batching multiple flags per entity), and map results back using that group’s identity (since you know which entity the request was for). Also includeentityType` in internal keys to avoid collisions between entities that share an ID across types.

Copilot uses AI. Check for mistakes.
Comment on lines +87 to +167
// Deduplicate entities
const entityMap = new Map<string, FlagrBatchEntity>();
for (const item of batch) {
if (!entityMap.has(item.entityDedupKey)) {
entityMap.set(item.entityDedupKey, item.entity);
}
}
const entities = [...entityMap.values()];

// Collect unique flag keys
const flagKeySet = new Set<string>();
for (const item of batch) {
flagKeySet.add(item.flagKey);
}
const allFlagKeys = [...flagKeySet];

// Split flag keys into chunks if maxBatchSize is set
const chunks: string[][] = [];
if (this.maxBatchSize === Infinity) {
chunks.push(allFlagKeys);
} else {
for (let i = 0; i < allFlagKeys.length; i += this.maxBatchSize) {
chunks.push(allFlagKeys.slice(i, i + this.maxBatchSize));
}
}

try {
// Build result map from all chunks
const resultMap = new Map<string, FlagrEvaluationResponse>();

for (const flagKeys of chunks) {
const requestBody = {
entities,
flagKeys,
enableDebug: true,
};

logger.debug(`Flagr batch evaluation request: ${JSON.stringify(requestBody)}`);

const controller = new AbortController();
const timeout = this.config.timeout ?? 5000;
const timeoutId = setTimeout(() => {
controller.abort();
}, timeout);

try {
const response = await fetch(`${this.config.baseUrl}/api/v1/evaluation/batch`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(requestBody),
signal: controller.signal,
});

if (!response.ok) {
throw new GeneralError(
`Flagr batch returned ${response.status.toString()}: ${response.statusText}`
);
}

const data = FlagrBatchEvaluationResponse.parse(await response.json());

for (const result of data.evaluationResults) {
const key = `${result.flagKey}::${result.evalContext.entityID}`;
resultMap.set(key, result);
}
} finally {
clearTimeout(timeoutId);
}
}

// Resolve each pending promise
for (const item of batch) {
const key = `${item.flagKey}::${item.entity.entityID}`;
const result = resultMap.get(key);

if (result === undefined) {
item.reject(new FlagNotFoundError(`Flag '${item.flagKey}' not found`));
} else if (!result.variantKey) {
item.resolve(null);
} else {
item.resolve(result);
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

entityDedupKeyincludesentityContext, but result mapping only keys on flagKey + entityID. If two pending evaluations share the same entityIDbut have differententityContext(or differententityType), their responses are not distinguishable in resultMapand will collide/overwrite, causing incorrect promise resolution. A concrete fix is to avoid putting multiple entries with the sameentityIDinto the same HTTP request: group pending work by a dedup key that includesentityType(and context if you want to keep it distinct), then issue one batch request per group (still batching multiple flags per entity), and map results back using that group’s identity (since you know which entity the request was for). Also includeentityType` in internal keys to avoid collisions between entities that share an ID across types.

Suggested change
// Deduplicate entities
const entityMap = new Map<string, FlagrBatchEntity>();
for (const item of batch) {
if (!entityMap.has(item.entityDedupKey)) {
entityMap.set(item.entityDedupKey, item.entity);
}
}
const entities = [...entityMap.values()];
// Collect unique flag keys
const flagKeySet = new Set<string>();
for (const item of batch) {
flagKeySet.add(item.flagKey);
}
const allFlagKeys = [...flagKeySet];
// Split flag keys into chunks if maxBatchSize is set
const chunks: string[][] = [];
if (this.maxBatchSize === Infinity) {
chunks.push(allFlagKeys);
} else {
for (let i = 0; i < allFlagKeys.length; i += this.maxBatchSize) {
chunks.push(allFlagKeys.slice(i, i + this.maxBatchSize));
}
}
try {
// Build result map from all chunks
const resultMap = new Map<string, FlagrEvaluationResponse>();
for (const flagKeys of chunks) {
const requestBody = {
entities,
flagKeys,
enableDebug: true,
};
logger.debug(`Flagr batch evaluation request: ${JSON.stringify(requestBody)}`);
const controller = new AbortController();
const timeout = this.config.timeout ?? 5000;
const timeoutId = setTimeout(() => {
controller.abort();
}, timeout);
try {
const response = await fetch(`${this.config.baseUrl}/api/v1/evaluation/batch`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(requestBody),
signal: controller.signal,
});
if (!response.ok) {
throw new GeneralError(
`Flagr batch returned ${response.status.toString()}: ${response.statusText}`
);
}
const data = FlagrBatchEvaluationResponse.parse(await response.json());
for (const result of data.evaluationResults) {
const key = `${result.flagKey}::${result.evalContext.entityID}`;
resultMap.set(key, result);
}
} finally {
clearTimeout(timeoutId);
}
}
// Resolve each pending promise
for (const item of batch) {
const key = `${item.flagKey}::${item.entity.entityID}`;
const result = resultMap.get(key);
if (result === undefined) {
item.reject(new FlagNotFoundError(`Flag '${item.flagKey}' not found`));
} else if (!result.variantKey) {
item.resolve(null);
} else {
item.resolve(result);
// Group pending evaluations by entityDedupKey so that each HTTP request
// only contains a single logical entity (including its context/type).
const groups = new Map<
string,
{
entity: FlagrBatchEntity;
flagKeys: Set<string>;
items: PendingEvaluation[];
}
>();
for (const item of batch) {
let group = groups.get(item.entityDedupKey);
if (!group) {
group = {
entity: item.entity,
flagKeys: new Set<string>(),
items: [],
};
groups.set(item.entityDedupKey, group);
}
group.flagKeys.add(item.flagKey);
group.items.push(item);
}
try {
// Process each entity group separately to avoid collisions between
// entities that share an ID but differ in context or type.
for (const group of groups.values()) {
const entities = [group.entity];
const allFlagKeys = [...group.flagKeys];
// Split flag keys into chunks if maxBatchSize is set
const chunks: string[][] = [];
if (this.maxBatchSize === Infinity) {
chunks.push(allFlagKeys);
} else {
for (let i = 0; i < allFlagKeys.length; i += this.maxBatchSize) {
chunks.push(allFlagKeys.slice(i, i + this.maxBatchSize));
}
}
// Build result map for this group from all its chunks
const resultMap = new Map<string, FlagrEvaluationResponse>();
for (const flagKeys of chunks) {
if (flagKeys.length === 0) {
continue;
}
const requestBody = {
entities,
flagKeys,
enableDebug: true,
};
logger.debug(`Flagr batch evaluation request: ${JSON.stringify(requestBody)}`);
const controller = new AbortController();
const timeout = this.config.timeout ?? 5000;
const timeoutId = setTimeout(() => {
controller.abort();
}, timeout);
try {
const response = await fetch(`${this.config.baseUrl}/api/v1/evaluation/batch`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(requestBody),
signal: controller.signal,
});
if (!response.ok) {
throw new GeneralError(
`Flagr batch returned ${response.status.toString()}: ${response.statusText}`
);
}
const data = FlagrBatchEvaluationResponse.parse(await response.json());
for (const result of data.evaluationResults) {
// Within a group, all results are for the same entity, so we
// can safely key only by flagKey.
const key = result.flagKey;
resultMap.set(key, result);
}
} finally {
clearTimeout(timeoutId);
}
}
// Resolve each pending promise in this group
for (const item of group.items) {
const key = item.flagKey;
const result = resultMap.get(key);
if (result === undefined) {
item.reject(new FlagNotFoundError(`Flag '${item.flagKey}' not found`));
} else if (!result.variantKey) {
item.resolve(null);
} else {
item.resolve(result);
}

Copilot uses AI. Check for mistakes.
Comment on lines +118 to +121
const requestBody = {
entities,
flagKeys,
enableDebug: true,
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enableDebugis always set totrue, which can significantly increase response payload size (eval debug logs) and CPU/memory usage when batching many flags/entities. Consider making this configurable (e.g., via provider config), defaulting to false`, or tying it to an explicit debug mode rather than always enabling it in production paths.

Suggested change
const requestBody = {
entities,
flagKeys,
enableDebug: true,
const enableDebug = process.env.FLAGR_ENABLE_DEBUG === 'true';
const requestBody = {
entities,
flagKeys,
enableDebug,

Copilot uses AI. Check for mistakes.
.object({
enabled: z.boolean(),
maxBatchSize: z.number().min(1).optional(),
scheduleFn: z.custom<(fn: () => void) => void>().optional(),
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

z.custom()does not validate at runtime by default, soscheduleFncan be any value and still pass config parsing. Prefer a runtime-validating schema (e.g., a function schema) or add a refinement to ensurescheduleFn` is actually callable, so misconfiguration fails fast with a clear validation error.

Suggested change
scheduleFn: z.custom<(fn: () => void) => void>().optional(),
scheduleFn: z
.function()
.args(z.function().args().returns(z.void()))
.returns(z.void())
.optional(),

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants