Skip to content

Commit 44487d8

Browse files
fix: Set default embedding max batch size to 1024 (#1116)
This change lowers the default embedding request batch size based on observed production-like logs under S0 throttling, where 2048 repeatedly hit 429 and 1024 progressed successfully. ## Changes - `EmbeddingRetryOptions.MaxEmbeddingBatchSize` default: `2048` -> `1024` - `AIOptions:EmbeddingRetry:MaxEmbeddingBatchSize` in `EssentialCSharp.Web/appsettings.json`: `2048` -> `1024` ## Why Recent run data showed sustained retry exhaustion at 2048 and successful completion after adaptive downshift to 1024. Setting 1024 as the default improves out-of-the-box behavior under throttled tiers while preserving configurability.
1 parent 920c021 commit 44487d8

2 files changed

Lines changed: 2 additions & 2 deletions

File tree

EssentialCSharp.Chat.Shared/Models/EmbeddingRetryOptions.cs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ public sealed class EmbeddingRetryOptions
3939
/// The service may adaptively downshift below this value when throttled.
4040
/// </summary>
4141
[Range(1, 2048)]
42-
public int MaxEmbeddingBatchSize { get; set; } = 2048;
42+
public int MaxEmbeddingBatchSize { get; set; } = 1024;
4343

4444
/// <summary>
4545
/// Minimum delay between embedding API requests in milliseconds.

EssentialCSharp.Web/appsettings.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@
2525
"MaxRetries": 5,
2626
"BaseDelayMs": 1000,
2727
"MaxDelayMs": 60000,
28-
"MaxEmbeddingBatchSize": 2048,
28+
"MaxEmbeddingBatchSize": 1024,
2929
"MinInterRequestDelayMs": 250,
3030
"BackoffMultiplier": 2.0,
3131
"MaxJitterFraction": 0.2

0 commit comments

Comments
 (0)