Skip to content

Ai gateway#2937

Merged
predic8 merged 53 commits into
masterfrom
ai-gateway
May 27, 2026
Merged

Ai gateway#2937
predic8 merged 53 commits into
masterfrom
ai-gateway

Conversation

@predic8

@predic8 predic8 commented May 7, 2026

Copy link
Copy Markdown
Member

Summary by CodeRabbit

  • New Features

    • Added LLM Gateway interceptor supporting Claude, Google Gemini, and OpenAI providers
    • Introduced per-user API key management with token budget enforcement
    • Added Server-Sent Events (SSE) parser for streaming response handling
    • Enhanced JSON parsing and manipulation utilities
  • Documentation

    • Added comprehensive LLM gateway tutorials for Claude, Google, and OpenAI with example configurations
  • Tests

    • Added integration test suites for LLM gateway providers with multi-user scenarios
  • Chores

    • Updated license headers to Apache 2.0 across multiple files
    • Consolidated import statements

Review Change Stack

@coderabbitai

coderabbitai Bot commented May 7, 2026

Copy link
Copy Markdown
Contributor

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Introduces an LLM Gateway interceptor with providers (OpenAI, Claude, Google), SSE parsing, request/response models, policy enforcement, token-usage stores (in-memory/JDBC), error creators, tutorials/configs, and comprehensive tests including streaming.

Changes

LLM Gateway Integration

Layer / File(s) Summary
LLM Gateway and ecosystem setup
core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/*, core/.../provider/**/*, core/.../store/*, core/.../util/**/*, distribution/tutorials/ai/llm-gateway/**/*, distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/**/*, core/src/test/**/*, annot/.../ObjectBinder.java, minor MCP/jsonrpc headers/imports
Adds interceptor, contracts, SSE/event parsing, providers (OpenAI/Claude/Google), policies/system prompt, usage stores (simple/JDBC), error creators, tutorials and fixtures, and tests (incl. streaming). Minor utility and header/import tweaks included.

Sequence Diagram(s)

(skipped)

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Suggested labels

7.x

Suggested reviewers

  • rrayst

Poem

A rabbit taps keys in a meadow of code,
Streams whisper tokens along the road.
Claude, OpenAI, and Gemini sing,
Policies prune what the models bring.
Keys are shared, budgets align—
Gateway hums, responses fine.
Thump-thump—ship time! 🐇✨

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ai-gateway

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

🧹 Nitpick comments (6)
annot/src/main/java/com/predic8/membrane/annot/yaml/parsing/binding/ObjectBinder.java (1)

102-108: 💤 Low value

instantiate declares checked exceptions that are already swallowed by bind's catch-all

The helper correctly surfaces NoSuchMethodException as a descriptive error. Minor note: InvocationTargetException, InstantiationException, and IllegalAccessException are declared in the throws clause, but since instantiate is private and its only caller (bind) already catches Throwable, these declarations are noise. Consider removing them, which also removes the need for bind to implicitly rely on the catch-all for those paths.

🔧 Optional cleanup
-    private static <T> `@NotNull` T instantiate(Class<T> clazz) throws InvocationTargetException, InstantiationException, IllegalAccessException {
+    private static <T> `@NotNull` T instantiate(Class<T> clazz) {
         try {
             return clazz.getConstructor().newInstance();
         } catch (NoSuchMethodException e) {
             throw new ConfigurationParsingException("Class %s does not have a public no-arg constructor.".formatted(clazz.getName()));
+        } catch (InvocationTargetException | InstantiationException | IllegalAccessException e) {
+            throw new ConfigurationParsingException(e);
         }
     }

This keeps all exception handling inside the helper and makes the contract self-contained, while bind()'s catch (ConfigurationParsingException e) still applies the source-file context as before.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@annot/src/main/java/com/predic8/membrane/annot/yaml/parsing/binding/ObjectBinder.java`
around lines 102 - 108, The instantiate helper should not declare
InvocationTargetException, InstantiationException, or IllegalAccessException
since its caller bind already swallows Throwable; instead, remove those
exceptions from the method signature and catch InvocationTargetException,
InstantiationException, and IllegalAccessException inside instantiate,
converting them to a ConfigurationParsingException with a clear message
(including clazz.getName() and the original exception as the cause); keep the
existing NoSuchMethodException handling as-is; update nothing in bind beyond
removing the throws dependency so bind continues to catch
ConfigurationParsingException for source-file context.
core/src/main/java/com/predic8/membrane/core/util/jdbc/AbstractJdbcSupport.java (1)

60-71: Open @TODO: migrate subclass connections to getConnection()

The comment on line 60 documents that subclasses should be migrated to this method but haven't been yet. Without that migration the deduplication benefit is unrealised.

Would you like me to open a tracking issue for migrating existing subclasses to use getConnection()?

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/util/jdbc/AbstractJdbcSupport.java`
around lines 60 - 71, Subclasses are still opening connections directly instead
of using AbstractJdbcSupport.getConnection(), preventing centralized error
handling and deduplication; update all subclasses that currently call
datasource.getConnection() or manage SQLException to instead call
getConnection() so they benefit from the
ClassNotFoundException-to-ConfigurationException mapping provided by
getRootCause() and ConfigurationException, remove duplicate try/catch logic from
those subclasses, and ensure imports and tests reflect the consolidated behavior
(look for usages of datasource.getConnection(), direct new SQLException
handling, and any class-specific connection helpers to replace with
getConnection()).
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java (1)

16-16: 💤 Low value

Consider documenting the checkLimit return value contract

It's unclear whether the return value means "remaining tokens," "0 means deny," or something else. A brief Javadoc would help implementors (e.g., JDBCAiApiUsageStore currently returns 0 always, which may be interpreted as "limit reached").

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java`
at line 16, Add a clear Javadoc to the AiApiStore.checkLimit(AiApiUser user)
method that defines the return-value contract (e.g., that it returns the number
of remaining allowed tokens/requests, that 0 means no remaining allowance and
should be treated as "deny", and any special sentinel values like -1 mean
"unlimited"), and update the JDBCAiApiUsageStore implementation to conform to
that contract (it currently returns 0 always which can be misinterpreted);
reference the method name checkLimit and the JDBCAiApiUsageStore implementation
when making the change so callers and implementors know the exact semantics.
core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java (2)

91-108: 💤 Low value

Optional: replace bare RuntimeException with a typed exception.

new RuntimeException(e) in setJsonResponse/getJson swallows the failure category; a domain-specific exception (or at least an IllegalStateException) makes upstream handling/logging clearer and avoids mixing programmer errors with checked-IO failures.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java`
around lines 91 - 108, Replace the bare RuntimeException throws in
setJsonResponse and getJson with a more specific unchecked exception (e.g.,
IllegalStateException or a domain-specific exception) so failure semantics are
clearer; update setJsonResponse to throw IllegalStateException with a contextual
message when JsonProcessingException occurs, and update getJson to throw
IllegalStateException (or a new OpenAIApiException) for both the "Expected JSON
Object" case and the caught IOException, including the original exception as the
cause and useful context (referencing methods setJsonResponse, getJson, om,
Exchange and Flow to locate the code).

71-89: ⚡ Quick win

Recommended: avoid parse/re-serialize when no transformation is needed.

getJson (line 71) and setJsonResponse (line 87) run on every request even when both maxOutputTokens and maxInputTokens are 0 (i.e., nothing to mutate). This adds latency and forces every request through this interceptor to be a JSON object — a non-JSON request will throw RuntimeException from getJson and surface as an internal server error. Skip the body work when there is nothing to do.

♻️ Proposed shape
-        var json = getJson(exc, REQUEST);
-
-        if (maxOutputTokens != 0) {
-            json.put(MAX_OUTPUT_TOKENS, maxOutputTokens);
-        }
-
-        if (maxInputTokens != 0) {
-            var input = json.get("input");
-            if (input != null) {
-                var estimated = estimateTokens(input.asText());
-                if (estimated > maxInputTokens) {
-                    exc.setResponse(contextLengthExceeded(maxInputTokens, estimated));
-                    return RETURN;
-                }
-            }
-        }
-        setJsonResponse(exc, json);
-        return CONTINUE;
+        if (maxOutputTokens == 0 && maxInputTokens == 0) {
+            return CONTINUE;
+        }
+        var json = getJson(exc, REQUEST);
+        if (maxInputTokens != 0) {
+            var input = json.get("input");
+            if (input != null) {
+                var estimated = estimateTokens(input.asText());
+                if (estimated > maxInputTokens) {
+                    exc.setResponse(contextLengthExceeded(maxInputTokens, estimated));
+                    return RETURN;
+                }
+            }
+        }
+        if (maxOutputTokens != 0) {
+            json.put(MAX_OUTPUT_TOKENS, maxOutputTokens);
+        }
+        setJsonResponse(exc, json);
+        return CONTINUE;

Also note: input-token check happens after max_output_tokens is written into json; it's safer to validate first so a rejected request never has its body mutated.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java`
around lines 71 - 89, Detect when both maxOutputTokens and maxInputTokens are
zero and short-circuit the interceptor before calling getJson or setJsonResponse
so you skip parsing/serializing entirely; locate the logic in
OpenAIAPIInterceptor.java around the getJson(...) and setJsonResponse(...) calls
and return CONTINUE immediately if both limits are 0. Additionally, when
maxInputTokens is > 0 or maxOutputTokens is > 0, validate the input token count
first (use estimateTokens on the request input via json.get("input") and call
exc.setResponse(contextLengthExceeded(...))/return RETURN if exceeded) before
mutating the json with json.put(MAX_OUTPUT_TOKENS, ...), ensuring getJson is
only invoked when you will actually inspect or modify the body.
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java (1)

11-11: 💤 Low value

Minor: formatting nit on @MCElement attribute list.

Missing space after the comma in component = false, id="simple-ai-api-store" (name="simpleStore",component). Cosmetic only.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java`
at line 11, The `@MCElement` annotation on the SimpleAiApiStore class has a
missing space after the comma in its attribute list; update the annotation on
SimpleAiApiStore (the `@MCElement`(...) line) to read name="simpleStore",
component = false, id="simple-ai-api-store" (ensure there is a space after the
comma between attributes and keep attribute names/values unchanged) so the
formatting is consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@core/src/main/java/com/predic8/membrane/core/interceptor/ai/AiUtil.java`:
- Around line 45-48: The current check in AiUtil that uses
ah.indexOf(BEARER_PREFIX) allows "Bearer" to appear anywhere in the header
(e.g., "XBearer ..."); change the logic to ensure the Authorization header value
starts with the Bearer scheme by using ah.trim().startsWith(BEARER_PREFIX) (or
require index == 0 after trimming) before extracting the token in the method
that currently performs the indexOf check; after confirming the prefix at the
start, strip the exact prefix and any leading whitespace to return the token.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java`:
- Line 65: The stored property MEMBRANE_AI_USERTOKEN currently holds an
Optional<AiApiUser> (set via exc.setProperty(..., user)) but is later read with
exc.getProperty(MEMBRANE_AI_USERTOKEN, String.class) for per-user tracking;
change the stored representation to the actual bearer/user token string so the
typed accessor matches: when setting MEMBRANE_AI_USERTOKEN store
user.get().getToken() (or whichever String field represents the user/bearer)
instead of the Optional, and ensure any code that reads it
(exc.getProperty(MEMBRANE_AI_USERTOKEN, String.class) and
SimpleAiApiStore.store(String user, ...)) expects that String token
consistently.
- Around line 147-149: The getAiStore() accessor currently returns null; change
it to return the configured AiApiStore instance used by the interceptor (e.g.,
return the aiStore/aiApiStore field or delegate to the existing getter used by
other accessors) so reflective access and tests see the actual store; update the
OpenAIAPIInterceptor#getAiStore method to return the interceptor's stored
AiApiStore instance (matching the field name used in the class) instead of null.
- Around line 111-126: handleResponse currently calls store.store(...)
unconditionally which causes an NPE when no aiStore is configured; update the
OpenAIAPIInterceptor.handleResponse method to guard the usage-storage logic with
a null-check (if (store != null)) before calling store.store and getUsage,
mirroring the guard used in handleRequest, and ensure you still continue
processing (return CONTINUE) when store is null; reference symbols:
handleResponse, store, store.store(...), getUsage(...), MEMBRANE_AI_USERTOKEN.
- Around line 128-134: In OpenAIAPIInterceptor.getUsage, the current calls
usage.get("input_tokens").asInt() etc. will NPE if any field is missing; change
to use the safe JSON path accessor (e.g., usage.path("input_tokens").asInt(0))
for each token field so missing keys fall back to 0, and keep the existing guard
for a completely missing usage node (return new Usage(0,0,0)). Ensure you
construct the Usage with the three safe asInt(0) results.
- Around line 120-123: The current check json.get("error").isNull() causes an
NPE when the "error" field is absent; change to a null-safe path lookup using
JsonNode.path(): obtain JsonNode error = json.path("error") and then check both
!error.isMissingNode() && !error.isNull() before returning CONTINUE (update the
code in OpenAIAPIInterceptor where json.get("error") is used).

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAiApiUtil.java`:
- Around line 25-35: contextLengthExceeded currently serializes ErrorBody
directly producing a flat JSON; change it to wrap the ErrorBody inside an
ErrorEnvelope (same pattern used by authenticationFailed() and
tokenLimitExceeded()) before calling createJson so the response shape becomes
{"error":{...}}; update the return in contextLengthExceeded to build the
badRequest().json(createJson(new ErrorEnvelope(new ErrorBody(...)))) response
using the existing ErrorEnvelope and ErrorBody classes.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiLimit.java`:
- Around line 17-38: AiApiLimit has concurrent mutations on tokens and
nextReset; make mutations atomic by synchronizing access: mark both checkLimit()
and addTokens(long) as synchronized so reads/writes to tokens and nextReset
happen under the same intrinsic lock for AiApiLimit, ensuring the reset branch
in checkLimit() and the this.tokens += tokens update are thread-safe;
alternatively, replace long tokens with AtomicLong and guard nextReset/reset
logic with a synchronized block, but keep checkLimit() and addTokens() logically
synchronized on the instance so tokens and nextReset stay consistent.
- Around line 22-24: The constructor sets nextReset using period which defaults
to 0, causing immediate resets; change initialization so nextReset is computed
lazily in checkLimit() (or validate period>0 in init()). Specifically, in class
AiApiLimit defer setting nextReset in the constructor, and in method
checkLimit() detect an uninitialized state (e.g., period <= 0 or nextReset ==
null/equals now sentinel) and then set nextReset = now().plusSeconds(period)
(after ensuring period>0) and initialize tokens appropriately; alternatively add
an init() validation in AiApiLimit to assert period > 0 and throw/configure
before any checkLimit() runs. Ensure references: AiApiLimit constructor, field
nextReset, field period, method checkLimit, and optional init().

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/JDBCAiApiUsageStore.java`:
- Around line 53-61: The methods in JDBCAiApiUsageStore — getUser(String) and
checkLimit(AiApiUser) — are stubbed and silently break
authentication/rate-limiting; replace the current bodies so they throw
UnsupportedOperationException with a clear message like "getUser not implemented
in JDBCAiApiUsageStore" / "checkLimit not implemented in JDBCAiApiUsageStore"
(or alternatively add explicit TODO comments) so callers fail fast instead of
being silently denied; ensure messages reference the method names and suggest
implementing them or using a different store, and keep the existing store()
functionality intact.
- Around line 13-22: The CREATE_TABLE_SQL in JDBCAiApiUsageStore uses
PostgreSQL's "BIGINT GENERATED ALWAYS AS IDENTITY" which breaks MySQL/MariaDB;
update createTablesIfNotExist()/JDBCAiApiUsageStore to use database-specific
DDL: detect the JDBC URL/DatabaseMetaData and emit "BIGINT AUTO_INCREMENT
PRIMARY KEY" for MySQL/MariaDB and keep the IDENTITY form for PostgreSQL, or
alternatively switch to a portable approach (e.g., remove DB identity and
generate IDs in application code) so the CREATE_TABLE_SQL constant is
chosen/constructed per database engine.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java`:
- Around line 20-33: store(...) and checkLimit(...) currently use a single
shared AiApiLimit (limit) so per-user quotas are not enforced and concurrent
updates are unsafe; change to maintain per-user limits (e.g., a
ConcurrentHashMap<String, AiApiLimit> or Map<AiApiUser, AiApiLimit> keyed by
token/username) and update store(user, usage) to add tokens to the caller's
AiApiLimit, and change checkLimit(AiApiUser) to consult that user's AiApiLimit;
also make AiApiLimit thread-safe (use AtomicLong/LongAdder for counters or
synchronize access) so addTokens/checkLimit are safe under concurrent requests,
or if you intend a global budget instead rename the field/methods to reflect a
shared limit and document it.
- Around line 16-28: The users field in SimpleAiApiStore is not initialized,
causing getUser(String) to NPE on users.stream(); fix by defensively
initializing the users field (private List<AiApiUser> users = new ArrayList<>())
or by guarding getUser(...) with a null/empty check (e.g., return
Optional.empty() if users is null) and ensure imports (java.util.ArrayList) are
added; update any related logic (store(...) and getUser(...)) to work with the
initialized/guarded users list.

---

Nitpick comments:
In
`@annot/src/main/java/com/predic8/membrane/annot/yaml/parsing/binding/ObjectBinder.java`:
- Around line 102-108: The instantiate helper should not declare
InvocationTargetException, InstantiationException, or IllegalAccessException
since its caller bind already swallows Throwable; instead, remove those
exceptions from the method signature and catch InvocationTargetException,
InstantiationException, and IllegalAccessException inside instantiate,
converting them to a ConfigurationParsingException with a clear message
(including clazz.getName() and the original exception as the cause); keep the
existing NoSuchMethodException handling as-is; update nothing in bind beyond
removing the throws dependency so bind continues to catch
ConfigurationParsingException for source-file context.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java`:
- Around line 91-108: Replace the bare RuntimeException throws in
setJsonResponse and getJson with a more specific unchecked exception (e.g.,
IllegalStateException or a domain-specific exception) so failure semantics are
clearer; update setJsonResponse to throw IllegalStateException with a contextual
message when JsonProcessingException occurs, and update getJson to throw
IllegalStateException (or a new OpenAIApiException) for both the "Expected JSON
Object" case and the caught IOException, including the original exception as the
cause and useful context (referencing methods setJsonResponse, getJson, om,
Exchange and Flow to locate the code).
- Around line 71-89: Detect when both maxOutputTokens and maxInputTokens are
zero and short-circuit the interceptor before calling getJson or setJsonResponse
so you skip parsing/serializing entirely; locate the logic in
OpenAIAPIInterceptor.java around the getJson(...) and setJsonResponse(...) calls
and return CONTINUE immediately if both limits are 0. Additionally, when
maxInputTokens is > 0 or maxOutputTokens is > 0, validate the input token count
first (use estimateTokens on the request input via json.get("input") and call
exc.setResponse(contextLengthExceeded(...))/return RETURN if exceeded) before
mutating the json with json.put(MAX_OUTPUT_TOKENS, ...), ensuring getJson is
only invoked when you will actually inspect or modify the body.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java`:
- Line 16: Add a clear Javadoc to the AiApiStore.checkLimit(AiApiUser user)
method that defines the return-value contract (e.g., that it returns the number
of remaining allowed tokens/requests, that 0 means no remaining allowance and
should be treated as "deny", and any special sentinel values like -1 mean
"unlimited"), and update the JDBCAiApiUsageStore implementation to conform to
that contract (it currently returns 0 always which can be misinterpreted);
reference the method name checkLimit and the JDBCAiApiUsageStore implementation
when making the change so callers and implementors know the exact semantics.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java`:
- Line 11: The `@MCElement` annotation on the SimpleAiApiStore class has a missing
space after the comma in its attribute list; update the annotation on
SimpleAiApiStore (the `@MCElement`(...) line) to read name="simpleStore",
component = false, id="simple-ai-api-store" (ensure there is a space after the
comma between attributes and keep attribute names/values unchanged) so the
formatting is consistent.

In
`@core/src/main/java/com/predic8/membrane/core/util/jdbc/AbstractJdbcSupport.java`:
- Around line 60-71: Subclasses are still opening connections directly instead
of using AbstractJdbcSupport.getConnection(), preventing centralized error
handling and deduplication; update all subclasses that currently call
datasource.getConnection() or manage SQLException to instead call
getConnection() so they benefit from the
ClassNotFoundException-to-ConfigurationException mapping provided by
getRootCause() and ConfigurationException, remove duplicate try/catch logic from
those subclasses, and ensure imports and tests reflect the consolidated behavior
(look for usages of datasource.getConnection(), direct new SQLException
handling, and any class-specific connection helpers to replace with
getConnection()).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5199bbff-c303-4ad8-a638-e79ae07a87e6

📥 Commits

Reviewing files that changed from the base of the PR and between 1af0001 and d792f5f.

📒 Files selected for processing (12)
  • annot/src/main/java/com/predic8/membrane/annot/yaml/parsing/binding/ObjectBinder.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/AiUtil.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAiApiUtil.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiLimit.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiUser.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/JDBCAiApiUsageStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/Usage.java
  • core/src/main/java/com/predic8/membrane/core/security/AbstractSecurityScheme.java
  • core/src/main/java/com/predic8/membrane/core/util/jdbc/AbstractJdbcSupport.java

Comment thread core/src/main/java/com/predic8/membrane/core/interceptor/ai/AiUtil.java Outdated
Comment thread core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiLimit.java Outdated
predic8 added 5 commits May 7, 2026 15:58
- Introduced thread-safe `AtomicLong` for token management.
- Synchronized reset logic in `AiApiLimit`.
- Improved error handling and null checks in OpenAI API interactions.
- Default-initialized user list in `SimpleAiApiStore`.
- Fixed getter for `AiApiStore` in interceptor.
- Removed `AiUtil` and replaced with modular `AiProvider` interface.
- Added provider implementations: `Claude`, `OpenAI`.
- Updated `OpenAIAPIInterceptor` to use configurable providers and enforce model restrictions.
- Introduced `NoAiApiLimit` for simplified limit management.
- Enhanced error handling with model validation in `OpenAiApiUtil`.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (2)
core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java (1)

69-69: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Property type mismatch breaks usage tracking (AiApiUser stored, String retrieved).

You store user object in MEMBRANE_AI_USERTOKEN but later read it as String.class; this will yield null/cast issues and wrong accounting keys.

Also applies to: 137-137

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java`
at line 69, You’re storing an AiApiUser object into the MEMBRANE_AI_USERTOKEN
property via exc.setProperty but other code expects a String; change the set to
store a String key instead (e.g., exc.setProperty(MEMBRANE_AI_USERTOKEN,
user.getId() or user.getUsername()/getToken()) so the later retrieval as
String.class works), and make the same change at the other occurrence around the
second site (the other set at line ~137) so both storage and retrieval use the
same String identifier rather than an AiApiUser instance.
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java (1)

21-34: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Quota accounting is global, not per user.

user is ignored for limit tracking, so one heavy caller can exhaust quota for everyone. This defeats per-user enforcement.

Based on learnings: verify shared-state protections are preserved when logic is split or delegated; here the shared limit state is central and should be explicitly scoped per user.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java`
around lines 21 - 34, The quota accounting currently uses a single shared
"limit" object so store(String user, Usage usage) and checkLimit(AiApiUser user)
ignore the user and apply/globalize tokens; change this to a per-user limit map
(e.g., Map<String, Limit> or Map<AiApiUser, Limit>) and update store to look up
the caller's Limit by token/user and call addTokens on that per-user Limit, and
update checkLimit(AiApiUser user) to query that same per-user Limit; ensure the
map uses a concurrent implementation (ConcurrentHashMap) or other
synchronization to protect shared state and initialize a Limit for a user on
first use (using get-or-create semantics) so existing methods getUser, store,
and checkLimit operate per-user rather than against the single "limit" field.
🧹 Nitpick comments (1)
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/NoAiApiLimit.java (1)

3-8: ⚡ Quick win

NoAiApiLimit enforces a fixed limit of 1000 instead of "no limit".

The class name implies unlimited access, but checkLimit() returns a hardcoded 1000. Since checkLimit() semantically represents remaining quota and is used to reject requests when the value drops to zero, this effectively caps all users at 1000 tokens. Either rename the class to reflect the actual limit (e.g., DefaultAiApiLimit) or return Long.MAX_VALUE (or similar unbounded sentinel) to match the "no limit" semantics.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/NoAiApiLimit.java`
around lines 3 - 8, The class NoAiApiLimit currently returns a hardcoded 1000 in
checkLimit(), which contradicts its "no limit" name and caps usage; update the
implementation of NoAiApiLimit.checkLimit() to return an unbounded sentinel such
as Long.MAX_VALUE (or another agreed "no limit" value) so it truly represents
unlimited quota, or alternatively rename the class (e.g., DefaultAiApiLimit) and
keep the 1000 if the intention is a fixed default limit; reference NoAiApiLimit
and its checkLimit() method (and AiApiLimit base type) when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java`:
- Around line 80-89: The current logic only calls
provider.estimateInputTokens(json) if json.get("input") is present, which skips
providers that estimate from other fields; change the block in
OpenAIAPIInterceptor so that whenever maxInputTokens != 0 you call
provider.estimateInputTokens(json) unconditionally (remove the input null
check), then compare the returned estimated value to maxInputTokens and, if
exceeded, call exc.setResponse(contextLengthExceeded(maxInputTokens, estimated))
and return RETURN; ensure you keep the same symbols: maxInputTokens,
provider.estimateInputTokens(json), exc.setResponse(...),
contextLengthExceeded(...), and RETURN.
- Around line 56-57: provider is used without validation and can NPE when
calling provider.getApiKey(...) or provider.setApiKey(...); add a non-null
validation either at startup in the OpenAIAPIInterceptor.init() method (throw a
clear configuration exception if provider is null) or add a request-time guard
in the request handling path that checks if provider == null and returns a 5xx
response with a clear error message; update references to
provider.getApiKey(...) and provider.setApiKey(...) to only be executed after
this check so no NPE can occur.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/OpenAI.java`:
- Around line 35-42: In OpenAI.java, the Bearer token extraction currently uses
ah.indexOf(BEARER_PREFIX) which matches the prefix anywhere; change the logic in
the method that reads ah to require a strict prefix (e.g., after trimming
leading whitespace the header must startWith BEARER_PREFIX or match a
case‑insensitive regex like ^Bearer\s+(\S+)$) and then extract the token from
that exact location (using BEARER_PREFIX.length()), returning null for any
header that does not strictly start with the Bearer prefix or that yields an
empty token; reference BEARER_PREFIX and the variable ah when making this
change.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java`:
- Around line 27-29: getUser in SimpleAiApiStore uses u.getToken().equals(token)
which can NPE if a configured AiApiUser has a null token; change the equality
check to Objects.equals(u.getToken(), token) inside the getUser method to safely
handle nulls and add an import for java.util.Objects if not already present so
the stream filter becomes safe.

---

Duplicate comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java`:
- Line 69: You’re storing an AiApiUser object into the MEMBRANE_AI_USERTOKEN
property via exc.setProperty but other code expects a String; change the set to
store a String key instead (e.g., exc.setProperty(MEMBRANE_AI_USERTOKEN,
user.getId() or user.getUsername()/getToken()) so the later retrieval as
String.class works), and make the same change at the other occurrence around the
second site (the other set at line ~137) so both storage and retrieval use the
same String identifier rather than an AiApiUser instance.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java`:
- Around line 21-34: The quota accounting currently uses a single shared "limit"
object so store(String user, Usage usage) and checkLimit(AiApiUser user) ignore
the user and apply/globalize tokens; change this to a per-user limit map (e.g.,
Map<String, Limit> or Map<AiApiUser, Limit>) and update store to look up the
caller's Limit by token/user and call addTokens on that per-user Limit, and
update checkLimit(AiApiUser user) to query that same per-user Limit; ensure the
map uses a concurrent implementation (ConcurrentHashMap) or other
synchronization to protect shared state and initialize a Limit for a user on
first use (using get-or-create semantics) so existing methods getUser, store,
and checkLimit operate per-user rather than against the single "limit" field.

---

Nitpick comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/NoAiApiLimit.java`:
- Around line 3-8: The class NoAiApiLimit currently returns a hardcoded 1000 in
checkLimit(), which contradicts its "no limit" name and caps usage; update the
implementation of NoAiApiLimit.checkLimit() to return an unbounded sentinel such
as Long.MAX_VALUE (or another agreed "no limit" value) so it truly represents
unlimited quota, or alternatively rename the class (e.g., DefaultAiApiLimit) and
keep the 1000 if the intention is a fixed default limit; reference NoAiApiLimit
and its checkLimit() method (and AiApiLimit base type) when making the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2ac0a859-8e9a-4f2a-bead-13d81b96b81b

📥 Commits

Reviewing files that changed from the base of the PR and between 8fa0d95 and 356a4c5.

📒 Files selected for processing (9)
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAiApiUtil.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AiProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/Claude.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/OpenAI.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiLimit.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/NoAiApiLimit.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java
✅ Files skipped from review due to trivial changes (2)
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AiProvider.java
🚧 Files skipped from review as they are similar to previous changes (1)
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiLimit.java

Comment thread core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/OpenAI.java Outdated
@membrane-ci-server

Copy link
Copy Markdown

This pull request needs "/ok-to-test" from an authorized committer.

…bstraction

- Added `AiApiRequest` and `AiApiResponse` abstractions for request/response handling.
- Introduced `AbstractAiApiRequest` and `AbstractAiApiResponse` as base classes.
- Implemented providers: `Google`, `OpenAI`, and `Claude` with concrete request/response handling.
- Updated `AiProvider` to handle request/response creation.
- Refactored `OpenAIAPIInterceptor` to leverage request/response abstraction and enforce contract restrictions.
- Enhanced `JsonUtil` with helper methods for JSON body parsing and updates.
- Updated `AiApiStore` and related classes for improved usage tracking and user abstraction.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (1)
core/src/main/java/com/predic8/membrane/core/interceptor/ai/AbstractAiApiRequest.java (1)

40-47: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Bearer token extraction is too permissive — matches anywhere in the header value.

indexOf(BEARER_PREFIX) accepts headers like "Basic abc Bearer xyz" and extracts "xyz" as the token. The standard requires the scheme to be at the start of the value.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/AbstractAiApiRequest.java`
around lines 40 - 47, The current extraction uses ah.indexOf(BEARER_PREFIX)
which wrongly matches the scheme anywhere in the header; change the logic to
require the scheme at the start (e.g., use ah.startsWith(BEARER_PREFIX) or a
regex anchored with ^) before calling ah.substring(...) and trimming, and return
null if the header does not begin with BEARER_PREFIX; keep using BEARER_PREFIX,
ah.substring(...), and token.isEmpty() checks but only after validating the
prefix is at the start.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/ClaudeAiRequest.java`:
- Around line 15-35: The method estimateInputTokens uses an int accumulator
named tokens which can overflow before being returned as a long; change the
accumulator declaration in estimateInputTokens from int to long and ensure all
constants/updates use long arithmetic (e.g., use 1000L and ensure divisions
produce long results or cast operands) when adding system, message, block and
image token estimates so no intermediate int overflow occurs when computing the
total from json.path("system") and json.path("messages").

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/GoogleAiRequest.java`:
- Around line 107-118: getGenerationConfig() can NPE because json may be null
(see estimateInputTokens guard); update getGenerationConfig() (used by
setMaxOutputTokens) to first ensure json is non-null and is an ObjectNode
(create/replace it with a new ObjectNode when null or not an object) before
calling json.get("generationConfig"), then return or create the
"generationConfig" ObjectNode; reference methods: getGenerationConfig,
setMaxOutputTokens and the field json.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/OpenAiAiRequest.java`:
- Around line 15-17: The estimateInputTokens() method in OpenAiAiRequest
currently uses json.path("input").asText("") which returns an empty string for
Chat Completions (no "input" field) and for ArrayNode inputs, making token
checks no-op; update estimateInputTokens() to inspect both "input" and
"messages": if "messages" is an array (Chat Completions) iterate it and
concatenate or sum the lengths of text-like fields (e.g., each message's
"content" or all string values) to compute total chars, if "input" is an array
iterate elements and for each element use asText("") for strings or extract
textual fields from objects before summing; finally compute tokens as
ceil(totalChars / 4.0) and return that long so OpenAIAPIInterceptor's
maxInputTokens check works correctly.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/JDBCAiApiUsageStore.java`:
- Line 15: The SQL DDL string in JDBCAiApiUsageStore (created in the
createTablesIfNotExist block) contains a Java text-block literal with a `//
`@TODO` ...` comment which is being embedded into the SQL and breaks parsing;
remove the `// `@TODO` ...` from the SQL text block or replace it with a valid SQL
comment (e.g., `/* ... */` or `-- ...`), ensuring the `id BIGINT GENERATED
ALWAYS AS IDENTITY PRIMARY KEY` line ends cleanly (no stray `//` token) so the
DDL executes successfully.

In `@core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java`:
- Around line 89-98: getJsonObject currently throws a bare
RuntimeException/IOException on non-JSON or non-object bodies which bubbles out
and crashes the pipeline; change getJsonObject to return Optional<ObjectNode>
(signature: getJsonObject(Message msg) -> Optional<ObjectNode>) and implement it
to return Optional.empty() for non-object JSON or when parsing fails (wrap IO
errors in UncheckedIOException only if you must rethrow); then update callers
(notably AbstractAiApiResponse constructor and any usages in
OpenAIAPIInterceptor) to check the Optional and convert an empty result into a
structured error response or a controlled exception (e.g.,
IllegalArgumentException) so the interceptor can map it to an HTTP error instead
of letting an unchecked RuntimeException propagate.

---

Duplicate comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/AbstractAiApiRequest.java`:
- Around line 40-47: The current extraction uses ah.indexOf(BEARER_PREFIX) which
wrongly matches the scheme anywhere in the header; change the logic to require
the scheme at the start (e.g., use ah.startsWith(BEARER_PREFIX) or a regex
anchored with ^) before calling ah.substring(...) and trimming, and return null
if the header does not begin with BEARER_PREFIX; keep using BEARER_PREFIX,
ah.substring(...), and token.isEmpty() checks but only after validating the
prefix is at the start.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b118fe28-0fcb-4a38-a1db-897ea2b30c18

📥 Commits

Reviewing files that changed from the base of the PR and between 356a4c5 and 863e5cc.

📒 Files selected for processing (20)
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/AbstractAiApiRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/AbstractAiApiResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/AiApiRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/AiApiResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AiProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/Claude.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/ClaudeAiRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/ClaudeAiResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/Google.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/GoogleAiRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/GoogleAiResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/OpenAI.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/OpenAiAiRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/OpenAiAiResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiUser.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/JDBCAiApiUsageStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java
✅ Files skipped from review due to trivial changes (4)
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/OpenAiAiResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/AiApiRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/ClaudeAiResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java
🚧 Files skipped from review as they are similar to previous changes (1)
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiUser.java

Comment thread core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java Outdated
…ponents

- Updated `checkLimit` method to consider input and output tokens.
- Improved token calculation logic in AI request providers.
- Enhanced JSON parsing in `JsonUtil` with Optional for safer operations.
- Added detailed error handling in `OpenAIAPIInterceptor` for invalid requests.
- Refined token estimation logic with safety margins and JSON structure considerations.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiLimit.java`:
- Around line 69-72: The setter setPeriod(int period) must validate that period
> 0 and must update the `@GuardedBy`("lock") field nextReset while holding the
lock; modify setPeriod to throw an IllegalArgumentException for non-positive
period values and move the assignment of nextReset = now().plusSeconds(period)
inside the same critical section that protects nextReset (use the class's lock
or synchronized(lock)) so both period and nextReset are updated atomically under
the lock.
- Around line 39-53: The reset-check and token updates must be made atomic: move
the nextReset null/now.isAfter(nextReset) check and the subsequent
tokens.set(0)/nextReset update inside the same synchronized(lock) block and
ensure any token-modifying methods use that same lock; specifically wrap the
block that computes and returns maxTokens - tokens.get() - tokensForNextRequest
in synchronized(lock) (the method containing nextReset) and change
addTokens(long tokens) to synchronize on lock as well (or perform its add inside
synchronized(lock)) so tokens, nextReset and tokensForNextRequest are always
updated/read under the same lock.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java`:
- Around line 22-23: Replace the INFO-level log that prints per-user identifiers
in SimpleAiApiStore (the log.info call that uses user.getName() and usage) with
a lower-verbosity, non-identifying message: change log.info(...) to
log.debug(...) (or log.trace(...) if you prefer) and remove the user.getName()
from the message so only non-sensitive data like usage.totalTokens() or a
redacted indicator is logged; keep the limit.addTokens(usage.totalTokens())
behavior unchanged.
- Around line 40-42: The setter setUsers currently allows assigning null to the
users field which later causes NPEs in getUser when it calls users.stream();
update setUsers to guard against null by assigning an empty immutable list
(e.g., Collections.emptyList() or List.of()) when the incoming users argument is
null, so that getUser can safely call users.stream(); keep the field name users
and the method getUser unchanged so callers remain consistent.

In `@core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java`:
- Around line 98-103: The code in JsonUtil.java reads a JsonNode with "JsonNode
jsonNode = om.readTree(msg.getBodyAsStreamDecoded())" but dereferences jsonNode
via jsonNode.getNodeType(), which can NPE when readTree returns null for empty
input; update the logic in the method containing that line to first check "if
(jsonNode == null) return Optional.empty();" (and/or log that input was empty)
before any access, and only call getNodeType() when jsonNode is non-null; also
keep the existing branch that returns Optional.of(on) when the node is an
ObjectNode and return Optional.empty() for any other node types.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f29cb49e-f9a8-4c53-b7fc-24c1061b037c

📥 Commits

Reviewing files that changed from the base of the PR and between 863e5cc and 2933dad.

📒 Files selected for processing (11)
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/AbstractAiApiRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/AbstractAiApiResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/ClaudeAiRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/OpenAiAiRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiLimit.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/JDBCAiApiUsageStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/NoAiApiLimit.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java
🚧 Files skipped from review as they are similar to previous changes (4)
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/NoAiApiLimit.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/ClaudeAiRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/AbstractAiApiRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java

Comment thread core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiLimit.java Outdated
Comment thread core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiLimit.java Outdated
Comment thread core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java Outdated
predic8 added 10 commits May 8, 2026 13:28
- Synchronized token management methods in `AiApiLimit` to ensure thread safety.
- Adjusted log levels for `SimpleAiApiStore` to reduce verbosity.
- Added PostgreSQL dependency to the distribution.
- Updated logging configuration to set debug level for AI interceptors.
- Introduced `SSEUtil` for parsing Server-Sent Events (SSE) from chunks.
- Enhanced `AbstractAiApiRequest` to handle JSON requests conditionally.
- Deprecated and replaced `max_output_tokens` usage in specific providers.
- Improved stream support in `OpenAiAiRequest` with response usage tracking.
- Refactored token limit logic in `OpenAIAPIInterceptor` for better flow.
- Renamed `AiApiRequest` to `LLMRequest` and `AiApiResponse` to `LLMResponse`.
- Updated providers (`Google`, `OpenAI`, `Claude`) to align with `LLMProvider` interface.
- Refactored `OpenAIAPIInterceptor` to `LLMGatewayInterceptor` and related utilities.
- Removed `SSEUtil` and replaced with `SSEParser`.
- Improved streaming and token usage handling in `AbstractLLMResponse`.
- Modularized LLM responses for `Claude` and `OpenAI` providers.
- Replaced `LLMResponse` interface and `AbstractLLMResponse` with updated abstractions.
- Added `ChatCompletionsSSEParser` for advanced SSE chunk handling.
- Introduced specific SSE event classes: `ChatCompletionEvent`, `ChatCompletionDoneEvent`, `ResponsesApiEvent`.
- Renamed and restructured classes for consistency in AI namespace.
- Improved token usage tracking and event-based streaming.
…acking logic

- Deleted `ChatCompletionsSSEParser` and related classes/methods.
- Simplified `ChatCompletionEvent` by removing token usage parsing.
- Updated tool extraction logic in `AbstractLLMRequest` to handle function-specific tools.
@predic8

predic8 commented May 18, 2026

Copy link
Copy Markdown
Member Author

@coderabbitai plz review

@predic8 predic8 closed this May 18, 2026
@predic8 predic8 reopened this May 18, 2026
@coderabbitai

coderabbitai Bot commented May 18, 2026

Copy link
Copy Markdown
Contributor

@predic8 Sure! I'll start a full review of the PR right away.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 17

♻️ Duplicate comments (1)
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleLLMRequest.java (1)

108-118: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

setMaxOutputTokens() can still NPE when json is null.

On Line 113, json.get("generationConfig") dereferences json unguarded. This is the same underlying issue previously flagged and appears to persist after the rename.

Proposed fix
     `@Override`
     public void setMaxOutputTokens(int maxOutputTokens) {
+        if (json == null || json.isNull()) {
+            return;
+        }
         getGenerationConfig().put("max_output_tokens", maxOutputTokens);
     }
 
     private ObjectNode getGenerationConfig() {
+        if (!(json instanceof ObjectNode objectJson)) {
+            json = com.fasterxml.jackson.databind.node.JsonNodeFactory.instance.objectNode();
+            objectJson = (ObjectNode) json;
+        }
-        var gc = json.get("generationConfig");
+        var gc = objectJson.get("generationConfig");
         if (gc instanceof ObjectNode objectNode) {
             return objectNode;
         }
-        return json.putObject("generationConfig");
+        return objectJson.putObject("generationConfig");
     }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleLLMRequest.java`
around lines 108 - 118, setMaxOutputTokens can still NPE because
getGenerationConfig calls json.get(...) without ensuring json is initialized;
update getGenerationConfig (used by setMaxOutputTokens) to first ensure the
field json is non-null (initialize it to an ObjectNode or call a helper that
creates it) before accessing json.get("generationConfig"), and then return or
create the "generationConfig" ObjectNode; reference the methods
getGenerationConfig and setMaxOutputTokens and the json field when making the
change.
🧹 Nitpick comments (1)
core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMApiUtil.java (1)

64-72: ⚡ Quick win

Catch JsonProcessingException specifically instead of broad Exception.

ObjectMapper.writeValueAsString() throws JsonProcessingException on serialization failures. Catching the specific exception type makes the error handling intent clearer and follows best practices for exception handling.

♻️ Proposed fix
+import com.fasterxml.jackson.core.JsonProcessingException;
+
 public static String createJson(Object o) {
     try {
         return om.writeValueAsString(o);
-    } catch (Exception e) {
+    } catch (JsonProcessingException e) {
         return """
                 { "error": "Could not create JSON" }
                 """;
     }
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMApiUtil.java`
around lines 64 - 72, The createJson method currently catches a broad Exception;
change the catch to handle com.fasterxml.jackson.core.JsonProcessingException
specifically for om.writeValueAsString(o) to make error handling precise
(replace the generic catch(Exception e) with catch(JsonProcessingException e))
and keep the same fallback JSON return and any logging if present.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/AbstractLLMEvent.java`:
- Around line 35-40: In AbstractLLMEvent, the Optional holding the parsed JSON
(the variable named opt used after sse.data()) is dereferenced unguarded (var
json = opt.get()), which can throw NoSuchElementException; update the code in
the method that handles SSE parsing to check opt.isEmpty() and bail out (e.g.,
return or skip processing) after logging the "Unknown event format" message so
you never call opt.get() when empty, ensuring safe handling of non-object or
failed parses.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/ChatCompletionEvent.java`:
- Around line 41-42: In ChatCompletionEvent (around the log.debug call), avoid
logging raw tool-call arguments from fn.path("arguments").asText(); instead
build a safe, redacted string (e.g., replace sensitive patterns or mask the
content and/or truncate to a max length like 100 chars with "..." appended) and
log that; update the log.debug invocation to use the redacted/truncated value so
the source (fn.path("arguments").asText()) is never written raw to logs.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java`:
- Around line 65-77: The input token limit is bypassed when no aiStore is
configured because inputTokens is only set inside the store branch; compute and
enforce input token counting regardless of store presence: in
LLMGatewayInterceptor, call aiReq.estimateInputTokens() when
exc.getRequest().isPOSTRequest() before or outside the store != null branch (or
duplicate the estimate in the else) so inputTokens holds the actual value for
the later maxInputTokens check; apply the same change for the second occurrence
referenced around lines 92-97 so the maxInputTokens check always uses the real
estimated inputTokens even when store is null.
- Around line 116-119: The callback passed to provider.getLLMResponse currently
calls store.store(...) whenever store != null, which can pass a null AiApiUser
if MEMBRANE_AI_USER isn't set on the exchange; change the callback to first read
the user from the exchange (exc.getProperty(MEMBRANE_AI_USER, AiApiUser.class)),
check that the user is not null (and only then call store.store(user,
res.getUsage())), leaving the existing store != null guard intact so persistence
only happens when both store and a valid AiApiUser are present.
- Around line 144-146: The getDisplayName method in LLMGatewayInterceptor
currently returns the stale provider-specific string "OpenAI API"; update
LLMGatewayInterceptor.getDisplayName() to return a provider-agnostic name (e.g.,
"LLM Gateway" or "AI Gateway") so the interceptor no longer advertises a single
provider, and adjust any related UI/tooltip text retrieval (methods referencing
getDisplayName) to remain consistent with the new neutral label.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AbstractLLMRequest.java`:
- Around line 81-83: getModel() can throw an NPE because the backing json field
may be null for non-JSON requests; update AbstractLLMRequest.getModel to first
check whether json is null (or json.path("model") is missing) and return a safe
value (e.g., null or an empty string) instead of dereferencing json, or
otherwise handle non-JSON requests upstream. Locate the json field and the
method getModel() in class AbstractLLMRequest and add a null-check/guard that
avoids calling json.path(...) when json==null and documents the chosen fallback
behavior.
- Around line 65-72: The current extraction in AbstractLLMRequest uses
ah.indexOf(BEARER_PREFIX) which accepts "Bearer" anywhere; change it to parse
the Authorization header strictly as a Bearer scheme: trim ah, verify the header
starts with the bearer scheme (case-insensitive) followed by whitespace (e.g.
check ah.trim().toLowerCase().startsWith((BEARER_PREFIX + " ").toLowerCase()) or
use regionMatches(true,...)), and only then take the substring after
BEARER_PREFIX, trim it and return null on empty; update the token extraction
logic around BEARER_PREFIX and the variable ah accordingly.
- Around line 35-40: The mapping in AbstractLLMRequest that extracts tool names
from tools.valueStream() is unsafe because chained get(...).get(...).asText()
can NPE or throw for malformed entries; update the lambda in the stream to
defensively check each node before accessing "function" and "name" (e.g., verify
node.isObject() / has("function") and has("name") and that name is textual or
non-null), return the function.name when present and valid, otherwise the
top-level name when valid, and fallback to a safe default or skip the entry
instead of calling asText() blindly so malformed tool objects no longer throw in
get(...).get(...).asText().

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AbstractLLMResponse.java`:
- Around line 61-71: The lambda passed to terminal.ifPresent in
AbstractLLMResponse currently returns early on a "[DONE]" event which skips
calling postProcessor and can leave json null causing isError() to NPE; change
the handling so that when event.data() equals "[DONE]" you do not return early
but instead assign json to a safe empty/error JsonNode (e.g.,
JsonNodeFactory.instance.objectNode().put("done", true) or a descriptive error
node) and then call postProcessor.accept(AbstractLLMResponse.this); apply the
same fix to the other terminal handling block referenced (the similar code at
lines ~75-77) so postProcessor is always invoked and json is always non-null
before isError() is called.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeLLMRequest.java`:
- Around line 21-67: The code assumes the field json is non-null in
setMaxOutputTokens and estimateInputTokens which leads to NPEs when the request
body is absent or parsing failed; update both setMaxOutputTokens and
estimateInputTokens to guard against a null json (either early-return with
sensible defaults or lazily initialize json to an empty ObjectNode) before
accessing json.path(...), and ensure any subsequent logic (e.g.,
thinking/budget_tokens adjustments and token estimation) handles the empty json
case safely; reference the json field and the methods setMaxOutputTokens and
estimateInputTokens when making the change.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeLLMResponse.java`:
- Around line 48-50: The current log.debug call in ClaudeLLMResponse that prints
inputJson (log.debug("Tool {} with {}", tool, inputJson.toString())) must not
log raw payloads; change it to log a redacted or summarized form instead (e.g.,
log tool name plus either the set of top-level keys, input size, or a sanitized
version) by replacing the direct inputJson.toString() usage with a safe helper
(e.g., redactToolInput(inputJson) or summarizeToolInput(inputJson)) that
strips/omits sensitive values; update the log.debug invocation to use the helper
and keep the tool identifier (tool) intact so callers can locate the change.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/MessageDelta.java`:
- Around line 25-31: The Usage object in MessageDelta is currently constructed
using only md.inputTokens and md.outputTokens, omitting billable cache tokens;
update the Usage creation in the MessageDelta parsing logic so that it includes
md.cacheCreationInputTokens and md.cacheReadInputTokens (i.e., total =
inputTokens + outputTokens + cacheCreationInputTokens + cacheReadInputTokens) so
billing/rate-limit accounting is correct for the MessageDelta instance (look for
the MessageDelta class and the md.usage = new Usage(...) line).

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAiLLMRequest.java`:
- Around line 25-33: The method estimateInputTokens uses the field json without
guarding for null; add a null-check at the start of estimateInputTokens (or a
guard in any helper it calls) so that if json is null the method returns 0 (or
an appropriate default) instead of calling json.path(...) and risking an NPE;
update estimateInputTokens to check json before calling
countText(json.path("input")), estimateChatCompletitions(), and
countText(json.path("system")) so those invocations only run when json is
non-null.
- Around line 17-20: The current code in OpenAiLLMRequest uses
json.putObject("stream_options") which unconditionally replaces any existing
stream_options; change this to merge instead: check for an existing JsonObject
via json.getJsonObject("stream_options") (or equivalent), if it exists and is an
object call put("include_usage", true) on it, otherwise create a new JsonObject
with include_usage=true and assign it (only when
json.path("stream").asBoolean(false) && isChatCompletionsRequest(exchange));
keep the check/behavior around isChatCompletionsRequest(exchange) intact.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAiLLMResponse.java`:
- Around line 34-42: The getUsage() logic in OpenAiLLMResponse is ignoring
returned values from getInputTokens/getOutputTokens and is reading total_tokens
from the wrong JSON node; assign the results of getInputTokens(usage) and
getOutputTokens(usage) to inputTokens and outputTokens respectively, and read
totalTokens from usage.path("total_tokens") (not json.path("total_tokens")) for
both the new and older chat completions branches so usage.total_tokens is used
consistently.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/ResponsesApiEvent.java`:
- Around line 29-31: The current log statement in ResponsesApiEvent.java logs
raw function-call arguments via on.path("arguments").asText() at INFO; change
this to a non-production level (use log.debug) and avoid printing full
payloads—log only metadata such as the function name (on.path("name").asText()),
argument size or keys, or a redacted/truncated version of arguments (e.g., first
N characters or replace values with "<REDACTED>") instead of the raw text;
update the log invocation that references on.path("name") and
on.path("arguments") accordingly so sensitive data is not emitted at INFO.

In `@core/src/main/java/com/predic8/membrane/core/util/http/SSEParser.java`:
- Line 38: The current log call in SSEParser that does log.debug("Parsing SSE
chunk: {}", chunk) exposes raw streamed payloads; change it to avoid printing
chunk contents and instead log safe metadata: compute the chunk size (e.g.,
chunk.length() or bytes) and a simple event count (e.g., count occurrences of
"data:" or empty-line delimiters) and use those values in the debug message.
Update the logging in the SSEParser code path (referencing the chunk variable
and the log instance) to emit only size and event-count metadata and remove any
direct inclusion of the chunk string.

---

Duplicate comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleLLMRequest.java`:
- Around line 108-118: setMaxOutputTokens can still NPE because
getGenerationConfig calls json.get(...) without ensuring json is initialized;
update getGenerationConfig (used by setMaxOutputTokens) to first ensure the
field json is non-null (initialize it to an ObjectNode or call a helper that
creates it) before accessing json.get("generationConfig"), and then return or
create the "generationConfig" ObjectNode; reference the methods
getGenerationConfig and setMaxOutputTokens and the json field when making the
change.

---

Nitpick comments:
In `@core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMApiUtil.java`:
- Around line 64-72: The createJson method currently catches a broad Exception;
change the catch to handle com.fasterxml.jackson.core.JsonProcessingException
specifically for om.writeValueAsString(o) to make error handling precise
(replace the generic catch(Exception e) with catch(JsonProcessingException e))
and keep the same fallback JSON return and any logging if present.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 11071819-f5de-4c68-9b96-669afb387962

📥 Commits

Reviewing files that changed from the base of the PR and between 863e5cc and 062ac5c.

📒 Files selected for processing (33)
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/AbstractLLMEvent.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/ChatCompletionDoneEvent.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/ChatCompletionEvent.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMApiUtil.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/ResponsesApiEvent.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AbstractLLMMessage.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AbstractLLMRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AbstractLLMResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/LLMProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/LLMRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/LLMResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeLLMRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeLLMResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ContentBlockDelta.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ContentBlockStart.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/MessageDelta.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ToolUse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleLLMRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleLLMResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAIProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAiLLMRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAiLLMResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiLimit.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/JDBCAiApiUsageStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/NoAiApiLimit.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/util/http/SSEParser.java
  • core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java
  • core/src/test/java/com/predic8/membrane/core/util/http/SSEParserTest.java
✅ Files skipped from review due to trivial changes (3)
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ContentBlockStart.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ToolUse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/ChatCompletionDoneEvent.java
🚧 Files skipped from review as they are similar to previous changes (5)
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/NoAiApiLimit.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/JDBCAiApiUsageStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiLimit.java

@predic8

predic8 commented May 21, 2026

Copy link
Copy Markdown
Member Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented May 21, 2026

Copy link
Copy Markdown
Contributor
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
distribution/tutorials/ai/llm-gateway/google/membrane.cmd (1)

1-25: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Convert to CRLF line endings.

Windows cmd.exe parses .cmd/.bat files in 512-byte chunks and LF-only line endings can cause label resolution (:search_up, :found, :notfound) to misbehave depending on file size and chunk boundaries. Ensure this file is committed with CRLF (e.g., add a .gitattributes entry like *.cmd text eol=crlf or run unix2dos).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@distribution/tutorials/ai/llm-gateway/google/membrane.cmd` around lines 1 -
25, This file uses LF-only endings which can break label parsing in cmd scripts;
convert distribution/tutorials/ai/llm-gateway/google/membrane.cmd to CRLF line
endings and ensure the repo enforces CRLF for .cmd files (e.g., add or update
.gitattributes with a rule like "*.cmd text eol=crlf" and re-commit the fixed
file) so labels such as :search_up, :found and :notfound resolve correctly when
executed; alternatively run a tool like unix2dos on the file and commit the
CRLF-converted version.
♻️ Duplicate comments (3)
core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java (1)

149-150: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard usage persistence when no authenticated user is present.

On Line 150, store.store(user, ...) is called even when user is null (e.g., request paths that never set MEMBRANE_AI_USER), which can break persistence or store invalid records.

Proposed fix
-            if (store != null) {
+            if (store != null && user != null) {
                 store.store(user, res.getUsage());
             }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java`
around lines 149 - 150, The persistence call may be invoked with a null
authenticated user: guard the store.store(user, res.getUsage()) call so it only
runs when both store != null and the user object (from MEMBRANE_AI_USER) is
non-null; update the condition around the call in LLMGatewayInterceptor (the
block containing store.store, user and res.getUsage()) to check user != null
before calling store.store to avoid persisting invalid records.
distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/google/SharingApiKeysGoogleTutorialTest.java (1)

180-218: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Same test-isolation concern as the Claude variant.

This alicesTokenBudgetIsExhaustedWhileBobIsUnaffected shares Alice's gateway-side token counter with the other Alice-tests in the same class. If JUnit 5 happens to execute another Alice-test first, Alice's running total will not start at 0 and the budget arithmetic in the Javadoc no longer holds. Please ensure the base class resets per-user usage between @Test methods (see the same comment on the Claude variant).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/google/SharingApiKeysGoogleTutorialTest.java`
around lines 180 - 218, The test
alicesTokenBudgetIsExhaustedWhileBobIsUnaffected relies on a fresh per-user
token counter but current test runs share Alice's gateway-side state across
methods; add a test teardown that resets per-user usage between `@Test` methods
(e.g., an `@AfterEach` in the common base test class used by
SharingApiKeysGoogleTutorialTest) so the gateway's user token counters are
reinitialized before each test, or call the existing reset method on the shared
token-store (e.g., TokenUsageStore.reset(), UsageTracker.clear(), or similar)
from that teardown to guarantee Alice's counter starts at 0 for every test.
distribution/tutorials/ai/llm-gateway/openai/membrane.cmd (1)

1-25: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Convert to CRLF line endings (duplicate of google/membrane.cmd).

Same issue as the Google variant: this .cmd file is committed with LF-only endings, which can cause cmd.exe's 512-byte chunked parser to mis-handle labels and goto/call targets. Switch to CRLF (e.g., via a .gitattributes rule *.cmd text eol=crlf).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@distribution/tutorials/ai/llm-gateway/openai/membrane.cmd` around lines 1 -
25, This .cmd script (labels :search_up, :found, :notfound and uses call/goto)
is committed with LF-only endings which can break cmd.exe parsing; convert the
file to CRLF line endings and add a repository rule to enforce this (e.g., add a
.gitattributes entry like "*.cmd text eol=crlf") or re-save the file with CRLF
so the labels and call "%MEMBRANE_HOME%\scripts\run-membrane.cmd" %* work
correctly.
🧹 Nitpick comments (3)
distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/AbstractAiTutorialTest.java (1)

46-47: ⚡ Quick win

Avoid fixed mock port to reduce CI/test flakiness.

Line 46 hard-codes 3100; concurrent test runs or occupied local ports can make this fail nondeterministically.

💡 Proposed fix
+import java.net.ServerSocket;
@@
-    protected static final int MOCK_LLM_PORT = 3100;
+    protected int mockLlmPort;
@@
-    protected String mockApiUrl() {
-        return "http://localhost:" + MOCK_LLM_PORT;
+    protected String mockApiUrl() {
+        return "http://localhost:" + mockLlmPort;
     }
@@
     private void startMockLlmApi() throws Exception {
+        mockLlmPort = findFreePort();
         var si = new StaticInterceptor();
@@
-        var sp = new ServiceProxy(new ServiceProxyKey(MOCK_LLM_PORT), null, 0);
+        var sp = new ServiceProxy(new ServiceProxyKey(mockLlmPort), null, 0);
@@
         mockRouter.start();
     }
+
+    private static int findFreePort() throws Exception {
+        try (ServerSocket socket = new ServerSocket(0)) {
+            return socket.getLocalPort();
+        }
+    }

Also applies to: 93-95, 111-112

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/AbstractAiTutorialTest.java`
around lines 46 - 47, The test currently hardcodes MOCK_LLM_PORT = 3100 in
AbstractAiTutorialTest which causes CI flakiness; change this to use a
dynamically allocated port: remove the fixed final constant, add an instance
field (e.g., mockLlmPort) and initialize it at test setup by binding a
ServerSocket to port 0 (or using an OS-assigned ephemeral port) to retrieve the
actual port, then update all references that used MOCK_LLM_PORT (including
usages around the earlier noted blocks) to use the new mockLlmPort field (or a
getter like getMockLlmPort()) so each test run uses a free, non-conflicting
port.
distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/openai/SharingApiKeysOpenAiTutorialTest.java (1)

112-113: 💤 Low value

Tutorial YAML uses a literal placeholder as the upstream API key.

This assertion confirms the tutorial's 20-Sharing-API-Keys.yaml ships with <<Replace with your API_KEY>> as the actual upstream apiKey value. If a user copies the tutorial to a real deployment without substituting the placeholder, the gateway will forward Bearer <<Replace with your API_KEY>> to OpenAI and get an opaque 401. Consider asserting that the tutorial YAML includes a prominent "REPLACE BEFORE USE" comment, or fail-fast in the gateway when the configured apiKey still contains <<…>> markers. Not a blocker for the tests themselves.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/openai/SharingApiKeysOpenAiTutorialTest.java`
around lines 112 - 113, The tutorial YAML ships with a literal placeholder API
key (`<<Replace with your API_KEY>>`) which can be accidentally forwarded; add a
fail-fast check where the gateway loads the upstream apiKey (the code path that
the test SharingApiKeysOpenAiTutorialTest exercises / whatever class constructs
the OpenAI upstream config) to detect placeholder markers like "<<" or ">>" or
the substring "Replace with your API_KEY" and throw a clear
IllegalStateException (or log an error and exit) explaining the developer must
replace the placeholder in 20-Sharing-API-Keys.yaml before use; update the
configuration loader/constructor that sets the apiKey so it validates and
rejects placeholder values at startup.
distribution/tutorials/ai/llm-gateway/google/20-Sharing-API-Keys.yaml (1)

13-19: 💤 Low value

Consider illustrating bob's key in the example curls.

Scenario #2 mentions that "Bob should still be able to send requests", but every example curl uses alice's key abc123. Adding one example with bob's key (qwertz) would make the multi-user demo concretely runnable and avoid confusion.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@distribution/tutorials/ai/llm-gateway/google/20-Sharing-API-Keys.yaml` around
lines 13 - 19, The examples all use Alice's API key (abc123) but the scenario
text references Bob remaining able to send requests; add or replace one curl
example to show Bob's key (qwertz) being used — e.g., a curl to
/v1beta/models/gpt-5:generateContent with header "x-goog-api-key: qwertz" near
the Token Limit Exceeded / Wrong Model examples so readers can run the
multi-user demo (references: alice, bob, abc123, qwertz, endpoint
/v1beta/models/gpt-5:generateContent).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java`:
- Around line 93-99: The current effectiveMaxTokens calculation using
Math.min(aiReq.getRequestedMaxOutputTokens(), maxOutputTokens) can produce 0 or
-1 when sentinel/unlimited values are present and thus corrupt limit accounting;
change the logic in LLMGatewayInterceptor to treat -1 as "unlimited" and compute
effectiveMaxTokens like: if aiReq.getRequestedMaxOutputTokens() == -1 use
maxOutputTokens; else if maxOutputTokens == -1 use
aiReq.getRequestedMaxOutputTokens(); otherwise use Math.min(requested,
maxOutputTokens); then ensure effectiveMaxTokens is non-negative (e.g., clamp to
0) before calling store.checkLimit(user, inputTokens, effectiveMaxTokens) and
when passing the required token count into
errorCreator.tokenLimitExceeded(inputTokens + effectiveMaxTokens, remaining,
store.getRemainingResetTime()) so the required-token value is never negative or
sentinel.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleErrorCreator.java`:
- Around line 24-33: The rate-limit error message in GoogleErrorCreator builds a
string using tokenRequired, tokenRemaining, and tokenResetInSeconds and
currently formats tokenRemaining directly; change the formatting to clamp
tokenRemaining to a non-negative value (e.g., use Math.max(0, tokenRemaining) or
an equivalent local variable like visibleTokenRemaining) before calling
envelope/json so the message never shows negative "only -5 remain"; update the
formatted argument used in the envelope call where tokenRemaining is passed.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java`:
- Around line 89-91: getUsers() currently returns the mutable `@GuardedBy`("lock")
field users directly, breaking the synchronization contract; change
SimpleAiApiStore.getUsers() to avoid leaking the guarded list by making and
returning a defensive copy (or an unmodifiable view) while holding lock: inside
synchronized(lock) create a new List (e.g., new ArrayList<>(users)) or
Collections.unmodifiableList(new ArrayList<>(users)) and return that, ensuring
callers cannot mutate the original users field outside the lock.

In `@distribution/tutorials/ai/llm-gateway/google/10-Basic-LLM-Gateway.yaml`:
- Line 5: Update the comment that currently reads "# Replace <<API_KEY>> with
your OpenAI API key." to correctly reference Google/Gemini credentials for this
Google Gemini tutorial; locate the comment string "# Replace <<API_KEY>> with
your OpenAI API key." in the 10-Basic-LLM-Gateway.yaml file and replace it with
a clear phrase such as "# Replace <<API_KEY>> with your Google API key
(Gemini)". Ensure the new wording explicitly mentions Google or Gemini so it
matches the tutorial context.

In `@distribution/tutorials/ai/llm-gateway/google/20-Sharing-API-Keys.yaml`:
- Around line 25-27: The comment in the tutorial snippet contains a grammar
typo: replace the phrase "even it was requested" with "even though it was
requested" in the comment block (the curl example comment in
20-Sharing-API-Keys.yaml that mentions "totalTokens should not exceed 200 even
it was requested in max-output.json") so the sentence reads "...totalTokens
should not exceed 200 even though it was requested in max-output.json".

---

Outside diff comments:
In `@distribution/tutorials/ai/llm-gateway/google/membrane.cmd`:
- Around line 1-25: This file uses LF-only endings which can break label parsing
in cmd scripts; convert
distribution/tutorials/ai/llm-gateway/google/membrane.cmd to CRLF line endings
and ensure the repo enforces CRLF for .cmd files (e.g., add or update
.gitattributes with a rule like "*.cmd text eol=crlf" and re-commit the fixed
file) so labels such as :search_up, :found and :notfound resolve correctly when
executed; alternatively run a tool like unix2dos on the file and commit the
CRLF-converted version.

---

Duplicate comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java`:
- Around line 149-150: The persistence call may be invoked with a null
authenticated user: guard the store.store(user, res.getUsage()) call so it only
runs when both store != null and the user object (from MEMBRANE_AI_USER) is
non-null; update the condition around the call in LLMGatewayInterceptor (the
block containing store.store, user and res.getUsage()) to check user != null
before calling store.store to avoid persisting invalid records.

In
`@distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/google/SharingApiKeysGoogleTutorialTest.java`:
- Around line 180-218: The test alicesTokenBudgetIsExhaustedWhileBobIsUnaffected
relies on a fresh per-user token counter but current test runs share Alice's
gateway-side state across methods; add a test teardown that resets per-user
usage between `@Test` methods (e.g., an `@AfterEach` in the common base test class
used by SharingApiKeysGoogleTutorialTest) so the gateway's user token counters
are reinitialized before each test, or call the existing reset method on the
shared token-store (e.g., TokenUsageStore.reset(), UsageTracker.clear(), or
similar) from that teardown to guarantee Alice's counter starts at 0 for every
test.

In `@distribution/tutorials/ai/llm-gateway/openai/membrane.cmd`:
- Around line 1-25: This .cmd script (labels :search_up, :found, :notfound and
uses call/goto) is committed with LF-only endings which can break cmd.exe
parsing; convert the file to CRLF line endings and add a repository rule to
enforce this (e.g., add a .gitattributes entry like "*.cmd text eol=crlf") or
re-save the file with CRLF so the labels and call
"%MEMBRANE_HOME%\scripts\run-membrane.cmd" %* work correctly.

---

Nitpick comments:
In
`@distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/AbstractAiTutorialTest.java`:
- Around line 46-47: The test currently hardcodes MOCK_LLM_PORT = 3100 in
AbstractAiTutorialTest which causes CI flakiness; change this to use a
dynamically allocated port: remove the fixed final constant, add an instance
field (e.g., mockLlmPort) and initialize it at test setup by binding a
ServerSocket to port 0 (or using an OS-assigned ephemeral port) to retrieve the
actual port, then update all references that used MOCK_LLM_PORT (including
usages around the earlier noted blocks) to use the new mockLlmPort field (or a
getter like getMockLlmPort()) so each test run uses a free, non-conflicting
port.

In
`@distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/openai/SharingApiKeysOpenAiTutorialTest.java`:
- Around line 112-113: The tutorial YAML ships with a literal placeholder API
key (`<<Replace with your API_KEY>>`) which can be accidentally forwarded; add a
fail-fast check where the gateway loads the upstream apiKey (the code path that
the test SharingApiKeysOpenAiTutorialTest exercises / whatever class constructs
the OpenAI upstream config) to detect placeholder markers like "<<" or ">>" or
the substring "Replace with your API_KEY" and throw a clear
IllegalStateException (or log an error and exit) explaining the developer must
replace the placeholder in 20-Sharing-API-Keys.yaml before use; update the
configuration loader/constructor that sets the apiKey so it validates and
rejects placeholder values at startup.

In `@distribution/tutorials/ai/llm-gateway/google/20-Sharing-API-Keys.yaml`:
- Around line 13-19: The examples all use Alice's API key (abc123) but the
scenario text references Bob remaining able to send requests; add or replace one
curl example to show Bob's key (qwertz) being used — e.g., a curl to
/v1beta/models/gpt-5:generateContent with header "x-goog-api-key: qwertz" near
the Token Limit Exceeded / Wrong Model examples so readers can run the
multi-user demo (references: alice, bob, abc123, qwertz, endpoint
/v1beta/models/gpt-5:generateContent).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 80d72558-1bcf-4c21-a2dd-5a2266de02f2

📥 Commits

Reviewing files that changed from the base of the PR and between 329bb80 and 38acaa3.

📒 Files selected for processing (74)
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/LLMErrorCreator.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeErrorCreator.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeErrorResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeLLMResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleErrorCreator.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleLLMRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleLLMResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAiErrorCreator.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiUser.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/ExchangeToolSupport.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/ExchangeUtils.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/MCPUtil.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpPayloadSanitizer.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpSchemaBuilder.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpSessionContext.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpSessionManager.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpToolDefinition.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpToolHandler.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpToolRegistry.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/MembraneMCPServer.java
  • core/src/main/java/com/predic8/membrane/core/jsonrpc/JSONRPCRequest.java
  • core/src/main/java/com/predic8/membrane/core/jsonrpc/JSONRPCResponse.java
  • core/src/main/java/com/predic8/membrane/core/jsonrpc/JSONRPCUtil.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPInitialize.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPInitializeResponse.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPInitialized.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPNotification.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPPing.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPRequest.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPResponse.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPToolsCall.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPToolsCallResponse.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPToolsList.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPToolsListResponse.java
  • core/src/test/java/com/predic8/membrane/core/interceptor/mcp/MembraneMCPServerTest.java
  • core/src/test/java/com/predic8/membrane/core/jsonrpc/JSONRPCRequestTest.java
  • core/src/test/java/com/predic8/membrane/core/jsonrpc/JSONRPCResponseTest.java
  • core/src/test/java/com/predic8/membrane/core/mcp/MCPInitializeTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/AbstractAiTutorialTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/claude/BasicClaudeLLMGatewayTutorialTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/claude/SharingApiKeysTutorialTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/google/AbstractGoogleTutorialTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/google/BasicGoogleLLMGatewayTutorialTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/google/SharingApiKeysGoogleTutorialTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/openai/AbstractOpenAiTutorialTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/openai/BasicOpenAiLLMGatewayTutorialTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/openai/SharingApiKeysOpenAiTutorialTest.java
  • distribution/tutorials/ai/llm-gateway/claude/10-Basic-LLM-Gateway.yaml
  • distribution/tutorials/ai/llm-gateway/claude/20-Sharing-API-Keys.yaml
  • distribution/tutorials/ai/llm-gateway/claude/max-input.json
  • distribution/tutorials/ai/llm-gateway/claude/max-output.json
  • distribution/tutorials/ai/llm-gateway/claude/membrane.cmd
  • distribution/tutorials/ai/llm-gateway/claude/membrane.sh
  • distribution/tutorials/ai/llm-gateway/claude/simple.json
  • distribution/tutorials/ai/llm-gateway/claude/wrong-model.json
  • distribution/tutorials/ai/llm-gateway/google/10-Basic-LLM-Gateway.yaml
  • distribution/tutorials/ai/llm-gateway/google/20-Sharing-API-Keys.yaml
  • distribution/tutorials/ai/llm-gateway/google/max-input.json
  • distribution/tutorials/ai/llm-gateway/google/max-output.json
  • distribution/tutorials/ai/llm-gateway/google/membrane.cmd
  • distribution/tutorials/ai/llm-gateway/google/membrane.sh
  • distribution/tutorials/ai/llm-gateway/google/simple.json
  • distribution/tutorials/ai/llm-gateway/openai/10-Basic-LLM-Gateway.yaml
  • distribution/tutorials/ai/llm-gateway/openai/20-Sharing-API-Keys.yaml
  • distribution/tutorials/ai/llm-gateway/openai/max-input.json
  • distribution/tutorials/ai/llm-gateway/openai/max-output.json
  • distribution/tutorials/ai/llm-gateway/openai/membrane.cmd
  • distribution/tutorials/ai/llm-gateway/openai/membrane.sh
  • distribution/tutorials/ai/llm-gateway/openai/simple.json
  • distribution/tutorials/ai/llm-gateway/openai/wrong-model.json
💤 Files with no reviewable changes (5)
  • distribution/tutorials/ai/llm-gateway/openai/simple.json
  • distribution/tutorials/ai/llm-gateway/claude/membrane.cmd
  • distribution/tutorials/ai/llm-gateway/openai/wrong-model.json
  • distribution/tutorials/ai/llm-gateway/openai/max-input.json
  • distribution/tutorials/ai/llm-gateway/claude/membrane.sh
✅ Files skipped from review due to trivial changes (39)
  • distribution/tutorials/ai/llm-gateway/claude/max-output.json
  • distribution/tutorials/ai/llm-gateway/claude/wrong-model.json
  • distribution/tutorials/ai/llm-gateway/google/simple.json
  • distribution/tutorials/ai/llm-gateway/openai/max-output.json
  • distribution/tutorials/ai/llm-gateway/claude/max-input.json
  • distribution/tutorials/ai/llm-gateway/claude/20-Sharing-API-Keys.yaml
  • distribution/tutorials/ai/llm-gateway/google/max-output.json
  • distribution/tutorials/ai/llm-gateway/claude/simple.json
  • distribution/tutorials/ai/llm-gateway/openai/10-Basic-LLM-Gateway.yaml
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPToolsList.java
  • distribution/tutorials/ai/llm-gateway/claude/10-Basic-LLM-Gateway.yaml
  • core/src/test/java/com/predic8/membrane/core/jsonrpc/JSONRPCRequestTest.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPRequest.java
  • core/src/main/java/com/predic8/membrane/core/jsonrpc/JSONRPCResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpToolHandler.java
  • core/src/main/java/com/predic8/membrane/core/jsonrpc/JSONRPCUtil.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPInitializeResponse.java
  • core/src/test/java/com/predic8/membrane/core/mcp/MCPInitializeTest.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPToolsListResponse.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPToolsCall.java
  • core/src/test/java/com/predic8/membrane/core/interceptor/mcp/MembraneMCPServerTest.java
  • core/src/test/java/com/predic8/membrane/core/jsonrpc/JSONRPCResponseTest.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPResponse.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPToolsCallResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpToolRegistry.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpToolDefinition.java
  • core/src/main/java/com/predic8/membrane/core/jsonrpc/JSONRPCRequest.java
  • distribution/tutorials/ai/llm-gateway/openai/20-Sharing-API-Keys.yaml
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/ExchangeToolSupport.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/MembraneMCPServer.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpSchemaBuilder.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/ExchangeUtils.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpSessionManager.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpPayloadSanitizer.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpSessionContext.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/mcp/MCPUtil.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPInitialize.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPInitialized.java
  • core/src/main/java/com/predic8/membrane/core/mcp/MCPNotification.java

Comment thread distribution/tutorials/ai/llm-gateway/google/10-Basic-LLM-Gateway.yaml Outdated
Comment thread distribution/tutorials/ai/llm-gateway/google/20-Sharing-API-Keys.yaml Outdated
…for LLM Gateway

- Ensure thread-safe access to users in `SimpleAiApiStore` with `List.copyOf`.
- Introduce `visibleRemaining` to handle non-negative token values in `GoogleErrorCreator`.
- Add configuration validation in `LLMGatewayInterceptor` to enforce API key substitution.
- Enhance token limit handling to adjust output tokens dynamically in `LLMGatewayInterceptor`.
- Update Google and Claude tutorials with clearer instructions for API key usage and token limits.
@predic8

predic8 commented May 21, 2026

Copy link
Copy Markdown
Member Author

/ok-to-test

predic8 and others added 6 commits May 22, 2026 08:01
- Introduced `StreamingOpenAiLLMGatewayTutorialTest` with SSE mocking and validation.
- Added JSON fixtures (`stream.json`, `max-output-stream.json`) for testing streaming requests.
- Enhanced base test framework to support `text/event-stream` responses.
- Updated `LLMGatewayInterceptor` to handle streaming scenarios with capped tokens.
- Replaced raw API key placeholders with `TEST_API_KEY` constant in tutorial tests to ensure consistency.
- Added `TEST_API_KEY` to `AbstractAiTutorialTest` for upstream key substitution verification.
- Updated `log4j2.xml` to limit logging to `com.predic8.membrane.core.interceptor.ai`.
- Introduced PostgreSQL dependency in `pom.xml` for future enhancements.
@predic8 predic8 requested a review from rrayst May 22, 2026 14:34
predic8 added 3 commits May 26, 2026 11:59
…dd policies support

- Unified OpenAI and Chat Completions error handling under `ChatCompletionsErrorCreator`.
- Deprecated older OpenAI-specific classes in favor of `ChatCompletions` equivalents.
- Introduced detailed usage policies handling in `LLMGatewayInterceptor`.
- Updated YAML tutorials to reflect the new `policies` configuration model.
…licy-based token and model restrictions

- Added `Policies` class for defining restrictions on models, input tokens, and output tokens in the LLM Gateway.
- Replaced `maxInputTokens` and `maxOutputTokens` fields in `LLMGatewayInterceptor` with `Policies`.
- Updated YAML tutorials (OpenAI, Claude, Google) to use the new `policies` configuration.
@predic8

predic8 commented May 26, 2026

Copy link
Copy Markdown
Member Author

/ok-to-test

predic8 added 6 commits May 26, 2026 17:12
- Replaced `Policies` class implementation with `DefaultPolicies` and `NullPolicies` for enhanced flexibility.
- Added `SystemPrompt` class to support dynamic system prompt management in LLM Gateway.
- Updated `LLMGatewayInterceptor` to delegate policy enforcement and system prompt handling to respective components.
- Extended providers (OpenAI, Claude, Google Gemini) with standardized system prompt methods (`getSystemPrompt`, `setSystemPrompt`, `removeSystemPrompt`).
- Enhanced test coverage with `AbstractLLMRequestTest` for API key handling and bearer token case insensitivity.
- Added `setSystemPrompt`, `removeSystemPrompt`, and `isChatCompletion` methods for enhanced prompt management.
- Refactored `SystemPrompt.Action` to remove unused `REJECT` action.
- Updated `AbstractLLMRequestTest` to validate new `SystemPrompt` behaviors.
@predic8

predic8 commented May 27, 2026

Copy link
Copy Markdown
Member Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented May 27, 2026

Copy link
Copy Markdown
Contributor
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/AbstractLLMEvent.java (1)

50-50: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid logging raw SSE payload at INFO level.

This logs unredacted event data at INFO. Keep payload details out of non-debug logs.

Suggested fix
-            log.info("Unknown event format: {}", sse.data());
+            log.info("Unknown event format. dataLength={}", sse.data() == null ? 0 : sse.data().length());
Based on learnings: In this repository, sensitive raw content is exempt only at DEBUG level; higher levels should still be flagged.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/AbstractLLMEvent.java`
at line 50, The INFO-level log in AbstractLLMEvent (the log.info("Unknown event
format: {}", sse.data()) call) exposes raw SSE payloads; change this so raw
payloads are only logged at DEBUG: replace the INFO log with a generic INFO/
WARN message that does not include sse.data() (e.g., "Unknown event format
received") and move the detailed payload logging to log.debug(...) where you
include sse.data() or a redacted version; update any surrounding handling in the
class/method to use log.debug for payloads and ensure no raw sse.data() appears
at INFO or higher.
core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/store/AiApiUser.java (1)

94-97: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Reject negative token budgets in configuration.

Line 95 allows negative values, which makes quota checks produce invalid remaining-token results. Guard tokens >= 0 in the setter.

Suggested fix
 `@MCAttribute`
 public void setTokens(long tokens) {
+    if (tokens < 0) {
+        throw new IllegalArgumentException("tokens must be >= 0");
+    }
     this.tokens = tokens;
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/store/AiApiUser.java`
around lines 94 - 97, The setter AiApiUser.setTokens currently accepts negative
values causing invalid quotas; update setTokens(long tokens) to validate tokens
>= 0 and reject negatives (e.g., throw IllegalArgumentException or clamp to 0)
and ensure the class field 'tokens' is only assigned when the value is valid;
reference AiApiUser.setTokens and the 'tokens' field and add a clear error
message when rejecting negative input.
♻️ Duplicate comments (7)
core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/AbstractLLMEvent.java (1)

49-53: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Guard Optional before dereferencing parsed SSE JSON.

opt.get() is still reached when parsing fails, which can throw and break streaming flow.

Suggested fix
         var opt = JsonUtil.getJsonObject(sse.data());
         if (opt.isEmpty()) {
             log.info("Unknown event format: {}", sse.data());
+            return null;
         }
 
         var json = opt.get();
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/AbstractLLMEvent.java`
around lines 49 - 53, The code in AbstractLLMEvent currently calls opt.get()
without handling the empty case; update the logic in the method where opt and
sse.data() are used (inside AbstractLLMEvent) to guard the Optional before
dereferencing: if opt.isEmpty() after parsing sse.data(), log the unknown event
as already done and then return/continue (or otherwise skip further processing)
instead of proceeding to var json = opt.get(); this prevents
NoSuchElementException and preserves streaming flow.
core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractLLMResponse.java (1)

84-86: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard isError() against null json.

Line 85 can throw NPE when json has not been initialized on streaming paths.

Proposed fix
     `@Override`
     public boolean isError() {
-        return json.get("error") != null && !json.get("error").isNull();
+        return json != null && json.get("error") != null && !json.get("error").isNull();
     }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractLLMResponse.java`
around lines 84 - 86, AbstractLLMResponse.isError() can throw an NPE when the
field json is null on streaming paths; update the isError() method in class
AbstractLLMResponse to first check whether json is null and return false (or
equivalent safe value) before accessing json.get("error"), i.e., guard the json
field (null check) and only call json.get("error").isNull() when json is
non-null to avoid the NPE.
core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractLLMRequest.java (1)

73-77: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Enforce strict Bearer scheme parsing.

Line 73 and Line 77 still allow malformed headers (e.g., BearerXYZ) to be treated as valid tokens.

Proposed fix
-        if (!ah.regionMatches(true, 0, BEARER_PREFIX, 0, BEARER_PREFIX.length())) {
+        var value = ah.trim();
+        var prefix = BEARER_PREFIX + " ";
+        if (!value.regionMatches(true, 0, prefix, 0, prefix.length())) {
             return null;
         }

-        var token = ah.substring(BEARER_PREFIX.length()).trim();
+        var token = value.substring(prefix.length()).trim();
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractLLMRequest.java`
around lines 73 - 77, The bearer parsing currently accepts headers like
"BearerXYZ" because it only matches the prefix and then trims the remainder;
update the check in AbstractLLMRequest to require an explicit separator after
BEARER_PREFIX (either make BEARER_PREFIX include the trailing space "Bearer " or
add an extra check that ah.length() > BEARER_PREFIX.length() &&
Character.isWhitespace(ah.charAt(BEARER_PREFIX.length()))), and only then
extract the token (using substring(BEARER_PREFIX.length()).trim() if you keep
the space-aware prefix). Ensure you reference the BEARER_PREFIX constant and the
auth header variable (ah) in your changes so malformed "BearerXYZ" headers are
rejected.
core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/google/GoogleLLMRequest.java (1)

110-117: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add null guards before using json and URI in request helpers.

These methods can still NPE when body parsing fails or URI is missing, despite Line 83 already acknowledging null json is possible.

Proposed fix
     `@Override`
     public String getSystemPrompt() {
+        if (json == null || json.isNull()) {
+            return "";
+        }
         for (var part : json.path("systemInstruction").path("parts")) {
             if (part.path("text").isTextual()) {
                 return part.path("text").asText("");
@@
     `@Override`
     public void setSystemPrompt(String systemPrompt) {
+        if (json == null || json.isNull()) {
+            return;
+        }
         json.putObject("systemInstruction")
                 .putArray("parts")
                 .addObject()
                 .put("text", systemPrompt);
@@
     `@Override`
     public void removeSystemPrompt() {
+        if (json == null || json.isNull()) {
+            return;
+        }
         json.remove("systemInstruction");
     }
@@
     `@Override`
     public boolean isChatCompletion() {
-        return exchange.getRequest().getUri().contains("/chat/completions");
+        var uri = exchange.getRequest().getUri();
+        return uri != null && uri.contains("/chat/completions");
     }
@@
     `@Override`
     public void setMaxOutputTokens(int maxOutputTokens) {
+        if (json == null || json.isNull()) {
+            return;
+        }
         getGenerationConfig().put("maxOutputTokens", maxOutputTokens);
     }
#!/bin/bash
# Inspect null guards and dereferences in GoogleLLMRequest plus json field origin.
fd GoogleLLMRequest.java --exec sed -n '75,210p' {}
fd AbstractLLMRequest.java --exec sed -n '1,220p' {}
rg -nP --type java 'json\.path\(|json\.putObject\(|json\.remove\(|getUri\(\)\.contains'

Also applies to: 124-137, 141-143, 182-192

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/google/GoogleLLMRequest.java`
around lines 110 - 117, Several helper methods (e.g., getSystemPrompt)
dereference the field json and call getUri() without null checks, which can NPE
when body parsing fails or URI is absent; update each helper (getSystemPrompt
and other methods around lines referenced) to first null-check json (and its
nodes) before calling json.path(...) or json.putObject/remove, and null-check
getUri() before invoking getUri().contains(...) — return safe defaults (empty
string or false) when json or URI is null; use the existing json origin in the
class and mirror the null-guard pattern used at line 83 to avoid duplicate
dereferences across methods.
core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/openai/AbstractOpenAiLLMRequest.java (1)

28-39: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard json at the top of estimateInputTokens().

json.path(...) is dereferenced without a null guard, so missing/unparseable request bodies can throw before token checks run.

Proposed fix
     `@Override`
     public long estimateInputTokens() {
+        if (json == null || json.isNull()) {
+            return 0;
+        }
 
         long chars = countText(json.path("input"));
#!/bin/bash
# Verify missing null guard and inspect AbstractLLMRequest json handling.
fd AbstractOpenAiLLMRequest.java --exec sed -n '24,48p' {}
fd AbstractLLMRequest.java --exec sed -n '1,220p' {}
rg -nP --type java 'estimateInputTokens\(|json\.path\("input"\)'
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/openai/AbstractOpenAiLLMRequest.java`
around lines 28 - 39, In estimateInputTokens() of AbstractOpenAiLLMRequest,
guard the json field at the top before calling json.path(...): check that json
is non-null (and optionally json.isObject() or has expected paths) and return 0
or a safe default token count if json is missing/unparseable; then proceed to
call countText(json.path("input")), estimateChatCompletitions(),
countText(json.path("system")), countJsonSize(json.path("tools")) and
countJsonSize(json.path("functions")). Ensure the guard uses the same json field
declared in this class and does not alter downstream countText()/countJsonSize()
contracts.
core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/store/JDBCAiApiUsageStore.java (2)

31-40: ⚠️ Potential issue | 🟠 Major

Use database-specific auto-increment DDL.

Line 33 still uses GENERATED ALWAYS AS IDENTITY, which is PostgreSQL-specific and can fail on MySQL/MariaDB when this store initializes.

Does MySQL 8.x or MariaDB support `BIGINT GENERATED ALWAYS AS IDENTITY` in `CREATE TABLE`, or is `AUTO_INCREMENT` required?
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/store/JDBCAiApiUsageStore.java`
around lines 31 - 40, The CREATE_TABLE_SQL constant in JDBCAiApiUsageStore uses
PostgreSQL's "BIGINT GENERATED ALWAYS AS IDENTITY" which breaks on
MySQL/MariaDB; change to generate dialect-specific DDL at runtime instead of a
static constant: detect the DB product via
Connection.getMetaData().getDatabaseProductName() (or similar) in the class
constructor or the schema-initialization method, and choose "BIGINT
AUTO_INCREMENT PRIMARY KEY" for MySQL/MariaDB and the existing "BIGINT GENERATED
ALWAYS AS IDENTITY PRIMARY KEY" (or SERIAL) for PostgreSQL, then execute the
chosen CREATE TABLE statement to create ai_api_usage. Ensure the chosen symbol
CREATE_TABLE_SQL is replaced by a runtime variable or factory method used by the
table-creation logic.

72-84: ⚠️ Potential issue | 🟠 Major

Placeholder methods still make the store non-functional for auth/quota checks.

getUser() and checkLimit() still return constant placeholders, and getRemainingResetTime() is also hardcoded. With this store configured, authentication/rate-limit behavior can be incorrect or silently bypassed.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/store/JDBCAiApiUsageStore.java`
around lines 72 - 84, The JDBCAiApiUsageStore class currently has placeholder
implementations for getUser, checkLimit, and getRemainingResetTime which break
auth and quota enforcement; implement getUser(String token) to look up the API
user by token in the DB (query the API key / user table) and return
Optional<AiApiUser> populated with id, quota window and limits; implement
checkLimit(AiApiUser user, long inputTokens, long outputTokens) to run inside a
DB transaction: compute tokensToAdd = inputTokens+outputTokens, load and/or
initialize the user's current usage row for the current window, enforce the
configured limit (return 0 or throw as appropriate when over limit), increment
and persist the usage count, and return the remaining allowed tokens after the
increment; implement getRemainingResetTime() to calculate and return the
milliseconds until the current quota window resets (based on the configured
window start/duration for the store/user). Ensure proper SQL
queries/transactions, concurrency-safe updates (e.g., SELECT ... FOR UPDATE or
atomic update), and that all three methods in JDBCAiApiUsageStore use the same
time-window logic and fields as the rest of the store.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/DefaultPolicies.java`:
- Around line 109-124: Both setters setMaxInputTokens and setMaxOutputTokens
must reject negative values: add a non-negative validation at the start of each
setter (e.g., if maxInputTokens < 0 or maxOutputTokens < 0) and throw an
IllegalArgumentException with a clear message indicating the parameter and that
it must be non-negative; update setMaxInputTokens and setMaxOutputTokens to
perform this check before assigning to the fields to prevent negative token
limits from being stored.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/LLMGatewayInterceptor.java`:
- Around line 148-151: The computeEffectiveMaxOutputTokens method underestimates
when maxOutputTokens==0 (treated as unlimited): change its logic so that if
requestedMaxOutputTokens <= 0 it still returns maxOutputTokens, but if
maxOutputTokens <= 0 (meaning unlimited) and requestedMaxOutputTokens > 0 it
returns requestedMaxOutputTokens; otherwise return
Math.min(requestedMaxOutputTokens, maxOutputTokens). Update the
computeEffectiveMaxOutputTokens function to implement this three-way decision.
- Around line 68-72: The init() method should fail fast if required
collaborators are missing: before calling provider.getErrorCreator() or
policies.init(...), check that provider and policies are not null and throw a
clear configuration exception (e.g., IllegalStateException with a descriptive
message like "LLMGatewayInterceptor: missing provider" or "missing policies") so
initialization does not NPE; ensure errorCreator is obtained only after the
null-checks and then pass it to policies.init(errorCreator).

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/chatcompletions/ChatCompletionsProvider.java`:
- Around line 58-60: ChatCompletionsProvider.getErrorCreator() currently returns
null which will cause runtime failures; implement and return a concrete
LLMErrorCreator instance instead. Create a ChatCompletionsErrorCreator (or
similar) that implements LLMErrorCreator and encapsulates provider-specific
error mapping/serialization, then modify
ChatCompletionsProvider.getErrorCreator() to return a singleton/new instance of
that class so the gateway can build provider-specific error responses.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/chatcompletions/ChatCompletionsRequest.java`:
- Around line 117-119: The getRequestedMaxOutputTokens method currently only
reads "max_completion_tokens" and misses requests using "max_tokens"; update
getRequestedMaxOutputTokens to check both JSON fields (prefer
"max_completion_tokens" when present, then "max_tokens") and return the first
non-zero/defined value, falling back to 0 if neither exists; modify the logic in
getRequestedMaxOutputTokens (and any helpers it calls) to read
json.path("max_completion_tokens").asLong(...) then
json.path("max_tokens").asLong(...) as a fallback to ensure token-limit checks
cover both keys.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/claude/ContentBlockStart.java`:
- Around line 25-29: In ContentBlockStart.from(...), avoid directly casting
(ObjectNode) on.path("content_block"); first check that
on.path("content_block").isObject() and only then assign to cb and proceed; if
it's not an object, leave cbs.toolUse null (or handle missing appropriately) and
only call ToolUse.from(cb) when cb is an ObjectNode to prevent
ClassCastException.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/SystemPrompt.java`:
- Line 45: The current SystemPrompt class logs the full prompt text at INFO via
log.info("Removing instructions: {}", instructions); change this so INFO-level
logs do not contain raw instructions: replace the INFO call with a log.message
that only records metadata (e.g., a short identifier or length) and move the
full instructions content to a DEBUG-level log (log.debug(..., instructions)) or
remove it entirely; update the SystemPrompt logging around the variable
instructions accordingly.
- Around line 26-32: The switch in SystemPrompt.handleRequest(LLMRequest aiReq,
Exchange exc) assumes the field action is non-null and can throw an NPE; guard
against null by checking SystemPrompt.action at the start of handleRequest
(e.g., if action == null) and either assign a sensible default action or
return/throw a clear exception; update the method to handle the null case before
the switch that references enum constants like OVERWRITE so request handling
never dereferences a null action.

---

Outside diff comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/AbstractLLMEvent.java`:
- Line 50: The INFO-level log in AbstractLLMEvent (the log.info("Unknown event
format: {}", sse.data()) call) exposes raw SSE payloads; change this so raw
payloads are only logged at DEBUG: replace the INFO log with a generic INFO/
WARN message that does not include sse.data() (e.g., "Unknown event format
received") and move the detailed payload logging to log.debug(...) where you
include sse.data() or a redacted version; update any surrounding handling in the
class/method to use log.debug for payloads and ensure no raw sse.data() appears
at INFO or higher.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/store/AiApiUser.java`:
- Around line 94-97: The setter AiApiUser.setTokens currently accepts negative
values causing invalid quotas; update setTokens(long tokens) to validate tokens
>= 0 and reject negatives (e.g., throw IllegalArgumentException or clamp to 0)
and ensure the class field 'tokens' is only assigned when the value is valid;
reference AiApiUser.setTokens and the 'tokens' field and add a clear error
message when rejecting negative input.

---

Duplicate comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/AbstractLLMEvent.java`:
- Around line 49-53: The code in AbstractLLMEvent currently calls opt.get()
without handling the empty case; update the logic in the method where opt and
sse.data() are used (inside AbstractLLMEvent) to guard the Optional before
dereferencing: if opt.isEmpty() after parsing sse.data(), log the unknown event
as already done and then return/continue (or otherwise skip further processing)
instead of proceeding to var json = opt.get(); this prevents
NoSuchElementException and preserves streaming flow.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractLLMRequest.java`:
- Around line 73-77: The bearer parsing currently accepts headers like
"BearerXYZ" because it only matches the prefix and then trims the remainder;
update the check in AbstractLLMRequest to require an explicit separator after
BEARER_PREFIX (either make BEARER_PREFIX include the trailing space "Bearer " or
add an extra check that ah.length() > BEARER_PREFIX.length() &&
Character.isWhitespace(ah.charAt(BEARER_PREFIX.length()))), and only then
extract the token (using substring(BEARER_PREFIX.length()).trim() if you keep
the space-aware prefix). Ensure you reference the BEARER_PREFIX constant and the
auth header variable (ah) in your changes so malformed "BearerXYZ" headers are
rejected.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractLLMResponse.java`:
- Around line 84-86: AbstractLLMResponse.isError() can throw an NPE when the
field json is null on streaming paths; update the isError() method in class
AbstractLLMResponse to first check whether json is null and return false (or
equivalent safe value) before accessing json.get("error"), i.e., guard the json
field (null check) and only call json.get("error").isNull() when json is
non-null to avoid the NPE.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/google/GoogleLLMRequest.java`:
- Around line 110-117: Several helper methods (e.g., getSystemPrompt)
dereference the field json and call getUri() without null checks, which can NPE
when body parsing fails or URI is absent; update each helper (getSystemPrompt
and other methods around lines referenced) to first null-check json (and its
nodes) before calling json.path(...) or json.putObject/remove, and null-check
getUri() before invoking getUri().contains(...) — return safe defaults (empty
string or false) when json or URI is null; use the existing json origin in the
class and mirror the null-guard pattern used at line 83 to avoid duplicate
dereferences across methods.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/openai/AbstractOpenAiLLMRequest.java`:
- Around line 28-39: In estimateInputTokens() of AbstractOpenAiLLMRequest, guard
the json field at the top before calling json.path(...): check that json is
non-null (and optionally json.isObject() or has expected paths) and return 0 or
a safe default token count if json is missing/unparseable; then proceed to call
countText(json.path("input")), estimateChatCompletitions(),
countText(json.path("system")), countJsonSize(json.path("tools")) and
countJsonSize(json.path("functions")). Ensure the guard uses the same json field
declared in this class and does not alter downstream countText()/countJsonSize()
contracts.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/store/JDBCAiApiUsageStore.java`:
- Around line 31-40: The CREATE_TABLE_SQL constant in JDBCAiApiUsageStore uses
PostgreSQL's "BIGINT GENERATED ALWAYS AS IDENTITY" which breaks on
MySQL/MariaDB; change to generate dialect-specific DDL at runtime instead of a
static constant: detect the DB product via
Connection.getMetaData().getDatabaseProductName() (or similar) in the class
constructor or the schema-initialization method, and choose "BIGINT
AUTO_INCREMENT PRIMARY KEY" for MySQL/MariaDB and the existing "BIGINT GENERATED
ALWAYS AS IDENTITY PRIMARY KEY" (or SERIAL) for PostgreSQL, then execute the
chosen CREATE TABLE statement to create ai_api_usage. Ensure the chosen symbol
CREATE_TABLE_SQL is replaced by a runtime variable or factory method used by the
table-creation logic.
- Around line 72-84: The JDBCAiApiUsageStore class currently has placeholder
implementations for getUser, checkLimit, and getRemainingResetTime which break
auth and quota enforcement; implement getUser(String token) to look up the API
user by token in the DB (query the API key / user table) and return
Optional<AiApiUser> populated with id, quota window and limits; implement
checkLimit(AiApiUser user, long inputTokens, long outputTokens) to run inside a
DB transaction: compute tokensToAdd = inputTokens+outputTokens, load and/or
initialize the user's current usage row for the current window, enforce the
configured limit (return 0 or throw as appropriate when over limit), increment
and persist the usage count, and return the remaining allowed tokens after the
increment; implement getRemainingResetTime() to calculate and return the
milliseconds until the current quota window resets (based on the configured
window start/duration for the store/user). Ensure proper SQL
queries/transactions, concurrency-safe updates (e.g., SELECT ... FOR UPDATE or
atomic update), and that all three methods in JDBCAiApiUsageStore use the same
time-window logic and fields as the rest of the store.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7f23e91c-06f6-4ec3-b233-994c570eafe2

📥 Commits

Reviewing files that changed from the base of the PR and between 38acaa3 and 6935d3b.

📒 Files selected for processing (60)
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/AbstractLLMEvent.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/ChatCompletionDoneEvent.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/ChatCompletionEvent.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/DefaultPolicies.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/LLMGatewayInterceptor.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/NullPolicies.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/Policies.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/ResponsesApiEvent.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/SystemPrompt.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractLLMErrorCreator.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractLLMMessage.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractLLMRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractLLMResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/LLMErrorCreator.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/LLMProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/LLMRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/LLMResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/chatcompletions/ChatCompletionsErrorCreator.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/chatcompletions/ChatCompletionsProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/chatcompletions/ChatCompletionsRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/chatcompletions/ChatCompletionsResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/claude/ClaudeErrorCreator.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/claude/ClaudeErrorResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/claude/ClaudeLLMRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/claude/ClaudeLLMResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/claude/ClaudeProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/claude/ContentBlockDelta.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/claude/ContentBlockStart.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/claude/MessageDelta.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/claude/ToolUse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/google/GoogleErrorCreator.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/google/GoogleLLMRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/google/GoogleLLMResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/google/GoogleProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/openai/AbstractOpenAiLLMRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/openai/OpenAIChatCompletionsRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/openai/OpenAIProvider.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/openai/OpenAiLLMResponsesRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/openai/OpenAiLLMResponsesResponse.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/store/AiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/store/AiApiUser.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/store/JDBCAiApiUsageStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/store/SimpleAiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/store/Usage.java
  • core/src/main/java/com/predic8/membrane/core/util/http/SSEParser.java
  • core/src/test/java/com/predic8/membrane/core/interceptor/llmgateway/provider/AbstractLLMRequestTest.java
  • core/src/test/java/com/predic8/membrane/core/util/http/SSEParserTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/AbstractAiTutorialTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/claude/SharingApiKeysTutorialTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/google/SharingApiKeysGoogleTutorialTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/openai/SharingApiKeysOpenAiTutorialTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/openai/StreamingOpenAiLLMGatewayTutorialTest.java
  • distribution/tutorials/ai/llm-gateway/claude/10-Basic-LLM-Gateway.yaml
  • distribution/tutorials/ai/llm-gateway/claude/20-Sharing-API-Keys.yaml
  • distribution/tutorials/ai/llm-gateway/google/10-Basic-LLM-Gateway.yaml
  • distribution/tutorials/ai/llm-gateway/google/20-Sharing-API-Keys.yaml
  • distribution/tutorials/ai/llm-gateway/openai/10-Basic-LLM-Gateway.yaml
  • distribution/tutorials/ai/llm-gateway/openai/20-Sharing-API-Keys.yaml
  • distribution/tutorials/ai/llm-gateway/openai/max-output-stream.json
  • distribution/tutorials/ai/llm-gateway/openai/stream.json
✅ Files skipped from review due to trivial changes (8)
  • distribution/tutorials/ai/llm-gateway/openai/stream.json
  • distribution/tutorials/ai/llm-gateway/openai/max-output-stream.json
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/claude/ToolUse.java
  • distribution/tutorials/ai/llm-gateway/claude/20-Sharing-API-Keys.yaml
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/store/SimpleAiApiStore.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/provider/LLMRequest.java
  • core/src/main/java/com/predic8/membrane/core/interceptor/llmgateway/NullPolicies.java
  • distribution/tutorials/ai/llm-gateway/openai/20-Sharing-API-Keys.yaml
🚧 Files skipped from review as they are similar to previous changes (9)
  • distribution/tutorials/ai/llm-gateway/openai/10-Basic-LLM-Gateway.yaml
  • distribution/tutorials/ai/llm-gateway/google/10-Basic-LLM-Gateway.yaml
  • distribution/tutorials/ai/llm-gateway/claude/10-Basic-LLM-Gateway.yaml
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/openai/SharingApiKeysOpenAiTutorialTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/claude/SharingApiKeysTutorialTest.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/AbstractAiTutorialTest.java
  • core/src/main/java/com/predic8/membrane/core/util/http/SSEParser.java
  • distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/google/SharingApiKeysGoogleTutorialTest.java
  • core/src/test/java/com/predic8/membrane/core/util/http/SSEParserTest.java

@predic8 predic8 merged commit a8a9ca3 into master May 27, 2026
4 of 5 checks passed
@predic8 predic8 deleted the ai-gateway branch May 27, 2026 14:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants