Skip to content

Commit 704428b

Browse files
authored
feat(codec): add missing optional AnnotatedLlmRequest/Response fields (NVIDIA#76)
Summary This PR expands normalized codec extraction around `AnnotatedLlmRequest` and `AnnotatedLlmResponse` for: - OpenAI Chat Completions (`/v1/chat/completions`) - OpenAI Responses (`/v1/responses`) - Anthropic Messages (`/v1/messages`) - Hybrid payload variants observed in inference gateways/provider bridges (`vLLM`, `LiteLLM`, `SGLang` patterns) The goal is to extract more meaningful normalized state while preserving unmodeled provider-specific fields losslessly via `extra`. Additive Request IR State (`AnnotatedLlmRequest`) Added normalized optional fields: - `store: Option<bool>` - `previous_response_id: Option<String>` - `truncation: Option<Json>` - `reasoning: Option<Json>` - `include: Option<Json>` - `user: Option<String>` - `metadata: Option<Json>` - `service_tier: Option<String>` - `parallel_tool_calls: Option<bool>` - `max_output_tokens: Option<u64>` - `max_tool_calls: Option<u64>` - `top_logprobs: Option<u64>` - `stream: Option<bool>` Multimodal request content expansion: - `ContentPart::ImageUrl { image_url: OpenAiImageUrl }` - `OpenAiImageUrl { url, detail }` Additive Response IR State (`ApiSpecificResponse`) OpenAI Responses variant expanded with: - `previous_response_id` - `store` - `service_tier` - `truncation` - `reasoning` - `input_tokens_details` - `output_tokens_details` Anthropic Messages variant expanded with: - `service_tier` - `container` - `content_blocks` OpenAI Responses request-side hardening - Added strict-first decode behavior for heterogeneous `input` arrays - Removed silent lossy fallback behavior - Preserves unparsed mixed input items in `extra` (`_openai_responses_unparsed_input_items`) for round-trip safety - Handles Anthropic-style tool hint combinations when present in mixed gateway payloads Anthropic request-side updates - Expanded extraction for metadata, service-tier, and tool parallelism semantics - Added explicit `tool_choice.type == "none"` parity in decode/encode - Preserves bridge/runtime extension fields in `extra` Hybrid payload coverage added Added fixture/test coverage for mixed/provider patterns: - vLLM-style Anthropic and OpenAI Responses hybrids - LiteLLM hybrid patterns for Anthropic and Responses - SGLang Responses extension payloads Consumer blast-radius updates Because request IR added new fields and a new `ContentPart` variant, downstream consumers were updated: - `crates/adaptive` - `crates/ffi` tests - `crates/wasm` tests - `crates/python` Scope note This PR intentionally avoids a larger architectural shift. It keeps the current `AnnotatedLlmRequest` / `AnnotatedLlmResponse` IR approach and expands extraction additively. Validation performed - `uv run pre-commit run --all-files` - `cargo test -p nemo-flow-adaptive` - `cargo test -p nemo-flow-ffi` - `cargo test -p nemo-flow-wasm` - `cargo test -p nemo-flow-python` - `cargo test -p nemo-flow codec::` - Live OpenAI Responses smoke test against the real API - Live OpenAI Responses mixed tool-follow-up `input` round-trip test against the real API - Live Anthropic Messages smoke test against the real API Live validation notes Live provider validation covered: - OpenAI Responses request/response decode and encode behavior - OpenAI Responses usage detail preservation for `input_tokens_details.cached_tokens` and `output_tokens_details.reasoning_tokens` - OpenAI mixed `input` array round-trip behavior using a real tool-calling follow-up request - Anthropic Messages response preservation for `type` and `stop_reason` Remaining limitations - This is still not exhaustive gateway conformance testing for every provider bridge variant - Hybrid gateway behavior is still primarily fixture-backed rather than live-provider-backed ## Summary by CodeRabbit ## Release Notes * **New Features** * Added support for multimodal message content including images alongside text. * Introduced parallel tool execution control to optimize tool-calling behavior. * Expanded request and response metadata fields for improved API compatibility (store, user, metadata, service tier, reasoning controls, and token limits). * **Bug Fixes** * Improved text extraction from multimodal content to correctly handle image URLs. * **Tests** * Updated test fixtures to support expanded request/response schema. [![Review Change Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/NVIDIA/NeMo-Flow/pull/76) Authors: - https://github.com/afourniernv Approvers: - Will Killian (https://github.com/willkill07) URL: NVIDIA#76
1 parent d5798c2 commit 704428b

31 files changed

Lines changed: 2078 additions & 75 deletions

crates/adaptive/src/acg/ir_builder.rs

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -147,9 +147,9 @@ fn extract_text(content: &MessageContent) -> String {
147147
MessageContent::Text(text) => text.clone(),
148148
MessageContent::Parts(parts) => parts
149149
.iter()
150-
.map(|part| {
151-
let ContentPart::Text { text } = part;
152-
text.as_str()
150+
.filter_map(|part| match part {
151+
ContentPart::Text { text } => Some(text.as_str()),
152+
ContentPart::ImageUrl { .. } => None,
153153
})
154154
.collect::<Vec<_>>()
155155
.join("\n"),

crates/adaptive/src/acg_profile.rs

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -188,7 +188,12 @@ fn extract_text(content: &MessageContent) -> String {
188188
MessageContent::Parts(parts) => parts
189189
.iter()
190190
.map(|part| match part {
191-
ContentPart::Text { text } => text.as_str(),
191+
ContentPart::Text { text } => text.clone(),
192+
ContentPart::ImageUrl { image_url } => format!(
193+
"[image:{}:{}]",
194+
image_url.detail.as_deref().unwrap_or("none"),
195+
sha256_hex(&image_url.url)
196+
),
192197
})
193198
.collect::<Vec<_>>()
194199
.join("\n"),

crates/adaptive/tests/integration/acg_module_surface_tests.rs

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -143,6 +143,19 @@ fn acg_module_surface_policy_and_ir_builder_symbols_compile_from_canonical_names
143143
params: None,
144144
tools: None,
145145
tool_choice: None,
146+
store: None,
147+
previous_response_id: None,
148+
truncation: None,
149+
reasoning: None,
150+
include: None,
151+
user: None,
152+
metadata: None,
153+
service_tier: None,
154+
parallel_tool_calls: None,
155+
max_output_tokens: None,
156+
max_tool_calls: None,
157+
top_logprobs: None,
158+
stream: None,
146159
extra: serde_json::Map::new(),
147160
};
148161

@@ -188,6 +201,19 @@ fn acg_module_surface_build_prompt_ir_inserts_tool_schema_before_first_non_syste
188201
},
189202
}]),
190203
tool_choice: None,
204+
store: None,
205+
previous_response_id: None,
206+
truncation: None,
207+
reasoning: None,
208+
include: None,
209+
user: None,
210+
metadata: None,
211+
service_tier: None,
212+
parallel_tool_calls: None,
213+
max_output_tokens: None,
214+
max_tool_calls: None,
215+
top_logprobs: None,
216+
stream: None,
191217
extra: serde_json::Map::new(),
192218
};
193219

crates/adaptive/tests/integration/redis_tests.rs

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -120,6 +120,19 @@ fn sample_annotated_request(model: &str) -> AnnotatedLlmRequest {
120120
params: None,
121121
tools: None,
122122
tool_choice: None,
123+
store: None,
124+
previous_response_id: None,
125+
truncation: None,
126+
reasoning: None,
127+
include: None,
128+
user: None,
129+
metadata: None,
130+
service_tier: None,
131+
parallel_tool_calls: None,
132+
max_output_tokens: None,
133+
max_tool_calls: None,
134+
top_logprobs: None,
135+
stream: None,
123136
extra: serde_json::Map::new(),
124137
}
125138
}

crates/adaptive/tests/integration/runtime_integration_tests.rs

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,19 @@ fn sample_annotated_request(model: &str) -> AnnotatedLlmRequest {
7979
params: None,
8080
tools: None,
8181
tool_choice: None,
82+
store: None,
83+
previous_response_id: None,
84+
truncation: None,
85+
reasoning: None,
86+
include: None,
87+
user: None,
88+
metadata: None,
89+
service_tier: None,
90+
parallel_tool_calls: None,
91+
max_output_tokens: None,
92+
max_tool_calls: None,
93+
top_logprobs: None,
94+
stream: None,
8295
extra: Map::new(),
8396
}
8497
}
@@ -100,6 +113,19 @@ fn sample_growing_chat_requests(model: &str) -> Vec<AnnotatedLlmRequest> {
100113
params: None,
101114
tools: None,
102115
tool_choice: None,
116+
store: None,
117+
previous_response_id: None,
118+
truncation: None,
119+
reasoning: None,
120+
include: None,
121+
user: None,
122+
metadata: None,
123+
service_tier: None,
124+
parallel_tool_calls: None,
125+
max_output_tokens: None,
126+
max_tool_calls: None,
127+
top_logprobs: None,
128+
stream: None,
103129
extra: Map::new(),
104130
},
105131
AnnotatedLlmRequest {
@@ -128,6 +154,19 @@ fn sample_growing_chat_requests(model: &str) -> Vec<AnnotatedLlmRequest> {
128154
params: None,
129155
tools: None,
130156
tool_choice: None,
157+
store: None,
158+
previous_response_id: None,
159+
truncation: None,
160+
reasoning: None,
161+
include: None,
162+
user: None,
163+
metadata: None,
164+
service_tier: None,
165+
parallel_tool_calls: None,
166+
max_output_tokens: None,
167+
max_tool_calls: None,
168+
top_logprobs: None,
169+
stream: None,
131170
extra: Map::new(),
132171
},
133172
AnnotatedLlmRequest {
@@ -167,6 +206,19 @@ fn sample_growing_chat_requests(model: &str) -> Vec<AnnotatedLlmRequest> {
167206
params: None,
168207
tools: None,
169208
tool_choice: None,
209+
store: None,
210+
previous_response_id: None,
211+
truncation: None,
212+
reasoning: None,
213+
include: None,
214+
user: None,
215+
metadata: None,
216+
service_tier: None,
217+
parallel_tool_calls: None,
218+
max_output_tokens: None,
219+
max_tool_calls: None,
220+
top_logprobs: None,
221+
stream: None,
170222
extra: Map::new(),
171223
},
172224
]

crates/adaptive/tests/unit/acg/ir_builder_tests.rs

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,19 @@ fn build_prompt_ir_inserts_tools_before_first_non_system_message_and_preserves_a
7171
params: None,
7272
tools: Some(vec![sample_tool_definition("search")]),
7373
tool_choice: None,
74+
store: None,
75+
previous_response_id: None,
76+
truncation: None,
77+
reasoning: None,
78+
include: None,
79+
user: None,
80+
metadata: None,
81+
service_tier: None,
82+
parallel_tool_calls: None,
83+
max_output_tokens: None,
84+
max_tool_calls: None,
85+
top_logprobs: None,
86+
stream: None,
7487
extra: serde_json::Map::new(),
7588
};
7689

@@ -110,6 +123,19 @@ fn build_prompt_ir_appends_tool_blocks_when_request_contains_only_system_message
110123
sample_tool_definition("lookup"),
111124
]),
112125
tool_choice: None,
126+
store: None,
127+
previous_response_id: None,
128+
truncation: None,
129+
reasoning: None,
130+
include: None,
131+
user: None,
132+
metadata: None,
133+
service_tier: None,
134+
parallel_tool_calls: None,
135+
max_output_tokens: None,
136+
max_tool_calls: None,
137+
top_logprobs: None,
138+
stream: None,
113139
extra: serde_json::Map::new(),
114140
};
115141

@@ -139,6 +165,19 @@ fn build_prompt_ir_omits_tool_schema_hashes_when_no_tools_are_present() {
139165
params: None,
140166
tools: None,
141167
tool_choice: None,
168+
store: None,
169+
previous_response_id: None,
170+
truncation: None,
171+
reasoning: None,
172+
include: None,
173+
user: None,
174+
metadata: None,
175+
service_tier: None,
176+
parallel_tool_calls: None,
177+
max_output_tokens: None,
178+
max_tool_calls: None,
179+
top_logprobs: None,
180+
stream: None,
142181
extra: serde_json::Map::new(),
143182
};
144183

crates/adaptive/tests/unit/acg_component_tests.rs

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -97,6 +97,19 @@ fn sample_annotated_request(model: &str) -> AnnotatedLlmRequest {
9797
params: None,
9898
tools: None,
9999
tool_choice: None,
100+
store: None,
101+
previous_response_id: None,
102+
truncation: None,
103+
reasoning: None,
104+
include: None,
105+
user: None,
106+
metadata: None,
107+
service_tier: None,
108+
parallel_tool_calls: None,
109+
max_output_tokens: None,
110+
max_tool_calls: None,
111+
top_logprobs: None,
112+
stream: None,
100113
extra: serde_json::Map::new(),
101114
}
102115
}

crates/adaptive/tests/unit/acg_learner_tests.rs

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,19 @@ fn sample_request(model: &str, system: &str, user: &str) -> AnnotatedLlmRequest
3434
params: None,
3535
tools: None,
3636
tool_choice: None,
37+
store: None,
38+
previous_response_id: None,
39+
truncation: None,
40+
reasoning: None,
41+
include: None,
42+
user: None,
43+
metadata: None,
44+
service_tier: None,
45+
parallel_tool_calls: None,
46+
max_output_tokens: None,
47+
max_tool_calls: None,
48+
top_logprobs: None,
49+
stream: None,
3750
extra: serde_json::Map::new(),
3851
}
3952
}

crates/adaptive/tests/unit/acg_profile_tests.rs

Lines changed: 48 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,8 @@
44
//! Unit tests for acg profile in the NeMo Flow adaptive crate.
55
66
use nemo_flow::codec::request::{
7-
AnnotatedLlmRequest, ContentPart, FunctionDefinition, Message, MessageContent, ToolDefinition,
7+
AnnotatedLlmRequest, ContentPart, FunctionDefinition, Message, MessageContent, OpenAiImageUrl,
8+
ToolDefinition,
89
};
910
use serde_json::json;
1011

@@ -17,6 +18,19 @@ fn request(messages: Vec<Message>, tools: Option<Vec<ToolDefinition>>) -> Annota
1718
params: None,
1819
tools,
1920
tool_choice: None,
21+
store: None,
22+
previous_response_id: None,
23+
truncation: None,
24+
reasoning: None,
25+
include: None,
26+
user: None,
27+
metadata: None,
28+
service_tier: None,
29+
parallel_tool_calls: None,
30+
max_output_tokens: None,
31+
max_tool_calls: None,
32+
top_logprobs: None,
33+
stream: None,
2034
extra: serde_json::Map::new(),
2135
}
2236
}
@@ -110,3 +124,36 @@ fn acg_profile_helpers_cover_none_paths_and_short_hash() {
110124
assert_eq!(short_hash("short"), "short");
111125
assert_eq!(message_role_tag(&too_short.messages[0]), "user");
112126
}
127+
128+
#[test]
129+
fn acg_profile_image_parts_contribute_stable_fingerprint_signal() {
130+
let with_image_a = request(
131+
vec![Message::User {
132+
content: MessageContent::Parts(vec![ContentPart::ImageUrl {
133+
image_url: OpenAiImageUrl {
134+
url: "https://example.com/a.png".to_string(),
135+
detail: Some("high".to_string()),
136+
},
137+
}]),
138+
name: None,
139+
}],
140+
None,
141+
);
142+
let with_image_b = request(
143+
vec![Message::User {
144+
content: MessageContent::Parts(vec![ContentPart::ImageUrl {
145+
image_url: OpenAiImageUrl {
146+
url: "https://example.com/b.png".to_string(),
147+
detail: Some("high".to_string()),
148+
},
149+
}]),
150+
name: None,
151+
}],
152+
None,
153+
);
154+
155+
assert_ne!(
156+
learning_seed_fingerprint(&with_image_a),
157+
learning_seed_fingerprint(&with_image_b)
158+
);
159+
}

crates/adaptive/tests/unit/adaptive_hints_intercept_tests.rs

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -179,6 +179,19 @@ fn test_adaptive_hints_intercept_injects_prediction_hints_and_manual_override()
179179
params: None,
180180
tools: None,
181181
tool_choice: None,
182+
store: None,
183+
previous_response_id: None,
184+
truncation: None,
185+
reasoning: None,
186+
include: None,
187+
user: None,
188+
metadata: None,
189+
service_tier: None,
190+
parallel_tool_calls: None,
191+
max_output_tokens: None,
192+
max_tool_calls: None,
193+
top_logprobs: None,
194+
stream: None,
182195
extra: serde_json::Map::new(),
183196
};
184197
let (request, returned_annotated) = req_fn(

0 commit comments

Comments
 (0)