Skip to content

Fix charset decoding for server.request.body.files_content in commons-fileupload#11212

Merged
gh-worker-dd-mergequeue-cf854d[bot] merged 16 commits intomasterfrom
alejandro.gonzalez/APPSEC-61875-files-content-encoding
Apr 28, 2026
Merged

Fix charset decoding for server.request.body.files_content in commons-fileupload#11212
gh-worker-dd-mergequeue-cf854d[bot] merged 16 commits intomasterfrom
alejandro.gonzalez/APPSEC-61875-files-content-encoding

Conversation

@jandro996
Copy link
Copy Markdown
Member

@jandro996 jandro996 commented Apr 27, 2026

What Does This Do

  • Introduces MultipartContentDecoder in internal-api/src/main/java/datadog/trace/api/http/ — a shared utility that decodes multipart file content bytes using the charset declared in each part's Content-Type header
  • FileItemContentReader.readContent() now delegates to MultipartContentDecoder.decodeBytes(buf, total, fileItem.getContentType()) instead of hardcoding ISO-8859-1
  • MultipartContentDecoder.extractCharset() is case-insensitive, handles RFC 2045 quoted values (charset="UTF-8"), and guards against parameter name substring false positives (xcharset=… no longer matches as charset=)
  • CodingErrorAction.REPLACE is used so that truncation at a multibyte character boundary (from the MAX_CONTENT_BYTES cap) produces U+FFFD for the incomplete sequence rather than falling back to the JVM default charset for the entire string
  • CODEOWNERS updated to assign internal-api/src/main/java/datadog/trace/api/http/ to @DataDog/asm-java

Motivation

FileItemContentReader was decoding all uploaded file content with hardcoded ISO-8859-1. Files uploaded as UTF-8 or another charset arrived at the WAF with garbled non-ASCII characters, preventing detection on any content outside the ASCII range.

Additional Notes

Charset fallback diverges from StoredByteBody: when no charset is declared in the part's Content-Type, MultipartContentDecoder falls back to Charset.defaultCharset() (the JVM default) rather than hardcoding UTF-8 → ISO-8859-1 as StoredByteBody does. This was deliberately chosen after discussion with Manu — the JVM default is a safer assumption for a multi-framework utility than an opinionated constant.

catch (CharacterCodingException e) is unreachable at runtime: CharsetDecoder.decode(ByteBuffer) declares throws CharacterCodingException as a checked exception so the compiler requires the catch, but CodingErrorAction.REPLACE ensures the exception is never actually thrown. The catch block rethrows as IllegalStateException to make the intent clear.

MultipartContentDecoder is designed for reuse: the same class is used in PR #11198 (Tomcat and Netty integrations) so the charset parsing logic lives in one place.

Jira ticket: APPSEC-61875

@jandro996 jandro996 added type: bug Bug report and fix tag: no release notes Changes to exclude from release notes comp: asm waf Application Security Management (WAF) labels Apr 27, 2026
…Test

Charset fallback scenarios are covered by MultipartContentDecoderTest.
One integration test is kept to verify that getContentType() is passed through.
@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented Apr 27, 2026

Benchmarks

Startup

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master alejandro.gonzalez/APPSEC-61875-files-content-encoding
git_commit_date 1777321489 1777371692
git_commit_sha 75fe2b3 33f9eb1
release_version 1.62.0-SNAPSHOT~75fe2b3c55 1.62.0-SNAPSHOT~33f9eb1764
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1777373487 1777373487
ci_job_id 1637001070 1637001070
ci_pipeline_id 110102513 110102513
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-0-jsb4egb5 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-0-jsb4egb5 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
module Agent Agent
parent None None

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 63 metrics, 8 unstable metrics.

Startup time reports for insecure-bank
gantt
    title insecure-bank - global startup overhead: candidate=1.62.0-SNAPSHOT~33f9eb1764, baseline=1.62.0-SNAPSHOT~75fe2b3c55

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.064 s) : 0, 1064236
Total [baseline] (8.825 s) : 0, 8825491
Agent [candidate] (1.064 s) : 0, 1064393
Total [candidate] (8.837 s) : 0, 8837398
section iast
Agent [baseline] (1.241 s) : 0, 1241349
Total [baseline] (9.528 s) : 0, 9527842
Agent [candidate] (1.248 s) : 0, 1247917
Total [candidate] (9.545 s) : 0, 9545178
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.064 s -
Agent iast 1.241 s 177.113 ms (16.6%)
Total tracing 8.825 s -
Total iast 9.528 s 702.351 ms (8.0%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.064 s -
Agent iast 1.248 s 183.524 ms (17.2%)
Total tracing 8.837 s -
Total iast 9.545 s 707.78 ms (8.0%)
gantt
    title insecure-bank - break down per module: candidate=1.62.0-SNAPSHOT~33f9eb1764, baseline=1.62.0-SNAPSHOT~75fe2b3c55

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.232 ms) : 0, 1232
crashtracking [candidate] (1.218 ms) : 0, 1218
BytebuddyAgent [baseline] (636.228 ms) : 0, 636228
BytebuddyAgent [candidate] (635.815 ms) : 0, 635815
AgentMeter [baseline] (29.472 ms) : 0, 29472
AgentMeter [candidate] (29.47 ms) : 0, 29470
GlobalTracer [baseline] (249.048 ms) : 0, 249048
GlobalTracer [candidate] (249.158 ms) : 0, 249158
AppSec [baseline] (32.761 ms) : 0, 32761
AppSec [candidate] (32.626 ms) : 0, 32626
Debugger [baseline] (60.495 ms) : 0, 60495
Debugger [candidate] (59.872 ms) : 0, 59872
Remote Config [baseline] (598.483 µs) : 0, 598
Remote Config [candidate] (598.748 µs) : 0, 599
Telemetry [baseline] (8.366 ms) : 0, 8366
Telemetry [candidate] (8.37 ms) : 0, 8370
Flare Poller [baseline] (9.929 ms) : 0, 9929
Flare Poller [candidate] (11.26 ms) : 0, 11260
section iast
crashtracking [baseline] (1.225 ms) : 0, 1225
crashtracking [candidate] (1.221 ms) : 0, 1221
BytebuddyAgent [baseline] (821.637 ms) : 0, 821637
BytebuddyAgent [candidate] (827.744 ms) : 0, 827744
AgentMeter [baseline] (11.275 ms) : 0, 11275
AgentMeter [candidate] (11.271 ms) : 0, 11271
GlobalTracer [baseline] (237.53 ms) : 0, 237530
GlobalTracer [candidate] (237.381 ms) : 0, 237381
AppSec [baseline] (30.496 ms) : 0, 30496
AppSec [candidate] (30.574 ms) : 0, 30574
Debugger [baseline] (62.45 ms) : 0, 62450
Debugger [candidate] (62.557 ms) : 0, 62557
Remote Config [baseline] (534.095 µs) : 0, 534
Remote Config [candidate] (527.302 µs) : 0, 527
Telemetry [baseline] (7.935 ms) : 0, 7935
Telemetry [candidate] (7.926 ms) : 0, 7926
Flare Poller [baseline] (3.345 ms) : 0, 3345
Flare Poller [candidate] (3.344 ms) : 0, 3344
IAST [baseline] (28.955 ms) : 0, 28955
IAST [candidate] (29.225 ms) : 0, 29225
Loading
Startup time reports for petclinic
gantt
    title petclinic - global startup overhead: candidate=1.62.0-SNAPSHOT~33f9eb1764, baseline=1.62.0-SNAPSHOT~75fe2b3c55

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.071 s) : 0, 1070587
Total [baseline] (11.074 s) : 0, 11073615
Agent [candidate] (1.064 s) : 0, 1064355
Total [candidate] (11.111 s) : 0, 11111029
section appsec
Agent [baseline] (1.279 s) : 0, 1279411
Total [baseline] (11.19 s) : 0, 11189722
Agent [candidate] (1.269 s) : 0, 1268854
Total [candidate] (11.152 s) : 0, 11152139
section iast
Agent [baseline] (1.247 s) : 0, 1247351
Total [baseline] (11.263 s) : 0, 11263068
Agent [candidate] (1.243 s) : 0, 1243260
Total [candidate] (11.211 s) : 0, 11210656
section profiling
Agent [baseline] (1.19 s) : 0, 1190027
Total [baseline] (10.992 s) : 0, 10991747
Agent [candidate] (1.204 s) : 0, 1203760
Total [candidate] (11.022 s) : 0, 11021941
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.071 s -
Agent appsec 1.279 s 208.824 ms (19.5%)
Agent iast 1.247 s 176.764 ms (16.5%)
Agent profiling 1.19 s 119.44 ms (11.2%)
Total tracing 11.074 s -
Total appsec 11.19 s 116.107 ms (1.0%)
Total iast 11.263 s 189.453 ms (1.7%)
Total profiling 10.992 s -81.868 ms (-0.7%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.064 s -
Agent appsec 1.269 s 204.5 ms (19.2%)
Agent iast 1.243 s 178.905 ms (16.8%)
Agent profiling 1.204 s 139.406 ms (13.1%)
Total tracing 11.111 s -
Total appsec 11.152 s 41.11 ms (0.4%)
Total iast 11.211 s 99.627 ms (0.9%)
Total profiling 11.022 s -89.088 ms (-0.8%)
gantt
    title petclinic - break down per module: candidate=1.62.0-SNAPSHOT~33f9eb1764, baseline=1.62.0-SNAPSHOT~75fe2b3c55

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.228 ms) : 0, 1228
crashtracking [candidate] (1.21 ms) : 0, 1210
BytebuddyAgent [baseline] (638.614 ms) : 0, 638614
BytebuddyAgent [candidate] (635.624 ms) : 0, 635624
AgentMeter [baseline] (29.714 ms) : 0, 29714
AgentMeter [candidate] (29.607 ms) : 0, 29607
GlobalTracer [baseline] (250.617 ms) : 0, 250617
GlobalTracer [candidate] (249.936 ms) : 0, 249936
AppSec [baseline] (33.089 ms) : 0, 33089
AppSec [candidate] (32.889 ms) : 0, 32889
Debugger [baseline] (60.753 ms) : 0, 60753
Debugger [candidate] (60.927 ms) : 0, 60927
Remote Config [baseline] (601.425 µs) : 0, 601
Remote Config [candidate] (609.0 µs) : 0, 609
Telemetry [baseline] (10.749 ms) : 0, 10749
Telemetry [candidate] (9.183 ms) : 0, 9183
Flare Poller [baseline] (9.127 ms) : 0, 9127
Flare Poller [candidate] (8.438 ms) : 0, 8438
section appsec
crashtracking [baseline] (1.24 ms) : 0, 1240
crashtracking [candidate] (1.223 ms) : 0, 1223
BytebuddyAgent [baseline] (686.073 ms) : 0, 686073
BytebuddyAgent [candidate] (674.611 ms) : 0, 674611
AgentMeter [baseline] (12.438 ms) : 0, 12438
AgentMeter [candidate] (12.237 ms) : 0, 12237
GlobalTracer [baseline] (251.247 ms) : 0, 251247
GlobalTracer [candidate] (250.792 ms) : 0, 250792
AppSec [baseline] (185.726 ms) : 0, 185726
AppSec [candidate] (186.52 ms) : 0, 186520
Debugger [baseline] (65.468 ms) : 0, 65468
Debugger [candidate] (65.283 ms) : 0, 65283
Remote Config [baseline] (589.489 µs) : 0, 589
Remote Config [candidate] (562.502 µs) : 0, 563
Telemetry [baseline] (7.988 ms) : 0, 7988
Telemetry [candidate] (7.899 ms) : 0, 7899
Flare Poller [baseline] (6.789 ms) : 0, 6789
Flare Poller [candidate] (8.145 ms) : 0, 8145
IAST [baseline] (24.874 ms) : 0, 24874
IAST [candidate] (24.951 ms) : 0, 24951
section iast
crashtracking [baseline] (1.219 ms) : 0, 1219
crashtracking [candidate] (1.246 ms) : 0, 1246
BytebuddyAgent [baseline] (825.872 ms) : 0, 825872
BytebuddyAgent [candidate] (822.708 ms) : 0, 822708
AgentMeter [baseline] (11.444 ms) : 0, 11444
AgentMeter [candidate] (11.291 ms) : 0, 11291
GlobalTracer [baseline] (237.775 ms) : 0, 237775
GlobalTracer [candidate] (237.555 ms) : 0, 237555
AppSec [baseline] (32.167 ms) : 0, 32167
AppSec [candidate] (30.661 ms) : 0, 30661
Debugger [baseline] (64.118 ms) : 0, 64118
Debugger [candidate] (63.798 ms) : 0, 63798
Remote Config [baseline] (532.535 µs) : 0, 533
Remote Config [candidate] (522.196 µs) : 0, 522
Telemetry [baseline] (7.961 ms) : 0, 7961
Telemetry [candidate] (7.909 ms) : 0, 7909
Flare Poller [baseline] (3.455 ms) : 0, 3455
Flare Poller [candidate] (3.424 ms) : 0, 3424
IAST [baseline] (26.625 ms) : 0, 26625
IAST [candidate] (28.15 ms) : 0, 28150
section profiling
crashtracking [baseline] (1.179 ms) : 0, 1179
crashtracking [candidate] (1.208 ms) : 0, 1208
BytebuddyAgent [baseline] (694.486 ms) : 0, 694486
BytebuddyAgent [candidate] (704.139 ms) : 0, 704139
AgentMeter [baseline] (8.96 ms) : 0, 8960
AgentMeter [candidate] (9.095 ms) : 0, 9095
GlobalTracer [baseline] (208.714 ms) : 0, 208714
GlobalTracer [candidate] (210.324 ms) : 0, 210324
AppSec [baseline] (32.757 ms) : 0, 32757
AppSec [candidate] (33.182 ms) : 0, 33182
Debugger [baseline] (66.122 ms) : 0, 66122
Debugger [candidate] (66.31 ms) : 0, 66310
Remote Config [baseline] (594.393 µs) : 0, 594
Remote Config [candidate] (583.26 µs) : 0, 583
Telemetry [baseline] (8.109 ms) : 0, 8109
Telemetry [candidate] (8.177 ms) : 0, 8177
Flare Poller [baseline] (3.533 ms) : 0, 3533
Flare Poller [candidate] (3.671 ms) : 0, 3671
ProfilingAgent [baseline] (93.988 ms) : 0, 93988
ProfilingAgent [candidate] (94.885 ms) : 0, 94885
Profiling [baseline] (94.544 ms) : 0, 94544
Profiling [candidate] (95.44 ms) : 0, 95440
Loading

Load

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master alejandro.gonzalez/APPSEC-61875-files-content-encoding
git_commit_date 1777321489 1777371692
git_commit_sha 75fe2b3 33f9eb1
release_version 1.62.0-SNAPSHOT~75fe2b3c55 1.62.0-SNAPSHOT~33f9eb1764
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1777374072 1777374072
ci_job_id 1637001072 1637001072
ci_pipeline_id 110102513 110102513
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-0-vifwlgy6 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-0-vifwlgy6 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 21 metrics, 15 unstable metrics.

Request duration reports for petclinic
gantt
    title petclinic - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~33f9eb1764, baseline=1.62.0-SNAPSHOT~75fe2b3c55
    dateFormat X
    axisFormat %s
section baseline
no_agent (19.55 ms) : 19353, 19746
.   : milestone, 19550,
appsec (18.735 ms) : 18546, 18924
.   : milestone, 18735,
code_origins (17.975 ms) : 17798, 18152
.   : milestone, 17975,
iast (17.978 ms) : 17798, 18157
.   : milestone, 17978,
profiling (18.788 ms) : 18601, 18975
.   : milestone, 18788,
tracing (17.933 ms) : 17754, 18111
.   : milestone, 17933,
section candidate
no_agent (19.372 ms) : 19176, 19569
.   : milestone, 19372,
appsec (18.68 ms) : 18492, 18868
.   : milestone, 18680,
code_origins (17.776 ms) : 17599, 17952
.   : milestone, 17776,
iast (17.887 ms) : 17709, 18064
.   : milestone, 17887,
profiling (18.438 ms) : 18255, 18622
.   : milestone, 18438,
tracing (17.694 ms) : 17519, 17869
.   : milestone, 17694,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 19.55 ms [19.353 ms, 19.746 ms] -
appsec 18.735 ms [18.546 ms, 18.924 ms] -814.743 µs (-4.2%)
code_origins 17.975 ms [17.798 ms, 18.152 ms] -1.574 ms (-8.1%)
iast 17.978 ms [17.798 ms, 18.157 ms] -1.572 ms (-8.0%)
profiling 18.788 ms [18.601 ms, 18.975 ms] -761.541 µs (-3.9%)
tracing 17.933 ms [17.754 ms, 18.111 ms] -1.617 ms (-8.3%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 19.372 ms [19.176 ms, 19.569 ms] -
appsec 18.68 ms [18.492 ms, 18.868 ms] -692.197 µs (-3.6%)
code_origins 17.776 ms [17.599 ms, 17.952 ms] -1.597 ms (-8.2%)
iast 17.887 ms [17.709 ms, 18.064 ms] -1.486 ms (-7.7%)
profiling 18.438 ms [18.255 ms, 18.622 ms] -933.897 µs (-4.8%)
tracing 17.694 ms [17.519 ms, 17.869 ms] -1.678 ms (-8.7%)
Request duration reports for insecure-bank
gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~33f9eb1764, baseline=1.62.0-SNAPSHOT~75fe2b3c55
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.27 ms) : 1258, 1282
.   : milestone, 1270,
iast (3.318 ms) : 3272, 3364
.   : milestone, 3318,
iast_FULL (6.068 ms) : 6006, 6130
.   : milestone, 6068,
iast_GLOBAL (3.689 ms) : 3627, 3750
.   : milestone, 3689,
profiling (2.267 ms) : 2246, 2289
.   : milestone, 2267,
tracing (1.953 ms) : 1936, 1969
.   : milestone, 1953,
section candidate
no_agent (1.237 ms) : 1225, 1249
.   : milestone, 1237,
iast (3.374 ms) : 3328, 3420
.   : milestone, 3374,
iast_FULL (5.983 ms) : 5922, 6045
.   : milestone, 5983,
iast_GLOBAL (3.686 ms) : 3625, 3747
.   : milestone, 3686,
profiling (2.152 ms) : 2133, 2172
.   : milestone, 2152,
tracing (1.932 ms) : 1916, 1948
.   : milestone, 1932,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.27 ms [1.258 ms, 1.282 ms] -
iast 3.318 ms [3.272 ms, 3.364 ms] 2.048 ms (161.3%)
iast_FULL 6.068 ms [6.006 ms, 6.13 ms] 4.799 ms (377.9%)
iast_GLOBAL 3.689 ms [3.627 ms, 3.75 ms] 2.419 ms (190.5%)
profiling 2.267 ms [2.246 ms, 2.289 ms] 997.626 µs (78.6%)
tracing 1.953 ms [1.936 ms, 1.969 ms] 682.809 µs (53.8%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.237 ms [1.225 ms, 1.249 ms] -
iast 3.374 ms [3.328 ms, 3.42 ms] 2.137 ms (172.8%)
iast_FULL 5.983 ms [5.922 ms, 6.045 ms] 4.747 ms (383.8%)
iast_GLOBAL 3.686 ms [3.625 ms, 3.747 ms] 2.449 ms (198.0%)
profiling 2.152 ms [2.133 ms, 2.172 ms] 915.667 µs (74.0%)
tracing 1.932 ms [1.916 ms, 1.948 ms] 695.355 µs (56.2%)

Dacapo

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master alejandro.gonzalez/APPSEC-61875-files-content-encoding
git_commit_date 1777321489 1777371692
git_commit_sha 75fe2b3 33f9eb1
release_version 1.62.0-SNAPSHOT~75fe2b3c55 1.62.0-SNAPSHOT~33f9eb1764
See matching parameters
Baseline Candidate
application biojava biojava
ci_job_date 1777373699 1777373699
ci_job_id 1637001074 1637001074
ci_pipeline_id 110102513 110102513
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-0-0n5staxo 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-0-0n5staxo 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 10 metrics, 2 unstable metrics.

Execution time for biojava
gantt
    title biojava - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~33f9eb1764, baseline=1.62.0-SNAPSHOT~75fe2b3c55
    dateFormat X
    axisFormat %s
section baseline
no_agent (14.932 s) : 14932000, 14932000
.   : milestone, 14932000,
appsec (14.726 s) : 14726000, 14726000
.   : milestone, 14726000,
iast (18.183 s) : 18183000, 18183000
.   : milestone, 18183000,
iast_GLOBAL (18.028 s) : 18028000, 18028000
.   : milestone, 18028000,
profiling (15.033 s) : 15033000, 15033000
.   : milestone, 15033000,
tracing (14.76 s) : 14760000, 14760000
.   : milestone, 14760000,
section candidate
no_agent (15.457 s) : 15457000, 15457000
.   : milestone, 15457000,
appsec (14.523 s) : 14523000, 14523000
.   : milestone, 14523000,
iast (19.089 s) : 19089000, 19089000
.   : milestone, 19089000,
iast_GLOBAL (17.891 s) : 17891000, 17891000
.   : milestone, 17891000,
profiling (15.069 s) : 15069000, 15069000
.   : milestone, 15069000,
tracing (14.763 s) : 14763000, 14763000
.   : milestone, 14763000,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 14.932 s [14.932 s, 14.932 s] -
appsec 14.726 s [14.726 s, 14.726 s] -206.0 ms (-1.4%)
iast 18.183 s [18.183 s, 18.183 s] 3.251 s (21.8%)
iast_GLOBAL 18.028 s [18.028 s, 18.028 s] 3.096 s (20.7%)
profiling 15.033 s [15.033 s, 15.033 s] 101.0 ms (0.7%)
tracing 14.76 s [14.76 s, 14.76 s] -172.0 ms (-1.2%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.457 s [15.457 s, 15.457 s] -
appsec 14.523 s [14.523 s, 14.523 s] -934.0 ms (-6.0%)
iast 19.089 s [19.089 s, 19.089 s] 3.632 s (23.5%)
iast_GLOBAL 17.891 s [17.891 s, 17.891 s] 2.434 s (15.7%)
profiling 15.069 s [15.069 s, 15.069 s] -388.0 ms (-2.5%)
tracing 14.763 s [14.763 s, 14.763 s] -694.0 ms (-4.5%)
Execution time for tomcat
gantt
    title tomcat - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~33f9eb1764, baseline=1.62.0-SNAPSHOT~75fe2b3c55
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.486 ms) : 1475, 1498
.   : milestone, 1486,
appsec (3.822 ms) : 3601, 4043
.   : milestone, 3822,
iast (2.284 ms) : 2214, 2355
.   : milestone, 2284,
iast_GLOBAL (2.314 ms) : 2243, 2384
.   : milestone, 2314,
profiling (2.111 ms) : 2056, 2167
.   : milestone, 2111,
tracing (2.074 ms) : 2020, 2128
.   : milestone, 2074,
section candidate
no_agent (1.482 ms) : 1471, 1494
.   : milestone, 1482,
appsec (3.784 ms) : 3567, 4001
.   : milestone, 3784,
iast (2.28 ms) : 2210, 2350
.   : milestone, 2280,
iast_GLOBAL (2.319 ms) : 2248, 2390
.   : milestone, 2319,
profiling (2.526 ms) : 2360, 2693
.   : milestone, 2526,
tracing (2.072 ms) : 2018, 2126
.   : milestone, 2072,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.486 ms [1.475 ms, 1.498 ms] -
appsec 3.822 ms [3.601 ms, 4.043 ms] 2.336 ms (157.2%)
iast 2.284 ms [2.214 ms, 2.355 ms] 798.302 µs (53.7%)
iast_GLOBAL 2.314 ms [2.243 ms, 2.384 ms] 827.636 µs (55.7%)
profiling 2.111 ms [2.056 ms, 2.167 ms] 625.32 µs (42.1%)
tracing 2.074 ms [2.02 ms, 2.128 ms] 587.754 µs (39.5%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.482 ms [1.471 ms, 1.494 ms] -
appsec 3.784 ms [3.567 ms, 4.001 ms] 2.302 ms (155.3%)
iast 2.28 ms [2.21 ms, 2.35 ms] 798.013 µs (53.8%)
iast_GLOBAL 2.319 ms [2.248 ms, 2.39 ms] 836.737 µs (56.5%)
profiling 2.526 ms [2.36 ms, 2.693 ms] 1.044 ms (70.4%)
tracing 2.072 ms [2.018 ms, 2.126 ms] 589.838 µs (39.8%)

…entDecoder

Replaces hardcoded UTF-8 (no-charset default) and ISO-8859-1 (fallback)
with Charset.defaultCharset() in both cases, per reviewer feedback.
Copy link
Copy Markdown
Member

@manuel-alvarez-alvarez manuel-alvarez-alvarez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM,

@jandro996 jandro996 marked this pull request as ready for review April 28, 2026 08:48
@jandro996 jandro996 requested review from a team as code owners April 28, 2026 08:48
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e74778b436

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal-api/src/main/java/datadog/trace/api/http/MultipartContentDecoder.java Outdated
…partContentDecoder

RFC 2045 allows quoted parameter values (charset="UTF-8"). Without stripping
the quotes Charset.forName rejects the name and decodeBytes falls back to the
JVM default instead of the declared charset.
…ecoder

String#split is forbidden (uses regex internally). Replace with an explicit
char scan to find the first ; , or space after charset=.
@jandro996
Copy link
Copy Markdown
Member Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 228e1c467a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…oder.extractCharset

Replace toLowerCase(Locale.ROOT).indexOf with an inline ASCII case-insensitive
scan to avoid allocating a full lowercase copy of the Content-Type string.
Also use the already-computed end variable as the loop bound.
All files in this package (StoredByteBody, StoredBodySupplier,
MultipartContentDecoder, etc.) are AppSec HTTP body inspection
infrastructure.
@jandro996 jandro996 requested a review from a team as a code owner April 28, 2026 09:57
@jandro996 jandro996 requested review from dougqh and removed request for a team April 28, 2026 09:57
When FileItemContentReader truncates at MAX_CONTENT_BYTES a cut in the
middle of a multibyte character no longer triggers the fallback path.
REPLACE substitutes the incomplete sequence with U+FFFD using the
declared charset; REPORT was throwing and silently switching to the
JVM default charset for the whole string.
@jandro996
Copy link
Copy Markdown
Member Author

@codex review

With CodingErrorAction.REPLACE the decoder never throws
CharacterCodingException, making the catch branch unreachable.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 22536b19ea

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal-api/src/main/java/datadog/trace/api/http/MultipartContentDecoder.java Outdated
Substring search could match 'xcharset=' as 'charset=', allowing
a client-controlled decoy parameter to override the real charset.
Now requires the match to be at position 0 or preceded by ';' or ' '.
CharsetDecoder.decode(ByteBuffer) declares throws CharacterCodingException
even though CodingErrorAction.REPLACE makes it unreachable; the compiler
still requires the exception to be caught or declared.
@jandro996
Copy link
Copy Markdown
Member Author

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. What shall we delve into next?

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@jandro996 jandro996 enabled auto-merge April 28, 2026 10:27
@jandro996 jandro996 changed the title Fix per-part charset decoding in FileItemContentReader for files_content Fix charset decoding for server.request.body.files_content in commons-fileupload Apr 28, 2026
@jandro996 jandro996 added this pull request to the merge queue Apr 28, 2026
@dd-octo-sts
Copy link
Copy Markdown
Contributor

dd-octo-sts Bot commented Apr 28, 2026

/merge

@gh-worker-devflow-routing-ef8351
Copy link
Copy Markdown

gh-worker-devflow-routing-ef8351 Bot commented Apr 28, 2026

View all feedbacks in Devflow UI.

2026-04-28 11:17:52 UTC ℹ️ Start processing command /merge


2026-04-28 11:17:57 UTC ℹ️ MergeQueue: pull request added to the queue

The expected merge time in master is approximately 2h (p90).


2026-04-28 12:33:19 UTC ℹ️ MergeQueue: This merge request was merged

@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Apr 28, 2026
jandro996 added a commit that referenced this pull request Apr 28, 2026
Introduces MultipartContentDecoder (internal-api) to decode multipart file bytes
using the charset declared in each part's Content-Type header, with JVM-default
fallback and REPLACE on malformed input. Mirrors the approach in PR #11212 for
commons-fileupload so all three integrations share the same decoding logic.
@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d Bot merged commit 9d73760 into master Apr 28, 2026
762 of 767 checks passed
@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d Bot deleted the alejandro.gonzalez/APPSEC-61875-files-content-encoding branch April 28, 2026 12:33
@github-actions github-actions Bot added this to the 1.62.0 milestone Apr 28, 2026
try {
return Charset.forName(name);
} catch (IllegalArgumentException e) {
return null;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this risk a NullPointerException elsewhere?
Should we fallback to a default Charset instead?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now it's only used in decodeBytes method, where that is done
if (charset == null) charset = Charset.defaultCharset();

Probably It makes more sense to return it where you mention, not sure if in the future it will be necessary to know exactly if we are not able to get the charset

I could change it in my next PR related with this topic if you want :)

jandro996 added a commit that referenced this pull request Apr 28, 2026
Introduces MultipartContentDecoder (internal-api) to decode multipart file bytes
using the charset declared in each part's Content-Type header, with JVM-default
fallback and REPLACE on malformed input. Mirrors the approach in PR #11212 for
commons-fileupload so all three integrations share the same decoding logic.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp: asm waf Application Security Management (WAF) tag: no release notes Changes to exclude from release notes type: bug Bug report and fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants