Commit 4d95df5
committed
Drop Kinesis records with invalid UTF-8 bytes
Validate UTF-8 encoding of Kinesis record bytes before passing to the
codec. Records containing malformed UTF-8 (e.g. unpaired surrogates
like 0xDBC8) are dropped with a warning log instead of crashing the
pipeline with a Jackson JsonParseException.
Signed-off-by: Souvik Bose <souvbose@amazon.com>1 parent 1ac64aa commit 4d95df5
2 files changed
Lines changed: 31 additions & 1 deletion
File tree
- data-prepper-plugins/kinesis-source/src
- main/java/org/opensearch/dataprepper/plugins/kinesis/source/converter
- test/java/org/opensearch/dataprepper/plugins/kinesis/source/converter
Lines changed: 10 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
21 | 24 | | |
22 | 25 | | |
23 | 26 | | |
| |||
27 | 30 | | |
28 | 31 | | |
29 | 32 | | |
| 33 | + | |
30 | 34 | | |
31 | 35 | | |
32 | 36 | | |
| |||
65 | 69 | | |
66 | 70 | | |
67 | 71 | | |
68 | | - | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
69 | 78 | | |
70 | 79 | | |
Lines changed: 21 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
| 42 | + | |
42 | 43 | | |
43 | 44 | | |
44 | 45 | | |
| |||
108 | 109 | | |
109 | 110 | | |
110 | 111 | | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
111 | 132 | | |
112 | 133 | | |
113 | 134 | | |
| |||
0 commit comments