Commit f08f376
committed
SEE LOG. Merge branch 'sv/integration-target--combinable-DFA-capture-resolution' into sv/integrate-combinable-DFA-capture-resolution
This was a really gnarly merge and took a while to get through.
Both the branch implementing captures and the branch implementing
`fsm_union_repeated_pattern_group` made interface changes and added
significant behavior to `ast_analysis.c`, in ways that were at times
tricky to reconcile. `ast_compile.c` was also restructured, but some
of the changes were necessary to inform capture and unioning
functionality.
As of now _almost_ all the tests are passing, but there are a couple
specific things to note:
- This brings in an interface change to `fsm_endid_set`, it now returns
an enum rather than just an int. Because if you squint enough they are
both just numbers (according to the C language spec) code using
`if (!fsm_endid_set(...)) { ... }` will not get a warning for the
changed meaning of the return code. I prefer having this be an enum,
but it IS an interface change, and I'm not opposed to changing it back
in a later commit.
- `build/tests/capture/res_test_case_list:FAIL`
I'm going to fix this in a later commit. I'm not 100% sure yet, but
I think it's related to conflicting changes in the parser code, which
I'm waiting to regenerate until after this merge commit.
- `build/tests/endids/res10_minimise_partial_overlap:FAIL`
This has to do with `AST_ANALYSIS_ERROR_UNSUPPORTED_PCRE` or
`AST_ANALYSIS_ERROR_UNSUPPORTED_CAPTURE` not being handled yet by code
specific to the native dialect. I'm going to handle that later, but
will have to check whether native should behave like PCRE in those
particular cases or not. Either the error handling code needs to be
updated, or the code raising the UNSUPPORTED error needs to check
which regex dialect is in effect.
- fuzz/target.c
I have confirmed that the merged fuzzer harness code builds, but
haven't yet spend time re-fuzzing anything. In my experience libfuzzer
has bit-rotted over the last couple clang/LLVM releases and tends to
nondeterministically crash in combination with some of the clang
sanitizers now, so we may want to retarget this to using AFL-Fuzz++.
That's well outside the scope of this PR, though.
- src/lx/parser.act
I updated this but haven't re-generated the parsers yet, and I
updated the generated code directly with a one-line change to
reflect the `fsm_endid_set` interface change. As mentioned above,
I'll re-generate that code in a separate commit.
Conflicts:
- fuzz/target.c
- include/adt/stateset.h
- include/re/re.h
- src/adt/stateset.c
- src/fsm/main.c
- src/libfsm/Makefile
- src/libfsm/capture.c
- src/libfsm/clone.c
- src/libfsm/closure.c
- src/libfsm/consolidate.c
- src/libfsm/determinise.c
- src/libfsm/determinise_internal.h
- src/libfsm/endids.c
- src/libfsm/epsilons.c
- src/libfsm/exec.c
- src/libfsm/internal.h
- src/libfsm/merge.c
- src/libfsm/minimise.c
- src/libfsm/state.c
- src/libre/ast.h
- src/libre/ast_analysis.c
- src/libre/ast_analysis.h
- src/libre/ast_compile.c
- src/libre/ast_rewrite.c
- src/libre/re.c
- src/lx/parser.act
- src/re/main.c
- tests/capture/captest.c
- tests/capture/captest.h
- tests/capture/capture3.c
- tests/capture/capture4.c
- tests/capture/capture5.c
- tests/capture/capture_concat1.c
- tests/capture/capture_concat2.c
- tests/capture/capture_union1.c
- tests/minimise/minimise_test_case_list.c462 files changed
Lines changed: 31818 additions & 9468 deletions
File tree
- .github/workflows
- doc
- examples
- bm
- glob
- iprange
- rpn
- utf8dfa
- words
- fuzz
- include
- adt
- fsm
- print
- re
- man
- fsm.1
- lx.1
- re.1
- rx.1
- src
- adt
- fsm
- libfsm
- pred
- print
- vm
- walk
- libre
- class
- dialect
- glob
- like
- literal
- native
- pcre
- sql
- print
- lx
- print
- print
- retest
- re
- rx
- tests
- aho_corasick
- capture
- detect_required
- eager_output
- endids
- fsm
- gen
- lxpos
- minimise
- pcre-anchor
- pcre
- re_interpolate_groups
- re_literal
- re_strings
- regressions
- retest
- theft
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
71 | 71 | | |
72 | 72 | | |
73 | 73 | | |
| 74 | + | |
74 | 75 | | |
75 | 76 | | |
76 | 77 | | |
| |||
101 | 102 | | |
102 | 103 | | |
103 | 104 | | |
| 105 | + | |
104 | 106 | | |
105 | 107 | | |
106 | 108 | | |
| |||
114 | 116 | | |
115 | 117 | | |
116 | 118 | | |
| 119 | + | |
117 | 120 | | |
| 121 | + | |
118 | 122 | | |
119 | 123 | | |
| 124 | + | |
120 | 125 | | |
121 | 126 | | |
122 | 127 | | |
123 | | - | |
124 | | - | |
| 128 | + | |
125 | 129 | | |
126 | 130 | | |
127 | 131 | | |
| |||
131 | 135 | | |
132 | 136 | | |
133 | 137 | | |
| 138 | + | |
| 139 | + | |
134 | 140 | | |
135 | 141 | | |
136 | 142 | | |
| |||
141 | 147 | | |
142 | 148 | | |
143 | 149 | | |
| 150 | + | |
144 | 151 | | |
145 | 152 | | |
146 | 153 | | |
| |||
184 | 191 | | |
185 | 192 | | |
186 | 193 | | |
187 | | - | |
| 194 | + | |
188 | 195 | | |
189 | 196 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| 7 | + | |
| 8 | + | |
7 | 9 | | |
8 | 10 | | |
9 | 11 | | |
10 | 12 | | |
11 | 13 | | |
12 | 14 | | |
| 15 | + | |
13 | 16 | | |
14 | 17 | | |
15 | 18 | | |
16 | 19 | | |
17 | 20 | | |
18 | 21 | | |
19 | | - | |
| 22 | + | |
| 23 | + | |
20 | 24 | | |
21 | 25 | | |
22 | 26 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
61 | 61 | | |
62 | 62 | | |
63 | 63 | | |
64 | | - | |
| 64 | + | |
65 | 65 | | |
66 | 66 | | |
67 | 67 | | |
| |||
80 | 80 | | |
81 | 81 | | |
82 | 82 | | |
83 | | - | |
| 83 | + | |
84 | 84 | | |
85 | 85 | | |
86 | 86 | | |
| |||
0 commit comments