Commit b5e93fc
authored
perf: optimize PikeVM by matching multiple literal bytes in tight loop when possible (VirusTotal#678)
This PR introduces an optimization to PikeVM matching for contiguous runs of literal bytes, speeding up regex matching for patterns containing literal sequences.
Introduced Instr::Bytes(LiteralBytesIter) to group contiguous literal bytes and match them in a single fast loop, bypassing the VM thread scheduling, state transitions, and epsilon-closure overhead for every byte. Built a lazy iterator LiteralBytesIter that decodes bytecode literals on-the-fly, resolving escaped OPCODE_PREFIX (0xAA) sequences and stopping at unescaped control opcodes.
The optimization is enabled only when there is exactly one active thread in the VM (self.threads.len() == 1). This prevents desynchronizing other threads that might be matching at different offsets or branches if a run consumes multiple bytes.1 parent 7330d40 commit b5e93fc
4 files changed
Lines changed: 285 additions & 13 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1723 | 1723 | | |
1724 | 1724 | | |
1725 | 1725 | | |
| 1726 | + | |
| 1727 | + | |
| 1728 | + | |
| 1729 | + | |
1726 | 1730 | | |
1727 | 1731 | | |
1728 | 1732 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
171 | 171 | | |
172 | 172 | | |
173 | 173 | | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
174 | 177 | | |
175 | 178 | | |
176 | 179 | | |
| |||
320 | 323 | | |
321 | 324 | | |
322 | 325 | | |
323 | | - | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
324 | 351 | | |
325 | 352 | | |
326 | 353 | | |
| |||
425 | 452 | | |
426 | 453 | | |
427 | 454 | | |
428 | | - | |
429 | | - | |
| 455 | + | |
| 456 | + | |
| 457 | + | |
| 458 | + | |
| 459 | + | |
| 460 | + | |
| 461 | + | |
| 462 | + | |
430 | 463 | | |
431 | | - | |
432 | 464 | | |
433 | 465 | | |
434 | 466 | | |
| |||
473 | 505 | | |
474 | 506 | | |
475 | 507 | | |
476 | | - | |
| 508 | + | |
477 | 509 | | |
478 | 510 | | |
479 | 511 | | |
| |||
595 | 627 | | |
596 | 628 | | |
597 | 629 | | |
| 630 | + | |
| 631 | + | |
| 632 | + | |
| 633 | + | |
| 634 | + | |
| 635 | + | |
| 636 | + | |
| 637 | + | |
| 638 | + | |
| 639 | + | |
| 640 | + | |
| 641 | + | |
| 642 | + | |
| 643 | + | |
| 644 | + | |
| 645 | + | |
| 646 | + | |
| 647 | + | |
| 648 | + | |
| 649 | + | |
| 650 | + | |
| 651 | + | |
| 652 | + | |
| 653 | + | |
| 654 | + | |
| 655 | + | |
| 656 | + | |
| 657 | + | |
| 658 | + | |
| 659 | + | |
| 660 | + | |
| 661 | + | |
| 662 | + | |
| 663 | + | |
| 664 | + | |
| 665 | + | |
| 666 | + | |
| 667 | + | |
| 668 | + | |
| 669 | + | |
| 670 | + | |
| 671 | + | |
| 672 | + | |
| 673 | + | |
| 674 | + | |
| 675 | + | |
| 676 | + | |
| 677 | + | |
| 678 | + | |
| 679 | + | |
| 680 | + | |
| 681 | + | |
| 682 | + | |
| 683 | + | |
| 684 | + | |
| 685 | + | |
| 686 | + | |
| 687 | + | |
| 688 | + | |
| 689 | + | |
| 690 | + | |
| 691 | + | |
| 692 | + | |
| 693 | + | |
| 694 | + | |
| 695 | + | |
| 696 | + | |
| 697 | + | |
| 698 | + | |
| 699 | + | |
| 700 | + | |
| 701 | + | |
| 702 | + | |
| 703 | + | |
| 704 | + | |
| 705 | + | |
| 706 | + | |
| 707 | + | |
| 708 | + | |
| 709 | + | |
| 710 | + | |
| 711 | + | |
| 712 | + | |
| 713 | + | |
| 714 | + | |
| 715 | + | |
| 716 | + | |
| 717 | + | |
| 718 | + | |
| 719 | + | |
| 720 | + | |
| 721 | + | |
| 722 | + | |
| 723 | + | |
| 724 | + | |
| 725 | + | |
| 726 | + | |
| 727 | + | |
| 728 | + | |
| 729 | + | |
| 730 | + | |
| 731 | + | |
| 732 | + | |
| 733 | + | |
| 734 | + | |
| 735 | + | |
| 736 | + | |
| 737 | + | |
| 738 | + | |
| 739 | + | |
| 740 | + | |
| 741 | + | |
| 742 | + | |
| 743 | + | |
| 744 | + | |
| 745 | + | |
| 746 | + | |
| 747 | + | |
| 748 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
162 | 162 | | |
163 | 163 | | |
164 | 164 | | |
165 | | - | |
166 | 165 | | |
167 | 166 | | |
168 | 167 | | |
| |||
181 | 180 | | |
182 | 181 | | |
183 | 182 | | |
184 | | - | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
185 | 196 | | |
186 | 197 | | |
187 | | - | |
188 | | - | |
189 | | - | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
190 | 202 | | |
191 | 203 | | |
192 | 204 | | |
193 | 205 | | |
194 | 206 | | |
195 | 207 | | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
196 | 264 | | |
197 | 265 | | |
198 | 266 | | |
| |||
226 | 294 | | |
227 | 295 | | |
228 | 296 | | |
229 | | - | |
| 297 | + | |
230 | 298 | | |
231 | 299 | | |
232 | 300 | | |
| |||
328 | 396 | | |
329 | 397 | | |
330 | 398 | | |
331 | | - | |
332 | | - | |
| 399 | + | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
333 | 403 | | |
334 | 404 | | |
335 | 405 | | |
| 406 | + | |
336 | 407 | | |
337 | 408 | | |
338 | 409 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1053 | 1053 | | |
1054 | 1054 | | |
1055 | 1055 | | |
| 1056 | + | |
| 1057 | + | |
| 1058 | + | |
| 1059 | + | |
| 1060 | + | |
| 1061 | + | |
| 1062 | + | |
| 1063 | + | |
| 1064 | + | |
| 1065 | + | |
| 1066 | + | |
| 1067 | + | |
| 1068 | + | |
| 1069 | + | |
| 1070 | + | |
| 1071 | + | |
| 1072 | + | |
| 1073 | + | |
| 1074 | + | |
| 1075 | + | |
| 1076 | + | |
| 1077 | + | |
| 1078 | + | |
| 1079 | + | |
| 1080 | + | |
| 1081 | + | |
| 1082 | + | |
| 1083 | + | |
| 1084 | + | |
| 1085 | + | |
| 1086 | + | |
| 1087 | + | |
| 1088 | + | |
| 1089 | + | |
| 1090 | + | |
| 1091 | + | |
| 1092 | + | |
| 1093 | + | |
| 1094 | + | |
| 1095 | + | |
| 1096 | + | |
| 1097 | + | |
| 1098 | + | |
| 1099 | + | |
| 1100 | + | |
| 1101 | + | |
0 commit comments