Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
97 commits
Select commit Hold shift + click to select a range
98dfb2d
Missing documentation for -x
katef Oct 23, 2024
280ca7c
Too much documentation for -u.
katef Oct 23, 2024
6dc7305
Merge pull request #502 from katef/kate/missing-manpage-flag
katef Oct 23, 2024
7ebc1eb
Fix memory leak.
silentbicycle Nov 7, 2024
c9a240a
Add a regression test for the memory leak.
silentbicycle Nov 8, 2024
3cf9ee7
Merge pull request #504 from katef/sv/fix-memory-leak-from-determinis…
katef Nov 12, 2024
3572258
Update CI to use Ubuntu 22.04 for now so we do not get bleeding edge …
deg4uss3r Feb 11, 2025
db6dcdd
Merge pull request #505 from deg4uss3r/rth/fix-ci
katef Mar 21, 2025
c89c15d
retest: Remove `isatty` check and extra logging output.
silentbicycle Jun 17, 2025
f6fc836
Merge pull request #506 from katef/sv/remove-isatty-check
katef Jun 19, 2025
825303f
Remove dependence on internal.h in print/ir.h and lx/print/c.c.
silentbicycle Jul 3, 2025
cb42d58
Makefile: Check '*res*' not 'res*' for tests.
silentbicycle Jul 8, 2025
7824e6a
Fix lx by rewriting much of its -l c codegen.
silentbicycle Jul 3, 2025
26e2b13
test/lxpos: Update test data (EOF position info).
silentbicycle Jul 22, 2025
862a68c
Re-enable lxpos tests.
silentbicycle Jul 22, 2025
affca78
Use $LX_BIN instead of $LX in lxpos makefile.
silentbicycle Jul 23, 2025
a8f0c59
lx: Make -l dump's output call lx.free() when using dynamic buffer.
silentbicycle Jul 23, 2025
552aa01
lx: Use prefix.tok, not "TOK_".
silentbicycle Jul 29, 2025
beebd1b
lx: return TOK_ERROR if reaching the end of a zone function.
silentbicycle Jul 29, 2025
ae53e94
lx: Handle unexpected EOF in pattern pairs. Add tests, update out6.dump.
silentbicycle Jul 29, 2025
ea9c90b
lx: Rewrite logic to make the four cases explicit, fix dead code.
silentbicycle Jul 30, 2025
17d415d
lx: Only gen fixedpop / dynpop & calls to them when buffer mode is set.
silentbicycle Jul 31, 2025
7ed18b9
lx: Suppress warning for possibly unused function.
silentbicycle Jul 31, 2025
1e55db8
lx: Ensure prefix.api & prefix.lx are used in the generated code.
silentbicycle Jul 31, 2025
4a5ca84
Replace FSM_ADVANCE_HOOK macro with optional hooks->advance callback.
silentbicycle Aug 4, 2025
f25e8b7
The advance hook should also be called for FSM_IO_STR.
silentbicycle Aug 4, 2025
051aaf0
Move setting `has_consumed_input` flag into lx's advance hook.
silentbicycle Aug 4, 2025
08fd72c
lx: Avoid useless call to pop and some other 'unused' warnings.
silentbicycle Aug 18, 2025
c897e9d
Merge pull request #509 from katef/sv/fix-lx-token-identification
katef Aug 20, 2025
c853157
lx: Distinguish between unexpected EOF and EOF in ignored zones.
silentbicycle Aug 26, 2025
9be4aec
Generated code. Re-generate lexers and parsers with lx bug fixed.
silentbicycle Aug 26, 2025
051c362
Typo.
katef Aug 28, 2025
87f6df2
Stray const.
katef Aug 28, 2025
400c0a5
Merge pull request #511 from katef/kate/stray-const
katef Aug 28, 2025
6c66234
Merge pull request #510 from katef/sv/fix-lx-handling-for-EOF-broken-…
katef Aug 29, 2025
664b32c
Update to Unicode 17.0
data-man Sep 10, 2025
4155d56
experimental: Add eager outputs, similar to endids but eagerly matched.
silentbicycle Aug 27, 2024
74907ca
Ensure .has_eager_outputs is zeroed on new states. (msan)
silentbicycle Oct 10, 2024
f2ddf1d
eager_output interface cleanup: Replace _any with _count and _get.
silentbicycle Oct 10, 2024
981128f
minimise_test_oracle.c: mismatched eager outputs also prevent merging.
silentbicycle Oct 10, 2024
fa63fcd
fuzz/target.c: re_is_anchor interface changes.
silentbicycle Oct 12, 2024
496198d
fuzzer: Add seed argument for fsm_generate_matches (interface change).
silentbicycle Oct 10, 2024
e2b9130
Fix memory leak in fsm_eager_output_compact, found while fuzzing.
silentbicycle Jan 31, 2025
36d0187
Fix fsm_union_repeated_pattern_group's anchoring linkage.
silentbicycle Feb 4, 2025
af27a87
fsm_union_repeated_pattern_group: Interface changes.
silentbicycle Feb 4, 2025
d44a671
fsm_union_repeated_pattern_group: fix linkage for mixed start anchoring.
silentbicycle Feb 5, 2025
36129d0
Add tests, fix anchoring bugs in fsm_union_repeated_pattern_group.
silentbicycle Feb 12, 2025
7b8b169
Interface change: Add 'const'.
silentbicycle Feb 12, 2025
6b07bd2
union: Fix trivial memory leak.
silentbicycle Feb 12, 2025
7546f81
union.c: Add comments for assertions.
silentbicycle Feb 12, 2025
bb9f620
Switch to collecting an anchored_start state set, not just one state.
silentbicycle Feb 14, 2025
a54095a
Rename test file -- make tests looks for build files matching "*res*".
silentbicycle Sep 19, 2025
eca233d
union.c: Updates to fsm_union_repeated_pattern_group and its internals.
silentbicycle Sep 9, 2025
e712575
Eager outputs: Comment on use, rename some functions, clean up.
silentbicycle Sep 15, 2025
eeea923
fuzz/target.c: Updates for interface changes.
silentbicycle Sep 19, 2025
d6db021
Misc. cleanup before integration.
silentbicycle Sep 19, 2025
3293b7b
Restore FORCE_ENDIDS behavior for tests/eager_output/eager_output7.c.
silentbicycle Sep 19, 2025
255e426
fsm.h: Fix missing word in comment.
silentbicycle Sep 22, 2025
610f2a3
Note why fsm_union_repeated_pattern_group depends on re_comp.
silentbicycle Sep 22, 2025
68c612d
Merge pull request #512 from data-man/ucd17
katef Sep 23, 2025
6f4288e
Merge branch 'main' into sv/eager-outputs-and-union-repeated-pattern-…
silentbicycle Oct 16, 2025
a32aff3
Stray comment.
katef Oct 28, 2025
fcde3b7
Optionally save linkage_info during NFA construction in re_comp.
silentbicycle Oct 16, 2025
9971fe1
Copy some fields from linkage_info, remove analysis.
silentbicycle Nov 5, 2025
50a5fe7
Restore freeing of state sets copied from linkage_info.
silentbicycle Nov 5, 2025
c65d68c
Update stale comment.
silentbicycle Nov 20, 2025
504b1d0
Merge pull request #513 from katef/sv/eager-outputs-and-union-repeate…
katef Nov 20, 2025
8debf24
Add documentation guide on how to use Libfsm effectively
gsusanto-fastly Nov 25, 2025
7306570
Update docs based on reviews
gsusanto-fastly Nov 27, 2025
20f262c
Revision #2
gsusanto-fastly Nov 27, 2025
227f4ac
Naming.
katef Nov 27, 2025
c4d4ffe
Markup.
katef Nov 27, 2025
bf867f2
Blurb on calling the generated code.
katef Nov 27, 2025
4351273
Blurb on bounded repetition.
katef Nov 27, 2025
d8ab92e
Markup.
katef Nov 27, 2025
27802dc
Merge pull request #515 from katef/gsusanto-docs
katef Nov 27, 2025
e6683da
First cut at re_interpolate_groups()
katef Jan 16, 2026
2f0c772
Add start,end error reporting
katef Jan 26, 2026
1a6f007
Clarification.
katef Jan 26, 2026
cf2bb0a
Fill out placeholders for writing out output.
katef Jan 26, 2026
3234a7c
Convincing myself string offsets are convenient
katef Jan 26, 2026
433c8b8
Allow a NULL output string.
katef Jan 26, 2026
4ae257e
Clarification.
katef Jan 26, 2026
0d487a6
Defensively terminate the output buffer on error.
katef Jan 27, 2026
8247320
Update to actions/cache@v5
katef Jan 27, 2026
6d1bd9b
fail-on-cache-miss: for grabbing arbitrary builds.
katef Jan 27, 2026
c1203e3
Explicitly allow build cache miss for makefile tests.
katef Jan 27, 2026
3f69a7c
Explicitly fail-on-cache-miss for other things too.
katef Jan 27, 2026
661105e
cache/restore where possible.
katef Jan 27, 2026
74ab9a7
Merge branch 'kate/actions-fluffery' into kate/interpolate_groups
katef Jan 27, 2026
d817464
Merge pull request #517 from katef/kate/actions-fluffery
katef Jan 27, 2026
a1526ab
Merge pull request #516 from katef/kate/interpolate_groups
katef Jan 27, 2026
c798ec1
Merge mishap, accidentally @v4
katef Jan 27, 2026
051be1a
Merge branch 'main' into upstream-sync
silentbicycle Feb 9, 2026
a6e9c2a
Makefile: grep for 'FAIL' should use -I to ignore binary files.
silentbicycle Feb 9, 2026
97cdb4e
Merge pull request #519 from katef/sv/grep-test-ignore-binary-files
katef Feb 9, 2026
5c0ed62
Merge remote-tracking branch 'origin/main' into sv/upstream-sync-eage…
silentbicycle Feb 9, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
92 changes: 55 additions & 37 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ jobs:

steps:
- name: Cache checkout
uses: actions/cache@v4
uses: actions/cache@v5
id: cache-checkout
with:
path: ${{ env.wc }}
Expand Down Expand Up @@ -52,7 +52,7 @@ jobs:

steps:
- name: Cache PCRE suite
uses: actions/cache@v4
uses: actions/cache@v5
id: cache-pcre
with:
path: pcre-suite/${{ env.pcre2 }}
Expand All @@ -70,19 +70,20 @@ jobs:
chmod -R ug-w pcre-suite

- name: Cache converted PCRE tests
uses: actions/cache@v4
uses: actions/cache@v5
id: cache-cvtpcre
with:
path: ${{ env.cvtpcre }}
key: cvtpcre-bmake-${{ matrix.os }}-gcc-DEBUG-AUSAN-${{ github.sha }}-${{ env.pcre2 }}

- name: Fetch build
- name: Restore build
if: steps.cache-cvtpcre.outputs.cache-hit != 'true'
uses: actions/cache@v4
uses: actions/cache/restore@v5
id: cache-build
with:
path: ${{ env.build }}
key: build-bmake-${{ matrix.os }}-gcc-DEBUG-AUSAN-${{ github.sha }} # arbitrary build, just for cvtpcre
fail-on-cache-miss: true

- name: Convert PCRE suite
if: steps.cache-cvtpcre.outputs.cache-hit != 'true'
Expand Down Expand Up @@ -157,15 +158,16 @@ jobs:
cc: gcc # -fsanitize=fuzzer is clang-only

steps:
- name: Fetch checkout
uses: actions/cache@v4
- name: Restore checkout
uses: actions/cache/restore@v5
id: cache-checkout
with:
path: ${{ env.wc }}
key: checkout-${{ github.sha }}
fail-on-cache-miss: true

- name: Cache build
uses: actions/cache@v4
uses: actions/cache@v5
id: cache-build
with:
path: ${{ env.build }}
Expand Down Expand Up @@ -235,20 +237,26 @@ jobs:
make: pmake # not packaged

steps:
- name: Fetch checkout
uses: actions/cache@v4
- name: Restore checkout
uses: actions/cache/restore@v5
id: cache-checkout
with:
path: ${{ env.wc }}
key: checkout-${{ github.sha }}
fail-on-cache-miss: true

# An arbitary build.
- name: Fetch build
uses: actions/cache@v4
# Failing to fetch this is not fatal, we're testing Makefiles here.
# Some combinations of our options (pmake, EXPENSIVE_CHECKS, whatever)
# won't exist in cache because we didn't build those. That's okay for
# the purposes of this step, building those is harmless.
- name: Restore build
uses: actions/cache/restore@v5
id: cache-build
with:
path: ${{ env.build }}
key: build-${{ matrix.make }}-${{ matrix.os }}-${{ matrix.cc }}-${{ matrix.debug }}-${{ matrix.san }}-${{ github.sha }}
fail-on-cache-miss: false

# We don't need to build the entire repo to know that the makefiles work,
# I'm just deleting a couple of .o files and rebuilding those instead.
Expand Down Expand Up @@ -324,12 +332,13 @@ jobs:
san: MSAN # not supported

steps:
- name: Fetch checkout
uses: actions/cache@v4
- name: Restore checkout
uses: actions/cache/restore@v5
id: cache-checkout
with:
path: ${{ env.wc }}
key: checkout-${{ github.sha }}
fail-on-cache-miss: true

- name: Dependencies (Ubuntu)
if: matrix.os == 'ubuntu-22.04'
Expand All @@ -346,12 +355,13 @@ jobs:
brew install bmake pcre
${{ matrix.cc }} --version

- name: Fetch build
uses: actions/cache@v4
- name: Restore build
uses: actions/cache/restore@v5
id: cache-build
with:
path: ${{ env.build }}
key: build-${{ matrix.make }}-${{ matrix.os }}-${{ matrix.cc }}-${{ matrix.debug }}-${{ matrix.san }}-${{ github.sha }}
fail-on-cache-miss: true

- name: Get number of CPU cores
uses: SimenB/github-actions-cpu-cores@v2
Expand Down Expand Up @@ -383,12 +393,13 @@ jobs:
cc: gcc # it's clang anyway

steps:
- name: Fetch checkout
uses: actions/cache@v4
- name: Restore checkout
uses: actions/cache/restore@v5
id: cache-checkout
with:
path: ${{ env.wc }}
key: checkout-${{ github.sha }}
fail-on-cache-miss: true

- name: Dependencies (Ubuntu)
if: matrix.os == 'ubuntu-22.04'
Expand All @@ -405,12 +416,13 @@ jobs:
brew install bmake
${{ matrix.cc }} --version

- name: Fetch build
uses: actions/cache@v4
- name: Restore build
uses: actions/cache/restore@v5
id: cache-build
with:
path: ${{ env.build }}
key: build-${{ matrix.make }}-${{ matrix.os }}-${{ matrix.cc }}-${{ matrix.debug }}-${{ matrix.san }}-${{ github.sha }}
fail-on-cache-miss: true

# note we do the fuzzing unconditionally; each run adds to the corpus.
#
Expand All @@ -421,7 +433,7 @@ jobs:
# still run fuzzing, just from empty, and do not save their seeds.
- name: Restore seeds (mode ${{ matrix.mode }})
if: github.repository == 'katef/libfsm'
uses: actions/cache/restore@v4
uses: actions/cache/restore@v5
id: cache-seeds
with:
path: ${{ env.seeds }}-${{ matrix.mode }}
Expand Down Expand Up @@ -458,7 +470,7 @@ jobs:
# the same seeds for a given bug.
# The explicit cache/restore and cache/save actions are just for that.
- name: Save seeds (mode ${{ matrix.mode }}-${{ matrix.debug }})
uses: actions/cache/save@v4
uses: actions/cache/save@v5
if: always()
with:
path: ${{ env.seeds }}-${{ matrix.mode }}
Expand Down Expand Up @@ -515,15 +527,16 @@ jobs:
sudo apt-get install golang
go version

- name: Fetch build
uses: actions/cache@v4
- name: Restore build
uses: actions/cache/restore@v5
id: cache-build
with:
path: ${{ env.build }}
key: build-${{ matrix.make }}-${{ matrix.os }}-${{ matrix.cc }}-${{ matrix.debug }}-${{ matrix.san }}-${{ github.sha }}
fail-on-cache-miss: true

- name: Fetch converted PCRE tests
uses: actions/cache@v4
uses: actions/cache@v5
id: cache-cvtpcre
with:
path: ${{ env.cvtpcre }}
Expand All @@ -542,7 +555,7 @@ jobs:

steps:
- name: Cache docs
uses: actions/cache@v4
uses: actions/cache@v5
id: cache-docs
with:
path: ${{ env.build }}
Expand All @@ -555,13 +568,14 @@ jobs:
sudo apt-get update
sudo apt-get install bmake libxml2-utils xsltproc docbook-xml docbook-xsl

- name: Fetch checkout
- name: Restore checkout
if: steps.cache-docs.outputs.cache-hit != 'true'
uses: actions/cache@v4
uses: actions/cache/restore@v5
id: cache-checkout
with:
path: ${{ env.wc }}
key: checkout-${{ github.sha }}
fail-on-cache-miss: true

- name: Get number of CPU cores
if: steps.cache-docs.outputs.cache-hit != 'true'
Expand Down Expand Up @@ -597,7 +611,7 @@ jobs:

steps:
- name: Cache prefix
uses: actions/cache@v4
uses: actions/cache@v5
id: cache-prefix
with:
path: ${{ env.prefix }}
Expand All @@ -609,29 +623,32 @@ jobs:
uname -a
sudo apt-get install bmake

- name: Fetch checkout
- name: Restore checkout
if: steps.cache-prefix.outputs.cache-hit != 'true'
uses: actions/cache@v4
uses: actions/cache/restore@v5
id: cache-checkout
with:
path: ${{ env.wc }}
key: checkout-${{ github.sha }}
fail-on-cache-miss: true

- name: Fetch build
- name: Restore build
if: steps.cache-prefix.outputs.cache-hit != 'true'
uses: actions/cache@v4
uses: actions/cache/restore@v5
id: cache-build
with:
path: ${{ env.build }}
key: build-${{ env.make }}-${{ env.os }}-${{ env.cc }}-${{ env.debug }}-${{ env.san }}-${{ github.sha }}
fail-on-cache-miss: true

- name: Fetch docs
- name: Restore docs
if: steps.cache-prefix.outputs.cache-hit != 'true'
uses: actions/cache@v4
uses: actions/cache/restore@v5
id: cache-docs
with:
path: ${{ env.build }}
key: docs-${{ github.sha }}
fail-on-cache-miss: true

- name: Get number of CPU cores
if: steps.cache-prefix.outputs.cache-hit != 'true'
Expand Down Expand Up @@ -670,12 +687,13 @@ jobs:
sudo gem install --no-document fpm
fpm -v

- name: Fetch prefix
uses: actions/cache@v4
- name: Restore prefix
uses: actions/cache/restore@v5
id: cache-prefix
with:
path: ${{ env.prefix }}
key: prefix-${{ env.make }}-${{ env.os }}-${{ env.cc }}-${{ env.debug }}-${{ env.san }}-${{ github.sha }}
fail-on-cache-miss: true

- name: Find version
# TODO: would get a tag or branch name here
Expand Down
7 changes: 4 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -125,8 +125,7 @@ SUBDIR += tests/fsm
SUBDIR += tests/glob
SUBDIR += tests/like
SUBDIR += tests/literal
# FIXME: commenting this out for now due to Makefile error
#SUBDIR += tests/lxpos
SUBDIR += tests/lxpos
SUBDIR += tests/minimise
SUBDIR += tests/native
SUBDIR += tests/pcre
Expand All @@ -137,6 +136,7 @@ SUBDIR += tests/pcre-repeat
SUBDIR += tests/pred
SUBDIR += tests/re_literal
SUBDIR += tests/re_strings
SUBDIR += tests/regressions
SUBDIR += tests/reverse
SUBDIR += tests/trim
SUBDIR += tests/union
Expand All @@ -147,6 +147,7 @@ SUBDIR += tests/sql
SUBDIR += tests/queue
SUBDIR += tests/aho_corasick
SUBDIR += tests/retest
SUBDIR += tests/re_interpolate_groups
SUBDIR += tests
.if make(theft) || make(${BUILD}/theft/theft)
SUBDIR += theft
Expand Down Expand Up @@ -190,6 +191,6 @@ STAGE_BUILD := ${STAGE_BUILD:Nbin/cvtpcre}

.if make(test)
.END::
grep FAIL ${BUILD}/tests/*/res*; [ $$? -ne 0 ]
grep -I FAIL ${BUILD}/tests/*/*res*; [ $$? -ne 0 ]
.endif

3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,15 @@
; re -cb -pl dot '[Ll]ibf+(sm)*' '[Ll]ibre' | dot
![libfsm.svg](doc/tutorial/libfsm.svg)

libfsm is not a drop-in replacement for other regex engines, and it only supports patterns that can be compiled to deterministic FSMs. In return, supported patterns run in linear time.

Getting started:

* See the [tutorial introduction](doc/tutorial/re.md) for a quick overview
of the re(1) command line interface.
* [Compilation phases](doc/tutorial/phases.md) for typical applications
which compile regular expressions to code.
* [Advice on using libfsm](doc/advice.md) for suggestions around compilation time, unsupported features, common usage patterns, and examples.

You get:

Expand Down
Loading
Loading