feat(SymbolicDynamics): basic setup of Zd, full shift, cylinders, pat… by Sfgangloff · Pull Request #28546 · leanprover-community/mathlib4

Sfgangloff · 2025-08-17T13:23:43Z

This PR adds a group-generic foundation for symbolic dynamics over an arbitrary group G, together with convenient specializations for ℤ and ℤ^d.

Summary of additions:

Full shift and shift action
- abbrev FullShift (A G) := G → A (inherits product topology from the Π-type).
- Right shift shift g x with convention (shift g x) h = x (h * g).
Cylinders and topology
- cylinder U x : Set (G → A) for finite U : Finset G.
- Cylinders are open under [DiscreteTopology A]; with a finite alphabet they are also closed.
- Equality with dependent products:
  cylinder U x = Set.pi (↑U) (fun i => ({x i} : Set A)), enabling use of the Set.pi API.
Patterns, occurrences, and subshifts
- Pattern A G with finite support : Finset G and data : support → A.
- Pattern.occursIn p x g (occurrence at translate g) and the expected shift law.
- forbids F and Subshift A G (closed, shift-invariant subsets).
- FixedSupport A G U with an equivalence to (U → A) to obtain finiteness.
Language on finite shapes and counting
- languageOn X U, languageCardOn X U, and patternCountOn Y U.
Entropy along a shape sequence
- limsupAtTop (as an sInf of eventual upper bounds).
- entropyAlong X F hF := limsup (log (patternCountOn X (F n) + 1) / |F n|)
  for any nonempty finite shapes F : ℕ → Finset G (the + 1 avoids log 0).
Specializations
- IntShapes: segments [-n,n] on Multiplicative ℤ, with entropy_Z.
- ZdShapes: boxes [-n,n]^d on ℤ^d (as functions Fin d → ℤ), with entropy_Zd.

Mathematical remarks:

The API is shape-parametric: entropy is defined along user-provided finite shapes.
On amenable groups, using a Følner sequence yields a canonical value (Ornstein–Weiss).
This PR does not assume amenability; the family of shapes is an explicit input.

Motivation:

Provide a clean, reusable base for symbolic dynamics on groups in mathlib.

Future work:

Add a Følner predicate and prove shape-independence / limit existence on amenable groups.
Expand the ℤ/ℤ^d toolkit (alternative shapes, mixing, factors).
Develop 1D theory and, longer-term, multidimensional SFT results (e.g. along the lines of Hochman–Meyerovitch).

github-actions · 2025-08-17T13:24:54Z

PR summary c63c4e5c93

Import changes for modified files

No significant changes to the import graph

Import changes for all files

Files	Import difference
`Mathlib.Dynamics.SymbolicDynamics.Basic` (new file)	677

Declarations diff

+ LanguageOn
+ MulSubshift
+ MulSubshift.languageOn
+ MulSubshift.ofForbidden
+ Pattern
+ Pattern.mulOccursInAt
+ Pattern.mulShift
+ Subshift
+ continuous_mulShift
+ cylinder
+ cylinder_eq_set_pi
+ finite_setOf_pattern_support_eq
+ fromConfig
+ isClosed_cylinder
+ isClosed_mulForbidden
+ isClosed_mulOccursInAt
+ isOpen_cylinder
+ isOpen_mulOccursInAt
+ mapsTo_mulShift_mulForbidden
+ mem_cylinder
+ mulForbidden
+ mulFullShift
+ mulOccursInAt_eq_cylinder
+ mulOccursInAt_mulShift
+ mulShift
+ mulShift_apply
+ mulShift_apply_mul_left_of_mem
+ mulShift_mul
+ mulShift_one

You can run this locally as follows

## from your `mathlib4` directory:
git clone https://github.com/leanprover-community/mathlib-ci.git ../mathlib-ci

## summary with just the declaration names:
../mathlib-ci/scripts/pr_summary/declarations_diff.sh <optional_commit>

## more verbose report:
../mathlib-ci/scripts/pr_summary/declarations_diff.sh long <optional_commit>

The doc-module for scripts/pr_summary/declarations_diff.sh in the mathlib-ci repository contains some details about this script.

No changes to technical debt.

This script lives in the mathlib-ci repository. To run it locally, from your mathlib4 directory:

git clone https://github.com/leanprover-community/mathlib-ci.git ../mathlib-ci
../mathlib-ci/scripts/reporting/technical-debt-metrics.sh pr_summary

The relative value is the weighted sum of the differences with weight given by the inverse of the current value of the statistic.
The absolute value is the relative value divided by the total sum of the inverses of the current values (i.e. the weighted average of the differences).

sgouezel · 2025-08-17T13:38:44Z

It's probably not a good idea to do this over Fin d → ℤ, as this will not apply to subshifts over ℤ (since Fin 1 → ℤ and ℤ are not the same thing). A solution here would be to abstract away, i.e., do symbolic dynamics over a group G (where you can instantiate G to be either Fin d → ℤ or ℤ). I don't expect the general theory to be more complicated than the one you develop (possibly simpler, in fact). You might need at some point to assume that G is finitely generated, but don't do it before you need it!

Sfgangloff · 2025-08-17T19:28:25Z

It's probably not a good idea to do this over Fin d → ℤ, as this will not apply to subshifts over ℤ (since Fin 1 → ℤ and ℤ are not the same thing). A solution here would be to abstract away, i.e., do symbolic dynamics over a group G (where you can instantiate G to be either Fin d → ℤ or ℤ). I don't expect the general theory to be more complicated than the one you develop (possibly simpler, in fact). You might need at some point to assume that G is finitely generated, but don't do it before you need it!

Thank you for the advice ! I did not want to do it in full generality at first, but indeed d=1 is an important case. I'll try revamp this for groups in general.

sgouezel

Thanks for your PR! I have mostly made form comments in my review, to show you how your code should be formatted for mathlib. Could you try to use there rules also for the rest of the PR?

Also, reviewing long PRs like this one is quite complicated. Could you maybe split it, postponing the entropy part to another PR, to make sure that we can get your PR in quicker? Thanks!

Sfgangloff · 2025-09-23T06:57:57Z

Thanks for your PR! I have mostly made form comments in my review, to show you how your code should be formatted for mathlib. Could you try to use there rules also for the rest of the PR?

Also, reviewing long PRs like this one is quite complicated. Could you maybe split it, postponing the entropy part to another PR, to make sure that we can get your PR in quicker? Thanks!

Thanks a lot for your review! I’ll go through your formatting comments carefully and apply the rules consistently across the PR. I’ll also split off the entropy part into a separate PR, so that this one is easier to review.

Sfgangloff · 2025-09-23T09:24:32Z

Thanks for your PR! I have mostly made form comments in my review, to show you how your code should be formatted for mathlib. Could you try to use there rules also for the rest of the PR?
Also, reviewing long PRs like this one is quite complicated. Could you maybe split it, postponing the entropy part to another PR, to make sure that we can get your PR in quicker? Thanks!

Thanks a lot for your review! I’ll go through your formatting comments carefully and apply the rules consistently across the PR. I’ll also split off the entropy part into a separate PR, so that this one is easier to review.

@sgouezel I have implemented all requested changes. Ready for re-review—thanks!

sgouezel · 2025-09-23T12:21:22Z

Thanks!
A useful practice is to resolve the conversations (using the "Resolve conversation" button) for which you think you have properly answered, and only leave open those for which further discussion is needed. This helps a lot to spot the points where further action is needed.

sgouezel · 2025-09-23T12:28:43Z

I have a design question here. Everything is defined for the full shift, but most shifts of interest are not the full shift but rather subshifts (e.g. subshifts of finite type, low complexity subshifts, and so on). This means that you will need to redefine all your notions for subshifts, which is not desirable from the point of view of code duplication.

I wonder if it would be better to have a design talking only of subshifts, where the full shift would be the subshift on univ. I mean, something like introducing a class IsMulSubshift s (for s a subset of G -> A) saying that that s is shift-invariant. And then define mulShift [IsMulSubshift s] g x for x : s, as an element of s. And develop all the API around cylinders and so on in this context.

Do you think it would work, or do you see issues with this design which would warrant developing things specifically for the full shift?

Sfgangloff · 2025-09-23T13:10:08Z

I have a design question here. Everything is defined for the full shift, but most shifts of interest are not the full shift but rather subshifts (e.g. subshifts of finite type, low complexity subshifts, and so on). This means that you will need to redefine all your notions for subshifts, which is not desirable from the point of view of code duplication.

I wonder if it would be better to have a design talking only of subshifts, where the full shift would be the subshift on univ. I mean, something like introducing a class IsMulSubshift s (for s a subset of G -> A) saying that that s is shift-invariant. And then define mulShift [IsMulSubshift s] g x for x : s, as an element of s. And develop all the API around cylinders and so on in this context.

Do you think it would work, or do you see issues with this design which would warrant developing things specifically for the full shift?

Thanks for the suggestion! I agree most objects of interest are subshifts. My plan is to keep the definitions/lemmas in the ambient full shift and use them on a subshift simply by restriction (viewing a subshift Y as a subtype of G → A). In practice this means most lemmas apply directly without re-proving them. When it is helpful for ergonomics, I can add tiny wrappers (not new theory): e.g. the shift restricted to Y, and “cylinders/occurrence inside Y” defined as the set of points of Y whose underlying function lies in the ambient set. That’s just making the restriction explicit, not duplicating proofs. If you prefer, I can include those small wrappers in this PR; otherwise I’ll keep using the ambient results by restriction.

sgouezel · 2025-09-23T16:32:02Z

My question is mainly for the map itself: if you have a subset s, the shift does not in general restrict to the subset, you need invariance. Also, for cylinders, viewing them inside the subtype means some more work (you need to take the preimage of the cylinder under the inclusion map). I am worried that this will make working with subshifts more awkward than it should be. If you have already worked out more of the theory and if you have an example to show me that everything works smoothly, that's great -- could you give me a pointer?

sgouezel · 2025-09-25T08:12:28Z

There are some linting errors, but certainly not related to your PR. Could you merge master to fix them?

Sfgangloff · 2025-09-25T16:04:25Z

There are some linting errors, but certainly not related to your PR. Could you merge master to fix them?

Done

Sfgangloff · 2025-09-25T16:19:31Z

My question is mainly for the map itself: if you have a subset s, the shift does not in general restrict to the subset, you need invariance. Also, for cylinders, viewing them inside the subtype means some more work (you need to take the preimage of the cylinder under the inclusion map). I am worried that this will make working with subshifts more awkward than it should be. If you have already worked out more of the theory and if you have an example to show me that everything works smoothly, that's great -- could you give me a pointer?

Thanks for clarifying! I see your point: if one defines cylinders and the shift only “inside a subshift,” then one gets nice ergonomics locally, but there is overhead whenever combining or comparing different subshifts. For example, suppose I want to prove that
\mathcal L_U(X \cup Y) = \mathcal L_U(X) \cup \mathcal L_U(Y).
Even if languages were defined relative to a subshift, in such a statement I would still need to move both sides back to the ambient pattern type in G \to A to take the union. The same issue arises with intersections, factors, products, etc. In that sense, having the ambient definitions as the “ground truth” is what makes these set-theoretic identities essentially tautological.

In the current design, all definitions (shift, cylinder, occurrence, language, …) live in the ambient full shift, and a subshift is just a closed, invariant subset. Then for a subshift Y I can view its language as

languageOn Y.carrier U

and results like the union lemma become immediate. When needed for ergonomics, I can certainly add thin wrappers (e.g. “language of a subshift on U”), but they would just unfold to the ambient versions.

So I think your concern is exactly the tradeoff: restricting everything to subshifts makes the local API slightly nicer but the global manipulations harder. Keeping things ambient keeps the global operations trivial, and I expect most interesting theory (SFTs, complexity, entropy, …) will want to combine subshifts.

I don’t yet have a long worked-out example, but even at this basic stage you can already see the pattern with unions. If you’d like, I can include some of these wrapper lemmas in the PR so that the ergonomics for subshifts are visible right away.

sgouezel · 2025-09-26T06:39:07Z

I see your point. I think it's just a difference of background: from my point of view, I work a lot inside a fixed subshift (to study entropy, transfer operators, thermodynamic formalism, and so on), but I essentially never work in the space of subshifts (i.e., I'll never take the union of two subshifts).

Since you are the one doing the PR, we should go with your point of view. Could you please explain this design motivation in the file-level docstring? Also, it might be a good idea to namespace everything, to make sure that if we want to define cylinders with the inner point of view we don't get a clash.

Sfgangloff · 2025-09-26T08:14:35Z

I see your point. I think it's just a difference of background: from my point of view, I work a lot inside a fixed subshift (to study entropy, transfer operators, thermodynamic formalism, and so on), but I essentially never work in the space of subshifts (i.e., I'll never take the union of two subshifts).

Since you are the one doing the PR, we should go with your point of view. Could you please explain this design motivation in the file-level docstring? Also, it might be a good idea to namespace everything, to make sure that if we want to define cylinders with the inner point of view we don't get a clash.

Thanks for the feedback! I have updated the file accordingly:

I expanded the module-level docstring to explain the design choice;
I put all ambient definitions under a dedicated namespace (SymbolicDynamics.Ambient), so that if later we also want to develop the “inner” subshift-relative API, it can live in a parallel namespace (SymbolicDynamics.Subshift) without clashes.

sgouezel · 2025-09-29T08:15:28Z

Once you're happy, can you remove the awaiting-author tag? Otherwise, the PR does not appear in the queue, so I'm very likely to forget it!

Sfgangloff · 2025-09-29T13:55:56Z

Once you're happy, can you remove the awaiting-author tag? Otherwise, the PR does not appear in the queue, so I'm very likely to forget it!

@sgouezel I don’t seem to have permission to remove the awaiting-author tag (maybe because I’m marked as a new contributor ?). The PR is ready for another review though.

sgouezel · 2025-09-29T14:32:19Z

You need to post a message with -awaiting-author.

sgouezel

I'm really sorry for the delay, all my apologies. It's a little bit complicated to review a PR like this one, mostly made of definitions, because one can not really see why the definitions are relevant, and if some could be skipped or not -- for instance, you define dominoes but you don't use them yet. I'm sure the applications come later down the road, of course.

I'll try to get the PR inside Mathlib in the near future with your help. But for future development I think it will be more efficient if you build a project downstream of Mathlib, and prove there deep theorems that matter to you. Because proving theorems is what helps get good definitions through refactors imposed by the proofs, and also once you have this you can show off and explain why your work should get into Mathlib.

In any case, I hope that the discussions we had together helped to show you what kind of design choices are important when doing formalization and what style conventions we have, and I hope all these can be put to good use in your project. If you encounter other design questions, don't hesitate to bug me with them on Zulip!

Sfgangloff · 2026-04-01T07:22:03Z

I'm really sorry for the delay, all my apologies. It's a little bit complicated to review a PR like this one, mostly made of definitions, because one can not really see why the definitions are relevant, and if some could be skipped or not -- for instance, you define dominoes but you don't use them yet. I'm sure the applications come later down the road, of course.

I'll try to get the PR inside Mathlib in the near future with your help. But for future development I think it will be more efficient if you build a project downstream of Mathlib, and prove there deep theorems that matter to you. Because proving theorems is what helps get good definitions through refactors imposed by the proofs, and also once you have this you can show off and explain why your work should get into Mathlib.

In any case, I hope that the discussions we had together helped to show you what kind of design choices are important when doing formalization and what style conventions we have, and I hope all these can be put to good use in your project. If you encounter other design questions, don't hesitate to bug me with them on Zulip!

Thanks a lot for the feedback, and no worries at all for the delay!
I understand your point about definitions being hard to evaluate without accompanying theorems. I’ll start developing things downstream, focusing on proofs, and then progressively extract the parts that make sense to contribute to mathlib. Thanks again for the guidance, and I’ll reach out on Zulip if I run into design questions!

Sfgangloff · 2026-04-10T05:13:21Z

-awaiting-author

sgouezel · 2026-04-26T19:41:23Z

Only minor comments left, thanks!

Please don't forget to close the discussions in which you considered you've given a satisfactory answer: it makes it possible to focus on the remaining issues.

Co-authored-by: Sebastien Gouezel <sebastien.gouezel@univ-rennes1.fr>

Sfgangloff · 2026-04-28T02:14:54Z

-awaiting-author

sgouezel · 2026-04-28T12:46:16Z

+This is the set of all finite patterns obtained by restricting some configuration
+`x ∈ X` to `U`. -/
+@[to_additive languageOn]
+def mulLanguageOn (X : Set (G → A)) (U : Finset G) : Set (Pattern A G) :=


Maybe drop the mul in the definition, then.

…of isClosed_mulForbidden proof

Sfgangloff · 2026-04-29T04:05:34Z

-awaiting-author

github-actions Bot added new-contributor This PR was made by a contributor with at most 5 merged PRs. Welcome to the community! t-dynamics Dynamical Systems labels Aug 17, 2025

kckennylau reviewed Aug 17, 2025

View reviewed changes