Skip to content

Separate TODO and singleton cleanup paths during incremental invalidation#736

Draft
st0012 wants to merge 2 commits intomainfrom
defer-cleanup-emptied-declarations
Draft

Separate TODO and singleton cleanup paths during incremental invalidation#736
st0012 wants to merge 2 commits intomainfrom
defer-cleanup-emptied-declarations

Conversation

@st0012
Copy link
Copy Markdown
Member

@st0012 st0012 commented Apr 13, 2026

Bugs fixed

A. Indexing hangs on class << X where X is never defined. Minimal repro:

class << Math
  def log2(x); x; end
end

With no module Math anywhere in the codebase, the resolver created a TODO for Math → cleanup removed it (empty) → re-resolution re-created it → infinite loop.

B. Singleton classes wrongly removed when one of their defining files is deleted but other signals still hold them alive. Example: deleting a bare class Foo; end reopener would cascade-remove Foo::<Foo> even though another file defined def self.bar on it. Similarly for singletons kept alive by callers (Foo.new).

C. Orphan TODOs left behind after ancestor delete+readd. During the delete phase, create_todo_for_parent synthesizes placeholders for temporarily unresolvable parents. After the ancestor's definer file is re-added, the real declarations resolve correctly but the stale TODOs lingered.

D. Compact-notation TODO parents not cleaned up. Deleting the only file for class A::B; end removed A::B but left TODO A in the graph.

Solution

Two structural changes:

  1. Move TODO cleanup out of the resolve loop. It now runs exactly once, after resolution has finished — so re-creating a TODO doesn't bounce back into another cleanup (fixes A).
  2. Introduce is_removable as the single signal for "no longer in use," so singleton classes kept alive by members or inbound references aren't wrongly swept up (fixes B).

Plus a targeted worklist fix:

  • On owner-emptied-by-child-removal, cascade the owner inline through the worklist rather than leaving it stranded (fixes D).

How TODOs get removed

There are two mechanisms:

  • Cascaded removal during invalidation. When a child's invalidation empties its TODO owner, the owner is cascaded inline from invalidate_declaration. Example: with class A::B; end as the only file, deleting it removes A::B and then cascades to the TODO A (D).
  • Post-resolution cleanup. Some TODOs stay empty forever. In class << Math (A), the singleton attaches via set_singleton_class_id, not add_member, so TODO Math never gains a member and no cascade ever fires. Every TODO create_todo_for_parent makes is recorded in tracked_todos; after resolution, cleanup_stale_todos drains the set and removes entries that are still empty via invalidate_graph.

We can't decide the second case at TODO creation time — a later file could still add module Math and promote it. So tracked_todos is a watch list (everything goes in), and the decision is deferred to cleanup_stale_todos.

Singleton liveness

The new is_removable predicate on Declaration is the single signal for "no longer in use":

  • Singleton classes: kept if they have members, definitions, or inbound references.
  • Everything else: kept if they have any definition.

This protects Foo::<Foo> from being removed when the class << self file is deleted while Foo.new still references it elsewhere, or while @x/def self.bar still live under the singleton (fixes B).

Tests

16 new tests in rust/rubydex/src/model/graph.rs. Each targets one of the bugs above or an adjacent invariant, notably:

  • resolve_terminates_with_singleton_class_on_undefined_constant — the original infinite-loop repro (A)
  • singleton_class_survives_when_reopener_is_deleted / ..._when_singleton_definition_deleted_but_caller_remains / empty_singleton_removed_after_its_members_are_deleted — the retain-vs-collect matrix (B)
  • no_orphan_todo_after_ancestor_definer_readded / compact_class_todo_chain_restored_after_ancestor_delete_readd — ancestor delete+readd (C)
  • todo_parent_removed_after_compact_child_deleted / orphan_singleton_cleaned_up_when_owner_deleted / ancestor_singletons_cleaned_up_after_class_chain_deleted — cascade coverage (D)
  • multi_round_deletion_matches_fresh_index — incremental-vs-fresh parity across edit sequences

@st0012 st0012 self-assigned this Apr 13, 2026
@st0012 st0012 added the bugfix A change that fixes an existing bug label Apr 13, 2026
@st0012 st0012 marked this pull request as ready for review April 13, 2026 22:21
@st0012 st0012 requested a review from a team as a code owner April 13, 2026 22:21
@st0012 st0012 force-pushed the defer-cleanup-emptied-declarations branch from 4d03161 to 0d94164 Compare April 14, 2026 14:57
@st0012 st0012 force-pushed the defer-cleanup-emptied-declarations branch from 0d94164 to 4810117 Compare April 14, 2026 15:26
@st0012 st0012 changed the title Defer cleanup of declarations emptied during invalidation Defer cleanup and remove synthetic declarations during incremental resolution Apr 14, 2026
Comment thread rust/rubydex/src/model/graph.rs Outdated
@st0012 st0012 force-pushed the defer-cleanup-emptied-declarations branch 2 times, most recently from ba4d491 to def42e4 Compare April 14, 2026 16:14
@st0012 st0012 changed the title Defer cleanup and remove synthetic declarations during incremental resolution Defer cleanup of emptied and synthetic declarations during incremental resolution Apr 14, 2026
@st0012
Copy link
Copy Markdown
Member Author

st0012 commented Apr 14, 2026

@vinistock I decided to merge a follow up PR to this one, which is a much bigger change. Can you give it another look? Thanks

@st0012 st0012 force-pushed the defer-cleanup-emptied-declarations branch 4 times, most recently from 7f57671 to 99c8b04 Compare April 15, 2026 10:28
@st0012
Copy link
Copy Markdown
Member Author

st0012 commented Apr 15, 2026

Changing this back to draft because I think it hits infinite loops in some cases.

@st0012 st0012 marked this pull request as draft April 15, 2026 14:37
@st0012 st0012 force-pushed the defer-cleanup-emptied-declarations branch 8 times, most recently from a77765a to bdf6d42 Compare April 15, 2026 21:43
Comment thread rust/rubydex/src/model/graph.rs
@st0012 st0012 force-pushed the defer-cleanup-emptied-declarations branch 2 times, most recently from 2e25ad5 to 93b64c1 Compare April 16, 2026 16:55
@st0012 st0012 force-pushed the defer-cleanup-emptied-declarations branch 9 times, most recently from e2201cb to 79ccbcf Compare April 17, 2026 12:02
@st0012 st0012 marked this pull request as ready for review April 17, 2026 12:11
@st0012 st0012 changed the title Defer cleanup of emptied and synthetic declarations during incremental resolution Separate TODO and singleton cleanup paths during incremental invalidation Apr 17, 2026
@st0012 st0012 requested a review from vinistock April 17, 2026 13:51
@st0012 st0012 force-pushed the defer-cleanup-emptied-declarations branch from 79ccbcf to f67a532 Compare April 17, 2026 14:46
Comment thread rust/rubydex/src/model/declaration.rs Outdated
/// members (e.g. `@x`, `def self.bar`) and by inbound constant references
/// (e.g. `Foo.new`).
#[must_use]
pub fn has_no_backing_definitions(&self) -> bool {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method probably needs to be renamed. It doesn't check for definitions exclusively.

Maybe this can be generalized as has_no_dependents? Or something along those lines.

Comment thread rust/rubydex/src/model/declaration.rs Outdated

/// Clears `singleton_class_id` only if it matches `expected`.
/// Prevents accidentally clearing a pointer to a re-created singleton.
pub fn clear_singleton_class_id(&mut self, expected: DeclarationId) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is a bit weird. What do you think about having this method only set self.singleton_class_id = None and perform the check outside?

Comment thread rust/rubydex/src/model/declaration.rs Outdated

/// Returns true if this namespace has no definitions and no members.
#[must_use]
pub fn is_empty(&self) -> bool {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name is_empty here might not be descriptive enough. For example, for a singleton class, you could have no members, no definitions and still have references. E.g.:

class Foo; end
Foo.methods

That would classify as "empty".

/// could still add `module Math` and promote it. `cleanup_stale_todos`
/// sweeps this list after resolution and removes entries that are
/// still empty.
tracked_todos: IdentityHashSet<DeclarationId>,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're starting the take on a crazy amount of complexity to account for todo declarations. I keep asking myself if this is worth it.

The only use case we have is Tapioca wanting to include RBIs even for things that indeed don't exist in the codebase. For example:

# some Minitest extension gem
class Minitest::MyExtension
end

If Minitest exists, we will correctly provide the right declarations. If not, then it's because the workspace doesn't use Minitest (which would make it disappear from RBIs, but not result in any type checking failure since it is not used).

I wonder if there's another way to account for the Tapioca use case of wanting to include a one to one representation of the codebase ignoring undefined constants without having to deal with all of this todo complexity.

Comment on lines +97 to +106
/// 2. Post-resolution cleanup — some TODOs stay empty forever.
/// Example: `class << Math; def log2(x); x; end; end` with no
/// `module Math` anywhere. The resolver creates a TODO `Math` as
/// the singleton's parent, but the singleton attaches via
/// `set_singleton_class_id`, not `add_member`. So TODO `Math`
/// never gains a member; nothing is ever removed, so (1) never
/// fires. We can't decide this at creation time — a later file
/// could still add `module Math` and promote it. `cleanup_stale_todos`
/// sweeps this list after resolution and removes entries that are
/// still empty.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading this, do we need to create a todo in this case? I think we can limit todo creation to the module and class keywords, but excluding singleton class blocks (after all, the target is a reference).

@st0012 st0012 force-pushed the defer-cleanup-emptied-declarations branch from f67a532 to 360c9a6 Compare April 17, 2026 18:37
@st0012 st0012 marked this pull request as draft May 1, 2026 13:02
@st0012 st0012 force-pushed the defer-cleanup-emptied-declarations branch from 360c9a6 to d55bfaa Compare May 1, 2026 13:02
st0012 added 2 commits May 3, 2026 22:48
After incremental invalidation, declarations can lose all their
definitions. Previously these were removed immediately, destroying
accumulated state (singleton links, member lists) the resolver needs.

This defers TODO cleanup to after resolution via `cleanup_stale_todos()`.
TODOs created speculatively by `create_todo_for_parent` are marked via
`defer_todo_cleanup` and evaluated post-resolution — promoted TODOs and
those with members are kept, empty stubs are removed.

Singleton classes use references and members as "still in use" signals
in `has_no_backing_definitions`, preventing premature removal when
callers (Foo.new) or class-level state (@x = 1) still exist.

Other empty declarations (from file deletion or member removal) cascade
immediately via the invalidation worklist.
@st0012 st0012 force-pushed the defer-cleanup-emptied-declarations branch from d55bfaa to dc37cbf Compare May 3, 2026 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bugfix A change that fixes an existing bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants