Skip to content

Commit b962256

Browse files
authored
fix(esm_library): extract TLA shared modules to break circular dependency (#13606)
* fix(esm_library): extract TLA shared modules to break circular dependency When using ESM library output with top-level await (TLA), a circular dependency occurs: the main chunk awaits the async chunk (because it has TLA), but the async chunk imports shared modules from the main chunk. This adds `extract_tla_shared_modules()` which scans async chunks' dependency graphs and extracts modules that reside in ancestor chunks into separate chunks to break the cycle. Also emits a warning when TLA is used with ESM library output, as the bundled execution order may differ from the original source. * fix: address review feedback for TLA shared module extraction - Reconnect entry modules to new chunk with entrypoint metadata after split, preventing entrypoints from losing their startup module - Group extracted modules by their source chunk set to avoid duplicating a module into multiple new chunks when it appears in multiple ancestors - Filter inactive connections via is_target_active() during BFS to avoid over-approximating the dependency graph - Fix doc comment: clarify that early exit skips chunk-graph scanning, not all computation * fix: only emit TLA warning when circular dep is actually detected - Fix cargo fmt: merge std imports, fix line break - Change extract_tla_shared_modules to return true only when modules were actually extracted, not just when TLA exists - Only emit the warning diagnostic when a circular dependency is found and shared modules are extracted, avoiding false warnings in tests that legitimately use TLA * fix: remove redundant has_tla check and use is_only_initial - Remove redundant has_tla early exit since async_chunks.is_empty() already covers the no-TLA case - Use is_only_initial instead of can_be_initial to also catch chunks that belong to both initial and non-initial groups * perf: single-pass BFS with shared visited set for TLA extraction Replace per-async-chunk independent BFS with a single BFS pass sharing one visited set across all async chunks. Pre-compute a reverse map (chunk → which async chunks consider it an ancestor) to enable O(1) ancestor lookups during traversal. Complexity: O(N + E + A × G) down from O(A × (N + E + G)) where N=modules, E=edges, A=async chunks, G=chunk groups. * test: enable TLA split-sync-modules tests that were previously skipped Remove test.filter.js from split-sync-modules-{1,2,3} tests that were skipped with "cycle caused by available modules". These tests exercise the exact circular dependency scenario that extract_tla_shared_modules now fixes. Add warnings.js to expect the TLA circular dependency warning that is emitted when shared modules are extracted. * fix: fix clippy collapsible_if and restore skipped TLA tests - Fix clippy collapsible_if lint by merging nested if-let + if - Restore test.filter.js for split-sync-modules tests: the chunk extraction alone is not sufficient to break the cycle in ESM rendering — additional render-side changes are needed - Add side_effects_state_artifact param for is_target_active API change * test: enable TLA split-sync-modules tests Remove test.filter.js so the tests actually execute, and add warnings.js to expect the TLA circular dependency warning. * fix: detect async chunks by TLA-sourced dynamic imports Previously we detected "async chunks" as non-initial chunks containing a module marked async. This missed the actual cycle scenario: the async chunk's modules themselves are sync; what matters is that the importing module has TLA and uses await-import to load the chunk. New detection: scan all modules with has_top_level_await, follow their DynamicImport dependencies, and mark the non-initial target chunks as potentially problematic. This correctly covers the split-sync-modules test cases where the imported chunk re-exports from a lib that sits in the parent chunk due to available-modules dedup. * fix: correctly handle entry modules in TLA chunk extraction Previously the extraction reconnected ALL moved modules as entry modules to every initial entrypoint group of the new chunk. This was wrong — only modules that were ORIGINALLY entry modules in their source chunks should be reconnected, and only to their original entrypoint group. Wrongly promoting shared library modules (like lib.js) to entrypoint status corrupted the chunk graph and caused runtime deadlocks. Key changes: - Only reconnect modules as entry modules if they WERE entry modules in some source chunk (tracked per-source-chunk via get_chunk_entry_modules_with_chunk_group_iterable lookup before disconnecting) - Use the ORIGINAL entrypoint group when reconnecting, not "all initial groups of the new chunk" - Skip DynamicImport dependencies during BFS — they produce promises and don't force the target chunk to be evaluated synchronously, so they can't cause the TLA deadlock - Set chunk_reason for debuggability This mirrors the pattern used by RemoveDuplicateModulesPlugin. * test: add snapshot files for TLA split-sync-modules tests Run with UPDATE_SNAPSHOT=true locally confirms all 3 tests pass: lib.js is extracted into a separate chunk, main.mjs and async.mjs both import from it — no circular dependency. All 198 ESM output tests pass, no regressions. * fix: per-async-chunk BFS to correctly handle multi-origin shared modules Previously the BFS used a shared visited set with single-origin tracking per module. This was buggy when a module was reachable from multiple async chunks whose ancestors differ: - Module M lives in chunks [P1, P2] - P1 is ancestor of async chunk A1, P2 is ancestor of A2 - A1's BFS reaches M first, records extract from P1 - A2's BFS skips M via visited, never records extract from P2 - Result: cycle between A2 and P2 remains Fix: run an independent BFS per async chunk with its own visited set. Each module is analyzed in the context of the async chunk actually traversing it, so extractions for different ancestor→async edges don't collide. Complexity goes from O(N+E+A·G) back to O(A·(N+E+G)), but correctness trumps the marginal speedup — A is usually small. Also add a note about the Phase 1 over-inclusion of function-scoped import()s; accepted as safe because extraction only fires when real cross-chunk static cycles exist.
1 parent d79172e commit b962256

11 files changed

Lines changed: 376 additions & 5 deletions

File tree

crates/rspack_plugin_esm_library/src/optimize_chunks.rs

Lines changed: 247 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
use std::sync::Arc;
1+
use std::{collections::VecDeque, sync::Arc};
22

33
use atomic_refcell::AtomicRefCell;
44
use rayon::prelude::*;
@@ -15,6 +15,252 @@ use rspack_util::{
1515

1616
use crate::EsmLibraryPlugin;
1717

18+
/// Scan TLA-awaited async chunks for static dependencies that reside in ancestor
19+
/// chunks. When a module with top-level await dynamically imports a chunk, and
20+
/// that chunk has static imports back to ancestor chunks, the ancestor must
21+
/// already have fully executed before the child can resolve its imports — but
22+
/// the ancestor is paused at its top-level await, creating a deadlock.
23+
///
24+
/// This function extracts such shared modules into separate chunks to break
25+
/// the cycle. It returns `true` if modules were actually extracted.
26+
pub(crate) fn extract_tla_shared_modules(compilation: &mut Compilation) -> bool {
27+
let module_graph = compilation.get_module_graph();
28+
let chunk_graph = &compilation.build_chunk_graph_artifact.chunk_graph;
29+
let chunk_group_by_ukey = &compilation.build_chunk_graph_artifact.chunk_group_by_ukey;
30+
31+
// Phase 1: Detect at-risk async chunks.
32+
// Source: modules with `has_top_level_await`.
33+
// We iterate `module.get_blocks()` (blocks attached directly to the module).
34+
// This is conservative: a function-scoped `import()` also shows up here when
35+
// the parser attaches its block to the module. We accept that over-inclusion
36+
// — extraction only actually happens in later phases when there's a real
37+
// cross-chunk static dependency back to an ancestor chunk, so function-only
38+
// imports without cycles incur only a cheap BFS and no chunk mutation.
39+
// Precise filtering would need AST-level "is the import() directly awaited
40+
// at module top level" info which is not exposed on the block.
41+
let mut async_chunks_set: FxHashSet<ChunkUkey> = FxHashSet::default();
42+
for (module_id, module) in module_graph.modules() {
43+
if !module.build_meta().has_top_level_await {
44+
continue;
45+
}
46+
for block_id in module.get_blocks() {
47+
let Some(block) = module_graph.block_by_id(block_id) else {
48+
continue;
49+
};
50+
for dep_id in block.get_dependencies() {
51+
let dep = module_graph.dependency_by_id(dep_id);
52+
if dep.dependency_type() != &DependencyType::DynamicImport {
53+
continue;
54+
}
55+
let Some(target) = module_graph.module_identifier_by_dependency_id(dep_id) else {
56+
continue;
57+
};
58+
if target == module_id {
59+
continue;
60+
}
61+
for &chunk_ukey in chunk_graph.get_module_chunks(*target) {
62+
let chunk = compilation
63+
.build_chunk_graph_artifact
64+
.chunk_by_ukey
65+
.expect_get(&chunk_ukey);
66+
if !chunk.is_only_initial(chunk_group_by_ukey) {
67+
async_chunks_set.insert(chunk_ukey);
68+
}
69+
}
70+
}
71+
}
72+
}
73+
74+
if async_chunks_set.is_empty() {
75+
return false;
76+
}
77+
78+
// Phase 2 + 3: For each at-risk async chunk, compute its ancestor chunks and
79+
// run a BFS through static outgoing edges. Each async chunk uses its OWN
80+
// `visited` set so that a module reachable from multiple async chunks is
81+
// analyzed once per origin — this matters when a shared module lives in
82+
// DIFFERENT ancestor chunks for different async chunks (e.g. multi-entry
83+
// builds where an async chunk hangs off each entry).
84+
let async_chunks: Vec<ChunkUkey> = async_chunks_set.iter().copied().collect();
85+
let mut modules_to_extract: IdentifierMap<FxHashSet<ChunkUkey>> = IdentifierMap::default();
86+
87+
for &async_chunk_ukey in &async_chunks {
88+
// Collect ancestor chunks via chunk group parent traversal
89+
let chunk = compilation
90+
.build_chunk_graph_artifact
91+
.chunk_by_ukey
92+
.expect_get(&async_chunk_ukey);
93+
let mut ancestor_groups = FxHashSet::default();
94+
for group_ukey in chunk.groups() {
95+
let group = chunk_group_by_ukey.expect_get(group_ukey);
96+
ancestor_groups.extend(group.ancestors(chunk_group_by_ukey));
97+
}
98+
let mut ancestor_chunks: FxHashSet<ChunkUkey> = FxHashSet::default();
99+
for g in &ancestor_groups {
100+
ancestor_chunks.extend(chunk_group_by_ukey.expect_get(g).chunks.iter().copied());
101+
}
102+
103+
if ancestor_chunks.is_empty() {
104+
continue;
105+
}
106+
107+
// Per-async-chunk BFS with a local `visited` set. Starts from all modules
108+
// that live in this async chunk.
109+
let mut visited = IdentifierSet::default();
110+
let mut queue: VecDeque<ModuleIdentifier> = chunk_graph
111+
.get_chunk_modules_identifier(&async_chunk_ukey)
112+
.iter()
113+
.copied()
114+
.collect();
115+
116+
while let Some(module_id) = queue.pop_front() {
117+
if !visited.insert(module_id) {
118+
continue;
119+
}
120+
121+
for conn in module_graph.get_outgoing_connections(&module_id) {
122+
// Dynamic imports produce promises; the target is not required to have
123+
// finished evaluating in the current frame, so no sync-load deadlock.
124+
let dep = module_graph.dependency_by_id(&conn.dependency_id);
125+
if dep.dependency_type() == &DependencyType::DynamicImport {
126+
continue;
127+
}
128+
if !conn.is_target_active(
129+
module_graph,
130+
None,
131+
&compilation.module_graph_cache_artifact,
132+
&compilation
133+
.build_module_graph_artifact
134+
.side_effects_state_artifact,
135+
&compilation.exports_info_artifact,
136+
) {
137+
continue;
138+
}
139+
let Some(target) = module_graph.module_identifier_by_dependency_id(&conn.dependency_id)
140+
else {
141+
continue;
142+
};
143+
let target_chunks = chunk_graph.get_module_chunks(*target);
144+
145+
// If the target lives in a chunk that is an ancestor of THIS async
146+
// chunk, extract it from that specific ancestor chunk. Also continue
147+
// BFS into the target — its own static deps may also live in ancestor
148+
// chunks (e.g. async → sharedA(ancestor) → sharedB(ancestor)).
149+
let mut in_ancestor = false;
150+
for &target_chunk in target_chunks {
151+
if ancestor_chunks.contains(&target_chunk) {
152+
modules_to_extract
153+
.entry(*target)
154+
.or_default()
155+
.insert(target_chunk);
156+
in_ancestor = true;
157+
}
158+
}
159+
160+
// Continue BFS if target is inside this async chunk OR in an ancestor
161+
// chunk (to transitively find all ancestor-resident dependencies).
162+
if in_ancestor || target_chunks.contains(&async_chunk_ukey) {
163+
queue.push_back(*target);
164+
}
165+
}
166+
}
167+
}
168+
169+
if modules_to_extract.is_empty() {
170+
return false;
171+
}
172+
173+
// Phase 4: Group modules by their exact source-chunk set (modules that live
174+
// in the same set of chunks go into one new chunk — mirrors
175+
// `RemoveDuplicateModulesPlugin`). Then create new chunks and relocate.
176+
let mut chunk_group_map: FxHashMap<Vec<ChunkUkey>, Vec<ModuleIdentifier>> = FxHashMap::default();
177+
for (module_id, source_chunks) in &modules_to_extract {
178+
let mut sorted: Vec<ChunkUkey> = source_chunks.iter().copied().collect();
179+
sorted.sort();
180+
chunk_group_map.entry(sorted).or_default().push(*module_id);
181+
}
182+
183+
for (source_chunks, modules) in chunk_group_map {
184+
let new_chunk_ukey =
185+
Compilation::add_chunk(&mut compilation.build_chunk_graph_artifact.chunk_by_ukey);
186+
if let Some(mut mutations) = compilation.incremental.mutations_write() {
187+
mutations.add(Mutation::ChunkAdd {
188+
chunk: new_chunk_ukey,
189+
});
190+
}
191+
{
192+
let new_chunk = compilation
193+
.build_chunk_graph_artifact
194+
.chunk_by_ukey
195+
.expect_get_mut(&new_chunk_ukey);
196+
*new_chunk.chunk_reason_mut() = Some("extracted to break TLA circular dependency".into());
197+
}
198+
compilation
199+
.build_chunk_graph_artifact
200+
.chunk_graph
201+
.add_chunk(new_chunk_ukey);
202+
203+
// For each module: collect which source chunks had it registered as an
204+
// entry module, then disconnect from all source chunks and connect to the
205+
// new chunk. Only modules that WERE entry modules get reconnected as entry.
206+
for &module_id in &modules {
207+
let mut entry_reconnections: Vec<ChunkGroupUkey> = Vec::new();
208+
for source_chunk_ukey in &source_chunks {
209+
if let Some(&entrypoint) = compilation
210+
.build_chunk_graph_artifact
211+
.chunk_graph
212+
.get_chunk_entry_modules_with_chunk_group_iterable(source_chunk_ukey)
213+
.get(&module_id)
214+
{
215+
entry_reconnections.push(entrypoint);
216+
compilation
217+
.build_chunk_graph_artifact
218+
.chunk_graph
219+
.disconnect_chunk_and_entry_module(source_chunk_ukey, module_id);
220+
}
221+
compilation
222+
.build_chunk_graph_artifact
223+
.chunk_graph
224+
.disconnect_chunk_and_module(source_chunk_ukey, module_id);
225+
}
226+
compilation
227+
.build_chunk_graph_artifact
228+
.chunk_graph
229+
.connect_chunk_and_module(new_chunk_ukey, module_id);
230+
for entrypoint in entry_reconnections {
231+
compilation
232+
.build_chunk_graph_artifact
233+
.chunk_graph
234+
.connect_chunk_and_entry_module(new_chunk_ukey, module_id, entrypoint);
235+
}
236+
}
237+
238+
// Establish chunk group relationships: new_chunk joins each source chunk's
239+
// groups (making it a sibling — it will be loaded alongside the source).
240+
for source_chunk_ukey in &source_chunks {
241+
let [Some(source_chunk), Some(new_chunk)] = compilation
242+
.build_chunk_graph_artifact
243+
.chunk_by_ukey
244+
.get_many_mut([source_chunk_ukey, &new_chunk_ukey])
245+
else {
246+
unreachable!("both chunks should exist")
247+
};
248+
source_chunk.split(
249+
new_chunk,
250+
&mut compilation.build_chunk_graph_artifact.chunk_group_by_ukey,
251+
);
252+
if let Some(mut mutations) = compilation.incremental.mutations_write() {
253+
mutations.add(Mutation::ChunkSplit {
254+
from: *source_chunk_ukey,
255+
to: new_chunk_ukey,
256+
});
257+
}
258+
}
259+
}
260+
261+
true
262+
}
263+
18264
/// Ensure that all entry chunks only export the exports used by other chunks,
19265
/// this requires no other chunks depend on the entry chunk to get exports
20266
///

crates/rspack_plugin_esm_library/src/plugin.rs

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ use crate::{
4343
esm_lib_parser_plugin::EsmLibParserPlugin,
4444
optimize_chunks::{
4545
analyze_dyn_import_targets, assign_dyn_import_chunk_short_names, ensure_entry_exports,
46-
optimize_runtime_chunks,
46+
extract_tla_shared_modules, optimize_runtime_chunks,
4747
},
4848
preserve_modules::preserve_modules,
4949
runtime::EsmRegisterModuleRuntimeModule,
@@ -697,6 +697,18 @@ async fn optimize_chunks(&self, compilation: &mut Compilation) -> Result<Option<
697697
crate::split_chunks::split(cache_groups, compilation).await?;
698698
}
699699

700+
let extracted_tla_shared = extract_tla_shared_modules(compilation);
701+
if extracted_tla_shared {
702+
compilation.push_diagnostic(rspack_error::Diagnostic::warn(
703+
"EsmLibraryPlugin".into(),
704+
"Top-level await with shared modules caused a circular dependency between async and \
705+
parent chunks. The shared modules have been extracted into separate chunks to break \
706+
the cycle. After bundling, the execution order of top-level await may differ from the \
707+
original source, which could lead to incorrect runtime behavior."
708+
.into(),
709+
));
710+
}
711+
700712
ensure_entry_exports(compilation);
701713
let concate_modules_map = self.concatenated_modules_map.read().await;
702714
let concatenated_modules = concate_modules_map
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
```mjs title=async.mjs
2+
3+
// ./async.js
4+
5+
6+
7+
export { value } from "./lib_js.mjs";
8+
9+
```
10+
11+
```mjs title=lib_js.mjs
12+
// ./lib.js
13+
const value = () => 42
14+
export { value };
15+
16+
```
17+
18+
```mjs title=main.mjs
19+
import { value } from "./lib_js.mjs";
20+
21+
// ./index.js
22+
23+
24+
const {value: valueAsync} = await import("./async.mjs");
25+
26+
it('should have correct value', () => {
27+
expect(value).toBe(valueAsync);
28+
expect(value()).toBe(42);
29+
})
30+
export {};
31+
32+
```

tests/rspack-test/esmOutputCases/top-level-await/split-sync-modules-1/test.filter.js

Lines changed: 0 additions & 1 deletion
This file was deleted.
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
module.exports = [/Top-level await with shared modules caused a circular dependency/];
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
```mjs title=async.mjs
2+
3+
// ./async.js
4+
5+
6+
7+
export { value } from "./lib_js.mjs";
8+
9+
```
10+
11+
```mjs title=lib_js.mjs
12+
// ./lib.js
13+
const value = () => 42
14+
export { value };
15+
16+
```
17+
18+
```mjs title=main.mjs
19+
import { value } from "./lib_js.mjs";
20+
21+
// ./index2.js
22+
23+
24+
const {value: valueAsync} = await import("./async.mjs");
25+
26+
it('should have correct value', () => {
27+
expect(value).toBe(valueAsync);
28+
expect(value()).toBe(42);
29+
})
30+
// ./index.js
31+
32+
export {};
33+
34+
```

tests/rspack-test/esmOutputCases/top-level-await/split-sync-modules-2/test.filter.js

Lines changed: 0 additions & 1 deletion
This file was deleted.
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
module.exports = [/Top-level await with shared modules caused a circular dependency/];

0 commit comments

Comments
 (0)