Skip to content

Commit b658e36

Browse files
robhoganmeta-codesync[bot]
authored andcommitted
Derive source-map tuples from Babel's decoded map (#1741)
Summary: Pull Request resolved: #1741 The transform worker built its source-map tuples via `result.rawMappings.map(toSegmentTuple)`. Accessing `result.rawMappings` forces `babel/generator` to run a second decode (`allMappings`) that allocates a flat array of ~4-5 objects per segment — even though Babel *already* computed an equivalent decoded map (`result.decodedMap`, the jridgewell/gen-mapping decoded format) eagerly during generation and Metro was discarding it. This swaps the source to `result.decodedMap` via a new `tuplesFromBabelDecodedMap` (decoded source lines are 0-based -> +1, name indices resolved against `decodedMap.names`). Output is byte-identical to `result.rawMappings.map(toSegmentTuple)`, and it eliminates the redundant `allMappings` decode for *every* build (not just compact source maps). This is a standalone, unconditional improvement, so it sits first in the stack ahead of the compact-source-map work, which builds on it. - `metro-source-map`: add `BabelDecodedMap` type + `tuplesFromBabelDecodedMap`. - `metro-transform-worker`: source tuples from `result.decodedMap`. - `babel_v7.x.x` libdef: add `decodedMap` to `GeneratorResult`. Microbenchmark (real `babel/generator` 7.29.1, 133 modules / ~30.6K segments, `--expose-gc`, median of 11): `generate()` alone 20.2 ms; `generate()` + access `decodedMap` 19.2 ms (~0 delta — it's a sunk, eager cost); `generate()` + access `rawMappings` 28.8 ms (+8.6 ms) with ~40% more heap (19.5 vs 13.9 MB). So consuming `decodedMap` drops the `rawMappings`/`allMappings` decode entirely. (`decodedMap` is eager in 7.29.1; even if a future Babel makes it lazy it allocates arrays-of-numbers vs `rawMappings`' nested objects, so it stays <=.) ## E2E benchmark — cold WildeBundle (this diff vs baseline = parent) Interleaved, paired A/B: each of 12 rounds runs one cold build per cell — {baseline, this diff} x {child-process workers, worker threads} — so slow machine drift is shared within each round and cancels in the per-round delta. Fresh Metro per build, transform cache wiped (cold), `maxWorkers=16`, default path (no compact source maps). "Transform CPU" = total user+sys CPU across the whole worker process tree; "tree RSS" = whole-tree resident set (captures workers in both modes); "graph heap" = main-isolate heapUsed post-build (the retained module graph). base/this-diff columns are medians; Δ is the paired mean with a 95% CI (Student-t, 11 df); "n.s." = CI includes 0. Child-process workers (Metro default; 12 paired rounds): | metric | baseline | this diff | Δ (95% CI) | |---|---|---|---| | transform CPU (s) | 625 | 612 | **-16.6 (-2.6%) [-24.7, -8.5]** | | build wall (s) | 65.9 | 65.6 | -0.5 (-0.7%) n.s. | | transient tree RSS (GB) | 15.8 | 16.0 | +0.06, n.s. | | post-build tree RSS (GB) | 15.1 | 15.1 | +0.08, n.s. | | graph heap, main isolate (GB) | 1.59 | 1.59 | ~0, n.s. | Worker threads (`unstable_workerThreads`; 12 paired rounds): | metric | baseline | this diff | Δ (95% CI) | |---|---|---|---| | transform CPU (s) | 664 | 653 | -18.6 (-2.8%) [-37.5, +0.3] | | build wall (s) | 59.8 | 59.5 | -1.2 (-1.9%) n.s. | | transient RSS (GB) | 13.2 | 12.7 | -0.46 (-3.5%) [-0.81, -0.11] | | post-build RSS (GB) | 12.3 | 11.9 | -0.45 (-3.7%) [-0.80, -0.10] | | graph heap, main isolate (GB) | 1.60 | 1.60 | ~0, n.s. | Takeaways: - **Transform CPU drops ~2.6-2.8%, equally in both worker modes** — the point estimates (-16.6 s child-process, -18.6 s threads) agree to within 2 s and their CIs overlap almost entirely, so there is no real asymmetry. This is exactly what the mechanism predicts: the optimization runs *inside* the worker (consume `decodedMap` instead of forcing the `rawMappings`/`allMappings` decode), so the saving is identical whether the worker is a child process or a thread. (An earlier small-n pass suggested a child-process-only win; that was sampling noise — threads-mode CPU is just noisier, SD 30 s vs 13 s, which only widens its CI without moving the point estimate.) - Build wall time is ~1-2% lower in both modes but within noise — the CPU saving is spread across 16 workers, so it moves the critical path little. - Main-isolate post-build heap (the retained graph of stored tuples) is unchanged in every config — no memory regression, byte-identical output. - Transient/post tree RSS shows a ~0.5 GB (~3.5%) reduction that is resolvable only in the lower-variance threads configuration; the noisier child-process configuration (RSS ~16 GB, CI half-width ~0.3 GB) cannot corroborate it, so treat it as suggestive, not established. Harness: `memory-investigation/run-worker-bench-ab.sh` (interleaved A/B) + `worker-bench-measure.js` + `worker-bench-stats.js` (paired CIs), in the base diff of this stack. Worker-threads mode under `js1 run` is GK-gated (`metro_worker_threads`); benched via a local `FORCE_WORKER_THREADS` override (not committed). Reviewed By: huntie, GijsWeterings Differential Revision: D108506323 fbshipit-source-id: 52c05932382b48aeed2b05ca9110d5908ea6ffeb
1 parent 77d6054 commit b658e36

4 files changed

Lines changed: 162 additions & 2 deletions

File tree

packages/metro-source-map/src/source-map.js

Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,24 @@ export type MetroSourceMapSegmentTuple =
3535
| SourceMapping
3636
| GeneratedCodeMapping;
3737

38+
// A single segment of a standard "decoded" source map (as produced by
39+
// `@babel/generator`'s `result.decodedMap` / `@jridgewell/gen-mapping`),
40+
// grouped by generated line. All fields are 0-based, including the source line
41+
// (unlike Metro's `MetroSourceMapSegmentTuple`, whose source line is 1-based):
42+
// [generatedColumn]
43+
// [generatedColumn, sourceIndex, sourceLine, sourceColumn]
44+
// [generatedColumn, sourceIndex, sourceLine, sourceColumn, nameIndex]
45+
type BabelDecodedMapSegment =
46+
| [number]
47+
| [number, number, number, number]
48+
| [number, number, number, number, number];
49+
50+
export type BabelDecodedMap = {
51+
readonly mappings: ReadonlyArray<ReadonlyArray<BabelDecodedMapSegment>>,
52+
readonly names: ReadonlyArray<string>,
53+
...
54+
};
55+
3856
export type HermesFunctionOffsets = {[number]: ReadonlyArray<number>, ...};
3957

4058
export type FBSourcesArray = ReadonlyArray<?FBSourceMetadata>;
@@ -279,6 +297,51 @@ function toSegmentTuple(
279297
return [line, column, original.line, original.column, name];
280298
}
281299

300+
/**
301+
* Converts a Babel/gen-mapping "decoded" source map (`result.decodedMap` from
302+
* `@babel/generator`) into raw mapping tuples, byte-identical to
303+
* `result.rawMappings.map(toSegmentTuple)`.
304+
*
305+
* Preferred over `result.rawMappings` because `decodedMap` is computed eagerly
306+
* during generation, whereas accessing `rawMappings` triggers a second decode
307+
* (`allMappings`) that allocates ~4-5 objects per segment. No terminating
308+
* mapping is appended (callers that need one use `countLinesAndTerminateMap`).
309+
*/
310+
function tuplesFromBabelDecodedMap(
311+
decodedMap: BabelDecodedMap,
312+
): Array<MetroSourceMapSegmentTuple> {
313+
const {mappings, names} = decodedMap;
314+
const tuples: Array<MetroSourceMapSegmentTuple> = [];
315+
for (let line = 0, n = mappings.length; line < n; ++line) {
316+
// Decoded mappings are grouped by generated line (0-based); tuples use
317+
// 1-based generated lines.
318+
const generatedLine = line + 1;
319+
const segments = mappings[line];
320+
for (let i = 0, m = segments.length; i < m; ++i) {
321+
const segment = segments[i];
322+
switch (segment.length) {
323+
case 1:
324+
tuples.push([generatedLine, segment[0]]);
325+
break;
326+
case 4:
327+
// Decoded source lines are 0-based; tuples use 1-based source lines.
328+
tuples.push([generatedLine, segment[0], segment[2] + 1, segment[3]]);
329+
break;
330+
case 5:
331+
tuples.push([
332+
generatedLine,
333+
segment[0],
334+
segment[2] + 1,
335+
segment[3],
336+
names[segment[4]],
337+
]);
338+
break;
339+
}
340+
}
341+
}
342+
return tuples;
343+
}
344+
282345
function addMappingsForFile(
283346
generator: Generator,
284347
mappings: Array<MetroSourceMapSegmentTuple>,
@@ -349,6 +412,7 @@ export {
349412
normalizeSourcePath,
350413
toBabelSegments,
351414
toSegmentTuple,
415+
tuplesFromBabelDecodedMap,
352416
};
353417

354418
/**

packages/metro-source-map/types/source-map.d.ts

Lines changed: 23 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
*
77
* @noformat
88
* @oncall react_native
9-
* @generated SignedSource<<7303fe7149cb12d764c6106cdf4f49ee>>
9+
* @generated SignedSource<<c2fb54d8a5eb6212af899a87f3fa4852>>
1010
*
1111
* This file was translated from Flow by scripts/generateTypeScriptDefinitions.js
1212
* Original file: packages/metro-source-map/src/source-map.js
@@ -35,6 +35,14 @@ export type MetroSourceMapSegmentTuple =
3535
| SourceMappingWithName
3636
| SourceMapping
3737
| GeneratedCodeMapping;
38+
type BabelDecodedMapSegment =
39+
| [number]
40+
| [number, number, number, number]
41+
| [number, number, number, number, number];
42+
export type BabelDecodedMap = {
43+
readonly mappings: ReadonlyArray<ReadonlyArray<BabelDecodedMapSegment>>;
44+
readonly names: ReadonlyArray<string>;
45+
};
3846
export type HermesFunctionOffsets = {
3947
[$$Key$$: number]: ReadonlyArray<number>;
4048
};
@@ -125,6 +133,19 @@ declare function toBabelSegments(
125133
declare function toSegmentTuple(
126134
mapping: BabelSourceMapSegment,
127135
): MetroSourceMapSegmentTuple;
136+
/**
137+
* Converts a Babel/gen-mapping "decoded" source map (`result.decodedMap` from
138+
* `@babel/generator`) into raw mapping tuples, byte-identical to
139+
* `result.rawMappings.map(toSegmentTuple)`.
140+
*
141+
* Preferred over `result.rawMappings` because `decodedMap` is computed eagerly
142+
* during generation, whereas accessing `rawMappings` triggers a second decode
143+
* (`allMappings`) that allocates ~4-5 objects per segment. No terminating
144+
* mapping is appended (callers that need one use `countLinesAndTerminateMap`).
145+
*/
146+
declare function tuplesFromBabelDecodedMap(
147+
decodedMap: BabelDecodedMap,
148+
): Array<MetroSourceMapSegmentTuple>;
128149
export {
129150
BundleBuilder,
130151
composeSourceMaps,
@@ -137,6 +158,7 @@ export {
137158
normalizeSourcePath,
138159
toBabelSegments,
139160
toSegmentTuple,
161+
tuplesFromBabelDecodedMap,
140162
};
141163
/**
142164
* Backwards-compatibility with CommonJS consumers using interopRequireDefault.
Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
/**
2+
* Copyright (c) Meta Platforms, Inc. and affiliates.
3+
*
4+
* This source code is licensed under the MIT license found in the
5+
* LICENSE file in the root directory of this source tree.
6+
*
7+
* @flow strict-local
8+
* @format
9+
* @oncall react_native
10+
*/
11+
12+
'use strict';
13+
14+
import generate from '@babel/generator';
15+
import * as babylon from '@babel/parser';
16+
import {toSegmentTuple, tuplesFromBabelDecodedMap} from 'metro-source-map';
17+
18+
// The transform worker derives source-map tuples from Babel's eagerly-computed
19+
// `result.decodedMap` instead of triggering the more expensive `rawMappings`
20+
// (`allMappings`) decode. This must be byte-identical to the previous
21+
// `result.rawMappings.map(toSegmentTuple)`.
22+
const SAMPLES = [
23+
`function foo(aaa, bbb) {
24+
const ccc = aaa + bbb;
25+
return ccc * 2;
26+
}
27+
class Bar extends Foo {
28+
method(xxx) {
29+
return this.value + xxx;
30+
}
31+
}
32+
export default function entry(items) {
33+
const obj = {a: 1, b: 2, c: [1, 2, 3]};
34+
return items.map(x => x.value).filter(Boolean);
35+
}
36+
`,
37+
`const x = require('foo');\nmodule.exports = (a, b) => { let s = 0; for (let i = 0; i < a.length; i++) { s += a[i] * b; } return s; };\n`,
38+
`// header\nconst y = 1;\n\n\nfunction z() { return y; }\n`,
39+
`const w = 42; const v = w + 1; export {w, v};`,
40+
`1 + 1;\n`,
41+
];
42+
43+
describe('tuplesFromBabelDecodedMap', () => {
44+
test.each(SAMPLES.map((code, i) => [i, code]))(
45+
'is byte-identical to rawMappings.map(toSegmentTuple) [sample %i]',
46+
(_i, code) => {
47+
const ast = babylon.parse(code, {sourceType: 'unambiguous'});
48+
const result = generate(
49+
ast,
50+
{sourceMaps: true, sourceFileName: 'file.js'},
51+
code,
52+
);
53+
const fromRaw = (result.rawMappings ?? []).map(toSegmentTuple);
54+
const fromDecoded = tuplesFromBabelDecodedMap(
55+
nullthrowsLocal(result.decodedMap),
56+
);
57+
expect(fromDecoded).toEqual(fromRaw);
58+
expect(fromDecoded.length).toBeGreaterThan(0);
59+
},
60+
);
61+
});
62+
63+
function nullthrowsLocal<T>(x: ?T): T {
64+
if (x == null) {
65+
throw new Error('Expected decodedMap to be present');
66+
}
67+
return x;
68+
}

packages/metro-transform-worker/src/index.js

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@ import {
4646
functionMapBabelPlugin,
4747
toBabelSegments,
4848
toSegmentTuple,
49+
tuplesFromBabelDecodedMap,
4950
} from 'metro-source-map';
5051
import metroTransformPlugins from 'metro-transform-plugins';
5152
import collectDependencies from 'metro/private/ModuleGraph/worker/collectDependencies';
@@ -471,7 +472,12 @@ async function transformJS(
471472
file.code,
472473
);
473474

474-
let map = result.rawMappings ? result.rawMappings.map(toSegmentTuple) : [];
475+
// Derive tuples from Babel's eagerly-computed decoded map rather than
476+
// `result.rawMappings`, which would trigger a second, more expensive decode
477+
// (`allMappings`). Byte-identical to `result.rawMappings.map(toSegmentTuple)`.
478+
let map = result.decodedMap
479+
? tuplesFromBabelDecodedMap(result.decodedMap)
480+
: [];
475481
let code = result.code;
476482

477483
if (minify) {

0 commit comments

Comments
 (0)