Commit 710d0ee
Optimize Sheet#rows_generator hot path
Benchmarked on a generated 20,000-row x 40-col xlsx (mixed shared-string,
numeric, and empty cells). Median full-parse time dropped ~1.68x
(1.85s -> 1.11s) and allocations dropped ~3x (28.6M -> ~9M), with
byte-identical output verified on both default- and prefixed-namespace
files across rows, simple_rows, and rows_with_meta_data.
Changes, in order of impact:
- Resolve the namespace prefix once via a `namespace_resolved` flag
instead of re-checking `node.namespaces` on every node. Worksheets use
a default namespace, so the old `prefix.empty?` guard never latched and
allocated a namespaces hash for every node in the stream. This is the
bulk of the wall-clock win.
- Hoist `node.name`/`node.node_type` into locals (each was read up to 4x
per node in the if/elsif chain) and reorder branches so the hottest
nodes (<v>/<c>) are tested first.
- Use `node.attribute_hash` instead of `node.attributes` for cell and row
nodes. `Reader#attributes` is `attribute_hash.merge(namespaces)`, so it
built and merged a namespaces hash on every call; we only need the
element's own attributes. This is an allocation/GC win (drops Hash#merge
and the per-node namespaces hash) with negligible wall-clock change.
Safe because xlsx row/cell elements never declare their own namespaces
(the namespace hash is empty for them), so the result is identical --
confirmed byte-for-byte including the self-closing-row + meta-data path.
- In fill_in_empty_cells, drop the redundant `.to_a` on the column range
and use `delete_suffix` instead of `gsub` to strip the row number.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>1 parent 39bffae commit 710d0ee
1 file changed
Lines changed: 34 additions & 25 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
101 | 101 | | |
102 | 102 | | |
103 | 103 | | |
104 | | - | |
| 104 | + | |
105 | 105 | | |
106 | 106 | | |
107 | 107 | | |
108 | 108 | | |
109 | 109 | | |
110 | | - | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
111 | 114 | | |
112 | | - | |
113 | | - | |
114 | | - | |
115 | | - | |
116 | | - | |
117 | | - | |
118 | | - | |
119 | | - | |
120 | | - | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
121 | 123 | | |
122 | | - | |
123 | | - | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
124 | 142 | | |
125 | 143 | | |
126 | 144 | | |
127 | | - | |
| 145 | + | |
128 | 146 | | |
129 | 147 | | |
130 | 148 | | |
| |||
138 | 156 | | |
139 | 157 | | |
140 | 158 | | |
141 | | - | |
142 | | - | |
143 | | - | |
144 | | - | |
145 | | - | |
146 | | - | |
147 | | - | |
148 | | - | |
149 | | - | |
150 | 159 | | |
151 | 160 | | |
152 | 161 | | |
| |||
172 | 181 | | |
173 | 182 | | |
174 | 183 | | |
175 | | - | |
176 | | - | |
| 184 | + | |
| 185 | + | |
177 | 186 | | |
178 | 187 | | |
179 | 188 | | |
| |||
0 commit comments