Observed
After upgrading from 13.6.0.5 (Persistent) to 13.7.x (Hasql), users reported that the epoch row written at the first epoch boundary contains wildly inflated values for out_sum and fees. Examples
- Epoch 620: out_sum ≈ 397.7 billion ADA (expected ~38 billion), fees ≈ 1M ADA (expected ~36k ADA).
- Epoch 624: both out_sum and fees ~3× the expected value.
13.6.0.5 and external explorers (e.g. AdaStat) report the correct values. The discrepancy appears at every epoch boundary on the upgraded instance, with varying multipliers, and only on epochs after the upgrade (i.e. cold-cache boundaries).
Root cause (suspected)
The queryCalcEpochEntry fallback path (used when the in-memory epoch cache is empty - cold start, rollback, or --disable-cache) has two bugs introduced by the Persistent → Hasql migration:
-
cardano-db/src/Cardano/Db/Statement/EpochAndProtocol.hs (queryCalcEpochEntryStmt): - SQL aggregates SUM(tx.out_sum) and SUM(tx.fee) where the columns are lovelace (= numeric(20,0)), so the SUM result is numeric - but the decoder reads them as HsqlD.int8 (PostgreSQL bigint).
- The Word128 result is then constructed as Word128 0 (fromIntegral outSum), which only populates the low 64 bits (high 64 bits forced to 0). If the int8 read wraps negative due to type coercion, fromIntegral produces a near-max Word64 that becomes a huge Word128.
-
cardano-db/src/Cardano/Db/Types.hs (word128Decoder):
- word128Decoder = fromInteger . fromIntegral . coefficient <$> HsqlD.numeric
- Data.Scientific.coefficient returns only the mantissa, dropping base10Exponent. If PostgreSQL ever returns the numeric in normalized form (e.g. Scientific 38 16), the decoder reads 38 instead of 380000000000000000. This affects every read of epoch.out_sum (e.g. queryLatestEpoch, used to re-prime the cache after restart).
These cold-cache reads then become the seed to which subsequent per-block diffs are added by calculateNewEpoch, so the corruption persists and grows on every block until the next epoch boundary.
Affected versions
13.7.0.0 through 13.7.0.4 (the Hasql release line).
Observed
After upgrading from 13.6.0.5 (Persistent) to 13.7.x (Hasql), users reported that the epoch row written at the first epoch boundary contains wildly inflated values for out_sum and fees. Examples
13.6.0.5 and external explorers (e.g. AdaStat) report the correct values. The discrepancy appears at every epoch boundary on the upgraded instance, with varying multipliers, and only on epochs after the upgrade (i.e. cold-cache boundaries).
Root cause (suspected)
The queryCalcEpochEntry fallback path (used when the in-memory epoch cache is empty - cold start, rollback, or --disable-cache) has two bugs introduced by the Persistent → Hasql migration:
cardano-db/src/Cardano/Db/Statement/EpochAndProtocol.hs (queryCalcEpochEntryStmt): - SQL aggregates SUM(tx.out_sum) and SUM(tx.fee) where the columns are lovelace (= numeric(20,0)), so the SUM result is numeric - but the decoder reads them as HsqlD.int8 (PostgreSQL bigint).
cardano-db/src/Cardano/Db/Types.hs (word128Decoder):
These cold-cache reads then become the seed to which subsequent per-block diffs are added by calculateNewEpoch, so the corruption persists and grows on every block until the next epoch boundary.
Affected versions
13.7.0.0 through 13.7.0.4 (the Hasql release line).