-
Notifications
You must be signed in to change notification settings - Fork 288
VersionDB Migration
VersionDB is a solution for the size issue of the IAVL database (application.db). At this stage, it is only recommended for archive and non-validator nodes (validator nodes are recommended to prune anyway).
VersionDB stores multiple versions of on-chain state key-value pairs directly, without using a merklized tree structure like the IAVL tree. Both DB size and query performance are significantly better than the IAVL tree. The major feature it lacks compared to the IAVL tree is root hash and merkle proof generation, so the IAVL tree is still required for those tasks.
The gRPC query service does not currently need to support proof generation, so VersionDB alone is sufficient to back it. A --grpc-only flag is already available for starting a standalone gRPC query service.
There can be different implementations of the VersionDB idea. The current implementation is based on RocksDB v7's experimental user-defined timestamp and stores data in a standalone RocksDB instance. It does not support other DB backends yet, but the other databases in the node continue to support multiple backends as before.
After migrating to an archived VersionDB, you can prune the IAVL tree to reclaim disk space if you do not need to generate merkle proofs on historical versions. Better support for this will be provided in the next phase of our storage optimization plan.
The limitations of the setup with VersionDB and a pruned IAVL tree are:
-
eth_getProofis not supported for historical versions that have been pruned from the IAVL tree. - Non-gRPC
/abci_queryis not supported for historical versions that have been pruned from the IAVL tree.
All other APIs function the same as on a regular archive node.
To enable VersionDB, add versiondb to the list of store.streamers in app.toml:
[store]
streamers = ["versiondb"] On startup, the node creates a StreamingService that subscribes to the latest state changes in real time and saves them to VersionDB. The DB instance is placed at $NODE_HOME/data/versiondb; the DB path is not currently customizable. The node will also switch the
gRPC query service's backing store from the IAVL tree to VersionDB. You should migrate the legacy state in advance to make the transition smooth; otherwise, gRPC queries will not see the legacy versions.
If VersionDB is non-empty and its latest version does not match the IAVL DB's last committed version, startup will fail with the error message "versiondb lastest version %d doesn't match iavl latest version %d". This check prevents accidentally creating gaps in
VersionDB. When this error occurs, either update VersionDB to the latest version in the IAVL tree manually, or restore the IAVL DB to the same version as VersionDB (see Catch Up With IAVL Tree).
For a state sync node, after the local snapshot exists, you need to manually restore the initial VersionDB:
$ cronosd changeset restore-versiondb <height> <format>Since our chain is now quite large, significant effort has gone into ensuring that the transition process completes in a practical amount of time. The migration parallelizes tasks as much as possible and uses significant RAM, but flags are provided to control
concurrency and RAM usage so it can run on different machine specs.
The legacy state migration is done in three main steps:
- Extract state change sets from the existing archived IAVL tree.
- Build VersionDB from the change set files.
- Build a clean
application.dbfrom the change set files.
$ cronosd changeset dump data --home /chain/.cronosd --iavl-version <version>The dump command extracts change sets from the IAVL tree and stores each store in a separate directory. By default, it uses the list of stores registered in the current version of App; this can be customized with the --stores parameter.
Important: The --iavl-version flag defaults to 1. If your chain has blocks committed under IAVL version 0 (i.e. older blocks predating the v1 upgrade), you must explicitly set --iavl-version 0 when dumping those blocks; otherwise they will be skipped
and the dump will be incomplete. Always set --iavl-version explicitly to match the IAVL version that produced the blocks you want to extract.
Change set files are segmented into block chunks and compressed with zlib level 6 by default. The default chunk size is 1M blocks. The resulting data directory looks like:
data/acc/block-0.zz
data/acc/block-1000000.zz
data/acc/block-2000000.zz
...
data/authz/block-0.zz
data/authz/block-1000000.zz
data/authz/block-2000000.zz
...
Extraction is the slowest step; a test run on a testnet archive node took around 11 hours on an 8-core SSD machine. Fortunately, change set files can be verified quickly (a few minutes), so they can be shared on a CDN in a trustless manner. Normal users should
download them from a CDN and verify them locally, which is much faster than extracting them yourself.
For the RocksDB backend, the dump command opens the DB in read-only mode and can run against a live node's DB. The goleveldb backend does not yet support this.
$ cronosd changeset verify data
35b85a775ff51cbcc48537247eb786f98fc6a178531d48560126e00f545251be
{"version":"189","storeInfos":[{"name":"acc","commitId":{"version":"189" ... The verify command replays all change sets, rebuilds the target IAVL tree, and outputs the app hash and commit info of the target version (defaults to the latest version in the change sets). You can then manually check the app hash against the block headers.
verify takes several minutes and several gigabytes of RAM to run. If RAM usage is a concern, it can run incrementally: export a snapshot at an intermediate version, then verify the remaining versions starting from that snapshot:
$ cronosd changeset verify data --save-snapshot snapshot --target-version 3000000
$ cronosd changeset verify data --load-snapshot snapshot The change set file format is documented here.
To maximize initial data ingestion speed into RocksDB, we use the SST file writer feature to write out SST files first and then ingest them into the final DB. SST files for each store can be written in parallel. We also developed an external sorting algorithm to sort the data before writing the SST files, so the SST files do not overlap and can be ingested directly into the bottom-most level of the final DB.
$ cronosd changeset build-versiondb-sst ./data ./sst
$ cronosd changeset ingest-versiondb-sst /home/.cronosd/data/versiondb sst/*.sst --move-files --maximum-version 189 You can control peak RAM usage with --concurrency and --sorter-chunk-size.
With default parameters, this finishes in around 12 minutes on our testnet archive node test (8 cores, peak RSS 2 GB).
When migrating an existing archive node to VersionDB, it is recommended to rebuild application.db from scratch to reclaim disk space more quickly. We provide a command to restore a single version of the IAVL tree from a memiavl snapshot:
$ # create memiavl snapshot
$ cronosd changeset verify data --save-snapshot snapshot
$ # restore application.db
$ cronosd changeset restore-app-db snapshot application.db Then replace the entire application.db in the node with the newly generated one.
This takes only a few minutes to run on our testnet archive node. It currently supports generating only a RocksDB application.db, so set app-db-backend="rocksdb" in app.toml.
If a non-empty VersionDB lags behind the current application.db, the node will refuse to start. In this case you can either sync VersionDB to catch up with application.db, or restore application.db to the version matching VersionDB. To catch up, follow the same
procedure as migrating from genesis, but pass the relevant block range to the change set dump command.