Skip to content

Commit 48b7fa5

Browse files
authored
Merge pull request #215 from threefoldtech/master-docs-update
docs: update all project documentation
2 parents 214cb52 + d60af4a commit 48b7fa5

12 files changed

Lines changed: 481 additions & 111 deletions

File tree

README.md

Lines changed: 23 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ A high-performance indexing layer with a queryable GraphQL API over Ledger Chain
44

55
## What this is
66

7-
This project provides a Subsquid-based indexer that consumes raw blockchain events from Ledger Chain, transforms them into a structured schema, and exposes them through a GraphQL endpoint. It replaces direct chain queries with a fast, developer-friendly API suitable for front-end applications and data analytics.
7+
This project provides a [Subsquid](https://docs.subsquid.io)-based indexer that consumes raw blockchain events from Ledger Chain, transforms them into a structured schema, and exposes them through a GraphQL endpoint. It replaces direct chain queries with a fast, developer-friendly API suitable for front-end applications and data analytics.
88

99
## What this repository contains
1010

@@ -36,7 +36,7 @@ This repository is owned and maintained by TF-Tech NV, a Belgian company respons
3636

3737
## Prerequisites
3838

39-
- Node v16.x
39+
- Node v20+
4040
- Docker
4141
- Docker Compose
4242

@@ -46,19 +46,27 @@ See [docs](./docs/readme.md) for detailed running instructions.
4646

4747
## Project layout
4848

49-
- `indexer` — Docker Compose setup for the indexer
50-
- `db` — Processor database migration files
51-
- `scripts` — Scripts for generating initial state and development scripts
52-
- `src` — Source code
53-
- `mappings` — Mapper functions for indexer data
54-
- `model` — Generated models from the `schema.graphql` file
55-
- `types` — Type files that require manual edits when the schema or chain types change
56-
- `processor.ts` — Processor entrypoint
57-
- `typegen` — Declaration file generation (used for development)
58-
- `tfchainVersions.jsonl` — Generated Ledger Chain runtime versions and their data
59-
- `typegen.json` — Typegen config
60-
- `typesBundle.json` — Typegen bundle config
61-
- `schema.graphql` — The GraphQL schema file; changes to this file result in changes to the models in `src/models`
49+
- `indexer/` — Docker Compose setup for the indexer (archive)
50+
- `db/` — Processor database migration files
51+
- `scripts/` — Utility scripts (see [scripts/readme.md](./scripts/readme.md))
52+
- `src/` — Processor source code
53+
- `mappings/` — Event handler functions that map chain events to database entities
54+
- `model/` — Generated TypeORM models from `schema.graphql`
55+
- `types/` — Auto-generated type definitions (do not edit manually — run `make typegen`)
56+
- `processor.ts` — Processor entrypoint: event subscription and dispatch
57+
- `typegen/` — Type generation infrastructure
58+
- `tfchainVersions.jsonl` — Append-only log of runtime metadata from all Ledger Chain networks
59+
- `typegen.json` — Typegen config: which events to generate types for
60+
- `typesBundle.json` — Frozen pre-V14 type mappings (do not edit for new runtime versions)
61+
- `docs/` — Documentation
62+
- [typeChanges.md](./docs/typeChanges.md) — How to handle type changes on chain (adding new runtime versions, resync guidance)
63+
- [development.md](./docs/development.md) — Local development setup
64+
- [production.md](./docs/production.md) — Production deployment
65+
- [release_process.md](./docs/release_process.md) — Release workflow
66+
- `schema.graphql` — GraphQL schema — changes here regenerate `src/model/` via `npm run codegen`
67+
- `Makefile` — Common tasks: `typegen`, `typegen-add`, `typegen-seed`, `version-bump`
68+
- `processor-chart/` — Helm chart for processor + query node deployment
69+
- `indexer/chart/` — Helm chart for indexer stack deployment
6270

6371
## License
6472

docs/advanced-development.md

Lines changed: 195 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,195 @@
1+
# Advanced Development Guide
2+
3+
This document covers the internal architecture of the tfchain_graphql indexer/processor stack. For the day-to-day workflow (adding new runtime versions, resyncing), see [typeChanges.md](./typeChanges.md).
4+
5+
## 1. Pre-V14 Metadata and the typesBundle
6+
7+
Substrate runtime metadata comes in different versions. TFChain has two eras:
8+
9+
- **V12 (pre-V14)**: metadata is NOT self-describing for custom types. The `typesBundle.json` file provides type definitions that tell the decoder how to interpret SCALE-encoded data.
10+
- **V14+**: metadata is self-describing. All type information is embedded in the metadata itself. The typesBundle is not used.
11+
12+
The boundary differs per network:
13+
14+
| Network | Pre-V14 specs | First V14 spec |
15+
|---------|--------------|----------------|
16+
| Devnet | 49-67 | 101 |
17+
| QAnet | 61-67 | 104 |
18+
| Testnet | 9-70 | 113 |
19+
| Mainnet | 31-70 | 113 |
20+
21+
The typesBundle uses `minmax` ranges to define which type definitions apply at which spec versions. For example, `[50, None]` means "from spec 50 onwards until overridden by a later entry." When a type changes (e.g., a new field is added), a new `minmax` entry is added with the updated definition.
22+
23+
**Event hashes** for pre-V14 specs are computed from the typesBundle type definitions combined with the metadata. This is important: the same Rust struct can produce different event hashes depending on how the typesBundle defines it (field names, field order).
24+
25+
## 2. The Indexer-Processor Contract
26+
27+
The data flow from chain to GraphQL API is:
28+
29+
```
30+
Chain (SCALE-encoded events)
31+
-> Indexer (substrate-ingest) decodes SCALE using typesBundle -> stores decoded JSON in CockroachDB
32+
-> Gateway serves stored JSON
33+
-> Processor's decodeEvent() reads args[fieldName] from the JSON
34+
-> Processor stores entities in PostgreSQL
35+
-> Query node serves GraphQL API from PostgreSQL
36+
```
37+
38+
The processor does NOT decode raw SCALE bytes. It reads pre-decoded JSON from the gateway. The `decodeEvent` method in `@subsquid/substrate-processor` iterates over the event's field definitions and reads `args[fieldName]` from the stored JSON object.
39+
40+
This means **field names must match** between:
41+
- The typesBundle that the indexer used to decode and store the JSON
42+
- The type definitions the processor expects (derived from the current typesBundle + metadata)
43+
44+
If the typesBundle is updated but the indexer is not resynced, the stored JSON has field names from the old typesBundle while the processor expects names from the new one. This mismatch causes assertion failures during decoding.
45+
46+
**Rule: always resync the indexer after changing the typesBundle.** Alternatively, add workaround patches in the mapping handlers to fix the field names before decoding (see topic 4).
47+
48+
## 3. Cross-Network Metadata Differences
49+
50+
A given spec version number typically represents the same runtime binary across all networks. This was verified by comparing metadata hex from the firesquid indexers on all 4 networks:
51+
52+
```graphql
53+
# Query on each network's firesquid
54+
{ metadata(where: {specVersion_eq: 49}) { hex } }
55+
```
56+
57+
All shared pre-V14 specs have identical metadata across all networks that deployed them.
58+
59+
**Exceptions**: specs 125 and 134 have different WASM on devnet vs other networks. Devnet received release candidate builds that were later revised before deployment to qa/test/main. These are V14 specs, so the typesBundle is not involved. Verified that all tracked event hashes are identical despite the metadata differences (the changes are in non-event types).
60+
61+
The JSONL merge script (`scripts/merge-versions.js`) deduplicates by specVersion, keeping the first entry encountered. Since devnet is seeded first, devnet's metadata wins for conflicts.
62+
63+
## 4. The `dedicatedFarm:` Colon Bug
64+
65+
The typesBundle historically had a typo: `"dedicatedFarm:"` (trailing colon) in the Farm struct definition at `[63, None]`.
66+
67+
- **Introduced**: commit `478ee70`
68+
- **Fixed**: commit `980dd11`
69+
- **Grid deployment updated**: commit `119b5dc` (June 2024)
70+
- **Indexers resynced**: never
71+
72+
Because the indexers were never resynced, all network indexer snapshots contain decoded JSON with `"dedicatedFarm:"` (with colon) as the field key for pre-V14 Farm events. The current typesBundle (without colon) produces a different event hash, and the processor expects `"dedicatedFarm"` (no colon) as the field name.
73+
74+
When the processor reads `args["dedicatedFarm"]`, it gets `undefined` because the stored key is `"dedicatedFarm:"`. The SCALE JSON codec then asserts `typeof undefined == "boolean"` and crashes.
75+
76+
**Workaround** (in the `isV63` branches of `farmStored` and `farmUpdated`):
77+
78+
```typescript
79+
(item.event.args as any).dedicatedFarm = false
80+
```
81+
82+
This adds the expected field name to the args object before decoding. The value `false` is safe because `farmStored` hardcodes `dedicatedFarm = false` anyway.
83+
84+
**Proper fix**: resync all indexers with the corrected typesBundle so the stored JSON has correct field names. After resync, the workaround can be removed.
85+
86+
**Note**: the mainnet `grid_deployment` repo still has the old typesBundle with this colon bug. Devnet, qanet, and testnet have the corrected version.
87+
88+
## 5. How Typegen Assigns Version Labels
89+
90+
Typegen reads the JSONL file in order (sorted by specVersion). For each tracked event, it:
91+
92+
1. Computes the event hash at each specVersion by decoding the metadata (using the typesBundle for pre-V14)
93+
2. Compares the hash to the previous entry's hash
94+
3. If the hash changed, generates a new `isVxx`/`asVxx` accessor named after the specVersion
95+
4. If the hash is the same as the previous entry, skips it (consecutive hash dedup)
96+
97+
This means:
98+
- The JSONL order determines which specVersion gets the "canonical" label for each hash
99+
- The same hash can produce accessors at multiple specs if it appears, changes, then reverts (e.g., Twin hash oscillates at specs 125-127 due to devnet metadata differences)
100+
- Typegen **cannot** handle two JSONL entries with the same specVersion (duplicate TypeScript method names would cause compilation errors)
101+
102+
The `isVxx` runtime check is `getEventHash(eventName) === 'hash'`. It checks the current block's runtime hash, not the spec version. So `isV9` can match blocks at any spec version as long as the event hash is the same.
103+
104+
## 6. Network Deployment History
105+
106+
| Network | Genesis spec | Genesis block | Pre-V14 range | Notes |
107+
|---------|-------------|---------------|---------------|-------|
108+
| Testnet | 9 | 0 | 9-70 | Oldest continuous chain. Has all historical specs. |
109+
| Mainnet | 31 | 0 | 31-70 | Started later than testnet. |
110+
| QAnet | 61 | 0 | 61-67 | Reset. Started from spec 61. |
111+
| Devnet | 49 | 0 | 49-67 | Reset multiple times. Current chain starts at spec 49. Has RC specs 63-67 that are devnet-only. |
112+
113+
**Spec reuse after resets**: devnet was reset multiple times. Spec numbers 1-48 existed on the old devnet but are gone. The current devnet starts at spec 49. Git history may show commits with the same spec number from different eras. The deployed version is always the commit that bumps `spec_version` in `substrate-node/runtime/src/lib.rs`.
114+
115+
**Git commit pattern**: developers add features in commits while the runtime still has the old spec version, then a separate commit bumps the spec. The bump commit is what gets built and deployed. When tracing types at a spec version, look at the commit that set `spec_version: XX`, not earlier commits that may show intermediate states.
116+
117+
**Verifying deployed runtime**: compare metadata hex from firesquid indexers across networks. If the hex matches, the same WASM was deployed. If it differs, the networks have different code at that spec (typically devnet RC vs production release).
118+
119+
## 7. Debugging Event Decode Failures
120+
121+
### Symptoms
122+
123+
- `AssertionError: typeof value == "boolean"` (or "string", "number")
124+
- `AssertionError: The expression evaluated to a falsy value` in codec-json.js
125+
- Processor crash loop at a specific block
126+
127+
### Step 1: Identify the spec version
128+
129+
Enable `SQD_DEBUG=sqd:processor:mapping` on the processor (avoid `sqd:processor:*` which floods logs with serialization errors from the node-fetch URLSearchParams bug). Look for the specId in debug output or check which block the processor is stuck on:
130+
131+
```bash
132+
docker logs processor-container 2>&1 | grep "last processed block"
133+
```
134+
135+
Then query the indexer explorer for the spec at that block:
136+
```graphql
137+
{ blocks(where: {height_eq: XXXXX}) { spec { specVersion } } }
138+
```
139+
140+
### Step 2: Determine pre-V14 or V14
141+
142+
- Devnet: spec < 101 is pre-V14
143+
- Testnet/Mainnet: spec < 113 is pre-V14
144+
- QAnet: spec < 104 is pre-V14
145+
146+
For V14 failures, the issue is in the auto-generated types. For pre-V14, the typesBundle is involved.
147+
148+
### Step 3: For pre-V14 failures
149+
150+
1. Check the typesBundle `minmax` range covering this spec
151+
2. Verify the type definition matches the Rust source at that spec
152+
3. **Check what the indexer stored**: query the firesquid gateway for the event:
153+
```graphql
154+
{ events(where: {name_eq: "TfgridModule.FarmStored", block: {height_eq: XXXXX}}) { args } }
155+
```
156+
4. Compare the stored field names with what the processor expects
157+
5. If field names don't match, the indexer was built with a different typesBundle
158+
159+
### Step 4: Compare metadata across networks
160+
161+
Query each network's firesquid:
162+
163+
```graphql
164+
{ metadata(where: {specVersion_eq: XX}) { hex } }
165+
```
166+
167+
Compare the hex values directly. Same hex = same WASM. Different hex = different code at the same spec (typically devnet RC).
168+
169+
### Step 5: Compare event hashes
170+
171+
Save metadata to temporary JSONL files and run typegen on each to see what event hashes they produce:
172+
173+
```bash
174+
# Create single-entry JSONL from indexer metadata
175+
# Run typegen with the tracked events
176+
# Compare the getEventHash lines in the output
177+
```
178+
179+
### Step 6: Check production code
180+
181+
If production works but your branch doesn't:
182+
183+
```bash
184+
diff <(git show origin/production-branch:src/types/events.ts | grep getEventHash | sort) \
185+
<(grep getEventHash src/types/events.ts | sort)
186+
```
187+
188+
Look for removed hash branches or missing workaround patches in the mapping handlers.
189+
190+
### Common pitfalls
191+
192+
- **Mixed-version DB contamination**: running two different processor versions against the same PostgreSQL database creates mixed entity ID formats. Always do a full DB reset when switching versions.
193+
- **CockroachDB snapshot extraction**: never use `--strip-components` when extracting indexer snapshots. The tar archive has SST files at root level.
194+
- **Startup race**: on first start, the processor may fail with "relation does not exist" if migrations haven't completed. Docker restart policy recovers this automatically.
195+
- **Shared Docker network DNS**: when both compose stacks share a network, all service names must be unique across both stacks. The indexer uses `cockroachdb` (not `db`) to avoid collision with the processor's PostgreSQL `db` service.

0 commit comments

Comments
 (0)