Skip to content

Commit 8b76042

Browse files
authored
fix(docs): use scalar docusaurus_tag "default" for api-nr records (#23097)
## Summary Fourth in the series fixing search after #22861. After #23058 merged, the production index still has **0 records under `aztec-nr-api/mainnet/...`**. Confirmed by querying the live Typesense collection directly (`filter_by:url:=https://docs.aztec.network/aztec-nr-api/mainnet/*` returns `found: 0`) and by inspecting the most recent scraper run logs. ## Root cause The schema override added by #23058 doesn't take effect. Every api-nr document import is rejected by Typesense with HTTP 400 `'Field \`docusaurus_tag\` must be a string.'`, even though `custom_settings.field_definitions` lists an explicit `{ \"name\": \"docusaurus_tag\", \"type\": \"string*\" }` ahead of the wildcard `.*_tag: string`. Per Typesense docs an explicit field should win over a regex pattern field, but in practice the wildcard's `string` type appears to be what's enforced. The CI guard from #23042 (`MIN_HITS=5000`) didn't trip because the ~12k non-api docs still passed. ## Fix The PR over-engineered the solution. Reading the docusaurus theme: ```ts // docusaurus-theme-common/src/utils/searchUtils.ts export const DEFAULT_SEARCH_TAG = 'default'; ``` ```ts // docusaurus-theme-common/src/index.ts const tags = [DEFAULT_SEARCH_TAG, ...docsTags]; return {locale: i18n.currentLocale, tags}; ``` …the theme unconditionally prepends `'default'` to the `docusaurus_tag` filter on every dropdown query, in every plugin context. So api-nr records only need the single scalar value `\"default\"` to satisfy the filter from anywhere on the docs site. No array, no schema surgery, no version-specific tag derivation. Three changes: ### 1. `docs/typesense.config.json` Drop the `custom_settings.field_definitions` override entirely (the scraper's default schema with `.*_tag: string` accepts scalar string values cleanly), and collapse the api-nr `extra_attributes.docusaurus_tag` to scalar `\"default\"`. ### 2. `.github/workflows/docs-typesense.yml` — remove jq mutation The jq block that derived versioned tags is no longer needed. The scraper now reads `docs/typesense.config.json` verbatim. ### 3. `.github/workflows/docs-typesense.yml` — log api-nr visibility post-index After the scraper completes its alias swap, curl the live `aztec-docs` alias for `docusaurus_tag:=[default]&&language:=en` and log the count. No existing docusaurus page carries the `\"default\"` tag (each is stamped with its plugin-context tag, e.g. `docs-developer-v4.2.0`, from the `<meta name=\"docsearch:docusaurus_tag\">` tag), so this count is effectively the count of indexed api-nr records — and the filter mirrors what the theme actually sends. Informational only; not gated by a threshold. ## Behavior change api-nr records will now appear in the search dropdown from every plugin context (developer, network, root, participate) and every doc version (mainnet, testnet, nightly), because they're stamped with the always-prepended `\"default\"` tag rather than version-specific tags. Today we only generate `aztec-nr-api/mainnet/`, so a user browsing testnet developer docs would see mainnet aztec-nr API links in their dropdown. Probably desirable (an aztec-nr API symbol is the same regardless of which doc version you're reading), but a behavior change vs the (non-functional) #23058 attempt. ## Caveat api-nr visibility now depends on the docusaurus theme's `DEFAULT_SEARCH_TAG = 'default'` invariant. If a future caller ever issues a search query that doesn't include `'default'` in the tag list (e.g. a custom search page bypassing `useContextualSearchFilters`), api-nr records would silently disappear from that surface. ## Test plan - [ ] Manually dispatch `Docs Scraper` workflow via `workflow_dispatch` on this branch. - [ ] Confirm the run logs `Indexed N records (threshold: 5000)` with N >> 5000. - [ ] Confirm the run logs `api-nr records visible under docusaurus_tag:=[default]: M` with M well above zero (#23049 indexed 14,773 api-nr records before the schema rejection started silently dropping them, so we expect a similar count). - [ ] Confirm no `'Field \`docusaurus_tag\` must be a string.'` 400s in the scraper output. - [ ] After merge, search docs.aztec.network from the homepage, /developers/, /network/, and /participate/ for an Aztec.nr identifier (e.g. `ContractClassId`, `balance_set`, `compute_log_tag`, `address_note`) and confirm API reference pages appear in the dropdown in all four contexts.
2 parents 999138f + eb23b52 commit 8b76042

2 files changed

Lines changed: 24 additions & 106 deletions

File tree

.github/workflows/docs-typesense.yml

Lines changed: 23 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -34,51 +34,43 @@ jobs:
3434
# regression (which happened with #22861 dropping the index from
3535
# ~12k to 48 records) into a loud CI failure.
3636
MIN_HITS: "5000"
37+
TYPESENSE_API_KEY: ${{ secrets.TYPESENSE_API_KEY }}
38+
TYPESENSE_HOST: ${{ secrets.TYPESENSE_HOST }}
3739
run: |
3840
set -euo pipefail
3941
40-
# Derive the version-specific docusaurus_tag values from the docs version
41-
# configs and append them to the api-nr start_url's docusaurus_tag array.
42-
# Each plugin instance produces a tag of the form `docs-${pluginId}-${versionName}`.
43-
# The unversioned tags (participate, root, default) are already in the static
44-
# config; this step adds entries for `developer` and `network` which bump on
45-
# release. Without this, the api-nr records would lose contextual search
46-
# visibility every time mainnet/testnet versions change.
47-
extra_tags=$(jq -nc \
48-
--slurpfile dev docs/developer_version_config.json \
49-
--slurpfile net docs/network_version_config.json \
50-
'[
51-
("docs-developer-" + ($dev[0].mainnet // "")),
52-
("docs-developer-" + ($dev[0].testnet // "")),
53-
("docs-network-" + ($net[0].mainnet // "")),
54-
("docs-network-" + ($net[0].testnet // ""))
55-
] | map(select(. != "docs-developer-" and . != "docs-network-")) | unique')
56-
echo "Derived docusaurus_tag values: $extra_tags"
57-
58-
config_json=$(jq -c --argjson extra "$extra_tags" '
59-
.start_urls |= map(
60-
if .selectors_key == "api-nr"
61-
then .extra_attributes.docusaurus_tag = ((.extra_attributes.docusaurus_tag // []) + $extra | unique)
62-
else .
63-
end
64-
)
65-
' docs/typesense.config.json)
66-
6742
docker run \
68-
-e "TYPESENSE_API_KEY=${{ secrets.TYPESENSE_API_KEY }}" \
69-
-e "TYPESENSE_HOST=${{ secrets.TYPESENSE_HOST }}" \
43+
-e "TYPESENSE_API_KEY=$TYPESENSE_API_KEY" \
44+
-e "TYPESENSE_HOST=$TYPESENSE_HOST" \
7045
-e "TYPESENSE_PORT=443" \
7146
-e "TYPESENSE_PROTOCOL=https" \
72-
-e "CONFIG=$config_json" \
47+
-e "CONFIG=$(cat docs/typesense.config.json)" \
7348
typesense/docsearch-scraper:0.11.0 2>&1 | tee scraper.log
7449
7550
nb_hits=$(grep -oE 'Nb hits: *[0-9]+' scraper.log | tail -1 | grep -oE '[0-9]+' || true)
7651
if [ -z "$nb_hits" ]; then
77-
echo "::error::Could not parse 'Nb hits' from scraper output assuming index is broken."
52+
echo "::error::Could not parse 'Nb hits' from scraper output, assuming index is broken."
7853
exit 1
7954
fi
8055
echo "Indexed $nb_hits records (threshold: $MIN_HITS)"
8156
if [ "$nb_hits" -lt "$MIN_HITS" ]; then
8257
echo "::error::Indexed only $nb_hits records (expected at least $MIN_HITS). Search index is likely broken."
8358
exit 1
8459
fi
60+
61+
# Log how many api-nr records are visible in the live index. The
62+
# docusaurus theme always prepends `default` to its contextual
63+
# docusaurus_tag filter, and no docusaurus page is stamped with
64+
# `default` (each carries its plugin-context tag instead), so this
65+
# facet count is effectively the count of indexed api-nr records.
66+
# Informational only: the count varies with aztec-nr content size.
67+
api_hits=$(curl -fsS \
68+
"https://$TYPESENSE_HOST/collections/aztec-docs/documents/search" \
69+
-H "X-TYPESENSE-API-KEY: $TYPESENSE_API_KEY" \
70+
-G \
71+
--data-urlencode "q=*" \
72+
--data-urlencode "query_by=hierarchy.lvl0" \
73+
--data-urlencode "filter_by=docusaurus_tag:=[default]&&language:=en" \
74+
--data-urlencode "per_page=1" \
75+
| jq -r '.found')
76+
echo "api-nr records visible under docusaurus_tag:=[default]: $api_hits"

docs/typesense.config.json

Lines changed: 1 addition & 75 deletions
Original file line numberDiff line numberDiff line change
@@ -7,11 +7,7 @@
77
"page_rank": 2,
88
"extra_attributes": {
99
"language": "en",
10-
"docusaurus_tag": [
11-
"docs-participate-current",
12-
"docs-root-current",
13-
"default"
14-
]
10+
"docusaurus_tag": "default"
1511
}
1612
},
1713
{
@@ -63,76 +59,6 @@
6359
"url",
6460
"url_without_anchor",
6561
"type"
66-
],
67-
"field_definitions": [
68-
{ "name": "anchor", "type": "string", "optional": true },
69-
{ "name": "content", "type": "string", "optional": true },
70-
{ "name": "url", "type": "string", "facet": true },
71-
{
72-
"name": "url_without_anchor",
73-
"type": "string",
74-
"facet": true,
75-
"optional": true
76-
},
77-
{
78-
"name": "version",
79-
"type": "string[]",
80-
"facet": true,
81-
"optional": true
82-
},
83-
{
84-
"name": "hierarchy.lvl0",
85-
"type": "string",
86-
"facet": true,
87-
"optional": true
88-
},
89-
{
90-
"name": "hierarchy.lvl1",
91-
"type": "string",
92-
"facet": true,
93-
"optional": true
94-
},
95-
{
96-
"name": "hierarchy.lvl2",
97-
"type": "string",
98-
"facet": true,
99-
"optional": true
100-
},
101-
{
102-
"name": "hierarchy.lvl3",
103-
"type": "string",
104-
"facet": true,
105-
"optional": true
106-
},
107-
{
108-
"name": "hierarchy.lvl4",
109-
"type": "string",
110-
"facet": true,
111-
"optional": true
112-
},
113-
{
114-
"name": "hierarchy.lvl5",
115-
"type": "string",
116-
"facet": true,
117-
"optional": true
118-
},
119-
{
120-
"name": "hierarchy.lvl6",
121-
"type": "string",
122-
"facet": true,
123-
"optional": true
124-
},
125-
{ "name": "type", "type": "string", "facet": true, "optional": true },
126-
{
127-
"name": "docusaurus_tag",
128-
"type": "string*",
129-
"facet": true,
130-
"optional": true
131-
},
132-
{ "name": ".*_tag", "type": "string", "facet": true, "optional": true },
133-
{ "name": "language", "type": "string", "facet": true, "optional": true },
134-
{ "name": "tags", "type": "string[]", "facet": true, "optional": true },
135-
{ "name": "item_priority", "type": "int64" }
13662
]
13763
},
13864
"conversation_id": ["833762294"],

0 commit comments

Comments
 (0)