Skip to content

Commit 38f180b

Browse files
committed
fix(docs): stamp api-nr records with scalar docusaurus_tag "default"
#23058 tried to make aztec-nr-api records visible in the docusaurus search dropdown by stamping `docusaurus_tag` as an array of plugin context tags, paired with a `field_definitions` schema override that declared `docusaurus_tag` as `string*` (string-or-array). In practice the override doesn't take effect: every api-nr document import is rejected by Typesense with `'Field 'docusaurus_tag' must be a string.'` The CI guard added in #23042 (MIN_HITS=5000) didn't trip because the ~12k non-api docs still passed. The fix is much smaller. The docusaurus theme's contextual filter unconditionally prepends the constant `default` (DEFAULT_SEARCH_TAG in docusaurus-theme-common) to every dropdown query's `docusaurus_tag` list. So a single scalar value of `"default"` on api-nr records satisfies the filter from every plugin context, and no schema override is needed: the scraper's default `.*_tag: string` accepts the scalar cleanly. Changes: - `docs/typesense.config.json`: drop `field_definitions`; collapse api-nr `extra_attributes.docusaurus_tag` to scalar `"default"`. - `.github/workflows/docs-typesense.yml`: drop the jq mutation that derived versioned tags (no longer needed). Add a post-index curl smoke check that searches the live alias for `docusaurus_tag:=[default]&&language:=en` and fails the run if fewer than MIN_API_HITS=1000 records are visible. No existing docusaurus page carries the `"default"` tag (each one is stamped with its plugin-context tag from the docsearch meta), so this count is effectively the count of indexed api-nr records.
1 parent 999138f commit 38f180b

2 files changed

Lines changed: 33 additions & 106 deletions

File tree

.github/workflows/docs-typesense.yml

Lines changed: 32 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -34,51 +34,52 @@ jobs:
3434
# regression (which happened with #22861 dropping the index from
3535
# ~12k to 48 records) into a loud CI failure.
3636
MIN_HITS: "5000"
37+
# Fail the run if fewer than this many api-nr records are visible in
38+
# the live index after reindex. #23058 indexed ~12k non-api records
39+
# and silently dropped every aztec-nr-api record (rejected by the
40+
# collection schema) without tripping MIN_HITS, so this guard catches
41+
# the api-nr-specific regression directly.
42+
MIN_API_HITS: "1000"
43+
TYPESENSE_API_KEY: ${{ secrets.TYPESENSE_API_KEY }}
44+
TYPESENSE_HOST: ${{ secrets.TYPESENSE_HOST }}
3745
run: |
3846
set -euo pipefail
3947
40-
# Derive the version-specific docusaurus_tag values from the docs version
41-
# configs and append them to the api-nr start_url's docusaurus_tag array.
42-
# Each plugin instance produces a tag of the form `docs-${pluginId}-${versionName}`.
43-
# The unversioned tags (participate, root, default) are already in the static
44-
# config; this step adds entries for `developer` and `network` which bump on
45-
# release. Without this, the api-nr records would lose contextual search
46-
# visibility every time mainnet/testnet versions change.
47-
extra_tags=$(jq -nc \
48-
--slurpfile dev docs/developer_version_config.json \
49-
--slurpfile net docs/network_version_config.json \
50-
'[
51-
("docs-developer-" + ($dev[0].mainnet // "")),
52-
("docs-developer-" + ($dev[0].testnet // "")),
53-
("docs-network-" + ($net[0].mainnet // "")),
54-
("docs-network-" + ($net[0].testnet // ""))
55-
] | map(select(. != "docs-developer-" and . != "docs-network-")) | unique')
56-
echo "Derived docusaurus_tag values: $extra_tags"
57-
58-
config_json=$(jq -c --argjson extra "$extra_tags" '
59-
.start_urls |= map(
60-
if .selectors_key == "api-nr"
61-
then .extra_attributes.docusaurus_tag = ((.extra_attributes.docusaurus_tag // []) + $extra | unique)
62-
else .
63-
end
64-
)
65-
' docs/typesense.config.json)
66-
6748
docker run \
68-
-e "TYPESENSE_API_KEY=${{ secrets.TYPESENSE_API_KEY }}" \
69-
-e "TYPESENSE_HOST=${{ secrets.TYPESENSE_HOST }}" \
49+
-e "TYPESENSE_API_KEY=$TYPESENSE_API_KEY" \
50+
-e "TYPESENSE_HOST=$TYPESENSE_HOST" \
7051
-e "TYPESENSE_PORT=443" \
7152
-e "TYPESENSE_PROTOCOL=https" \
72-
-e "CONFIG=$config_json" \
53+
-e "CONFIG=$(cat docs/typesense.config.json)" \
7354
typesense/docsearch-scraper:0.11.0 2>&1 | tee scraper.log
7455
7556
nb_hits=$(grep -oE 'Nb hits: *[0-9]+' scraper.log | tail -1 | grep -oE '[0-9]+' || true)
7657
if [ -z "$nb_hits" ]; then
77-
echo "::error::Could not parse 'Nb hits' from scraper output assuming index is broken."
58+
echo "::error::Could not parse 'Nb hits' from scraper output, assuming index is broken."
7859
exit 1
7960
fi
8061
echo "Indexed $nb_hits records (threshold: $MIN_HITS)"
8162
if [ "$nb_hits" -lt "$MIN_HITS" ]; then
8263
echo "::error::Indexed only $nb_hits records (expected at least $MIN_HITS). Search index is likely broken."
8364
exit 1
8465
fi
66+
67+
# Verify api-nr records are searchable in the live index. The
68+
# docusaurus theme always prepends `default` to its contextual
69+
# docusaurus_tag filter, and no docusaurus page is stamped with
70+
# `default` (each carries its plugin-context tag instead), so this
71+
# facet count is effectively the count of indexed api-nr records.
72+
api_hits=$(curl -fsS \
73+
"https://$TYPESENSE_HOST/collections/aztec-docs/documents/search" \
74+
-H "X-TYPESENSE-API-KEY: $TYPESENSE_API_KEY" \
75+
-G \
76+
--data-urlencode "q=*" \
77+
--data-urlencode "query_by=hierarchy.lvl0" \
78+
--data-urlencode "filter_by=docusaurus_tag:=[default]&&language:=en" \
79+
--data-urlencode "per_page=1" \
80+
| jq -r '.found')
81+
echo "api-nr records visible under docusaurus_tag:=[default]: $api_hits (threshold: $MIN_API_HITS)"
82+
if [ "$api_hits" -lt "$MIN_API_HITS" ]; then
83+
echo "::error::Only $api_hits api-nr records visible (expected at least $MIN_API_HITS). Aztec.nr API search is likely broken."
84+
exit 1
85+
fi

docs/typesense.config.json

Lines changed: 1 addition & 75 deletions
Original file line numberDiff line numberDiff line change
@@ -7,11 +7,7 @@
77
"page_rank": 2,
88
"extra_attributes": {
99
"language": "en",
10-
"docusaurus_tag": [
11-
"docs-participate-current",
12-
"docs-root-current",
13-
"default"
14-
]
10+
"docusaurus_tag": "default"
1511
}
1612
},
1713
{
@@ -63,76 +59,6 @@
6359
"url",
6460
"url_without_anchor",
6561
"type"
66-
],
67-
"field_definitions": [
68-
{ "name": "anchor", "type": "string", "optional": true },
69-
{ "name": "content", "type": "string", "optional": true },
70-
{ "name": "url", "type": "string", "facet": true },
71-
{
72-
"name": "url_without_anchor",
73-
"type": "string",
74-
"facet": true,
75-
"optional": true
76-
},
77-
{
78-
"name": "version",
79-
"type": "string[]",
80-
"facet": true,
81-
"optional": true
82-
},
83-
{
84-
"name": "hierarchy.lvl0",
85-
"type": "string",
86-
"facet": true,
87-
"optional": true
88-
},
89-
{
90-
"name": "hierarchy.lvl1",
91-
"type": "string",
92-
"facet": true,
93-
"optional": true
94-
},
95-
{
96-
"name": "hierarchy.lvl2",
97-
"type": "string",
98-
"facet": true,
99-
"optional": true
100-
},
101-
{
102-
"name": "hierarchy.lvl3",
103-
"type": "string",
104-
"facet": true,
105-
"optional": true
106-
},
107-
{
108-
"name": "hierarchy.lvl4",
109-
"type": "string",
110-
"facet": true,
111-
"optional": true
112-
},
113-
{
114-
"name": "hierarchy.lvl5",
115-
"type": "string",
116-
"facet": true,
117-
"optional": true
118-
},
119-
{
120-
"name": "hierarchy.lvl6",
121-
"type": "string",
122-
"facet": true,
123-
"optional": true
124-
},
125-
{ "name": "type", "type": "string", "facet": true, "optional": true },
126-
{
127-
"name": "docusaurus_tag",
128-
"type": "string*",
129-
"facet": true,
130-
"optional": true
131-
},
132-
{ "name": ".*_tag", "type": "string", "facet": true, "optional": true },
133-
{ "name": "language", "type": "string", "facet": true, "optional": true },
134-
{ "name": "tags", "type": "string[]", "facet": true, "optional": true },
135-
{ "name": "item_priority", "type": "int64" }
13662
]
13763
},
13864
"conversation_id": ["833762294"],

0 commit comments

Comments
 (0)