@@ -15,6 +15,9 @@ When writing code or answering questions about data engineering tools,
1515use this skill to fetch current, version-specific documentation instead
1616of relying on training data.
1717
18+ ## Requirements
19+ ** Tools used:** docs_lookup, glob, read
20+
1821## When to Use
1922
2023Activate this skill when the user:
@@ -35,108 +38,83 @@ Activate this skill when the user:
3538- Asks "how do I" questions about any data engineering library or platform
3639- Needs SQL syntax, API references, method signatures, or configuration options
3740
38- ## Documentation Sources
39-
40- This skill uses ** two methods** depending on the type of documentation:
41-
42- 1 . ** Context7 CLI** (` ctx7 ` ) — For Python libraries and SDKs (dbt-core, Airflow,
43- PySpark, Snowpark, etc.). These have indexed documentation in Context7.
44- 2 . ** Web Fetch** (` webfetch ` ) — For database platform SQL documentation (Snowflake SQL,
45- BigQuery SQL, Databricks SQL, DuckDB, PostgreSQL, ClickHouse). These platforms
46- maintain official docs sites that can be fetched directly.
47-
48- Check ` references/library-ids.md ` for the full mapping of which method to use.
49-
50- ## Method 1: Context7 CLI (for Python libraries/SDKs)
41+ ## How to Fetch Documentation
5142
52- ### Step 1: Identify the Library
53-
54- Check the ` references/library-ids.md ` file for pre-mapped Context7 library IDs.
55- If you find a match, skip to Step 3.
56-
57- If the library isn't in the reference file, resolve it:
58-
59- ``` bash
60- npx -y ctx7@latest library < library-name> " <user's question>"
61- ```
43+ ### Step 1: Identify the Tool
6244
63- Pick the result with the closest name match and highest score .
64- Note the Library ID (format: ` /org/project ` or ` /org/project/version ` ) .
45+ Determine which data engineering tool or platform the user is asking about .
46+ Check ` references/library-ids.md ` for the full list of supported tools .
6547
66- ### Step 2: Check for Project Version
48+ ### Step 2: Check for Project Version (optional)
6749
6850Look for version info in the user's project:
6951
7052- ` requirements.txt ` or ` pyproject.toml ` — Python package versions
7153- ` dbt_project.yml ` — dbt version (` require-dbt-version ` )
7254- ` packages.yml ` — dbt package versions
73- - ` setup.py ` or ` setup.cfg ` — Python package versions
7455
75- If a specific version is found, prefer version-specific library IDs
76- (format: ` /org/project/vX.Y.Z ` ) when available from the resolution step.
56+ ### Step 3: Use the ` docs_lookup ` Tool
7757
78- ### Step 3: Query Documentation
58+ Call the ` docs_lookup ` tool with the tool name and a specific query:
7959
80- ``` bash
81- npx -y ctx7@latest docs < libraryId> " <specific question>"
60+ ```
61+ docs_lookup(tool="dbt-core", query="how to create incremental models with merge strategy")
62+ docs_lookup(tool="snowflake", query="MERGE statement syntax and examples")
63+ docs_lookup(tool="duckdb", query="window functions syntax")
64+ docs_lookup(tool="postgresql", query="JSONB operators and functions")
65+ docs_lookup(tool="clickhouse", query="MergeTree engine settings")
8266```
8367
84- Write ** specific, detailed queries** for better results:
85- - Good: ` "How to create incremental models with merge strategy in dbt" `
86- - Bad: ` "incremental" `
87-
88- ## Method 2: Web Fetch (for database platform SQL docs)
89-
90- For Snowflake, BigQuery, Databricks, DuckDB, PostgreSQL, and ClickHouse
91- platform documentation (SQL syntax, functions, DDL, configuration), use
92- the ` webfetch ` tool to fetch specific documentation pages.
93-
94- ### Step 1: Find the Right URL
95-
96- Check ` references/library-ids.md ` for the ** Platform Documentation URLs**
97- section. Each platform has a base URL and common page paths listed.
98-
99- ### Step 2: Fetch the Documentation
68+ The tool automatically selects the best method:
69+ - ** Context7 (ctx7)** for Python libraries/SDKs — indexed, searchable docs
70+ - ** Web fetch** for database platforms — fetches from official documentation sites
10071
101- Use the ` webfetch ` tool with the specific documentation URL and a prompt
102- describing what information to extract :
72+ For platform docs with a ** specific page URL** (see ` references/library-ids.md ` ),
73+ pass it via the ` url ` parameter for better results :
10374
10475```
105- webfetch( url="https://docs.snowflake.com/en/sql-reference/sql/merge",
106- prompt="Extract the full MERGE syntax, parameters, and examples ")
76+ docs_lookup(tool="snowflake", query="MERGE syntax", url="https://docs.snowflake.com/en/sql-reference/sql/merge")
77+ docs_lookup(tool="postgresql", query="JSON functions", url="https://www.postgresql.org/docs/current/functions-json.html ")
10778```
10879
109- ### Step 3 : Use the Documentation
80+ ### Step 4 : Use the Documentation
11081
11182- Answer using the fetched documentation, not training data
11283- Include relevant code examples from the docs
113- - Cite the documentation URL for reference
84+ - Cite the library version or documentation URL when relevant
11485- If docs mention deprecations or breaking changes, highlight them
11586
87+ ## Supported Tools
88+
89+ ** Libraries/SDKs (via Context7):** dbt-core, airflow, pyspark, snowflake-connector-python,
90+ snowpark-python, google-cloud-bigquery, databricks-sdk, duckdb, psycopg2, psycopg,
91+ clickhouse-connect, confluent-kafka, sqlalchemy, polars, pandas, great-expectations,
92+ dbt-utils, dbt-expectations, dbt-snowflake, dbt-bigquery, dbt-databricks, dbt-postgres,
93+ dbt-redshift, dbt-spark, dbt-duckdb, dbt-clickhouse, elementary
94+
95+ ** Platforms (via web fetch):** snowflake, databricks, duckdb, postgresql, clickhouse, bigquery
96+
11697## Guidelines
11798
118- - Maximum 3 CLI/webfetch calls per user question to avoid rate limits
119- - Context7 works without authentication; set ` CONTEXT7_API_KEY ` for higher limits
120- - If a call fails (network error, rate limit), fall back to training data
121- and note that the docs could not be fetched
99+ - Maximum 3 ` docs_lookup ` calls per user question to avoid rate limits
100+ - If a call fails, the tool logs the failure automatically for improvement tracking
101+ - On failure, fall back to training data and note that docs could not be fetched
122102- For dbt: always check ` dbt_project.yml ` for version and ` packages.yml ` for packages
123103- For Python tools: check ` requirements.txt ` or ` pyproject.toml ` for pinned versions
124104- When multiple libraries are relevant (e.g., dbt-core + dbt-snowflake), fetch docs
125105 for the most specific one first
126- - For SQL platform docs, prefer the most specific page URL (e.g., the MERGE
127- statement page, not the general SQL reference index)
106+ - For SQL platform docs, pass a specific page URL via the ` url ` parameter for best results
128107
129108## Usage
130109
131110- ` /data-docs How do I create an incremental model in dbt? `
132111- ` /data-docs What Airflow operators are available for BigQuery? `
133112- ` /data-docs How to use window functions in PySpark? `
134- - ` /data-docs Snowpark DataFrame API for joins `
135113- ` /data-docs Snowflake MERGE statement syntax `
136114- ` /data-docs DuckDB window functions `
137115- ` /data-docs PostgreSQL JSONB operators `
138116- ` /data-docs ClickHouse MergeTree engine settings `
139117
140- Use the bash tool to run ` ctx7 ` CLI commands for libraries, and the ` webfetch `
141- tool for platform SQL documentation . Reference ` library-ids.md ` for the full
142- mapping of tools, IDs, and URLs.
118+ Use the ` docs_lookup ` tool for all documentation lookups. It handles method selection,
119+ telemetry, and failure logging automatically . Reference ` library-ids.md ` for the full
120+ mapping of tools, IDs, and documentation URLs.
0 commit comments