|
| 1 | +--- |
| 2 | +name: data-docs |
| 3 | +description: >- |
| 4 | + Fetch up-to-date, version-aware documentation for data engineering tools. |
| 5 | + Use this skill when writing code that uses dbt, Airflow, Spark, Snowflake, |
| 6 | + BigQuery, Databricks, Kafka, SQLAlchemy, Polars, or Great Expectations. |
| 7 | + Activates for API lookups, configuration questions, code generation, or |
| 8 | + debugging involving these data tools. |
| 9 | +--- |
| 10 | + |
| 11 | +# Data Engineering Documentation Lookup |
| 12 | + |
| 13 | +When writing code or answering questions about data engineering tools, |
| 14 | +use this skill to fetch current, version-specific documentation instead |
| 15 | +of relying on training data. |
| 16 | + |
| 17 | +## When to Use |
| 18 | + |
| 19 | +Activate this skill when the user: |
| 20 | + |
| 21 | +- Writes or modifies dbt models, macros, or configurations |
| 22 | +- Develops Airflow DAGs, operators, or hooks |
| 23 | +- Works with PySpark transformations or Spark SQL |
| 24 | +- Uses Snowflake SQL, Snowpark, or the Snowflake Python connector |
| 25 | +- Uses BigQuery SQL or the Python client library |
| 26 | +- Works with Databricks SDK or notebook code |
| 27 | +- Writes Kafka producer/consumer code |
| 28 | +- Uses SQLAlchemy ORM or Core queries |
| 29 | +- Works with Polars DataFrame operations |
| 30 | +- Sets up Great Expectations data validation |
| 31 | +- Asks "how do I" questions about any data engineering library |
| 32 | +- Needs API references, method signatures, or configuration options |
| 33 | + |
| 34 | +## How to Fetch Documentation |
| 35 | + |
| 36 | +### Step 1: Identify the Library |
| 37 | + |
| 38 | +Check the `references/library-ids.md` file for pre-mapped Context7 library IDs. |
| 39 | +If you find a match, skip to Step 3. |
| 40 | + |
| 41 | +If the library isn't in the reference file, resolve it: |
| 42 | + |
| 43 | +```bash |
| 44 | +npx -y ctx7@latest library <library-name> "<user's question>" |
| 45 | +``` |
| 46 | + |
| 47 | +Pick the result with the closest name match and highest score. |
| 48 | +Note the Library ID (format: `/org/project` or `/org/project/version`). |
| 49 | + |
| 50 | +### Step 2: Check for Project Version |
| 51 | + |
| 52 | +Look for version info in the user's project: |
| 53 | + |
| 54 | +- `requirements.txt` or `pyproject.toml` — Python package versions |
| 55 | +- `dbt_project.yml` — dbt version (`require-dbt-version`) |
| 56 | +- `packages.yml` — dbt package versions |
| 57 | +- `setup.py` or `setup.cfg` — Python package versions |
| 58 | + |
| 59 | +If a specific version is found, prefer version-specific library IDs |
| 60 | +(format: `/org/project/vX.Y.Z`) when available from the resolution step. |
| 61 | + |
| 62 | +### Step 3: Query Documentation |
| 63 | + |
| 64 | +```bash |
| 65 | +npx -y ctx7@latest docs <libraryId> "<specific question>" |
| 66 | +``` |
| 67 | + |
| 68 | +Write **specific, detailed queries** for better results: |
| 69 | +- Good: `"How to create incremental models with merge strategy in dbt"` |
| 70 | +- Bad: `"incremental"` |
| 71 | + |
| 72 | +### Step 4: Use the Documentation |
| 73 | + |
| 74 | +- Answer using the fetched documentation, not training data |
| 75 | +- Include relevant code examples from the docs |
| 76 | +- Cite the library version when relevant |
| 77 | +- If docs mention deprecations or breaking changes, highlight them |
| 78 | + |
| 79 | +## Guidelines |
| 80 | + |
| 81 | +- Maximum 3 CLI calls per user question to avoid rate limits |
| 82 | +- Works without authentication; set `CONTEXT7_API_KEY` env var for higher rate limits |
| 83 | +- If a CLI call fails (network error, rate limit), fall back to training data |
| 84 | + and note that the docs could not be fetched |
| 85 | +- For dbt: always check `dbt_project.yml` for version and `packages.yml` for packages |
| 86 | +- For Python tools: check `requirements.txt` or `pyproject.toml` for pinned versions |
| 87 | +- When multiple libraries are relevant (e.g., dbt-core + dbt-snowflake), fetch docs |
| 88 | + for the most specific one first |
| 89 | + |
| 90 | +## Usage |
| 91 | + |
| 92 | +- `/data-docs How do I create an incremental model in dbt?` |
| 93 | +- `/data-docs What Airflow operators are available for BigQuery?` |
| 94 | +- `/data-docs How to use window functions in PySpark?` |
| 95 | +- `/data-docs Snowpark DataFrame API for joins` |
| 96 | + |
| 97 | +Use the bash tool to run `ctx7` CLI commands. Reference `library-ids.md` for |
| 98 | +pre-mapped library IDs to skip the resolution step. |
0 commit comments