Skip to content

Commit 41dd842

Browse files
Merge branch 'master' into docs/federated-passthrough-queries
2 parents 51084ed + f41404c commit 41dd842

19 files changed

Lines changed: 345 additions & 101 deletions

.github/workflows/docs-lint.yaml

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
name: Docs Lint
2+
3+
on:
4+
pull_request:
5+
paths:
6+
- 'docs/**'
7+
- '**.md'
8+
- '.markdownlint-cli2.jsonc'
9+
- '.mise.toml'
10+
- 'Makefile'
11+
- '.github/workflows/docs-lint.yaml'
12+
13+
permissions:
14+
contents: read
15+
16+
jobs:
17+
lint:
18+
runs-on: ubuntu-latest
19+
steps:
20+
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
21+
- uses: jdx/mise-action@1648a7812b9aeae629881980618f079932869151 # v4.0.1
22+
- run: make docs/lint
23+
24+
build:
25+
runs-on: ubuntu-latest
26+
steps:
27+
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
28+
with:
29+
fetch-depth: 0 # Fetch all history for sphinx-multiversion
30+
- uses: jdx/mise-action@1648a7812b9aeae629881980618f079932869151 # v4.0.1
31+
- uses: astral-sh/setup-uv@37802adc94f370d6bfd71619e3f0bf239e1f3b78 # v7.6.0
32+
with:
33+
enable-cache: true
34+
- run: |
35+
uv sync --group dev
36+
make docs/build

.github/workflows/docs.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ jobs:
3232
enable-cache: true
3333
- run: |
3434
uv sync --group dev
35-
make docs
35+
make docs/build
3636
- name: Upload artifact
3737
uses: actions/upload-pages-artifact@7b1f4a764d45c48632c6b24a0339c27f5614fb0b # v4.0.0
3838
with:

.github/workflows/test.yaml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,9 @@ name: Test
22

33
on:
44
pull_request:
5+
paths-ignore:
6+
- 'docs/**'
7+
- '**.md'
58
schedule:
69
- cron: '0 0 * * 0'
710

.markdownlint-cli2.jsonc

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
{
2+
// https://github.com/DavidAnson/markdownlint/blob/main/doc/Rules.md
3+
"config": {
4+
"default": true,
5+
// Docs have long URLs and code lines that would be awkward to wrap
6+
"MD013": false,
7+
// Match existing style: dash bullets, 2-space nested indent
8+
"MD004": { "style": "dash" },
9+
"MD007": { "indent": 2 },
10+
// Allow inline HTML used by README (centered images) and MyST admonitions
11+
"MD033": {
12+
"allowed_elements": ["details", "summary", "br", "kbd", "sub", "sup", "div", "img", "p", "a"]
13+
},
14+
// Cursor docs intentionally repeat subsection names ("Basic usage") under different cursors
15+
"MD024": { "siblings_only": true },
16+
// `$ command` style is intentional in docs/testing.md and README shell snippets
17+
"MD014": false,
18+
// MyST `(label)=` ref targets must come before the first heading, so the
19+
// first line of many docs/*.md files is not an H1
20+
"MD041": false
21+
},
22+
"globs": [
23+
"docs/**/*.md",
24+
"*.md"
25+
],
26+
"ignores": [
27+
"docs/_build/**",
28+
"node_modules/**",
29+
".venv/**",
30+
".tox/**",
31+
".pytest_cache/**",
32+
".serena/**"
33+
]
34+
}

.mise.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,3 @@
11
[tools]
22
python = "3.12"
3+
"npm:markdownlint-cli2" = "0.18.1"

CLAUDE.md

Lines changed: 32 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,39 +1,47 @@
11
# PyAthena Development Guide for AI Assistants
22

33
## Project Overview
4+
45
PyAthena is a Python DB API 2.0 (PEP 249) compliant client for Amazon Athena. See `pyproject.toml` for Python version support and dependencies.
56

67
## Rules and Constraints
78

89
### Git Workflow
10+
911
- **NEVER** commit directly to `master` — always create a feature branch and PR
1012
- Create PRs as drafts: `gh pr create --draft`
1113

1214
### Import Rules
15+
1316
- **NEVER** use runtime imports (inside functions, methods, or conditional blocks)
1417
- All imports must be at the top of the file, after the license header
1518
- Exception: the existing codebase uses runtime imports for optional dependencies (`pyarrow`, `pandas`, etc.) in source code. For new code, use `TYPE_CHECKING` instead when possible
1619

1720
### Code Quality — Always Run Before Committing
21+
1822
```bash
1923
make format # Auto-fix formatting and imports
2024
make lint # Lint + format check + mypy
2125
```
2226

2327
### Testing
28+
2429
```bash
2530
# ALWAYS run `make lint` first — tests will fail if lint doesn't pass
26-
make test # Unit tests (runs chk first)
27-
make test-sqla # SQLAlchemy dialect tests
31+
make test/pyathena # Unit tests (runs lint first)
32+
make test/sqla # SQLAlchemy dialect tests
33+
make test/sqla-async # SQLAlchemy async dialect tests
2834
```
2935

3036
Tests require AWS environment variables. Use a `.env` file (gitignored):
37+
3138
```bash
3239
AWS_DEFAULT_REGION=<region>
3340
AWS_ATHENA_S3_STAGING_DIR=s3://<bucket>/<path>/
3441
AWS_ATHENA_WORKGROUP=<workgroup>
3542
AWS_ATHENA_SPARK_WORKGROUP=<spark-workgroup>
3643
```
44+
3745
```bash
3846
export $(cat .env | xargs) && uv run pytest tests/pyathena/test_file.py -v
3947
```
@@ -43,36 +51,58 @@ export $(cat .env | xargs) && uv run pytest tests/pyathena/test_file.py -v
4351
- New features require tests; changes to SQLAlchemy dialects must pass `make test-sqla`
4452

4553
#### Test Conventions
54+
4655
- **Class-based tests** for integration tests that use fixtures (cursors, engines): `class TestCursor:` with methods like `def test_fetchone(self, cursor):`
4756
- **Standalone functions** for unit tests of pure logic (converters, parsers, utils): `def test_to_struct_json_formats(input_value, expected):`
4857
- Test file naming mirrors source: `pyathena/parser.py``tests/pyathena/test_parser.py`
4958
- **Fixtures**: Cursor/engine fixtures are defined in `conftest.py` and injected by name (e.g., `cursor`, `engine`, `async_cursor`). Use `indirect=True` parametrization to pass connection options:
59+
5060
```python
5161
@pytest.mark.parametrize("engine", [{"driver": "rest"}], indirect=True)
5262
def test_query(self, engine):
5363
engine, conn = engine
5464
```
65+
5566
- **Parametrize** with `@pytest.mark.parametrize(("input", "expected"), [...])` for data-driven tests
5667
- **Integration tests** (need AWS) use cursor/engine fixtures with real Athena queries; **unit tests** (no AWS) call functions directly with test data
5768

69+
### Markdown Lint
70+
71+
`docs/**/*.md` and project-root `*.md` files are linted with [markdownlint-cli2](https://github.com/DavidAnson/markdownlint-cli2). The config lives at `.markdownlint-cli2.jsonc`. CI runs lint + Sphinx build on PRs that touch docs (`.github/workflows/docs-lint.yaml`).
72+
73+
`markdownlint-cli2` is pinned in `.mise.toml`, so [`mise`](https://mise.jdx.dev/) installs the exact version used in CI. Run locally:
74+
75+
```bash
76+
mise install # one-time: installs markdownlint-cli2
77+
make docs/lint # check
78+
make docs/format # auto-fix what's possible
79+
make docs/build # build the Sphinx site under docs/_build/html
80+
```
81+
5882
## Architecture — Key Design Decisions
5983

6084
These are non-obvious conventions that can't be discovered by reading code alone.
6185

6286
### PEP 249 Compliance
87+
6388
All cursor types must implement: `execute()`, `fetchone()`, `fetchmany()`, `fetchall()`, `close()`. New cursor features must follow the DB API 2.0 specification.
6489

6590
### Cursor Module Pattern
91+
6692
Each cursor type lives in its own subpackage (`pandas/`, `arrow/`, `polars/`, `s3fs/`, `spark/`) with a consistent structure: `cursor.py`, `async_cursor.py`, `converter.py`, `result_set.py`. When adding features, consider impact on all cursor types.
6793

6894
### Filesystem (fsspec) Compatibility
95+
6996
`pyathena/filesystem/s3.py` implements fsspec's `AbstractFileSystem`. When modifying:
97+
7098
- Match `s3fs` library behavior where possible (users migrate from it)
7199
- Use `delimiter="/"` in S3 API calls to minimize requests
72100
- Handle edge cases: empty paths, trailing slashes, bucket-only paths
73101

74102
### Version Management
103+
75104
Versions are derived from git tags via `hatch-vcs` — never edit `pyathena/_version.py` manually.
76105

77106
### Google-style Docstrings
107+
78108
Use Google-style docstrings for public methods. See existing code for examples.

Makefile

Lines changed: 16 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -13,29 +13,37 @@ lint:
1313
uvx ruff@$(RUFF_VERSION) format --check .
1414
uv run mypy .
1515

16-
.PHONY: test
17-
test: lint
16+
.PHONY: test/pyathena
17+
test/pyathena: lint
1818
uv run pytest -n 8 --cov pyathena --cov-report html --cov-report term tests/pyathena/
1919

20-
.PHONY: test-sqla
21-
test-sqla:
20+
.PHONY: test/sqla
21+
test/sqla:
2222
uv run pytest -n 8 --cov pyathena --cov-report html --cov-report term tests/sqlalchemy/
2323

24-
.PHONY: test-sqla-async
25-
test-sqla-async:
24+
.PHONY: test/sqla-async
25+
test/sqla-async:
2626
uv run pytest -n 8 --cov pyathena --cov-report html --cov-report term tests/sqlalchemy/ --dburi async
2727

2828
.PHONY: tox
2929
tox:
3030
uvx tox@$(TOX_VERSION) -c pyproject.toml run
3131

32-
.PHONY: docs
33-
docs:
32+
.PHONY: docs/build
33+
docs/build:
3434
uv run sphinx-multiversion docs docs/_build/html
3535
echo '<meta http-equiv="refresh" content="0; url=./master/index.html">' > docs/_build/html/index.html
3636
echo 'pyathena.dev' > docs/_build/html/CNAME
3737
touch docs/_build/html/.nojekyll
3838

39+
.PHONY: docs/lint
40+
docs/lint:
41+
mise exec -- markdownlint-cli2
42+
43+
.PHONY: docs/format
44+
docs/format:
45+
mise exec -- markdownlint-cli2 --fix
46+
3947
.PHONY: tool
4048
tool:
4149
uv tool install ruff@$(RUFF_VERSION)

README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ PyAthena is a Python [DB API 2.0 (PEP 249)](https://www.python.org/dev/peps/pep-
2121

2222
## Requirements
2323

24-
* Python
24+
- Python
2525

2626
- CPython 3.10, 3.11, 3.12, 3.13, 3.14
2727

@@ -77,10 +77,10 @@ Many of the implementations in this library are based on [PyHive](https://github
7777

7878
## Links
7979

80-
- Documentation: https://pyathena.dev/
81-
- PyPI Releases: https://pypi.org/project/PyAthena/
82-
- Source Code: https://github.com/pyathena-dev/PyAthena/
83-
- Issue Tracker: https://github.com/pyathena-dev/PyAthena/issues
80+
- Documentation: <https://pyathena.dev/>
81+
- PyPI Releases: <https://pypi.org/project/PyAthena/>
82+
- Source Code: <https://github.com/pyathena-dev/PyAthena/>
83+
- Issue Tracker: <https://github.com/pyathena-dev/PyAthena/issues>
8484

8585
## Logo
8686

docs/cursor.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -293,7 +293,6 @@ cursor = connect(s3_staging_dir="s3://YOUR_S3_BUCKET/path/to/",
293293
region_name="us-west-2").cursor(cursor=AsyncDictCursor, dict_type=OrderedDict)
294294
```
295295

296-
297296
## AioCursor
298297

299298
See {ref}`aio-cursor`.

docs/introduction.md

Lines changed: 17 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66

77
## Requirements
88

9-
* Python
9+
- Python
1010

1111
- CPython 3.10, 3.11, 3.12, 3.13, 3.14
1212

@@ -35,23 +35,26 @@ Extra packages:
3535
PyAthena provides comprehensive support for Amazon Athena's data types and features:
3636

3737
**Core Features:**
38-
- **DB API 2.0 Compliance**: Full PEP 249 compatibility for database operations
39-
- **SQLAlchemy Integration**: Native dialect support with table reflection and ORM capabilities
40-
- **Multiple Cursor Types**: Standard, Pandas, Arrow, Polars, S3FS and Spark cursor implementations
41-
- **Async Support**: Asynchronous query execution for non-blocking operations
38+
39+
- **DB API 2.0 Compliance**: Full PEP 249 compatibility for database operations
40+
- **SQLAlchemy Integration**: Native dialect support with table reflection and ORM capabilities
41+
- **Multiple Cursor Types**: Standard, Pandas, Arrow, Polars, S3FS and Spark cursor implementations
42+
- **Async Support**: Asynchronous query execution for non-blocking operations
4243

4344
**Data Type Support:**
44-
- **STRUCT/ROW Types**: {ref}`Complete support <sqlalchemy>` for complex nested data structures
45-
- **ARRAY Types**: {ref}`Complete support <sqlalchemy>` for ordered collections with automatic Python list conversion
46-
- **MAP Types**: {ref}`Complete support <sqlalchemy>` for key-value dictionary-like data structures
47-
- **JSON Integration**: Seamless JSON data parsing and conversion
48-
- **Performance Optimized**: Smart format detection for efficient data processing
45+
46+
- **STRUCT/ROW Types**: {ref}`Complete support <sqlalchemy>` for complex nested data structures
47+
- **ARRAY Types**: {ref}`Complete support <sqlalchemy>` for ordered collections with automatic Python list conversion
48+
- **MAP Types**: {ref}`Complete support <sqlalchemy>` for key-value dictionary-like data structures
49+
- **JSON Integration**: Seamless JSON data parsing and conversion
50+
- **Performance Optimized**: Smart format detection for efficient data processing
4951

5052
**Additional Features:**
51-
- **Connection Management**: Efficient connection pooling and configuration
52-
- **Result Caching**: Athena query result reuse capabilities
53-
- **Error Handling**: Comprehensive exception handling and recovery
54-
- **S3 Integration**: Direct S3 data access and staging support
53+
54+
- **Connection Management**: Efficient connection pooling and configuration
55+
- **Result Caching**: Athena query result reuse capabilities
56+
- **Error Handling**: Comprehensive exception handling and recovery
57+
- **S3 Integration**: Direct S3 data access and staging support
5558

5659
(license)=
5760

0 commit comments

Comments
 (0)