PyAthena is a Python DB API 2.0 (PEP 249) compliant client for Amazon Athena. See pyproject.toml for Python version support and dependencies.
- NEVER commit directly to
master— always create a feature branch and PR - Create PRs as drafts:
gh pr create --draft
- NEVER use runtime imports (inside functions, methods, or conditional blocks)
- All imports must be at the top of the file, after the license header
- Exception: the existing codebase uses runtime imports for optional dependencies (
pyarrow,pandas, etc.) in source code. For new code, useTYPE_CHECKINGinstead when possible
make format # Auto-fix formatting and imports
make lint # Lint + format check + mypy# ALWAYS run `make lint` first — tests will fail if lint doesn't pass
make test # Unit tests (runs chk first)
make test-sqla # SQLAlchemy dialect testsTests require AWS environment variables. Use a .env file (gitignored):
AWS_DEFAULT_REGION=<region>
AWS_ATHENA_S3_STAGING_DIR=s3://<bucket>/<path>/
AWS_ATHENA_WORKGROUP=<workgroup>
AWS_ATHENA_SPARK_WORKGROUP=<spark-workgroup>export $(cat .env | xargs) && uv run pytest tests/pyathena/test_file.py -v- Tests mirror source structure under
tests/pyathena/ - Use pytest fixtures from
conftest.py - New features require tests; changes to SQLAlchemy dialects must pass
make test-sqla
- Class-based tests for integration tests that use fixtures (cursors, engines):
class TestCursor:with methods likedef test_fetchone(self, cursor): - Standalone functions for unit tests of pure logic (converters, parsers, utils):
def test_to_struct_json_formats(input_value, expected): - Test file naming mirrors source:
pyathena/parser.py→tests/pyathena/test_parser.py - Fixtures: Cursor/engine fixtures are defined in
conftest.pyand injected by name (e.g.,cursor,engine,async_cursor). Useindirect=Trueparametrization to pass connection options:@pytest.mark.parametrize("engine", [{"driver": "rest"}], indirect=True) def test_query(self, engine): engine, conn = engine
- Parametrize with
@pytest.mark.parametrize(("input", "expected"), [...])for data-driven tests - Integration tests (need AWS) use cursor/engine fixtures with real Athena queries; unit tests (no AWS) call functions directly with test data
These are non-obvious conventions that can't be discovered by reading code alone.
All cursor types must implement: execute(), fetchone(), fetchmany(), fetchall(), close(). New cursor features must follow the DB API 2.0 specification.
Each cursor type lives in its own subpackage (pandas/, arrow/, polars/, s3fs/, spark/) with a consistent structure: cursor.py, async_cursor.py, converter.py, result_set.py. When adding features, consider impact on all cursor types.
pyathena/filesystem/s3.py implements fsspec's AbstractFileSystem. When modifying:
- Match
s3fslibrary behavior where possible (users migrate from it) - Use
delimiter="/"in S3 API calls to minimize requests - Handle edge cases: empty paths, trailing slashes, bucket-only paths
Versions are derived from git tags via hatch-vcs — never edit pyathena/_version.py manually.
Use Google-style docstrings for public methods. See existing code for examples.