Skip to content

fix(types): handle Arrow-style type names from Parquet/managed tables; update docs#12

Merged
eddietejeda merged 2 commits into
mainfrom
feat/managed-databases-v2
May 24, 2026
Merged

fix(types): handle Arrow-style type names from Parquet/managed tables; update docs#12
eddietejeda merged 2 commits into
mainfrom
feat/managed-databases-v2

Conversation

@eddietejeda

@eddietejeda eddietejeda commented May 24, 2026

Copy link
Copy Markdown
Contributor

Summary

  • types.py: add _ARROW_TYPE_MAP to handle Arrow-style type names (Date32, Float64, Utf8, LargeBinary, etc.) returned by the information schema for Parquet/managed-table columns. Previously these fell through to PostgresType.from_string() and were silently mapped to String.
  • tests/test_hotdata_types.py: parametrized test covering all Arrow-style names and case-insensitivity.
  • README.md: bump hotdata requirement to ≥0.2.3; document Arrow-style type support in the feature list and the Connect → Types section.

Test plan

  • uv run pytest tests/test_hotdata_types.py — new test_dtype_from_hotdata_arrow_type_names cases pass
  • uv run pytest — full suite passes
  • README version badge / requirement line reads hotdata ≥0.2.3

🤖 Generated with Claude Code

Comment thread src/ibis_hotdata/backend.py Outdated
Comment thread src/ibis_hotdata/backend.py
Comment thread src/ibis_hotdata/backend.py Outdated
Comment thread src/ibis_hotdata/backend.py
Comment thread README.md Outdated
claude[bot]
claude Bot previously approved these changes May 24, 2026
eddietejeda and others added 2 commits May 24, 2026 11:39
…; update docs

- types.py: add _ARROW_TYPE_MAP for Arrow-style names (Date32, Float64, Utf8, etc.)
  returned by the information_schema for Parquet/managed table columns
- tests: add parametrized test_dtype_from_hotdata_arrow_type_names covering all
  Arrow-style names and case-insensitivity
- README: update hotdata requirement to >=0.2.3; document Arrow-style type support
  in both the feature list and the Connect → Types section

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- backend.py: add _find_managed_connection helper that returns None for not-found
  vs raising IbisError; use it in create_database so real 5xx API failures are no
  longer swallowed by the broad `except IbisError: pass`
- backend.py: always overwrite _database_id in _table_location (drop the `or`) so
  both cached fields stay in sync when multiple managed databases are used
- backend.py: add explicit parens to api_conn ternary in get_schema for clarity
- backend.py: document the database_id parameter in do_connect docstring
- README.md: rewrite as user-facing docs — quick start first, plain language,
  no private method calls in examples, support table replaces spec-style prose

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@eddietejeda eddietejeda force-pushed the feat/managed-databases-v2 branch from 6c66835 to 33785bb Compare May 24, 2026 18:39
Comment thread src/ibis_hotdata/types.py
# time
"time32": dt.Time,
"time64": dt.Time,
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the map handles unsigned Arrow ints but not signed ones (Int8, Int16, Int32, Int64). Because the Postgres dialect treats int8 as an alias for BIGINT and int4 as INTEGER, an Arrow-style Int8 (signed 8-bit) column will silently fall through to PostgresType.from_string("Int8") and resolve to dt.Int64 — an 8× widening rather than a USERDEFINED fallback. Parquet schemas routinely produce Int8/Int16 for small signed ints, so if Hotdata can ever emit those names, consider adding them explicitly:

"int8":  dt.Int8,
"int16": dt.Int16,
"int32": dt.Int32,
"int64": dt.Int64,

(not blocking — only matters if Hotdata actually returns these names; the unsigned counterparts being present suggests it may.)

@eddietejeda eddietejeda merged commit 81fe4a6 into main May 24, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant