Skip to content

Commit 9421820

Browse files
Address review gaps: docs, tests, and docstring for DIGEST-MD5 auth
- Document that hive.kerberos-service-name applies to both KERBEROS and DIGEST-MD5 - Add precedence note for hive.metastore.authentication vs legacy boolean - Add test for empty-string auth mechanism raising HiveAuthError - Add integration test for KERBEROS via hive.metastore.authentication config - Expand HiveAuthError docstring to cover token file errors Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent da1e562 commit 9421820

File tree

3 files changed

+23
-2
lines changed

3 files changed

+23
-2
lines changed

mkdocs/docs/configuration.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -683,12 +683,14 @@ catalog:
683683
|------------------------------| ------- | ------------------------------------ |
684684
| hive.hive2-compatible | true | Using Hive 2.x compatibility mode |
685685
| hive.kerberos-authentication | true | Using authentication via Kerberos |
686-
| hive.kerberos-service-name | hive | Kerberos service name (default hive) |
686+
| hive.kerberos-service-name | hive | SASL service name used for both KERBEROS and DIGEST-MD5 auth (default hive) |
687687
| ugi | t-1234:secret | Hadoop UGI for Hive client. |
688688
| hive.metastore.authentication | DIGEST-MD5 | Auth mechanism: `NONE` (default), `KERBEROS`, or `DIGEST-MD5` |
689689

690690
When using DIGEST-MD5 authentication, PyIceberg reads a Hive delegation token from the file pointed to by the `$HADOOP_TOKEN_FILE_LOCATION` environment variable. This is the standard mechanism used in secure Hadoop environments where delegation tokens are distributed to jobs. Install PyIceberg with `pip install "pyiceberg[hive]"` to get the required `pure-sasl` dependency.
691691

692+
Note: `hive.metastore.authentication` takes precedence over the legacy `hive.kerberos-authentication` boolean. New deployments should prefer `hive.metastore.authentication`.
693+
692694
When using Hive 2.x, make sure to set the compatibility flag:
693695

694696
```yaml

pyiceberg/exceptions.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -133,4 +133,4 @@ class ValidationException(Exception):
133133

134134

135135
class HiveAuthError(Exception):
136-
"""Raised when Hive Metastore authentication fails."""
136+
"""Raised when Hive Metastore authentication fails or the delegation token file is missing or malformed."""

tests/catalog/test_hive.py

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1466,6 +1466,25 @@ def test_auth_mechanism_unknown_raises() -> None:
14661466
_HiveClient(uri="thrift://localhost:9083", auth_mechanism="PLAIN")
14671467

14681468

1469+
def test_auth_mechanism_empty_string_raises() -> None:
1470+
"""Empty string auth mechanism should raise HiveAuthError."""
1471+
from pyiceberg.exceptions import HiveAuthError
1472+
1473+
with pytest.raises(HiveAuthError, match="Unknown auth mechanism.*''"):
1474+
_HiveClient(uri="thrift://localhost:9083", auth_mechanism="")
1475+
1476+
1477+
def test_create_hive_client_passes_kerberos_via_config(monkeypatch: pytest.MonkeyPatch) -> None:
1478+
"""_create_hive_client passes hive.metastore.authentication=KERBEROS to _HiveClient."""
1479+
monkeypatch.setattr(TTransport.TSaslClientTransport, "__init__", lambda *a, **kw: None)
1480+
properties = {
1481+
"uri": "thrift://localhost:9083",
1482+
HIVE_METASTORE_AUTH: "KERBEROS",
1483+
}
1484+
client = HiveCatalog._create_hive_client(properties)
1485+
assert client._auth_mechanism == "KERBEROS"
1486+
1487+
14691488
def test_auth_mechanism_case_insensitive(monkeypatch: pytest.MonkeyPatch) -> None:
14701489
"""Auth mechanism should be case-insensitive."""
14711490
monkeypatch.setattr("pyiceberg.catalog.hive.read_hive_delegation_token", _fake_read_token)

0 commit comments

Comments
 (0)