Skip to content

Commit 1c03746

Browse files
Yuri Nikonchukclaude
authored andcommitted
Default hdfs.replication to 3 (match PyArrow default)
Keep PyArrow's original default behavior. Users can override per-table or per-catalog via hdfs.replication property. Set to 0 to defer to server-side hdfs-site.xml. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent ceb6c27 commit 1c03746

1 file changed

Lines changed: 1 addition & 1 deletion

File tree

pyiceberg/io/pyarrow.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -579,7 +579,7 @@ def _initialize_hdfs_fs(self, scheme: str, netloc: Optional[str]) -> FileSystem:
579579
from pyarrow.fs import HadoopFileSystem
580580

581581
hdfs_kwargs: Dict[str, Any] = {}
582-
replication = self.properties.get(HDFS_REPLICATION, "0")
582+
replication = self.properties.get(HDFS_REPLICATION, "3")
583583
hdfs_kwargs["replication"] = int(replication)
584584
if netloc:
585585
return HadoopFileSystem.from_uri(f"{scheme}://{netloc}/?replication={replication}")

0 commit comments

Comments
 (0)