There's an inconsistency in how FileSystems.get_filesystem() handles missing optional dependencies between GCS and S3.
Current Behavior
S3 (without aws extra):
>>> from apache_beam.io import filesystems
>>> filesystems.FileSystems.get_filesystem("s3://blah")
<apache_beam.io.aws.s3filesystem.S3FileSystem at 0x11a0af750>
Returns the filesystem object; validation happens later when the filesystem is actually used.
GCS (without gcp extra):
>>> from apache_beam.io import filesystems
>>> filesystems.FileSystems.get_filesystem("gcs://blah")
ValueError: Unable to get filesystem from specified path, please use the correct path or ensure the required dependency is installed, e.g., pip install apache-beam[gcp]. Path specified: gcs://blah
Raises immediately because GCSFileSystem isn't registered as a subclass.
Proposed Behavior
Both should behave consistently. GCSFileSystem should be returned from get_filesystem() like S3FileSystem, allowing callers to validate dependencies when the filesystem is actually used rather than at lookup time.
Why This Matters
- Inconsistent API behavior is confusing
- Code that handles multiple filesystem types can't catch/handle GCS gracefully
- Dependency validation at usage time (not lookup time) allows for better error handling and lazy loading patterns
Environment
- Apache Beam version: 2.70.0
- Python version: 3.11
Generated by Claude Code, confirmed by @hjtran
There's an inconsistency in how
FileSystems.get_filesystem()handles missing optional dependencies between GCS and S3.Current Behavior
S3 (without
awsextra):Returns the filesystem object; validation happens later when the filesystem is actually used.
GCS (without
gcpextra):Raises immediately because
GCSFileSystemisn't registered as a subclass.Proposed Behavior
Both should behave consistently. GCSFileSystem should be returned from
get_filesystem()like S3FileSystem, allowing callers to validate dependencies when the filesystem is actually used rather than at lookup time.Why This Matters
Environment
Generated by Claude Code, confirmed by @hjtran