Skip to content

SupabaseBucketDownloader #3084

@davidsbatista

Description

@davidsbatista

Implement SupabaseBucketDownloader, a component analogous to the existing S3Downloader that fetches files from Supabase Storage buckets and returns them as ByteStream objects for use in indexing pipelines.

Detailed design

from haystack_integrations.components.fetchers.supabase import SupabaseBucketDownloader

downloader = SupabaseBucketDownloader(
    supabase_url="https://<project>.supabase.co",
    supabase_key=Secret.from_env_var("SUPABASE_SERVICE_KEY"),
    bucket_name="my-documents",
)

Implementation notes

  • Primary dependency: supabase-py
  • Auth via supabase_url and supabase_key (using Secret for secure handling)
  • Output: list of ByteStream objects compatible with Haystack file converters (e.g. PyPDFToDocument, HTMLToDocument)
  • Take S3Downloader as the reference implementation for interface design
  • Support listing and filtering files within a bucket (e.g. by prefix or file extension)
  • Integration lives in integrations/supabase/

Checklist

  • The code is documented with docstrings and was merged into a feature branch

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions