Skip to content

feat(supabase): add SupabaseBucketDownloader component#3250

Open
SyedShahmeerAli12 wants to merge 4 commits intodeepset-ai:mainfrom
SyedShahmeerAli12:feat/supabase-bucket-downloader
Open

feat(supabase): add SupabaseBucketDownloader component#3250
SyedShahmeerAli12 wants to merge 4 commits intodeepset-ai:mainfrom
SyedShahmeerAli12:feat/supabase-bucket-downloader

Conversation

@SyedShahmeerAli12
Copy link
Copy Markdown
Contributor

@SyedShahmeerAli12 SyedShahmeerAli12 commented Apr 28, 2026

Related Issues

Proposed Changes

Adds SupabaseBucketDownloader, a new Haystack component that downloads files from a Supabase Storage bucket and returns them as ByteStream objects for use in indexing pipelines.

Component: haystack_integrations.components.downloaders.supabase.SupabaseBucketDownloader

Key features:

  • Downloads files from a Supabase Storage bucket by file path
  • Returns ByteStream objects with meta["file_path"] and meta["bucket_name"] set
  • Optional file_extensions filter (case-insensitive)
  • Failed downloads are logged and skipped — does not raise, returns remaining streams
  • MIME type auto-detected from file extension
  • Full serialization support (to_dict / from_dict)
  • Authenticates via SUPABASE_SERVICE_KEY env var (service role key for private buckets)

Usage:

from haystack_integrations.components.downloaders.supabase import SupabaseBucketDownloader
from haystack.utils import Secret

downloader = SupabaseBucketDownloader(
    supabase_url="https://<project-ref>.supabase.co",
    supabase_key=Secret.from_env_var("SUPABASE_SERVICE_KEY"),
    bucket_name="my-documents",
)
result = downloader.run(sources=["reports/report.pdf", "data/notes.txt"])
streams = result["streams"]

Dependency added: supabase>=2.0.0

How did you test it?

9 unit tests covering:

  • Default and custom init parameters
  • File extension normalization (case-insensitive)
  • Serialization round-trip (to_dict / from_dict)
  • run() returns correct ByteStream objects with metadata
  • File extension filtering skips non-matching files
  • Empty sources returns empty list
  • Failed downloads are skipped (logs warning, continues)
  • MIME type detection from file extension

Integration test skipped unless SUPABASE_URL and SUPABASE_SERVICE_KEY are set.

Notes for the reviewer

This is the first of the remaining sub-tasks from #2862 (after SupabasePgvectorDocumentStore, SupabasePgvectorEmbeddingRetriever, and SupabasePgvectorKeywordRetriever were merged in #3164).

The supabase>=2.0.0 dependency is added to pyproject.toml. The existing integration only depended on pgvector-haystack.

Checklist

@SyedShahmeerAli12 SyedShahmeerAli12 requested a review from a team as a code owner April 28, 2026 11:26
@SyedShahmeerAli12 SyedShahmeerAli12 requested review from davidsbatista and removed request for a team April 28, 2026 11:26
@github-actions github-actions Bot added integration:supabase type:documentation Improvements or additions to documentation labels Apr 28, 2026
@socket-security
Copy link
Copy Markdown

socket-security Bot commented Apr 28, 2026

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addedsupabase@​2.29.0100100100100100

View full report

@github-actions
Copy link
Copy Markdown
Contributor

Coverage report (supabase)

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  integrations/supabase/src/haystack_integrations/components/downloaders/supabase
  supabase_bucket_downloader.py 78-79
Project Total  

This report was generated by python-coverage-comment-action

@SyedShahmeerAli12
Copy link
Copy Markdown
Contributor Author

SyedShahmeerAli12 commented Apr 30, 2026

@davidsbatista or could anyone please review this PR when you get a chance?

This adds SupabaseBucketDownloader the next sub-task of #2862 after the pgvector components were merged in #3164. Once this is reviewed and merged we can continue with the remaining Supabase features in that issue.

@davidsbatista
Copy link
Copy Markdown
Contributor

@SyedShahmeerAli12, please, there's no need to tag us and ask to review the PRs, it was open just 2 days ago.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

integration:supabase type:documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants