Skip to content

feat: add S3-compatible storage backend#4716

Open
Crash-- wants to merge 20 commits into
masterfrom
feat/s3-vfs-backend
Open

feat: add S3-compatible storage backend#4716
Crash-- wants to merge 20 commits into
masterfrom
feat/s3-vfs-backend

Conversation

@Crash--
Copy link
Copy Markdown
Contributor

@Crash-- Crash-- commented Mar 30, 2026

Summary

  • Add S3-compatible object storage as a third VFS backend (alongside afero and Swift)
  • Uses minio-go/v7 — no AWS SDK dependency. Targets OVH S3, MinIO, Scaleway, and any S3-compatible provider
  • Bucket strategy: one shared bucket per organization (<prefix>-<orgId>), key prefix per instance (<DBPrefix>/)
  • Memory-efficient: single PUT when file size is known (streams like Swift ~32KB), multipart only for unknown size
  • All 17 VFS integration tests pass for the S3 backend

What's included

Area Status
Main VFS (upload, download, delete, versions, fsck) Done
Avatar storage Done
Thumbnail storage Done
App installation (Copier + FileServer) Done
Dynamic assets Done
Preview/icon cache Done
Export archiver Done
Documentation (docs/s3.md) Done
Integration tests (testcontainers + MinIO) Done
Security review fixes (MD5 verification, path traversal, error sanitization) Done

Configuration

fs:
  url: s3://s3.rbx.io.cloud.ovh.net?access_key=ACCESS&secret_key=SECRET&region=rbx&bucket_prefix=cozy&use_ssl=true

All buckets (apps, assets, previews, exports) are created automatically at startup.

Bucket layout

Bucket Content
<prefix>-<orgId> Main VFS data (files, versions, thumbs, avatars)
<prefix>-apps-web Web application assets
<prefix>-apps-konnectors Konnector assets
<prefix>-assets Dynamic assets
<prefix>-previews PDF preview and icon cache
<prefix>-exports Instance export archives

See docs/s3.md for full documentation including a local MinIO setup tutorial.

Test plan

  • Unit tests: 15/15 naming tests pass
  • Integration tests: 17/17 VFS tests pass (afero, swift, s3)
  • Manual API test: upload, download, trash, permanent delete via /files API
  • Manual browser test: Drive UI works, file upload + thumbnail generation confirmed
  • Security review: all critical/high/medium findings fixed
  • Test with OVH S3 in staging

🤖 Generated with Claude Code

@Crash-- Crash-- requested a review from a team as a code owner March 30, 2026 07:07
@Crash--
Copy link
Copy Markdown
Contributor Author

Crash-- commented Mar 30, 2026

I tried it manually with a local minio:

I've:

  • app installation working
  • files upload on Drive working
  • files download on Drive working
  • thumbnails working
  • Avatar Storage.
  • fsck command
  • instance deletion

We need to test the :

  • move
    But maybe we'll test it on public instance, I don't have the move wizard installed & else.

@Crash--
Copy link
Copy Markdown
Contributor Author

Crash-- commented Mar 30, 2026

Enregistrement.de.l.ecran.2026-03-30.a.18.51.51.mov

@Crash-- Crash-- force-pushed the feat/s3-vfs-backend branch from 81d2b27 to 1dacd7c Compare March 31, 2026 03:21
Comment thread model/vfs/vfss3/s3.go Outdated
Comment thread model/vfs/vfss3/s3.go Outdated
@@ -0,0 +1,53 @@
package vfss3
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and a little bit confusing to havein s3.go inly provate delete functions

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or, maybe I understood, there is duplication with appfs, the same functions.
Does it make sense then to move them to pkg/s3 or pkg/s3util package with

  func EnsureBucket(ctx, client, bucket, region) error
  func DeleteObjects(ctx, client, bucket, names) error
  func WrapNotFound(err) error

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — extracted the shared helpers into pkg/s3util with DeleteObjects, DeletePrefixObjects, EnsureBucket, IsNotFound, and WrapNotFound. Both vfss3 and appfs now use this shared package. Also added integration tests for the new package. See commits 643cd8c and c14f0f1.

Comment thread model/vfs/vfss3/impl.go
domain string
prefix string // DBPrefix — used as key prefix in the bucket
contextName string
ctx context.Context
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that we have context in all other vfs, but it's an execution context, not data. Let's do not spread anti-pattern and if not change interface, but at least just pass the context.Background as is

Copy link
Copy Markdown
Contributor Author

@Crash-- Crash-- Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the concern — storing a context in a struct is generally considered an anti-pattern in Go. However, the VFS interface (model/vfs/vfs.go) does not pass a context.Context in its method signatures, so all backends that need one store it as a field. The Swift implementation (vfsswift/impl_v3.go) uses the exact same pattern: ctx: context.Background() set at creation time. Changing this would require updating the VFS interface, which feels out of scope for this PR - and I'm not confortable with that - . Happy to discuss if you think it's worth a broader refactor though.

Comment thread model/vfs/vfss3/impl.go Outdated
Comment thread model/vfs/vfs_test.go
@@ -17,6 +17,7 @@ import (

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we don't test any error cases during upload. But we run all the PuObject in the background goroutine, so it would be gread to add test when miniio is down during upload to check that all errors are propagated to client

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Crash-- Crash-- force-pushed the feat/s3-vfs-backend branch 2 times, most recently from f1d60b9 to e82461e Compare April 3, 2026 05:22
@Crash--
Copy link
Copy Markdown
Contributor Author

Crash-- commented Apr 3, 2026

I need to retry every thing since the refacto / changes

Crash-- and others added 15 commits May 10, 2026 18:17
Add SchemeS3 constant and S3 connection singleton following the same
pattern as the existing Swift connection (swift.go). Uses minio-go/v7
for S3-compatible storage (OVH, MinIO, Scaleway, etc.).

Configuration via fs.url query params: access_key, secret_key, region,
bucket_prefix, use_ssl.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New package model/vfs/vfss3/ implements vfs.VFS for S3-compatible
object storage, mirroring the Swift V3 implementation.

Key design decisions:
- Bucket per orgId: <bucket_prefix>-<orgId> (fallback "default")
- Key prefix per instance: <DBPrefix>/
- CreateFile uses io.Pipe + PutObject goroutine for streaming
- Local MD5 computation for multipart upload fallback
- Single PUT when ByteSize known (memory-efficient like Swift)
- PartSize=5MiB, NumThreads=1 for multipart (bounded memory)

Also adds GetOrgID() accessor on Instance.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add case config.SchemeS3 to every storage dispatch switch:
- MakeVFS, AvatarFS, ThumbsFS (instance.go)
- Copier, AppsFileServer, KonnectorsFileServer (apps.go)
- SystemArchiver (archiver.go)
- SystemCache (cache.go)
- InitDynamicAssetFS (fs.go)
- NewCapabilities (capabilities.go)

New S3 implementations:
- pkg/appfs/s3.go: S3 Copier and FileServer for app installation
- pkg/assets/dynamic/impl_s3.go: S3 dynamic assets storage
- S3 preview cache and archiver in their respective files

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add MinIO testcontainer fixture (tests/testutils/minio_utils.go)
- Add makeS3FS helper and wire into the VFS test table
- All 17 VFS tests pass for the S3 backend

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Document the S3 backend architecture: bucket strategy, object key
structure, memory consumption, encryption, differences from Swift,
configuration, and testing.

Also add S3 URL example to cozy.example.yaml.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Revert accidental commenting of GetLegalNoticeUrl in settings
- Remove unused errFailFast variable in fsck.go
- Remove unused maxNbFilesToDelete constant in s3.go
- Fix s3Cache to return os.ErrNotExist for NoSuchKey errors
- Fix indentation in s3Copier.Exist
- Fix BucketName to avoid trailing hyphens

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Critical:
- Verify MD5 integrity when caller provides expected hash (was skipped)
- Limit app file reads to 50 MiB to prevent OOM from corrupted objects

High:
- Sanitize S3 errors to avoid leaking bucket names and key paths
- Prevent path traversal via ".." in app filenames and file serving
- Fix fsck DocName to use stripped object name instead of full S3 key

Medium:
- Replace panic with error return in s3Copier.Copy
- Sanitize bucket_prefix config parameter

Low:
- Add bounds check for ETag substring to prevent panic
- Remove unused encoding/hex import

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The package is already named `multierror`, so the explicit alias is a no-op.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The 5 GiB maxFileSize constant was carried over from Swift, where it is
a real server-side single-object PUT limit. S3 with multipart upload
(already configured via minio-go PartSize) handles large files
transparently, so the constant was misleading and not consistently
enforced.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Verify that when the S3 backend becomes unreachable during a background
PutObject goroutine, the error is properly propagated to the caller
through Close(). Uses a short HTTP dial timeout to avoid hanging on
TCP retries when the minio container is stopped.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move duplicated S3 utility functions (DeleteObjects, DeletePrefixObjects,
EnsureBucket, IsNotFound, WrapNotFound) from vfss3 and appfs into a
shared pkg/s3util package. Includes integration tests with minio.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The model/vfs package now exercises three backends (afero, swift, s3)
plus a second MinIO container for S3UploadErrorPropagation, pushing the
package past the previous 5-minute per-package limit on GH runners.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…pipe

CreateFile starts a background goroutine that calls PutObject reading
from an io.Pipe. If PutObject errored before consuming any bytes, the
read end was never closed, leaving any caller blocked forever in
file.Write. Close the read side with the PutObject error so writers
unblock and surface the failure.

Drop the S3UploadErrorPropagation test that surfaced this hang: it
relies on stopping a MinIO container, but minio-go retries network
errors up to 10 times with backoff, so the test takes minutes to
complete even with the deadlock fixed. The pipe-close fix is what makes
the underlying behavior correct; reintroducing a fast unit-level test
for it can be done in a follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a workflow that builds the production Dockerfile and pushes the
resulting image to ghcr.io/<repo>:s3-test on every push to the
feat/s3-vfs-backend branch, plus a workflow_dispatch trigger that
accepts a custom tag.

Also publishes a ${tag}-<short-sha> tag for traceability.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Crash-- and others added 4 commits May 12, 2026 11:09
scripts/build.sh runs `git describe` / `git rev-parse` to derive the
build version string. With recent git versions the COPY'd working tree
trips the "detected dubious ownership" safety check inside the
container (the host user that owned the source no longer owns the
files), causing the build step to exit 128 before producing the
binary. Whitelisting /app as safe restores the previous behaviour.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
scripts/build.sh derives the version from git describe / git rev-parse
on the COPY'd working tree, which fails inside the buildx container
(exit 128 with no captured output). Inline `go build` with a build-arg
version string sidesteps the whole bash + git chain.

The workflow passes VERSION_STRING=<tag>-<sha> so the running binary
reports a recognizable version.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CheckAvailableDiskSpace returns maxsize = fs.MaxFileSize() when no
disk quota is configured. s3VFS.MaxFileSize() returned 0 (intended as
"no per-file limit"), but that meant CreateFile's redundant
`if newsize > maxsize` post-check fired for every non-empty upload —
making any file fail with ErrFileTooBig → HTTP 413 → "storage full"
in the UI, even on an empty volume.

Two changes:
 - Return -1 from MaxFileSize() to match aferoVFS's convention for
   "no limit". This makes the streaming-time check in
   s3FileCreation.Write (`f.maxsize >= 0 && f.w > f.maxsize`) skip
   correctly when no quota is in effect.
 - Drop the redundant post-CheckAvailableDiskSpace `newsize > maxsize`
   check from CreateFile and CopyFileFromOtherFS — the function
   already returns the right error when relevant.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
s3Copier wrote a zero-byte "installation complete" marker at the same
key the version's files live under (slug/version), with file contents
stored at slug/version/<name>. S3 browsers render that as a folder and
a file with the same name sitting side by side, which is misleading
when admins inspect the bucket.

Move the marker to <appObj>.cozy-installed so it sits on its own
distinct key while keeping the file layout unchanged.

Backward compat note: apps installed under the previous marker key
will be re-copied on first access (Exist returns false for the new
key → Start triggers a fresh install). This is a no-op cost for fresh
deployments; for existing s3-test installations the app payload gets
re-uploaded once.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants