Skip to content

41526 Storage [Backend] Create a file service#53

Open
EBirkenfeld wants to merge 89 commits into
masterfrom
41526__create_file_service
Open

41526 Storage [Backend] Create a file service#53
EBirkenfeld wants to merge 89 commits into
masterfrom
41526__create_file_service

Conversation

@EBirkenfeld
Copy link
Copy Markdown
Collaborator

@EBirkenfeld EBirkenfeld commented Oct 24, 2025

Note

Add a dedicated file storage microservice to replace Google Cloud Storage uploads

  • Introduces a new FastAPI microservice in storage/ that handles file uploads and downloads using an S3-compatible backend (SeaweedFS or GCS), with its own Postgres database, Alembic migrations, JWT/Redis-based authentication, and object-level permission checks.
  • Adds a Django Attachment model in backend/src/storage/ with django-guardian object permissions, replacing FileAttachment-based attachment tracking across tasks, workflows, templates, and comments.
  • Replaces signed-URL Google Cloud Storage upload flows throughout the backend and frontend with calls to the new file service; files are now referenced as markdown links containing a file_id rather than numeric attachment IDs.
  • Adds Nginx config blocks to proxy $FILE_DOMAIN to the new service, and extends docker-compose files to include file-service, file-postgres, and SeaweedFS stack services.
  • Adds management commands for data migration: backfilling file_id on existing FileAttachment rows from GCS URLs, migrating them to the new Attachment model, syncing records to the file DB, and replacing legacy storage links with file service URLs.
  • Removes google-cloud-storage dependency and StoragePermission class; adds django-guardian.
  • Risk: WorkflowEvent serializer no longer includes an attachments field; /workflows/attachments and /workflows/public/attachments endpoints are removed; attachment IDs are now strings throughout the frontend.

Macroscope summarized 3dfa00f.


Note

High Risk
Large cross-cutting change to file storage, auth, API contracts (comments/attachments), and data migration paths affecting security-sensitive uploads and existing clients.

Overview
This PR replaces in-app Google Cloud Storage uploads with a dedicated file microservice wired into local Docker (backend API, file-service, file Postgres, SeaweedFS) and removes legacy STORAGE / GCS env from Celery workers.

Uploads and avatars/logos now go through FileServiceClient (upload_file_with_attachment) in admin, contacts, integrations, Microsoft Graph photos, and related paths; sync_account_file_fields keeps storage.Attachment rows aligned when account logos or user/group photos change. django-guardian and a custom GroupObjectPermission app support object-level access on the new attachment model.

Workflow attachments shift from integer FileAttachment IDs to markdown links to the file service; comments require text only (no attachments list). Event/task serializers drop embedded attachments; refresh_attachments and Guardian cleanup run on comments and task updates. Public/embed template auth caches template data in Redis and invalidates on template save.

Migration tooling adds management commands (fill_file_attachment_file_id, migrate to Attachment, sync file DB, replace GCS URLs, run_file_migration) plus analytics/tests updated for storage.Attachment. Breaking: API consumers using comment attachments or file fields as ID lists, or expecting attachments on workflow events, must move to file-service URLs in markdown.

Reviewed by Cursor Bugbot for commit 3dfa00f. Bugbot is set up for automated code reviews on this repo. Configure here.

…ion API

- Add FileAttachment.access_type field with account/restricted options
- Add FileAttachment.file_id field for unique file identification
- Create FileAttachmentPermission model for user-specific file access
- Implement AttachmentService.check_user_permission method
- Add AttachmentsViewSet with check-permission endpoint
- Add Redis caching for public template authentication
- Add response_forbidden method to BaseResponseMixin
- Include comprehensive test coverage for permission checking

This enables fine-grained file access control with two access types:
- account: accessible by all users in the same account
- restricted: accessible only by users with explicit permissions
…integration

- Add FastAPI-based file upload and download endpoints with streaming support
- Implement Clean Architecture with domain entities, use cases, and repositories
- Add authentication middleware with JWT token validation and Redis caching
- Integrate Google Cloud Storage S3-compatible API for file storage
- Add comprehensive error handling with custom exceptions and HTTP status codes
- Implement file access permissions validation through external HTTP service
- Add database models and Alembic migrations for file metadata storage
- Include Docker containerization with docker-compose for local development
- Add comprehensive test suite with unit, integration, and e2e tests
- Configure pre-commit hooks with ruff, mypy, and pytest for code quality
…e_access_rights' into 41526__сreate_file_service

# Conflicts:
#	backend/src/processes/enums.py
#	backend/src/processes/models/__init__.py
#	backend/src/processes/models/workflows/attachment.py
#	backend/src/processes/services/attachments.py
#	backend/src/processes/tests/test_services/test_attachments.py
@EBirkenfeld EBirkenfeld self-assigned this Oct 24, 2025
@EBirkenfeld EBirkenfeld added the Backend API changes request label Oct 24, 2025
Comment thread storage/src/application/use_cases/file_upload.py
Comment thread storage/src/shared_kernel/exceptions/domain_exceptions.py
Comment thread storage/src/infra/repositories/storage_service.py Outdated
Comment thread storage/src/shared_kernel/middleware/auth_middleware.py
Comment thread storage/src/infra/repositories/storage_service.py
….toml to dedicated config files

- Move mypy configuration from pyproject.toml to mypy.ini for better separation of concerns
- Simplify ruff.toml configuration by removing extensive rule selections and using "ALL" selector
- Update ruff target version from py37 to py311 to match project Python version
- Remove redundant ruff configuration from pyproject.toml to avoid duplication
- Apply code formatting fixes across entire codebase
- Standardize import statements and code style according to new linting rules
- Update test files to comply with new formatting standards
Comment thread storage/src/shared_kernel/database/migrations/env.py
Comment thread storage/src/shared_kernel/exceptions/validation_exceptions.py Outdated
Comment thread storage/src/shared_kernel/config.py
Comment thread storage/src/shared_kernel/exceptions/base_exceptions.py
Comment thread storage/src/shared_kernel/exceptions/error_codes.py
Comment thread storage/src/shared_kernel/uow/unit_of_work.py Outdated
Comment thread backend/src/processes/services/attachments.py Outdated
Comment thread storage/src/shared_kernel/exceptions/exception_handler.py
Comment thread storage/src/shared_kernel/exceptions/validation_exceptions.py Outdated
Comment thread storage/src/infra/repositories/file_record_repository.py
…ore rule in ruff configuration

- Update docstrings across various modules to ensure consistency and clarity.
- Remove unused "D" rule from ruff.toml configuration.
- Enhance readability and maintainability of the codebase.
Comment thread storage/src/presentation/api/files.py Outdated
Comment thread storage/src/presentation/api/files.py
Comment thread storage/src/infra/http_client.py
…ling for consistency

- Adjust import paths in test files to ensure they reference the correct locations.
- Replace instances of FileNotFoundError with DomainFileNotFoundError for better clarity in exception handling.
- Streamline fixture definitions and improve code readability across various test modules.
… configuration

- Update docstrings across various test files for consistency and clarity.
- Add new linting rules in ruff.toml for improved code quality.
- Enhance readability and maintainability of the codebase by refining fixture definitions and mock implementations.
…line permission handling

- Refactor the AuthenticationMiddleware to enhance error handling and response formatting.
- Update permission classes to use direct Request type hints instead of string annotations.
- Consolidate permission checks into FastAPI dependency wrappers for better clarity and usability.
- Remove unused exception classes and error messages to clean up the codebase.
- Adjust test cases to reflect changes in authentication and permission handling.
Comment thread backend/src/processes/views/attachments.py Outdated
Comment thread storage/src/shared_kernel/middleware/auth_middleware.py
Comment thread storage/src/presentation/api/files.py
…exception tests

- Remove unused infrastructure error codes from error_codes.py to streamline the codebase.
- Update the AuthenticationMiddleware constructor to use direct FastAPI type hints for clarity.
- Add new tests for validation exceptions, including file size and storage errors, to improve coverage and ensure accurate error handling.
Comment thread storage/src/shared_kernel/exceptions/permission_exceptions.py
Comment thread storage/src/shared_kernel/exceptions/external_service_exceptions.py
Comment thread storage/src/shared_kernel/auth/dependencies.py
Comment thread storage/src/infra/http_client.py
Comment thread storage/src/shared_kernel/auth/public_token.py
Comment thread storage/src/shared_kernel/exceptions/base_exceptions.py
Comment thread storage/tests/fixtures/e2e.py
Comment thread storage/src/shared_kernel/auth/guest_token.py
Comment thread storage/tests/fixtures/e2e.py
Comment thread backend/src/processes/models/workflows/attachment.py Outdated
Comment thread storage/src/infra/http_client.py Outdated
Comment thread storage/src/infra/mappers/file_record_mapper.py
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

Comment thread backend/src/storage/utils.py
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

Comment thread backend/src/authentication/services/microsoft.py
Comment thread backend/src/processes/services/tasks/field.py Outdated
Comment thread backend/src/processes/services/tasks/task.py
Comment thread backend/src/storage/utils.py
… redundant FileServiceConnectionException imports
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 3 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

Comment thread backend/src/processes/models/templates/template.py
Comment thread backend/src/accounts/admin.py
Comment thread backend/src/accounts/services/account.py Outdated
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

Comment thread backend/src/processes/services/tasks/field.py Outdated
…essary calls and add tests for logo update scenarios
Comment thread backend/src/processes/services/tasks/field.py
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 4 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

Comment thread backend/src/processes/services/condition_check/resolvers/file.py
Comment thread backend/src/processes/models/templates/template.py
Comment thread backend/src/accounts/admin.py
Comment thread backend/src/processes/services/events.py
# Conflicts:
#	backend/docker-compose.yml
#	backend/src/accounts/services/user.py
#	backend/src/accounts/tests/views/test_user/test_update.py
#	backend/src/accounts/views/user.py
#	backend/src/processes/enums.py
#	backend/src/processes/models/workflows/attachment.py
#	backend/src/processes/tests/fixtures.py
#	backend/src/processes/tests/test_services/test_attachments.py
#	backend/src/processes/tests/test_services/test_workflows/test_workflow_version_service.py
#	docker-compose.src.yml
#	docker-compose.yml
#	frontend/docker-compose.yml
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Comment thread backend/docker-compose.yml Outdated
Comment thread backend/src/accounts/services/user.py Outdated
Comment thread backend/src/storage/services/attachments.py Outdated
Comment thread backend/src/storage/services/file_sync.py Outdated
Comment thread backend/src/processes/tests/test_services/test_comment_service.py
…dedicated Attachment model and migration commands for existing attachments.
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Comment thread backend/docker-compose.yml Outdated
Comment thread backend/src/accounts/services/user.py Outdated
Comment thread backend/src/storage/tests/test_services/test_attachments.py Outdated
Comment thread storage/src/shared_kernel/uow/unit_of_work.py Outdated
Comment thread backend/src/storage/tests/test_services/test_attachments_permissions.py Outdated
Comment thread backend/src/storage/utils.py
Comment thread backend/src/storage/queries.py
Comment thread backend/src/storage/services/file_sync.py Outdated
Comment thread backend/src/storage/utils.py
Comment thread backend/src/storage/tests/test_services/test_file_sync.py Outdated
…ices, alongside new tests and a Docker Compose configuration.
Comment thread storage/src/shared_kernel/uow/unit_of_work.py
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 3 potential issues.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Comment thread backend/src/authentication/services/microsoft.py
Comment thread backend/src/accounts/admin.py
Comment thread backend/src/processes/models/templates/template.py
Comment on lines +73 to +78
AuthUser(
auth_type=UserType.AUTHENTICATED,
user_id=1,
account_id=2,
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unit/test_auth_middleware.py:73

The AuthUser object created on lines 73-77 is never assigned to a variable or used — it is constructed and immediately discarded. This appears to be leftover code that serves no purpose in the test.

-        AuthUser(
-            auth_type=UserType.AUTHENTICATED,
-            user_id=1,
-            account_id=2,
-        )
-
         # Mock PneumaticToken.data
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file storage/tests/unit/test_auth_middleware.py around lines 73-78:

The `AuthUser` object created on lines 73-77 is never assigned to a variable or used — it is constructed and immediately discarded. This appears to be leftover code that serves no purpose in the test.

Evidence trail:
storage/tests/unit/test_auth_middleware.py lines 60-95 (viewed at REVIEWED_COMMIT): Lines 73-77 show `AuthUser(...)` call with no left-hand side assignment. The test at lines 67-92 shows the `AuthUser` object is not referenced anywhere else in the test - not in the mock setup, not in assertions, and not returned by any mock.

Comment on lines +26 to +31
def test_assign_task_permissions__performer__ok(self):
# arrange
user = create_test_admin()
workflow = create_test_workflow(user=user, tasks_count=1)
task = workflow.tasks.first()
service = AttachmentService(user=user)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Low test_services/test_attachments_permissions.py:26

test_assign_task_permissions__performer__ok does not explicitly add the user as a task performer, so it actually tests owner/member permissions instead of performer permissions. The test passes even when performer permission assignment is broken because create_test_workflow(user=user) makes the user an implicit workflow owner/member. Consider creating a separate user and explicitly adding them via task.taskperformer_set.create(user=performer) to properly isolate the performer permission path.

-    def test_assign_task_permissions__performer__ok(self):
-        # arrange
-        user = create_test_admin()
-        workflow = create_test_workflow(user=user, tasks_count=1)
-        task = workflow.tasks.first()
-        service = AttachmentService(user=user)
+    def test_assign_task_permissions__performer__ok(self):
+        # arrange
+        owner = create_test_admin()
+        performer = create_test_user(account=owner.account)
+        workflow = create_test_workflow(user=owner, tasks_count=1)
+        task = workflow.tasks.first()
+        task.taskperformer_set.create(user=performer)
+        service = AttachmentService(user=owner)
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file backend/src/storage/tests/test_services/test_attachments_permissions.py around lines 26-31:

`test_assign_task_permissions__performer__ok` does not explicitly add the user as a task performer, so it actually tests owner/member permissions instead of performer permissions. The test passes even when performer permission assignment is broken because `create_test_workflow(user=user)` makes the user an implicit workflow owner/member. Consider creating a separate user and explicitly adding them via `task.taskperformer_set.create(user=performer)` to properly isolate the performer permission path.

Evidence trail:
backend/src/storage/tests/test_services/test_attachments_permissions.py lines 22-47 (test method), backend/src/processes/tests/fixtures.py lines 393-455 (create_test_workflow adds user to owners/members), backend/src/processes/tests/fixtures.py line 350 (add_raw_performer adds user as performer), backend/src/storage/services/attachments.py lines 82-148 (_assign_task_permissions grants permissions from multiple sources: performers, owners, members), lines 232-254 show properly isolated test (test_assign_task_permissions__guest_performer__ok creates separate performer)

total = attachments.count()
self.stdout.write(f'Found {total} records ready to be synced.')

try:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Low commands/sync_files_to_file_service.py:96

If a non-psycopg2.Error exception is raised during the sync loop (e.g., from attr.account.get_owner()), the database connection opened on line 98 is never closed, causing a connection leak. Consider using a context manager or try/finally to ensure conn.close() is always called.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file backend/src/processes/management/commands/sync_files_to_file_service.py around line 96:

If a non-`psycopg2.Error` exception is raised during the sync loop (e.g., from `attr.account.get_owner()`), the database connection opened on line 98 is never closed, causing a connection leak. Consider using a context manager or `try/finally` to ensure `conn.close()` is always called.

Evidence trail:
backend/src/processes/management/commands/sync_files_to_file_service.py - full file viewed at REVIEWED_COMMIT. Connection opened at line 98 (`conn = self.get_file_db_connection()`), inner try/except only catches psycopg2.Error (lines 139-165), potential exception source at line 135 (`attr.account.get_owner()`), cleanup at lines 172-175 (`conn.close()`) is not protected by try/finally.

Comment on lines +37 to +44
command = UploadFileCommand(
file_content=b'test file content',
filename='test_file.txt',
content_type='text/plain',
size=18,
user_id=1,
account_id=1,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Low integration/test_use_cases_integration.py:37

UploadFileCommand is constructed with size=18 but the actual file_content b'test file content' is only 17 bytes, so the test exercises the system with mismatched metadata. If production code validates that size matches the content length, this test would incorrectly fail; if it doesn't, the test may mask bugs in size-dependent logic. Consider setting size=17 to match the actual content length.

         command = UploadFileCommand(
             file_content=b'test file content',
             filename='test_file.txt',
             content_type='text/plain',
-            size=18,
+            size=17,
             user_id=1,
             account_id=1,
         )
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file storage/tests/integration/test_use_cases_integration.py around lines 37-44:

`UploadFileCommand` is constructed with `size=18` but the actual `file_content` `b'test file content'` is only 17 bytes, so the test exercises the system with mismatched metadata. If production code validates that `size` matches the content length, this test would incorrectly fail; if it doesn't, the test may mask bugs in size-dependent logic. Consider setting `size=17` to match the actual content length.

Evidence trail:
storage/tests/integration/test_use_cases_integration.py lines 37-46 (REVIEWED_COMMIT): Shows `UploadFileCommand` with `file_content=b'test file content'` and `size=18`.

Byte count verification: `b'test file content'` = 17 bytes (t,e,s,t, ,f,i,l,e, ,c,o,n,t,e,n,t)

storage/src/presentation/api/files.py lines 66-67 (REVIEWED_COMMIT): Production code calculates `file_size = len(file_content)` ensuring consistency, which the test bypasses.

storage/src/application/use_cases/file_upload.py lines 48-56 (REVIEWED_COMMIT): Shows `command.size` is directly used to create `FileRecord` without validation against content length.

Comment on lines +278 to +279
except (ValueError, TypeError, IntegrityError):
return [], bool(existent_file_ids)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium storage/utils.py:278

When the transaction in refresh_attachments_for_event rolls back, the function returns has_attachments = bool(existent_file_ids). Since existent_file_ids only contains IDs that appear in both the new file_ids and existing attachments, it's empty when the text references entirely new files. The rollback restores the original attachments, so returning False is wrong — the event still has attachments. This causes _refresh_workflow_event_attachments to incorrectly set event.with_attachments = False.

Consider querying the actual attachment count after rollback, or restructuring to avoid the stale has_attachments value.

-    except (ValueError, TypeError, IntegrityError):
-        return [], bool(existent_file_ids)
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file backend/src/storage/utils.py around lines 278-279:

When the transaction in `refresh_attachments_for_event` rolls back, the function returns `has_attachments = bool(existent_file_ids)`. Since `existent_file_ids` only contains IDs that appear in both the new `file_ids` and existing attachments, it's empty when the text references entirely new files. The rollback restores the original attachments, so returning `False` is wrong — the event still has attachments. This causes `_refresh_workflow_event_attachments` to incorrectly set `event.with_attachments = False`.

Consider querying the actual attachment count after rollback, or restructuring to avoid the stale `has_attachments` value.

Evidence trail:
backend/src/storage/utils.py lines 247-253 (existent_file_ids computation), lines 257-278 (transaction and exception handler returning bool(existent_file_ids)), lines 493-508 (_refresh_workflow_event_attachments using has_attachments to update event.with_attachments). Test at backend/src/storage/tests/test_workflow_event_attachments.py lines 148-181 confirms different scenario (event with no pre-existing attachments).

Comment on lines +72 to +91
source_type = SourceType.ACCOUNT

task_id = None
if attr.event_id and attr.event.task_id:
task_id = attr.event.task_id
elif attr.output_id and attr.output.task_id:
task_id = attr.output.task_id

template_id = None
if attr.workflow_id:
template_id = attr.workflow.template_id
elif (
attr.event_id and
attr.event.task_id and
attr.event.task.workflow_id
):
template_id = attr.event.task.workflow.template_id
elif attr.output_id and attr.output.workflow_id:
template_id = attr.output.workflow.template_id

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 High commands/migrate_file_attachment_to_attachment.py:72

source_type is hardcoded to SourceType.ACCOUNT regardless of which relationships are populated. When workflow_id, task_id, or template_id are set, the migrated attachment still reports SourceType.ACCOUNT, causing any logic that branches on source_type to misroute. Consider deriving source_type from the actual relationship: use SourceType.TASK if task_id is set, SourceType.WORKFLOW if workflow_id is set, SourceType.TEMPLATE if template_id is set, and fall back to SourceType.ACCOUNT only when none apply.

-                source_type = SourceType.ACCOUNT
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file backend/src/processes/management/commands/migrate_file_attachment_to_attachment.py around lines 72-91:

`source_type` is hardcoded to `SourceType.ACCOUNT` regardless of which relationships are populated. When `workflow_id`, `task_id`, or `template_id` are set, the migrated attachment still reports `SourceType.ACCOUNT`, causing any logic that branches on `source_type` to misroute. Consider deriving `source_type` from the actual relationship: use `SourceType.TASK` if `task_id` is set, `SourceType.WORKFLOW` if `workflow_id` is set, `SourceType.TEMPLATE` if `template_id` is set, and fall back to `SourceType.ACCOUNT` only when none apply.

Evidence trail:
1. backend/src/processes/management/commands/migrate_file_attachment_to_attachment.py line 72: `source_type = SourceType.ACCOUNT` hardcoded
2. Same file lines 74-89: `task_id` and `template_id` are computed from relationships but never used to set `source_type`
3. backend/src/storage/enums.py lines 4-18: `SourceType` enum with ACCOUNT, WORKFLOW, TASK, TEMPLATE values and docstring "Defines which entity the file is attached to"
4. backend/src/storage/services/attachments.py lines 75-79: branching logic `if self.instance.source_type == SourceType.TASK: ... elif ... TEMPLATE ... elif ... WORKFLOW`
5. backend/src/storage/models.py line 22: index on `[source_type, account]` confirming it's used for queries

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 4 potential issues.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

def save(self, *args, **kwargs):
super().save(*args, **kwargs)
from src.authentication.services.public_auth import PublicAuthService
PublicAuthService.invalidate_template_public_tokens(self)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Template save cache invalidation breaks on transaction rollback

Medium Severity

The Template.save() override calls invalidate_template_public_tokens after super().save(), performing Redis cache deletions that are non-transactional. If save() is called within a transaction.atomic() block that later rolls back, the database reverts but the cache entry remains deleted. This temporarily breaks file service access for public/embed tokens, since the file service relies on this cache for authentication. The comment in invalidate_template_public_tokens even states "Storage service relies on this cache for public token auth."

Additional Locations (1)
Fix in Cursor Fix in Web

'error': str(ex),
},
level=SentryLogLevel.ERROR,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unhandled AttributeError when form user is None

Low Severity

In IntegrationCreateForm.save(), self.user defaults to None (via kwargs.pop('user', None)), but self.user.account is accessed inside the try block without a None check. If self.user is None, this raises an AttributeError, which is not a subclass of FileServiceException and won't be caught by the except handler, causing an unhandled crash. While the admin's get_form wrapper currently always injects request.user, the __init__ explicitly accepts None as a default, making this a latent bug.

Fix in Cursor Fix in Web

except psycopg2.Error as e:
logger.error("Error syncing %s: %s", file_id, e)
conn.rollback()
errors += 1
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Batch rollback silently loses successfully synced records

Low Severity

In the sync loop, conn.rollback() on a single insert error rolls back ALL uncommitted records since the last batch commit, not just the failed one. Records that were successfully inserted but not yet committed (up to batch_size - 1 records) are silently lost. The synced counter still includes these rolled-back records, so the final report overestimates the number of actually synced records.

Fix in Cursor Fix in Web

user=self.user,
old_values=[old_photo],
new_values=[self.instance.photo],
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sync_account_file_fields always called even when photo unchanged

Low Severity

sync_account_file_fields is called unconditionally on every partial_update in UserGroupService, even when the photo hasn't changed. In contrast, AccountService.partial_update guards the equivalent call with a change check (if (old_logo_lg, old_logo_sm) != ...). This inconsistency means unnecessary work extracting file IDs and querying attachments on every group update regardless of whether photo-related fields were modified.

Fix in Cursor Fix in Web

pneumojoseph
pneumojoseph previously approved these changes Apr 6, 2026
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

There are 102 total unresolved issues (including 100 from previous reviews).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 3dfa00f. Configure here.

account=user.account,
user=user,
old_values=old_values,
new_values=[],
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Microsoft photo sync omits new URL

Medium Severity

After updating user.photo, apply_photo_to_user calls sync_account_file_fields with an empty new_values list. That helper only creates account-scoped Attachment rows from new_values, so a photo URL applied without a matching upload in the same flow may never get an Attachment record while old IDs can still be removed.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 3dfa00f. Configure here.

] = fs_domain
url_mapping[
f"https://storage.cloud.google.com/{account.bucket_name}"
] = fs_domain
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bucket prefix breaks migrated links

High Severity

The migration maps whole GCS bucket base URLs to the file-service base URL. Substring replacement then rewrites full object URLs into {file-service}/{object-path} instead of {file-service}/{file_id}, so many migrated links won’t match the file service’s download routes.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 3dfa00f. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Backend API changes request Frontend Web client changes request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants