Skip to content

Commit 355da03

Browse files
committed
feat: implement submission file workflow with comprehensive testing
- Create Submission file model - Add custom storage to use ORA bucket - Develop SubmissionFileProcessor for create files in DB - Add TestSubmissionFileProcessor with extensive test coverage - Verify complete file processing workflow - Validation tests for file addition in create_external_grader_detail - Document architectural decisions with ADR - Add test queue folder in .gitignore
1 parent 1096759 commit 355da03

8 files changed

Lines changed: 589 additions & 6 deletions

File tree

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,3 +63,4 @@ test.txt.tmp
6363

6464
# VSCode
6565
.vscode
66+
test_queue/*
Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
2. File Handling Implementation for Submission System
2+
#####################################################
3+
4+
Status
5+
******
6+
7+
**Provisional** *2025-02-14*
8+
9+
Implemented by https://github.com/openedx/edx-submissions/pull/286
10+
11+
Context
12+
*******
13+
14+
As part of the XQueue migration effort detailed in ADR 0001, we need to implement a file handling system within edx-submissions. Currently, XQueue manages file submissions through a tightly coupled approach.
15+
16+
### Current Limitations
17+
18+
1. **Inadequate File Management**: XQueue's approach relies on JSON strings in character fields, with size constraints and manual URL manipulation for file handling.
19+
20+
2. **Process Inefficiencies**: The current system uses synchronous HTTP for file retrieval, lacks proper validation, and tightly couples submission processing with file handling.
21+
22+
3. **Integration Challenges**: External graders depend on specific URL formats with HTTP-based retrieval, embedding file information directly in submission payloads.
23+
24+
Decision
25+
********
26+
27+
We will implement a dedicated file management system for the assessment submission process, focusing on workflow and educational needs:
28+
29+
1. **Centralized Storage**: Create a unified repository for student-submitted files, ensuring they are properly associated with their assessments and accessible throughout the grading process.
30+
31+
2. **Streamlined Workflow**: Design a clear process where files are automatically processed during submission creation, securely stored, and efficiently delivered to grading systems.
32+
33+
3. **Consistent Experience**: Maintain compatibility with existing systems to ensure a smooth transition, allowing instructors and external graders to access files without changes to their established workflows.
34+
35+
Consequences
36+
************
37+
38+
Positive:
39+
---------
40+
41+
1. **Architecture**: Clean separation of concerns, improved maintainability, better error handling, optimized database access
42+
43+
2. **Integration**: Seamless xqueue-watcher compatibility, support for existing workflows, minimal client code changes
44+
45+
3. **Operations**: Robust file validation, improved tracking, better error visibility, simplified lifecycle management
46+
47+
Negative:
48+
---------
49+
50+
1. **Technical**: Additional database structures
51+
52+
2. **Migration**: Temporary system complexity, additional testing needs
53+
54+
3. **Performance**: File processing overhead
55+
56+
References
57+
**********
58+
59+
Current System Documentation:
60+
* XQueue Repository: https://github.com/openedx/xqueue
61+
* XQueue Watcher Repository: https://github.com/openedx/xqueue-watcher
62+
63+
Related Repositories:
64+
* edx-submissions: https://github.com/openedx/edx-submissions
65+
* edx-platform: https://github.com/openedx/edx-platform
66+
* XQueue Repository: https://github.com/openedx/xqueue
67+
68+
Related Documentation:
69+
* ADR 0001: Creation of ExternalGraderDetail Model for XQueue Migration
70+
71+
Future Event Integration:
72+
* Open edX Events Framework: https://github.com/openedx/openedx-events
73+
* Event Bus Documentation: https://openedx.atlassian.net/wiki/spaces/AC/pages/124125264/Event+Bus
74+
75+
Related Architecture Documents:
76+
* Open edX Architecture Guidelines: https://openedx.atlassian.net/wiki/spaces/AC/pages/124125264/Architecture+Guidelines
77+

submissions/api.py

Lines changed: 67 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@
1515
# SubmissionError imported so that code importing this api has access
1616
from submissions.errors import ( # pylint: disable=unused-import
1717
ExternalGraderQueueEmptyError,
18+
FileProcessingError,
1819
SubmissionError,
1920
SubmissionInternalError,
2021
SubmissionNotFoundError,
@@ -28,6 +29,7 @@
2829
ScoreSummary,
2930
StudentItem,
3031
Submission,
32+
SubmissionFile,
3133
score_reset,
3234
score_set
3335
)
@@ -49,7 +51,6 @@
4951
TOP_SUBMISSIONS_CACHE_TIMEOUT = 300
5052

5153

52-
# pylint: disable=unused-argument
5354
def create_external_grader_detail(student_item_dict,
5455
answer,
5556
queue_name: str,
@@ -93,9 +94,13 @@ def create_external_grader_detail(student_item_dict,
9394
submission_uuid=submission_uuid,
9495
queue_name=queue_name,
9596
grader_file_name=grader_file_name,
96-
points_possible=points_possible,
97+
points_possible=points_possible)
98+
99+
files_dict = external_grader_additional_data.get('files')
100+
if files_dict:
101+
file_processor = SubmissionFileProcessor(instance)
102+
file_processor.process_files(files_dict)
97103

98-
)
99104
return instance
100105

101106
except DatabaseError as error:
@@ -106,6 +111,65 @@ def create_external_grader_detail(student_item_dict,
106111
raise SubmissionInternalError(error_message) from error
107112

108113

114+
class SubmissionFileProcessor:
115+
"""
116+
Process file operations for submissions
117+
"""
118+
119+
def __init__(self, external_grader):
120+
self.external_grader = external_grader
121+
122+
def process_files(self, files_dict):
123+
"""
124+
Process uploaded files from an Open edX environment and store them as SubmissionFile objects.
125+
126+
This method specifically handles native OpenedX FileObjForWebobFiles objects.
127+
128+
The method performs the following operations:
129+
1. Stores each file directly as a SubmissionFile object in the database
130+
2. Returns URLs in xqueue-compatible format
131+
132+
Args:
133+
files_dict (dict): Dictionary mapping filenames to file objects.
134+
Format: {filename: file_object, ...}
135+
136+
Returns:
137+
dict: Dictionary mapping original filenames to xqueue URLs.
138+
Format: {filename: "/queue_name/uuid", ...}
139+
140+
Example:
141+
>>> external_grader = ExternalGraderDetail.create_from_uuid(
142+
submission_uuid=submission_uuid,
143+
queue_name=queue_name,
144+
grader_file_name=grader_file_name,
145+
points_possible=points_possible)
146+
>>> file_processor = SubmissionFileProcessor(external_grader)
147+
>>> files = {'assignment.py': file_obj}
148+
>>> urls = file_processor.process_files(files)
149+
>>> print(urls)
150+
{'assignment.py': '/my_queue/550e8400-e29b-41d4-a716-446655440000'}
151+
"""
152+
files_urls = {}
153+
for filename, file_obj in files_dict.items():
154+
submission_file = SubmissionFile.objects.create(
155+
external_grader=self.external_grader,
156+
file=file_obj.file,
157+
original_filename=filename
158+
)
159+
files_urls[filename] = submission_file.xqueue_url
160+
161+
return files_urls
162+
163+
def get_files_for_grader(self):
164+
"""
165+
Returns files in format expected by xqueue-watcher
166+
"""
167+
return {
168+
file.original_filename: file.file.url
169+
for file in self.external_grader.files.all()
170+
}
171+
172+
109173
def create_submission(
110174
student_item_dict,
111175
answer,

submissions/errors.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,17 @@ class ExternalGraderQueueEmptyError(SubmissionError):
4040
"""
4141

4242

43+
class FileProcessingError(SubmissionError):
44+
"""
45+
Exception raised when there's an error reading or processing a file.
46+
47+
This exception is raised when file operations fail, such as:
48+
- I/O errors when reading file content
49+
- OS errors during file operations
50+
- Unicode decoding errors when processing file content
51+
"""
52+
53+
4354
class SubmissionRequestError(SubmissionError):
4455
"""
4556
This error is raised when there was a request-specific error
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# Generated by Django 4.2.19 on 2025-05-15 14:14
2+
3+
from django.db import migrations, models
4+
import django.db.models.deletion
5+
import django.utils.timezone
6+
import submissions.models
7+
import uuid
8+
9+
10+
class Migration(migrations.Migration):
11+
12+
dependencies = [
13+
('submissions', '0004_externalgraderdetail'),
14+
]
15+
16+
operations = [
17+
migrations.CreateModel(
18+
name='SubmissionFile',
19+
fields=[
20+
('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
21+
('uuid', models.UUIDField(default=uuid.uuid4, editable=False)),
22+
('file', models.FileField(max_length=512, storage=submissions.models.get_storage,
23+
upload_to=submissions.models.submission_file_path)),
24+
('original_filename', models.CharField(max_length=255)),
25+
('created_at', models.DateTimeField(default=django.utils.timezone.now)),
26+
('external_grader', models.ForeignKey(null=True, on_delete=django.db.models.deletion.SET_NULL,
27+
related_name='files', to='submissions.externalgraderdetail')),
28+
],
29+
options={
30+
'indexes': [models.Index(fields=['external_grader', 'uuid'], name='submissions_externa_ff8089_idx')],
31+
},
32+
),
33+
]

submissions/models.py

Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,12 +9,15 @@
99
./manage.py makemigrations submissions
1010
"""
1111

12+
import functools
1213
import logging
14+
import os
1315
from datetime import timedelta
1416
from uuid import uuid4
1517

1618
from django.conf import settings
1719
from django.contrib import auth
20+
from django.core.files.storage import default_storage
1821
from django.db import DatabaseError, models, transaction
1922
from django.db.models.signals import post_save, pre_save
2023
from django.dispatch import Signal, receiver
@@ -693,3 +696,99 @@ def update_status(self, new_status):
693696
def create_from_uuid(cls, submission_uuid, **kwargs):
694697
submission = Submission.objects.get(uuid=submission_uuid)
695698
return cls.objects.create(submission=submission, **kwargs)
699+
700+
701+
def submission_file_path(instance, _):
702+
"""
703+
Generate file path for submission files.
704+
Format: queue_name/uuid
705+
The filename is replaced with the UUID to ensure uniqueness without preserving extension.
706+
"""
707+
return os.path.join(
708+
instance.external_grader.queue_name,
709+
f"{instance.uuid}"
710+
)
711+
712+
713+
@functools.cache
714+
def _get_storage_cached():
715+
"""
716+
Cached implementation to get the configured storage backend.
717+
This function is for internal use only.
718+
"""
719+
edx_submissions_config = getattr(settings, 'EDX_SUBMISSIONS', {})
720+
storage_config = edx_submissions_config.get('MEDIA')
721+
722+
if storage_config:
723+
return storage_config
724+
725+
return default_storage
726+
727+
728+
def get_storage():
729+
"""
730+
Get the configured storage backend or fallback to default storage.
731+
Private helper with caching to avoid Django migration serialization errors.
732+
733+
This function checks for a storage configuration in the Django settings.
734+
It first looks for 'MEDIA' in the 'EDX_SUBMISSIONS' configuration dictionary.
735+
736+
Returns:
737+
Storage instance: Returns the configured storage if found in EDX_SUBMISSIONS['MEDIA'],
738+
otherwise returns Django's default_storage.
739+
740+
Example:
741+
# In settings
742+
from storages.backends.s3boto3 import S3Boto3Storage
743+
EDX_SUBMISSIONS = {
744+
'MEDIA': S3Boto3Storage(bucket_name='my-bucket')
745+
}
746+
747+
# Then get_storage() will return the S3Boto3Storage instance
748+
"""
749+
return _get_storage_cached() # For performance while keeping this function serializable for migrations
750+
751+
752+
class SubmissionFile(models.Model):
753+
"""
754+
Model to handle files associated with submissions
755+
"""
756+
uuid = models.UUIDField(default=uuid4, editable=False) # legacy S3 key
757+
external_grader = models.ForeignKey(
758+
'submissions.ExternalGraderDetail',
759+
on_delete=models.SET_NULL,
760+
related_name='files',
761+
null=True,
762+
)
763+
file = models.FileField(
764+
upload_to=submission_file_path,
765+
max_length=512,
766+
storage=get_storage
767+
)
768+
original_filename = models.CharField(max_length=255) # This is necessary to send file name to xqueue-watcher
769+
created_at = models.DateTimeField(default=now)
770+
771+
class Meta:
772+
indexes = [
773+
models.Index(fields=['external_grader', 'uuid']),
774+
]
775+
776+
@property
777+
def xqueue_url(self):
778+
"""
779+
Returns a URL in the XQueue-compatible format: /queue_name/uuid
780+
781+
This format is used for file references in both the legacy XQueue system
782+
and the new integrated standard. It maintains backward compatibility
783+
while supporting the migration from the external XQueue API to the
784+
integrated Open edX solution.
785+
786+
The URL follows the pattern: /{queue_name}/{submission_uuid}
787+
where:
788+
- queue_name: identifies the external grader queue
789+
- uuid: uniquely identifies this submission (legacy S3 key)
790+
791+
Returns:
792+
str: Formatted URL path following XQueue conventions
793+
"""
794+
return f"/{self.external_grader.queue_name}/{self.uuid}"

0 commit comments

Comments
 (0)