feat: support added to export content libraries to git#38026
feat: support added to export content libraries to git#38026marslanabdulrauf wants to merge 9 commits intoopenedx:masterfrom
Conversation
|
Thanks for the pull request, @marslanabdulrauf! This repository is currently maintained by Once you've gone through the following steps feel free to tag them in a comment and let them know that your changes are ready for engineering review. 🔘 Get product approvalIf you haven't already, check this list to see if your contribution needs to go through the product review process.
🔘 Provide contextTo help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:
🔘 Get a green buildIf one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green. DetailsWhere can I find more information?If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources: When can I expect my changes to be merged?Our goal is to get community contributions seen and reviewed as efficiently as possible. However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:
💡 As a result it may take up to several weeks or months to complete a review and merge your PR. |
63668a5 to
24001ee
Compare
I was surprised this didn't exist already in libraries v2? What's the export strategy for libraries v2 from Studio? |
asadali145
left a comment
There was a problem hiding this comment.
This works, but I have a few questions on the code.
24001ee to
9b02669
Compare
asadali145
left a comment
There was a problem hiding this comment.
Looks better, I have a couple of suggestions and also, we need to add tests.
| from cms.djangoapps.contentstore.storage import course_import_export_storage | ||
|
|
||
| from . import api | ||
| from .api import create_library_v2_zip |
There was a problem hiding this comment.
We can remove this import and update the usage like api.create_library_v2_zip()
|
|
||
| def export_to_git(course_id, repo, user='', rdir=None): | ||
| """Export a course to git.""" | ||
| def export_library_v2_to_zip(library_key, root_dir, library_dir, user=None): |
There was a problem hiding this comment.
How about we move this to api.backup.py as well, like create_library_v2_zip?
There was a problem hiding this comment.
yes it make sense 👍
b36e709 to
30baaab
Compare
4862309 to
a4af66f
Compare
asadali145
left a comment
There was a problem hiding this comment.
Can we add tests for the new and updated utils?
| return root_dir, file_path | ||
|
|
||
|
|
||
| def export_library_v2_to_zip(library_key, root_dir, library_dir, user=None): |
There was a problem hiding this comment.
I think the name is a bit misleading, this does not exports to zip but it extracts the zip created by create_library_v2_zip.
| # Get user object for backup API | ||
| user_obj = get_user_model().objects.filter(username=user).first() | ||
| temp_dir, zip_path = create_library_v2_zip(library_key, user_obj) | ||
|
|
||
| try: | ||
| # Target directory for extraction | ||
| target_dir = os.path.join(root_dir, library_dir) | ||
|
|
||
| # Create target directory if it doesn't exist | ||
| os.makedirs(target_dir, exist_ok=True) | ||
|
|
||
| # Extract zip contents (will overwrite existing files) | ||
| with zipfile.ZipFile(zip_path, 'r') as zip_ref: | ||
| zip_ref.extractall(target_dir) | ||
|
|
||
| log.info('Extracted library v2 backup to %s', target_dir) | ||
|
|
||
| finally: | ||
| # Cleanup temporary files | ||
| if temp_dir.exists(): |
There was a problem hiding this comment.
Some of the code comments are not needed here as the code is simple and readable that we don't really need. Also, there are some unnecessary blank lines as well.
| is_library_v2 = isinstance(content_key, LibraryLocatorV2) | ||
| if is_library_v2: | ||
| # V2 libraries use backup API with zip extraction | ||
| export_xml_func = export_library_v2_to_zip | ||
| content_type_label = "library" | ||
| elif isinstance(content_key, LibraryLocator): | ||
| export_xml_func = export_library_to_xml | ||
| content_type_label = "library" | ||
| else: | ||
| export_xml_func = export_course_to_xml | ||
| content_type_label = "course" |
There was a problem hiding this comment.
nit: we could use default dict here.
a4af66f to
bad0d90
Compare
asadali145
left a comment
There was a problem hiding this comment.
Works well for me. As discussed, we will add tests after an approval from core contributors.
| return root_dir, file_path | ||
|
|
||
|
|
||
| def export_library_v2_to_dir(library_key, root_dir, library_dir, user=None): |
There was a problem hiding this comment.
nit:
| def export_library_v2_to_dir(library_key, root_dir, library_dir, user=None): | |
| def extract_library_v2_zip_to_dir(library_key, root_dir, library_dir, user=None): |
d1aa766 to
9c6d943
Compare
|
@marslanabdulrauf can you rebase this PR? |
|
@mphilbrick211 can you recommend someone from the Libraries v2 team who might do a second review on this PR? |
a7f35b6 to
6e3b9bd
Compare
|
@marslanabdulrauf this one needs a rebase too. |
27398c6 to
86a8815
Compare
86a8815 to
6bfad43
Compare
|
@pdpinch: I can review this. |
|
Reviewing this today. There's a fairly major PR in flight to account for openedx-core refactoring, but I think this PR is unaffected. Before I begin the review, can you please let me know if there were AI tools used to assist in the creation of this PR? If so, that's fine, as long as it adheres to our policy. I just want to know what was used. Thank you. |
ormsbee
left a comment
There was a problem hiding this comment.
Some minor requests for changes, and some optional nits.
@pdpinch, @marslanabdulrauf, @asadali145: At some point after the release, I'd like to have a discussion as to what we would need to do in order to gracefully move the git export functionality out to a plugin.
Thank folks.
|
|
||
| def export_to_git(course_id, repo, user='', rdir=None): | ||
| """Export a course to git.""" | ||
| def export_to_git(content_key, repo, user='', rdir=None): |
There was a problem hiding this comment.
| def export_to_git(content_key, repo, user='', rdir=None): | |
| def export_to_git(context_key, repo, user='', rdir=None): |
Our terminology for a key that can be a Library or Course is a LearningContextKey, so the var should be named context_key. But I'm a bit confused how this is working, since my understanding is that you'd have to modify the git_export.py command parsing to go from CourseKey.from_string() to LearningContextKey.from_string(), and that doesn't seem to be a part of this PR? For unfortunate historical reasons, LibraryLocator is a kind of CourseKey, but the new v2 library keys aren't.
| Export a course or library to git. | ||
|
|
||
| Args: | ||
| content_key: CourseKey or LibraryLocator for the content to export |
There was a problem hiding this comment.
But also LibraryLocatorV2 now, right?
| is_library_v2 = isinstance(content_key, LibraryLocatorV2) | ||
| if is_library_v2: | ||
| # V2 libraries use backup API with zip extraction | ||
| content_export_func = extract_library_v2_zip_to_dir | ||
| elif isinstance(content_key, LibraryLocator): | ||
| content_export_func = export_library_to_xml | ||
| else: | ||
| content_export_func = export_course_to_xml | ||
| content_type_label = "course" |
There was a problem hiding this comment.
Nit (optional): You this setup work up here, but don't seem to use it until 65 lines later. You could do a simpler check to make sure it's a recognized key type up here (to error out early if necessary), and defer the rest of the logic until line 161. Then you don't need the indirection with the content_export_func.
So the first part could be like:
if not isinstance(context_key, (LibraryLocatorV2, LibraryLocator, CourseLocator)):
raise TypeError(
f"{context_key!r} for git export must be LibraryLocatorV2, LibraryLocator, "
f"or CourseLocator, not {type(context_key}"
)And then your logic later on could look something like:
# We must check in this order, because LibraryLocator is a subclass of CourseLocator,
# so isinstance(context_key, CourseLocator) on a LibraryLocator would return True.
if isinstance(context_key, (LibraryLocatorV2, LibraryLocator)):
content_type_label = "library"
else:
content_type_label = "course"
if isinstance(context_key, LibraryLocatorV2):
# V2 Library export
export_library_to_xml(context_key, root_dir, content_dir, user)
else:
# V1 Libraries and Courses both use the same modulestore-based export
export_course_to_xml(modulestore(), contentstore(), context_key, root_dir, content_dir)That way, each block stands on its own better, and you don't have to scan up or down the function to understand what's going on. Also, the content_type_label gets written once, rather than written with a default and then overwritten afterwards, making it just a little more obvious.
(Disclaimer: This is off the cuff and I haven't tried actually running this code, so there might be something obviously wrong with it. Please just take it as a broad suggestion. )
There was a problem hiding this comment.
It just occurred to me that we explicitly made the locators involved here implement is_course to make this simpler to deal with. So the "library" vs. "course" logic can be:
content_type_label = "course" if context_key.is_course else "library"| root_dir = Path(mkdtemp()) | ||
| sanitized_lib_key = str(library_key).replace(":", "-") | ||
| sanitized_lib_key = slugify(sanitized_lib_key, allow_unicode=True) | ||
| timestamp = datetime.now().strftime("%Y-%m-%d-%H%M%S") | ||
| filename = f'{sanitized_lib_key}-{timestamp}.zip' | ||
| file_path = os.path.join(root_dir, filename) | ||
| origin_server = getattr(settings, 'CMS_BASE', None) |
There was a problem hiding this comment.
Please try to refactor this so that it shares the same code path as the celery task for backup uses, instead of being a copy that might drift as conventions change.
| Exception: If backup creation or extraction fails | ||
| """ | ||
| # Get user object for backup API | ||
| user_obj = get_user_model().objects.filter(username=user).first() |
There was a problem hiding this comment.
If someone doesn't want to attach user information, that's one thing. But if someone is explicitly passing a user and that user is not found, it should be an error, and not simply silently fall back to None.
| return root_dir, file_path | ||
|
|
||
|
|
||
| def extract_library_v2_zip_to_dir(library_key, root_dir, library_dir, user=None): |
There was a problem hiding this comment.
| def extract_library_v2_zip_to_dir(library_key, root_dir, library_dir, user=None): | |
| def extract_library_v2_zip_to_dir(library_key, root_dir, library_dir, username=None): |
Otherwise, this could be mistaken for a User obj, when it's being used as a username here.

Ticket
https://github.com/mitodl/hq/issues/10083
Description
This pull request enhances the
export_to_gitworkflow to support exporting both course and library content, including v2 libraries, to a git repository. It introduces a new export mechanism for v2 libraries using a zip-based backup API, refactors the main export function for broader content support, and improves logging and commit messages for clarity.Support for v2 library export and generalization of export logic:
export_library_v2_to_zipfunction to export v2 libraries using the backup API, creating a zip backup and extracting it to the target directory.export_to_gitto accept a generalcontent_key(course or library), automatically detect the content type, and select the appropriate export function for courses, v1 libraries, or v2 libraries. [1] [2]Improvements to commit messages and logging:
Dependency and import updates:
shutil,zipfile,datetime,Path,mkdtemp,slugify, andLibraryLocatorV2.Supporting information
Link to other information about the change, such as Jira issues, GitHub issues, or Discourse discussions.
Be sure to check they are publicly readable, or if not, repeat the information here.
Testing instructions
export_to_gitfunction with different content_keys to test if they are exported properly to Github.I have been testing it with ol-openedx-git-auto-export -- This Library export feature is not released yet