Hardening#169
Open
peterstadler wants to merge 27 commits into
Open
Conversation
only allow to load XML files from local eXist db
with hardened `eutil:getDoc#1`
with hardened `eutil:getDoc#1`
with hardened `eutil:getDoc#1`
and replace with hardened `eutil:getDoc#1`
There was a problem hiding this comment.
Pull request overview
This PR hardens XML document loading by restricting eutil:getDoc#1 to internal eXist-db URIs and updating multiple XQuery modules/endpoints to use this wrapper instead of calling doc() directly.
Changes:
- Added URI validation + a dedicated error QName to
eutil:getDoc#1and introducedeutil:isInternalDbUri. - Replaced many direct
doc(...)usages witheutil:getDoc(...)acrossdata/xqmmodules anddata/xqlendpoints. - Added missing
eutilmodule imports in severaldata/xqlscripts.
Reviewed changes
Copilot reviewed 34 out of 34 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| data/xqm/eutil.xqm | Adds internal-URI enforcement in eutil:getDoc and introduces URI validation helper. |
| data/xqm/edition.xqm | Switches edition parsing/helpers to use eutil:getDoc. |
| data/xqm/work.xqm | Uses eutil:getDoc when loading work documents. |
| data/xqm/annotation.xqm | Uses eutil:getDoc in annotationsToJSON (module still contains other doc() calls). |
| data/xql/getZone.xql | Uses hardened document loader for MEI zone lookups. |
| data/xql/getXml.xql | Uses hardened loader for returning document fragments by ID. |
| data/xql/getWorkID.xql | Uses hardened loader for work ID resolution. |
| data/xql/getText.xql | Uses hardened loader for text doc + XSLT loading during transforms. |
| data/xql/getSummary.xql | Uses hardened loader for related work/image document loads. |
| data/xql/getReducedDocument.xql | Loads source doc via hardened loader (still contains a direct doc() for PI-based XSLT). |
| data/xql/getPreferences.xql | Loads default/project preferences via hardened loader. |
| data/xql/getParts.xql | Loads MEI via hardened loader. |
| data/xql/getPages.xql | Loads MEI via hardened loader (comment updated accordingly). |
| data/xql/getOverlays.xql | Adds eutil import and uses hardened loader for MEI. |
| data/xql/getOverlayOnPage.xql | Adds eutil import and uses hardened loader for MEI. |
| data/xql/getNavigatorConfig.xql | Loads edition doc via hardened loader. |
| data/xql/getMusicInMdiv.xql | Adds eutil import and uses hardened loader for MEI. |
| data/xql/getMovementsFirstPage.xql | Adds eutil import and uses hardened loader for MEI. |
| data/xql/getMovements.xql | Adds eutil import and uses hardened loader for MEI. |
| data/xql/getMeasuresOnPage.xql | Adds eutil import and uses hardened loader for MEI. |
| data/xql/getMeasurePage.xql | Adds eutil import (currently with an invalid module path). |
| data/xql/getMeasure.xql | Adds eutil import (currently with an invalid module path). |
| data/xql/getLanguageFile.xql | Loads locale/project language files via hardened loader (import path currently invalid). |
| data/xql/getInternalIdType.xql | Loads document via hardened loader (import path currently invalid). |
| data/xql/getHelp.xql | Loads help XML + XSLTs via hardened loader (import path currently invalid). |
| data/xql/getConcordances.xql | Loads MEI via hardened loader. |
| data/xql/getAnnotationText.xql | Loads annotation target doc via hardened loader. |
| data/xql/getAnnotationsOnPage.xql | Loads source MEI via hardened loader. |
| data/xql/getAnnotations.xql | Loads MEI once via hardened loader and reuses it for totals. |
| data/xql/getAnnotationPreviews.xql | Loads docs via hardened loader when resolving referenced elements. |
| data/xql/getAnnotationOpenAllUris.xql | Loads docs via hardened loader when resolving participants/ranges. |
| data/xql/getAnnotationMeta.xql | Loads annotation doc via hardened loader. |
| data/xql/getAnnotationInfos.xql | Loads MEI via hardened loader. |
| data/xql/getAnnotation.xql | Uses hardened loader for multiple dereferences + fixes typos in docs/comments. |
Comments suppressed due to low confidence (1)
data/xql/getReducedDocument.xql:45
- This file still loads an XSLT referenced via
xml-stylesheetprocessing instruction using a directdoc($xslInstruction)later in the query. That bypasses the hardenedeutil:getDocpolicy and can reintroduce remote/file URI loading via crafted input documents; switch toeutil:getDoc($xslInstruction)(or otherwise validate/allowlist the PI href).
let $doc := eutil:getDoc($uri)
let $xsl := '../xslt/reduceToSelection.xsl'
let $doc :=
transform:transform($doc, $xsl,
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
with hardened `eutil:getDoc#1`
to prevent resolution attempts
with hardened `eutil:getDoc#1`
to treat double ad triple slashes alike
for validating user input
and `$eutil:langDoc#1` to prevent unvalidated user input
…id noisy logs Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
…id noisy logs Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Member
Author
|
Hi @fmacca , you may take over now, if you have some time to look into this. |
…ode and not as path
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description, Context and related Issue
Allowing to load arbitrary XML files (from anywhere) is a major security issue in the XML world and should be avoided.
This PR updates the function
eutil:getDoc#1to only load files from the local database and replaces all direct calls to thedoc-function with the hardened wrapper functioneutil:getDoc#1.How Has This Been Tested?
I tested this by starting two Docker containers:
edirom/edirom-online-backend:developimageI then loaded the same clarinet quintet data xar into both containers and ran the following script which simply curls some endpoints and diffs the changes:
The output from looks good – there are only some expected differences in the
@srcattributes:Types of changes
Overview