Skip to content

Adds a GH workflow to validate XML syntax and well-formedness#5

Open
funkyfuture wants to merge 2 commits into
simondschweitzer:masterfrom
delb-xml:master
Open

Adds a GH workflow to validate XML syntax and well-formedness#5
funkyfuture wants to merge 2 commits into
simondschweitzer:masterfrom
delb-xml:master

Conversation

@funkyfuture

Copy link
Copy Markdown

ciao Simon,

i'd like to use this text corpus within the integrations test suite for the delb library. unfortunately there's quiet a lot of errors in the corpus.

thus i've added a Github workflow to validate all documents.

due to the amount of errors i don't see that it'd be reasonable to fix the issues by hand.

@funkyfuture

Copy link
Copy Markdown
Author

this is the outcome of an example run: https://github.com/delb-xml/test-corpus-aed-tei/actions/runs/19209861146/job/54910465966

the forth step includes the emitted error messages. but due to the usage of find there's no non-zero exit code, so the action/workflow doesn't fail. @JKatzwinkel could we parler in the coming days about possible solutions, i don't wanna over-engineer.

Comment thread .github/workflows/qa.yml Outdated
Comment thread .github/workflows/qa.yml Outdated
Comment thread .github/workflows/qa.yml Outdated
steps:
- run: sudo apt install libxml2-utils
- uses: actions/checkout@v5
- run: find files -type f -name '*.xml' -exec xmllint --noout --pedantic {} \;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since when has this --pedantic flag been a thing and why is it not in the man page??

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my installation even has --strict-namespace. i use Arch btw.

@funkyfuture

Copy link
Copy Markdown
Author

a'ight, i got this action thing tucked in.

here's an example run on a fork.

future pull requests would display error annotations in the review tab, see screenshots here

@funkyfuture funkyfuture marked this pull request as ready for review November 25, 2025 22:07
@JKatzwinkel

Copy link
Copy Markdown
Contributor

how is the action on the container registry lol i didnt even know you could do that!?

@simondschweitzer

Copy link
Copy Markdown
Owner

Thanks for your work and sorry for my late reply! I don't know what this means:
image
So, first of all, I will click the button "Approve workflows to run" :-)

@simondschweitzer

Copy link
Copy Markdown
Owner

1 failing and 1 successful check:
image
Is this a problem?

@simondschweitzer

Copy link
Copy Markdown
Owner

Ah, o.k., this failing check is because of the errors in the corpus, right?

@funkyfuture

Copy link
Copy Markdown
Author

how is the action on the container registry lol i didnt even know you could do that!?

this is the how. and yeah that's the common thing w/ the GH docs, there's a lot of info in it, but cluttered and hard to find.

Ah, o.k., this failing check is because of the errors in the corpus, right?

yes, it is. the best comprehension for an assessment is this summary.

at least regarding the unencoded ampersands seem to be suited for a simple search & replace action, doesn't it?

@funkyfuture

Copy link
Copy Markdown
Author

i did that search & replace; interestingly it affects exactly 500 files.

also, note that the editor (VS code) removed some CR characters at line endings, leaving the LFs only.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants