Skip to content

Displaying the case definition text with the actual format.#13937

Open
KarnaiahPesula wants to merge 3 commits into
developmentfrom
bugfix-13935-case-classification-info-format-issue
Open

Displaying the case definition text with the actual format.#13937
KarnaiahPesula wants to merge 3 commits into
developmentfrom
bugfix-13935-case-classification-info-format-issue

Conversation

@KarnaiahPesula
Copy link
Copy Markdown
Contributor

@KarnaiahPesula KarnaiahPesula commented May 20, 2026

Fixes #

Summary by CodeRabbit

  • Bug Fixes
    • Preserves additional simple rich-text (e.g., underline/font/style) while safely escaping disallowed content.
    • Handles empty input and common HTML entities to avoid malformed output.
  • New Features
    • Plain http/https text is auto-linked as styled external links that open in a new tab with secure attributes.
  • Improvements
    • HTML sanitization is more flexible and configurable for allowed tags/attributes.

Review Change Stack

@KarnaiahPesula KarnaiahPesula linked an issue May 20, 2026 that may be closed by this pull request
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 20, 2026

📝 Walkthrough

Walkthrough

CaseDataForm replaces URL-only linkification with a combined regex detecting HTML tag fragments or http(s) URLs, unescapes HTML entities, sanitizes with a controlled Jsoup Safelist via HtmlHelper, preserves safe tag fragments, and converts plain-text URLs into secure external links.

Changes

Text Sanitization and URL Linkification

Layer / File(s) Summary
HtmlHelper: relaxed clean overload and settings
sormas-api/src/main/java/de/symeda/sormas/api/utils/HtmlHelper.java
Adds org.jsoup.nodes.Document import and a new cleanHtmlRelaxed(String htmlText, Safelist whitelist) overload that disables pretty-printing; expands the existing cleanHtmlRelaxed(String) to an explicit relaxed Safelist allowing additional tags and style attributes.
UI imports and combined pattern
sormas-ui/src/main/java/de/symeda/sormas/ui/caze/CaseDataForm.java
Adds Jsoup Safelist and HtmlHelper imports and introduces RICH_TEXT_OR_URL_PATTERN that matches either HTML tag fragments or http(s) URLs.
sanitizeAndLinkify implementation and unescape helper
sormas-ui/src/main/java/de/symeda/sormas/ui/caze/CaseDataForm.java
Rewrites sanitizeAndLinkify to return "" for null/empty input, unescape common entities, use HtmlHelper.cleanHtmlRelaxed with a configured Safelist, pass through matched safe tag fragments, and wrap matched plain URLs into escaped, styled external <a> elements with target="_blank" and rel="noopener noreferrer"; adds unescapeHtml helper and removes old URL-only logic.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • raulbob
  • obinna-h-n

Poem

🐰 I nibble on entities, unescape with delight,
Tags I let pass when their edges are right,
Plain URLs I wrap in a safe little nest,
target="_blank" and rel do the rest.
Hop, click, and display — the form looks its best!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Description check ⚠️ Warning The description is incomplete; it contains only the template placeholder without specifying the related issue number or any substantive explanation of the changes. Replace the template placeholder with the actual issue number (e.g., 'Fixes #13935') and add details about what the fix addresses and why the changes were necessary.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: displaying case definition text with preserved formatting through HTML sanitization and linkification.
Docstring Coverage ✅ Passed Docstring coverage is 83.33% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch bugfix-13935-case-classification-info-format-issue

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@sormas-ui/src/main/java/de/symeda/sormas/ui/caze/CaseDataForm.java`:
- Around line 1663-1668: The code in CaseDataForm that appends matcher.group(1)
(htmlTag) directly into result (the htmlTag/result.append path) allows XSS;
change it to allowlist safe tags only (e.g., div, span, p, br, b, i, ul, li) and
only append the tag if htmlTag's name and attributes match that allowlist,
otherwise escape or strip the tag before appending; locate the logic around
matcher.group(1)/htmlTag/result.append and enforce the allowlist check (or
replace manual checks with a call to an HTML sanitizer such as OWASP Java HTML
Sanitizer) so no untrusted tags/attributes (like script or on* handlers) are
passed through.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7c9f8774-acf7-4cb2-a819-7090d1d785cc

📥 Commits

Reviewing files that changed from the base of the PR and between 5d78b2f and 7cc9ac2.

📒 Files selected for processing (1)
  • sormas-ui/src/main/java/de/symeda/sormas/ui/caze/CaseDataForm.java

Comment thread sormas-ui/src/main/java/de/symeda/sormas/ui/caze/CaseDataForm.java
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@sormas-ui/src/main/java/de/symeda/sormas/ui/caze/CaseDataForm.java`:
- Around line 1674-1684: The current sanitization in CaseDataForm (the htmlTag
handling that checks lowerTag.contains("onclick"/"onerror"/"onload") and then
calls escapeHtml(htmlTag) or result.append(htmlTag)) is incomplete and allows
many XSS bypasses; replace this custom blocklist with a robust sanitizer: call
HtmlHelper.cleanHtmlRelaxed(htmlTag, HtmlHelper.EVENTACTION_WHITELIST) (or the
existing jsoup Safelist-based helper) for each fragment instead of manual string
checks, ensuring all attributes matching /^on\w+/ and dangerous URL schemes like
data: and vbscript: are removed; update the code path that currently uses
escapeHtml(htmlTag) / result.append(htmlTag) to use the cleaned output and
remove the manual lowerTag contains checks.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5832b922-b002-4a46-98b9-36c052717003

📥 Commits

Reviewing files that changed from the base of the PR and between 7cc9ac2 and 088860f.

📒 Files selected for processing (1)
  • sormas-ui/src/main/java/de/symeda/sormas/ui/caze/CaseDataForm.java

Comment thread sormas-ui/src/main/java/de/symeda/sormas/ui/caze/CaseDataForm.java Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
sormas-ui/src/main/java/de/symeda/sormas/ui/caze/CaseDataForm.java (1)

1682-1690: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

URLs with ampersands will be double-encoded and break.

After Jsoup sanitization, & in text becomes &amp;. When a URL like https://example.com?a=1&b=2 is matched from sanitized text, it's already https://example.com?a=1&amp;b=2. Applying escapeHtml() again produces https://example.com?a=1&amp;amp;b=2, breaking the link.

🐛 Proposed fix to handle pre-encoded URLs
         } else if (url != null) {
             // It's a plain-text URL. Wrap it in your custom blue link styling.
-            String escapedUrl = escapeHtml(url);
+            // URL from sanitized text may already have HTML entities - decode first, then encode
+            String decodedUrl = url.replace("&amp;", "&").replace("&lt;", "<").replace("&gt;", ">").replace("&quot;", "\"").replace("&`#39`;", "'");
+            String escapedUrl = escapeHtml(decodedUrl);
             result.append("<a href=\"")
                 .append(escapedUrl)
                 .append("\" target=\"_blank\" rel=\"noopener noreferrer\" style=\"color: `#197de1`; text-decoration: underline;\">")
                 .append(escapedUrl)
                 .append("</a>");
         }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@sormas-ui/src/main/java/de/symeda/sormas/ui/caze/CaseDataForm.java` around
lines 1682 - 1690, The matched URL from sanitized text may already contain HTML
entities (e.g., "&amp;") and calling escapeHtml(url) double-encodes ampersands;
in CaseDataForm (the block that builds the "<a href=...>" using escapeHtml(url))
first unescape any existing HTML entities (e.g., via
StringEscapeUtils.unescapeHtml4 or equivalent) to get the raw URL, then escape
for HTML once for the href attribute and link text; ensure you use the
unescaped-then-escaped value for both the href and the displayed text so
ampersands are not double-encoded and the link remains valid.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@sormas-api/src/main/java/de/symeda/sormas/api/utils/HtmlHelper.java`:
- Around line 66-69: The new overload cleanHtmlRelaxed(String htmlText, Safelist
whitelist) lacks a null check for htmlText and will NPE in Jsoup.clean(); update
the method to mirror the other helpers (e.g., cleanHtml, cleanI18nString) by
returning null immediately if htmlText is null before creating
Document.OutputSettings or calling Jsoup.clean, keeping the existing use of
Safelist and OutputSettings.

---

Outside diff comments:
In `@sormas-ui/src/main/java/de/symeda/sormas/ui/caze/CaseDataForm.java`:
- Around line 1682-1690: The matched URL from sanitized text may already contain
HTML entities (e.g., "&amp;") and calling escapeHtml(url) double-encodes
ampersands; in CaseDataForm (the block that builds the "<a href=...>" using
escapeHtml(url)) first unescape any existing HTML entities (e.g., via
StringEscapeUtils.unescapeHtml4 or equivalent) to get the raw URL, then escape
for HTML once for the href attribute and link text; ensure you use the
unescaped-then-escaped value for both the href and the displayed text so
ampersands are not double-encoded and the link remains valid.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: dfa85a35-ac63-4b50-89e2-3a817ee9d343

📥 Commits

Reviewing files that changed from the base of the PR and between 088860f and d24227b.

📒 Files selected for processing (2)
  • sormas-api/src/main/java/de/symeda/sormas/api/utils/HtmlHelper.java
  • sormas-ui/src/main/java/de/symeda/sormas/ui/caze/CaseDataForm.java

Comment on lines +66 to +69
public static String cleanHtmlRelaxed(String htmlText, Safelist whitelist) {
Document.OutputSettings outputSettings = new Document.OutputSettings().prettyPrint(false);
return Jsoup.clean(htmlText, "", whitelist, outputSettings);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Missing null check for htmlText parameter.

The new overload doesn't handle null input, unlike all other methods in this class (e.g., cleanHtml, cleanI18nString, the other cleanHtmlRelaxed). Passing null will cause a NullPointerException from Jsoup.clean().

🛡️ Proposed fix to add null handling
 public static String cleanHtmlRelaxed(String htmlText, Safelist whitelist) {
+    if (htmlText == null) {
+        return "";
+    }
     Document.OutputSettings outputSettings = new Document.OutputSettings().prettyPrint(false);
     return Jsoup.clean(htmlText, "", whitelist, outputSettings);
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
public static String cleanHtmlRelaxed(String htmlText, Safelist whitelist) {
Document.OutputSettings outputSettings = new Document.OutputSettings().prettyPrint(false);
return Jsoup.clean(htmlText, "", whitelist, outputSettings);
}
public static String cleanHtmlRelaxed(String htmlText, Safelist whitelist) {
if (htmlText == null) {
return "";
}
Document.OutputSettings outputSettings = new Document.OutputSettings().prettyPrint(false);
return Jsoup.clean(htmlText, "", whitelist, outputSettings);
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@sormas-api/src/main/java/de/symeda/sormas/api/utils/HtmlHelper.java` around
lines 66 - 69, The new overload cleanHtmlRelaxed(String htmlText, Safelist
whitelist) lacks a null check for htmlText and will NPE in Jsoup.clean(); update
the method to mirror the other helpers (e.g., cleanHtml, cleanI18nString) by
returning null immediately if htmlText is null before creating
Document.OutputSettings or calling Jsoup.clean, keeping the existing use of
Safelist and OutputSettings.

@KarnaiahPesula KarnaiahPesula requested a review from raulbob May 26, 2026 06:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Case Classification info popup not displaying correctly

2 participants