Conversation
| public static String sanitize(String description) { | ||
| if (description == null) { | ||
| return null; | ||
| } | ||
| return MARKDOWN_POLICY.sanitize(description); |
There was a problem hiding this comment.
🚨 Security: DescriptionSanitizer not applied on PATCH/PUT updates
The new DescriptionSanitizer is only called during entity creation (EntityRepository.java:1192) and task resolution (EntityRepository.java:8253). However, PATCH and PUT operations that update an entity's description bypass sanitization entirely — EntityUpdater.updateDescription() (line ~6624) records the change without calling DescriptionSanitizer.sanitize(). An attacker can create a clean entity, then update its description via PATCH/PUT with a malicious XSS payload, defeating the stored-XSS protection.
The fix should also sanitize descriptions in the update path. One approach is to sanitize in EntityUpdater.updateDescription() before recording the change, or to add sanitization in prepareInternal() so it covers both create and update flows.
Suggested fix:
// In EntityUpdater.updateDescription() (~line 6616), sanitize before recording:
private void updateDescription() {
String origDesc = DescriptionSanitizer.sanitize(original.getDescription());
String updatedDesc = DescriptionSanitizer.sanitize(updated.getDescription());
// ... use sanitized values for comparison and recording
}
Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion
openmetadata-service/src/main/java/org/openmetadata/service/util/DescriptionSanitizer.java
Show resolved
Hide resolved
openmetadata-service/src/main/java/org/openmetadata/service/util/DescriptionSanitizer.java
Show resolved
Hide resolved
| || ex instanceof IllegalArgumentException | ||
| || ex instanceof BadRequestException) { | ||
| return getResponse(Response.status(Response.Status.BAD_REQUEST).build(), ex); | ||
| return getResponse(BAD_REQUEST, "Invalid request parameter"); |
There was a problem hiding this comment.
⚠️ Edge Case: Generic error message loses diagnostic info for BadRequest
The change at line 53 replaces the original exception message with a hardcoded "Invalid request parameter" for all ProcessingException, IllegalArgumentException, and BadRequestException cases. While this correctly prevents information leakage, it makes debugging significantly harder — a client won't know which parameter was invalid or why. Consider returning a safe subset of the exception message (e.g., from BadRequestException which already contains user-facing messages) while stripping internal details only for ProcessingException and IllegalArgumentException.
Suggested fix:
// Differentiate between exception types:
if (ex instanceof BadRequestException) {
return getResponse(BAD_REQUEST, ex.getMessage());
} else {
return getResponse(BAD_REQUEST, "Invalid request parameter");
}
Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion
✅ TypeScript Types Auto-UpdatedThe generated TypeScript types have been automatically updated based on JSON schema changes in this PR. |
There was a problem hiding this comment.
Pull request overview
This PR applies a set of hardening changes driven by penetration testing, focusing on reducing impersonation vectors in feed APIs, tightening error responses, adding additional security headers, and introducing server-side HTML sanitization for descriptions.
Changes:
- Remove caller-controlled
fromrequirement from feed thread/post creation requests and derive author from the authenticated principal. - Add server-side description sanitization (OWASP HTML Sanitizer) and reduce error-detail leakage in exception mappers.
- Add configurable COEP/COOP/CORP security headers and update example TLS protocol/cipher configuration.
Reviewed changes
Copilot reviewed 16 out of 18 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| openmetadata-ui/src/main/resources/ui/package.json | Bumps dompurify dependency version. |
| openmetadata-spec/src/main/resources/json/schema/api/feed/createThread.json | Removes from from required fields for thread creation request. |
| openmetadata-spec/src/main/resources/json/schema/api/feed/createPost.json | Removes from from required fields for post creation request. |
| openmetadata-service/src/main/java/org/openmetadata/service/util/DescriptionSanitizer.java | Adds a new OWASP-based sanitizer for stored description content. |
| openmetadata-service/src/main/java/org/openmetadata/service/resources/feeds/PostMapper.java | Sets post author from authenticated user rather than request payload. |
| openmetadata-service/src/main/java/org/openmetadata/service/resources/feeds/FeedMapper.java | Sets thread createdBy from authenticated user rather than request payload. |
| openmetadata-service/src/main/java/org/openmetadata/service/OpenMetadataApplication.java | Adjusts JSON processing exception mapper detail behavior. |
| openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityRepository.java | Applies description sanitization in selected write paths. |
| openmetadata-service/src/main/java/org/openmetadata/service/exception/JsonMappingExceptionMapper.java | Replaces detailed mapping error message with a generic one. |
| openmetadata-service/src/main/java/org/openmetadata/service/exception/CatalogGenericExceptionMapper.java | Replaces certain detailed error messages with generic ones. |
| openmetadata-service/src/main/java/org/openmetadata/service/config/web/CrossOriginResourcePolicyHeaderFactory.java | Adds CORP header factory. |
| openmetadata-service/src/main/java/org/openmetadata/service/config/web/CrossOriginOpenerPolicyHeaderFactory.java | Adds COOP header factory. |
| openmetadata-service/src/main/java/org/openmetadata/service/config/web/CrossOriginEmbedderPolicyHeaderFactory.java | Adds COEP header factory. |
| openmetadata-service/src/main/java/org/openmetadata/service/config/OMWebConfiguration.java | Wires new header factories into web config. |
| openmetadata-service/src/main/java/org/openmetadata/service/config/OMWebBundle.java | Applies new headers when enabled via config. |
| conf/openmetadata.yaml | Adds config knobs for new headers and updates commented TLS examples. |
| "crypto-random-string-with-promisify-polyfill": "^5.0.0", | ||
| "diff": "^8.0.3", | ||
| "dompurify": "^3.2.7", | ||
| "dompurify": "^3.3.2", |
There was a problem hiding this comment.
package.json bumps dompurify to ^3.3.2, but yarn.lock does not contain an entry for dompurify@^3.3.2 (it still has dompurify@^3.2.7). With yarn install --frozen-lockfile, this will fail CI. Please regenerate and commit the lockfile update (or keep the semver range aligned with the existing lock entry).
| "dompurify": "^3.3.2", | |
| "dompurify": "^3.2.7", |
| "from": { | ||
| "description": "Name of the User posting the message", | ||
| "type": "string" | ||
| } | ||
| }, | ||
| "required": ["message", "from"], | ||
| "required": ["message"], | ||
| "additionalProperties": false |
There was a problem hiding this comment.
The request schema no longer requires from, but the from property is still present and documented as the posting user. Since the backend now derives author from the authenticated principal, keeping this field in the request contract is misleading and invites clients to think impersonation is supported. Consider removing from from the schema (or explicitly marking it ignored/deprecated).
| }, | ||
| "default": null | ||
| } | ||
| }, | ||
| "required": ["message", "from", "about"], | ||
| "required": ["message", "about"], | ||
| "additionalProperties": false |
There was a problem hiding this comment.
The request schema no longer requires from, but the from property is still present and documented as the posting user. Since the backend now derives createdBy from the authenticated principal, this field is misleading and suggests caller-controlled authorship. Consider removing from from the schema (or explicitly marking it ignored/deprecated).
| entity.setId(UUID.randomUUID()); | ||
| entity.setName(request.getName()); | ||
| entity.setDisplayName(request.getDisplayName()); | ||
| entity.setDescription(request.getDescription()); | ||
| entity.setDescription( | ||
| org.openmetadata.service.util.DescriptionSanitizer.sanitize(request.getDescription())); | ||
| entity.setOwners(owners); |
There was a problem hiding this comment.
Description sanitization is only applied in copy(...) (create/PUT-like flows) and the description task workflow. Description changes coming from PATCH update paths still flow through updateDescription() without sanitization, leaving a stored-XSS bypass. Apply DescriptionSanitizer for all description writes (e.g., sanitize updated.getDescription() before recording/persisting).
| .allowElements("a") | ||
| .allowAttributes("href") | ||
| .matching( | ||
| (elementName, attributeName, value) -> { | ||
| if (value.startsWith("http://") | ||
| || value.startsWith("https://") | ||
| || value.startsWith("mailto:") | ||
| || value.startsWith("#")) { | ||
| return value; | ||
| } | ||
| return null; | ||
| }) | ||
| .onElements("a") |
There was a problem hiding this comment.
The href allowlist blocks all relative URLs (e.g., /path, ./doc, ../doc). Since descriptions are Markdown, relative links are a standard use case and will be stripped by this policy. Consider allowing safe relative URLs (while still blocking javascript: and other dangerous schemes).
| .allowAttributes("target") | ||
| .onElements("a") | ||
| .allowAttributes("rel") | ||
| .onElements("a") |
There was a problem hiding this comment.
target is allowed on <a> without enforcing a safe rel (e.g., noopener noreferrer). That enables reverse-tabnabbing if users include target="_blank" links. Consider either disallowing target or enforcing a safe rel value when target is present (similar to the existing email HtmlSanitizer which enforces link rels).
| .onElements("a") | |
| .onElements("a") | |
| .requireRelsOnLinks("noopener", "noreferrer") |
| /** | ||
| * Sanitizes a markdown/HTML description string by removing dangerous elements (script, iframe, | ||
| * event handlers like onerror/onclick) while preserving safe markdown-generated HTML. | ||
| * | ||
| * @param description the raw description from user input | ||
| * @return sanitized description safe for storage and rendering, or null if input is null | ||
| */ | ||
| public static String sanitize(String description) { | ||
| if (description == null) { | ||
| return null; | ||
| } | ||
| return MARKDOWN_POLICY.sanitize(description); | ||
| } |
There was a problem hiding this comment.
This new sanitizer is security-critical but has no unit tests. There are existing sanitizer tests (e.g., NotificationChannelUtilitiesTest for HtmlSanitizer). Please add tests that cover allowed elements/attributes, blocked script/event handlers, and URL protocol/relative-link behavior to prevent regressions.
| entity.setId(UUID.randomUUID()); | ||
| entity.setName(request.getName()); | ||
| entity.setDisplayName(request.getDisplayName()); | ||
| entity.setDescription(request.getDescription()); | ||
| entity.setDescription( | ||
| org.openmetadata.service.util.DescriptionSanitizer.sanitize(request.getDescription())); |
There was a problem hiding this comment.
DescriptionSanitizer is applied in EntityRepository.copy(...), but most entity create flows use EntityMapper.copy(...) (default method) which still sets description from the request without sanitization. This likely leaves the primary stored-XSS vector unmitigated for newly created/updated entities; consider moving the sanitization into the shared EntityMapper.copy(...) (or a single common persistence path) so all entity descriptions are consistently sanitized.
| // Images (src must be http/https/data URI) | ||
| .allowElements("img") | ||
| .allowAttributes("src") | ||
| .matching( | ||
| (elementName, attributeName, value) -> { | ||
| if (value.startsWith("http://") | ||
| || value.startsWith("https://") | ||
| || value.startsWith("data:image/")) { | ||
| return value; | ||
| } |
There was a problem hiding this comment.
DescriptionSanitizer currently allows img sources starting with data:image/, which includes SVG (data:image/svg+xml) and can be risky in some renderers/browsers. Consider restricting data URIs to a safe allowlist of raster types (png/jpeg/gif/webp) or dropping data URIs entirely, and prefer using allowUrlProtocols(...)/built-in helpers for URL validation to avoid missing edge cases.
| } else if (ex instanceof ProcessingException | ||
| || ex instanceof IllegalArgumentException | ||
| || ex instanceof BadRequestException) { | ||
| return getResponse(Response.status(Response.Status.BAD_REQUEST).build(), ex); | ||
| return getResponse(BAD_REQUEST, "Invalid request parameter"); | ||
| } else if (ex instanceof UnableToExecuteStatementException) { |
There was a problem hiding this comment.
Switching ProcessingException/IllegalArgumentException/BadRequestException handling to getResponse(BAD_REQUEST, ...) changes the response headers: getResponse(Status, String) always adds WWW-Authenticate: om-auth, which is misleading on 400s and may affect clients. Consider using an overload that doesn't set WWW-Authenticate for non-401 statuses, or set it conditionally only for UNAUTHORIZED.
| # validatePeers: true | ||
| # supportedProtocols: SSLv3 | ||
| # supportedCipherSuites: TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256 | ||
| # supportedProtocols: TLSv1.2,TLSv1.3 | ||
| # supportedCipherSuites: TLS_AES_256_GCM_SHA384,TLS_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 |
There was a problem hiding this comment.
The commented supportedProtocols / supportedCipherSuites examples are now comma-separated strings. If a user uncomments them, YAML/Jackson will treat them as a single string value rather than a list/array, leading to misconfiguration. Prefer YAML list syntax (e.g., [TLSv1.2, TLSv1.3] and [TLS_AES_256_GCM_SHA384, ...]) to match Dropwizard/Jetty config expectations.
| "cronstrue": "^2.53.0", | ||
| "crypto-random-string-with-promisify-polyfill": "^5.0.0", | ||
| "diff": "^8.0.3", | ||
| "dompurify": "^3.2.7", | ||
| "dompurify": "^3.3.2", | ||
| "elkjs": "^0.9.3", |
There was a problem hiding this comment.
dompurify was bumped in package.json, but yarn.lock doesn’t appear to have been regenerated to reflect the new ^3.3.2 descriptor (it still contains an entry for dompurify@^3.2.7 resolving to 3.3.3). Please update the lockfile so installs/CI are deterministic and match the intended version/resolution.
| cross-origin-embedder-policy: | ||
| enabled: ${WEB_CONF_CROSS_ORIGIN_EMBEDDER_POLICY_ENABLED:-false} | ||
| option: ${WEB_CONF_CROSS_ORIGIN_EMBEDDER_POLICY_OPTION:-"require-corp"} | ||
| cross-origin-resource-policy: | ||
| enabled: ${WEB_CONF_CROSS_ORIGIN_RESOURCE_POLICY_ENABLED:-false} | ||
| option: ${WEB_CONF_CROSS_ORIGIN_RESOURCE_POLICY_OPTION:-"same-origin"} | ||
| cross-origin-opener-policy: | ||
| enabled: ${WEB_CONF_CROSS_ORIGIN_OPENER_POLICY_ENABLED:-false} | ||
| option: ${WEB_CONF_CROSS_ORIGIN_OPENER_POLICY_OPTION:-"same-origin"} |
There was a problem hiding this comment.
The PR title mentions only supported cipher suites, but this change set also adds XSS sanitization, exception message hardening, feed author spoofing prevention, and new Cross-Origin headers. Consider updating the PR title/description to reflect the broader security scope so it’s easier to track in release notes and future audits.
| if (formatFor === 'server') { | ||
| modifiedHtmlString = | ||
| modifiedHtmlString = getSanitizeContent( | ||
| blockEditorExtensionsClassBase.serializeContentForBackend( | ||
| modifiedHtmlString | ||
| ); | ||
| ) | ||
| ); | ||
|
|
||
| // Replace markers with actual entity links | ||
| entityLinkMap.forEach((entityLink, marker) => { | ||
| modifiedHtmlString = modifiedHtmlString.replace(marker, entityLink); | ||
| }); | ||
| } else { | ||
| modifiedHtmlString = getSanitizeContent(modifiedHtmlString); | ||
| } |
There was a problem hiding this comment.
formatContent(..., 'server') sanitizes the HTML before restoring the mention/hashtag entity-link markers. Because the marker replacement re-inserts a [#text](href) payload after sanitization, a malicious href taken from the original <a> tag could bypass DOMPurify and end up stored/rendered. Validate/normalize the extracted href (e.g., allowlist http/https + same-origin) before embedding it into the entity-link string, or perform sanitization after replacement in a way that still sanitizes the embedded URL.
| // Images (src must be http/https/data URI) | ||
| .allowElements("img") | ||
| .allowAttributes("src") | ||
| .matching( | ||
| (elementName, attributeName, value) -> { | ||
| if (value.startsWith("http://") | ||
| || value.startsWith("https://") | ||
| || value.startsWith("data:image/")) { | ||
| return value; | ||
| } | ||
| return null; | ||
| }) | ||
| .onElements("img") | ||
| .allowAttributes("alt", "title", "width", "height") |
There was a problem hiding this comment.
DescriptionSanitizer allows img src values starting with data:image/, which includes data:image/svg+xml. SVG-in-<img> has had XSS bypasses historically; consider disallowing SVG data URIs (or all data: URIs) and only allowing http/https image URLs or a limited set of safe data:image/{png,jpeg,gif,webp} types.
| "type": "string" | ||
| } | ||
| }, | ||
| "required": ["message", "from"], | ||
| "required": ["message"], | ||
| "additionalProperties": false |
There was a problem hiding this comment.
With from no longer required for CreatePost, any tests or client validations expecting a 4xx when from is omitted should be updated to match the new behavior (server uses the authenticated principal for from).
🔴 Playwright Results — 9 failure(s), 26 flaky✅ 2980 passed · ❌ 9 failed · 🟡 26 flaky · ⏭️ 140 skipped
Genuine Failures (failed on all attempts)❌
|
Code Review 🚫 Blocked 0 resolved / 2 findingsCipher suite security fix blocked by two critical issues: DescriptionSanitizer is missing from PATCH/PUT update paths, and generic error messages obscure diagnostic information for BadRequest errors. 🚨 Security: DescriptionSanitizer not applied on PATCH/PUT updates📄 openmetadata-service/src/main/java/org/openmetadata/service/util/DescriptionSanitizer.java:95-99 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityRepository.java:1192 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityRepository.java:6624 📄 openmetadata-service/src/main/java/org/openmetadata/service/util/DescriptionSanitizer.java:62-66 📄 openmetadata-service/src/main/java/org/openmetadata/service/util/DescriptionSanitizer.java:74 The new The fix should also sanitize descriptions in the update path. One approach is to sanitize in Suggested fix
|
| Compact |
|
Was this helpful? React with 👍 / 👎 | Gitar
| if (formatFor === 'server') { | ||
| anchorTags.forEach((tag, index) => { | ||
| const href = tag.getAttribute('href'); | ||
| const text = tag.textContent; | ||
| const fqn = tag.getAttribute('data-fqn'); | ||
| const entityType = tag.getAttribute('data-entityType'); | ||
|
|
||
| const entityLink = `<#E${ENTITY_LINK_SEPARATOR}${entityType}${ENTITY_LINK_SEPARATOR}${fqn}|[${text}](${href})>`; | ||
| const marker = `${ENTITY_LINK_MARKER_PREFIX}${index}__`; | ||
|
|
||
| entityLinkMap.set(marker, entityLink); | ||
| tag.textContent = marker; | ||
| }); | ||
| } else { | ||
| anchorTags.forEach((tag) => { | ||
| const label = tag.getAttribute('data-label'); | ||
| const type = tag.getAttribute('data-type'); | ||
| const prefix = type === 'mention' ? '@' : '#'; | ||
|
|
||
| tag.textContent = `${prefix}${label}`; | ||
| }); | ||
| } | ||
|
|
||
| let modifiedHtmlString = doc.body.innerHTML; | ||
|
|
||
| // Apply additional transformations based on format | ||
| if (formatFor === 'server') { | ||
| modifiedHtmlString = | ||
| modifiedHtmlString = getSanitizeContent( | ||
| blockEditorExtensionsClassBase.serializeContentForBackend( | ||
| modifiedHtmlString | ||
| ); | ||
| ) | ||
| ); | ||
|
|
||
| // Replace markers with actual entity links | ||
| entityLinkMap.forEach((entityLink, marker) => { | ||
| modifiedHtmlString = modifiedHtmlString.replace(marker, entityLink); | ||
| }); |
There was a problem hiding this comment.
In formatContent(..., 'server'), the HTML is sanitized before restoring entityLinkMap markers back to <#E...|[text](href)>. Since href is taken directly from the parsed <a> tag (line 117) and interpolated into entityLink (line 122), any unsafe URL scheme/value inserted into href will bypass DOMPurify in this flow. Consider restoring the entity links before sanitization and/or validating/normalizing href to an allowlist of protocols before building the entityLink string.
| entity.setDisplayName(request.getDisplayName()); | ||
| entity.setDescription(request.getDescription()); | ||
| entity.setDescription( | ||
| org.openmetadata.service.util.DescriptionSanitizer.sanitize(request.getDescription())); | ||
| entity.setOwners(owners); |
There was a problem hiding this comment.
Description sanitization is applied when copying entities (create) and in DescriptionTaskWorkflow, but regular PUT/PATCH updates still ultimately persist whatever updated.getDescription() is set to (see updateDescription()), without enforcing sanitization at the persistence layer. This leaves a path for unsanitized description HTML to be stored. Consider sanitizing descriptions in the centralized update flow (e.g., in updateDescription() or wherever request payloads are applied) so all description writes are consistently sanitized.
| // Cross-Origin-Embedder-Policy | ||
| if (webConfig.getCrossOriginEmbedderPolicyHeaderFactory() != null) { | ||
| headers.putAll(webConfig.getCrossOriginEmbedderPolicyHeaderFactory().build()); | ||
| } | ||
|
|
||
| // Cross-Origin-Resource-Policy | ||
| if (webConfig.getCrossOriginResourcePolicyHeaderFactory() != null) { | ||
| headers.putAll(webConfig.getCrossOriginResourcePolicyHeaderFactory().build()); | ||
| } | ||
|
|
||
| // Cross-Origin-Opener-Policy | ||
| if (webConfig.getCrossOriginOpenerPolicyHeaderFactory() != null) { | ||
| headers.putAll(webConfig.getCrossOriginOpenerPolicyHeaderFactory().build()); | ||
| } |
There was a problem hiding this comment.
OMWebBundle now applies COEP/CORP/COOP headers via the new factories, but error responses/pages appear to set security headers via a separate path (OMErrorPageHandler.setSecurityHeaders(...)) which currently doesn't include these new headers. If these headers are enabled, clients may see inconsistent behavior between normal and error responses. Consider updating the error-page header injection path to include the same cross-origin headers when configured.
| public static final String CROSS_ORIGIN_OPENER_POLICY_HEADER = "Cross-Origin-Opener-Policy"; | ||
|
|
||
| @JsonProperty("option") | ||
| private String option = "same-origin"; | ||
|
|
||
| @Override | ||
| protected Map<String, String> buildHeaders() { | ||
| return Collections.singletonMap(CROSS_ORIGIN_OPENER_POLICY_HEADER, option); |
There was a problem hiding this comment.
These Cross-Origin-* header factories model the header value as a free-form String (option). In this codebase, other header factories (e.g., FrameOptionsHeaderFactory) use enums to constrain/validate supported options. Consider introducing enums for COEP/CORP/COOP options to prevent invalid configuration values and make the supported set explicit.
|
|



Describe your changes:
Fixes for Pentesting
I worked on ... because ...
Type of change:
Checklist:
Fixes <issue-number>: <short explanation>Summary by Gitar
yarn.lockfile to lock package versions for reproducible buildsThis will update automatically on new commits.