Commit 6239acc
authored
deps: bump org.jsoup:jsoup from 1.17.2 to 1.22.2 (#53)
Bumps [org.jsoup:jsoup](https://github.com/jhy/jsoup) from 1.17.2 to
1.22.2.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/jhy/jsoup/releases">org.jsoup:jsoup's
releases</a>.</em></p>
<blockquote>
<h2>jsoup Java HTML Parser release 1.22.2</h2>
<p><strong>jsoup 1.22.2</strong> is out now, with fixes and refinements
across the library. It makes editing the DOM during traversal more
predictable, refreshes the default HTML tag definitions with newer
elements and better text boundaries, and improves reliability in parsing
and HTTP transport. The release also fixes a number of edge cases in
cleaning, stream parsing, XML doctype handling, and Android
packaging.</p>
<p><strong>jsoup</strong> is a Java library for working with real-world
HTML and XML. It provides a very convenient API for extracting and
manipulating data, using the best of HTML5 DOM methods and CSS
selectors.</p>
<p><a
href="https://github.com/jhy/jsoup/blob/HEAD/download"><strong>Download</strong></a>
jsoup now.</p>
<h2>Improvements</h2>
<ul>
<li>Expanded and clarified <code>NodeTraversor</code> support for
in-place DOM rewrites during <code>NodeVisitor.head()</code>.
Current-node edits such as <code>remove</code>, <code>replace</code>,
and <code>unwrap</code> now recover more predictably, while traversal
stays within the original root subtree. This makes single-pass tree
cleanup and normalization visitors easier to write, for example when
unwrapping presentational elements or replacing text nodes as you walk
the DOM. <!-- raw HTML omitted --><a
href="https://redirect.github.com/jhy/jsoup/issues/2472">#2472</a><!--
raw HTML omitted --></li>
<li>Documentation: clarified that a configured <code>Cleaner</code> may
be reused across concurrent threads, and that shared
<code>Safelist</code> instances should not be mutated while in use. <!--
raw HTML omitted --><a
href="https://redirect.github.com/jhy/jsoup/issues/2473">#2473</a><!--
raw HTML omitted --></li>
<li>Updated the default HTML <code>TagSet</code> for current HTML
elements: added <code>dialog</code>, <code>search</code>,
<code>picture</code>, and <code>slot</code>; made <code>ins</code>,
<code>del</code>, <code>button</code>, <code>audio</code>,
<code>video</code>, and <code>canvas</code> inline by default
(<code>Tag#isInline()</code>, aligned to phrasing content in the spec);
and added readable <code>Element.text()</code> boundaries for controls
and embedded objects via the new <code>Tag.TextBoundary</code> option.
This improves pretty-printing and keeps normalized text from running
adjacent words together. <!-- raw HTML omitted --><a
href="https://redirect.github.com/jhy/jsoup/pull/2493">#2493</a><!-- raw
HTML omitted --></li>
</ul>
<h2>Bug Fixes</h2>
<ul>
<li>Android (R8/ProGuard): added a rule to ignore the optional
<code>re2j</code> dependency when not present. <!-- raw HTML omitted
--><a
href="https://redirect.github.com/jhy/jsoup/issues/2459">#2459</a><!--
raw HTML omitted --></li>
<li>Fixed a <code>NodeTraversor</code> regression in 1.21.2 where
removing or replacing the current node during <code>head()</code> could
revisit the replacement node and loop indefinitely. The traversal docs
now also clarify which inserted nodes are visited in the current pass.
<!-- raw HTML omitted --><a
href="https://redirect.github.com/jhy/jsoup/issues/2472">#2472</a><!--
raw HTML omitted --></li>
<li>Parsing during charset sniffing no longer fails if an advisory
<code>available()</code> call throws <code>IOException</code>, as seen
on JDK 8 <code>HttpURLConnection</code>. <!-- raw HTML omitted --><a
href="https://redirect.github.com/jhy/jsoup/issues/2474">#2474</a><!--
raw HTML omitted --></li>
<li><code>Cleaner</code> no longer makes relative URL attributes in the
input document absolute when cleaning or validating a
<code>Document</code>. URL normalization now applies only to the cleaned
output, and <code>Safelist.isSafeAttribute()</code> is side effect free.
<!-- raw HTML omitted --><a
href="https://redirect.github.com/jhy/jsoup/issues/2475">#2475</a><!--
raw HTML omitted --></li>
<li><code>Cleaner</code> no longer duplicates enforced attributes when
the input <code>Document</code> preserves attribute case. A case-variant
source attribute is now replaced by the enforced attribute in the
cleaned output. <!-- raw HTML omitted --><a
href="https://redirect.github.com/jhy/jsoup/issues/2476">#2476</a><!--
raw HTML omitted --></li>
<li>If a per-request SOCKS proxy is configured, jsoup now avoids using
the JDK <code>HttpClient</code>, because the JDK would silently ignore
that proxy and attempt to connect directly. Those requests now fall back
to the legacy <code>HttpURLConnection</code> transport instead, which
does support SOCKS. <!-- raw HTML omitted --><a
href="https://redirect.github.com/jhy/jsoup/issues/2468">#2468</a><!--
raw HTML omitted --></li>
<li><code>Connection.Response.streamParser()</code> and
<code>DataUtil.streamParser(Path, ...)</code> could fail on small inputs
without a declared charset, if the initial 5 KB charset sniff fully
consumed the input and closed it before the stream parse began. <!-- raw
HTML omitted --><a
href="https://redirect.github.com/jhy/jsoup/issues/2483">#2483</a><!--
raw HTML omitted --></li>
<li>In XML mode, doctypes with an internal subset, such as
<code><!DOCTYPE root [<!ENTITY name
"value">]></code>, now round-trip correctly. The subset
is preserved as raw text only; entities are not expanded and external
DTDs are not loaded. <!-- raw HTML omitted --><a
href="https://redirect.github.com/jhy/jsoup/issues/2486">#2486</a><!--
raw HTML omitted --></li>
</ul>
<h2>Build Changes</h2>
<ul>
<li>Migrated the integration test server from Jetty to Netty, which
actively maintains support for our minimum JDK target (8). <!-- raw HTML
omitted --><a
href="https://redirect.github.com/jhy/jsoup/pull/2491">#2491</a><!-- raw
HTML omitted --></li>
</ul>
<hr />
<p>My sincere thanks to everyone who contributed to this release!
If you have any suggestions for the next release, I would love to hear
them; please get in touch via <a
href="https://github.com/jhy/jsoup/discussions">jsoup discussions</a>,
or with me <a href="https://jhedley.com/">directly</a>.</p>
<p>You can also <!-- raw HTML omitted -->follow me<!-- raw HTML omitted
--> (<!-- raw HTML omitted --><!-- raw HTML omitted -->@<a
href="mailto:jhy@tilde.zone">jhy@tilde.zone</a><!-- raw HTML omitted
--><!-- raw HTML omitted -->) on Mastodon / Fediverse to receive
occasional notes about jsoup releases.</p>
<h2>jsoup Java HTML Parser release 1.22.1</h2>
<p><strong>jsoup 1.22.1</strong> is out now, adding support for the
<code>re2j</code> regular expression engine for regex-based CSS
selectors, a configurable maximum parser depth, and numerous bug fixes
and improvements.</p>
<p><strong>jsoup</strong> is a Java library for working with real-world
HTML and XML. It provides a very convenient API for extracting and
manipulating data, using the best of HTML5 DOM methods and CSS
selectors.</p>
<p><a href="https://jsoup.org/download"><strong>Download</strong></a>
jsoup now.</p>
<h3>Improvements</h3>
<ul>
<li>Added support for using the <code>re2j</code> regular expression
engine for regex-based CSS selectors (e.g. <code>[attr~=regex]</code>,
<code>:matches(regex)</code>), which ensures linear-time performance for
regex evaluation. This allows safer handling of arbitrary user-supplied
query regexes. To enable, add the <code>com.google.re2j</code>
dependency to your classpath, e.g.:</li>
</ul>
<pre lang="xml"><code> <dependency>
<groupId>com.google.re2j</groupId>
<artifactId>re2j</artifactId>
<version>1.8</version>
</dependency>
</code></pre>
<p>(If you already have that dependency in your classpath, but you want
to keep using the Java regex engine, you can disable re2j via
<code>System.setProperty("jsoup.useRe2j",
"false")</code>.) You can confirm that the re2j engine has
been enabled correctly by calling <code>Regex.usingRe2j()</code>. <!--
raw HTML omitted --><a
href="https://redirect.github.com/jhy/jsoup/pull/2407">#2407</a><!-- raw
HTML omitted --></p>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/jhy/jsoup/blob/master/CHANGES.md">org.jsoup:jsoup's
changelog</a>.</em></p>
<blockquote>
<h2>1.22.2 (2026-Apr-20)</h2>
<h3>Improvements</h3>
<ul>
<li>Expanded and clarified <code>NodeTraversor</code> support for
in-place DOM rewrites during <code>NodeVisitor.head()</code>.
Current-node edits such as <code>remove</code>, <code>replace</code>,
and <code>unwrap</code> now recover more predictably, while traversal
stays within the original root subtree. This makes single-pass tree
cleanup and normalization visitors easier to write, for example when
unwrapping presentational elements or replacing text nodes as you walk
the DOM. <a
href="https://redirect.github.com/jhy/jsoup/issues/2472">#2472</a></li>
<li>Documentation: clarified that a configured <code>Cleaner</code> may
be reused across concurrent threads, and that shared
<code>Safelist</code> instances should not be mutated while in use. <a
href="https://redirect.github.com/jhy/jsoup/issues/2473">#2473</a></li>
<li>Updated the default HTML <code>TagSet</code> for current HTML
elements: added <code>dialog</code>, <code>search</code>,
<code>picture</code>, and <code>slot</code>; made <code>ins</code>,
<code>del</code>, <code>button</code>, <code>audio</code>,
<code>video</code>, and <code>canvas</code> inline by default
(<code>Tag#isInline()</code>, aligned to phrasing content in the spec);
and added readable <code>Element.text()</code> boundaries for controls
and embedded objects via the new <code>Tag.TextBoundary</code> option.
This improves pretty-printing and keeps normalized text from running
adjacent words together. <a
href="https://redirect.github.com/jhy/jsoup/pull/2493">#2493</a></li>
</ul>
<h3>Bug Fixes</h3>
<ul>
<li>Android (R8/ProGuard): added a rule to ignore the optional
<code>re2j</code> dependency when not present. <a
href="https://redirect.github.com/jhy/jsoup/issues/2459">#2459</a></li>
<li>Fixed a <code>NodeTraversor</code> regression in 1.21.2 where
removing or replacing the current node during <code>head()</code> could
revisit the replacement node and loop indefinitely. The traversal docs
now also clarify which inserted nodes are visited in the current pass.
<a
href="https://redirect.github.com/jhy/jsoup/issues/2472">#2472</a></li>
<li>Parsing during charset sniffing no longer fails if an advisory
<code>available()</code> call throws <code>IOException</code>, as seen
on JDK 8 <code>HttpURLConnection</code>. <a
href="https://redirect.github.com/jhy/jsoup/issues/2474">#2474</a></li>
<li><code>Cleaner</code> no longer makes relative URL attributes in the
input document absolute when cleaning or validating a
<code>Document</code>. URL normalization now applies only to the cleaned
output, and <code>Safelist.isSafeAttribute()</code> is side effect free.
<a
href="https://redirect.github.com/jhy/jsoup/issues/2475">#2475</a></li>
<li><code>Cleaner</code> no longer duplicates enforced attributes when
the input <code>Document</code> preserves attribute case. A case-variant
source attribute is now replaced by the enforced attribute in the
cleaned output. <a
href="https://redirect.github.com/jhy/jsoup/issues/2476">#2476</a></li>
<li>If a per-request SOCKS proxy is configured, jsoup now avoids using
the JDK <code>HttpClient</code>, because the JDK would silently ignore
that proxy and attempt to connect directly. Those requests now fall back
to the legacy <code>HttpURLConnection</code> transport instead, which
does support SOCKS. <a
href="https://redirect.github.com/jhy/jsoup/issues/2468">#2468</a></li>
<li><code>Connection.Response.streamParser()</code> and
<code>DataUtil.streamParser(Path, ...)</code> could fail on small inputs
without a declared charset, if the initial 5 KB charset sniff fully
consumed the input and closed it before the stream parse began. <a
href="https://redirect.github.com/jhy/jsoup/issues/2483">#2483</a></li>
<li>In XML mode, doctypes with an internal subset, such as
<code><!DOCTYPE root [<!ENTITY name
"value">]></code>, now round-trip correctly. The subset
is preserved as raw text only; entities are not expanded and external
DTDs are not loaded. <a
href="https://redirect.github.com/jhy/jsoup/issues/2486">#2486</a></li>
</ul>
<h3>Build Changes</h3>
<ul>
<li>Migrated the integration test server from Jetty to Netty, which
actively maintains support for our minimum JDK target (8). <a
href="https://redirect.github.com/jhy/jsoup/pull/2491">#2491</a></li>
</ul>
<h2>1.22.1 (2026-Jan-01)</h2>
<h3>Improvements</h3>
<ul>
<li>Added support for using the <code>re2j</code> regular expression
engine for regex-based CSS selectors (e.g. <code>[attr~=regex]</code>,
<code>:matches(regex)</code>), which ensures linear-time performance for
regex evaluation. This allows safer handling of arbitrary user-supplied
query regexes. To enable, add the <code>com.google.re2j</code>
dependency to your classpath, e.g.:</li>
</ul>
<pre lang="xml"><code> <dependency>
<groupId>com.google.re2j</groupId>
<artifactId>re2j</artifactId>
<version>1.8</version>
</dependency>
</code></pre>
<p>(If you already have that dependency in your classpath, but you want
to keep using the Java regex engine, you can disable re2j via
<code>System.setProperty("jsoup.useRe2j",
"false")</code>.) You can confirm that the re2j engine has
been enabled correctly by calling
<code>org.jsoup.helper.Regex.usingRe2j()</code>. <a
href="https://redirect.github.com/jhy/jsoup/pull/2407">#2407</a></p>
<ul>
<li>Added an instance method <code>Parser#unescape(String,
boolean)</code> that unescapes HTML entities using the parser's
configuration (e.g. to support error tracking), complementing the
existing static utility <code>Parser.unescapeEntities(String,
boolean)</code>. <a
href="https://redirect.github.com/jhy/jsoup/pull/2396">#2396</a></li>
<li>Added a configurable maximum parser depth (to limit the number of
open elements on stack) to both HTML and XML parsers. The HTML parser
now defaults to a depth of 512 to match browser behavior, and protect
against unbounded stack growth, while the XML parser keeps unlimited
depth by default, but can opt into a limit via
<code>org.jsoup.parser.Parser#setMaxDepth</code>. <a
href="https://redirect.github.com/jhy/jsoup/issues/2421">#2421</a></li>
<li>Build: added CI coverage for JDK 25 <a
href="https://redirect.github.com/jhy/jsoup/pull/2403">#2403</a></li>
<li>Build: added a CI fuzzer for contextual fragment parsing (in
addition to existing full body HTML and XML fuzzers). [oss-fuzz <a
href="https://redirect.github.com/jhy/jsoup/issues/14041">#14041</a>](<a
href="https://redirect.github.com/google/oss-fuzz/pull/14041">google/oss-fuzz#14041</a>)</li>
</ul>
<h3>Changes</h3>
<ul>
<li>Set a removal schedule of jsoup 1.24.1 for previously deprecated
APIs.</li>
</ul>
<h3>Bug Fixes</h3>
<ul>
<li>Previously cached child <code>Elements</code> of an
<code>Element</code> were not correctly invalidated in
<code>Node#replaceWith(Node)</code>, which could lead to incorrect
results when subsequently calling <code>Element#children()</code>. <a
href="https://redirect.github.com/jhy/jsoup/issues/2391">#2391</a></li>
<li>Attribute selector values are now compared literally without
trimming. Previously, jsoup trimmed whitespace from selector values and
from element attribute values, which could cause mismatches with browser
behavior (e.g. <code>[attr=" foo "]</code>). Now matches align
with the CSS specification and browser engines. <a
href="https://redirect.github.com/jhy/jsoup/issues/2380">#2380</a></li>
<li>When using the JDK HttpClient, any system default proxy
(<code>ProxySelector.getDefault()</code>) was ignored. Now, the system
proxy is used if a per-request proxy is not set. <a
href="https://redirect.github.com/jhy/jsoup/issues/2388">#2388</a>, <a
href="https://redirect.github.com/jhy/jsoup/pull/2390">#2390</a></li>
<li>A <code>ValidationException</code> could be thrown in the adoption
agency algorithm with particularly broken input. Now logged as a parse
error. <a
href="https://redirect.github.com/jhy/jsoup/issues/2393">#2393</a></li>
<li>Null characters in the HTML body were not consistently removed; and
in foreign content were not correctly replaced. <a
href="https://redirect.github.com/jhy/jsoup/issues/2395">#2395</a></li>
<li>An <code>IndexOutOfBoundsException</code> could be thrown when
parsing a body fragment with crafted input. Now logged as a parse error.
<a href="https://redirect.github.com/jhy/jsoup/issues/2397">#2397</a>,
<a
href="https://redirect.github.com/jhy/jsoup/issues/2406">#2406</a></li>
<li>When using StructuralEvaluators (e.g., a <code>parent child</code>
selector) across many retained threads, their memoized results could
also be retained, increasing memory use. These results are now cleared
immediately after use, reducing overall memory consumption. <a
href="https://redirect.github.com/jhy/jsoup/issues/2411">#2411</a></li>
<li>Cloning a <code>Parser</code> now preserves any custom
<code>TagSet</code> applied to the parser. <a
href="https://redirect.github.com/jhy/jsoup/issues/2422">#2422</a>, <a
href="https://redirect.github.com/jhy/jsoup/pull/2423">#2423</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/jhy/jsoup/commit/ac28afe6e5bf96d39fd17c3e0a797a7585e1958c"><code>ac28afe</code></a>
[maven-release-plugin] prepare release jsoup-1.22.2</li>
<li><a
href="https://github.com/jhy/jsoup/commit/52f2cd3ea2004b9be0e0a09021bac7ce2daf8ae4"><code>52f2cd3</code></a>
Improve entity example in changelog</li>
<li><a
href="https://github.com/jhy/jsoup/commit/cf6ffe08616f8633ee6113b91f9d6a07acef38c6"><code>cf6ffe0</code></a>
Add Tag#TextBoundary option; bring TagSet to spec (<a
href="https://redirect.github.com/jhy/jsoup/issues/2493">#2493</a>)</li>
<li><a
href="https://github.com/jhy/jsoup/commit/2be739c1c659a1592c402a5441f8be6f7881280c"><code>2be739c</code></a>
Bump github/codeql-action from 4 to 4.35.1 (<a
href="https://redirect.github.com/jhy/jsoup/issues/2492">#2492</a>)</li>
<li><a
href="https://github.com/jhy/jsoup/commit/45de7cbc215eb3f1189d23eaf57acf6f7b1a5edf"><code>45de7cb</code></a>
Migrate integration test server from Jetty to Netty (<a
href="https://redirect.github.com/jhy/jsoup/issues/2491">#2491</a>)</li>
<li><a
href="https://github.com/jhy/jsoup/commit/1df14edbfc327a1ef309142ef5e8ed68324de320"><code>1df14ed</code></a>
Preserve XML doctype internal subset</li>
<li><a
href="https://github.com/jhy/jsoup/commit/06fa52d15a22003b67dfdb3f8220cc025d493a43"><code>06fa52d</code></a>
Adding Contribution Guide</li>
<li><a
href="https://github.com/jhy/jsoup/commit/d4a8941820c037327538c30a8723ec715b67b6f6"><code>d4a8941</code></a>
Simplify the test; doesn't need the buffer</li>
<li><a
href="https://github.com/jhy/jsoup/commit/823709f519995492d9a092fe315af389616e58f8"><code>823709f</code></a>
Don't reuse a fully read sniffed doc for StreamParser</li>
<li><a
href="https://github.com/jhy/jsoup/commit/e1b0df5fec53710214cd700de38d82e1ca92bd79"><code>e1b0df5</code></a>
NodeFilter javadoc tweak</li>
<li>Additional commits viewable in <a
href="https://github.com/jhy/jsoup/compare/jsoup-1.17.2...jsoup-1.22.2">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>1 parent 0f40090 commit 6239acc
1 file changed
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
157 | 157 | | |
158 | 158 | | |
159 | 159 | | |
160 | | - | |
| 160 | + | |
161 | 161 | | |
162 | 162 | | |
163 | 163 | | |
| |||
0 commit comments