Skip to content

Commit 71bef5a

Browse files
committed
Unwrap N-level chained proxy URLs to the innermost upstream.
Recursively peel embedded root/http(s) hops in parseChainedFileUrl so multi_origin, allow_origin, and HTTP ext see the final endpoint. Assisted-by: Cursor:Composer-2.5 CursorAI
1 parent 335a7b4 commit 71bef5a

6 files changed

Lines changed: 79 additions & 12 deletions

File tree

src/XrdApps/XrdClJournalCachePlugin/README.md

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -481,14 +481,22 @@ Clients can name a dynamic upstream inside a proxy URL:
481481
root://proxy.example:1095//root://origin.cern.ch:1094//store/file.dat
482482
```
483483

484-
PSS forwarding (`pss.origin =root,http,https`) accepts path-embedded upstreams such as `/root://origin.cern.ch:1094//store/file.dat` as well.
484+
Longer chains with **N proxies** are supported; JournalCache unwraps through every embedded hop to the innermost upstream:
485+
486+
```
487+
root://proxy1//root://proxy2//root://proxy3//root://origin.cern.ch:1094//store/file.dat
488+
```
489+
490+
Use the usual XRootD `//` separator before each path segment (including each embedded URL).
491+
492+
PSS forwarding (`pss.origin =root,http,https`) accepts path-embedded upstreams such as `/root://origin.cern.ch:1094//store/file.dat` as well. Each PSS hop forwards one layer; with `multi_origin = 1`, JournalCache on a proxy unwraps the full chain in one step for open, allowlist checks, and cache keys.
485493

486494
**Plugin options** (client config for the xrootd/PSS process):
487495

488496
| Key | Meaning |
489497
|-----|---------|
490-
| `multi_origin = 1` | Unwrap chained URLs to the inner upstream for open + journal cache key |
491-
| `allow_origin = <regex>` | Allowed upstream patterns (comma-separated or repeated key); matched against full URL, location, or host |
498+
| `multi_origin = 1` | Unwrap chained URLs to the innermost upstream for open + journal cache key |
499+
| `allow_origin = <regex>` | Allowed upstream patterns (comma-separated or repeated key); matched against the fully unwrapped URL, location, or host |
492500

493501
Environment overrides: `XRD_JOURNALCACHE_MULTI_ORIGIN`, `XRD_JOURNALCACHE_ALLOW_ORIGIN`.
494502

src/XrdApps/XrdClJournalCachePlugin/html/index.html

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -530,11 +530,13 @@ <h3>HTTP forwarding proxy (dynamic origins)</h3>
530530
<h3>Chained multi-origin URLs (<code>root://</code> and friends)</h3>
531531
<p>Clients can name a dynamic upstream inside a proxy URL:</p>
532532
<pre><code>root://proxy.example:1095//root://origin.cern.ch:1094//store/file.dat</code></pre>
533-
<p>PSS forwarding (<code>pss.origin =root,http,https</code>) also accepts path-embedded upstreams such as <code>/root://origin.cern.ch:1094//store/file.dat</code>.</p>
533+
<p>Longer chains with <strong>N proxies</strong> are supported; JournalCache unwraps through every embedded hop to the innermost upstream:</p>
534+
<pre><code>root://proxy1//root://proxy2//root://proxy3//root://origin.cern.ch:1094//store/file.dat</code></pre>
535+
<p>Use the usual XRootD <code>//</code> separator before each path segment. PSS forwarding (<code>pss.origin =root,http,https</code>) also accepts path-embedded upstreams such as <code>/root://origin.cern.ch:1094//store/file.dat</code>. Each PSS hop forwards one layer; with <code>multi_origin = 1</code>, JournalCache unwraps the full chain in one step for open, allowlist checks, and cache keys.</p>
534536
<table>
535537
<tr><th>Key</th><th>Meaning</th></tr>
536-
<tr><td><code>multi_origin = 1</code></td><td>Unwrap chained URLs to the inner upstream for open + journal cache key</td></tr>
537-
<tr><td><code>allow_origin = &lt;regex&gt;</code></td><td>Allowed upstream patterns (comma-separated or repeated)</td></tr>
538+
<tr><td><code>multi_origin = 1</code></td><td>Unwrap chained URLs to the innermost upstream for open + journal cache key</td></tr>
539+
<tr><td><code>allow_origin = &lt;regex&gt;</code></td><td>Allowed upstream patterns; matched against the fully unwrapped URL</td></tr>
538540
</table>
539541
<p>Environment overrides: <code>XRD_JOURNALCACHE_MULTI_ORIGIN</code>, <code>XRD_JOURNALCACHE_ALLOW_ORIGIN</code>. The HTTP ext handler accepts the same <code>allow_origin</code> lines and rejects disallowed upstreams with <strong>403</strong>.</p>
540542
<pre><code>multi_origin = 1

src/XrdApps/XrdClJournalCachePlugin/html/index.md

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -315,12 +315,20 @@ Clients can name a dynamic upstream inside a proxy URL:
315315
root://proxy.example:1095//root://origin.cern.ch:1094//store/file.dat
316316
```
317317

318-
PSS forwarding (`pss.origin =root,http,https`) also accepts path-embedded upstreams such as `/root://origin.cern.ch:1094//store/file.dat`.
318+
Longer chains with **N proxies** are supported; JournalCache unwraps through every embedded hop to the innermost upstream:
319+
320+
```
321+
root://proxy1//root://proxy2//root://proxy3//root://origin.cern.ch:1094//store/file.dat
322+
```
323+
324+
Use the usual XRootD `//` separator before each path segment (including each embedded URL).
325+
326+
PSS forwarding (`pss.origin =root,http,https`) also accepts path-embedded upstreams such as `/root://origin.cern.ch:1094//store/file.dat`. Each PSS hop forwards one layer; with `multi_origin = 1`, JournalCache on a proxy unwraps the full chain in one step for open, allowlist checks, and cache keys.
319327

320328
| Key | Meaning |
321329
|-----|---------|
322-
| `multi_origin = 1` | Unwrap chained URLs to the inner upstream for open + journal cache key |
323-
| `allow_origin = <regex>` | Allowed upstream patterns (comma-separated or repeated); matched against full URL, location, or host |
330+
| `multi_origin = 1` | Unwrap chained URLs to the innermost upstream for open + journal cache key |
331+
| `allow_origin = <regex>` | Allowed upstream patterns (comma-separated or repeated); matched against the fully unwrapped URL, location, or host |
324332

325333
Environment overrides: `XRD_JOURNALCACHE_MULTI_ORIGIN`, `XRD_JOURNALCACHE_ALLOW_ORIGIN`.
326334

src/XrdApps/XrdClJournalCachePlugin/http/ForwardingUrl.cc

Lines changed: 22 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,26 @@ EmbeddedFileUrl parseEmbeddedFromRest(std::string rest) {
6969
return canonicalizeFileUrl(rest);
7070
}
7171

72+
EmbeddedFileUrl unwrapFullyChained(const EmbeddedFileUrl &first) {
73+
EmbeddedFileUrl result = first;
74+
if (!result.valid) {
75+
return result;
76+
}
77+
78+
while (true) {
79+
XrdCl::URL current(result.fileUrl);
80+
if (!current.IsValid()) {
81+
break;
82+
}
83+
const EmbeddedFileUrl inner = parseEmbeddedFileUrl(current.GetPath());
84+
if (!inner.valid) {
85+
break;
86+
}
87+
result = inner;
88+
}
89+
return result;
90+
}
91+
7292
} // namespace
7393

7494
EmbeddedFileUrl parseEmbeddedFileUrl(const std::string &path) {
@@ -80,7 +100,7 @@ EmbeddedFileUrl parseChainedFileUrl(const std::string &url) {
80100
if (!url.empty() && url.front() == '/') {
81101
EmbeddedFileUrl embedded = parseEmbeddedFileUrl(url);
82102
if (embedded.valid) {
83-
return embedded;
103+
return unwrapFullyChained(embedded);
84104
}
85105
}
86106

@@ -91,7 +111,7 @@ EmbeddedFileUrl parseChainedFileUrl(const std::string &url) {
91111

92112
EmbeddedFileUrl embedded = parseEmbeddedFileUrl(outer.GetPath());
93113
if (embedded.valid) {
94-
return embedded;
114+
return unwrapFullyChained(embedded);
95115
}
96116

97117
return canonicalizeFileUrl(url);

src/XrdApps/XrdClJournalCachePlugin/http/ForwardingUrl.hh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ struct EmbeddedFileUrl {
1313
//! Parse `/https://host/path`, `/root://host//path`, etc.
1414
EmbeddedFileUrl parseEmbeddedFileUrl(const std::string &path);
1515

16-
//! Parse `root://proxy//root://origin//path` and path-embedded upstream URLs.
16+
//! Parse chained URLs and unwrap through N embedded proxies to the innermost upstream.
1717
EmbeddedFileUrl parseChainedFileUrl(const std::string &url);
1818

1919
std::string resolveJournalDirWithSettings(const std::string &cacheRoot,

tests/XrdCl/XrdClJournalCache.cc

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -728,6 +728,35 @@ TEST(ForwardingUrlTest, ParseChainedRootUrl) {
728728
EXPECT_EQ(chained.fileUrl.find("proxy.cern.ch"), std::string::npos);
729729
}
730730

731+
TEST(ForwardingUrlTest, ParseChainedTripleRootUrl) {
732+
const auto chained = JournalCache::parseChainedFileUrl(
733+
"root://proxy1:1095//root://proxy2:1096//root://origin.cern.ch:1094//"
734+
"store/file.dat");
735+
ASSERT_TRUE(chained.valid);
736+
EXPECT_NE(chained.fileUrl.find("origin.cern.ch"), std::string::npos);
737+
EXPECT_NE(chained.fileUrl.find("/store/file.dat"), std::string::npos);
738+
EXPECT_EQ(chained.fileUrl.find("proxy1"), std::string::npos);
739+
EXPECT_EQ(chained.fileUrl.find("proxy2"), std::string::npos);
740+
}
741+
742+
TEST(ForwardingUrlTest, ParseChainedTripleEmbeddedPath) {
743+
const auto chained = JournalCache::parseChainedFileUrl(
744+
"/root://proxy2:1096//root://origin.cern.ch:1094//store/file.dat");
745+
ASSERT_TRUE(chained.valid);
746+
EXPECT_NE(chained.fileUrl.find("origin.cern.ch"), std::string::npos);
747+
EXPECT_EQ(chained.fileUrl.find("proxy2"), std::string::npos);
748+
}
749+
750+
TEST(ForwardingUrlTest, ParseChainedMixedProtocols) {
751+
const auto chained = JournalCache::parseChainedFileUrl(
752+
"root://proxy:1095//root://relay:1096//https://cdn.example.org/store/"
753+
"file.dat");
754+
ASSERT_TRUE(chained.valid);
755+
EXPECT_NE(chained.fileUrl.find("cdn.example.org"), std::string::npos);
756+
EXPECT_EQ(chained.fileUrl.find("proxy"), std::string::npos);
757+
EXPECT_EQ(chained.fileUrl.find("relay"), std::string::npos);
758+
}
759+
731760
TEST(OriginAllowlistTest, AllowsMatchingHostOrUrl) {
732761
JournalCache::OriginAllowlist allowlist;
733762
allowlist.addPattern(R"(^root://([a-z0-9.-]+\.)?cern\.ch(:1094)?/)");

0 commit comments

Comments
 (0)