Skip to content

Commit 79fb097

Browse files
committed
Add HTTP forwarding proxy mode for dynamic upstream origins.
Parse embedded http(s) URLs from PSS forwarding paths, align journal lookup with the file plugin, and document example configs for browser-facing proxies. Assisted-by: Cursor:Composer-2.5 CursorAI
1 parent 395019b commit 79fb097

14 files changed

Lines changed: 407 additions & 74 deletions

src/XrdApps/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,7 @@ if(BUILD_HTTP AND NOT XRDCL_ONLY)
7979

8080
add_library(${XrdClJournalCacheHttpExt} MODULE
8181
XrdClJournalCachePlugin/http/JournalCacheHttpExt.cc
82+
XrdClJournalCachePlugin/http/ForwardingUrl.cc
8283
XrdClJournalCachePlugin/file/CacheHeaders.cc
8384
XrdClJournalCachePlugin/file/Digest.cc
8485
)

src/XrdApps/XrdClJournalCachePlugin/README.md

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -413,6 +413,66 @@ Per-file operational CGIs (append to the HTTP URL or XRootD open URL):
413413
/store/file.root?xrd.journalcache.clean=1
414414
```
415415

416+
## 7.7 HTTP forwarding proxy (dynamic origins)
417+
418+
For a browser-facing **forwarding proxy** where each request names its own HTTP(S) upstream, use PSS forwarding mode together with `forwarding = 1` in the HTTP ext config.
419+
420+
**Request form** (destination embedded in the path):
421+
422+
```
423+
GET /https://cdn.example.org/store/file.dat HTTP/1.1
424+
Host: cache-proxy.example
425+
```
426+
427+
**Server config** (see `http/journalcache-forwarding.cf`):
428+
429+
```ini
430+
pss.origin =http,https
431+
all.export /
432+
ofs.osslib libXrdPss.so
433+
434+
# Required: restrict allowed upstream hosts
435+
pss.permit /* .example.org
436+
437+
http.exthandler journalcache libXrdClJournalCacheHttpExt-5.so \
438+
/etc/xrootd/journalcache-http-forwarding.ext.conf
439+
```
440+
441+
**HTTP ext config** (`http/journalcache-http-forwarding.ext.conf`):
442+
443+
```ini
444+
forwarding = 1
445+
cache = /var/tmp/journalcache/
446+
flat = 0
447+
prefix = /
448+
```
449+
450+
**Client plugins** for the xrootd process (`http/journalcache-forwarding-client.conf`):
451+
452+
```ini
453+
url = *
454+
lib = libXrdClJournalCachePlugin-5.so
455+
enable = true
456+
cache = /var/tmp/journalcache/
457+
458+
url = http*
459+
lib = libXrdClHttp.so
460+
enable = true
461+
462+
url = https*
463+
lib = libXrdClHttp.so
464+
enable = true
465+
```
466+
467+
In forwarding mode the ext handler:
468+
469+
- Parses the embedded `http://` / `https://` URL from each path
470+
- Resolves journal files using the same cache-key rules as the file plugin
471+
- Issues HTTP **HEAD** to that upstream for cache metadata (when libcurl is available)
472+
- Skips local `root://` xattr lookups (`server` / `http_origin` are ignored)
473+
474+
PSS turns `/https://host/path` into a client open of `https://host/path`; **XrdClHttp** performs the HTTP fetch; JournalCache journals the bytes.
475+
416476
# 8 JournalCache in a Proxy server
417477

418478
To run a proxy server with JournalCache you create a usual proxy configuration file:

src/XrdApps/XrdClJournalCachePlugin/html/index.html

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -513,6 +513,19 @@ <h3>HTTP ext handler config</h3>
513513
<p>Origin files can publish cache policy by setting xattrs, for example:</p>
514514
<pre><code>xrdfs setxattr /store/data/file.root http.cache-control 'public, s-maxage=3600'
515515
xrdfs setxattr /store/data/file.root http.etag '"abc123"'</code></pre>
516+
517+
<h3>HTTP forwarding proxy (dynamic origins)</h3>
518+
<p>For browser clients that embed the upstream URL in the path (<code>/https://host/path</code>), enable PSS forwarding and set <code>forwarding = 1</code> in the HTTP ext config. Example files live under <code>http/</code>: <code>journalcache-forwarding.cf</code>, <code>journalcache-http-forwarding.ext.conf</code>, <code>journalcache-forwarding-client.conf</code>.</p>
519+
<pre><code># xrootd server
520+
pss.origin =http,https
521+
pss.permit /* .example.org
522+
http.exthandler journalcache libXrdClJournalCacheHttpExt-5.so \
523+
/etc/xrootd/journalcache-http-forwarding.ext.conf
524+
525+
# ext handler
526+
forwarding = 1
527+
cache = /var/tmp/journalcache/</code></pre>
528+
<p>Load <strong>XrdClHttp</strong> alongside JournalCache in the xrootd process client plugins. PSS fetches via HTTP(S); the ext handler parses each embedded URL for journal lookup, 304, and response headers.</p>
516529
</section>
517530

518531
<section id="proxy">

src/XrdApps/XrdClJournalCachePlugin/html/index.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -289,6 +289,24 @@ xrdfs setxattr /store/data/file.root http.etag '"abc123"'
289289
xrdfs setxattr /store/data/file.root http.last-modified 'Wed, 21 Oct 2015 07:28:00 GMT'
290290
```
291291

292+
### HTTP forwarding proxy (dynamic origins)
293+
294+
For browser clients that embed the upstream URL in the path (`/https://host/path`), enable PSS forwarding and set `forwarding = 1` in the HTTP ext config. Example files: `http/journalcache-forwarding.cf`, `http/journalcache-http-forwarding.ext.conf`, `http/journalcache-forwarding-client.conf`.
295+
296+
```ini
297+
# xrootd server
298+
pss.origin =http,https
299+
pss.permit /* .example.org
300+
http.exthandler journalcache libXrdClJournalCacheHttpExt-5.so \
301+
/etc/xrootd/journalcache-http-forwarding.ext.conf
302+
303+
# ext handler
304+
forwarding = 1
305+
cache = /var/tmp/journalcache/
306+
```
307+
308+
Load **XrdClHttp** alongside JournalCache in the xrootd process client plugins. PSS fetches via HTTP(S); the ext handler parses each embedded URL for journal lookup, 304, and response headers.
309+
292310
---
293311

294312
## Proxy and HTTP front-end setup
Lines changed: 112 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,112 @@
1+
#include "http/ForwardingUrl.hh"
2+
#include "file/Digest.hh"
3+
4+
#include "XrdCl/XrdClURL.hh"
5+
6+
namespace JournalCache {
7+
namespace {
8+
9+
bool startsWith(const std::string &value, const std::string &prefix) {
10+
return value.size() >= prefix.size() &&
11+
value.compare(0, prefix.size(), prefix) == 0;
12+
}
13+
14+
std::string stripQuery(const std::string &path) {
15+
const auto pos = path.find('?');
16+
if (pos == std::string::npos) {
17+
return path;
18+
}
19+
return path.substr(0, pos);
20+
}
21+
22+
} // namespace
23+
24+
namespace {
25+
26+
std::string normalizeRemotePath(const std::string &path) {
27+
if (path.empty()) {
28+
return "/";
29+
}
30+
if (path[0] == '/') {
31+
return path;
32+
}
33+
return "/" + path;
34+
}
35+
36+
} // namespace
37+
38+
EmbeddedFileUrl parseEmbeddedFileUrl(const std::string &path) {
39+
EmbeddedFileUrl result;
40+
std::string rest = stripQuery(path);
41+
if (!rest.empty() && rest.front() == '/') {
42+
rest.erase(0, 1);
43+
}
44+
45+
if (!startsWith(rest, "https://") && !startsWith(rest, "http://")) {
46+
return result;
47+
}
48+
49+
XrdCl::URL parsed(rest);
50+
if (!parsed.IsValid()) {
51+
return result;
52+
}
53+
54+
XrdCl::URL clean;
55+
clean.SetProtocol(parsed.GetProtocol());
56+
clean.SetHostName(parsed.GetHostName());
57+
clean.SetPort(parsed.GetPort());
58+
clean.SetPath(parsed.GetPath());
59+
result.fileUrl = clean.GetURL();
60+
result.valid = !result.fileUrl.empty();
61+
return result;
62+
}
63+
64+
std::string resolveJournalDirWithSettings(const std::string &cacheRoot,
65+
const std::string &serverUrl,
66+
const std::string &remotePath,
67+
bool flatHierarchy,
68+
const std::string &basePath) {
69+
const std::string normPath = normalizeRemotePath(remotePath);
70+
if (flatHierarchy) {
71+
return cacheRoot + computeSHA256(serverUrl + normPath);
72+
}
73+
74+
if (!basePath.empty()) {
75+
const size_t pos = normPath.find(basePath);
76+
if (pos != std::string::npos) {
77+
return cacheRoot + normPath.substr(pos);
78+
}
79+
XrdCl::URL url(serverUrl);
80+
const size_t urlPos = url.GetPath().find(basePath);
81+
if (urlPos != std::string::npos) {
82+
return cacheRoot + normPath;
83+
}
84+
}
85+
86+
XrdCl::URL url(serverUrl);
87+
const std::string host =
88+
url.GetHostName() + ":" + std::to_string(url.GetPort());
89+
return cacheRoot + host + normPath;
90+
}
91+
92+
std::string resolveJournalPathFromCacheKey(const std::string &cacheRoot,
93+
const std::string &cacheKeyUrl,
94+
bool flatHierarchy,
95+
const std::string &basePath) {
96+
if (cacheRoot.empty() || cacheKeyUrl.empty()) {
97+
return {};
98+
}
99+
100+
std::string journalDir;
101+
if (flatHierarchy) {
102+
journalDir = cacheRoot + computeSHA256(cacheKeyUrl);
103+
} else {
104+
XrdCl::URL url(cacheKeyUrl);
105+
journalDir = resolveJournalDirWithSettings(cacheRoot, cacheKeyUrl,
106+
url.GetPath(), flatHierarchy,
107+
basePath);
108+
}
109+
return journalDir + "/journal";
110+
}
111+
112+
} // namespace JournalCache
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
#pragma once
2+
3+
#include <string>
4+
5+
namespace JournalCache {
6+
7+
//! Parsed upstream URL embedded in a PSS forwarding-proxy path.
8+
struct EmbeddedFileUrl {
9+
std::string fileUrl;
10+
bool valid = false;
11+
};
12+
13+
//! Parse `/https://host/path` or `/http://host/path` into a canonical file URL.
14+
EmbeddedFileUrl parseEmbeddedFileUrl(const std::string &path);
15+
16+
std::string resolveJournalDirWithSettings(const std::string &cacheRoot,
17+
const std::string &serverUrl,
18+
const std::string &remotePath,
19+
bool flatHierarchy,
20+
const std::string &basePath);
21+
22+
//! Resolve on-disk journal path using the same rules as the file plugin.
23+
std::string resolveJournalPathFromCacheKey(const std::string &cacheRoot,
24+
const std::string &cacheKeyUrl,
25+
bool flatHierarchy,
26+
const std::string &basePath);
27+
28+
} // namespace JournalCache

0 commit comments

Comments
 (0)