-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathSETUP_GUIDE.txt
More file actions
348 lines (254 loc) · 11.9 KB
/
SETUP_GUIDE.txt
File metadata and controls
348 lines (254 loc) · 11.9 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
PFC-Log v3.3 — Setup Guide
===========================
Thank you for trying PFC-Log!
Your trial license is valid for 30 days from today.
REQUIREMENTS
------------
- Linux x86_64 (Intel/AMD) — ARM64 not supported (no AWS Graviton / Apple Silicon)
- Docker Engine 20.10 or newer
- RAM: 512 MB free minimum (1 GB recommended)
- Disk: ~200 MB for image + input file size + ~10% for compressed output
(Example: 10 GB log file → needs ~11 GB free disk space)
SETUP — 3 STEPS
---------------
1. Pull the Docker image:
docker pull impossibleforge/pfc-log:v3.3
2. Activate your license key:
Copy the attached license.key file into the same directory
as your log files. No editing needed — it's ready to use.
3. Compress & Decompress:
# Compress:
docker run --rm \
-v $(pwd):/data \
impossibleforge/pfc-log:v3.3 \
compress /data/access.log /data/access.pfc
# Decompress:
docker run --rm \
-v $(pwd):/data \
impossibleforge/pfc-log:v3.3 \
decompress /data/access.pfc /data/restored.log
# With preset (fast / default / max):
docker run --rm \
-v $(pwd):/data \
impossibleforge/pfc-log:v3.3 \
compress /data/access.log /data/access.pfc --level default
NOTE: The license.key file must be in the same directory as your log files
(that directory is mounted as /data inside the container).
PRESETS
-------
fast -- 8 MiB blocks, maximum speed
default -- 32 MiB blocks, best balance (recommended)
max -- 128 MiB blocks, best compression ratio
BENCHMARK RESULTS (1 GB Apache/Nginx logs)
------------------------------------------
PFC-Log default: 6.73% @ 14.5 MB/s compress, 32 MB/s decompress
gzip -6: 14.29% (53% larger than PFC)
zstd -3: 14.32% (same ratio as gzip, just faster)
zstd -19: 8.43% @ 0.45 MB/s compress (26 hours/TB!)
bzip2 -9: 7.39% @ 6.4 MB/s
PFC-Log is #1 — best ratio at high speed.
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
RANDOM ACCESS FEATURES
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
You can inspect, query, and partially decompress .pfc files without
decompressing the entire file.
── INFO: Inspect a compressed file ──────────────────────────
Show basic file info (size, ratio, blocks, timestamp range):
docker run --rm \
-v $(pwd):/data \
impossibleforge/pfc-log:v3.3 \
info /data/access.pfc
Show detailed block structure (offsets, sizes, timestamps per block):
docker run --rm \
-v $(pwd):/data \
impossibleforge/pfc-log:v3.3 \
info /data/access.pfc --blocks
Example output (--blocks):
Block Byte Offset Comp Size Orig Size
0 30 7,057,783 33,554,432
1 7,057,813 6,552,803 33,554,432
Hint: use these offsets with HTTP Range requests (S3, Azure, etc.)
Note: Timestamp ranges per block are stored in the .pfc.idx file.
Use the query command (without --out) to see which blocks cover a time range.
── SEEK-BLOCK: Decompress one block only ────────────────────
Decompress a single block without touching the rest of the file.
Useful for spot-checking recent logs or sampling specific time windows.
docker run --rm \
-v $(pwd):/data \
impossibleforge/pfc-log:v3.3 \
seek-block 1 /data/access.pfc
# To a file:
docker run --rm \
-v $(pwd):/data \
impossibleforge/pfc-log:v3.3 \
seek-block 1 /data/access.pfc /data/block1.log
# To stdout (pipe-friendly):
docker run --rm \
-v $(pwd):/data \
impossibleforge/pfc-log:v3.3 \
seek-block 1 /data/access.pfc - | grep -a "ERROR"
# Count matching lines:
docker run --rm \
-v $(pwd):/data \
impossibleforge/pfc-log:v3.3 \
seek-block 1 /data/access.pfc - | grep -ac "POST"
Note: Use grep -a (not plain grep) when piping seek-block output to
stdout. Log data may contain non-ASCII characters that cause plain
grep to treat the stream as binary. grep -a forces text mode.
Note: Blocks are 0-indexed. Block 0 is always the preprocessor
dictionary and is very small — user log data starts at block 1.
Note on block boundaries: Because log preprocessing tokens can span
block edges, individual seek-block output may differ by a few bytes
from the exact corresponding slice of the full decompression.
This is expected behavior — for byte-exact results, use decompress.
── QUERY: Time-range search ─────────────────────────────────
Search for log lines within a specific time window.
Only the relevant blocks are decompressed — no full-file scan needed.
# Show which blocks match (no decompression):
docker run --rm \
-v $(pwd):/data \
impossibleforge/pfc-log:v3.3 \
query /data/access.pfc --from "2025-03-01T00:00:00" --to "2025-03-01T23:59:59"
# Decompress matching blocks to file:
docker run --rm \
-v $(pwd):/data \
impossibleforge/pfc-log:v3.3 \
query /data/access.pfc \
--from "2025-03-01T00:00:00" --to "2025-03-01T23:59:59" \
--out /data/march1.log
# Decompress matching blocks to stdout:
docker run --rm \
-v $(pwd):/data \
impossibleforge/pfc-log:v3.3 \
query /data/access.pfc \
--from "2025-03-01T00:00:00" --to "2025-03-01T23:59:59" \
--out -
Timestamps use ISO-8601 format (UTC). Requires that the .pfc.idx
index file is present alongside the .pfc file (auto-generated
during compression, located in the same directory).
── TIMESTAMP INDEX ──────────────────────────────────────────
During compression, PFC-Log automatically creates a .pfc.idx file
alongside the .pfc file. This index stores the timestamp range
for each block, enabling fast time-range queries.
access.pfc ← compressed data
access.pfc.idx ← timestamp index (auto-generated, ~1 KB)
Keep both files together — whether on local disk or in cloud storage.
If the .idx file is missing, query will fall back to full-file decompression.
To compress without generating an index:
docker run --rm \
-v $(pwd):/data \
impossibleforge/pfc-log:v3.3 \
compress /data/access.log /data/access.pfc --no-index
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
S3 / CLOUD STORAGE — DIRECT BLOCK ACCESS (NEW IN v3.3)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
PFC-Log v3.3 supports direct cloud access via pre-signed URLs —
no SDK, no credentials inside the container, no dependencies to install.
Works with any storage that supports HTTP Range requests:
AWS S3 / S3 Glacier Instant Retrieval
Azure Blob Storage (SAS URLs)
Cloudflare R2, Hetzner Object Storage, Wasabi, MinIO, Backblaze B2
Why pre-signed URLs?
The URL carries its own authentication — PFC just makes standard
HTTP requests. No AWS credentials inside the container, no SDK updates,
no vendor lock-in. One mechanism works everywhere.
Egress savings:
10 TB archive, 1-hour query → PFC downloads ~24 MB instead of full file
That's 99.9%+ egress reduction vs downloading the full archive.
At $0.09/GB: gzip full download = $129 per query. PFC = $0.002.
STEP 1 — Compress locally (as normal):
docker run --rm \
-v $(pwd):/data \
impossibleforge/pfc-log:v3.3 \
compress /data/access.log /data/access.pfc
This produces two files:
access.pfc ← compressed archive
access.pfc.idx ← timestamp index (~1 KB, auto-generated)
STEP 2 — Upload BOTH files to your cloud storage:
# AWS S3:
aws s3 cp access.pfc s3://my-bucket/logs/
aws s3 cp access.pfc.idx s3://my-bucket/logs/
# Azure:
az storage blob upload -f access.pfc -c logs -n access.pfc
az storage blob upload -f access.pfc.idx -c logs -n access.pfc.idx
Always upload BOTH files. The .pfc.idx enables fast block lookup.
STEP 3 — Generate pre-signed URLs:
# AWS (valid for 1 hour — adjust --expires-in as needed):
aws s3 presign s3://my-bucket/logs/access.pfc --expires-in 3600
aws s3 presign s3://my-bucket/logs/access.pfc.idx --expires-in 3600
# Azure (SAS URL):
az storage blob generate-sas --full-uri \
--account-name ACCOUNT --container-name logs \
--name access.pfc --permissions r --expiry 2025-12-31T00:00:00Z
# Hetzner / Cloudflare R2 / Wasabi / MinIO:
Use your provider's presign command or dashboard.
All S3-compatible providers support pre-signed URLs.
# rclone (works with all providers):
rclone link remote:my-bucket/logs/access.pfc
STEP 4 — Query directly from cloud (no full download):
# Time-range query (requires --idx-url for .pfc.idx):
docker run --rm \
-v $(pwd):/data \
impossibleforge/pfc-log:v3.3 \
s3-fetch "https://my-bucket.s3.amazonaws.com/logs/access.pfc?X-Amz-..." \
--idx-url "https://my-bucket.s3.amazonaws.com/logs/access.pfc.idx?X-Amz-..." \
--from "2025-03-01T00:00:00" --to "2025-03-01T23:59:59" \
--out /data/march1.log
# Show file metadata from cloud (no download of data blocks):
docker run --rm \
impossibleforge/pfc-log:v3.3 \
s3-info "https://my-bucket.s3.amazonaws.com/logs/access.pfc?X-Amz-..."
# Show detailed block table:
docker run --rm \
impossibleforge/pfc-log:v3.3 \
s3-info "https://my-bucket.s3.amazonaws.com/logs/access.pfc?X-Amz-..." --blocks
# Fetch specific blocks by index (no .pfc.idx needed):
docker run --rm \
-v $(pwd):/data \
impossibleforge/pfc-log:v3.3 \
s3-fetch "https://my-bucket.s3.amazonaws.com/logs/access.pfc?X-Amz-..." \
--blocks 2 3 --out /data/blocks23.log
# Fetch a block range:
docker run --rm \
-v $(pwd):/data \
impossibleforge/pfc-log:v3.3 \
s3-fetch "https://my-bucket.s3.amazonaws.com/logs/access.pfc?X-Amz-..." \
--block-range 1-4 --out /data/blocks1to4.log
S3 GLACIER NOTE:
S3 Glacier Instant Retrieval supports HTTP Range requests natively —
PFC s3-fetch works directly.
For Glacier Flexible / Deep Archive, restore the object first
(this is AWS-side), then use the pre-signed URL of the restored copy.
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
FULL COMMAND REFERENCE
----------------------
impossibleforge/pfc-log:v3.3 --help
impossibleforge/pfc-log:v3.3 compress --help
impossibleforge/pfc-log:v3.3 decompress --help
impossibleforge/pfc-log:v3.3 info --help
impossibleforge/pfc-log:v3.3 seek-block --help
impossibleforge/pfc-log:v3.3 query --help
impossibleforge/pfc-log:v3.3 s3-info --help
impossibleforge/pfc-log:v3.3 s3-fetch --help
PRIVACY & DATA SECURITY
-----------------------
PFC-Log makes NO network connections for compress/decompress/query
operations — ever. s3-fetch only connects when you explicitly
provide a pre-signed URL.
- No telemetry, no usage tracking, no data collection
- No license server — trial validity is checked against local system clock only
- Your log files never leave your server
- Fully GDPR-compliant by design (all processing is local)
You can verify compress/decompress with no network access:
docker run --rm --network none \
-v $(pwd):/data \
impossibleforge/pfc-log:v3.3 \
compress /data/access.log /data/access.pfc
It works identically with no network access.
SUPPORT & CONTACT
-----------------
Email: impossibleforge@gmail.com
Questions or feedback — just reply to this email.
--
ImpossibleForge | PFC-Log
impossibleforge@gmail.com