Skip to content

Commit 4dd1a0d

Browse files
Merge pull request #107 from datalogics-kam/pdfcloud-5204-initial-perl-samples
PDFCLOUD-5204 Initial Perl samples
2 parents 3f90080 + db600ef commit 4dd1a0d

11 files changed

Lines changed: 764 additions & 0 deletions

File tree

Perl/.env.example

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
PDFREST_API_KEY=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
2+
# Optional: Override API base URL (default https://api.pdfrest.com)
3+
# PDFREST_URL=https://eu-api.pdfrest.com/
4+
# For more information visit https://pdfrest.com/pricing#how-do-eu-gdpr-api-calls-work

Perl/.gitignore

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
# Created by https://www.toptal.com/developers/gitignore/api/perl
2+
# Edit at https://www.toptal.com/developers/gitignore?templates=perl
3+
4+
### Perl ###
5+
!Build/
6+
.last_cover_stats
7+
/META.yml
8+
/META.json
9+
/MYMETA.*
10+
*.o
11+
*.pm.tdy
12+
*.bs
13+
14+
# Devel::Cover
15+
cover_db/
16+
17+
# Devel::NYTProf
18+
nytprof.out
19+
20+
# Dist::Zilla
21+
/.build/
22+
23+
# Module::Build
24+
_build/
25+
Build
26+
Build.bat
27+
28+
# Module::Install
29+
inc/
30+
31+
# ExtUtils::MakeMaker
32+
/blib/
33+
/_eumm/
34+
/*.gz
35+
/Makefile
36+
/Makefile.old
37+
/MANIFEST.bak
38+
/pm_to_blib
39+
/*.zip
40+
41+
# Carton
42+
local/
43+
44+
# End of https://www.toptal.com/developers/gitignore/api/perl

Perl/AGENTS.md

Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
# Repository Guidelines
2+
3+
## Project Structure & Module Organization
4+
- `Endpoint Examples/JSON Payload/`: Two-step samples (upload then operate), e.g., `markdown.pl`.
5+
- `Endpoint Examples/Multipart Payload/`: Single multipart request samples, e.g., `rasterized-pdf.pl`.
6+
- `Complex Flow Examples/`: Multi-step workflows that chain endpoints.
7+
- `.env.example` → copy to `.env` and set `PDFREST_API_KEY`; optional `PDFREST_URL`.
8+
- `README.md`: Setup and usage details for this language folder.
9+
10+
## Build, Test, and Development Commands
11+
- Install deps from `cpanfile` (recommended):
12+
- `cpanm --installdeps .`
13+
- macOS quick setup: `make install-macos` or run `scripts/setup-macos.sh`
14+
- Run JSON sample:
15+
- `perl "Endpoint Examples/JSON Payload/markdown.pl" /path/to/input.pdf`
16+
- Run Multipart sample:
17+
- `perl "Endpoint Examples/Multipart Payload/rasterized-pdf.pl" /path/to/input.pdf`
18+
- Capture output to file: append `> response.json`
19+
20+
21+
## Coding Style & Naming Conventions
22+
- Indentation: 4 spaces; enable `use strict; use warnings; use utf8;` at top.
23+
- Naming: `snake_case` for files/variables; script names mirror endpoints (e.g., `markdown.pl`).
24+
- HTTP: prefer `LWP::UserAgent` + `HTTP::Request::Common`.
25+
- Base URL: `$ENV{PDFREST_URL} // 'https://api.pdfrest.com'`.
26+
- I/O: print API responses to STDOUT; send diagnostics to STDERR; exit non‑2xx with non‑zero status.
27+
28+
### Environment Loading
29+
- Use `Dotenv` to read `.env` into `%ENV` (do not hand‑roll parsing).
30+
- Add dependency in `cpanfile`: `requires 'Dotenv';`.
31+
- Load with a guarded call so missing files are fine:
32+
- JSON/Multipart examples: `my $env_path = "$Bin/../../.env"; -e $env_path and Dotenv->load($env_path);`
33+
- Complex Flow examples: `my $env_path = "$Bin/../.env"; -e $env_path and Dotenv->load($env_path);`
34+
- Do not override pre‑existing environment variables; rely on library defaults (no explicit override).
35+
36+
## Testing Guidelines
37+
- No formal test suite required for samples. Validate by running against small, known inputs.
38+
- Success: non‑zero exit on failures; JSON body printed to STDOUT on success.
39+
- If adding tests, use `Test::More` under `t/` and run with `prove -lr t`.
40+
41+
## Commit & Pull Request Guidelines
42+
- Commits: imperative, scoped, and reference endpoint/path (e.g., "Add multipart markdown sample").
43+
- PRs: include what/why, run commands used for verification, expected response snippet, and linked issues.
44+
- Avoid unrelated changes or large binaries.
45+
46+
## Security & Configuration Tips
47+
- Do not commit secrets. Provide `.env.example`; load `PDFREST_API_KEY` from environment.
48+
- Optional region override: `PDFREST_URL=https://eu-api.pdfrest.com/` for EU/GDPR routing.
49+
- Never print API keys; rely on concise error messages. Respect proxies via `HTTPS_PROXY` when needed.
50+
51+
## Troubleshooting
52+
- HTTPS support missing: install `LWP::Protocol::https` and `Mozilla::CA` (included in `cpanfile`). Re-run `cpanm --installdeps .`.
53+
- SSL toolchain: some systems require OpenSSL dev libs (e.g., macOS: `brew install openssl`, Debian/Ubuntu: `apt-get install libssl-dev`) before `cpanm` can build `Net::SSLeay`/`IO::Socket::SSL`.
54+
55+
---
56+
57+
## Audience And Tone (Internal)
58+
59+
These Perl samples are customer‑facing and intended to help potential customers evaluate pdfRest quickly. Keep all code, comments, and documentation clear, minimal, and task‑focused. Avoid internal jargon and keep meta‑process notes out of `README.md` and the samples.
60+
61+
Key points:
62+
- Clarity: explain what the sample does in 1–2 bullets.
63+
- Guidance: show how to set up `.env` and how to run the script.
64+
- Region: mention optional `PDFREST_URL` with the EU endpoint for GDPR and proximity.
65+
- Safety: never log secrets; print only response bodies and minimal diagnostics to `STDERR`.
66+
- Errors: exit non‑zero on non‑2xx responses with a concise message.
67+
68+
## Sample Header Convention (Internal)
69+
70+
Add this standardized header comment at the top of every Perl sample. This header is customer‑visible; the convention itself is tracked here for us (don’t place this template in `README.md`).
71+
72+
Template:
73+
74+
```
75+
#!
76+
# What this sample does:
77+
# - <One–two bullets describing purpose and request style>
78+
#
79+
# Setup (.env):
80+
# - Copy .env.example to .env
81+
# - Set PDFREST_API_KEY=your_api_key_here
82+
# - Optional: set PDFREST_URL to override the API region. For EU/GDPR compliance and proximity, use:
83+
# PDFREST_URL=https://eu-api.pdfrest.com
84+
# For more information visit https://pdfrest.com/pricing#how-do-eu-gdpr-api-calls-work
85+
#
86+
# Usage:
87+
# perl "<relative path to this file>" /path/to/input.pdf
88+
#
89+
# Output:
90+
# - Prints the API JSON response to stdout. Non-2xx responses exit with a concise message.
91+
# - Tip: pipe output to a file: perl ... > response.json
92+
```
93+
94+
Notes:
95+
- Match the endpoint name and request style (JSON two‑step vs multipart single request) in the bullets.
96+
- Keep the header concise; list optional parameters near where they are used in code.
97+
98+
## README Scope (Internal)
99+
100+
Keep `README.md` focused on user setup, running samples, and high‑level background. Avoid internal conventions or meta‑process content that could confuse customers. Place internal notes and templates in `AGENTS.md` (this file).
Lines changed: 147 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,147 @@
1+
#!/usr/bin/env perl
2+
use strict;
3+
use warnings;
4+
use utf8;
5+
use FindBin qw($Bin);
6+
use File::Basename qw(basename);
7+
use JSON::PP qw(decode_json);
8+
use LWP::UserAgent;
9+
use HTTP::Request;
10+
use HTTP::Request::Common qw(POST);
11+
use URI::Escape qw(uri_escape);
12+
use Dotenv;
13+
14+
#!
15+
# What this sample does:
16+
# - Merges multiple inputs (PDFs and non-PDFs) into a single PDF.
17+
# - Non-PDFs are converted to PDF; PDFs are uploaded. Collected IDs are merged via /merged-pdf.
18+
#
19+
# Setup (.env):
20+
# - Copy .env.example to .env (Perl folder root)
21+
# - Set PDFREST_API_KEY=your_api_key_here
22+
# - Optional: set PDFREST_URL to override the API region. For EU/GDPR compliance and proximity, use:
23+
# PDFREST_URL=https://eu-api.pdfrest.com
24+
# For more information visit https://pdfrest.com/pricing#how-do-eu-gdpr-api-calls-work
25+
#
26+
# Usage:
27+
# perl "Complex Flow Examples/merge-different-file-types.pl" /path/to/file1 /path/to/file2 [/path/to/file3 ...]
28+
#
29+
# Output:
30+
# - Prints the API JSON response to stdout. Non-2xx responses exit with a concise message.
31+
# - Tip: pipe output to a file: perl ... > response.json
32+
33+
binmode STDOUT, ':raw';
34+
binmode STDERR, ':encoding(UTF-8)';
35+
36+
# Load .env from the Perl folder root (one level up from this script)
37+
my $env_path = "$Bin/../.env";
38+
-e $env_path and Dotenv->load($env_path);
39+
40+
my $api_key = $ENV{PDFREST_API_KEY} // '';
41+
if (!$api_key || $api_key =~ /^\s*$/) {
42+
print STDERR "Missing PDFREST_API_KEY in .env or environment\n";
43+
exit 1;
44+
}
45+
46+
my $api_base = $ENV{PDFREST_URL} // $ENV{PDFREST_API} // 'https://api.pdfrest.com';
47+
$api_base =~ s{/+$}{};
48+
49+
my @paths = @ARGV;
50+
if (@paths < 2) {
51+
print STDERR "Usage: perl merge-different-file-types.pl /path/to/file1 /path/to/file2 [/path/to/file3 ...]\n";
52+
exit 1;
53+
}
54+
for my $p (@paths) { if (!-f $p) { print STDERR "Not a file: $p\n"; exit 1; } }
55+
56+
sub content_type_for {
57+
my ($path) = @_;
58+
my ($ext) = $path =~ /(\.[^.]+)$/;
59+
$ext = lc($ext // '');
60+
return 'application/pdf' if $ext eq '.pdf';
61+
return 'image/png' if $ext eq '.png';
62+
return 'image/jpeg' if $ext eq '.jpg' || $ext eq '.jpeg';
63+
return 'image/gif' if $ext eq '.gif';
64+
return 'image/tiff' if $ext eq '.tif' || $ext eq '.tiff';
65+
return 'image/bmp' if $ext eq '.bmp';
66+
return 'image/webp' if $ext eq '.webp';
67+
return 'application/msword' if $ext eq '.doc';
68+
return 'application/vnd.openxmlformats-officedocument.wordprocessingml.document' if $ext eq '.docx';
69+
return 'application/vnd.ms-powerpoint' if $ext eq '.ppt';
70+
return 'application/vnd.openxmlformats-officedocument.presentationml.presentation' if $ext eq '.pptx';
71+
return 'application/vnd.ms-excel' if $ext eq '.xls';
72+
return 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet' if $ext eq '.xlsx';
73+
return 'text/plain' if $ext eq '.txt';
74+
return 'application/rtf' if $ext eq '.rtf';
75+
return 'text/html' if $ext eq '.html' || $ext eq '.htm';
76+
return 'application/octet-stream';
77+
}
78+
79+
my $ua = LWP::UserAgent->new( timeout => 120 );
80+
81+
eval {
82+
my @ids;
83+
for my $i (0..$#paths) {
84+
my $p = $paths[$i];
85+
my $ext = lc(($p =~ /(\.[^.]+)$/)[0] // '');
86+
if ($ext eq '.pdf') {
87+
# Upload and capture id
88+
open my $fh, '<:raw', $p or do { print STDERR "Unable to read $p: $!\n"; exit 1; };
89+
my $bytes; { local $/; $bytes = <$fh>; }
90+
close $fh;
91+
my $req = HTTP::Request->new('POST', "$api_base/upload");
92+
$req->header('api-key' => $api_key);
93+
$req->header('content-filename' => basename($p));
94+
$req->header('Content-Type' => 'application/octet-stream');
95+
$req->content($bytes);
96+
my $resp = $ua->request($req);
97+
print STDERR $resp->decoded_content // '';
98+
if (!$resp->is_success) { print STDERR "\nUpload failed (input #" . ($i+1) . ") status " . $resp->code . "\n"; exit 1; }
99+
my $json = decode_json($resp->decoded_content // '{}');
100+
my $id = $json->{files} && ref $json->{files} eq 'ARRAY' ? $json->{files}[0]{id} : undef;
101+
if (!$id) { print STDERR "Unexpected upload response format for input #" . ($i+1) . "\n"; exit 1; }
102+
push @ids, $id;
103+
print STDERR "Uploaded PDF (#" . ($i+1) . "); id=$id\n";
104+
} else {
105+
# Convert to PDF via /pdf and capture outputId
106+
my $ct = content_type_for($p);
107+
my $req = POST("$api_base/pdf",
108+
'Content_Type' => 'form-data',
109+
'Content' => [ file => [$p, basename($p), 'Content-Type' => $ct] ]
110+
);
111+
$req->header('api-key' => $api_key);
112+
my $resp = $ua->request($req);
113+
print STDERR $resp->decoded_content // '';
114+
if (!$resp->is_success) { print STDERR "\nConversion failed (input #" . ($i+1) . ") status " . $resp->code . "\n"; exit 1; }
115+
my $json = decode_json($resp->decoded_content // '{}');
116+
my $id = $json->{outputId};
117+
if (!$id) { print STDERR "Unexpected conversion response format for input #" . ($i+1) . "\n"; exit 1; }
118+
push @ids, $id;
119+
print STDERR "Converted non-PDF (#" . ($i+1) . "); outputId=$id\n";
120+
}
121+
}
122+
123+
# Build x-www-form-urlencoded with repeated arrays
124+
my @parts;
125+
for my $id (@ids) {
126+
push @parts, 'id[]=' . uri_escape($id);
127+
push @parts, 'pages[]=' . uri_escape('1-last');
128+
push @parts, 'type[]=id';
129+
}
130+
my $body = join('&', @parts);
131+
132+
my $merge_req = HTTP::Request->new('POST', "$api_base/merged-pdf");
133+
$merge_req->header('api-key' => $api_key);
134+
$merge_req->header('Content-Type' => 'application/x-www-form-urlencoded');
135+
$merge_req->content($body);
136+
my $merge_resp = $ua->request($merge_req);
137+
print STDOUT $merge_resp->decoded_content // '';
138+
if (!$merge_resp->is_success) { print STDERR "\nMerge failed with status " . $merge_resp->code . "\n"; exit 1; }
139+
1;
140+
} or do {
141+
my $err = $@ || 'Unknown error';
142+
$err =~ s/\s+$//;
143+
print STDERR "Error: $err\n";
144+
exit 1;
145+
};
146+
147+
__END__

0 commit comments

Comments
 (0)