ExtractPDF4J-Web/index.html at master · ExtractPDF4J/ExtractPDF4J-Web · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8" />
  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
  <meta name="description" content="ExtractPDF4J - JVM-native open-source library to extract tables from PDFs using Stream, Lattice, and OCR parsing modes." />
  <meta property="og:title" content="ExtractPDF4J" />
  <meta property="og:description" content="Extract tables from PDFs (text & scanned) directly in Java. Stream + Lattice + OCR parsing." />
  <meta property="og:image" content="assets/og-image.png" />
  <meta property="og:type" content="website" />
  <meta name="twitter:card" content="summary_large_image" />
  <title>ExtractPDF4J</title>
  <link rel="stylesheet" href="assets/style.css" />
  <link href="https://fonts.googleapis.com/css2?family=Inter:wght@400;600&display=swap" rel="stylesheet">
</head>
<body>
  <header class="hero">
    <h1>ExtractPDF4J</h1>
    <p class="tagline">Extract tables from PDFs — text-based or scanned — with JVM-native power 🚀</p>
    <div class="buttons">
      <a class="btn" href="https://github.com/ExtractPDF4J/ExtractPDF4J/" target="_blank">⭐ GitHub</a>
      <a class="btn secondary" href="docs/index.html">📖 Documentation</a>
    </div>
  </header>

  <section>
    <h2>✨ Why ExtractPDF4J?</h2>
    <p>ExtractPDF4J brings the best of Camelot and Tabula into the Java ecosystem, making PDF table extraction simple, robust, and production-ready.</p>
    <ul>
      <li>Stream & Lattice parsing (like Camelot, but JVM-native)</li>
      <li>OCR support for scanned PDFs</li>
      <li>Multi-page & complex table layouts</li>
      <li>Apache 2.0 licensed, open-source</li>
    </ul>
  </section>

  <section>
    <h2>🚀 Quick Start</h2>
    <pre>
&lt;dependency&gt;
  &lt;groupId&gt;io.github.mehulimukherjee&lt;/groupId&gt;
  &lt;artifactId&gt;extractpdf4j&lt;/artifactId&gt;
  &lt;version&gt;0.1.0&lt;/version&gt;
&lt;/dependency&gt;
    </pre>
  </section>

  <footer>
    <p>Made with ❤️ by Mehuli Mukherjee | <a href="https://github.com/ExtractPDF4J/ExtractPDF4J/">GitHub</a> | Apache 2.0</p>
  </footer>
</body>
</html>