33< head >
44 < meta charset ="UTF-8 " />
55 < meta name ="viewport " content ="width=device-width, initial-scale=1.0 " />
6+ < meta name ="description " content ="ExtractPDF4J - JVM-native open-source library to extract tables from PDFs using Stream, Lattice, and OCR parsing modes. " />
7+ < meta property ="og:title " content ="ExtractPDF4J " />
8+ < meta property ="og:description " content ="Extract tables from PDFs (text & scanned) directly in Java. Stream + Lattice + OCR parsing. " />
9+ < meta property ="og:image " content ="assets/og-image.png " />
10+ < meta property ="og:type " content ="website " />
11+ < meta name ="twitter:card " content ="summary_large_image " />
612 < title > ExtractPDF4J</ title >
713 < link rel ="stylesheet " href ="assets/style.css " />
14+ < link href ="https://fonts.googleapis.com/css2?family=Inter:wght@400;600&display=swap " rel ="stylesheet ">
815</ head >
916< body >
10- < header >
17+ < header class =" hero " >
1118 < h1 > ExtractPDF4J</ h1 >
12- < p > A powerful Java library for extracting tables from PDFs (Stream + Lattice + OCR)</ p >
13- < a class ="btn " href ="https://github.com/yourusername/ExtractPDF4J " target ="_blank "> ⭐ View on GitHub</ a >
14- < a class ="btn secondary " href ="docs/index.html "> 📖 Documentation</ a >
19+ < p class ="tagline "> Extract tables from PDFs — text-based or scanned — with JVM-native power 🚀</ p >
20+ < div class ="buttons ">
21+ < a class ="btn " href ="https://github.com/ExtractPDF4J/ExtractPDF4J/ " target ="_blank "> ⭐ GitHub</ a >
22+ < a class ="btn secondary " href ="docs/index.html "> 📖 Documentation</ a >
23+ </ div >
1524 </ header >
1625
1726 < section >
18- < h2 > ✨ Features</ h2 >
27+ < h2 > ✨ Why ExtractPDF4J?</ h2 >
28+ < p > ExtractPDF4J brings the best of Camelot and Tabula into the Java ecosystem, making PDF table extraction simple, robust, and production-ready.</ p >
1929 < ul >
20- < li > Stream and Lattice parsing (like Camelot, but in Java! )</ li >
21- < li > OCR-based extraction for scanned PDFs</ li >
22- < li > Multi-page & complex table handling </ li >
23- < li > Open-source under Apache 2.0</ li >
30+ < li > Stream & Lattice parsing (like Camelot, but JVM-native )</ li >
31+ < li > OCR support for scanned PDFs</ li >
32+ < li > Multi-page & complex table layouts </ li >
33+ < li > Apache 2.0 licensed, open-source </ li >
2434 </ ul >
2535 </ section >
2636
2737 < section >
2838 < h2 > 🚀 Quick Start</ h2 >
2939 < pre >
3040<dependency>
31- <groupId>com.extractpdf4j </groupId>
41+ <groupId>io.github.mehulimukherjee </groupId>
3242 <artifactId>extractpdf4j</artifactId>
33- <version>1.0 .0</version>
43+ <version>0.1 .0</version>
3444</dependency>
3545 </ pre >
3646 </ section >
3747
3848 < footer >
39- < p > Made with ❤️ by Mehuli Mukherjee | < a href ="docs/index.html " > Read Docs </ a > </ p >
49+ < p > Made with ❤️ by Mehuli Mukherjee | < a href ="https://github.com/ExtractPDF4J/ExtractPDF4J/ " > GitHub </ a > | Apache 2.0 </ p >
4050 </ footer >
4151</ body >
42- </ html >
52+ </ html >
0 commit comments