This guide documents the legacy PDF web interface inside the broader pyPDFLibrarianSort Python tools workspace.
A modern, beautiful web interface for organizing PDFs with drag & drop functionality, real-time categorization, and library browsing.
Option 1: Batch File (Windows)
START_WEB_INTERFACE.batOption 2: Python Command
python web_interface.pyOption 3: Direct Python
from web_interface import app
app.run(host='0.0.0.0', port=5000)Then open your browser and go to:
http://localhost:5000
- Drag PDF files directly into the browser
- Or click to browse and select files
- Upload multiple PDFs at once
- Real-time upload progress
- Automatic category suggestions
- Smart filename analysis
- Content-based classification
- Confidence scoring (High/Medium/Low)
- See all suggestions before organizing
- Edit categories manually
- Edit suggested filenames
- Approve or reject individual files
- Batch approve/reject all
- Browse organized PDFs
- Hierarchical folder view
- File size information
- Category statistics
- Total PDFs organized
- Category breakdown
- Last run date
- Visual charts
- Choose between Gemini, Anthropic, or DeepSeek
- Configure API keys
- Set ebooks folder path
First time using the web interface:
- Click ⚙️ Settings button
- Enter your Ebooks Folder path (e.g.,
F:\ebooks) - Select your AI Provider (Gemini, Anthropic, or DeepSeek)
- Enter your API Key
- Click Save Settings
- Drag & drop PDFs into the upload area
- OR click Choose Files to browse
- See uploaded files listed with sizes
- Click 🤖 Analyze & Categorize
The AI will analyze each PDF and suggest:
- Category: Where to organize it
- Rename: Better filename (if current is gibberish)
- Confidence: How confident the AI is
For each PDF, you can:
- ✅ Approve: Include in organization
- ❌ Reject: Skip this file
- ✏️ Edit: Change category or filename
- Use Approve All or Reject All for batch actions
- Review all suggestions
- Click 📦 Organize Approved Files
- Confirm the action
- Watch as PDFs are moved and renamed!
Click 📁 Browse Library to:
- See your organized PDFs
- Navigate folder structure
- View statistics
┌─────────────────────────────────────────┐
│ 📚 PDF Organizer │
│ AI-Powered Library Management │
│ [Settings] [Browse] [Statistics] │
├─────────────────────────────────────────┤
│ │
│ ┌───────────────────────────────┐ │
│ │ 📄 │ │
│ │ Drag & Drop PDFs Here │ │
│ │ or click to browse │ │
│ │ [Choose Files] │ │
│ └───────────────────────────────┘ │
│ │
└─────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│ 📋 Categorization Results │
│ [✓ Approve All] [✗ Reject All] │
│ [📦 Organize Approved Files] │
├─────────────────────────────────────────┤
│ ┌─────────────────────────────┐ │
│ │ Document.pdf 🔍 Gibberish │ │
│ │ Category: Science/Biology │ │
│ │ Rename: Study of Rabbits │ │
│ │ Confidence: HIGH │ │
│ │ [✓ Approve] [✗] │ │
│ └─────────────────────────────┘ │
│ │
│ ✅ APPROVED │
│ ┌─────────────────────────────┐ │
│ │ Python Guide.pdf │ │
│ │ Category: Programming/Python │ │
│ │ Confidence: HIGH │ │
│ └─────────────────────────────┘ │
└─────────────────────────────────────────┘
- Green border: Approved files
- Red opacity: Rejected files
- Blue badge: Gibberish filename detected
- Green badge: High confidence
- Yellow badge: Medium confidence
- Red badge: Low confidence
- Upload progress spinner
- Analysis loading indicator
- Organization progress overlay
- Toast notifications for all actions
{
"ebooks_folder": "F:/ebooks", // Where PDFs are organized
"provider": "gemini", // AI provider
"api_key": "your-api-key-here", // API key
"batch_delay": 10 // Not used in web interface
}- Gemini: https://aistudio.google.com/app/apikey
- Anthropic: https://console.anthropic.com/
- DeepSeek: https://platform.deepseek.com/
Run on a different port:
python web_interface.py --port 8080Or modify the code:
app.run(host='0.0.0.0', port=8080)If running on your local network:
- Find your computer's IP address
- Access from another device:
http://YOUR_IP:5000 - Make sure firewall allows port 5000
For production use, use a proper WSGI server:
pip install gunicorn
gunicorn -w 4 -b 0.0.0.0:5000 web_interface:appThe web interface provides a RESTful API:
Main page (HTML)
Get or update settings
Upload PDF files
- Body:
FormDatawith files - Returns: List of uploaded files with IDs
Analyze PDFs and get categorization
- Body:
{ files: [...] } - Returns: Categorization results
Move approved files to ebooks folder
- Body:
{ files: [...] } - Returns: Success/failure status
Browse organized library
- Returns: File tree and statistics
Get organization statistics
- Returns: Total organized, categories, last run
Get available categories
- Returns: List of existing categories
Error: Address already in use
Solution: Change the port:
app.run(port=5001) # Use different port- Check firewall settings
- Try
http://127.0.0.1:5000instead oflocalhost - Make sure server is running
- Check file size (max 100MB per file)
- Ensure files are PDFs
- Check browser console for errors
- Verify API key is correct
- Check ebooks folder exists
- Ensure API provider is selected
- Make sure files are approved (green border)
- Check ebooks folder permissions
- Verify category paths are valid
- Upload in batches of 20-50 files for best results
- Large PDFs (>10MB) may take longer to process
- Analysis is batched - multiple PDFs = one API call
- Daily Use: Leave web interface open, drag PDFs as they arrive
- Bulk Organization: Upload many PDFs, review all at once
- Careful Review: Always check suggestions before organizing
- Edit categories to match your structure
- Use existing categories when possible
- Create subcategories with
/(e.g.,Science/Biology/Zoology)
- AI suggests better names for gibberish files
- Edit names before organizing
- Keep names descriptive but concise
By default, the web interface is accessible only from your computer:
app.run(host='127.0.0.1') # Local onlyIf you enable network access (host='0.0.0.0'):
- Anyone on your network can access it
- API keys are stored in session (not secure for production)
- Use HTTPS in production
- Add authentication for sensitive use
- API keys stored in Flask session
- Not persisted to disk
- Lost when browser session ends
- Re-enter after restarting browser
Potential improvements:
- User authentication
- Multiple user accounts
- Persistent settings storage
- OCR for scanned PDFs
- PDF preview thumbnails
- Advanced search
- Category templates
- Undo functionality
- Dark mode
- Mobile-responsive design
- Batch operations history
Having issues? Check:
- Console Output: Look for error messages in terminal
- Browser Console: Check for JavaScript errors (F12)
- Network Tab: Inspect API calls
- Log Files: Check Flask logs
The web interface makes PDF organization beautiful and intuitive. Drag, drop, review, organize!
Happy organizing! 📚✨