Skip to content

Seglats/autometadata

Repository files navigation

Audiobookshelf Metadata Enricher

Enriches Audiobookshelf library metadata from Audible. Multi-pass search with fuzzy scoring, embedding similarity, LLM matching, and interactive fallback.

Files

main.py          Engine / orchestrator
abs.py           Audiobookshelf API client + processed tracker
audible.py       Audible scraper (search, ASIN lookup, detail pages)
utils.py         Title cleaning, multi-signal scoring
websearch.py     SearXNG fallback search
embedding.py     Sentence embedding similarity (CUDA/CPU)
openai_llm.py    LLM matching via OpenAI-compatible API
test.sh          Test runner
testdb.json      Test database

Install

With uv (recommended)

# Create venv and install core deps
uv venv
source .venv/bin/activate
uv pip install requests beautifulsoup4 python-dotenv

# Optional: embedding model (CPU)
uv pip install sentence-transformers torch

# Optional: embedding model (CUDA)
uv pip install sentence-transformers
uv pip install torch --index-url https://download.pytorch.org/whl/cu121

With pip

python -m venv .venv
source .venv/bin/activate
pip install requests beautifulsoup4 python-dotenv

# Optional: embedding
pip install sentence-transformers torch

Setup

cp .env.example .env
# Edit .env with your ABS URL and token

Usage

# Process all new books
python main.py

# Preview without writing
python main.py --dry-run

# Reprocess everything
python main.py --force

# Filter
python main.py --author "Tolkien"
python main.py --series "Dune"
python main.py --title "The Hobbit"

# Interactive mode (prompt on failures)
python main.py -i

# Combine
python main.py -i --force --author "Flanagan"

# Test mode (uses testdb.json, no ABS needed)
python main.py --testing --force

# Test suites
./test.sh              # All
./test.sh pipeline     # Full run
./test.sh filter       # Filter flags
./test.sh tracker      # Skip/force
./test.sh quick        # Single book

Pipeline

For each book:

  1. Fuzzy scoring — search Audible (full title, cleaned title, US+UK regions), score by title (10), author (5), narrator (7), series (3), detail richness (2)
  2. Embedding — cosine similarity via sentence-transformers (if configured)
  3. LLM — ask AI to pick best match (if configured)
  4. Interactive — prompt user to pick/search/enter ASIN (if -i flag)

Each layer is optional. Disable by leaving its env vars empty.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors