oidlabs-com · dilithjay · May 25, 2026 · May 25, 2026 · May 25, 2026 · May 25, 2026
diff --git a/.github/workflows/deploy_docs.yml b/.github/workflows/deploy_docs.yml
@@ -40,6 +40,7 @@ jobs:
 
   deploy:
     needs: build-docs
+    if: github.ref == 'refs/heads/main'
     runs-on: ubuntu-latest
     permissions:
       pages: write

diff --git a/docs/api.rst b/docs/api.rst
diff --git a/docs/benchmark.rst b/docs/benchmark.rst
@@ -22,7 +22,7 @@ The similarity metric is calculated using the following steps (see `calculate_si
 3. Whitespace and Punctuation Normalization
    Extra whitespace and punctuation are removed from both the parsed and ground truth texts. Therefore, the comparison is purely based on the sequence of characters/words, ignoring any formatting differences.
 
-3. Sequence Matching
+4. Sequence Matching
    Python's ``SequenceMatcher`` compares the extracted text sequences, calculating a similarity ratio between 0 and 1 that reflects content preservation and accuracy.
 
 Running the Benchmarks
@@ -60,11 +60,15 @@ Customizing Benchmarks
 
 You can modify the ``test_attributes`` list in the ``main()`` function to test different configurations:
 
-* ``parser_type``: Switch between LLM and static parsing
+* ``parser_type``: Switch between LLM and static parsing (``LLM_PARSE``, ``STATIC_PARSE``, ``AUTO``)
 * ``model``: Test different LLM models
-* ``framework``: Test different static parsing frameworks
+* ``framework``: Test different static parsing frameworks (``pdfplumber``, ``pdfminer``, ``paddleocr``)
 * ``pages_per_split``: Adjust document chunking
-* ``max_threads``: Control parallel processing
+
+.. note::
+
+   The benchmark harness currently hard-codes ``max_processes=1`` when calling :py:func:`lexoid.api.parse`, so configurations under the ``max_threads`` sweep knob in ``benchmark.py`` do not actually change
+   ``parse()``'s parallelism. To benchmark parallelism, edit ``tests/benchmark.py`` to forward the sweep value to ``max_processes``.
 
 Benchmark Results
 -----------------

diff --git a/docs/cli.rst b/docs/cli.rst
@@ -0,0 +1,99 @@
+Command-Line Interface
+======================
+
+Lexoid ships with a ``lexoid`` command (installed as a console script) for
+parsing documents without writing Python code. You can also invoke it via
+the module form ``python -m lexoid``.
+
+.. code-block:: bash
+
+    lexoid --help
+    python -m lexoid --help
+
+Commands
+--------
+
+The CLI exposes three sub-commands:
+
+* ``lexoid parse`` — Convert a document into markdown (or JSON with metadata).
+* ``lexoid schema`` — Extract structured data conforming to a JSON schema.
+* ``lexoid latex`` — Convert a document into LaTeX.
+
+Common options
+^^^^^^^^^^^^^^
+
+Available across all sub-commands:
+
+* ``--input, -i`` (required): Path to an input file (PDF, image, HTML, DOCX, XLSX, PPTX, CSV, TXT, audio) or a URL (``http://``, ``https://``).
+* ``--output, -o``: Path to an output file. If omitted, output goes to stdout (clean — status messages are written to stderr so output can be piped).
+* ``--verbose, -v``: Enable detailed logging.
+
+``lexoid parse``
+^^^^^^^^^^^^^^^^
+
+.. code-block:: bash
+
+    lexoid parse --input document.pdf
+    lexoid parse --input document.pdf --output output.md
+    lexoid parse --input document.pdf --format json --output result.json
+    lexoid parse --input document.pdf --parser-type STATIC_PARSE
+    lexoid parse --input document.pdf --model gpt-4o
+
+Options:
+
+* ``--parser-type, -p``: ``AUTO`` (default), ``LLM_PARSE``, or ``STATIC_PARSE``.
+* ``--model, -m``: LLM model name. Default: ``gemini-2.5-flash``.
+* ``--pages-per-split``: Pages per chunk. Default: ``4``.
+* ``--max-processes``: Parallel processes. Default: ``4``.
+* ``--framework``: Static parsing framework — ``pdfplumber`` or ``paddleocr``.
+* ``--format``: ``markdown`` (default; raw markdown text) or ``json`` (full result with segments, metadata, and token usage).
+* ``--api``: API provider override. One of ``openai``, ``gemini``, ``anthropic``, ``mistral``, ``together``, ``huggingface``, ``openrouter``, ``fireworks``, ``ollama``. If omitted, inferred from the model name.
+
+``lexoid schema``
+^^^^^^^^^^^^^^^^^
+
+Extract structured data using a JSON schema. The schema can be passed as a
+file path or as an inline JSON string.
+
+.. code-block:: bash
+
+    # Inline schema
+    lexoid schema \
+      --input document.pdf \
+      --schema '{"type": "object", "properties": {"title": {"type": "string"}}}' \
+      --output result.json
+
+    # Schema from file
+    lexoid schema --input document.pdf --schema schema.json --output result.json
+
+    # Specify model and API explicitly
+    lexoid schema --input document.pdf --schema schema.json --api openai --model gpt-4o
+
+Options:
+
+* ``--schema, -s`` (required): JSON schema — file path or inline JSON.
+* ``--model, -m``: LLM model. Default: ``gpt-4o-mini``.
+* ``--api``: API provider (auto-detected from model name if omitted).
+* ``--example-schema``: Example data (JSON string or file path) illustrating a filled schema.
+* ``--fill-single-schema``: Produce a single schema instance for the whole document instead of one per page.
+
+``lexoid latex``
+^^^^^^^^^^^^^^^^
+
+.. code-block:: bash
+
+    lexoid latex --input document.pdf
+    lexoid latex --input document.pdf --output output.tex
+    lexoid latex --input document.pdf --model gpt-4o
+
+Options:
+
+* ``--model, -m``: LLM model. Default: ``gpt-4o-mini``.
+* ``--api``: API provider (auto-detected from model name if omitted).
+
+API keys
+--------
+
+LLM commands require the relevant environment variable to be set
+(see :doc:`installation`). The CLI checks for the required key based on
+the resolved provider and raises a clear error if it is missing.
diff --git a/docs/index.rst b/docs/index.rst
@@ -1,44 +1,55 @@
 Welcome to Lexoid's Documentation
 =================================
 
-Lexoid is an efficient document parsing library that supports both LLM-based and non-LLM-based (static) PDF document parsing.
+Lexoid is an efficient document parsing library that supports both LLM-based and non-LLM-based (static) parsing of PDFs, images, web pages, office documents, and audio files.
 
 .. toctree::
    :maxdepth: 2
    :caption: Contents:
 
    installation
    api
+   cli
    contributing
    benchmark
 
 Key Features
 ------------
 
 * Multiple parsing strategies (LLM-based and static parsing)
-* Automatic parsing strategy selection
-* Support for multiple LLM providers (OpenAI, Google, Meta/Llama, Together AI)
+* Automatic parsing strategy selection (``AUTO`` mode) with optional ML-based LLM auto-selection
+* Routing priorities: ``speed``, ``accuracy``, and ``cost``
+* Support for many LLM providers (OpenAI, Google Gemini, Anthropic, Mistral, Hugging Face, Together AI, OpenRouter, Fireworks)
+* Local LLM inference via Ollama, SmolDocling/granite-docling, and PaddleOCR-VL (no API key required)
+* Schema-constrained extraction (``parse_with_schema``) accepting ``dict``, ``dataclass``, or Pydantic ``BaseModel``
+* LaTeX conversion (``parse_to_latex``)
+* Audio transcription to markdown (via Gemini)
+* Multi-format input: PDF, images (PNG/JPG/TIFF/BMP/GIF), HTML, DOCX, XLSX, PPTX, CSV, TXT, audio, and URLs
+* Recursive URL parsing
 * Table detection and markdown conversion
 * Hyperlink detection and preservation
-* Recursive URL parsing
-* Multi-format support
-* Parallel processing support
-* Permissive license
-* Reference highlighting and bounding box extraction
+* Reference highlighting and bounding box extraction (``return_bboxes``)
+* Parallel processing via multiprocessing
+* Command-line interface (``lexoid`` / ``python -m lexoid``)
+* Permissive Apache 2.0 license
 
 Supported API Providers
 -----------------------
 
-* Google
+* Google (Gemini)
 * OpenAI
+* Anthropic (Claude)
+* Mistral (OCR models)
 * Hugging Face
 * Together AI
 * OpenRouter
 * Fireworks
+* Ollama (local inference)
+* Local models (SmolDocling/granite-docling, PaddleOCR-VL)
 
 Indices and tables
 ==================
 
 * :ref:`genindex`
 * :ref:`modindex`
-* :ref:`search`
+* :ref:`search`
diff --git a/docs/installation.rst b/docs/installation.rst
@@ -8,27 +8,81 @@ Installing with pip
 
     pip install lexoid
 
+This installs both the Python library and the ``lexoid`` command-line entry
+point. See :doc:`cli` for CLI usage.
+
 Environment Setup
 -----------------
 
-To use LLM-based parsing, define the following environment variables or create a ``.env`` file with the following definitions:
+To use LLM-based parsing, define the environment variables for the providers
+you intend to use (in a shell, ``.env`` file, or your container environment):
 
 .. code-block:: bash
 
-    GOOGLE_API_KEY=your_google_api_key
-    OPENAI_API_KEY=your_openai_api_key
+    GOOGLE_API_KEY=your_google_api_key            # Gemini
+    OPENAI_API_KEY=your_openai_api_key            # OpenAI / GPT
+    ANTHROPIC_API_KEY=your_anthropic_api_key      # Claude
+    MISTRAL_API_KEY=your_mistral_api_key          # Mistral OCR
     HUGGINGFACEHUB_API_TOKEN=your_huggingface_token
     TOGETHER_API_KEY=your_together_api_key
+    OPENROUTER_API_KEY=your_openrouter_api_key
+    FIREWORKS_API_KEY=your_fireworks_api_key
+
+Only the providers you actually use require keys. Local backends (Ollama,
+SmolDocling/granite-docling, PaddleOCR-VL) do not require an API key.
+
+Additional environment variables
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* ``DEFAULT_LLM`` — overrides the default LLM model. Default: ``gemini-2.5-flash``.
+* ``DEFAULT_LOCAL_LM`` — overrides the default local model used by ``parse_with_local_model``. Default: ``ds4sd/SmolDocling-256M-preview``.
+* ``DEFAULT_STATIC_FRAMEWORK`` — overrides the default static-parsing framework. Default: ``pdfplumber``.
+* ``DEFAULT_MAX_IMAGE_DIMENSION`` — maximum pixel dimension for resizing page/image inputs. Default: ``1000``.
+* ``OLLAMA_BASE_URL`` — base URL of the Ollama server. Default: ``http://localhost:11434``.
+* ``OLLAMA_TIMEOUT`` — request timeout (seconds) for Ollama. Default: ``120``.
 
 Optional Dependencies
 ---------------------
 
-To use ``Playwright`` for retrieving web content (instead of the ``requests`` library):
+Playwright (for web content retrieval)
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+To use Playwright for retrieving web content (instead of the bare ``requests``
+library), install its browser dependencies after ``pip install lexoid``:
 
 .. code-block:: bash
 
     playwright install --with-deps --only-shell chromium
 
+LibreOffice (for DOCX to PDF on Linux)
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+On Linux, ``.doc``/``.docx`` to PDF conversion uses LibreOffice's
+``lowriter`` binary (because ``docx2pdf`` is unsupported on Linux). Install
+it from your distribution's package manager, e.g.:
+
+.. code-block:: bash
+
+    sudo apt-get install libreoffice
+
+On macOS/Windows, ``docx2pdf`` is used automatically (requires Microsoft Word
+or compatible installation).
+
+Ollama (for local LLM parsing)
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Install `Ollama <https://ollama.com>`_, pull a vision-capable model, and
+keep the server running:
+
+.. code-block:: bash
+
+    ollama pull gemma4
+    ollama serve
+
+Then call ``parse(..., api_provider="ollama", model="gemma4:latest", max_processes=1)``.
+Lexoid forces ``max_processes=1`` for Ollama-backed parsing to avoid local
+multiprocess contention.
+
 Building from Source
 --------------------
 
@@ -57,4 +111,4 @@ To activate virtual environment:
 
 .. code-block:: bash
 
-    source .venv/bin/activate
+    source .venv/bin/activate