Skip to content

Update docs and add skills.md files#184

Open
dilithjay wants to merge 8 commits into
mainfrom
dj/update-docs
Open

Update docs and add skills.md files#184
dilithjay wants to merge 8 commits into
mainfrom
dj/update-docs

Conversation

@dilithjay
Copy link
Copy Markdown
Contributor

  • Updated docs to cover providers, kwargs, return keys, CLI, and entry points that were missing or incorrect.
  • Added skills/lexoid-cli and skills/lexoid-python SKILL.md files for the lexoid CLI and the Python API, respectively.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR expands Lexoid’s documentation to better describe provider setup, CLI usage, and Python API return shapes, and adds SKILL.md guides for the CLI and Python integration.

Changes:

  • Add SKILL.md guides for lexoid CLI usage and the Python API.
  • Update Sphinx docs to include a new CLI page and refresh API/installation/provider details.
  • Adjust docs deployment workflow to only deploy from main.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
skills/lexoid-python/SKILL.md New Python skill guide with setup, API entry points, and recipes.
skills/lexoid-cli/SKILL.md New CLI skill guide with commands, flags, and common usage patterns.
docs/installation.rst Clarifies pip install behavior, provider env vars, and optional dependencies (Playwright/LibreOffice/Ollama).
docs/index.rst Updates overview text, feature list, providers list, and adds cli to the toctree.
docs/cli.rst New CLI reference page documenting subcommands and common flags.
docs/benchmark.rst Fixes numbering and expands benchmark configuration descriptions.
docs/api.rst Expands API reference (parser types, kwargs, return keys, examples) and adds missing entry points.
.github/workflows/deploy_docs.yml Ensures the deploy job only runs on refs/heads/main.
Comments suppressed due to low confidence (3)

docs/api.rst:85

  • pdf_path is currently added whenever as_pdf=True (even when save_dir isn’t provided, it will point to the temporary converted PDF). The docs say it’s only set when save_dir is specified, which doesn’t match lexoid.api.parse behavior.
   * ``pdf_path``: Path to the intermediate PDF generated when ``as_pdf=True`` and ``save_dir`` is specified.

skills/lexoid-python/SKILL.md:67

  • token_cost is described as present whenever api_cost_mapping is supplied, but parse() only sets it if the mapping contains an entry for the selected model. Consider noting that token_cost may be absent (or zero) when the model isn’t in the provided mapping.
    "token_usage": {"input": int, "output": int, "total": int, "llm_page_count": int},
    "token_cost": {...},         # only when api_cost_mapping is supplied
    "parsers_used": [str, ...],  # which parser ran per chunk

skills/lexoid-python/SKILL.md:69

  • This says pdf_path is only present when as_pdf=True and save_dir is set, but the implementation adds pdf_path whenever as_pdf=True (it may point to a temp file if save_dir isn’t provided). Please update this return-shape note to match actual behavior.
    "pdf_path": str,             # only when as_pdf=True and save_dir is set
}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docs/api.rst Outdated
Comment thread docs/api.rst Outdated
Comment thread skills/lexoid-python/SKILL.md Outdated
Comment thread docs/benchmark.rst Outdated
@pramitchoudhary pramitchoudhary added the enhancement New feature or request label May 25, 2026
@pramitchoudhary pramitchoudhary linked an issue May 25, 2026 that may be closed by this pull request
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@dilithjay dilithjay linked an issue May 25, 2026 that may be closed by this pull request
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.

Comment thread docs/api.rst Outdated
Comment thread docs/api.rst Outdated
Comment thread skills/lexoid-python/SKILL.md Outdated
Comment thread skills/lexoid-python/SKILL.md Outdated
Comment thread skills/lexoid-cli/SKILL.md Outdated
@pramitchoudhary pramitchoudhary self-requested a review May 28, 2026 01:55
@pramitchoudhary
Copy link
Copy Markdown
Contributor

For example, query:

Sandbox/runtime path: <simple virtual .venv path>

For the given document, it defaults to,

result = parse(
            str(pdf_path),
            parser_type="STATIC_PARSE", or OCR
            framework="pdfplumber",
            retry_on_fail=False,
        )

Should we not consider setting the default config to?

parse("doc.pdf", parser_type="AUTO", router_priority="accuracy", retry_on_fail=True)

Comment thread skills/lexoid-python/SKILL.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add examples to use Lexoid as part of skills description Verify that documentation is up to date

3 participants