docs(readme): reposition project around HF reference workflows by Alberto-Codes · Pull Request #91 · Alberto-Codes/turboquant-vllm

Alberto-Codes · 2026-04-10T03:18:03Z

This repositions turboquant-vllm as the HuggingFace/reference path for DynamicCache compression, verification, and architecture research instead of presenting the out-of-tree vLLM plugin as the main long-term product direction.

Reframe README and docs index around HuggingFace/reference workflows
Make the upstream native vLLM vs turboquant-vllm split explicit
De-emphasize plugin-first messaging in vLLM and container docs
Highlight verify CLI as part of the primary repo workflow

Test: uv run python -m compileall src

Closes #87

PR Review

Checklist

Self-reviewed my code
Tests pass (uv run pytest)
Lint passes (uv run ruff check .)
Breaking changes use ! in title and BREAKING CHANGE: in body

Review Focus

Check whether the README and docs landing page now make the repo role obvious within the first screenful
Verify the plugin path is still documented clearly, but no longer framed as the main strategic center
Confirm the upstream native vLLM split is clear and consistent across touched docs

Clarify that turboquant-vllm is primarily the HuggingFace/reference path for DynamicCache compression, verification, and architecture research while the out-of-tree vLLM plugin remains an optional bridge. - reframe README and docs index around HF/reference usage - de-emphasize plugin-first messaging in vLLM and container docs - highlight verify CLI and upstream native vLLM split Refs #87

CI lint failed in uv-secure because the lockfile still resolved cryptography 46.0.6, which now reports GHSA-p423-j2cm-9vmq. - upgrade cryptography in uv.lock to 46.0.7 - re-run uv-secure and lint-related local checks - keep the docs change set otherwise unchanged Refs #87

Copilot

Pull request overview

Repositions repository documentation to emphasize the HuggingFace DynamicCache reference workflow (plus verification/research) as the primary project direction, while keeping the vLLM plugin path documented as an optional bridge.

Changes:

Reframes the README to lead with HuggingFace/reference + verification workflows, and moves the vLLM plugin to an “optional bridge” section.
Updates usage docs (vLLM/plugin, HuggingFace, container) to de-emphasize plugin-first messaging and clarify the upstream native vLLM split.
Updates uv.lock to bump cryptography from 46.0.6 to 46.0.7.

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
uv.lock	Updates locked dependency version for `cryptography`.
README.md	Repositions project messaging around HF/reference workflows and verification; keeps plugin usage as optional.
docs/site/usage/vllm.md	Clarifies plugin as an optional out-of-tree bridge and points to upstream native vLLM direction.
docs/site/usage/huggingface.md	Labels HuggingFace `DynamicCache` as the primary workflow.
docs/site/usage/container.md	Reframes container instructions as being for the optional plugin bridge.
docs/index.md	Updates the repo docs index to highlight verification + HF/reference path and add a “choose the right path” guide.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-10T03:28:59Z

+Or with [uv](https://docs.astral.sh/uv/):

 ```bash
-TQ4_K_BITS=4 TQ4_V_BITS=3 vllm serve meta-llama/Llama-3.1-8B-Instruct --attention-backend CUSTOM
+uv add turboquant-vllm
+uv add turboquant-vllm --extra vllm
 ```


The uv install snippet currently shows two uv add commands back-to-back (base + --extra vllm), which reads like both should be run. This is redundant/confusing; it should be presented as alternative commands (either base install or install with the vllm extra) so users don’t add the dependency twice or wonder which one is correct.

Fixed. README now labels the two uv commands as alternatives and explains which workflow each command is for, so users do not read them as sequential steps.

Copilot · 2026-04-10T03:28:59Z

+## Choose the Right Path
+
+- **Use `turboquant-vllm`** when you want HuggingFace cache compression, model validation, multimodal experiments, or architecture/policy research
+- **Use upstream native vLLM TurboQuant** when you want the in-tree serving path in vLLM
+- **Use the plugin path here** only when you specifically need the out-of-tree bridge (`--attention-backend CUSTOM`)
+


This PR is described as repositioning the docs landing/index, but the rendered MkDocs site uses docs/site as docs_dir and its landing page is docs/site/index.md (see mkdocs.yml). Updates here in docs/index.md won’t affect the site landing page, so the public docs may still present the old plugin-first messaging unless the MkDocs index is updated too or this file is explicitly linked from the site.

Fixed. I updated docs/site/index.md, which is the MkDocs landing page configured by mkdocs.yml, so the public docs now lead with the HuggingFace/reference positioning instead of the old plugin-first messaging.

Alberto-Codes added 2 commits April 9, 2026 20:16

Alberto-Codes marked this pull request as ready for review April 10, 2026 03:26

Copilot AI review requested due to automatic review settings April 10, 2026 03:26

Copilot started reviewing on behalf of Alberto-Codes April 10, 2026 03:27 View session

Copilot AI reviewed Apr 10, 2026

View reviewed changes

docs: address PR review follow-ups for site landing

34ad6c5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(readme): reposition project around HF reference workflows#91

docs(readme): reposition project around HF reference workflows#91
Alberto-Codes wants to merge 3 commits intomainfrom
docs/issue-87-hf-positioning

Alberto-Codes commented Apr 10, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 10, 2026

Uh oh!

Alberto-Codes Apr 10, 2026

Uh oh!

Copilot AI Apr 10, 2026

Uh oh!

Alberto-Codes Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Alberto-Codes commented Apr 10, 2026

PR Review

Checklist

Review Focus

Related

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Alberto-Codes Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Alberto-Codes Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants