docs: last bits of the revamp project (#1713)

miyoungc · tgasser-nv · web-flow · commit 3b00a0c0cfe5 · 2026-03-11T12:17:24.000-05:00
Signed-off-by: Tim Gasser &lt;200644301+tgasser-nv@users.noreply.github.com&gt;
Co-authored-by: Tim Gasser &lt;200644301+tgasser-nv@users.noreply.github.com&gt;
diff --git a/docs/conf.py b/docs/conf.py
@@ -91,8 +91,8 @@
     # Top-level pages
     "architecture": "about/how-it-works.html",
     "architecture/readme": "reference/colang-architecture-guide.html",
-    "faqs": "resources/faqs.html",
-    "glossary": "resources/glossary.html",
+    "faqs": "index.html",
+    "glossary": "index.html",
     "release-notes": "about/release-notes.html",
     "security/guidelines": "resources/security/guidelines.html",
     # Getting started
diff --git a/docs/configure-rails/caching/index.md b/docs/configure-rails/caching/index.md
@@ -14,6 +14,11 @@ content:
 
 # Caching Instructions and Prompts
 
+The NVIDIA NeMo Guardrails library provides two caching strategies to reduce inference latency.
+The in-memory model cache stores LLM responses and returns them for repeated prompts without calling the LLM again.
+KV cache reuse is a NIM-level optimization that avoids computation of the system prompt on each NemoGuard NIM call.
+You can enable either or both strategies independently.
+
 ::::{grid} 1 2 2 2
 :gutter: 3
 
diff --git a/docs/configure-rails/custom-initialization/index.md b/docs/configure-rails/custom-initialization/index.md
@@ -22,7 +22,7 @@ content:
 
 # Configuring Custom Initialization
 
-The `config.py` file contains initialization code that runs **once at startup**, before the `LLMRails` instance is fully initialized. Use it to register custom providers and set up shared resources.
+The `config.py` file contains initialization code that runs once at startup, before the `LLMRails` instance is fully initialized. Use it to register custom providers and set up shared resources.
 
 ## When to Use config.py vs actions.py
 
@@ -53,7 +53,7 @@ Define the init() function to initialize resources and register action parameter
 :link: custom-llm-providers
 :link-type: doc
 
-Register custom text completion (BaseLLM) and chat models (BaseChatModel) for use with NeMo Guardrails.
+Register custom text completion (BaseLLM) and chat models (BaseChatModel) for use with the NVIDIA NeMo Guardrails library.
 +++
 {bdg-secondary}`How To`
 :::
@@ -62,7 +62,7 @@ Register custom text completion (BaseLLM) and chat models (BaseChatModel) for us
 :link: custom-embedding-providers
 :link-type: doc
 
-Register custom embedding providers for vector similarity search in NeMo Guardrails.
+Register custom embedding providers for vector similarity search in the NVIDIA NeMo Guardrails library.
 +++
 {bdg-secondary}`How To`
 :::
diff --git a/docs/index.md b/docs/index.md
@@ -267,7 +267,5 @@ Troubleshooting <troubleshooting>
 :name: Resources
 :hidden:
 
-FAQs <resources/faqs.md>
-Glossary <resources/glossary.md>
 Security <resources/security/guidelines.md>
 ```
diff --git a/docs/resources/faqs.md b/docs/resources/faqs.md
diff --git a/docs/resources/glossary.md b/docs/resources/glossary.md