docs: add architecture diagram for intrinsics (#998)

jakelorocco · web-flow · commit 6190010a83fd · 2026-05-04T17:28:48.000Z
Signed-off-by: Jake LoRocco &lt;jake.lorocco@ibm.com&gt;
diff --git a/docs/docs/advanced/intrinsics.md b/docs/docs/advanced/intrinsics.md
@@ -19,6 +19,7 @@ reliable than prompting a general-purpose model for these specialized micro-task
 > - **OpenAIBackend** — uses a Granite Switch model served via vLLM with
 >   `load_embedded_adapters=True`. Only intrinsics embedded in the model are
 >   available — check the model's `adapter_index.json` for the list.
+>   See `docs/docs/examples/granite-switch/README.md`
 >
 > Intrinsics do not work with Ollama or other remote backends.
 
diff --git a/docs/docs/images/granite-libraries-mellea-architecture.png b/docs/docs/images/granite-libraries-mellea-architecture.png
diff --git a/docs/examples/granite-switch/README.md b/docs/examples/granite-switch/README.md
@@ -26,11 +26,7 @@ python -m vllm.entrypoints.openai.api_server \
 
 ## Available adapters
 
-Not all intrinsics are embedded in every Granite Switch model. Check the model's
-`adapter_index.json` for the list of available adapters. The current model
-includes: `answerability`, `citations`, `context_relevance`, `guardian-core`,
-`hallucination_detection`, `query_clarification`, `query_rewrite`, and
-`requirement-check`.
+Not all intrinsics are embedded in every Granite Switch model. You should check the model's `adapter_index.json` file for a definitive list. For granite switch models pre-built by IBM, we include a list of models in the Mellea `model_id`.
 
 ## Files
 
@@ -52,7 +48,11 @@ Shows how to manually load embedded adapters using
 you only need a subset of adapters or want more control over adapter
 registration.
 
+## Architecture
+![Granite Libraries Software Stack Architecture in Mellea](../../docs/images/granite-libraries-mellea-architecture.png)
+
 ## Related
 
 - [`../intrinsics/`](../intrinsics/) — the same intrinsics using `LocalHFBackend`
-- [Intrinsics documentation](../../docs/docs/advanced/intrinsics.md)
+- [Intrinsics Documentation](../../docs/docs/advanced/intrinsics.md)
+- [Official Granite Switch Documentation](https://github.com/generative-computing/granite-switch) 
diff --git a/docs/examples/intrinsics/README.md b/docs/examples/intrinsics/README.md
@@ -2,54 +2,6 @@
 
 This directory contains examples for using Mellea's intrinsic functions - specialized model capabilities accessed through adapters.
 
-## Files
-
-### intrinsics.py
-Core example showing how to directly use intrinsics with adapters.
-
-**Key Features:**
-- Creating and adding adapters to backends
-- Using `Intrinsic` component for specialized tasks
-- Working with Granite Common adapters (aLoRA-based)
-- Understanding adapter output formats
-
-### answerability.py
-Checks if a question can be answered given the context.
-
-### citations.py
-Validates and extracts citations from generated text.
-
-### context_relevance.py
-Assesses if retrieved context is relevant to a query.
-
-### hallucination_detection.py
-Detects when model outputs contain hallucinated information.
-
-### query_rewrite.py
-Rewrites queries for better retrieval or understanding.
-
-### uncertainty.py
-Estimates the model's certainty about answering a question.
-
-### requirement_check.py
-Detect if text adheres to provided requirements.
-
-### policy_guardrails.py
-Checks if a scenario is compliant/non-compliant/ambiguous with respect to a given policy,
-
-### guardian_core.py
-Uses the guardian-core LoRA adapter for safety risk detection, including prompt-level harm, response-level social bias, RAG groundedness, and custom criteria.
-
-### factuality_detection.py
-Detects if the the model's output is factually incorrect relative to context.
-
-### factuality_correction.py
-Corrects a factually incorrect response relative to context.
-
-### context_attribution.py
-Identifies sentences in conversation history and documents that most influenced the response.
-
-
 ## Concepts Demonstrated
 
 - **Intrinsic Functions**: Specialized model capabilities beyond text generation
@@ -61,25 +13,20 @@ Identifies sentences in conversation history and documents that most influenced
 ## Basic Usage
 
 ```python
-from mellea.backends.huggingface import LocalHFBackend
-from mellea.backends.adapters.adapter import IntrinsicAdapter
-from mellea.stdlib.components import Intrinsic
-import mellea.stdlib.functional as mfuncs
+from mellea import model_ids, start_backend
+from mellea.stdlib import functional as mfuncs
+from mellea.stdlib.components.intrinsic import core
 
-# Create backend and adapter
-backend = LocalHFBackend(model_id="ibm-granite/granite-4.0-micro")
-adapter = IntrinsicAdapter("requirement_check",
-                               base_model_name=backend.base_model_name)
-backend.add_adapter(adapter)
-
-# Use intrinsic
-out, new_ctx = mfuncs.act(
-    Intrinsic(
-        "requirement_check",
-        intrinsic_kwargs={"requirement": "The assistant is helpful."}),
-    ctx,
-    backend
+ctx, backend = start_backend(
+    "hf", model_id=model_ids.IBM_GRANITE_4_1_3B, context_type="chat"
 )
+
+response, ctx = mfuncs.chat("What is 2 + 2?", ctx, backend)
+print(f"Response: {response.content}")
+
+# NOTE: There are additional functions for other intrinsics as well.
+result = core.check_certainty(ctx, backend)
+print(f"Certainty score: {result}")
 ```
 
 OpenAIBackends also support a type of embedded adapter for Granite Switch models:
@@ -89,8 +36,15 @@ backend = OpenAIBackend(
         load_embedded_adapters=True,  # Auto-loads adapters from huggingface repo.
         ...
 )
+```
+
+The underlying intrinsics can also be utilized directly when generating:
+```python
+from mellea.stdlib.components import Intrinsic
+import mellea.stdlib.functional as mfuncs
+
+...
 
-# Assumes the model has the `requirement_check` adapter embedded.
 out, new_ctx = mfuncs.act(
     Intrinsic(
         "requirement_check",
@@ -103,29 +57,32 @@ out, new_ctx = mfuncs.act(
 For complete runnable examples using the OpenAI backend with Granite Switch,
 see [`../granite-switch/`](../granite-switch/).
 
-> **Note:** Not all intrinsics are embedded in every Granite Switch model. The
-> current model includes: `answerability`, `citations`, `context_relevance`,
-> `guardian-core`, `hallucination_detection`, `query_clarification`,
-> `query_rewrite`, and `requirement-check`. Check the model's
-> `adapter_index.json` for the full list.
+> **Note:** Not all intrinsics are embedded in every Granite Switch model. You should check
+> the model's `adapter_index.json` file for a definitive list. For granite switch models
+> pre-built by IBM, we include a list of models in the Mellea `model_id`.
 
 ## Available Intrinsics
 
-- **requirement_check**: Validate requirements (used by ALoraRequirement)
 - **answerability**: Determine if question is answerable
 - **citations**: Extract and validate citations
+- **context-attribution**: Identify context sentences that most influenced response
 - **context_relevance**: Assess context-query relevance
+- **factuality_correction**: Correct factually incorrect responses
+- **factuality_detection**: Detect factually incorrect responses
+- **guardian-core**: Safety risk detection (harm, bias, groundedness, custom criteria)
 - **hallucination_detection**: Detect hallucinated content
+- **policy_guardrails**: Determine if scenario complies with policy
+- **query_clarification**: Generate a clarification request if needed, otherwise "CLEAR".
 - **query_rewrite**: Improve query formulation
+- **requirement_check**: Validate requirements (used by ALoraRequirement)
 - **uncertainty**: Estimate certainty about answering a question
-- **policy_guardrails**: Determine if scenario complies with policy
-- **guardian-core**: Safety risk detection (harm, bias, groundedness, custom criteria)
-- **factuality_detection**: Detect factually incorrect responses
-- **factuality_correction**: Correct factually incorrect responses
-- **context-attribution**: Identify context sentences that most influenced response
+
+## Architecture
+![Granite Libraries Software Stack Architecture in Mellea](../../docs/images/granite-libraries-mellea-architecture.png)
 
 ## Related Documentation
 
 - See `mellea/stdlib/components/intrinsic/` for intrinsic implementations
 - See `mellea/backends/adapters/` for adapter system
 - See `docs/dev/intrinsics_and_adapters.md` for architecture details
+- See `docs/docs/examples/granite-switch/README.md` for more about granite-switch
diff --git a/docs/examples/intrinsics/context_relevance.py b/docs/examples/intrinsics/context_relevance.py
@@ -14,7 +14,7 @@
 ctx, backend = start_backend(
     "hf", model_id=model_ids.IBM_GRANITE_4_MICRO_3B, context_type="chat"
 )
-# NOTE: this example uses Granite 4.0 micro because there is no context_relevance intrinsic for Graniet 4.1
+# NOTE: this example uses Granite 4.0 micro because there is no context_relevance intrinsic for Granite 4.1
 
 question = "Who is the CEO of Microsoft?"
 document = (

Original file line number	Diff line number	Diff line change
`@@ -19,6 +19,7 @@ reliable than prompting a general-purpose model for these specialized micro-task`
`19`	`19`	`> - OpenAIBackend — uses a Granite Switch model served via vLLM with`
`20`	`20`	> `load_embedded_adapters=True`. Only intrinsics embedded in the model are
`21`	`21`	> available — check the model's `adapter_index.json` for the list.
	`22`	+> See `docs/docs/examples/granite-switch/README.md`
`22`	`23`	`>`
`23`	`24`	`> Intrinsics do not work with Ollama or other remote backends.`
`24`	`25`
Original file line number	Diff line number	Diff line change
`@@ -14,7 +14,7 @@`
`14`	`14`	`ctx, backend = start_backend(`
`15`	`15`	`"hf", model_id=model_ids.IBM_GRANITE_4_MICRO_3B, context_type="chat"`
`16`	`16`	`)`
`17`		`-# NOTE: this example uses Granite 4.0 micro because there is no context_relevance intrinsic for Graniet 4.1`
	`17`	`+# NOTE: this example uses Granite 4.0 micro because there is no context_relevance intrinsic for Granite 4.1`
`18`	`18`
`19`	`19`	`question = "Who is the CEO of Microsoft?"`
`20`	`20`	`document = (`