parameterlab
diff --git a/‎.github/scripts/extract_changelog.py‎
Lines changed: 1 addition & 0 deletions b/‎.github/scripts/extract_changelog.py‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎AGENTS.md‎
Lines changed: 6 additions & 2 deletions b/‎AGENTS.md‎
Lines changed: 6 additions & 2 deletions
diff --git a/‎CHANGELOG.md‎
Lines changed: 6 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 6 additions & 0 deletions
diff --git a/‎CONTRIBUTING.md‎
Lines changed: 5 additions & 4 deletions b/‎CONTRIBUTING.md‎
Lines changed: 5 additions & 4 deletions
diff --git a/‎README.md‎
Lines changed: 1 addition & 1 deletion b/‎README.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/guides/config-gathering.md‎
Lines changed: 2 additions & 2 deletions b/‎docs/guides/config-gathering.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/guides/message-tracing.md‎
Lines changed: 24 additions & 21 deletions b/‎docs/guides/message-tracing.md‎
Lines changed: 24 additions & 21 deletions
diff --git a/‎docs/index.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/index.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎maseval/core/agent.py‎
Lines changed: 7 additions & 36 deletions b/‎maseval/core/agent.py‎
Lines changed: 7 additions & 36 deletions
diff --git a/‎maseval/core/benchmark.py‎
Lines changed: 7 additions & 7 deletions b/‎maseval/core/benchmark.py‎
Lines changed: 7 additions & 7 deletions
@@ -19,6 +19,7 @@ def extract_section(version: str, changelog_path: Path) -> str:
     if not match:
         print(f"No changelog entry found for version {version}", file=sys.stderr)
         sys.exit(1)
+    assert match is not None
     return match.group(0).strip()
 
 
 
@@ -102,12 +102,12 @@ uv remove <package-name>
 
 **Framework Adapter Pattern:**
 
-When implementing wrappers for external frameworks, **always use the framework's native message storage as the source of truth**:
+When implementing adapters for external frameworks, **always use the framework's native message storage as the source of truth**:
 
 **Pattern 1: Persistent State (smolagents)**
 
 ```python
-class MyFrameworkWrapper(AgentAdapter):
+class MyFrameworkAdapter(AgentAdapter):
     def get_messages(self) -> MessageHistory:
         """Dynamically fetch from framework's internal storage."""
         # Get from framework (e.g., agent.memory, agent.messages)
@@ -236,3 +236,7 @@ For lists and dictionaries, use `Dict[...,...]`, `List[...]`, `Sequence[...]` et
 - DO NOT publicly distribute code or data
 - DO NOT publish without explicit permission
 - DO NOT share copyrighted third-party benchmark data
+
+## Changelog
+
+When the task is completed, add your changes to the Changelog.
@@ -9,12 +9,18 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ### Added
 
+- The `logs` property inside `SmolAgentAdapter` and `LanggraphAgentAdapter` are now properly filled. (PR: #3)
+
 ### Changed
 
 ### Fixed
 
+- Consistent naming of agent `adapter` over `wrapper` (PR: #3)
+
 ### Removed
 
+- Removed `set_message_history`, `append_message_history` and `clear_message_history` for `AgentAdapter` and subclasses. (PR: #3)
+
 ## [0.1.2] - 2025-11-18
 
 ### Added
 
@@ -23,7 +23,7 @@ The `maseval` package is designed with a strict separation between its core logi
 
 1.  **`maseval/core`**: This is the heart of the library. It contains the essential logic and **must not** have any optional dependencies. It should be fully functional with a minimal installation.
 
-2.  **`maseval/interface`**: This contains adapters and wrappers for other multi-agent frameworks (like `crewai`, `langgraph`, etc.). All dependencies for these integrations are optional.
+2.  **`maseval/interface`**: This contains adapters for other multi-agent frameworks (like `crewai`, `langgraph`, etc.). All dependencies for these integrations are optional.
 
 > [!WARNING]
 > Code in `maseval/core` **must never** import from `maseval/interface`. This separation is critical to keep the core package lightweight and dependency-free. Breaking this rule will cause the library to fail.
@@ -197,11 +197,11 @@ The pipeline automatically performs the following tasks:
 
 ### 6. Implementing Framework Adapters
 
-When creating wrappers for external agent frameworks (in `maseval/interface/agents/`), follow these best practices to ensure consistency and reliability:
+When creating adapters for external agent frameworks (in `maseval/interface/agents/`), follow these best practices to ensure consistency and reliability:
 
 #### Message History Pattern
 
-**Always use the framework's native message storage as the source of truth.** Do not cache converted messages in the wrapper, as this can lead to inconsistencies if the framework's internal state changes.
+**Always use the framework's native message storage as the source of truth.** Do not cache converted messages in the adapter, as this can lead to inconsistencies if the framework's internal state changes.
 
 **Correct Pattern** (SmolAgents example):
 
@@ -256,13 +256,14 @@ When adding support for a new framework:
 - [ ] Add conditional import in `maseval/interface/agents/__init__.py`
 - [ ] Write integration tests in `tests/test_interface/`
 - [ ] Update documentation with usage examples
+- [ ] Provide a `logs` property inside the `AgentAdapter`.
 
 #### Framework-Specific Patterns
 
 **Pattern 1: Persistent State (smolagents)**
 
 ```python
-class MyFrameworkWrapper(AgentAdapter):
+class MyFrameworkAdapter(AgentAdapter):
     def get_messages(self) -> MessageHistory:
         """Dynamically fetch from framework's internal storage."""
         # Get from framework (e.g., agent.memory, agent.messages)
 
@@ -27,7 +27,7 @@ Analogous to pytest for testing or MLflow for ML experimentation, MASEval focuse
 
 - **Task-Specific Configurations:** Each benchmark task is a self-contained evaluation unit with its own instructions, environment state, success criteria, and custom evaluation logic. One task might measure success by environment state changes, another by programmatic output validation.
 
-- **Framework Agnostic by Design:** MASEval is intentionally unopinionated about agent frameworks, model providers, and system architectures. Simple, standardized interfaces and wrappers enable any agent system to be evaluated without modification to the core library.
+- **Framework Agnostic by Design:** MASEval is intentionally unopinionated about agent frameworks, model providers, and system architectures. Simple, standardized interfaces and adapters enable any agent system to be evaluated without modification to the core library.
 
 - **Lifecycle Hooks via Callbacks:** Inject custom logic at any point in the evaluation lifecycle (e.g., on_run_start, on_task_start, on_agent_step_end) through a callback system. This enables extensibility without modifying core evaluation logic.
 
 
@@ -124,8 +124,8 @@ class MyBenchmark(Benchmark):
     def setup_agents(self, agent_data, environment, task, user):
         model = MyModelAdapter(...)
         agent = MyAgent(model=model)
-        wrapper = AgentAdapter(agent, "agent")
-        return [wrapper], {"agent": wrapper}
+        adapter = AgentAdapter(agent, "agent")
+        return [adapter], {"agent": adapter}
     # ... other methods
 
 # Run benchmark
 
@@ -16,7 +16,7 @@ MASEval provides message tracing to capture agent conversations during benchmark
 
 ## Core Concepts
 
-**`MessageHistory`**: OpenAI-compatible message storage that all agent wrappers use internally.
+**`MessageHistory`**: OpenAI-compatible message storage that all agent adapters use internally.
 
 **`AgentAdapter.get_messages()`**: Standard method to retrieve conversation history from any wrapped agent.
 
@@ -26,17 +26,17 @@ MASEval provides message tracing to capture agent conversations during benchmark
 
 ### Accessing Message History
 
-Every agent wrapper exposes message history through `get_messages()`:
+Every agent adapter exposes message history through `get_messages()`:
 
 ```python
-from maseval.interface.agents import SmolAgentsWrapper
+from maseval.interface.agents import SmolAgentAdapter
 
 # Create and run your agent
-wrapper = SmolAgentsWrapper(agent, name="researcher")
-result = wrapper.run("What's the capital of France?")
+agent_adapter = SmolAgentAdapter(agent, name="researcher")
+result = agent_adapter.run("What's the capital of France?")
 
 # Get the conversation
-messages = wrapper.get_messages()
+messages = agent_adapter.get_messages()
 
 # Inspect messages
 for msg in messages:
@@ -45,18 +45,21 @@ for msg in messages:
         print(f"  Tools called: {[tc['function']['name'] for tc in msg['tool_calls']]}")
 ```
 
-### Clearing History Between Tasks
+### Fresh Conversations for Multiple Tasks
 
-In benchmarks, you typically want to clear history before each new task:
+In benchmarks, you typically want a fresh agent instance for each task:
 
 ```python
 # In your benchmark loop
 for task in benchmark.tasks:
-    wrapper.clear_message_history()  # Reset for new task
-    result = wrapper.run(task.query)
+    # Create a new adapter instance for each task
+    agent_adapter = YourAgentAdapter(agent_instance=agent, name="task_agent")
+    result = agent_adapter.run(task.query)
     evaluate(result, task.ground_truth)
 ```
 
+This ensures each task starts with a clean slate and avoids conversation history contamination.
+
 ## Using the Tracing Callback
 
 For multi-agent systems or when you need to collect conversations from many runs, use `MessageTracingAgentCallback`:
@@ -68,12 +71,12 @@ from maseval.core.callbacks import MessageTracingAgentCallback
 tracer = MessageTracingAgentCallback()
 
 # Attach to your agent(s)
-wrapper = SmolAgentsWrapper(agent, name="assistant", callbacks=[tracer])
+agent_adapter = SmolAgentAdapter(agent, name="assistant", callbacks=[tracer])
 
 # Run tasks
-wrapper.run("Task 1")
-wrapper.run("Task 2")
-wrapper.run("Task 3")
+agent_adapter.run("Task 1")
+agent_adapter.run("Task 2")
+agent_adapter.run("Task 3")
 
 # Get all conversations
 conversations = tracer.get_all_conversations()
@@ -93,8 +96,8 @@ Share one tracer across multiple agents to collect all conversations:
 tracer = MessageTracingAgentCallback()
 
 # Attach to multiple agents
-agent1 = SmolAgentsWrapper(agent1, name="researcher", callbacks=[tracer])
-agent2 = SmolAgentsWrapper(agent2, name="writer", callbacks=[tracer])
+agent1 = SmolAgentAdapter(agent1, name="researcher", callbacks=[tracer])
+agent2 = SmolAgentAdapter(agent2, name="writer", callbacks=[tracer])
 
 # Run both agents
 agent1.run("Research topic X")
@@ -119,7 +122,7 @@ tracer = MessageTracingAgentCallback()
 
 for batch in task_batches:
     for task in batch:
-        wrapper.run(task.query)
+        agent_adapter.run(task.query)
 
     # Process this batch
     conversations = tracer.get_all_conversations()
@@ -190,9 +193,9 @@ Messages use OpenAI's chat completion format:
 }
 ```
 
-## Custom Agent Wrappers
+## Custom Agent Adapters
 
-If you're implementing a custom wrapper, the framework handles message storage automatically via `get_messages()`. Just ensure your `_run_agent()` method returns a `MessageHistory`:
+If you're implementing a custom adapter, the framework handles message storage automatically via `get_messages()`. Just ensure your `_run_agent()` method returns a `MessageHistory`:
 
 ```python
 from maseval import AgentAdapter, MessageHistory
@@ -211,13 +214,13 @@ class MyAgentAdapter(AgentAdapter):
         return history
 ```
 
-See the [Agent Wrapper guide](../reference/agent.md) for details on implementing custom wrappers.
+See the [AgentAdapter guide](../reference/agent.md) for details on implementing custom adapters.
 
 ## Tips
 
 **For debugging**: Use `verbose=True` to see traces in real-time.
 
-**For benchmarks**: Clear history between tasks with `wrapper.clear_message_history()`.
+**For benchmarks**: Create a new adapter instance for each task to ensure clean conversation history.
 
 **For multi-agent systems**: Use a shared tracer and `get_conversations_by_agent()` to analyze each agent separately.
 
 
@@ -24,7 +24,7 @@ More details in the [Quickstart](getting-started/quickstart.md)
 
 - **Task-Specific Configurations:** Each benchmark task is a self-contained evaluation unit with its own instructions, environment state, success criteria, and custom evaluation logic. One task might measure success by environment state changes, another by programmatic output validation.
 
-- **Framework Agnostic by Design:** MASEval is intentionally unopinionated about agent frameworks, model providers, and system architectures. Simple, standardized interfaces and wrappers enable any agent system to be evaluated without modification to the core library.
+- **Framework Agnostic by Design:** MASEval is intentionally unopinionated about agent frameworks, model providers, and system architectures. Simple, standardized interfaces and adapters enable any agent system to be evaluated without modification to the core library.
 
 - **Lifecycle Hooks via Callbacks:** Inject custom logic at any point in the evaluation lifecycle (e.g., `on_run_start`, `on_task_start`, `on_agent_step_end`) through a callback system. This enables extensibility without modifying core evaluation logic.
 
 
@@ -1,16 +1,16 @@
 from abc import ABC, abstractmethod
-from typing import List, Any, Optional, Union, Dict
+from typing import List, Any, Optional, Dict
 
 from .callback import AgentCallback
-from .history import MessageHistory, RoleType
+from .history import MessageHistory
 from .tracing import TraceableMixin
 from .config import ConfigurableMixin
 
 
 class AgentAdapter(ABC, TraceableMixin, ConfigurableMixin):
     """Wraps an agent from any framework to provide a standard interface.
 
-    This wrapper provides:
+    This Adapter provides:
     - Unified execution interface via `run()`
     - Callback hooks for monitoring
     - Message history management via getter/setter
@@ -101,35 +101,6 @@ def get_messages(self) -> MessageHistory:
         """
         return self.messages if self.messages is not None else MessageHistory()
 
-    def set_message_history(self, history: MessageHistory) -> None:
-        """Set the message history.
-
-        This is typically called by _run_agent() implementations after executing
-        the agent, but can also be used to inject or modify history.
-
-        Args:
-            history: The MessageHistory to set
-        """
-        self.messages = history
-
-    def clear_message_history(self) -> None:
-        """Clear the message history."""
-        self.messages = None
-
-    def append_to_message_history(self, role: Union[RoleType, str], content: Union[str, List[Any]], **kwargs) -> None:
-        """Append a message to the history.
-
-        If no history exists, creates a new one.
-
-        Args:
-            role: The message role ("user", "assistant", "system", "tool")
-            content: The message content (string or list of content parts)
-            **kwargs: Additional fields (name, metadata, timestamp, etc.)
-        """
-        if self.messages is None:
-            self.messages = MessageHistory()
-        self.messages.add_message(role, content, **kwargs)  # type: ignore
-
     def gather_traces(self) -> dict[str, Any]:
         """Gather execution traces from this agent.
 
@@ -148,7 +119,7 @@ def gather_traces(self) -> dict[str, Any]:
 
         How to use:
             This method is automatically called by Benchmark during trace collection.
-            Framework-specific wrappers can extend this to include additional data:
+            Framework-specific adapters can extend this to include additional data:
 
             ```python
             def gather_traces(self) -> dict[str, Any]:
@@ -181,12 +152,12 @@ def gather_config(self) -> dict[str, Any]:
             - gathered_at: ISO timestamp
             - name: Agent name
             - agent_type: Underlying agent framework class name
-            - wrapper_type: The specific wrapper class (e.g., SmolAgentAdapter)
+            - adapter_type: The specific adapter class (e.g., SmolAgentAdapter)
             - callbacks: List of callback class names attached to this agent
 
         How to use:
             This method is automatically called by Benchmark during config collection.
-            Framework-specific wrappers can extend this to include additional data:
+            Framework-specific adapters can extend this to include additional data:
 
             ```python
             def gather_config(self) -> dict[str, Any]:
@@ -200,7 +171,7 @@ def gather_config(self) -> dict[str, Any]:
             **super().gather_config(),
             "name": self.name,
             "agent_type": type(self.agent).__name__,
-            "wrapper_type": type(self).__name__,
+            "adapter_type": type(self).__name__,
             "callbacks": [type(cb).__name__ for cb in self.callbacks],
         }
 
 
@@ -64,8 +64,8 @@ def setup_environment(self, agent_data, task):
 
                 def setup_agents(self, agent_data, environment, task, user):
                     agent = MyAgent(model=agent_data["model"])
-                    wrapper = AgentAdapter(agent, "agent")
-                    return [wrapper], {"agent": wrapper}
+                    agent_adapter = AgentAdapter(agent, "agent")
+                    return [agent_adapter], {"agent": agent_adapter}
 
                 def run_agents(self, agents, task, environment):
                     return agents[0].run(task.query)
@@ -258,10 +258,10 @@ def setup_agents(self, agent_data, environment, task, user):
 
                 # Create agent (auto-registered when returned)
                 agent = MyAgent(model=model)
-                wrapper = AgentAdapter(agent, "agent1")
+                agent_adapter = AgentAdapter(agent, "agent1")
 
                 # Environment and user are also auto-registered
-                return [wrapper], {"agent1": wrapper}
+                return [agent_adapter], {"agent1": agent_adapter}
             ```
 
             Traces and configs are automatically collected before evaluation via
@@ -673,12 +673,12 @@ def setup_agents(self, agent_data, environment, task, user):
                     model=model,
                     managed_agents=[w.agent for w in workers.values()]
                 )
-                orchestrator_wrapper = AgentAdapter(orchestrator, "orchestrator")
+                orchestrator_adapter = AgentAdapter(orchestrator, "orchestrator")
 
                 # Return orchestrator to run, but all agents for monitoring
                 # All agents auto-registered for tracing
-                all_agents = {"orchestrator": orchestrator_wrapper, **workers}
-                return [orchestrator_wrapper], all_agents
+                all_agents = {"orchestrator": orchestrator_adapter, **workers}
+                return [orchestrator_adapter], all_agents
             ```
         """
         pass