v0.6.0 - 2026-05-19
- feat: adding typing to melleatools by @akihikokuroda in #959
- feat: add tool calling support to m serve by @markstur in #850
- feat: cli OpenAI-compatible API
response_formatsupport by @markstur in #884 - feat: allow async functions as tools by @ajbozarth in #1041
- feat: add as_generic_chat_history function to convert any Context to … by @akihikokuroda in #1007
- feat: add MCP tool integration by @ajbozarth in #1042
- feat(telemetry): close five OTel GenAI semantic convention emission gaps (#1035) by @planetf1 in #1036
- feat: add bash tool by @akihikokuroda in #1056
- refactor: get instructions from upstream guardian adapters by @psschwei in #1037
- feat(stdlib): add stream_with_chunking() with per-chunk validation (#901) by @planetf1 in #942
- feat(stdlib): add streaming event types, events() iterator, and OTEL bridge (#902) by @planetf1 in #1095
- fix: tool call arguments by @akihikokuroda in #896
- fix: Remove tests and examples for deprecated models by @frreiss in #1012
- fix: remove unused keys from Message template args by @ajbozarth in #1010
- fix: silence false-positive context warning in react framework by @ajbozarth in #1009
- fix: populate
context_viewinCOMPONENT_PRE_EXECUTEpayload by @araujof in #941 - fix: skip tool hooks for framework-internal tools by @araujof in #939
- fix: drop unused docs dependency group by @ajbozarth in #1044
- fix: log unexpected errors server-side, return generic message to client (#991) by @SAY-5 in #1057
- fix: make FunctionParameters and LogitBias OpenAI-compatible by @markstur in #1039
- fix(backends): capture vLLM reasoning field in mot._thinking by @planetf1 in #1063
- fix(test): loosen fragile equality assertions in astream incremental tests by @planetf1 in #1069
- fix: update docs and comment references to python 3.11 and remove rust mention by @jakelorocco in #1089
- docs(openai): document empty response from thinking-mode models by @planetf1 in #1062
- fix: revert feat: add bash tool by @akihikokuroda in #1098
- docs: add canonical url headers by @AngeloDanducci in #961
- docs: add release and testing notes by @jakelorocco in #1031
- ci: more flexible pr template handling by @psschwei in #1082
- ci: better pr type checker experience by @psschwei in #1083
- chore: Add Ollama model name mappings for Granite 4.1 to intrinsics adapter resolution by @kndtran in #1085
- ci: migrate hold and pr-label actions to common org versions by @psschwei in #1081
Full Changelog: https://github.com/generative-computing/mellea/compare/v0.5.0...v0.6.0
v0.5.0 - 2026-05-05
- feat(telemetry): latency histograms for LLM request duration and TTFB (#463) by @ajbozarth in #782
- feat: rename generative slots -> generative stubs by @jakelorocco in #801
- feat: (m-decompose) Module Prompt V3 by @csbobby in #770
- feat: simplify plugin tests; fix plugin resetting by @jakelorocco in #819
- feat: add examples and tooling tests to run_tests_with_ollama_and_vllm by @jakelorocco in #821
- feat: add return types to invoke_hook by @jakelorocco in #707
- feat: separate out remaining dependencies and improve tests by @jakelorocco in #789
- feat: add error counter metrics categorized by semantic type (#465) by @ajbozarth in #856
- refactor: improve fancylogger implementation by @AngeloDanducci in #792
- refactor: add otel tracing filter to logging by @AngeloDanducci in #859
- feat: streaming support in m serve OpenAI API server by @markstur in #823
- feat: first pass at carrying contextvars though async flows by @AngeloDanducci in #878
- refactor: add print statements to show code flow in mify example by @code4days in #870
- feat: add pricing registry and cost metrics (#464) by @ajbozarth in #882
- feat: add operational counters for sampling, requirements, and tools (#467) by @ajbozarth in #883
- feat: add --skip-resource-checks flag to bypass hardware capability g… by @ajbozarth in #889
- refactor!: partition ModelOutputThunk execution metadata into Generat… by @ajbozarth in #908
- feat: add additional logging handlers by @AngeloDanducci in #907
- feat(core): add PartialValidationResult with tri-state semantics by @planetf1 in #924
- feat(stdlib): add ChunkingStrategy ABC and built-in chunkers by @planetf1 in #923
- feat: add prompt cache token support to cost telemetry by @ajbozarth in #936
- feat: add stream_validate() hook to Requirement (#900) by @planetf1 in #925
- feat(examples): add extra_requirements param to IVR qiskit validation by @ajbozarth in #955
- feat: add embedded adapters (granite switch) to openai backend by @jakelorocco in #881
- refactor(telemetry): replace builtin_pricing.json with litellm pricing API by @ajbozarth in #956
- feat: simplify intrinsics (code and examples) by @jakelorocco in #946
- feat: granite4.1 by @avinash2692 in #964
- feat: allow
namefield in intrinsics io.yaml by @ink-pad in #980 - feat: handle message docs correctly by @jakelorocco in #975
- feat: update granite library examples to use Granite 4.1 3B adapters. by @nrfulton in #981
- fix: restore example collection during directory traversal (#794) by @planetf1 in #795
- fix: redirect /how-to/safety-guardrails to existing security page (#788) by @planetf1 in #803
- fix(cli): handle sync/async serve functions in m serve by @markstur in #784
- fix: evict Ollama models between test modules to prevent memory starvation by @planetf1 in #804
- fix: sofai graph coloring example — broken model and incorrect problem #806 by @planetf1 in #807
- fix: flush MPS cache in alora test GPU cleanup (#790) by @planetf1 in #800
- fix(test): widen hallucination detection tolerance (#809) by @planetf1 in #810
- fix: reload module for telemetry testing so all tests can run by @jakelorocco in #805
- fix: handle stale .vllm-venv in test runner by @planetf1 in #829
- fix: remove all mentions to RITS by @guicho271828 in #868
- fix: granite33 response_end span uses sentence length not full respon… by @planetf1 in #845
- fix: run zizmor checker for github actions to ensure security by @jakelorocco in #854
- fix: render Click \b verbatim blocks in CLI reference docs (#866) by @planetf1 in #867
- fix: fixes invalid workflow file by @markstur in #877
- fix: granite33 citation spans wrong for duplicate sentences (#851) by @planetf1 in #872
- fix: fixing test bugs with xfail by @avinash2692 in #886
- fix: handle nested JSON in parse_judge_output via raw_decode by @sjoerdvink99 in #875
- fix: disable OCR in RichDocument CI test to avoid modelscope.cn download by @ajbozarth in #888
- fix: update hallucination_detection fixture for upstream NA enum addition by @ajbozarth in #918
- fix: remove wall time checks from tracing_backend tests by @jakelorocco in #927
- fix: add missing nav and fix cli ref by @AngeloDanducci in #922
- fix: add vllm pytest marker back by @jakelorocco in #933
- fix: raise ValueError on duplicate subtask tags in reorder_subtasks by @sjoerdvink99 in #874
- fix: replace asyncio.sleep FAF guards with deterministic awaits by @ajbozarth in #919
- fix: removing ollama hardcoding in examples, guardian, and test by @avinash2692 in #912
- fix: pin uncertainty and context-attribution revisions and update uncertai… by @AngeloDanducci in #970
- fix: swap python decompose example model by @AngeloDanducci in #968
- fix: model options with intrinsics by @jakelorocco in #972
- fix: add guardian intrinsic document by @subhajitchaudhury in #966
- fix: key in json object returned by policy_guardrails intrinsic by @monindersingh in #979
- fix: default intrinsic adapter types by @jakelorocco in #994
- fix: issues introduced by intrinsic changes by @jakelorocco in #986
- fix: update model ids and documentation links for switch by @jakelorocco in #997
- fix: move test_huggingface.py to granite4.1; and small rag intrinsic … by @jakelorocco in #1008
- fix: prevent major releases by @jakelorocco in #1016
- docs: add redirects for former pages by @psschwei in #846
- docs: add CLI reference page and remove CLI from API docs (#704) by @planetf1 in #852
- docs: add AI attribution policy by @ajbozarth in #848
- docs: consolidate how-to section by @psschwei in #893
- docs: add generation_error hook to plugins page, remove stale plan doc by @ajbozarth in #887
- docs: fix 'convienance' -> 'convenience' (5 occurrences) by @MukundaKatta in #894
- docs: move glossary to reference section by @psschwei in #892
- docs: document two session creation patterns by @akihikokuroda in #906
- docs: add backend selection lookup table by @akihikokuroda in #905
- docs: restructure sidebar — split Observability from Evaluation, move LLM-as-a-Judge to How-To by @ajbozarth in #895
- docs: add metadata to code block by @akihikokuroda in #917
- docs: test based eval documentation by @seirasto in #916
- docs: fix link to CONTRIBUTING guide by @seirasto in #960
- docs: add expected output blocks and update quickstart examples by @AngeloDanducci in #957
- docs: add architecture diagram for intrinsics by @jakelorocco in #998
- chore: update governance by @psschwei in #799
- test: add unit tests for stdlib/requirements (#814) by @planetf1 in #820
- test: add tool_arg_validator edge case test, fix typo (#826) by @planetf1 in #831
- test: add unit tests for helpers (#815) by @planetf1 in #847
- test: add unit tests for granite formatters (#812) by @planetf1 in #818
- test: unit tests for backend pure logic (cache, catalog, bedrock) by @planetf1 in #832
- chore: add info for working with intrinsics to AGENTS.md by @psschwei in #768
- test: add unit and integration tests for stdlib components (#817) by @planetf1 in #830
- test: unit tests for CLI decompose and eval pure-logic helpers (#861) by @planetf1 in #863
- test: pure-logic unit tests for stdlib, core, backends, telemetry (#860) by @planetf1 in #862
- ci: add actionlint to validate workflow files on PRs by @planetf1 in #880
- chore: Update expected test outputs to reflect upstream config changes by @frreiss in #897
- chore: removing some comments by @avinash2692 in #978
- test: add tests for new intrinsic field name by @jakelorocco in #988
- release: bump minor version by @jakelorocco in #977
- ci: add action for holding PRs (preventing merge) by @psschwei in #1014
- @sjoerdvink99 made their first contribution in #875
- @MukundaKatta made their first contribution in #894
- @seirasto made their first contribution in #916
- @subhajitchaudhury made their first contribution in #966
- @monindersingh made their first contribution in #979
Full Changelog: https://github.com/generative-computing/mellea/compare/v0.4.2...v0.5.0
v0.4.2 - 2026-04-08
- feat: add tests for mellea optional dependencies by @jakelorocco in #724
- feat: further vram optimizations by @avinash2692 in #765
- feat: (m decomp) M Decompose Readme and Docstring Updates by @csbobby in #767
- feat: add top level async streaming by @jakelorocco in #655
- feat(serve): improve OpenAI API compatibility with usage, finish_reas… by @markstur in #771
- feat: removing vllm backend by @avinash2692 in #781
- fix: modifications to granite formatter tests by @jakelorocco in #703
- fix: exclude tooling from mypy check by @planetf1 in #748
- fix: setting ollama host in conftest by @avinash2692 in #751
- fix: Add qualitative and slow markers so the example is skipped by @markstur in #764
- fix(tools): correct args validation in langchain tool wrapper by @markstur in #761
- fix: remove references to old pytest markers by @jakelorocco in #776
- fix: add error handling to OpenAI-compatible serve endpoint by @markstur in #774
- fix: assertion for test_find_context_attributions and range for hallucination detection by @jakelorocco in #779
- fix: add xfail to citation test; functionality is tested elsewhere by @jakelorocco in #787
- docs: remove discord link in main readme by @AngeloDanducci in #720
- docs: note virtual environment requirement for pre-commit hooks by @ajbozarth in #745
- docs: condense README to elevator pitch (#478) by @planetf1 in #688
- docs: update qiskit_code_validation example defaults by @ajbozarth in #743
- docs: remove pre-IVR validation and update readme with v2 benchmark results by @ajbozarth in #769
- docs: add multi-turn strategy option to Qiskit code validation example by @vabarbosa in #717
- chore: use github tooling to build release notes by @psschwei in #710
- docs: add release.md by @psschwei in #723
- fix: proper permissions on pr labeling job by @psschwei in #741
- ci: memory management in tests by @avinash2692 in #721
- chore: enforce commit formatting on PR titles by @psschwei in #750
- chore: Update HF repo names by @frreiss in #753
- ci: drop mergify, add release entry to pr-labels action by @psschwei in #752
- ci: fix to make pr label job required check by @psschwei in #756
- test: agent skills infrastructure and marker taxonomy audit (#727, #728) by @planetf1 in #742
- chore: add governance doc by @psschwei in #786
- chore: updating governance doc to use maintainers by @psschwei in #791
- @markstur made their first contribution in #764
Full Changelog: https://github.com/generative-computing/mellea/compare/v0.4.1...v0.4.2
v0.4.1 - 2026-03-23
- Move ruff hooks locally; add output for ci/cd autofixes; update (#709) (
f0e778e) - m-decomp: Upgraded pipeline and added README, examples, and fixed module issues (#676) (
cf63d92)
- Add missing dependencies (#715) (
4bb16c8) - Add special handling for mellea global event loop when forked (#624) (
a620440) - Update github action versions to Node24 compatible (#713) (
4c0bb1b) - Increase test timeout and remove unnecessary hook debugging (#706) (
871a4bf)
v0.4.0 - 2026-03-18
- Guardianlib intrinsics (#8) (#678) (
224d14f) - Add
find_context_attributions()intrinsic function (#679) (7eaf9b7) - Add codeowners for the granite-common part of mellea intrinsics (#669) (
a4ec484) - UQ & requirement_check as
coreIntrinsic (#551) (3e47d15) - Add OTLP logging export (#635) (
c4cb59f) - telemetry: Add configurable metrics exporters (OTLP and Prometheus) (#610) (
5ec3c7a) - Hook system and plugin support for Mellea (#582) (
cbd63bd) - Add token usage metrics with OpenTelemetry integration (#563) (
0e71558) - Move functionality of granite-common to mellea (#571) (
6901c93) - Add OpenTelemetry metrics support (#553) (
78c5aab)
- Always populate mot.usage in HuggingFace backend (#694) (#697) (
4d3fc1b) - Add opencv-python-headless to docling extras (#682) (#685) (
80000af) - Skip pytest collection of qiskit validation_helpers module (#683) (#686) (
ab56c85) - Remove answer_relevance* intrinsics; fix other intrinsics issues (#690) (
1734900) - Use tuple instead of generator for DropDuplicates dictionary key (#652) (
f7ad489) - Document.parts() returns [] instead of raising NotImplementedError (#637) (
3888476) - Add missing type annotations to public API functions (#619) (
97b2ceb) - Update MultiTurnStrategy to include validation failure reasons in repair messages (#633) (
ebdd092) - Restore VSCode test discovery and make GPU isolation opt-in (#605) (
21746b1) - Hf metrics tests run out of memory (#623) (
5411760) - Guarding optional imports for hooks (#627) (
9588284) - Python decompose model change and pipeline fix (#569) (
15d8fff) - Explicit PYTHONPATH for isolated test subprocesses (#593) (#594) (
7bfd18d) - Use device_map for HF model loading (#581) (#587) (
8a385d5) - Ensure enough tokens for structured output in vLLM test (#591) (#595) (
ac6a4cf) - Prevent example collection crash for readme_generator (#596) (
0e56243) - Include fixes issue in pr template (#602) (
a3f3f71) - Do not post_process before finally in ModelOutputThunk.astream (#580) (
af25037) - Correct type annotations and improve CI cache invalidation (#579) (
dfc8942) - Issues with tests (alora example, rag intrinsics, mistral tool use, vllm auto-skip) (#570) (
4cc75c8)
- Refactor telemetry docs into dedicated tracing, metrics, and logging pages (#662) (
56e7ff9) - Add missing example categories to examples catalogue (#645) (#672) (
a86fe40) - Fix MelleaPlugin/MelleaBasePayload missing from API coverage (#… (#670) (
17d48d7) - Removed outdated tutorial.md (#555) (
a0e2a46) - Pre-release verification (resync with latest docs, fix discrepancies) (#665) (
e1f34cd) - Fix RST double-backtick notation breaking API cross-reference links (#658) (
98c0e22) - Add plugins page to nav, apply standards, trim design doc (#663) (
3c0cfa4) - Fix missing docstring sections in plugins and telemetry (#654) (#664) (
8a84987) - Improve docstrings for API reference (#612) (#614) (
f7294d0) - Add Qiskit code validation IVR example (#576) (
ea8d21e) - Implement publishing pipeline (#617) (#646) (
0c5d9c9) - Complete developer documentation rewrite (#480) (#601) (
ed01c87) - Docs/api pipeline improvements (#611) (
3d6755d)
v0.3.2 - 2026-02-26
- Issues found in comprehensive tests: cache capacity, watsonx (#560) (
ff00e89) - Nonhybrid granite model id (#546) (
dc94364) - Huggingface memory leak (#544) (
2f74853) - Self._tokenizer is unset (#549) (
5ac4b2f) - Avoid instantiating an additional tokenizer (#548) (
05f0a91) - Allow mypy to install type stubs (#487) (
2bb34d6) - mellea decomp: Solve ConstraintExtractor parsing fails and improve robustness (#445) (
ca3a7f2)
v0.3.1 - 2026-02-11
- Migrate from Granite 3 to Granite 4 hybrid models (#357) (
8f9e18c) - Add MelleaTool.from_smolagents() for smolagents integration (#430) (
0471006) - Add tool calling argument validation (#364) (
840a02d) - Instrument telemetry (#355) (
b2e5a52) - Add query clarification RAG intrinsic support (#391) (
d38698a) - Add mellea react agent (#402) (
7884b8d) - Optimize example test discovery and execution speed (#372) (
e9aefaf) - New MelleaTool class and adoption across mellea (#380) (
ffb8b6c) - Add code coverage tracking with pytest-cov (#353) (
b45a4b6) - Add pytest markers for test categorization (#322) (#326) (
0d8d020)
- Lint/format issues (#536) (
781bb6b) - Tools in examples (#535) (
a49bdf8) - Quick fix to get the role / content from specifically parsed messages (#533) (
2f54cc8) - Migrate from IBM alora to PEFT 0.18.1 native aLoRA (#422) (
c6a3e64) - Flag more tests that require ollama (#420) (
b06851f) - Guarantee proper ordering of decompose subtask dependencies (#407) (
f0b1346) - Astream output (#358) (
9cafe05) - Update ci for merge-queue (#417) (
5cf8eee) - Some examples needed update (#408) (
3d5ab56) - Formatting model_ids for better readability (#386) (
318a962) - Update agents.md to strongly encourage using uv (#388) (
8b2e2cf) - Restrict transformers version to 4.x (#379) (
67f8bc0) - Friendly error messages for optional backend dependencies (#343) (
4f7091f) - Add missing await keywords in async tests (#346) (
a7442a6) - Use repr for helpful debug display in Message/ToolMessage (#339) (
f15fadb) - Add skip to timeout test for python < 3.11 (#333) (
2cc3352) - Don't overwrite user-configured logging levels (#298) (
119ea86)
- Bedrock example. (#410) (
3204b3a) - Add decompose to tutorial with example (#366) (
ef0a964) - Create contributing doc (#369) (
1cacbf9) - Add security policy (#363) (
afbda1d) - Add code of conduct (#365) (
94b21d9) - Add discord badge to readme (#362) (
168ccca)
v0.3.0 - 2026-01-21
- SOFAI Sampling Strategy (#311) (
cbf3913) - Reorg of codebase (#310) (
cbc456b) - Add typed components; add typing to model output thunks and sampling results (#300) (
2eb689d)
- Tool calling code sample in tutorial (#313) (
a42a487) - Adds granite-common[transformers] to Mellea's huggingface depedency group. (#330) (
87a8166) - Rename file from test_* (#332) (
6512b32) - Readd init file for mellea/stdlib (#328) (
cb156a7) - ImageBlocks are CBlocks (#323) (
8a4c910) - Additional tests optimization when running on github actions. (#293) (
c5398e4) - Import times by not exporting RichDocument at module level (#321) (
565b27f) - Add logging for start_session details (#299) (
6e68f57) - Typos in READMEs and documentation (#303) (
9f6a086) - Add explicit exports to init.py (#317) (
6e7b09b) - Mify protocol issues (#304) (
7013b04) - Linting error (#302) (
c6e3b08) - Add double quotes around brackets used in pip install (#301) (
2d017f1)
- Add AGENTS.md to guide AI coding assistants (#320) (
a89256a) - Improve contributor instructions in README. (#314) (
2be67c8)
v0.2.4 - 2026-01-08
- Fix gc in instructions and add exception to generate walk (#295) (
5fc7df0) - Marks span tests as qualitative & removes chat error message. (#294) (
5ce6360)
v0.2.3 - 2026-01-07
- Allow forcing a release through test failures (#292) (
14b55a3) - Lazy Spans and KV Blocks (#249) (
b9e4a33) - Switch to new RAG intrinsics repo (#289) (
94c35ad)
- OpenAI
base_urldefault and reasoning effort model option. (#271) (9733df8) - Unpin granite_commons version from 0.3.5 (#287) (
0b402bd)
v0.2.2 - 2025-12-18
- Add langchain / message interop example (#257) (
9b1f299) - Add better error messages for incorrect genslot args (#248) (
9d875d6)
- Uv-lock package changes (#261) (
cb0623f) - Lock granite-common version to avoid arg changes (#260) (
03716c1) - Docstrings to have code blocks (#256) (
94a7b40)
v0.2.1 - 2025-12-10
- Test-based Evaluation with LLM-as-a-judge (#225) (
0f1f0f8) - Add a
code_interpretertool (#232) (b03c964)
- Add simple lock to hf generation to prevent using incorrect weights (#237) (
6b2a527) - Collection of small fixes (#238) (
2120112) - Fix unused litellm import (#246) (
633bfd7) - Minor updates to answer relevance (#245) (
bde9b4d) - Pre-commit file selection (#243) (
e70d307)
v0.2.0 - 2025-11-19
- Change backend functions to use async; add generate_from_raw (
16b8aea) - Updates for intrinsics support (#227) (
52953a5) - Add requirements and preconditions to gen slots (#226) (
f73d8e2) - MelleaSession.register for functional interface and MelleaSession.powerup for dynamic mixin (register all methods in a class) (#224) (
662cfcc) - Add secure Python code execution with llm-sandbox support (#217) (
9d12458) - Adds think budget-forcing (#107) (
a2e29e6) - Making generate_from_raw public (#219) (
7eae224) - Conda/Mamba-based installation script (#138) (
6aea9dc) - Adds a vllm backend (#122) (
21908e5) - Add the ability to run examples with pytest (#198) (
e30afe6) - Ollama generate_from_raw uses existing event loop (#204) (
36a069f)
- Vllm format issues (
abbde23) - Some minor fixes (#223) (
7fa0891) - Watsonx self._project_id not getting set (#220) (
10f6ffa) - Decomp subtask regex (#218) (
5ac34be)
v0.1.3 - 2025-10-22
- Decompose cli tool enhancements & new prompt_modules (#170) (
b8fc8e1) - Add async functions (#169) (
689e1a9) - Add Granite Guardian 3.3 8B with updated examples function call validation and repair with reason. (#167) (
517e9c5) - Majority voting sampling strategy (#142) (
36eaca4)
- Fix vllm install script (#185) (
abcf622) - Watsonx and litellm parameter filtering (#187) (
793844c) - Pin trl to version 0.19.1 to avoid deprecation (#202) (
9948907) - Rename format argument in internal methods for better mypiability (#172) (
7a6f780) - Async overhaul; create global event loop; add client cache (#186) (
1e236dd) - Update readme and other places with granite model and tweaks (#184) (
519a35a)
v0.1.2 - 2025-10-03
- Default sampling strats to None for query, transform, chat (#179) (
c8d4601) - Docstrings (#177) (
6126bd9) - Always call sample when a strategy is provided (#176) (
8fece40)
v0.1.1 - 2025-10-01
v0.1.0 - 2025-10-01
- Add fix to watsonx and note to litellm (#173) (
307dbe1) - New context, new sampling,. (#166) (
4ae6d7c) - Add async and streaming support (#137) (
4ee56a9) - Best-of-N Sampling with Process Reward Models (#118) (
b18e03d)
v0.0.6 - 2025-09-18
v0.0.5 - 2025-09-17
- Enable VLMs (#126) (
629cd9b) - LiteLLM backend (#60) (
61d7f0e) - New logo by Ja Young Lee (#120) (
c8837c6)
- Adding pillow as dependency (#147) (
160c6ef) - Huggingface backend does not properly pad inputs (#145) (
a079c77) - Return to old logo (#132) (
f08d2ec) - Alora version and image printing in messages (#130) (
2b3ff55) - Remove ModelOption.THINKING from automatic mapping because it's explicitly handled in line #417 (which was causing parameter conflicts) (#124) (
b5c2a39)