beehive-lab · mikepapadim · Dec 11, 2025 · Dec 11, 2025
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -0,0 +1,32 @@
+# Changelog
+
+All notable changes to GPULlama3.java will be documented in this file.
+
+## [0.3.0] - 2025-12-11
+
+### Model Support
+
+- [refactor] Generalize the design of `tornadovm` package to support multiple new models and types for GPU exec  ([#62](https://github.com/beehive-lab/GPULlama3.java/pull/62))
+- Refactor/cleanup model loaders ([#58](https://github.com/beehive-lab/GPULlama3.java/pull/58))
+- Add Support for Q8_0 Models ([#59](https://github.com/beehive-lab/GPULlama3.java/pull/59))
+
+### Bug Fixes
+
+- [fix] Normalization compute step for non-nvidia hardware ([#84](https://github.com/beehive-lab/GPULlama3.java/pull/84))
+
+### Other Changes
+
+- Update README to enhance TornadoVM performance section and clarify GP… ([#85](https://github.com/beehive-lab/GPULlama3.java/pull/85))
+- Simplify installation by replacing TornadoVM submodule with pre-built SDK ([#82](https://github.com/beehive-lab/GPULlama3.java/pull/82))
+- [FP16] Improved performance by fusing dequantize with compute  in kernels: 20-30% Inference Speedup ([#78](https://github.com/beehive-lab/GPULlama3.java/pull/78))
+- [cicd] Prevent workflows from running on forks ([#83](https://github.com/beehive-lab/GPULlama3.java/pull/83))
+- [CI][packaging] Automate process of deploying a new release with Github actions ([#81](https://github.com/beehive-lab/GPULlama3.java/pull/81))
+- [Opt] Manipulation of Q8_0 tensors with Tornado `ByteArray`s ([#79](https://github.com/beehive-lab/GPULlama3.java/pull/79))
+- Optimization in Q8_0 loading ([#74](https://github.com/beehive-lab/GPULlama3.java/pull/74))
+- [opt] GGUF Load Optimization for tensors in TornadoVM layout ([#71](https://github.com/beehive-lab/GPULlama3.java/pull/71))
+- Add `SchedulerType` support to all TornadoVM layer planners and layer… ([#66](https://github.com/beehive-lab/GPULlama3.java/pull/66))
+- Weight Abstractions ([#65](https://github.com/beehive-lab/GPULlama3.java/pull/65))
+- Bug fixes in sizes and names of GridScheduler ([#64](https://github.com/beehive-lab/GPULlama3.java/pull/64))
+- Add Maven wrapper support ([#56](https://github.com/beehive-lab/GPULlama3.java/pull/56))
+- Add changes used in Devoxx Demo ([#54](https://github.com/beehive-lab/GPULlama3.java/pull/54))
+
diff --git a/CITATION.cff b/CITATION.cff
@@ -15,6 +15,6 @@ authors:
   given-names: "Christos"
 title: "GPULlama3.java"
 license: MIT License
-version: 0.1.0-beta
-date-released: "2025-05-30"
+version: 0.3.0
+date-released: 2025-12-11
 url: "https://github.com/beehive-lab/GPULlama3.java"
diff --git a/README.md b/README.md
@@ -165,7 +165,7 @@ You can add **GPULlama3.java** directly to your Maven project by including the f
 <dependency>
     <groupId>io.github.beehive-lab</groupId>
     <artifactId>gpu-llama3</artifactId>
-    <version>0.2.2</version>
+    <version>0.3.0</version>
 </dependency>
 ```
 

diff --git a/pom.xml b/pom.xml
@@ -7,7 +7,7 @@
         <!-- Use your verified namespace -->
         <groupId>io.github.beehive-lab</groupId>
         <artifactId>gpu-llama3</artifactId>
-        <version>0.2.2</version> <!-- release version (no -SNAPSHOT) -->
+        <version>0.3.0</version> <!-- release version (no -SNAPSHOT) -->
 
         <name>GPU Llama3</name>
         <description>GPU-accelerated LLaMA3 inference using TornadoVM</description>