fix tldr lines

harpomaxx · harpomaxx · commit 7cad61558d18 · 2026-04-06T02:57:46.000Z
diff --git a/docs/immune/finetuning_evaluation.md b/docs/immune/finetuning_evaluation.md
@@ -1,8 +1,7 @@
 ### Fine-Tuning Evaluation Methodology
 
-**Keywords:** LLM-as-Judge, SFT Evaluation, Win Rate, Comparative Ranking
 
-**TL;DR:** Fine-tuned models are evaluated using the same LLM-as-judge framework used for baseline comparison, extended to include the finetuned model as a fifth competitor. The methodology — blind comparative ranking, three metrics, two breakdown dimensions — is identical across tasks.
+**Summary:** Fine-tuned models are evaluated using the same LLM-as-judge framework used for baseline comparison, extended to include the finetuned model as a fifth competitor. The methodology — blind comparative ranking, three metrics, two breakdown dimensions — is identical across tasks.
 
 ---
 
diff --git a/docs/immune/finetuning_frameworks_rpi_5.md b/docs/immune/finetuning_frameworks_rpi_5.md
@@ -1,8 +1,7 @@
 ### Fine-Tuning Frameworks for GGUF Deployment on Raspberry Pi 5
 
-**Keywords:** Local Inference, GGUF Format, Raspberry Pi 5
 
-**TL;DR:** Among the evaluated frameworks, Unsloth stands out as the best fit due to its integrated GGUF export capabilities, minimal workflow complexity, and hardware-optimized quantization support, aligning perfectly with the IMMUNE project's goals and the Raspberry Pi 5’s limitations.
+**Summary:** Among the evaluated frameworks, Unsloth stands out as the best fit due to its integrated GGUF export capabilities, minimal workflow complexity, and hardware-optimized quantization support, aligning perfectly with the IMMUNE project's goals and the Raspberry Pi 5’s limitations.
 
 
 ### Index
diff --git a/docs/immune/finetuning_procedure.md b/docs/immune/finetuning_procedure.md
@@ -1,8 +1,7 @@
 ### Fine-Tuning Approach for Slips Immune
 
-**Keywords:** SFT, LoRA, Unsloth, GGUF, Raspberry Pi 5, Qwen2.5
 
-**TL;DR:** Task-specific fine-tuning of compact models (1.5B parameters) using LoRA + Unsloth, exported to GGUF for CPU inference on the Raspberry Pi 5. The same training pipeline applies across tasks; only the dataset and system prompt are task-specific.
+**Summary:** Task-specific fine-tuning of compact models (1.5B parameters) using LoRA + Unsloth, exported to GGUF for CPU inference on the Raspberry Pi 5. The same training pipeline applies across tasks; only the dataset and system prompt are task-specific.
 
 ---
 
diff --git a/docs/immune/finetuning_quantization.md b/docs/immune/finetuning_quantization.md
@@ -1,8 +1,7 @@
 ### Quantization and Deployment for Finetuned Models
 
-**Keywords:** GGUF, Quantization, Ollama, imatrix, Raspberry Pi 5, Deployment
 
-**TL;DR:** Finetuned models are converted to GGUF and published to Ollama in three quantization variants (q4_k_m, q5_k_m, q8_0). Quality degrades gracefully: ~19% loss at q8_0, ~25% at q5_k_m, ~33% at q4_k_m. q5_k_m offers the best quality/size trade-off for CPU/RPi deployment; 16-bit is recommended when a GPU is available.
+**Summary:** Finetuned models are converted to GGUF and published to Ollama in three quantization variants (q4_k_m, q5_k_m, q8_0). Quality degrades gracefully: ~19% loss at q8_0, ~25% at q5_k_m, ~33% at q4_k_m. q5_k_m offers the best quality/size trade-off for CPU/RPi deployment; 16-bit is recommended when a GPU is available.
 
 > **Evaluation basis:** performance numbers in this document were measured on the [finetuned summarization model](finetuning_results.md) (47 held-out incidents, judge: gpt-oss-120b). The conversion and publication methodology applies to any finetuned model in this pipeline.
 
diff --git a/docs/immune/finetuning_results.md b/docs/immune/finetuning_results.md
@@ -1,8 +1,7 @@
 ### Summarization Fine-Tuned Model: Evaluation Results
 
-**Keywords:** Qwen2.5-1.5B, Incident Summarization, SFT, LLM-as-Judge, Win Rate
 
-**TL;DR:** The Qwen2.5-1.5B model fine-tuned for Slips incident summarization ranks 1st overall with a 7.73 avg score and 74.5% win rate — well above GPT-4o-mini — across simple and medium incidents. The primary weakness is a hard failure on very large incidents (>4000 events) caused by input truncation.
+**Summary:** The Qwen2.5-1.5B model fine-tuned for Slips incident summarization ranks 1st overall with a 7.73 avg score and 74.5% win rate — well above GPT-4o-mini — across simple and medium incidents. The primary weakness is a hard failure on very large incidents (>4000 events) caused by input truncation.
 
 **Model:** [stratosphere/qwen2.5-1.5b-slips-immune](https://huggingface.co/stratosphere/qwen2.5-1.5b-slips-immune)  
 **Judge:** gpt-oss-120b | **Incidents evaluated:** 47 (44 scored, 3 missing) | **Date:** 2026-04-05
diff --git a/docs/immune/finetuning_summarization_procedure.md b/docs/immune/finetuning_summarization_procedure.md
@@ -1,8 +1,7 @@
 ### Summarization Fine-Tuning: Dataset and Training Procedure
 
-**Keywords:** Incident Summarization, SFT, LoRA, Dataset Filtering, Qwen2.5-1.5B
 
-**TL;DR:** The summarization model is trained on a quality-filtered subset of the Slips summarization dataset, using the highest-scoring model response per incident as the training target. The same general LoRA+Unsloth pipeline applies; this document covers the summarization-specific dataset preparation and system prompt.
+**Summary:** The summarization model is trained on a quality-filtered subset of the Slips summarization dataset, using the highest-scoring model response per incident as the training target. The same general LoRA+Unsloth pipeline applies; this document covers the summarization-specific dataset preparation and system prompt.
 
 ---