docs: professional README

aiKunalBisht · aiKunalBisht · commit 9c3a6d876384 · 2026-05-14T10:27:34.000+05:30
diff --git a/README.md b/README.md
@@ -13,193 +13,149 @@ short_description: Speech & Meeting Intelligence — English · Hindi · Japanes
 
 <div align="center">
 
-# 🎙️ TranscriptAI
+# TranscriptAI
 
-### Speech & Meeting Intelligence Platform
+**Meeting intelligence that understands not just what was said — but what was meant.**
 
-<p>
-  <a href="https://huggingface.co/spaces/KunalTheBeast/TranscriptAI">
-    <img src="https://img.shields.io/badge/🚀%20Live%20Demo-HuggingFace-D96080?style=for-the-badge" alt="Live Demo"/>
-  </a>
-  &nbsp;
-  <a href="https://github.com/aiKunalBisht/Transcript-ai">
-    <img src="https://img.shields.io/badge/GitHub-Source%20Code-3C2416?style=for-the-badge&logo=github" alt="GitHub"/>
-  </a>
-</p>
+[![Live Demo](https://img.shields.io/badge/🤗%20Live%20Demo-Hugging%20Face-FF9D00?style=for-the-badge)](https://huggingface.co/spaces/KunalTheBeast/TranscriptAI)
+[![GitHub](https://img.shields.io/badge/GitHub-Source-181717?style=for-the-badge&logo=github)](https://github.com/aiKunalBisht/Transcript-ai)
+[![Eval Score](https://img.shields.io/badge/Eval%20Score-93%25-brightgreen?style=for-the-badge)]()
+[![License MIT](https://img.shields.io/badge/License-MIT-blue?style=for-the-badge)]()
 
-<p>
-  <img src="https://img.shields.io/badge/Python-3.10%2B-C45C74?style=flat-square&logo=python&logoColor=white"/>
-  <img src="https://img.shields.io/badge/Streamlit-UI-D96080?style=flat-square&logo=streamlit&logoColor=white"/>
-  <img src="https://img.shields.io/badge/FastAPI-REST%20API-486858?style=flat-square&logo=fastapi&logoColor=white"/>
-  <img src="https://img.shields.io/badge/Groq-Free%20Tier-B87830?style=flat-square"/>
-  <img src="https://img.shields.io/badge/Eval%20Score-93%25%20EXCELLENT-486858?style=flat-square"/>
-  <img src="https://img.shields.io/badge/License-MIT-A8897C?style=flat-square"/>
-</p>
-
-<br/>
-
-**Turn any meeting or speech into structured intelligence.**
-Summaries · Action Items · Tone Analysis · Communication Risk Signals
-
-*Works in English, Hindi, and Japanese — output always in English.*
+Trilingual · English · Hindi · Japanese
 
 </div>
 
 ---
 
-## What It Does
+## The Problem
 
-Paste or upload any transcript and get structured intelligence in seconds.
+Most meeting tools extract *what* was said. They miss everything underneath.
 
-```
-Input:  "Rahul: Vikram, client report Monday tak ready honi chahiye."
-        "Vikram: Dekhte hain. Thoda mushkil hai."
-
-Output: ✅ Action Item  → Prepare client report | Owner: Vikram | Deadline: Monday
-        🔴 Hindi Signal → "dekhte hain"       — Classic Indian soft no (80% confidence)
-        🔴 Hindi Signal → "thoda mushkil hai" — Indirect refusal (85% confidence)
-        ⚠️  Risk Level  → HIGH — Commitment unlikely to be followed through
-        🟣 Tone         → Hesitant / Uncertain (Intensity: 2/5)
-```
+Every language and culture has indirect communication patterns — polite rejections, soft commitments, face-saving agreements — that a generic summarizer will log as action items that never get done.
+
+TranscriptAI is built to catch exactly those signals.
 
 ---
 
-## The Core Idea
+## Live Demo
+
+**[→ Try it on Hugging Face](https://huggingface.co/spaces/KunalTheBeast/TranscriptAI)**
 
-Most meeting tools extract *what* was said. TranscriptAI extracts *how* people communicate.
+No setup. No API key. Paste any transcript and get structured intelligence in seconds.
 
-Each language has its own indirect communication patterns. A generic summarizer misses all of them:
+**Example — what a generic tool misses:**
 
-| What was said | Generic AI | TranscriptAI |
+| What was said | Generic AI output | TranscriptAI output |
 |---|---|---|
-| *"Dekhte hain"* | "We will see" — neutral | 🔴 Classic Indian soft no — unlikely to happen |
-| *"検討いたします"* | "We will consider it" — action item | ⚠️ 72% rejection confidence — follow up in writing |
-| *"We'll circle back"* | Meeting note | 🌀 Corporate hedging — no concrete next step |
-| *"Haan haan bilkul"* | "Yes absolutely" — agreement | 🟠 Hierarchical yes — agreeing to please, may not follow through |
+| Indirect verbal agreement | ✅ Action item logged | ⚠️ Soft commitment — low follow-through probability |
+| Japanese polite consideration phrase | ✅ Action item logged | 🔴 72% rejection confidence — request written confirmation |
+| Corporate hedge — "we'll circle back" | 📝 Meeting note | 🌀 No concrete next step — escalation recommended |
+| Enthusiastic but hierarchical yes | ✅ Agreement confirmed | 🟠 Agreeing to please, not necessarily to act |
 
 ---
 
-## Language Intelligence Layers
+## Output
 
-Three separate NLP engines, auto-detected from the transcript:
+For every transcript, TranscriptAI produces:
 
-### 🇮🇳 Hindi / Hinglish
-- Indirect no — `dekhte hain`, `thoda mushkil hai`, `koshish karenge`
-- Hierarchical yes — `haan haan bilkul`, `jo aap kahenge`
-- Face-saving exits — `upar se baat karta hoon`
-- Jugaad framing — `kuch na kuch ho jayega`
-- Respect deflection — `aap jo theek samjhe`
-- Detects both **Roman script and Devanagari**
+- **Summary** — concise narrative paragraph plus key bullet points scaled to meeting length
+- **Action items** — extracted with owner, deadline, and commitment strength rating
+- **Communication risk signals** — indirect rejections, hedging language, power imbalance markers
+- **Speaker tone profile** — 6-level colour-coded scale with intensity score per speaker
+- **Meeting health score** — 0 to 100 composite across sentiment, action clarity, risk, and AI confidence
+- **Session trends** — risk drift, hallucination rate, and workload patterns across meetings
 
-### 🇬🇧 English
-- Commitment strength meter — "I will" vs "I'll try" vs "we'll see"
-- Escalation signals — "going to have to escalate", "reconsider the contract"
-- Power imbalance — "this is unacceptable", "you need to understand"
-- Corporate hedging — "circle back", "take under advisement", "touch base"
-- Passive aggression — "fine", "whatever works for you"
-- 40+ patterns across 4 categories
+---
 
-### 🇯🇵 Japanese
-- 16 nemawashi soft rejection patterns with confidence scores
-- Keigo formality detection via MeCab morphological analysis
-- Deterministic JA↔EN code-switch counting
-- Cross-script speaker normalization — 田中 and Tanaka are the same speaker
+## Language Engines
 
----
+Three independent NLP modules, auto-detected from transcript content.
 
-## Features
+### English
+Commitment strength grading distinguishes "I will deliver" from "I will try" from "we will see." Detects escalation signals, power imbalance language, passive aggression, and corporate hedging. Over 40 patterns across 4 categories.
 
-**Summary Tab**
-- Full narrative paragraph — what was discussed, decided, and the outcome
-- 3–8 key bullet points scaled to transcript length
-- Previous session panel — meeting continuity tracking
+### Hindi
+Identifies indirect refusals, hierarchical agreement (saying yes to please rather than commit), face-saving exits, and vague reassurances. Handles both Roman script and Devanagari. Over 30 patterns.
 
-**Meeting Health Score** — 0–100 from 4 signals
+### Japanese
+16 nemawashi soft-rejection patterns with per-pattern confidence scores. Keigo formality detection via MeCab morphological analysis. Cross-script speaker normalization — the same person written in kanji and in romanization resolves to a single speaker identity.
+
+---
+
+## Architecture
 
 ```
-Sentiment (30) + Action Clarity (25) + Communication Risk (25) + AI Confidence (20)
-```
+transcription/
+  pii_masker.py           Local anonymization — runs before any LLM call
+  speaker_normalizer.py   Cross-script speaker identity resolution
+  audio_processor.py      Whisper transcription pipeline
 
-| Score | Label |
-|---|---|
-| 80–100 | 🟢 Productive Meeting |
-| 60–79 | 🟡 Mostly Aligned |
-| 40–59 | 🟠 Needs Follow-up |
-| 0–39 | 🔴 High Risk |
+analysis/
+  analyzer.py             LLM orchestration — Groq → Ollama → Mock fallback
+  english_analyzer.py     English NLP engine
+  hindi_analyzer.py       Hindi NLP engine
+  soft_rejection.py       Japanese nemawashi detector
+  hallucination_guard.py  Rule-based output verification
+  japanese_tokenizer.py   MeCab morphological analysis
 
-**Speaker Tone Intelligence** — 6-level color-coded scale with intensity bars 1–5
+utils/
+  evaluator.py            ROUGE-L + F1 + semantic similarity scoring
+  cache.py                MD5 result caching — 24h TTL
+  logger.py               JSONL observability and trend analysis
 
-```
-🔴 Aggressive → 🟠 Assertive → 🟡 Neutral → 🟢 Cooperative → 🔵 Deferential → 🟣 Hesitant
+app.py                    Streamlit UI — 7 tabs, health score, trend dashboard
+api.py                    FastAPI REST endpoints
 ```
 
-**Production Features**
-- APPI-compliant PII masking — names, phones, emails anonymized before LLM; restored after
-- Hallucination guard — 100% rule-based token overlap, LLM never validates itself
-- Groq → Ollama → Mock fallback with explicit UX feedback per provider
-- Meeting trends dashboard — soft rejection trends, hallucination drift, workload
-- FastAPI REST endpoint for CRM integration
-- MD5 result caching (24h TTL) + JSONL observability logging
+**Processing pipeline — order is strict:**
+
+```
+1. PII Mask       local, before LLM          (privacy compliance)
+2. LLM Analysis   Groq / Ollama / Mock
+3. PII Restore    local, before normalization
+4. Normalize      cross-script speaker deduplication
+5. Tone Classify  per-speaker 6-level scoring
+6. NLP Layer      language-specific signal detection
+7. Cache + Log    MD5 cache write, JSONL append
+```
 
 ---
 
-## Evaluation — 93% Overall Score
+## Evaluation
 
-Custom evaluation system with **cultural corrections** — standard NLP metrics have Western bias.
-Japanese professional neutral speech is NOT incorrect. Soft sentiment scoring applied.
+Standard NLP metrics carry Western assumptions. Formal neutral speech in Japanese or indirect communication in South Asian business contexts scores poorly on metrics calibrated for direct English. This project uses a custom evaluation framework with cultural corrections applied at each version iteration.
 
-| Version | Score | Key change |
-|---|---|---|
-| v1 | 30% | Baseline — exact matching only |
-| v2 | 55% | Fuzzy names, rule-based code-switch, semantic similarity |
-| v3 | 75% | Cultural ground truth, JA tokenization, soft sentiment |
-| v4 | 83% | Hallucination guard, nemawashi filter, speaker sort |
-| **v5** | **93% EXCELLENT** | Sentiment rules, tone intelligence, optimal bullet matching |
-
-| Metric | Score |
-|---|---|
-| Action Items F1 | 1.0 — EXCELLENT |
-| Sentiment (soft/cultural) | 1.0 — EXCELLENT |
-| Hallucination Risk | LOW |
-| **Overall** | **93% — EXCELLENT** |
+| Version | Score | Primary Change |
+|---------|-------|----------------|
+| v1 | 30% | Baseline — exact string matching |
+| v2 | 55% | Fuzzy matching, semantic similarity |
+| v3 | 75% | Cultural ground truth, Japanese tokenization |
+| v4 | 83% | Hallucination guard, soft rejection filter |
+| v5 | **93%** | Tone intelligence, optimal bullet assignment |
+
+| Metric | Result |
+|--------|--------|
+| Action Item F1 | 1.0 — Excellent |
+| Sentiment (cultural) | 1.0 — Excellent |
+| Hallucination Risk | Low |
+| Overall | **93%** |
 
 ---
 
-## Architecture
+## Production Features
 
-```
-transcription/
-  pii_masker.py          APPI anonymization — before LLM (v3: handles all bracket variants)
-  speaker_normalizer.py  Cross-script identity resolution
-  audio_processor.py     Whisper transcription
+**Privacy**
+PII anonymization runs locally before any transcript reaches an LLM. Names, phone numbers, and email addresses are masked on input and restored on output. No personal data is transmitted.
 
-analysis/
-  analyzer.py            Groq → Ollama → Mock · trilingual detection · tone schema
-  english_analyzer.py    English NLP — 40+ patterns (hedging, power, escalation)
-  hindi_analyzer.py      Hindi NLP — 30+ patterns (Roman + Devanagari)
-  soft_rejection.py      Japanese 16-pattern nemawashi detector
-  hallucination_guard.py 100% rule-based claim verification
-  japanese_tokenizer.py  MeCab morphological analysis
+**Reliability**
+Three-tier LLM fallback — Groq (1–2s, free tier) → Ollama (local, zero cost) → Mock (always available). MD5 result caching with 24-hour TTL means repeat queries return in under one second.
 
-utils/
-  evaluator.py           ROUGE + semantic + F1 + optimal assignment matching
-  logger.py              JSONL logging + trend analysis engine
-  cache.py               MD5 result caching
+**Observability**
+Every analysis is written to a local JSONL log. A built-in trends dashboard tracks soft rejection rates, hallucination drift, and workload distribution across sessions.
 
-app.py                   Streamlit UI — translucent navbar, 7 tabs, health score
-api.py                   FastAPI REST endpoints
-```
-
-**Processing order (sequence is critical):**
-```
-1. PII mask       — local, before LLM         (APPI compliance)
-2. LLM analysis   — Groq / Ollama / Mock
-3. PII restore    — local, before normalization (so normalizer sees real names)
-4. Normalize      — cross-script speaker dedup
-5. Tone classify  — 6-level scale per speaker
-6. NLP layer      — language-specific routing
-7. Cache + log    — local JSONL
-```
+**Integration**
+FastAPI REST endpoint at `/analyze` for direct integration with CRM systems, Slack bots, or downstream pipelines.
 
 ---
 
@@ -209,21 +165,25 @@ api.py                   FastAPI REST endpoints
 git clone https://github.com/aiKunalBisht/Transcript-ai.git
 cd Transcript-ai
 pip install -r requirements.txt
+```
 
-# Recommended — Groq (1-2 second analysis, free tier)
-export GROQ_API_KEY=your_key_here    # free at console.groq.com
+**Cloud — Groq (recommended, free tier)**
+```bash
+export GROQ_API_KEY=your_key_here    # console.groq.com
 python -m streamlit run app.py
+```
 
-# Fully local — zero data leaves your machine
+**Local — fully offline, zero data leaves your machine**
+```bash
 ollama pull qwen3:8b
 python -m streamlit run app.py
 ```
 
-Optional:
+**Optional dependencies**
 ```bash
-pip install fugashi unidic-lite      # MeCab Japanese tokenizer
-pip install scikit-learn             # TF-IDF semantic similarity
-pip install sentence-transformers    # Neural semantic understanding (~500MB)
+pip install fugashi unidic-lite        # MeCab Japanese tokenizer
+pip install scikit-learn               # TF-IDF semantic similarity
+pip install sentence-transformers      # Neural semantic scoring
 ```
 
 ---
@@ -232,50 +192,48 @@ pip install sentence-transformers    # Neural semantic understanding (~500MB)
 
 ```bash
 python api.py
-# Interactive docs: http://localhost:8000/docs
+# Interactive docs at http://localhost:8000/docs
 ```
 
 ```python
 import requests
 
-r = requests.post("http://localhost:8000/analyze", json={
-    "transcript": "Rahul: Friday tak deliver ho sakta hai? Priya: Dekhte hain.",
-    "language": "hi",
+response = requests.post("http://localhost:8000/analyze", json={
+    "transcript": "Alex: Can we get this delivered by Friday?\nJordan: We will see what we can do.",
+    "language": "en",
     "mask_pii": True
 })
-print(r.json()["result"]["soft_rejections"]["risk_level"])   # HIGH
-print(r.json()["result"]["soft_rejections"]["risk_summary"]) # Commitment unlikely...
+
+result = response.json()["result"]
+print(result["soft_rejections"]["risk_level"])    # HIGH
+print(result["soft_rejections"]["risk_summary"])  # Commitment unlikely to be followed through
 ```
 
 ---
 
 ## Known Limitations
 
-| Limitation | Path Forward |
-|---|---|
-| Speaker diarization ~70% accuracy | pyannote.audio |
-| Audio unavailable on HF Spaces | Groq Whisper API (next) |
-| 3 synthetic test cases | External validation on real transcripts |
-| Confidence scores are heuristic | Labeled dataset + calibration |
-| No feedback loop | User correction collection + fine-tuning |
+| Limitation | Planned Improvement |
+|------------|---------------------|
+| Speaker diarization ~70% accuracy | pyannote.audio integration |
+| Audio upload unavailable on HF Spaces | Groq Whisper API — next release |
+| Confidence scores are heuristic | Labeled dataset and calibration |
+| Demo uses synthetic test cases | Real-world transcript validation ongoing |
 
 ---
 
-## Numbers
+## Project Scale
 
-```
-19 Python files  ·  6,000+ lines  ·  90+ functions
-40+ English patterns  ·  30+ Hindi patterns  ·  16 Japanese soft rejection patterns
-500+ Japanese surnames  ·  Eval score: 93% EXCELLENT
-Formats: TXT · VTT · JSON · MP4 · MP3 · WAV · M4A
-```
+19 Python files · 6,000+ lines · 90+ functions
+86 linguistic patterns across 3 languages · 500+ Japanese surname entries
+Supported formats: TXT · VTT · JSON · MP4 · MP3 · WAV · M4A
 
 ---
 
 <div align="center">
 
-Built by **[Kunal Bisht](https://github.com/aiKunalBisht)** · Pithoragarh, Uttarakhand, India
+Built by [Kunal Bisht](https://github.com/aiKunalBisht) — Pithoragarh, India
 
-[LinkedIn](https://linkedin.com/in/kunalhere) &nbsp;·&nbsp; [Hugging Face](https://huggingface.co/KunalTheBeast)
+[Hugging Face](https://huggingface.co/KunalTheBeast) · [LinkedIn](https://linkedin.com/in/kunalhere) · [GitHub](https://github.com/aiKunalBisht)
 
 </div>