|
1 | | -# DeepRaaga 🎵 |
| 1 | +<div align="center"> |
| 2 | + |
| 3 | +# 🎵 DeepRaaga |
| 4 | + |
2 | 5 | **An AI Framework for Learning and Generating Carnatic Ragas** |
3 | 6 |
|
4 | | -DeepRaaga is an open-source framework dedicated to modeling the intricate structural beauty of Carnatic music using Artificial Intelligence. By harmonizing traditional heritage with modern machine learning paradigms, we strive to build a computational bridge to India's rich musical legacy. |
| 7 | + |
5 | 8 |
|
6 | | -## 🌟 Vision: A National Knowledge Repository |
| 9 | +[](https://www.python.org/downloads/) |
| 10 | +[](https://reactjs.org/) |
| 11 | +[](https://vitejs.dev/) |
| 12 | +[](https://www.tensorflow.org/) |
| 13 | +[](https://github.com/sgmoorthy/DeepRaaga/actions) |
| 14 | +[](https://opensource.org/licenses/MIT) |
7 | 15 |
|
8 | | -Inspired by the visionary dialogue between PM Narendra Modi and veteran composer Ramesh Vinayakam on creating a "National Knowledge Repository" for Indian music, DeepRaaga stands as a foundational step toward that goal. |
| 16 | +*DeepRaaga is an open-source platform dedicated to modeling the intricate structural beauty of Carnatic music using Artificial Intelligence. By harmonizing traditional heritage with modern machine learning paradigms, we strive to build a computational bridge to India's rich musical legacy.* |
9 | 17 |
|
10 | | -Carnatic music cannot be reduced to simple discrete notes; it is defined by the continuous, microtonal inflections (**Gamakas**), characteristic melodic pathways (**Sancharas**), and the strict grammatical constraints of ascending (**Arohana**) and descending (**Avarohana**) scales. Our mission is to encode this profound acoustic heritage into robust AI models, moving beyond Western-centric MIR (Music Information Retrieval) to create an open platform that respects, preserves, and innovates upon the grammar of Indian Classical Music. |
| 18 | +</div> |
11 | 19 |
|
12 | | -## 1. Research Motivation |
| 20 | +--- |
13 | 21 |
|
14 | | -Carnatic music encodes melodic knowledge through ragas, which specify permitted swaras, characteristic phrases, and ornamentation patterns rather than fixed scores. Prior work in Music Information Retrieval (MIR) has focused mainly on Western genres, with comparatively fewer large-scale, open implementations for raga-centric modeling. [web:18][web:20] |
| 22 | +## 🌟 Vision: A National Knowledge Repository |
15 | 23 |
|
16 | | -DeepRaaga aims to provide: |
17 | | -- A reproducible pipeline from raw Carnatic MIDI/MusicXML to model-ready sequences. |
18 | | -- Baseline sequence models for raga classification and raga-conditioned generation. |
19 | | -- An extensible codebase that can support future work on swara-level modeling, tonic invariance, and improvisation analysis. [web:7][web:19] |
| 24 | +Inspired by the visionary dialogue between PM Narendra Modi and veteran composer Ramesh Vinayakam on creating a "National Knowledge Repository" for Indian music, DeepRaaga stands as a foundational step toward that goal. |
20 | 25 |
|
21 | | -## 2. System Overview |
| 26 | +Carnatic music cannot be reduced to simple discrete notes; it is defined by the continuous, microtonal inflections (**Gamakas**), characteristic melodic pathways (**Sancharas**), and the strict grammatical constraints of ascending (**Arohana**) and descending (**Avarohana**) scales. Our mission is to encode this profound acoustic heritage into robust AI models, moving beyond Western-centric MIR (Music Information Retrieval) to create an open platform that respects, preserves, and innovates upon the grammar of Indian Classical Music. |
22 | 27 |
|
23 | | -The project has two main subsystems: |
| 28 | +--- |
24 | 29 |
|
25 | | -- **Backend / Research pipeline (Python, TensorFlow, Magenta)** |
26 | | - - Data ingestion and conversion of Carnatic compositions (MIDI, MusicXML) into NoteSequence and TFRecord formats. |
27 | | - - Training of sequence models (RNN/LSTM or Transformer-style) on raga-labeled phrases. |
28 | | - - Scripts for evaluation (classification accuracy, phrase-level metrics) and sequence generation. [web:62][web:65] |
| 30 | +## 🏗️ System Architecture |
29 | 31 |
|
30 | | -- **Frontend / Interaction layer (React + Vite)** |
31 | | - - `index.html` bootstraps a React SPA via `src/main.jsx`, mounting `App.jsx` at the `#root` div. [web:69][web:73] |
32 | | - - UI components under `src/components/` manage raga selection, model invocation (backend API hook), and audio playback of generated MIDI. [web:66] |
| 32 | +The project is decoupled into two primary subsystems to ensure scalability and rapid inference: |
33 | 33 |
|
34 | | - |
| 34 | +### 1. Neural Backend (Python) |
| 35 | +- **Environment Requirement:** Python 3.10+ |
| 36 | +- **Data Ingestion:** Conversion of Carnatic compositions (MIDI, MusicXML) into `NoteSequence` and `TFRecord` formats holding deep arrays of pitch bends and timing. |
| 37 | +- **Deep Learning Core:** Training of sequence models (RNN/LSTM or Transformer-style) heavily conditioned on specific Raga constraints. |
| 38 | +- **Microservices:** A FastAPI/Flask REST layer to expose the `.generate()` methods via endpoints. |
35 | 39 |
|
36 | | -Repository layout (simplified): |
| 40 | +### 2. Interaction Layer (React + Vite) |
| 41 | +- **Technology Stack:** React 18, Vite 5, Material UI 3 |
| 42 | +- **Usage:** A modern, glassmorphic Single Page Application (SPA) offering visual generation tools, Raga playback via WebMIDI/Tone.js, and an integrated technical blog. |
37 | 43 |
|
38 | | -``` |
39 | | -DeepRaaga/ |
40 | | -├── base/ # Base classes, shared utilities |
41 | | -├── data/ # raw/ and processed/ Carnatic music data |
42 | | -├── docs/ # Concept notes and documentation |
43 | | -├── images/ # Diagrams, figures |
44 | | -├── model/ # Model definitions and training scripts |
45 | | -├── results/ # Generated outputs and logs |
46 | | -├── src/ # React front-end (App.jsx, main.jsx, components/) |
47 | | -├── test/ # Test scripts |
48 | | -├── index.html # Front-end entry (Vite/React) |
49 | | -├── requirements.txt |
50 | | -└── package.json |
51 | | -``` |
| 44 | +--- |
52 | 45 |
|
53 | | -## 3. Installation |
| 46 | +## 🚀 Installation & Local Development |
54 | 47 |
|
55 | | -### 3.1. Backend (Python) |
| 48 | +### 🐍 Backend setup (Python 3.10+) |
56 | 49 |
|
57 | | -``` |
| 50 | +```bash |
58 | 51 | git clone https://github.com/sgmoorthy/DeepRaaga.git |
59 | 52 | cd DeepRaaga |
60 | 53 |
|
| 54 | +# Initialize virtual environment |
61 | 55 | python -m venv .venv |
62 | | -source .venv/bin/activate # Windows: .\.venv\Scripts\activate |
63 | | -
|
64 | | -pip install -r requirements.txt |
65 | | -``` |
66 | 56 |
|
67 | | -`requirements.txt` typically includes TensorFlow, Magenta, librosa, pretty-midi, midi2audio, and related audio/MIR tooling. [web:59][web:68] |
| 57 | +# Activate environment |
| 58 | +source .venv/bin/activate # Unix/macOS |
| 59 | +.\.venv\Scripts\activate # Windows |
68 | 60 |
|
69 | | -### 3.2. Frontend (React) |
70 | | - |
71 | | -``` |
72 | | -npm install # or: pnpm install / yarn install |
73 | | -npm run dev # local development |
74 | | -npm run build # production build |
75 | | -``` |
76 | | - |
77 | | -`index.html` loads `src/main.jsx` as a module entry point and mounts the React app at `#root`. [web:69][web:73] |
78 | | - |
79 | | -``` |
80 | | -<!DOCTYPE html> |
81 | | -<html lang="en"> |
82 | | - <head> |
83 | | - <meta charset="UTF-8" /> |
84 | | - <meta name="viewport" content="width=device-width, initial-scale=1.0" /> |
85 | | - <title>DeepRaaga - AI Carnatic Music Generator</title> |
86 | | - <link |
87 | | - rel="stylesheet" |
88 | | - href="https://fonts.googleapis.com/css2?family=Roboto:wght@300;400;500;600;700&display=swap" |
89 | | - /> |
90 | | - </head> |
91 | | - <body> |
92 | | - <div id="root"></div> |
93 | | - <script type="module" src="/src/main.jsx"></script> |
94 | | - </body> |
95 | | -</html> |
| 61 | +# Install ML Dependencies |
| 62 | +pip install -r requirements.txt |
96 | 63 | ``` |
| 64 | +*(Dependencies include `TensorFlow`, `Magenta`, `librosa`, `pretty-midi`, and `FastAPI`)* |
97 | 65 |
|
98 | | -## 4. Data Pipeline |
99 | | - |
100 | | -### 4.1. Source Data |
101 | | - |
102 | | -- Raga-labeled Carnatic compositions as MIDI or MusicXML files, organized by raga under `data/raw/`. |
103 | | -- A practical starting point is to cover a subset of Melakarta ragas (parent ragas) and later extend to Janya ragas. [web:16][web:18] |
| 66 | +### ⚛️ Frontend Setup (React) |
104 | 67 |
|
105 | | -### 4.2. Preprocessing |
| 68 | +```bash |
| 69 | +# Install exact dependencies |
| 70 | +npm install |
106 | 71 |
|
107 | | -Run: |
| 72 | +# Spin up high-speed local dev server |
| 73 | +npm run dev |
108 | 74 |
|
109 | | -``` |
110 | | -python data/convert_data.py |
| 75 | +# Compile for production |
| 76 | +npm run build |
111 | 77 | ``` |
112 | 78 |
|
113 | | -This stage typically performs: |
114 | | -- Conversion of MIDI/MusicXML into Magenta `NoteSequence` protos. [web:62][web:65] |
115 | | -- Quantization to a fixed temporal grid while preserving raga-relevant pitch information. |
116 | | -- Creation of TFRecord datasets with (sequence, raga_id) pairs. |
117 | | -- Optional data augmentation (transposition within raga-compatible ranges, time-stretch within musically valid bounds). [web:25] |
| 79 | +--- |
118 | 80 |
|
119 | | -## 5. Models |
| 81 | +## ⚙️ CI/CD & Automated Deployment |
120 | 82 |
|
121 | | -### 5.1. Baseline Sequence Model |
| 83 | +DeepRaaga utilizes robust **GitHub Actions** pipelines to automate testing and deployments natively. |
122 | 84 |
|
123 | | -The initial baseline can use a recurrent neural network for modeling sequences of notes or pitch classes: [web:22][web:27] |
| 85 | +**The Workflow (`.github/workflows/deploy.yml`):** |
| 86 | +1. **Trigger:** Activated automatically upon merging code into the `master` branch. |
| 87 | +2. **Environment:** Spins up an `ubuntu-latest` container running Node.js. |
| 88 | +3. **Build:** Installs dependencies cleanly utilizing `npm ci` and compiles the React application utilizing `npx vite build --base=/DeepRaaga/`. |
| 89 | +4. **Deploy:** Artifacts are packaged and pushed directly to the `gh-pages` branch, instantly updating the lived production URL. |
124 | 90 |
|
125 | | -- **Input**: Tokenized swara/pitch sequence plus raga conditioning. |
126 | | -- **Architecture**: Embedding → 2–3 LSTM layers → dense output over note vocabulary. |
127 | | -- **Loss**: Cross-entropy over next-token prediction; optionally auxiliary raga classification loss. |
128 | | -- **Training script** (example): |
| 91 | +*(Note: Contributors do not need to manually push a build. Simply open a PR against `master` and the CI/CD pipeline handles it!)* |
129 | 92 |
|
130 | | -``` |
131 | | -python model/basic_model.py |
132 | | -``` |
133 | | - |
134 | | -### 5.2. Raga Classification |
135 | | - |
136 | | -For raga recognition from phrases, DeepRaaga can employ: [web:7][web:11][web:44] |
| 93 | +--- |
137 | 94 |
|
138 | | -- A CNN or CNN+LSTM over pitch-class distributions and contour features. |
139 | | -- Evaluation metrics: accuracy, macro-F1 over ragas, confusion matrices to inspect confusions between musically close ragas. |
| 95 | +## 📖 Documentation & The Keyword Blog |
140 | 96 |
|
141 | | -### 5.3. Raga-conditioned Generation |
| 97 | +DeepRaaga ships with a fully integrated, beautifully stylized publishing platform structurally inspired by Google's *"The Keyword"* blog. |
142 | 98 |
|
143 | | -Generation is performed by autoregressively sampling from the trained sequence model while conditioning on: [web:14][web:21] |
| 99 | +The frontend incorporates a dynamically mapped Client-Side Router that leverages raw `.md` (Markdown) data without requiring a dedicated backend database. |
144 | 100 |
|
145 | | -- Selected raga ID (conditioning vector or embedding). |
146 | | -- Optional constraints such as allowed pitch sets and typical phrase lengths. |
| 101 | +### Writing a New Blog Post |
| 102 | +1. **Create your Markdown file:** Add your technical writing to `docs/blog/your-new-post.md`. |
| 103 | +2. **Handle Media Natively:** Use standard Markdown image syntax (``). For YouTube videos, standard hyperlink anchor tags (`[Video](https://www.youtube.com/watch?v=VIDEO_ID)`) are dynamically intercepted by our custom React Markdown engine and converted into secure, auto-responsive `<iframe>` embeds. |
| 104 | +3. **Register your Post:** Open `src/components/BlogViewer.jsx` and add your metadata to the `blogs` manifest array: |
| 105 | +```javascript |
| 106 | +import newPost from '../../docs/blog/your-new-post.md?raw'; |
147 | 107 |
|
148 | | -An example generation script interface: |
149 | | - |
150 | | -``` |
151 | | -python generate.py --raga="Bhairavi" --duration=300 |
| 108 | +{ |
| 109 | + slug: 'your-new-post', |
| 110 | + title: 'Your Premium Title Here', |
| 111 | + category: 'Engineering', |
| 112 | + date: 'April 15, 2026', |
| 113 | + image: '/DeepRaaga/blog-images/feature.png', |
| 114 | + description: 'A brief description that appears on the feed card.', |
| 115 | + content: newPost |
| 116 | +} |
152 | 117 | ``` |
| 118 | +Vite will automatically hot-reload your new post into the responsive grid feed! |
153 | 119 |
|
154 | | -to generate approximately 5 minutes of raga-specific melodic material. |
155 | | - |
156 | | -## 6. Frontend Usage |
| 120 | +--- |
157 | 121 |
|
158 | | -Once the backend model server is running (for example, via a REST API wrapping the Python model), the React app provides: [web:66][web:73] |
159 | | - |
160 | | -- A dropdown or list of supported ragas. |
161 | | -- Controls for generation parameters (duration, temperature, starting phrase). |
162 | | -- Playback of generated MIDI via WebAudio or a WebMIDI-compatible synth. |
163 | | - |
164 | | -Typical dev workflow: |
| 122 | +## 🧠 ML Data Pipeline & Models |
165 | 123 |
|
| 124 | +### Preprocessing |
| 125 | +Data must be grouped by raga under `data/raw/` in MIDI format. To quantize and prepare the data for TensorFlow: |
| 126 | +```bash |
| 127 | +python data/convert_data.py |
166 | 128 | ``` |
167 | | -# In one terminal: start backend model server (example) |
168 | | -python model/serve_model.py # e.g., FastAPI/Flask app |
169 | 129 |
|
170 | | -# In another terminal: start React dev server |
171 | | -npm run dev |
| 130 | +### Raga-Conditioned Autoregressive Generation |
| 131 | +Generation is performed by autoregressively sampling from the trained sequence model while conditioning on a specific Raga Latent Vector: |
| 132 | +```bash |
| 133 | +python generate.py --raga="Bhairavi" --duration=300 --temperature=0.8 |
172 | 134 | ``` |
173 | 135 |
|
174 | | -## 7. Experimental Protocol |
175 | | - |
176 | | -To support research-grade reporting: |
177 | | - |
178 | | -1. **Train/validation/test split** |
179 | | - - Split compositions per raga so that test ragas contain unseen phrases. |
180 | | - - Ensure no phrase-level leakage between splits. [web:18][web:19] |
| 136 | +--- |
181 | 137 |
|
182 | | -2. **Metrics** |
183 | | - - Raga classification: accuracy, macro-F1, confusion matrix. |
184 | | - - Generation: human evaluation from Carnatic musicians (raga adherence, musicality), objective pitch-set compliance. [web:7][web:21] |
| 138 | +## 🗺️ Project Roadmap |
185 | 139 |
|
186 | | -3. **Baselines** |
187 | | - - N-gram or Markov-based pitch sequence models. |
188 | | - - Unconditioned LSTM trained across all ragas. [web:18][web:29] |
| 140 | +Building the ultimate AI framework for Carnatic music is an ongoing journey: |
189 | 141 |
|
190 | | -## 8. Research Paper Alignment |
| 142 | +- [ ] **Phase 1: Melakarta Integration:** Map all 72 parent scales explicitly. |
| 143 | +- [ ] **Phase 2: Rhythmic Tala-Awareness:** Introduce chronological constraints so models respect the 8-beat structure of cycles like *Adi Tala*. |
| 144 | +- [ ] **Phase 3: Transformer Infrastructure:** Upgrade baseline `RagaLSTM` nodes to causal Transformers (e.g., MusicLM variants) for improved *Alapana* continuity. |
| 145 | +- [ ] **Phase 4: Open API Sandbox:** Offer a real-time web REST API for digital musicians. |
191 | 146 |
|
192 | | -The accompanying research paper built on DeepRaaga typically includes: [web:7][web:11][web:44] |
| 147 | +--- |
193 | 148 |
|
194 | | -- **Problem definition**: learning and generating Carnatic ragas. |
195 | | -- **Methodology**: detailed data pipeline, model architectures, and training setup. |
196 | | -- **Results**: quantitative classification performance and qualitative generation study. |
197 | | -- **Ablations**: impact of raga conditioning, tonic normalization, and phrase segmentation. |
| 149 | +## 🤝 Contributing |
198 | 150 |
|
199 | | -This repository is intended to be directly citable as the implementation artifact for that paper. |
| 151 | +We rely on **Musicologists**, **ML Engineers**, and **React Developers** to build this repository. See our detailed breakdown of roles in [How to Contribute to DeepRaaga](https://github.com/sgmoorthy/DeepRaaga/tree/master/docs/blog/06-how-to-contribute.md). |
200 | 152 |
|
201 | | -## 9. Enhanced Technical Roadmap |
| 153 | +1. **Fork** the repo! |
| 154 | +2. Create your feature branch: `git checkout -b feature/carnatic-magic` |
| 155 | +3. Push to the branch: `git push origin feature/carnatic-magic` |
| 156 | +4. Submit a **Pull Request**. |
202 | 157 |
|
203 | | -Building the ultimate AI framework for Carnatic music is an ongoing journey. Here are our high-priority technical goals: |
| 158 | +--- |
204 | 159 |
|
205 | | -### Phase 1: Robust Data & Parsing |
206 | | -- [ ] **Melakarta Mapping System:** Explicitly map all 72 Melakarta (parent) ragas and build a relational database linking Janya (derivative) ragas to their parents. |
207 | | -- [ ] **Advanced Gamaka Encoding:** Enhance NoteSequence parsers to extract continuous pitch bends from `.wav` files using SPICE/CREPE and map them to symbolic swara tokens. |
208 | | - |
209 | | -### Phase 2: Advancing the Architecture |
210 | | -- [ ] **Transformer-based Sanchara Modeling:** Shift from basic LSTMs to causal Transformers to capture longer context in complex Alapanas. |
211 | | -- [ ] **Tala-Awareness & Rhythmic Segmentation:** Introduce explicit tokenization for **Tala** (rhythmic cycles) to ensure generated phrases adhere to constraints like *Adi Tala* (8 beats). |
212 | | -- [ ] **Tonic Invariance Modeling:** Train models that are completely independent of the singer's *Shruti* (root pitch), using relative interval embeddings. |
213 | | - |
214 | | -### Phase 3: The National Knowledge Repository |
215 | | -- [ ] **Crowdsourced Annotation UI:** A web-based visualizer for musicians to easily tag, correct, and curate the generated phrases and raw data. |
216 | | -- [ ] **Public API / Edge Models:** Lightweight ONNX models that can run directly in the browser via WebAssembly for real-time phrase accompaniment. |
217 | | - |
218 | | -## 10. How to Reuse in Your Research |
219 | | - |
220 | | -- Fork this repo and add your own Carnatic or Hindustani dataset. |
221 | | -- Swap in alternative architectures (Transformers, conformers, diffusion-based symbolic generators). [web:22][web:29] |
222 | | -- Use the provided pipeline as a template for MIR studies on raga recognition, recommendation, or improvisation analysis. [web:18][web:40] |
223 | | - |
224 | | -## 11. Citation |
225 | | - |
226 | | -If you use DeepRaaga in academic work, please cite the associated paper (placeholder): |
| 160 | +## 📜 Academic Citation |
227 | 161 |
|
| 162 | +If you use DeepRaaga in academic work, please cite the associated paper: |
| 163 | +```bibtex |
| 164 | +@inproceedings{swaminathan2025deepraaga, |
| 165 | + author = {Gurumurthy Swaminathan}, |
| 166 | + title = {DeepRaaga: Learning and Generating Carnatic Ragas with Deep Neural Sequence Models}, |
| 167 | + year = {2026}, |
| 168 | + booktitle = {Proceedings of the Audio Engineering Society} |
| 169 | +} |
228 | 170 | ``` |
229 | | -Gurumurthy Swaminathan, (2025). DeepRaaga: Learning and Generating Carnatic Ragas with Deep Neural Sequence Models. |
230 | | -Proceedings of the [Conference Name], pp. XX–YY. |
231 | | -``` |
232 | | - |
233 | | -Also acknowledge this repository: |
234 | | - |
235 | | -``` |
236 | | -DeepRaaga: an effort to teach Indian carnatic music to AI. |
237 | | -GitHub repository, https://github.com/sgmoorthy/DeepRaaga |
238 | | -``` |
239 | | -[web:38] |
240 | | - |
241 | | -## 12. License |
242 | 171 |
|
243 | | -This project is licensed under the MIT License – see the [LICENSE](LICENSE) file for details. [web:74][web:96] |
| 172 | +--- |
| 173 | +<div align="center"> |
| 174 | + <strong>DeepRaaga</strong> is an open-source platform created and managed by <strong>Gurumurthy Swaminathan</strong>.<br><br> |
| 175 | + Released under the <a href="LICENSE">MIT License</a>. |
| 176 | +</div> |
0 commit comments