Skip to content

Commit 2e0b6c8

Browse files
committed
feat: Add audio fingerprinting with similarity detection
- Implement comprehensive fingerprinting system with spectral hashing, landmarks, and perceptual hashing - Add FingerprintMatcher for comparing audio similarity (0-100% scores) - Add FingerprintDatabase with inverted index for fast similarity search - Integrate fingerprint comparison into MCP server's compare_audio tool - Include example demonstrating fingerprint generation and similarity detection - Support match types: Identical, VerySimilar, Similar, PartiallySimilar, Different
1 parent 29f693e commit 2e0b6c8

6 files changed

Lines changed: 1334 additions & 4 deletions

File tree

README.md

Lines changed: 34 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ High-fidelity audio analysis bridge for development workflows. Analyze audio fil
1212
- **Musical Analysis**: Key detection with confidence, chord progression, harmonic complexity
1313
- **Quality Assessment**: SNR, THD, clipping detection, noise floor, and reliability scoring
1414
- **Segment Analysis**: Temporal structure detection, pattern recognition, coherence analysis
15+
- **Audio Fingerprinting**: Similarity detection, duplicate finding, content identification
1516
- **Visualization**: Waveforms, spectrograms, power curves (base64 encoded)
1617
- **MCP Integration**: Direct integration with AI assistants via Model Context Protocol
1718
- **Content-based Caching**: Fast re-analysis with BLAKE3 hashing
@@ -77,6 +78,8 @@ async fn main() -> Result<(), Box<dyn std::error::Error>> {
7778
println!("Loudness: {:.1} LUFS", result.perceptual.loudness_lufs);
7879
println!("True peak: {:.1} dBFS", result.perceptual.true_peak_dbfs);
7980
println!("Content type: {:?}", result.classification.primary_type);
81+
println!("Audio quality score: {:.1}%", result.quality.overall_score * 100.0);
82+
println!("Fingerprint hash: {:016x}", result.fingerprint.perceptual_hash);
8083

8184
Ok(())
8285
}
@@ -101,11 +104,13 @@ Parameters:
101104
- `max_data_points`: Limit array sizes for pagination (default: 1000)
102105
- `cursor`: Continue from previous response's next_cursor
103106

104-
#### `compare_audio` - Compare two audio files
107+
#### `compare_audio` - Compare two audio files with fingerprint similarity
105108
Parameters:
106109
- `file_a`, `file_b` (required): Paths to audio files
107110
- `metrics`: Optional comparison metrics to calculate
108111

112+
Returns fingerprint similarity score (0.0-1.0) and match type (Identical, Similar, Different, etc.)
113+
109114
#### `get_job_status` - Check analysis job status
110115
Parameters:
111116
- `job_id` (required): Job ID from previous analysis
@@ -122,7 +127,8 @@ src/
122127
│ ├── classification.rs # Speech/music/silence detection
123128
│ ├── musical.rs # Key detection, chord progression, harmonic analysis
124129
│ ├── quality.rs # Audio quality assessment and issue detection
125-
│ └── segments.rs # Segment-based temporal structure analysis
130+
│ ├── segments.rs # Segment-based temporal structure analysis
131+
│ └── fingerprint.rs # Audio fingerprinting and similarity detection
126132
├── visualization/ # Waveform and spectrogram generation
127133
├── cache/ # Content-based caching system
128134
├── mcp/ # MCP server implementation
@@ -164,17 +170,41 @@ cargo run --example generate_samples
164170
# Basic analysis
165171
cargo run --example basic_analysis
166172

167-
# Envelope visualization (creates PNG)
168-
cargo run --example envelope_visualization
173+
# Spectral analysis (FFT/STFT)
174+
cargo run --example spectral_analysis
169175

170176
# Onset detection
171177
cargo run --example onset_detection
172178

179+
# Perceptual metrics (LUFS, dynamic range)
180+
cargo run --example perceptual_analysis
181+
182+
# Content classification (speech/music/silence)
183+
cargo run --example content_classification
184+
185+
# Musical analysis (key detection, chords)
186+
cargo run --example musical_analysis
187+
188+
# Audio quality assessment
189+
cargo run --example quality_assessment
190+
191+
# Segment-based temporal analysis
192+
cargo run --example segment_analysis
193+
194+
# Audio fingerprinting and similarity detection
195+
cargo run --example fingerprint_similarity
196+
173197
# Compare two audio files
174198
cargo run --example compare_files
175199

200+
# Cached analysis demonstration
201+
cargo run --example cached_analysis
202+
176203
# Batch processing
177204
cargo run --example batch_processing
205+
206+
# Envelope visualization (creates PNG)
207+
cargo run --example envelope_visualization
178208
```
179209

180210
See [examples/README.md](examples/README.md) for more details.

examples/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@ This creates a `samples/` directory with test WAV files including sine waves, ch
2323
- **musical_analysis.rs** - Key detection, chroma analysis, and harmonic complexity
2424
- **quality_assessment.rs** - Audio quality scoring, issue detection, and recommendations
2525
- **segment_analysis.rs** - Temporal structure, pattern detection, and coherence analysis
26+
- **fingerprint_similarity.rs** - Audio fingerprinting, similarity detection, and duplicate finding
2627
- **cached_analysis.rs** - Using the cache system for faster repeated analysis
2728
- **batch_processing.rs** - Process multiple files in parallel
2829
- **envelope_visualization.rs** - Generate waveform visualization with peak and RMS envelopes
@@ -42,6 +43,7 @@ cargo run --example content_classification
4243
cargo run --example musical_analysis
4344
cargo run --example quality_assessment
4445
cargo run --example segment_analysis
46+
cargo run --example fingerprint_similarity
4547
cargo run --example cached_analysis
4648
cargo run --example batch_processing
4749
cargo run --example envelope_visualization

examples/fingerprint_similarity.rs

Lines changed: 208 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,208 @@
1+
//! Example demonstrating audio fingerprinting and similarity detection
2+
3+
use ferrous_waves::analysis::fingerprint::{
4+
FingerprintDatabase, FingerprintGenerator, FingerprintMatcher,
5+
};
6+
use ferrous_waves::{AnalysisEngine, AudioFile};
7+
use std::error::Error;
8+
9+
#[tokio::main]
10+
async fn main() -> Result<(), Box<dyn Error>> {
11+
println!("Audio Fingerprinting and Similarity Detection Example");
12+
println!("====================================================\n");
13+
14+
// Analyze files and generate fingerprints
15+
let files = vec![
16+
("samples/drums.wav", "Drum pattern"),
17+
("samples/music.wav", "Musical content"),
18+
("samples/test.wav", "Test signal"),
19+
];
20+
21+
let engine = AnalysisEngine::new();
22+
let generator = FingerprintGenerator::new(44100.0);
23+
let mut database = FingerprintDatabase::new();
24+
25+
println!("Generating fingerprints...");
26+
println!("-------------------------");
27+
28+
for (file_path, description) in &files {
29+
println!("\nAnalyzing: {} ({})", file_path, description);
30+
31+
let audio = AudioFile::load(file_path)?;
32+
let result = engine.analyze(&audio).await?;
33+
34+
// Display fingerprint info
35+
let fingerprint = &result.fingerprint;
36+
37+
println!(" Fingerprint Details:");
38+
println!(" Perceptual hash: {:016x}", fingerprint.perceptual_hash);
39+
println!(" Spectral hashes: {}", fingerprint.spectral_hashes.len());
40+
println!(" Landmarks: {}", fingerprint.landmarks.len());
41+
println!(
42+
" Sub-fingerprints: {}",
43+
fingerprint.sub_fingerprints.len()
44+
);
45+
println!(
46+
" Compact size: {} bytes",
47+
fingerprint.fingerprint.len() * 8
48+
);
49+
50+
// Show dominant frequencies
51+
if !fingerprint.metadata.dominant_frequencies.is_empty() {
52+
println!(" Dominant frequencies:");
53+
for freq in fingerprint.metadata.dominant_frequencies.iter().take(3) {
54+
println!(" {:.1} Hz", freq);
55+
}
56+
}
57+
58+
// Show landmark types
59+
let spectral_peaks = fingerprint
60+
.landmarks
61+
.iter()
62+
.filter(|l| {
63+
matches!(
64+
l.landmark_type,
65+
ferrous_waves::analysis::fingerprint::LandmarkType::SpectralPeak
66+
)
67+
})
68+
.count();
69+
let onsets = fingerprint
70+
.landmarks
71+
.iter()
72+
.filter(|l| {
73+
matches!(
74+
l.landmark_type,
75+
ferrous_waves::analysis::fingerprint::LandmarkType::OnsetEvent
76+
)
77+
})
78+
.count();
79+
80+
println!(" Landmark breakdown:");
81+
println!(" Spectral peaks: {}", spectral_peaks);
82+
println!(" Onset events: {}", onsets);
83+
84+
// Add to database
85+
database.insert(file_path.to_string(), fingerprint.clone());
86+
}
87+
88+
// Compare fingerprints
89+
println!("\n\nSimilarity Comparison:");
90+
println!("======================");
91+
92+
let matcher = FingerprintMatcher::new();
93+
94+
for i in 0..files.len() {
95+
for j in i + 1..files.len() {
96+
let (file_a, desc_a) = files[i];
97+
let (file_b, desc_b) = files[j];
98+
99+
let audio_a = AudioFile::load(file_a)?;
100+
let audio_b = AudioFile::load(file_b)?;
101+
102+
let fp_a = generator.generate(&audio_a.buffer.to_mono())?;
103+
let fp_b = generator.generate(&audio_b.buffer.to_mono())?;
104+
105+
let match_result = matcher.compare(&fp_a, &fp_b);
106+
107+
println!("\n{} vs {}", desc_a, desc_b);
108+
println!(
109+
" Overall similarity: {:.1}%",
110+
match_result.similarity * 100.0
111+
);
112+
println!(" Match type: {:?}", match_result.match_type);
113+
println!(" Confidence: {:.1}%", match_result.confidence * 100.0);
114+
115+
println!(" Detailed scores:");
116+
println!(" Spectral: {:.1}%", match_result.scores.spectral * 100.0);
117+
println!(" Temporal: {:.1}%", match_result.scores.temporal * 100.0);
118+
println!(" Energy: {:.1}%", match_result.scores.energy * 100.0);
119+
println!(" Landmark: {:.1}%", match_result.scores.landmark * 100.0);
120+
println!(
121+
" Perceptual: {:.1}%",
122+
match_result.scores.perceptual * 100.0
123+
);
124+
125+
if !match_result.matched_segments.is_empty() {
126+
println!(
127+
" Matched segments: {}",
128+
match_result.matched_segments.len()
129+
);
130+
for (idx, segment) in match_result.matched_segments.iter().enumerate().take(3) {
131+
println!(
132+
" {}. [{:.1}s] ↔ [{:.1}s] (quality: {:.1}%)",
133+
idx + 1,
134+
segment.time_a,
135+
segment.time_b,
136+
segment.quality * 100.0
137+
);
138+
}
139+
}
140+
141+
if let Some(offset) = match_result.time_offset {
142+
println!(" Time offset detected: {:.2}s", offset);
143+
}
144+
}
145+
}
146+
147+
// Database search demonstration
148+
println!("\n\nDatabase Search:");
149+
println!("================");
150+
151+
// Search with the first file
152+
if let Some((query_file, query_desc)) = files.first() {
153+
let audio = AudioFile::load(query_file)?;
154+
let query_fp = generator.generate(&audio.buffer.to_mono())?;
155+
156+
println!("Searching for: {} in database", query_desc);
157+
158+
let results = database.search(&query_fp, 0.3);
159+
160+
println!("Found {} matches:", results.len());
161+
for (id, match_result) in results {
162+
println!(
163+
" - {} (similarity: {:.1}%, type: {:?})",
164+
id,
165+
match_result.similarity * 100.0,
166+
match_result.match_type
167+
);
168+
}
169+
}
170+
171+
// Self-similarity test
172+
println!("\n\nSelf-Similarity Test:");
173+
println!("=====================");
174+
175+
let test_audio = AudioFile::load("samples/test.wav")?;
176+
let fp1 = generator.generate(&test_audio.buffer.to_mono())?;
177+
let fp2 = generator.generate(&test_audio.buffer.to_mono())?;
178+
179+
let self_match = matcher.compare(&fp1, &fp2);
180+
181+
println!("Same audio compared to itself:");
182+
println!(" Similarity: {:.1}%", self_match.similarity * 100.0);
183+
println!(" Match type: {:?}", self_match.match_type);
184+
println!(" Expected: >99% similarity for identical audio");
185+
186+
// Reference information
187+
println!("\n\nFingerprinting Reference:");
188+
println!("========================");
189+
println!("Match Types:");
190+
println!(" Identical: >95% similarity");
191+
println!(" Very Similar: 85-95% similarity");
192+
println!(" Similar: 70-85% similarity");
193+
println!(" Partially Similar: 50-70% similarity");
194+
println!(" Different: <50% similarity");
195+
println!("\nUse Cases:");
196+
println!(" - Duplicate detection in music libraries");
197+
println!(" - Copyright and content identification");
198+
println!(" - Version tracking (remixes, covers)");
199+
println!(" - Audio synchronization and alignment");
200+
println!(" - Partial matching for samples and loops");
201+
println!("\nFingerprint Components:");
202+
println!(" - Spectral hashes: Frequency pattern encoding");
203+
println!(" - Landmarks: Significant acoustic events");
204+
println!(" - Perceptual hash: Overall audio signature");
205+
println!(" - Sub-fingerprints: Partial matching support");
206+
207+
Ok(())
208+
}

src/analysis/engine.rs

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
use crate::analysis::classification::{ContentClassification, ContentClassifier};
2+
use crate::analysis::fingerprint::{AudioFingerprint, FingerprintGenerator};
23
use crate::analysis::musical::{MusicalAnalysis, MusicalAnalyzer};
34
use crate::analysis::perceptual::{calculate_perceptual_metrics, PerceptualMetrics};
45
use crate::analysis::quality::{QualityAnalyzer, QualityAssessment};
@@ -23,6 +24,7 @@ pub struct AnalysisResult {
2324
pub musical: MusicalAnalysis,
2425
pub quality: QualityAssessment,
2526
pub segments: SegmentAnalysis,
27+
pub fingerprint: AudioFingerprint,
2628
pub visuals: VisualsData,
2729
pub insights: Vec<String>,
2830
pub recommendations: Vec<String>,
@@ -366,6 +368,10 @@ impl AnalysisEngine {
366368
let segment_analyzer = SegmentAnalyzer::new(audio.buffer.sample_rate as f32);
367369
let segments = segment_analyzer.analyze(&mono)?;
368370

371+
// Generate audio fingerprint
372+
let fingerprint_generator = FingerprintGenerator::new(audio.buffer.sample_rate as f32);
373+
let fingerprint = fingerprint_generator.generate(&mono)?;
374+
369375
// Add musical insights
370376
insights.push(format!(
371377
"Key: {} (confidence: {:.0}%)",
@@ -435,6 +441,17 @@ impl AnalysisEngine {
435441
insights.push("Low segment coherence - abrupt changes detected".to_string());
436442
}
437443

444+
// Add fingerprint insights
445+
insights.push(format!(
446+
"Audio fingerprint generated with {} spectral hashes",
447+
fingerprint.spectral_hashes.len()
448+
));
449+
450+
insights.push(format!(
451+
"{} acoustic landmarks detected",
452+
fingerprint.landmarks.len()
453+
));
454+
438455
// Add perceptual insights
439456
if perceptual.loudness_lufs < -23.0 {
440457
insights.push(format!(
@@ -506,6 +523,7 @@ impl AnalysisEngine {
506523
musical,
507524
quality,
508525
segments,
526+
fingerprint,
509527
insights,
510528
recommendations,
511529
};
@@ -540,6 +558,19 @@ impl AnalysisEngine {
540558
_ => None,
541559
};
542560

561+
// Compare fingerprints
562+
let (fingerprint_similarity, fingerprint_match_type) = match (&analysis_a, &analysis_b) {
563+
(Some(a), Some(b)) => {
564+
let matcher = crate::analysis::fingerprint::FingerprintMatcher::new();
565+
let match_result = matcher.compare(&a.fingerprint, &b.fingerprint);
566+
(
567+
Some(match_result.similarity),
568+
Some(format!("{:?}", match_result.match_type)),
569+
)
570+
}
571+
_ => (None, None),
572+
};
573+
543574
let duration_difference = audio_a.buffer.duration_seconds - audio_b.buffer.duration_seconds;
544575
let sample_rate_match = audio_a.buffer.sample_rate == audio_b.buffer.sample_rate;
545576

@@ -563,6 +594,8 @@ impl AnalysisEngine {
563594
sample_rate_match,
564595
tempo_difference,
565596
spectral_similarity: None,
597+
fingerprint_similarity,
598+
fingerprint_match_type,
566599
},
567600
}
568601
}
@@ -590,6 +623,8 @@ pub struct ComparisonMetrics {
590623
pub sample_rate_match: bool,
591624
pub tempo_difference: Option<f32>,
592625
pub spectral_similarity: Option<f32>,
626+
pub fingerprint_similarity: Option<f32>,
627+
pub fingerprint_match_type: Option<String>,
593628
}
594629

595630
impl Default for AnalysisEngine {

0 commit comments

Comments
 (0)