This system solves location search inconsistency by grouping nearby properties into geo-buckets and normalizing location names.
Problem: "Sangotedo" returns 0 results, but "Sangotedo" (different spelling) returns 47 results.
Solution: Grid-based buckets + fuzzy name matching = consistent results.
Grid-Based Bucketing:
Split the map into fixed-size grid cells (0.01° × 0.01° ≈ 1.1km squares)
Each cell = One Bucket
Example:
┌─────────┬─────────┬─────────┐
│ Lekki │ VI │ Ikoyi │
│ Cell 1 │ Cell 2 │ Cell 3 │
├─────────┼─────────┼─────────┤
│Sangotedo│ Ajah │ Badore │
│ Cell 4 │ Cell 5 │ Cell 6 │ ← All Sangotedo properties in Cell 4
└─────────┴─────────┴─────────┘
Grid Size: 0.01 degrees (~1.1 km)
Coordinate System: WGS84 (standard GPS)
Calculate Grid Cell:
grid_x = floor(longitude / 0.01)
grid_y = floor(latitude / 0.01)
Example - Sangotedo Coordinates:
(6.4698, 3.6285) → Grid (362, 646)
(6.4720, 3.6301) → Grid (362, 646) ← Same cell!
(6.4705, 3.6290) → Grid (362, 646) ← Same cell!✅ Fast: O(1) lookup - just calculate grid coordinates
✅ Simple: Easy to implement and debug
✅ Predictable: Fixed boundaries, consistent behavior
❌ Trade-off: Properties on cell boundaries might be in different buckets
- Mitigation: Fuzzy name matching catches these cases
┌─────────────────────────────────────────────────────────────┐
│ geo_buckets │
├─────────────────────────────────────────────────────────────┤
│ id SERIAL PRIMARY KEY │
│ grid_x INTEGER (indexed) │
│ grid_y INTEGER (indexed) │
│ canonical_name VARCHAR(255) │
│ aliases JSONB (e.g., ["sangotedo", "sangotedo ajah"])│
│ created_at TIMESTAMP │
│ │
│ UNIQUE(grid_x, grid_y) ← Only one bucket per grid cell │
└─────────────────────────────────────────────────────────────┘
│
│ 1:N
▼
┌─────────────────────────────────────────────────────────────┐
│ properties │
├─────────────────────────────────────────────────────────────┤
│ id SERIAL PRIMARY KEY │
│ title VARCHAR(255) │
│ location_name VARCHAR(255) │
│ latitude FLOAT │
│ longitude FLOAT │
│ geo_bucket_id INTEGER → references geo_buckets(id) │
│ price DECIMAL(12,2) │
│ bedrooms INTEGER │
│ bathrooms INTEGER │
│ created_at TIMESTAMP │
└─────────────────────────────────────────────────────────────┘-- Fast bucket lookup
CREATE INDEX idx_buckets_grid ON geo_buckets(grid_x, grid_y);
-- Fast property → bucket join
CREATE INDEX idx_properties_bucket ON properties(geo_bucket_id);One Bucket ←→ Many Properties
Example:
Bucket "sangotedo" (grid 362, 646)
├─ Property 1: "Sangotedo"
├─ Property 2: "Sangotedo, Ajah"
└─ Property 3: "sangotedo lagos"
Remove differences in spelling, case, punctuation:
Input → Normalized
────────────────────────────────────
"Sangotedo" → "sangotedo"
"Sangotedo, Ajah" → "sangotedo ajah"
"SANGOTEDO LAGOS!!" → "sangotedo lagos"
Steps:
1. Convert to lowercase
2. Remove punctuation (, . ! ?)
3. Collapse whitespaceProperties at nearby coordinates automatically share a bucket:
Coordinate Check:
─────────────────
Sangotedo (6.4698, 3.6285) → Grid (362, 646)
Sangotedo, Ajah (6.4720, 3.6301) → Grid (362, 646) ✓ Same!
sangotedo lagos (6.4705, 3.6290) → Grid (362, 646) ✓ Same!
Result: All 3 properties in ONE bucket
Handle typos and variations:
Search Query: "sangotedo"
─────────────────────────────────────────────
Match Type Example Distance
─────────────────────────────────────────────
Exact "sangotedo" 0 ✓
Substring "sangotedo ajah" Contains ✓
Small typo "sangotdo" (missing e) 1 ✓
Small typo "sangotedo" (extra o) 1 ✓
Too different "sngtd" 6 ✗
Threshold: Distance ≤ 3 characters
User searches "sangotedo"
↓
1. Normalize: "sangotedo"
↓
2. Check ALL buckets for matches:
- Bucket "sangotedo": Distance 0 ✓ MATCH
- Bucket "lekki": Distance 7 ✗ No match
- Bucket "vi": Distance 8 ✗ No match
↓
3. Get properties from matched bucket(s)
↓
4. Return ALL properties (3 found!)
┌─────────┐
│ USER │ POST /api/properties/
└────┬────┘ {title, location_name, lat, lng, price, ...}
│
▼
┌──────────────────────────────────────────────┐
│ API: Receive property data │
└────┬─────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────┐
│ Calculate Grid Cell │
│ grid_x = floor(lng / 0.01) = 362 │
│ grid_y = floor(lat / 0.01) = 646 │
└────┬─────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────┐
│ Check: Does bucket (362, 646) exist? │
│ ├─ YES: Get existing bucket │
│ │ Add "sangotedo" to aliases │
│ └─ NO: Create new bucket │
│ canonical_name = "sangotedo" │
└────┬─────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────┐
│ Save Property with geo_bucket_id │
└────┬─────────────────────────────────────────┘
│
▼
┌─────────┐
│ RETURN │ 201 Created {id, bucket_id, ...}
└─────────┘
┌─────────┐
│ USER │ GET /api/properties/search?location=sangotedo
└────┬────┘
│
▼
┌──────────────────────────────────────────────┐
│ Normalize Query │
│ "sangotedo" → "sangotedo" │
└────┬─────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────┐
│ Find Matching Buckets │
│ │
│ For each bucket in database: │
│ Check canonical_name: "sangotedo" │
│ ├─ Distance = 0 ✓ MATCH │
│ │ │
│ Check aliases: ["sangotedo", ...] │
│ ├─ "sangotedo ajah" contains query ✓ │
│ │ │
│ Matched: [Bucket #1] │
└────┬─────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────┐
│ Query Database │
│ SELECT * FROM properties │
│ WHERE geo_bucket_id IN (1) │
│ │
│ Results: 3 properties found │
└────┬─────────────────────────────────────────┘
│
▼
┌─────────┐
│ RETURN │ 200 OK [{property1}, {property2}, {property3}]
└─────────┘
User Input System Processing Output
────────── ───────────────── ──────
"sangotedo" → Normalize
↓
→ Find Buckets
↓
Bucket #1 (sangotedo)
- grid: (362, 646)
- aliases: [sangotedo,
sangotedo ajah,
sangotedo lagos]
↓
→ Get Properties → 3 Properties
↓ - Property 1
3 properties in bucket - Property 2
- Property 3
Operation Time Why
─────────────────────────────────────────────
Create Property ~5ms Simple grid calculation + INSERT
Search (exact) ~10ms Index lookup on bucket + JOIN
Search (fuzzy) ~30ms Scan buckets (fast with few buckets)
Bucket Stats ~50ms Aggregation query
Current:
- 500 buckets (Lagos area)
- 500,000 properties
- Database: ~1GB
- Performance: ✓ Good
Future (1M+ properties):
- Add trigram index for faster fuzzy search
- Cache frequent searches (Redis)
- Consider adjacent bucket search for boundaries
| Approach | Pros | Cons | Decision |
|---|---|---|---|
| Grid | Simple, O(1) lookup, predictable | Fixed boundaries | ✓ Chosen |
| H3 Hexagonal | Better coverage, no edges | Complex, overkill | ✗ |
| Radius clustering | Natural circles | O(n) search, slow | ✗ |
| Geohash | Hierarchical | Less intuitive | ✗ |
- Grid size 0.01°: Good balance for Lagos neighborhoods
- Levenshtein distance ≤ 3: Catches most typos without false positives
- In-memory bucket scan: Fast enough for <1000 buckets
- Substring matching: Handles "Sangotedo" in "Sangotedo, Ajah"
- Properties exactly on grid boundaries might be in different buckets
- Mitigation: Fuzzy name matching finds them anyway
- One fixed grid size for all areas
- Future: Could adjust per region (urban vs rural)
- Simple typo handling (Levenshtein)
- Future: Could add phonetic matching (Soundex)
Database: Empty
Input: {location_name: "Sangotedo", lat: 6.4698, lng: 3.6285}
Calculate: grid_x=362, grid_y=646
Create: Bucket #1 {canonical_name: "sangotedo", aliases: ["sangotedo"]}
Create: Property #1 {geo_bucket_id: 1}
Database:
Bucket #1: "sangotedo" (1 property)
Input: {location_name: "Sangotedo, Ajah", lat: 6.4720, lng: 3.6301}
Calculate: grid_x=362, grid_y=646 ← Same grid!
Find: Bucket #1 already exists
Update: Bucket #1 aliases += "sangotedo ajah"
Create: Property #2 {geo_bucket_id: 1}
Database:
Bucket #1: "sangotedo"
aliases: ["sangotedo", "sangotedo ajah"]
(2 properties)
Input: {location_name: "sangotedo lagos", lat: 6.4705, lng: 3.6290}
Calculate: grid_x=362, grid_y=646 ← Same grid!
Find: Bucket #1 already exists
Update: Bucket #1 aliases += "sangotedo lagos"
Create: Property #3 {geo_bucket_id: 1}
Database:
Bucket #1: "sangotedo"
aliases: ["sangotedo", "sangotedo ajah", "sangotedo lagos"]
(3 properties)
Query: "sangotedo"
Normalize: "sangotedo"
Match: Bucket #1 (exact match on canonical_name)
Result: Return all 3 properties ✓
Database Query:
SELECT * FROM properties WHERE geo_bucket_id = 1
→ Returns: Property #1, #2, #3
Problem Solved:
- ✓ "Sangotedo" = "Sangotedo, Ajah" = "sangotedo lagos"
- ✓ Case-insensitive
- ✓ Typo-tolerant
- ✓ Fast (<50ms)
How:
- Grid cells group nearby properties
- Name normalization removes formatting differences
- Fuzzy matching handles typos
- Aliases track all name variations
Result: Consistent, reliable property search regardless of spelling or formatting.