Skip to content

ImadElMaftouhi/ObjectLens

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

118 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

ObjectLens

ObjectLens is a multimedia mining and indexing system that enables efficient object-based image retrieval using deep learning and similarity search. The system supports both 2D image search (using ImageNet/YOLO) and 3D model retrieval (using pottery dataset), combining computer vision, feature extraction, and vector similarity search.

🎯 Overview

ObjectLens allows users to:

  • Upload an image and detect objects within it using YOLO
  • Search for similar objects across large datasets using FAISS-powered vector similarity
  • Filter results by class for more precise retrieval
  • Visualize 2D and 3D results through an interactive web interface
  • Index and retrieve 3D models based on geometric and visual features

πŸ—οΈ Architecture

ObjectLens/
β”œβ”€β”€ backend/          # FastAPI server with ML pipelines
β”œβ”€β”€ frontend/         # React + Vite web interface
β”œβ”€β”€ scripts/          # Dataset preparation and indexing scripts
β”œβ”€β”€ data/             # Datasets (ImageNet, Pottery)
β”œβ”€β”€ db/               # MongoDB configuration
└── docker-compose.yml

Technology Stack

Backend:

  • FastAPI for REST API
  • YOLOv8 (Ultralytics) for object detection
  • FAISS for fast similarity search
  • MongoDB for metadata storage
  • OpenCV, scikit-image, and custom descriptors for feature extraction

Frontend:

  • React 19 with Vite
  • Three.js for 3D model visualization
  • Tailwind CSS for styling
  • React Router for navigation

Databases:

  • MongoDB for storing object metadata and features
  • FAISS indices for vector similarity search

πŸš€ Quick Start

Prerequisites

  • Python 3.8+ with pip (Make sure to create a local python env : python -m venv .venv)
  • Node.js 16+ with npm
  • Docker & Docker Compose (optional, for containerized deployment)
  • Git for cloning the repository

Installation

  1. Clone the repository:

    git clone <repository-url>
    cd ObjectLens
  2. Set up environment variables:

    # Copy and configure .env file
    cp .env.example .env
    # Edit .env with your configuration
  3. Run the setup pipeline:

    # This downloads datasets, builds catalogs, and sets up indices
    bash run_pipeline.sh

    The pipeline performs:

    • Downloads ImageNet Winter21 dataset
    • Verifies and builds YOLO dataset
    • Precomputes image features
    • Downloads pottery 3D models
    • Builds and splits pottery catalog
    • Sets up MongoDB and FAISS indices

Running the Application

Option 1: Docker Compose (Recommended)

# Start MongoDB
docker-compose up -d mongo

# Start backend (from project root)
cd backend
pip install -r requirements.txt
cd ..; uvicorn backend.main:app --reload --host 0.0.0.0 --port 8000

# Start frontend (in a new terminal)
cd frontend
npm install
npm run dev

Access the application at http://localhost:5173

Option 2: Manual Setup

Backend:

cd backend
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install -r requirements.txt
cd ..; uvicorn backend.main:app --reload --host 0.0.0.0 --port 8000

Frontend:

cd frontend
npm install
npm run dev

MongoDB:

# Using Docker
docker run -d -p 27017:27017 --name objectlens-mongo mongo:latest

# Or install MongoDB locally

πŸ“Š Datasets

2D Dataset: ImageNet Winter21

  • Purpose: Object detection and 2D image retrieval
  • Classes: Multiple object categories
  • Format: JPEG images with YOLO annotations
  • Location: data/imagenet_4_yolo/

3D Dataset: Pottery Models

  • Purpose: 3D model retrieval and visualization
  • Classes: Amphora, Hydria, Krater, Kylix, and more
  • Format: OBJ files with textures
  • Location: data/raw/3DModels/

πŸ” How It Works

2D Image Search Pipeline

  1. Upload & Detection: User uploads an image β†’ YOLO detects objects β†’ user selects an object
  2. Feature Extraction: System crops the object β†’ extracts deep features (weighted, L2-normalized vector)
  3. Similarity Search: FAISS searches the index β†’ returns top-k nearest neighbors
  4. Result Retrieval: System looks up metadata in MongoDB β†’ returns images with highlighted matching objects

3D Model Retrieval Pipeline

  1. Feature Extraction: Extract geometric descriptors (shape, curvature, distribution)
  2. Indexing: Build FAISS index from 3D feature vectors
  3. Query: User searches by uploading a 3D model or selecting from catalog
  4. Visualization: Results displayed with Three.js 3D viewer

πŸ› οΈ API Endpoints

Health Check

GET /health
GET /health/dataset

Object Detection

POST /api/detect
# Upload image, returns detected objects with bounding boxes

Search

POST /api/search/topk?top_k=10&metric=cosine&same_class_only=false
# Parameters:
# - top_k: Number of results to return
# - metric: 'cosine' or 'euclidean'
# - same_class_only: Filter by object class

Sample Data

GET /api/samples/random?count=10
# Returns random sample images from dataset

See backend/test_commands.md for detailed API testing examples.

πŸ“ Project Structure

Backend (/backend)

backend/
β”œβ”€β”€ main.py              # FastAPI application entry point
β”œβ”€β”€ routers/             # API route handlers
β”‚   β”œβ”€β”€ detect.py        # Object detection endpoints
β”‚   β”œβ”€β”€ search.py        # Similarity search endpoints
β”‚   └── samples.py       # Sample data endpoints
β”œβ”€β”€ services/            # Business logic
β”‚   β”œβ”€β”€ yolo_service.py  # YOLO detection service
β”‚   β”œβ”€β”€ feature_extraction.py  # Feature extraction
β”‚   β”œβ”€β”€ faiss_service.py # FAISS index management
β”‚   └── compute_similarity.py  # Similarity computation
β”œβ”€β”€ core/                # Configuration and utilities
β”œβ”€β”€ db/                  # Database models and connections
└── schemas.py           # Pydantic models

Frontend (/frontend)

frontend/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ main.jsx         # Application entry point
β”‚   β”œβ”€β”€ App.jsx          # Root component with routing
β”‚   β”œβ”€β”€ Home.jsx         # Main search interface
β”‚   β”œβ”€β”€ Pr2d.jsx         # 2D preview page
β”‚   β”œβ”€β”€ Pr3d.jsx         # 3D preview page
β”‚   β”œβ”€β”€ api.js           # API client
β”‚   └── components/
β”‚       └── ModelViewer.jsx  # 3D model viewer
└── public/              # Static assets

See frontend/README.md for detailed frontend documentation.

Scripts (/scripts)

scripts/
β”œβ”€β”€ dataset/             # Dataset download and preparation
β”‚   β”œβ”€β”€ imagenet_*.py    # ImageNet pipeline scripts
β”‚   └── pottery_*.py     # Pottery dataset scripts
β”œβ”€β”€ preprocessing/       # Feature precomputation
└── indexing/            # Index building and evaluation

πŸ§ͺ Testing

Backend Tests

# Test feature extraction pipeline
python backend/test_feature_pipeline.py

# Test 3D retrieval
python backend/test_3D_retrieval.py

# Test system flow
python backend/test_system_flow.py

API Testing

# Using curl
curl -X POST "http://localhost:8000/api/search/topk?top_k=10" \
  -F "file=@path/to/image.jpg"

# Using Python
python backend/test_search_endpoint.py

🎨 Features

Current Features

  • βœ… Object detection with YOLOv8
  • βœ… FAISS-powered similarity search
  • βœ… Class-based filtering
  • βœ… 2D image retrieval
  • βœ… 3D model retrieval
  • βœ… Interactive web interface
  • βœ… MongoDB metadata storage
  • βœ… Docker support

Planned Features

  • πŸ”„ Batch upload and processing
  • πŸ”„ Advanced 3D feature descriptors
  • πŸ”„ Real-time collaborative search
  • πŸ”„ Export and annotation tools

πŸ”§ Configuration

Key configuration files:

  • .env - Environment variables (database URLs, API keys, dataset paths)
  • backend/core/config.py - Backend settings
  • frontend/src/api.js - API base URL configuration
  • docker-compose.yml - Container orchestration

πŸ“š Documentation

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ“ License

This project is part of the M232 Multimedia Mining & Indexing course at IASD.

πŸ™ Acknowledgments

  • ImageNet for the image dataset
  • Ultralytics for YOLOv8
  • FAISS by Meta AI for similarity search
  • Three.js for 3D visualization

πŸ“§ Contact

For questions or issues, please open an issue on the repository.


Built for MST.IASD.232 - multimedia mining and indexing course

About

ObjectLens: Intelligent 2D & 3D object detection and retrieval system combining YOLOv8 real-time detection with multi-modal feature extraction (form, texture, color) and FAISS-powered similarity search. Built with React/Vite frontend and FastAPI backend for scalable multimedia mining and 3D-aware image indexing.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors