A high-fidelity autonomous UAV simulation framework for disaster response, integrating multi-sensor fusion and real-time AI perception.
- Overview
- System Architecture
- Key Features
- Tech Stack
- Results
- Getting Started
- Documentation
- Repository Structure
- License
This project presents a fully simulated Autonomous Disaster Response Drone that integrates LiDAR-Optical Flow Fusion (LOFF) odometry with YOLOv8-based semantic perception to enable robust navigation in GPS-denied environments such as earthquake zones, flood areas, and disaster sites.
The platform was built using ROS 2 Humble, Gazebo Garden, and ArduPilot SITL β enabling rigorous hardware-free testing with realistic physics.
The system is composed of five tightly coupled subsystems:
- Sensor Simulation β LiDAR and RGB Camera modeled in Gazebo
- Optical Flow Estimation β Dense motion vectors from sequential image frames
- LiDAR-SLAM Fusion β Point cloud alignment with Factor Graph Optimization
- Semantic Perception β YOLOv8 real-time object detection
- Autonomous Decision-Making β Confidence-based landing and dynamic avoidance
| Feature | Description |
|---|---|
| πΊοΈ GPS-Denied Navigation | Stable pose estimation via LiDAR-SLAM + Optical Flow FGO |
| ποΈ Human Detection | YOLOv8 detects victims with a 70% confidence threshold |
| π¬ Autonomous Landing | Triggered upon confirmed human detection |
| π§ Obstacle Avoidance | LiDAR laser rays maintain a 3m safety distance |
| π Realistic Simulation | ROS 2 + Gazebo with maze, hills, and runway worlds |
- Simulation: ROS 2 Humble Β· Gazebo Garden Β· ArduPilot SITL Β· MAVProxy
- Perception: YOLOv8 (Darknet ROS) Β· OpenCV Β· PyTorch
- Localization: Google Cartographer SLAM Β· Micro-XRCE-DDS
- Languages: C++17 Β· Python 3.10
- Build Tools: Colcon Β· CMake Β· Rosdep
| Metric | Result |
|---|---|
| Localization Drift Reduction | 38% vs. single-sensor |
| Object Detection mAP@0.5 | 91.3% |
| Inference Speed | 60 FPS real-time |
| Gazebo Environment | LiDAR-Avoidance |
|---|---|
![]() |
![]() |
| Virtual World in Gazebo | LiDAR Obstacle Avoidance |
| Simulation Setup | YOLOv8 Detection |
|---|---|
| Integrated ROS + Gazebo Setup | YOLO Confidence Analysis (73% vs 48%) |
# Ubuntu 22.04 recommended
sudo apt install ros-humble-desktop python3-colcon-common-extensions# 1. Clone the repository
git clone https://github.com/your-username/Autonomous-Drone-Simulator.git
cd Autonomous-Drone-Simulator
# 2. Build the ROS 2 workspace
mkdir -p ~/ardu_ws/src && cd ~/ardu_ws/src
colcon build --packages-up-to ardupilot_gz_bringup
# 3. Launch the simulation
source install/setup.bash
ros2 launch ardupilot_gz_bringup iris_maze.launch.py lidar_dim:=2π For the full step-by-step setup, see the Installation and Execution Guide.
| Document | Description |
|---|---|
| Introduction | Project background, objectives, and literature review |
| System Overview | Architecture, software stack, and design methodology |
| Execution Guide | Full installation and step-by-step execution walkthrough |
| Results & Conclusion | Performance metrics, figures, and future work |
| References | Academic and technical citations |
Capstone-Thesis-Drone-Simulator/
β
βββ README.md # Project overview (this file)
βββ LICENSE # MIT License
β
βββ docs/ # Technical Documentation
β βββ Introduction.md # Project background & objectives
β βββ System_Overview.md # Architecture & methodology
β βββ Execution_Guide.md # Installation & execution guide
β βββ Results_and_Conclusion.md # Results, analysis & conclusion
β βββ References.md # Academic citations
β
βββ assets/
β βββ figures/ # All diagrams and screenshots
β βββ Picture1.jpg # System architecture diagram
β βββ Figure 3.1.1.jpg # SITL Simulation
β βββ Figure 3.1.2.jpg # Gazebo + ROS
β βββ Figure 3.1.3.jpg # MAVProxy interface
β βββ Figure 3.2.png # Gazebo + YOLO
β βββ Figure 4.1.png # Virtual World
β βββ Figure 4.2.svg # Simulation Setup
β βββ Figure 4.3.png # LiDAR Drone
β βββ Figure 4.4.svg # YOLO Output
β
βββ src/ # Source Code
βββ SLAM_LIDAR_Model.cpp # LiDAR-SLAM navigation & avoidance
βββ Yolo_Object_Detector.cpp # YOLOv8 ROS integration
Contributions are welcome! If you have ideas to improve the simulator, add new Gazebo worlds, enhance the perception model, or extend the navigation algorithms β feel free to open an issue or submit a pull request.
# Fork the repo, then:
git checkout -b feature/your-feature-name
git commit -m "Add: your feature description"
git push origin feature/your-feature-name
# Then open a Pull Request on GitHubPlease keep your code well-commented and consistent with the existing style.
This project builds upon the outstanding work of several open-source communities and researchers:
| Project | How It Was Used |
|---|---|
| ROS (Robot Operating System) | Core middleware β publishers, subscribers, node lifecycle |
| ArduPilot | Flight controller running in SITL mode for realistic drone dynamics |
| Gazebo | Physics-based 3D simulation worlds (maze, hills, runway) |
| Darknet (pjreddie) | The underlying neural network engine used for YOLO inference |
| darknet_ros | ROS wrapper publishing bounding boxes and detection events |
| MAVROS | MAVLink bridge for sending velocity commands to ArduPilot (/mavros/setpoint_velocity/cmd_vel) |
| OpenCV + cv_bridge | Camera image decoding and conversion between ROS and OpenCV formats |
| Boost C++ Libraries | Thread management and shared-memory mutex for concurrent detection |
| Intelligent Quads (IQ_GNC) | High-level GNC helper functions (takeoff, land, set_destination, check_waypoint_reached) |
| Zhang & Singh β LOAM | SLAM research that informed the localization and mapping approach |
| Zhang et al. β LOFF | LiDAR and Optical Flow Fusion Odometry β the core localization algorithm |
This project is accompanied by a research paper published on Zenodo:
Enhanced Multi-Modal UAV Perception using Large Language Models for Autonomous Disaster Reconnaissance
https://zenodo.org/records/20442636
If you use this work, please cite: Bhavya Keerthi K. (2026). Enhanced Multi-Modal UAV Perception using Large Language Models for Autonomous Disaster Reconnaissance. Zenodo. https://doi.org/10.5281/zenodo.20442636
This project is licensed under the MIT License β see the LICENSE file for details.


