This project explores the use of Reinforcement Learning (RL) algorithms for autonomous wall-following in robotics using the Webots simulation environment.
The main objective is to evaluate different RL algorithms, reward structures, and action spaces to determine the most effective approach for reliable and efficient wall-following behavior.
The project includes implementations and experiments using:
- Q-Learning
- PPO (Proximal Policy Optimization)
- TD3 (Twin Delayed Deep Deterministic Policy Gradient)
The experiments were conducted using an e-puck robot equipped with:
- Lidar sensors for object detection
- Collision sensors for obstacle avoidance
To run the project, you need to install:
Download and install:
- Webots
Official website:
https://cyberbotics.com/
We strongly recommend using:
- PyCharm
Open the project folder in PyCharm after installation.
This project was developed using:
Python 3.12Install the following dependencies:
numpy==0.29.1
gymnasium==1.26.2
stable-baselines3==2.3.2
tensorflow==2.16.1
tensorboard==2.16.2
torch==2.3.1You can install them using:
pip install numpy gymnasium stable-baselines3 tensorflow tensorboard torchOpen the project folder in PyCharm.
Make sure Webots is properly connected to the Python environment used by the project.
Install all required Python packages listed above.
To test the trained model on the provided environments, simply run:
testing.pyIf you want to test the model on your own maps:
The e-puck robot node must be named:
robotInside the Webots scene tree:
- locate the
DEFfield of the e-puck node - set its value to:
DEF robotThis is required for the controller to correctly identify and interact with the robot.
The project evaluates both:
- Discrete action spaces
- Continuous action spaces
Different reward strategies were tested to analyze their impact on:
- navigation smoothness
- collision avoidance
- wall-following accuracy
- learning efficiency
Experimental results showed that:
- PPO achieved the best overall performance
- well-designed reward functions significantly improve learning quality
- continuous action spaces generally produce smoother robot behavior
- the combination of algorithm, reward structure, and action space strongly affects performance
Wall-following is a fundamental robotics task that supports more advanced problems such as:
- autonomous navigation
- mapping
- localization
This project investigates how reinforcement learning can be used to solve the wall-following problem efficiently and reliably.
The developed system demonstrates that reinforcement learning techniques — particularly PPO with optimized rewards — can successfully guide autonomous robots in simulated environments with smooth and stable behavior.
- Ana Batista
- Gonçalo Monteiro
- Mafalda Aires
- FEUP — Faculty of Engineering, University of Porto
- FCUP — Faculty of Sciences, University of Porto