Real-time object detection using a compact ESP32-CAM and the CircuitDigest Cloud API β no model training, no dataset creation, just plug and detect!
- About the Project
- How It Works
- Components Required
- Circuit Diagram
- Step-by-Step Setup
- Code Overview
- Output
- Troubleshooting
- Advantages & Limitations
- FAQ
- Relevant Links
Object detection is a computer vision technique that identifies real-world objects β such as cell phones, remotes, laptops, cars, cups, books, and more β by drawing bounding boxes around them along with a confidence score indicating detection accuracy.
This project implements a low-cost, cloud-powered object detection system using an ESP32-CAM module. Unlike traditional systems that require high-end hardware and complex ML model training, this project leverages the CircuitDigest Cloud API to handle all the heavy processing in the cloud.
- β No dataset creation, labelling, or model training required
- β Supports 75+ object classes (people, cars, animals, electronics, etc.)
- β Compact and portable β just an ESP32-CAM, a push button, and a laptop
- β Results displayed on the Serial Monitor in real time
- β Adjustable confidence threshold and object class selection
[ Push Button Pressed ]
β
[ ESP32-CAM Captures Image ]
β
[ Image Sent to CircuitDigest Cloud via HTTPS API ]
β
[ Cloud Processes & Runs Object Detection ]
β
[ JSON Response: Object Names + Count + Confidence ]
β
[ Results Printed on Serial Monitor ]
- The user presses the push button connected to GPIO13.
- The ESP32-CAM captures a JPEG image.
- The image is sent to the CircuitDigest Cloud via an HTTPS POST request.
- The cloud processes the image and returns detected object names, counts, and confidence values.
- Results are printed on the Arduino Serial Monitor.
π‘ Tip: Ensure adequate lighting when capturing images β better lighting = higher detection accuracy.
| S.No | Component | Purpose |
|---|---|---|
| 1 | ESP32-CAM (AI Thinker) | Microcontroller + Camera Module |
| 2 | Push Button | Triggers image capture |
| 3 | Breadboard | Simplifies circuit connections |
| 4 | USB-to-Serial (FTDI) Adapter (if no onboard USB) | Programming the ESP32-CAM |
| 5 | USB Cable | Powers the system via laptop |
β οΈ Important: If you are using the standard ESP32-CAM (without an onboard USB port), you will need a USB-to-Serial (FTDI) adapter for programming.
Connect:FTDI TX β ESP32-CAM RX (U0R),RX β TX (U0T),GND β GND
Hold GPIO0 LOW during upload to enter flash mode.
The push button is connected to GPIO13 of the ESP32-CAM to trigger photo capture.
ESP32-CAM
GPIO13 ββββ Push Button ββββ GND
5V/3.3V βββ VCC
GND ββββββββ GND
Refer to the circuit diagram image in the
/assetsfolder for a visual reference.
- Go to CircuitDigest Cloud
- Sign up or log in to your account
- Scroll down and click on the Object Detection feature
- In the Try API section, set the desired confidence level (minimum probability threshold)
- A higher confidence value = more accurate but may miss some objects
- A lower value = detects more objects but with less certainty
- Choose from 75+ available classes (e.g., person, car, dog, laptop)
- You can select all classes or only specific ones
- Upload any test image containing objects from your selected classes
- Click Run Test to see detection results instantly
- Scroll to the microcontroller selection section
- Select ESP32-CAM
- Copy the generated code
- Open Arduino IDE
- Paste the copied code
- Update the following values in the code:
const char* ssid = "YourWiFiSSID"; const char* password = "YourWiFiPassword"; const char* apiKey = "YourAPIKey"; const char* classes = "[]"; // Leave empty for all classes const char* confidence = "0.2"; // Adjust as needed
- Select AI Thinker ESP32-CAM as the board in Arduino IDE
- Upload the sketch
- Connect the push button and open Serial Monitor at
115200baud
- Point the camera at objects
- Press the push button
- View results in the Serial Monitor π
#include <Arduino.h>
#include <WiFi.h>
#include <WiFiClientSecure.h>
#include "esp_camera.h"
const char* ssid = "YourSSID";
const char* password = "YourPassword";
const char* serverName = "www.circuitdigest.cloud";
const char* serverPath = "/api/v1/object-detection/detect";
const int serverPort = 443;
const char* apiKey = "YourAPIKey";
const char* classes = "[]"; // Empty = detect all classes
const char* confidence = "0.2";
#define TRIGGER_BTN 13void initCamera() {
camera_config_t config;
config.pixel_format = PIXFORMAT_JPEG;
config.frame_size = FRAMESIZE_VGA;
esp_camera_init(&config);
}Adjust
frame_sizeand quality settings based on your memory and accuracy needs.
String sendImageToAPI(camera_fb_t* fb) {
client.connect(serverName, serverPort);
client.println("POST /api HTTP/1.1");
client.write(fb->buf, fb->len);
}This function performs an HTTPS POST request, uploading the JPEG frame buffer to the CircuitDigest Cloud.
void setup() {
Serial.begin(115200);
pinMode(TRIGGER_BTN, INPUT_PULLUP);
initCamera();
WiFi.begin(ssid, password);
}
void loop() {
if (digitalRead(TRIGGER_BTN) == LOW) {
camera_fb_t* fb = esp_camera_fb_get();
String result = sendImageToAPI(fb);
Serial.println(result);
}
}When the push button is pressed, the Serial Monitor displays results like:
Detected Objects: 3
1. laptop - Confidence: 91%
2. cell phone - Confidence: 87%
3. mouse - Confidence: 78%
You can also view detection logs and API usage (daily/monthly) directly on the CircuitDigest Cloud dashboard.
| Issue | Cause | Solution |
|---|---|---|
| Camera capture failed | Insufficient PSRAM / memory | Reduce frame size, lower JPEG quality, enable PSRAM in board settings |
| ESP32-CAM keeps restarting | Insufficient USB power | Use an external 5V power supply; check connections |
| Blurry or unclear images | Focus or lighting issues | Manually adjust the lens; improve lighting; tune brightness/contrast |
| Camera initialization failed | Wrong pin config or board selection | Select AI Thinker ESP32-CAM in Arduino IDE; verify GPIO pin mapping |
| No detection / wrong results | Poor image quality or wrong confidence level | Improve lighting, adjust confidence threshold, reposition the camera |
| # | Advantage |
|---|---|
| 1 | Real-time object detection within a few seconds |
| 2 | Low-cost system using ESP32-CAM with built-in camera and Wi-Fi |
| 3 | Supports detection of multiple object types (people, cars, animals, etc.) |
| 4 | Easy to change object classes and confidence settings |
| 5 | Small size and portable design |
| # | Limitation |
|---|---|
| 1 | Cannot work without cloud API access |
| 2 | Requires an active internet connection |
| 3 | Blurry or low-quality images may produce incorrect results |
| 4 | Subject to daily/monthly API usage limits |
| 5 | Captures single images β not continuous live video detection |
Q1. What happens if the API limit is exceeded?
The server returns a "limit exceeded" response, and no detections will occur until the limit resets or a new API key is used.
Q2. Can we detect specific objects only?
Yes! Modify the classes parameter in the API request to target only selected object types.
Q3. How can detection accuracy be improved?
- Ensure good lighting conditions
- Adjust camera brightness and contrast settings
- Use an appropriate confidence threshold
- Position the camera properly and close to the subject
Q4. What is the role of the confidence parameter?
It sets the minimum probability threshold. Higher values = more precise but may miss objects. Lower values = detects more but with less certainty.
Q5. Why use cloud processing instead of on-device?
The ESP32-CAM has limited processing power. Complex object detection algorithms require significant computation, which is efficiently handled by cloud APIs.
Q6. Can the system work offline?
No. The system relies entirely on cloud-based processing. Internet connectivity is required.
- π License Plate Recognition Using ESP32-CAM
- π DIY Smart WiFi Video Doorbell Using ESP32
- π ESP32-CAM Face Mask Detection
- π GitHub Repository
- π CircuitDigest Cloud
Made with β€οΈ using ESP32-CAM & CircuitDigest Cloud