Help Needed: Training Bottleneck for CropBalanceAI Model #1340
Replies: 1 comment
-
|
Hi Raunak, You’ve already diagnosed the core issue correctly — CPU training on a MacBook Air will be a major bottleneck for a dataset of that size. A few practical suggestions to unblock you: 1. Move to GPU (strongly recommended)
Basic steps:
2. Using MPS on Mac (if staying local) device = torch.device("mps" if torch.backends.mps.is_available() else "cpu")
model.to(device)Also ensure:
3. Speed up DataLoader Try: DataLoader(
dataset,
batch_size=32,
shuffle=True,
num_workers=4, # experiment (2–8 depending on CPU)
pin_memory=True
)Additional improvements:
4. Training tweaks
Bottom line If you share your Hope this helps! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi Daniel,
I’m currently working on the disease classification model for the CropBalanceAI project using the PlantVillage dataset (~54k images). I’ve successfully set up the training pipeline, but I’ve hit a significant hardware bottleneck.
The Current Situation:
Hardware: MacBook Air (running locally).
Performance: It’s currently taking approximately 2.5 hours per epoch (averaging ~32s/it).
Issue: The training is running on the CPU, and with 10–20 epochs planned, it's not feasible to complete the training locally.
What I need help with:
Cloud Migration: Can you guide me on the best way to move this specific structure (disease_trainer.py and the data modules) to a GPU-accelerated environment like Google Colab or Kaggle Kernels?
MPS Optimization: If I continue locally, how can I properly implement Apple’s Metal Performance Shaders (MPS) in my disease_model.py to utilize the Mac's GPU?
Data Loading: I suspect my DataLoader might be slow due to the high image count. Should I be using a different approach for local disk I/O?
I’ve attached a screenshot of my terminal showing the current training logs and project structure. Looking forward to your advice on how to speed this up!
Best regards,
Raunak Pratap Singh
Beta Was this translation helpful? Give feedback.
All reactions