Learn Retrieval-Augmented Generation, vector search, embeddings, AI agents, function calling, evaluation, monitoring, hybrid search, reranking, and more - all in a free, open-source, hands-on course by DataTalks.Club.
β Star this repo to stay updated with new modules and cohort announcements
| Resource | Link |
|---|---|
| π Course materials | GitHub repository |
| π₯ Video lectures | YouTube playlist |
| π Cohort schedule & deadlines | courses.datatalks.club |
| π¬ Slack community | #course-llm-zoomcamp |
| π£ Announcements | Telegram |
| π 2025 cohort projects | courses.datatalks.club/llm-zoomcamp-2025/projects |
LLM Zoomcamp teaches you how to build practical, production-ready LLM applications step by step.
This course is for people who learn by doing. After completing it, you'll have a working codebase and the hands-on experience to build your own LLM-powered applications.
- Software Engineers: Add LLMs, RAG, and modern search capabilities to real products
- Data Engineers: Understand how vector search, hybrid search, and retrieval pipelines fit into production systems
- ML Practitioners: Get a structured way to evaluate and monitor LLM-based applications
- Python: You can write code confidently
- Command Line: Comfortable with terminal
- Docker: Basic familiarity
- ML / LLMs: Not required
- Hardware: Any laptop or PC. No GPU needed
- Expenses: ~$1-5 in API credits
Note
If you can write a Python function and have heard of ChatGPT, you have enough to get started.
There are two ways to follow the course: live and self-paced.
| Live Cohort | Self-Paced | |
|---|---|---|
| Start | June 8, 2026, 17:00 CET | Anytime |
| Lectures | Pre-recorded | Pre-recorded |
| Homework | Graded | Available but not scored |
| Leaderboard | β Yes | β No |
| Peer Review | β Yes | β No |
| Certificate | β Yes | β No |
| Cost | Free | Free |
| Register | Sign up here | Just start learning! |
Important
"Live cohort" does not mean live classes. All lectures are pre-recorded. "Live" means working with others, having deadlines, getting your homework and project scored, review your peers, and getting a certificate at the end.
Self-paced steps:
- Follow the materials on GitHub
- Ask questions and share progress in Slack
- Do homeworks (self-checked) and build a project for your portfolio
- 1. Introduction to LLMs & RAG. Build a basic RAG pipeline with text search
- 2. Vector Search. Index and retrieve documents using semantic embeddings
- 3. Agents. Add autonomous tool use and function calling to RAG
- Workshop - Data Ingestion. Ingest data with dlt from external sources into your RAG system
- 4. Evaluation. Measure retrieval and answer quality with offline and online eval
- 5. Monitoring. Monitor user feedback and system health with live dashboards
- 6. Best Practices. LangChain, hybrid search. Combine vector + keyword search; rerank results for higher precision
- 7. End-to-End Project. A complete project example: a fitness assistant built with LLMs
- Capstone Project. Ship a complete end-to-end project of your choice from scratch
Recommended approach:
- Watch the video for each module
- Complete the homework to reinforce the concepts
- Build your capstone project applying everything end-to-end
The capstone is your chance to apply everything end-to-end. You'll build a complete, working RAG application built and owned by you.
What you'll build:
- A searchable knowledge base. Choose a dataset, ingest, clean, and store it for retrieval
- A retrieval pipeline. Implement the full RAG flow: retrieve context, assemble prompts, call an LLM, return grounded answers
- An evaluation process. Measure how well your system retrieves and answers using search metrics or LLM-as-a-Judge
- A user-facing interface. A simple UI or API (Streamlit, FastAPI, or similar) so others can try your app
- Monitoring & feedback loops. Track queries, feedback, and performance over time
- Fitness & nutrition assistant
- Study companion for textbooks or course notes
- Medical FAQ assistant
- Codebase Q&A bot
- News summarization and retrieval tool
Note
See the full capstone project guidelines and browse all 2025 and 2024 cohort submissions for inspiration.
To earn your certificate:
- Complete the final project. Build a real-world RAG application demonstrating all course concepts
- Peer review 3 projects. Evaluate and provide written feedback on three fellow students' submissions
- Meet the deadlines. Submit your project and reviews within the cohort schedule
Certificates are issued after all peer reviews are completed. Self-paced learners are not eligible for certification but can build portfolio projects freely.
|
Alexey Grigorev Founder, DataTalks.Club Founder of DataTalks.Club and creator of multiple open-source ML courses reaching tens of thousands of learners worldwide. Former principal data scientist with deep expertise in ML systems and engineering. |
Timur Kamaliev Senior Data Scientist AI Engineer specializing in building production LLM systems, RAG pipelines, and agentic applications. Hands-on practitioner with real-world experience shipping GenAI products. |
A huge thanks to our sponsors for making this course possible!
Tip
Interested in supporting the DataTalks.Club community? Reach out to alexey@datatalks.club.
"This course gave me hands-on experience in building LLM-powered applications, including prompt engineering, retrieval-augmented generation (RAG), pipeline orchestration, and vector search optimization."
β Alexander Daniel Rios, LLM Zoomcamp Graduate
"Not gonna lie - this course took longer than planned. By the end, I was running on fumes, forcing myself to push through the final modules. But I made it. What I loved: hands-on experience building real AI systems (not just theory!), deep dives into RAG, vector databases, evaluation, and monitoring, and the wealth of production-ready practices that matter in enterprise environments."
β Vasiliy Chernykh, LLM Zoomcamp Graduate
Read more testimonials from past graduates β
Join the #course-llm-zoomcamp channel on DataTalks.Club Slack for discussions, troubleshooting, and networking with fellow learners and the course team.
To keep discussions useful for everyone:
- Follow our posting guidelines when asking questions
- Review the community guidelines
We actively encourage sharing your progress online throughout the course. Post what you're building on LinkedIn, Twitter/X, or a blog. It helps you get noticed and connect with others in the field. It also earns you bonus points toward your homework and project scores.
Full FAQ: datatalks.club/faq/llm-zoomcamp.html
Q: Is this course really free?
A: Yes. All videos, materials, and homework are free. You may spend $1-5 in OpenAI API credits if you run the code yourself.
Q: Do I need a GPU?
A: No. All exercises are designed to run on a standard laptop using cloud APIs.
Q: What does "live cohort" mean? Are there live classes?
A: No mandatory live classes. "Live" means homework deadlines, automatic scoring, a leaderboard, peer review, and certificate eligibility are all enabled. All lectures are pre-recorded.
Q: Can I join after the cohort has started?
A: Yes. You can join after the start date, but deadlines remain fixed. Some homework forms may already be closed.
Q: Can I join mid-cohort or self-paced?
A: Yes. All materials stay available after each cohort ends. Self-paced learners are always welcome, though certificates require a live cohort.
Q: Will I get a certificate?
A: Yes. Complete the final project and peer review 3 students' projects during the live cohort to earn your certificate. Self-paced mode does not include certification.
Q: Do I need to complete every homework to get a certificate?
A: No. You only need to complete the final project and peer reviews to get it.
Q: What if I get stuck?
A: Discuss your problem in #course-llm-zoomcamp on Slack. The community and instructors are active there. Also check the FAQ page for detailed answers.
Q: How much time should I expect to spend?
A: Expect roughly 5-10 hours per week, depending on your background and how deep you go into the materials.
Found a bug in the course materials? Know how to improve an explanation or fix broken code? Contributions are welcome and appreciated.
- Fork the repository
- Make your fix or improvement
- Open a pull request with a clear description
Every contribution helps future learners. Thank you π
DataTalks.Club is a global online community of data enthusiasts β a place to learn, share knowledge, ask questions, and support each other through free courses, events, and an active Slack community.
Website β’ Slack β’ Newsletter β’ Events β’ Google Calendar β’ YouTube β’ GitHub β’ LinkedIn β’ Twitter
Note
Most activity happens on Slack. Join us there for updates, discussions, and community events. Learn more at DataTalksClub Community Navigation.

