avatar
Data Engineering Zoomcamp
@dezoomcamp
20.05.2026 07:15
The Data Engineering Zoomcamp 2026 has officially wrapped up!

You had 3 chances to submit your final project and review 3 peers.

Congratulations to the 581 participants who successfully completed the course and earned their certificates!

If you're one of the graduates, you can find your well-deserved certificate in your course enrollment profile. We encourage you to celebrate this milestone by sharing it on LinkedIn!

Thank you goes out to everyone who joined us on this journey. We understand that not everyone was able to finish the course this time, and that’s perfectly fine. You’ll have another opportunity to complete it in a new cohort next year.

Thank you all for contributing to the learning experience of this course and DataTalks.Club community overall. We appreciate your hard work!
72
👍 10
👏 3
👎 1
14 2 3.4K
avatar
Data Engineering Zoomcamp
Переслано от канала
11.05.2026 07:23
I'm starting a workshop series to update the content of LLM Zoomcamp.

Each workshop will update the content of one course module:

​May 11: Build Your First RAG Application
May 18: From RAG to AI Agents: Function Calling and Tool Use
May 21: Vector Databases
May 27: RAG and Agents Evaluation
June 3: Monitoring LLM Applications
65
👍 20
👏 6
🔥 3
👎 1
3 22 6.4K
avatar
Data Engineering Zoomcamp
@dezoomcamp
11.05.2026 07:23
Now Data Engineering Zoomcamp is almost over and I'm focusing on updating LLM Zoomcamp - the course about using GenAI, RAG and Agents

We will have multiple workshops prior to the course start date which will be used as the videos for the course when it starts

The first one is today. Join me if you're interested in GenAI!
👍 34
3
1 5.9K
avatar
Data Engineering Zoomcamp
@dezoomcamp
02.05.2026 15:05
Judging by the questions in Slack, I realize that our documentation may be not complete or straightforward

I reorganized it a bit and here it is

https://datatalks.club/docs/courses/zoomcamp-logistics/

I hope you fill find the answers to your questions (especially around projects and peer reviews)
DataTalks.Club Documentation
Zoomcamp Logistics
How DataTalks.Club zoomcamps work - schedule, joining, modules, projects, peer review, certification.
18
2 3 8.2K
avatar
Data Engineering Zoomcamp
@dezoomcamp
30.04.2026 19:30
Some of you forgot to submit the project, and some of you forgot to do peer reviews. I see from Telegram and Slack that there are quite a few of you in this situation.

For those who already have a project but haven't submitted it yet, we have opened a third attempt.

It has a very strict deadline, so you won't have time to develop a new project. You can only submit the existing one.

https://courses.datatalks.club/de-zoomcamp-2026/project/project3

Good luck!
15
🙏 3
7 1 8K
avatar
Data Engineering Zoomcamp
@dezoomcamp
30.04.2026 11:58
The certificates are ready! You'll find them in your enrollment profile for the course

Don't forget to share them on LinkedIn!

Thanks everyone for being a part of this course!
43
🔥 5
18 1 6.9K
avatar
Data Engineering Zoomcamp
@dezoomcamp
11.04.2026 07:18
We have just scored project attempt 1

Congratulations to the 269 course participants who passed it! We will release the certificates for you later along with the second batch

If you haven't passed it, you can improve it and submit one more time

Note that if you did pass the project, submitting it again (even with improvements) is considered self-plagiarism. Please don't do it. But you can submit another project if you want

Also if you haven't made a submission for attempt 1, attempt 2 is still open:

https://courses.datatalks.club/de-zoomcamp-2026/project/project2

Have fun building!
33
🔥 2
28 1 11.9K
avatar
Data Engineering Zoomcamp
@dezoomcamp
30.03.2026 18:12
A quick update: the Bruin project competition is now open to everyone!

This means your submission does not have to be the same as your Zoomcamp final project.

Even if you are not participating in the course, or if you are using other tech for your Zoomcamp final project, you can still build a separate project with Bruin, submit it to the competition, and compete for prizes.

Prizes:

🔸 Mac Mini for an outstanding project
🔸 1 year Claude Pro for the top 3 projects
🔸 1 month Claude Pro for participants

Deadline: Monday, June 1st, 12:00 UTC

More details here: getbruin.com/competition
Bruin
Data Engineering Project Competition | Bruin
Build data pipelines with Bruin and compete for prizes.
29
👍 7
🔥 5
3 13 11.9K
avatar
Data Engineering Zoomcamp
@dezoomcamp
16.03.2026 09:33
Build your project using Bruin for ingestion, transformation, orchestration, and analysis, share it with the community, and compete for prizes.

Prizes:

🔸 Mac Mini for an outstanding project
🔸 1 year Claude Pro for the top 3 projects
🔸 1 month Claude Pro for participants

To participate:

🔸 Build your Zoomcamp project using Bruin
🔸 Publish it on GitHub with a README
🔸 Share it in #projects on Slack

Winners will be determined by community votes on Slack.

Learn more here: https://getbruin.com/zoomcamp-project/
Bruin
Data Engineering Project Competition | Bruin
Build data pipelines with Bruin and compete for prizes.
30
🔥 8
👍 7
3 8 12.5K
avatar
Data Engineering Zoomcamp
@dezoomcamp
16.03.2026 09:33
It’s time to apply everything from the course.

Build your final project with a complete end-to-end data pipeline.

It takes you through the full workflow:

🔸 Choose a dataset you’re interested in
🔸 Build a pipeline to ingest data into a data lake
🔸 Move the data to a data warehouse
🔸 Transform the data to prepare it for analysis
🔸 Build a dashboard to visualize the results

🎥 Watch the Projects how-to video for the full walkthrough and start building.

🏆 Tip: Use Bruin in your project to participate in the competition and win prizes. Details below.
12
👍 6
🔥 1
7 2 9.4K
avatar
Data Engineering Zoomcamp
@dezoomcamp
14.03.2026 05:36
19
5 3 10.1K
avatar
Data Engineering Zoomcamp
@dezoomcamp
09.03.2026 12:11
We're starting module 7 on stream processing.

Reminder: the previous homework deadline is in less than 24 hours.

The materials were created by Zach Wilson, who ran a Flink stream for the course last year. Alexey recorded an updated Apache Flink workshop to reflect support for Flink 2.x and modern Python versions (3.12, 3.9, 3.8).

It covers:

• Streaming fundamentals in Data Engineering Zoomcamp
• Kafka/Redpanda, Python producers and consumers
• Writing streaming events to PostgreSQL
• Apache Flink setup with Docker
• Flink jobs for stream processing
• Windowing, watermarks, and late events
• Real-time aggregations with Flink

Homework deadline: 17 March, 12 AM CET

Learn here: https://github.com/DataTalksClub/data-engineering-zoomcamp/tree/main/07-streaming
GitHub
data-engineering-zoomcamp/07-streaming at main · DataTalksClub/data-engineering-zoomcamp
Data Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Join the course here 👇🏼 - DataTalksClub/data-engineering-zoomcamp
17
🔥 1
7 5 10.9K
avatar
Data Engineering Zoomcamp
@dezoomcamp
04.03.2026 10:59
We're starting our stream about streaming!

This is going to be a part of module 7 about streaming - and reworked workshop from the last year with the latest versions of PyFlink

Stream: https://www.youtube.com/watch?v=YDUgFeHQzJU
Workshop: https://github.com/DataTalksClub/data-engineering-zoomcamp/tree/main/07-streaming/workshop

Watch now or later in recording!
YouTube
PyFlink Stream Processing Tutorial: Build a Real-Time Pipeline with Kafka, Redpanda and Python
In this workshop, Alexey Grigorev breaks down the complexities of real-time data engineering, moving from basic Python-based Kafka consumers to enterprise-grade Apache Flink pipelines. This workshop, part of the Data Engineering Zoomcamp, provides a hands-on look at how to handle high-velocity data, out-of-order events, and stateful aggregations. You’ll learn about: - Kafka vs. Red Panda: Understand why Red Panda is a faster, simpler alternative to Kafka for developers. - Producer & Consumer: Learn to turn Python data into streamable bytes and read them back. - Database Integration: How to move data from a live stream into a permanent PostgreSQL database. - Flink Basics: Learn how Job Managers (the brain) and Task Managers (the muscle) run streaming jobs. - Handling Late Data: Use Watermarks to manage events that arrive late or out of order. - Time Windows: Group data into time blocks (like "every 5 minutes") to calculate real-time totals. Links: - Course: https://github.com/DataTalksClub/data-engineering-zoomcamp - Workshop: https://github.com/DataTalksClub/data-engineering-zoomcamp/blob/main/07-streaming/workshop/README.md - DTC Courses: https://datatalks.club/blog/guide-to-free-online-courses-at-datatalks-club.html TIMECODES: 00:00 Workshop Introduction and Course Context 01:34 Finding the Workshop Materials and Course Links 03:01 Streaming Stack Overview: Kafka, Redpanda, and Flink 04:56 Credits, Workshop Changes, and Environment Setup Plan 06:32 Creating a Repository and Launching Codespaces 10:18 Python Project Setup with UV 13:09 Starting Redpanda and Defining Producer and Consumer Roles 15:37 Loading Taxi Data and Modeling Ride Events 23:48 Building a Kafka Producer and Sending JSON Events 27:16 Creating a Consumer and Cleaning Up Serialization Logic 40:02 Streaming Many Events and Writing Them to Postgres 48:41 Why Flink for Stream Processing 52:46 Flink Architecture and Docker Setup 01:03:24 First Flink Job: Kafka to Postgres Pass-Through 01:11:30 From Pass-Through to Real-Time Aggregations 01:14:50 Generating Real-Time and Late Events 01:18:58 Window Aggregations and Watermarks in Flink 01:28:57 Wrap-Up, Watermark Discussion, and Closing Connect with DataTalks.Club: - Join the community - https://datatalks.club/slack.html - Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ - Check other upcoming events - https://lu.ma/dtc-events - GitHub: https://github.com/DataTalksClub - LinkedIn - https://www.linkedin.com/company/datatalks-club/ - Twitter - https://twitter.com/DataTalksClub - Website - https://datatalks.club/ Connect with Alexey - Twitter - https://twitter.com/Al_Grigor - Linkedin - https://www.linkedin.com/in/agrigorev/ Check our free online courses: - ML Engineering course - http://mlzoomcamp.com - Data Engineering course - https://github.com/DataTalksClub/data-engineering-zoomcamp - MLOps course - https://github.com/DataTalksClub/mlops-zoomcamp - LLM course - https://github.com/DataTalksClub/llm-zoomcamp - Open-source LLM course: https://github.com/DataTalksClub/open-source-llm-zoomcamp - AI Dev Tools course: https://github.com/DataTalksClub/ai-dev-tools-zoomcamp 👉🏼 Read about all our courses in one place - https://datatalks.club/blog/guide-to-free-online-courses-at-datatalks-club.html 👋🏼 Support/inquiries If you want to support our community, use this link - https://github.com/sponsors/alexeygrigorev If you’re a company, reach us at alexey@datatalks.club #apacheflink #kafka #redpanda #python #dataengineering #streaming #postgresql #docker #pyflink #realtimeanalytics #bigdata #softwareengineering #codespaces #datapipelines #streamprocessing #eventdriven #coding #techworkshop #learninginpublic #datatalksclub
10
😁 2
2 9.7K
avatar
Data Engineering Zoomcamp
@dezoomcamp
02.03.2026 11:57
We uploaded the video about creating projects to YouTube, as some of you reported having problems with watching in on Loom

Here it is: https://www.youtube.com/watch?v=BL0E8xO8OnE

We'll also update it in the repo

And by the way, don't forget about our docs: https://datatalks.club/docs/courses/data-engineering-zoomcamp/

There's a lot of useful information there
YouTube
DTC Zoomcamp projects
In this video, Alexey Grigorev explains the critical final phase of the DataTalks.Club Zoomcamps: the Capstone Project. While homework builds specific skills, the project is the core requirement for graduation and certification across all courses, including MLOps, Machine Learning, and Data Engineering Zoomcamps. TIMECODES: 00:00 Introduction: Overview of the Zoomcamp Project Phase 03:12 Submission Requirements: GitHub Links and Commit IDs 06:18 Project Criteria: Applying Course Modules to Custom Datasets 09:20 Managing Your Identity on Certificates and Leaderboards 12:30 Peer Review Process: Evaluation and Reproduction 15:41 Academic Integrity: Policies on Plagiarism and Self-Plagiarism Connect with DataTalks.Club: - Join the community - https://datatalks.club/slack.html - Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ - Check other upcoming events - https://lu.ma/dtc-events - GitHub: https://github.com/DataTalksClub - LinkedIn - https://www.linkedin.com/company/datatalks-club/ - Twitter - https://twitter.com/DataTalksClub - Website - https://datatalks.club/ Connect with Alexey - Twitter - https://twitter.com/Al_Grigor - Linkedin - https://www.linkedin.com/in/agrigorev/ Check our free online courses: - ML Engineering course - http://mlzoomcamp.com - Data Engineering course - https://github.com/DataTalksClub/data-engineering-zoomcamp - MLOps course - https://github.com/DataTalksClub/mlops-zoomcamp - LLM course - https://github.com/DataTalksClub/llm-zoomcamp - Open-source LLM course: https://github.com/DataTalksClub/open-source-llm-zoomcamp - AI Dev Tools course: https://github.com/DataTalksClub/ai-dev-tools-zoomcamp 👉🏼 Read about all our courses in one place - https://datatalks.club/blog/guide-to-free-online-courses-at-datatalks-club.html 👋🏼 Support/inquiries If you want to support our community, use this link - https://github.com/sponsors/alexeygrigorev If you’re a company, reach us at alexey@datatalks.club
7
3 8.6K
avatar
Data Engineering Zoomcamp
@dezoomcamp
02.03.2026 08:39
This week, we're starting Module 6: Batch Processing.

Reminder: the previous homework deadline is in less than 24 hours.

In this module, you'll learn how batch processing works with Spark and PySpark.

You'll cover:

• Batch processing fundamentals and Spark basics
• Installing and running Spark locally or in Colab
• Working with Spark SQL and DataFrames
• Handling schemas and processing NYC taxi data
• How Spark clusters, joins, and groupBy work internally
• Running Spark in the cloud with Dataproc and BigQuery

Homework deadline: 10 March, 12 AM CET

We also recently had a workshop with dlt on AI-assisted data ingestion. Watch the recording and check out the code if you missed it. Practice what you learned in the homework assignment, and submit it here.
GitHub
data-engineering-zoomcamp/06-batch at main · DataTalksClub/data-engineering-zoomcamp
Data Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Join the course here 👇🏼 - DataTalksClub/data-engineering-zoomcamp
15
👍 2
4 3 8.8K

Data Engineering Zoomcamp

30.1K
Открыть в Telegram