Build a Reactive Data Streaming App with Python and Apache Kafka | Coding In Motion
Вставка
- Опубліковано 8 чер 2024
- cnfl.io/coding-in-motion-epis... | How do you get live notifications from a service that doesn’t support live notifications? Can you watch an online store for price drops? Or track comments on someone else’s UA-cam video? Or at work, can you react to changes in another department’s system, when that system doesn’t have a notification API? How do you turn the question-and-answer model of the web into a live-streaming system?
In this episode of Coding in Motion we’re going to build a solution that brings some data to life. Join Kris Jenkins in another step-by-step build as he demonstrates how to turn a static data source-UA-cam’s REST API-into a reactive system that:
► Uses Python to fetch and process data from a static web API
► Streams that data live, from Python into a Kafka topic
► Processes the incoming source data with ksqlDB, watching for important changes
► Then streams out live, custom notifications via Telegram
LEARN MORE
► GitHub source code: github.com/confluentinc/codin...
► Build a Data Streaming App with TypeScript/JavaScript and Apache Kafka | Coding in Motion: www.confluent.io/coding-in-mo...
► Coding in Motion Playlist: • Coding In Motion | Bui...
► Learn more with Kafka tutorials, resources, and guides: cnfl.io/confluent-developer-c...
► Use CODING200 to get $200 of free Confluent Cloud usage: cnfl.io/try-cloud-coding-in-m...
► Promo code details: cnfl.io/promo-code-details-co...
► Register for more: cnfl.io/register-coding-in-mo...
TIMESTAMPS
00:00 Intro
00:27 What Are We Building?
01:24 Setting Up A Basic Python Program
02:57 Planning Our Approach
04:00 Fetching Data From Google ("So let's do that.")
07:28 Handling Paging With Python Generators
17:39 Fetching Specific Video Data
22:10 Setting Up A Kafka Cluster
24:26 Defining A Persistent Data Stream
26:03 Setting Up The Python Kafka Library
31:27 Serializing and Storing Our Data
35:02 Detecting Stream Changes With ksqlDB
39:59 Creating A Telegram Alert Bot
43:42 Setting Up An HTTP Sink Connector
46:58 Defining And Triggering The Alerts
50:59 Retrospective
53:02 Outro
ABOUT CONFLUENT
Confluent is pioneering a fundamentally new category of data infrastructure focused on data in motion. Confluent’s cloud-native offering is the foundational platform for data in motion - designed to be the intelligent connective tissue enabling real-time data, from multiple sources, to constantly stream across the organization. With Confluent, organizations can meet the new business imperative of delivering rich, digital front-end customer experiences and transitioning to sophisticated, real-time, software-driven backend operations. To learn more, please visit www.confluent.io.
#streamprocessing #python #apachekafka #kafka #confluent - Наука та технологія
Rare to come across a session on UA-cam where the instructor is very clean on their delivery and emits a 1-1 match between every instruction and the tutoree's experience, (at least as of this comment). This was a treat!
First one of these I’ve watched. Beautifully explained, with all the required detail, really bringing it to life. Great work!
great video , great energy , very didactic. I really enjoy every minute of the video. Also the way you talk transmit really nice vibes
I liked this video very much. Just the relevant parts, leaving out all the fluff but with a lot of humour!
Finally, I has found the best video about kafka. Thank you for this video
Fantastic video -> event-driven architecture, data pipeline, notifications, Kafka, Telegram, code, best practices, and humour all in a single package🙂
This was brilliant, perfect and fun. What a clear instructor and perfect way to explain and introduce Kafka. And that Python generator solution was great.
Thank you so much for this video. Made getting acquainted with Kafka as a beginner a pleasure.
love that you do everything in the terminal, no need for an IDE. Respect 😆
Fabulous video. Very well taught, coded and explained.
Thanks for the session Kris, very interesting and touching many topics
a lot of functionality with so little code!
Looking forward to see this next refactoring video.
Congrats!
Incredible session! Woow!
Great video! I really appreciated the smooth and intuitive coding process. If you could, please consider refactoring for further improvements. In the meantime, I'll continue exploring Kafka Streams and the various applications that can be developed using it. Cheers! :)
Really enjoyed the video, Thanks Kris
This is very well explained!!. I really enjoyed it
I would LOVE to see a vim tutorial from Kris!!
This is a wonderful tutorial!
Great content as always! Looking forward to the follow-up refactoring video!
Really touching video ;). Thanks!
This is the first video that I have watched on your channel, and I just loved it. Especially because I have started learning kafka lately. I would love to know how we could deploy such kinds of applications.
That is a great tutorial, thank you!!
Great video! Thanks 👍
Thanks for the video!
thank you!
It was a very interesting video. I did not know that I can build such a useful application with Kafka and ksqlDB. Definitely I will wait for the refactoring and deployment video.
Awesome work
One of the best tutorials ever! Thanks for making this.
Would love to see the Higher order function refactoring and see if any more neat features have been added into Confluent cloud since this video was made. Retrospectives in action ;)
This was good. I think I am going to use what's demo'ed here.
SQL optimization video would be CLUTCH.
Awesome tutorial! I’m definitely interested in a follow-up that refactors and deploys the watcher script in the cloud (I’m guessing as some kind of cron job).
Question - you defined your schema by creating a stream in ksql and pulling down the resulting schema from Schema Registry using the schema registry client in Python. Would it be better to define the schema in Python and source control it there as the source of truth for the schema? That way, the developer has control over schema changes as the requirements evolve. What do you think?
I’m also wondering if you can use an HTTP source connector to get the UA-cam data into Kafka easier. Then you don’t even have to deploy the python script. I’m not sure if the http source connector can handle paging like the python generator solution does, though.
Now, imagine a world where UA-cam’s API (and the rest of the APIs in the world) supports push (eg over Server Sent Events) rather than pull. Your producer script could just subscribe to changes and publish to Kafka, like Change Data Capture in the database world. I’m looking forward to more APIs “thinking in streams” rather than just synchronous request/response.
Thanks again for an awesome video!
Confluent's HTTP source connector supports pagination, see docs.confluent.io/cloud/current/connectors/cc-http-source.html#cc-http-source-example-use-cases-cursor-pagination
Thanks a lot for the video!
I'm completely new to kafka and I ran into an issue - looks like Confluent UI has changed and I can't find schema registry url. Could someone help please?
Great video! Wondering how much this would roughly cost you in a year i.e. how many discounted synths are you buying before breaking even.
Thanks a lot Kris Can we get the code snippets please
is it possibile to do the project with basic confluent tier without paying?
Is it safe to show in clear your Google API ID?
@@krisjenkins8110 nice one
Good video, thank you!
At the same time I feel bad about your approach: it's too "confluent-oriented," and can vendor-lock newbie programmers with those proprietary technologies.
For those checking if the production video is out, here is one of Confluent's videos I found: ua-cam.com/video/S204ya8eObE/v-deo.html
Quick question, just wanna know why the title of this video is 'Build a REACTIVE data streaming'.
I don't see any reactive code in script you wrote
Is the whole data pipeline reactive?
In my understanding, reactive system should be driven by Event (for example, like, comment etc on youtube video as you scraped)
It means every time Like or comments on video you lookup occurs, it should be immediatly sent to kafka cluster,
not waiting for every 10 minutes waiting to be noitced that number of likes and comments has changed.
Can you give me detailed explanation why this demo is reactive?
Wow, the subscribe button blew me off
no way to make what you do, sir. I would like to see one day one tutorial where nothing is hidden and all is shown clearly. Well, i have time to die three times before it happens...
{
"name": "ImportError",
"message": "cannot import name 'config' from 'config' (c:\\Users\\flosr\\Engineering\\Data Engineering\\UA-cam API Project\\config.py)",
"stack": "---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
Cell In[7], line 3
1 import logging
2 import sys, requests
----> 3 from config import config
ImportError: cannot import name 'config' from 'config' (c:\\Users\\flosr\\Engineering\\Data Engineering\\UA-cam API Project\\config.py)"
}