68
165 001

8:40

How to Lazily Evaluate Chunks of a Big Pandas DataFrame

7:00

How To Make Project Docs in Kedro With Pydocs and `kedro build-docs`

6:51

Creating Custom Kedro Starters for Your Boilerplate Code

6:53

What is Kedro? Why is it useful? A Non-Technical Intro to Kedro

7:45

Deployable REST Enabled Data Pipelines with Flask, Docker, Kedro

9:05

Handling Zip Files in Kedro Using the de1 python package!

In this video I demonstrate how to install and use the brand new `de1` package and the ZipFileDataSet in order to handle zip files in a pipeline.
Checkout the package here: github.com/dataengineerone/de1-python

Відео

8:40

Let's Look at Kedro 0.17.0!

Переглядів 7414 роки тому

Kedro was released last month with the newest minor version to date: 0.17.0! In this video we explore some of the newest features.

How to Lazily Evaluate Chunks of a Big Pandas DataFrame

7:00

How to Lazily Evaluate Chunks of a Big Pandas DataFrame

Переглядів 1,8 тис.4 роки тому

When we have a case where we wish to do a group by, usually we rely on a big beefy machine to do the processing. Using the technique described in this video, one can instead save on memory and save on resources by skillfully chunking the data one wishes to process, and cleaning up after oneself when done.

How To Make Project Docs in Kedro With Pydocs and `kedro build-docs`

6:51

How To Make Project Docs in Kedro With Pydocs and `kedro build-docs`

Переглядів 9184 роки тому

In this video we show a simple tutorial for creating documentation for your kedro projects. We'll be exploring how we can add Pydocs to our functions and how, using kedro's built in `build-docs` feature, we automatically can pull all of that information into a beautiful HTML documentation library.

Creating Custom Kedro Starters for Your Boilerplate Code

6:53

Creating Custom Kedro Starters for Your Boilerplate Code

Переглядів 6754 роки тому

In this video I show you how to create a custom starter so that you can share the joy of not having to write boiler plate code.

What is Kedro? Why is it useful? A Non-Technical Intro to Kedro

7:45

What is Kedro? Why is it useful? A Non-Technical Intro to Kedro

Переглядів 6 тис.4 роки тому

In this video I explain what kedro is and why it is useful for non-technical persons!

Deployable REST Enabled Data Pipelines with Flask, Docker, Kedro

9:05

Deployable REST Enabled Data Pipelines with Flask, Docker, Kedro

Переглядів 2,4 тис.4 роки тому

Turning your pipeline into a REST API has never been easier, thanks to Flask and Docker. In this video, I show you how!

How To Create Dataset Save and Load Hooks

5:50

How To Create Dataset Save and Load Hooks

Переглядів 7054 роки тому

Dataset Save and Load hooks don't actually exist in Kedro! With this tutorial, I show you how to utilize Dataset Transformers in conjunction with existing hooks to create functions that can act as effective dataset hooks. TIMESTAMPS: 0:00 Introduction 0:22 There are no Dataset Hooks! 0:49 Why do we want them? 1:08 What are Transformers? 1:38 Writing a simple Transformer 3:02 Add the Transformer...

Using Streamlit To Make GUIs for your Kedro Parameters!

13:40

Using Streamlit To Make GUIs for your Kedro Parameters!

Переглядів 2,6 тис.4 роки тому

In this video, we show you how to use streamlit to control your kedro parameters. Streamlit is a webapp generating library that allows you to modify parameters and view data on the fly. It's a fantastic system, and combining it with kedro makes it even better! Today's episode is brought to you by kedro.community, the newest open community, bringing data pipeliners from around the world, togethe...

Creating Shared Catalogs for your Kedro Projects on GitHub

7:17

Creating Shared Catalogs for your Kedro Projects on GitHub

Переглядів 6154 роки тому

As kedro becomes more and more popular, the need to share your data catalog will become ever more likely. Thanks to kedro's new hooks, it makes it super easy to share catalog entries between project teams. In this episode, I show you how this can be accomplished. The code for today's video can be found here: gist.github.com/tamsanh/2075e293a089e76baa24cf29e3c566f1 TIMESTAMPS: 0:00 Introduction ...

Let's Look at Kedro 0.16.5 Release Notes

6:53

Let's Look at Kedro 0.16.5 Release Notes

Переглядів 3054 роки тому

The Newest Kedro was Released! 0.16.5 Congrats to the Kedro team for another awesome release. Snarky Canadian's "pyproject.toml" Explanation: snarky.ca/what-the-heck-is-pyproject-toml/ TIMESTAMPS: 0:00 Intro 0:40 Pipeline to Hooks Transition 1:44 Initial thoughts on Hooks 3:04 Standardizing pyproject.toml 4:00 Disabling Plugins with the .kedro.yml Configuration 4:31 Not Totally Backwards Compat...

The Complete Beginner's Guide to Kedro - How to use Kedro 0.16.4

23:52

The Complete Beginner's Guide to Kedro - How to use Kedro 0.16.4

Переглядів 20 тис.4 роки тому

This video brought to you by kedro.community/! The newest and nicest place to talk and learn with other data pipeliners. This is the first video a kedro newbie should watch, if they wish to understand how to use kedro! In this complete guide, you'll learn about Pipelines, Nodes, DataSets, Catalogs, Parameters, and how you can leverage all of these in your kedro project. We'll be walking through...

8:45

How To Customize Your Kedro CLI Options

Переглядів 6484 роки тому

In this video we cover how to add custom a custom CLI for kedro! We add a "cool-run" command which will run multiple pipelines for us with a single run. You can use this method to create all sorts of different configurations for your pipelines. TIMESTAMPS: 0:00 Intro 0:28 We will be editing the kedro_cli.py file's click 1:46 Explaining how the normal 'run' command works 2:20 Overview of our new...

How to Create and Reuse Pipelines with "Package and Pull" CLI

8:56

How to Create and Reuse Pipelines with "Package and Pull" CLI

Переглядів 5274 роки тому

The Pipeline CLI command has great options to enable pipeline reuse. In this episode, we take a closer look on what pipeline reuse looks like, and the caveats of reuse. TIMESTAMPS: 0:00 Intro 0:39 Looking at the Kedro Pipeline Command 1:11 `kedro pipeline create` for Creating Pipelines 2:40 Prepare the Pipeline for Packaging 3:17 Quick Look at Other Pipeline Options 3:29 `kedro pipeline describ...

How to Get/Write Data from/to a SQL Database

8:05

How to Get/Write Data from/to a SQL Database

Переглядів 2,2 тис.4 роки тому

Data Engineering is a tough job, and it can be made tougher by complex, difficult to understand data pipelines. In this series, we will be covering Kedro and how to use it to make data pipelines easier to read, write, and maintain. In this video we cover: Accessing SQL Data: * Use SQLTableDataSet to load and save entire DataFrames * Use if_exists parameter to manage table behavior. * Use SQLQue...

Finishing our SG API Pipeline with Chronocoding - How I Write Pipes Part V

10:27

Finishing our SG API Pipeline with Chronocoding - How I Write Pipes Part V

Переглядів 4014 роки тому

Finishing our SG API Pipeline with Chronocoding - How I Write Pipes Part V

Parallelize Pipeline Processing With Sub Node Parallelization - How I Write Pipes Part IV

13:19

Parallelize Pipeline Processing With Sub Node Parallelization - How I Write Pipes Part IV

Переглядів 1,1 тис.4 роки тому

Parallelize Pipeline Processing With Sub Node Parallelization - How I Write Pipes Part IV

Using Component Pipelines to Optimize Data Science Iteration - How I Write Pipes Part III

8:04

Using Component Pipelines to Optimize Data Science Iteration - How I Write Pipes Part III

Переглядів 5844 роки тому

Using Component Pipelines to Optimize Data Science Iteration - How I Write Pipes Part III

Adding our New Nodes to Our Pipeline - How I Write Data Pipelines - Part II

9:01

Adding our New Nodes to Our Pipeline - How I Write Data Pipelines - Part II

Переглядів 8504 роки тому

Adding our New Nodes to Our Pipeline - How I Write Data Pipelines - Part II

8:57

How I Write Data Pipelines - Part I

Переглядів 1,8 тис.4 роки тому

How I Write Data Pipelines - Part I

How to Combine Multiple CSV Files into a Single DataFrame

4:56

How to Combine Multiple CSV Files into a Single DataFrame

Переглядів 3,7 тис.4 роки тому

How to Combine Multiple CSV Files into a Single DataFrame

How to Setup PySpark for your Kedro Pipeline

7:13

How to Setup PySpark for your Kedro Pipeline

Переглядів 2,3 тис.4 роки тому

How to Setup PySpark for your Kedro Pipeline

6:38

Let's Take a Look at Kedro Starters!

Переглядів 3664 роки тому

Let's Take a Look at Kedro Starters!

Let's Take a Look at the Kedro 0.16.3 Release Notes!

5:51

Let's Take a Look at the Kedro 0.16.3 Release Notes!

Переглядів 3824 роки тому

Let's Take a Look at the Kedro 0.16.3 Release Notes!

How To Import Pipelines in Other Python Scripts

7:02

How To Import Pipelines in Other Python Scripts

Переглядів 7504 роки тому

How To Import Pipelines in Other Python Scripts

How To Use a Parameter Range to Generate Pipelines Automatically

10:03

How To Use a Parameter Range to Generate Pipelines Automatically

Переглядів 1,6 тис.4 роки тому

How To Use a Parameter Range to Generate Pipelines Automatically

How to Begin Profiling Your Data with Pandas Profiling

8:54

How to Begin Profiling Your Data with Pandas Profiling

Переглядів 7634 роки тому

How to Begin Profiling Your Data with Pandas Profiling

Two Tricks to Optimize your Kedro Jupyter Flow

7:35

Two Tricks to Optimize your Kedro Jupyter Flow

Переглядів 8564 роки тому

Two Tricks to Optimize your Kedro Jupyter Flow

Advanced Configuration with TemplatedConfigLoader

11:16

Advanced Configuration with TemplatedConfigLoader

Переглядів 7454 роки тому

Advanced Configuration with TemplatedConfigLoader

How to Contribute New Code back to Kedro

27:11

How to Contribute New Code back to Kedro

Переглядів 2274 роки тому

How to Contribute New Code back to Kedro

КОМЕНТАРІ

@ViniciusRaphael Рік тому
If you are facing the Error: No such command 'run' do: Still in the venv - cd <kedro directory> pip install -r requirements.txt After that the run command will work.
@robosergTV Рік тому
an example of how to get data from parallel workers and join that dat would have been great
@robosergTV Рік тому
kedro-wings last commit Jul 28, 2020, so basically dead
@ДмитрийДенисенко-ы9ы Рік тому
Would be great to have synchornization in the opposite direction and automatic notebook creation with all of the dependencies
@robosergTV Рік тому
this channel awesome! Thank you
@Daniel112299 Рік тому
That helped me a lot! Thank You!
@KarndeepSingh-k3z Рік тому
How i can pass parameters from KEDRO CLI in catalog.yml?? Any reference to it would be helpful.
@valuechen6196 Рік тому
awesome!
@blacksoul9242 Рік тому
can i have a codebase of this video ?
@ryojikn Рік тому
Is there any newer method to use structured streaming with Kedro? Do you have another suggestions?
@SP-db6sh Рік тому
What's the better it gives on comparison to prefect ?
@alisonrae 2 роки тому
Thanks for the video!
@prashantkhade5677 2 роки тому
How can we developed an engine which determines which rules applied on given dataset automatically
@mourady5588 2 роки тому
To be able to make a full series on kedro while constantly smiling, your dealer must be lit! Hook me up please! (pun unintended)
@digvijaymali6760 2 роки тому
How do we call async await func with kedro node?
@felipealfredosaavedra6484 2 роки тому
Excelent tutorial, now I can get all the benefit of using parameters, modular pipelines, and viz. This has made a positive change in my workflow. In my work I complement with "kedro catalog create --pipeline name_pipeline" and everything flows very smooth. Thanks a lot for sharing!
@ZubinRoy 2 роки тому
Great video and great example of a custom command as often need to run multiple pipelines. Quick question how do you find the kedro_cli.py file? I've created a new kedro project but there's no file there like you've shown?
@JulRbollo 2 роки тому
Great approach! What to do when your latest partition also needs historical partitions to calculate variables such as "total sales last 6 months" but prior partitions had their variables already computed?
@abhijitdas8617 2 роки тому
Kedro seems to be full of compatibility issues. Better not to use such a tool.
@robosergTV Рік тому
compatibility issues with what?
@narges2982 2 роки тому
Thanks for video. kedro 0.18.2 has no 'load_context' function in the class 'kedro.framework.context.context', what shall we do?
@krismanohar5508 2 роки тому
good high level explaination. Think things are broken with new versions. Need a updated tutorial.
@syinx2301 2 роки тому
is it possible kedro running every nodes/function in different environment (conda, pip, ,docker) in one pipelines?
@ayushdayani7157 2 роки тому
I need some help with setting up and running kedro on aws EMR
@ayushdayani7157 2 роки тому
can anyone help?
@pallavisingh7289 2 роки тому
will u pls help me to code Deployable REST Enabled Data Pipelines with Django, Kedro?
@nicolasduque8636 2 роки тому
Traceback (most recent call last): File "C:\Users\sduque\.conda\envs\kedro-tutorial\lib\site-packages\kedro\framework\cli\cli.py", line 682, in load_entry_points entry_point_commands.append(entry_point.load()) File "C:\Users\sduque\.conda\envs\kedro-tutorial\lib\site-packages\pkg_resources\__init__.py", line 2458, in load return self.resolve() File "C:\Users\sduque\.conda\envs\kedro-tutorial\lib\site-packages\pkg_resources\__init__.py", line 2464, in resolve module = __import__(self.module_name, fromlist=['__name__'], level=0) File "C:\Users\sduque\.conda\envs\kedro-tutorial\lib\site-packages\kedro_viz\launchers\cli.py", line 8, in <module> from kedro.framework.cli.project import PARAMS_ARG_HELP ImportError: cannot import name 'PARAMS_ARG_HELP' from 'kedro.framework.cli.project' (C:\Users\sduque\.conda\envs\kedro-tutorial\lib\site-packages\kedro\framework\cli\project.py) Error: Loading global commands from kedro-viz = kedro_viz.launchers.cli:commands Help!
@vigneshmurali1994 2 роки тому
Great stuff
@shomugger 2 роки тому
how to combine csv without pandas?)
@HugoCoolDude 2 роки тому
Can we build a Redis + Celery backed queue runner?
@Rabixter 2 роки тому
Very useful! Just getting into Kedro. Excited to try this tomorrow :)
@maazaouiyousri7806 2 роки тому
Thanks you for the tuto ! Have a question could share pls the documentation steps and i have a question: how to link the env with kedro using ssh connexion on pycharm ?
@AjaySingh-dq4df 2 роки тому
can u tell versions of kedro, python and kedro great used in this video please
@uff3511 2 роки тому
great how happy und smiling you are all the time! thanks for the demo! 😀 Is it possible to save the raw data in two new tables? One in which only goes the success rows and one in which goes the failure rows? (So that the next ETL steps can run on a "validation_passed_table")
@pickleinspector5948 2 роки тому
The Kedro documentation on their site is utter shit. Heck their "hello world" tutorial doesn't even work. They don't give examples on how to implement features....just really hard to get into it
@datwackypaki 2 роки тому
You're the goat lmk if you need a job
@minurapunchihewa4592 2 роки тому
Hi, Awesome video! If we are not going to use credentials.yml file in production, how are we going to access the required credentials?
@minurapunchihewa4592 2 роки тому
Hey, Great videos, really enjoying it! I had a question. I tried using transcoding for the Byte64DataSet as well, but I found that I get the following error, kedro.pipeline.pipeline.OutputNotUniqueError: Output(s) ['iris_scatter_plot'] are returned by more than one nodes. Node outputs must be unique. I simply changed my catalog file to look like this, iris_scatter_plot@base64: type: iris.io.base64_data_set.Base64DataSet filepath: data/iris64.txt What exactly is the issue here?
@totobourbeau7783 2 роки тому
Hi, great video, it's really a great series of tutorial videos on kedro, i'm rewatching a bunch of them whenever needed! So as i understand you have refactored one node , that you wrapped with a function that provides it with the datasets in the given dict and thus a new node is created. But let's say that i have a pipeline that has multiple nodes that does etl work on a txt file, and now i want to pass a bunch of txt files in a folder to the pipeline, is the only solution availlable to wrap all my nodes(or pure function python function) into one? Cause then in kedro-viz i would only see one node i.e. the one that is encompassing all the others.
@caminerin 2 роки тому
cristal clear, thanks for that kind of video.
@deepcontractor181 2 роки тому
kedro is unable to save pyspark model. it says TypeError: cannot pickle '_thread.RLock' object. is there any method to save a pyspark model?
@sofiacarvajal9154 2 роки тому
Hi, I'm working with Kedro but when I tried to visualize with kedro viz it is empty. In the PyCharm terminal, there is a warning saying that the catalog and parameters files are empty (which is not the case, both have yml files) Any idea why kedro viz can not see the yml files? Thanks!
@danhuayan9863 2 роки тому
Thank you so much for the tip by changing "script" to "module", was stuck at setting the run profile for a while and it works like a charm!
@elenafridman7294 2 роки тому
Best business description I have ever seen from data engineer!😍
@wayinone 2 роки тому
Thank you for sharing! This is very useful. My team benefit from you a lot, by the way.
@digambarbhat8095 3 роки тому
You have created great stuff. Thanks for all your efforts!! In entire video series I missed few advanced concepts like Kedro Plugins. Why don't you make video on creating kedro plugins?
@saharesmaeilzadeh6435 3 роки тому
This was a big help, thank you so much! And I love your energy!
@briankeleman357 3 роки тому
Really great video! Thanks for taking the time to craft a very easy-to-follow demo, Tam!
@JayOOfosho 3 роки тому
Please don't say "KLI". It's called a "C" (letter C pronounced "see") "L" (letter L) "I" (pronounced "eye"). Thanks!
@munivardhanreddyvenati2459 3 роки тому
Kedro is interesting! well while creating a new project I don't see the option of asking for example pipeline why is that so ? could you please help me
@Mactuarchitect 3 роки тому
Really enjoying your Kedro tutorials! Is the kedro-wings usage different in Kedro version 0.17.5? There is no `run.py` file.
@HugoCoolDude 3 роки тому
What if you don’t want to save the table to sql? But instead want to save the metadata of the dataset in SQL, and store the dataframe itself on s3? (If it it is a large dataframe), similar to how one might save an image for a profile? This is important when you have multiple (timeseries) datasets you want to save.

DataEngineerOne

КОМЕНТАРІ