End to End Machine Learning Pipeline Creation Using
Вставка
- Опубліковано 7 чер 2024
- EEnd to End Machine Learning Pipeline Creation Using #dvc | Live Demo #dataVersion #mlops #mlpipeline
you can clone the git repo to get started: github.com/TripathiAshutosh/d...
make sure to checkout the tag "base_files" just do git checkout base_files and then you will see only the src folder and params.yaml file. you need to start from here.
#dataVersion #dvc #modelversion #machinelearningpipeline #experimenttrackingusingdvc #datascienceforbeginners #dvctutorial #dvcpipelinetutorial #machinelearningpipelinebuilding #mlops
chapters:
Introduction 00:00
git clone and virtual env setup 01:46
python source file explaination 02:28
pipeline stages 25:00
pipeline run dvc repro 33:50
Visualization Metrics and plots 35:49
Experiment Tracking 40:18
Connect with me:
LinkedIn: / ashutoshtripathiai
Instagram: / ashutoshtripathi_ai
Twitter: / ashutosh_ai
Website: ashutoshtripathi.com
If you want to message me directly, then connect with me on LinkedIn and send a DM.
complete DVC playlist: ua-cam.com/play/PLwFaZuSL_mfr5MM2QAxkvXK29oGEFFxGu.html
Awesome content sir !!! 🎉 Thank you
You are welcome
First thing this is absolute gem of a video on DVC which I've found this after lot of struggle. Thank you very much !!
I have one doubt: If we push different data versions to the storage, will there be many versions if we don't replace every time and especially the data we are using is too large and we can go out of storage. If we replace every time how can we can get the older data versions. We have an use case where we need to run the model daily for personalization and load the predictions to the db. Already our data is ho huge with 100million rows just for a month data. How do data versioning help in these cases ?
Good
Thank you Mummy 🙏
As always, the video has crisp and precise content. Your videos are one of the best in the MLOps area. Earlier, I followed your Mlflow videos which were also very helpful.
One Question though: Can we use DVC and mlflow together? For e.g., If one wants to do data versioning using DVC and the rest of the pipeline activities(like experiment tracking, model registry etc.) in Mlflow, how to do it?
Yes you can do it. I have already created the videos for both of them separately like how to do data versioning using DVC and experiment tracking using mlflow. You just follow those. Further if you have any query let me know
How can we add MLflow code here for experiment tracking?
How DVC can be used with Azure ML studio, any idea or any link can anyone give regarding it?
Hi Ashutosh can you please create a video on how to automate ml models using Airflow.
Yes. Planning to upload in next two weeks. Plz stay tuned
sir after push the data/model in remote, in my case split data and model ia present in data dir
It will be present in ur dvc remote storage directory.
Hi sir,How to do in Jupiter notebook and databriks
Thank you
You can create .py files in jupyter notebook and then execute it will work. There should not be any difference. Or even you can run code in chunks in notebook. Let me know if you have any speyissue in running in notebook.
Databricks video I have not made yet. Will make and upload soon.
sir how many videos are left in dvc playlist
One video. Are you looking for any specific one in dvc
@@AshutoshTripathi_AI i was just asking sir. i am following all the videos. sir which tool will you start after dvc ?
I am thinking about DASK. Do you have something in mind?
@@AshutoshTripathi_AI Ok, DASK pe he bana dijiye