DagsHub
DagsHub
  • 79
  • 127 210
Open Source Auto-labeling with custom MLflow Models and Label Studio
Learn how to create an auto-labeling workflow using MLflow model registry and Label Studio, integrated by DagsHub. In this tutorial, I demonstrate how to set up an auto labeling workflow using our open-source tool, label studio configurable model, to easily connect a machine learning backend to label studio.
The code I use in the video: dagshub.com/idonov/ls-autolable-with-mlflow-model-demo/src/master/README.ipynb
Check out the documentation and related videos for more details:
More videos:
- MLflow Crash Course: ua-cam.com/video/N6wLAkmCHmg/v-deo.htmlsi=i-E-kZzzek8rJpVy
- Label Studio Basics: ua-cam.com/video/3zXAS5ymfdQ/v-deo.htmlsi=fHMXkvVEVESRoJUM
Documentation:
- How to upload a model to MLflow model registry: dagshub.com/docs/integration_guide/mlflow_tracking/#how-to-register-mlflow-model-in-dagshub-model-registry
- Auto labeling with Label Studio: dagshub.com/docs/use_cases/auto_labeling
Переглядів: 46

Відео

Curating and Validating Machine Learning Datasets
Переглядів 132Місяць тому
Learn how to curate and validate unstructured datasets in this step-by-step tutorial. In this video, you'll explore how to upload and connect data sources, filter and query datasets, and save subsets for sharing and future use. Dean also demonstrates how to compare dataset versions, edit metadata, and send data for annotation using integrated tools like Label Studio. By the end, you'll have a c...
🌲 Machine Learning in Agriculture: Scaling AI for Crop Management with Dror Haor
Переглядів 153Місяць тому
In this episode, Dean speaks with Dror Haor, CTO at SeeTree, about the challenges of deploying AI in agriculture at scale. They explore how SeeTree integrates AI and sensor fusion to manage vast amounts of remote sensing data, helping farmers improve crop yields with high accuracy at low costs. Dror shares insights on handling data drift, customizing models for different regions, and balancing ...
📊 Data-Driven Decisions: ML in E-Commerce Forecasting with Federico Bacci
Переглядів 1432 місяці тому
In this episode, Dean speaks with Federico Bacci, a data scientist and ML engineer at Bol, the largest e-commerce company in the Netherlands and Belgium. Federico shares valuable insights into the intricacies of deploying machine learning models in production, particularly for forecasting problems. He discusses the challenges of model explainability, the importance of feature engineering over m...
🚗 Driving Innovation: Machine Learning in Auto Claims Processing
Переглядів 1483 місяці тому
In this episode, Dean speaks with Michał Oleszak, an ML engineering manager at Solera. Michał shares insights into how his team is using machine learning to transform the automotive claims process, from recognizing vehicle damages in images to estimating repair costs. The conversation covers the challenges of deploying ML pipelines in production, managing data quality for computer vision tasks,...
🚑 ML in the Emergency Room with Ljubomir Buturovic
Переглядів 1274 місяці тому
In this episode, I chat with Ljubomir Buturovic, VP of ML and Informatics at Inflammatix. We discuss using ML to diagnose infections and blood tests in the emergency room. We dive into the challenges of building diagnostic (classification) and prognostic (predictive) modes, with takeaways related to building datasets for production use cases. Join our Discord community: discord.gg/tEYvqxwhah Ti...
🌊 AI-Native with Idan Gazit - The future of AI products and interfaces + Getting AI to production
Переглядів 1435 місяців тому
In this episode, Idan Gazit, Senior Director of Research at GitHub Next, discusses his role in exploring strategic technologies and incubating long bet projects. He explains how the GitHub Next team chooses research projects and the process of exploration and theme selection. Idan also shares insights into the ML focus at GitHub Next and the challenges of evaluating the impact of AI products. H...
🍪 Machine Learning in the cookie-less era with Uri Goren
Переглядів 1066 місяців тому
In this episode, I chatted with Uri Goren, founder and CEO of Argmax, about Machine Learning and the future of digital advertising in a world moving away from cookies due to privacy laws like GDPR and CCPA. We chat about challenges in maintaining personalized ads while respecting user privacy, and new methods like probabilistic models and contextual features to cover some of the gap left by rem...
🛰️ Modern & Realistic MLOps with Han-chung Lee
Переглядів 3527 місяців тому
In this episode, I speak with Han-Chung Lee, a machine learning engineer with a lot of interesting takes on ML and AI. We dive into the buzz around natural language processing and the big waves in generative AI. They chat about how newcomers are racing through NLP’s history, mixing old school and new tech, and the shift towards smarter databases. Han-Chung breaks it down with his straightforwar...
Deploying ML for free on AWS - A DagsHub Community Webinar
Переглядів 4977 місяців тому
Explore the world of video classification with our hands-on webinar. Learn to deploy machine learning models using AWS's free tools, create automated CI/CD pipelines, and master MLflow's model registry. It's going to be a practical, engaging session, perfect for ML enthusiasts! 🧑‍🏫 Link to the slides: docs.google.com/presentation/d/15z3MH4GF4EIYzc0NyCeDll20T6-ubshcwX5tKhbvXyw/edit#slide=id.g2b1...
🩻 AI in Medical Devices & Medicine with Mila Orlovsky
Переглядів 2278 місяців тому
In this episode, I had the pleasure of speaking with Mila Orlovsky, a pioneer in medical AI. We delve into practical applications, overcoming data challenges, and the intricacies of developing AI tools that meet regulatory standards. Mila discusses her experiences with predictive analytics in patient care, offering tips on navigating the complexities of AI implementation in medical environments...
⏪ Making LLMs Backwards Compatible with Jason Liu
Переглядів 1,2 тис.9 місяців тому
In this episode, I had the pleasure of speaking with Jason Liu, an applied AI consultant and the creator of Instructor - an open-source tool for extracting structured data from LLM outputs. We chat about LLM applications, their challenges, and how to overcome them. We also dive into Instructor, making LLMs interact with existing systems and a bunch of other cool things. Join our Discord communi...
🔴 Live MLOps Podcast - Building, Deploying and Monitoring Large Language Models with Jinen Setpal
Переглядів 384Рік тому
In this live episode, I'm speaking with Jinen Setpal, ML Engineer at DagsHub about actually building, deploying, and monitoring large language model applications. We discuss DPT, a chatbot project that is live in production on the DagsHub Discord server and helps answer support questions and the process and challenges involved in building it. We dive into evaluation methods, ways to reduce hall...
Live MLOps Podcast Episode!
Переглядів 82Рік тому
Join now to take part in our first live MLOps Podcast episode. I'll be chatting with Jinen Setpal, ML Engineer at DagsHub about his work building LLM applications and getting LLMs into production. Sign up for the event at the link here: www.linkedin.com/events/7098968036782596096/comments/
⛹️‍♂️ Large Scale Video ML at WSC Sports with Yuval Gabay
Переглядів 417Рік тому
⛹️‍♂️ Large Scale Video ML at WSC Sports with Yuval Gabay
DagsHub Data Engine: Product Overview
Переглядів 497Рік тому
DagsHub Data Engine: Product Overview
Data Engine Demo: Monocular Depth Estimation
Переглядів 1,2 тис.Рік тому
Data Engine Demo: Monocular Depth Estimation
DagsHub Learning: Automate Your Labeling Process
Переглядів 860Рік тому
DagsHub Learning: Automate Your Labeling Process
DagsHub Learning: Model Registry and Deployment on AWS services with MLflow
Переглядів 723Рік тому
DagsHub Learning: Model Registry and Deployment on AWS services with MLflow
DagsHub Learning: Version and Stream Data with DVC and DagsHub
Переглядів 560Рік тому
DagsHub Learning: Version and Stream Data with DVC and DagsHub
🤖 GPTs & Large Language Models in production with Hamel Husain
Переглядів 389Рік тому
GPTs & Large Language Models in production with Hamel Husain
DagsHub Learning: Experiment Tracking for Machine Learning with MLflow
Переглядів 733Рік тому
DagsHub Learning: Experiment Tracking for Machine Learning with MLflow
🫣 Is Data Science a dying job? with Almog Baku
Переглядів 244Рік тому
🫣 Is Data Science a dying job? with Almog Baku
DagsHub Learning: Model Registry and Deployment with MLflow
Переглядів 1,2 тис.Рік тому
DagsHub Learning: Model Registry and Deployment with MLflow
🏃‍♀️Moving Fast and Breaking Data with Shreya Shankar
Переглядів 480Рік тому
🏃‍♀️Moving Fast and Breaking Data with Shreya Shankar
DagsHub Learning: Experiment Tracking for Machine Learning with MLflow
Переглядів 813Рік тому
DagsHub Learning: Experiment Tracking for Machine Learning with MLflow
🚴‍♀️ Quick & Dirty Machine Learning with Noa Weiss
Переглядів 538Рік тому
🚴‍♀️ Quick & Dirty Machine Learning with Noa Weiss
Generative AI: Using ChatGPT and Stable Diffusion to Create Comic Strips
Переглядів 2,8 тис.Рік тому
Generative AI: Using ChatGPT and Stable Diffusion to Create Comic Strips
Automate the labeling process with Label Studio and DagsHub
Переглядів 6 тис.Рік тому
Automate the labeling process with Label Studio and DagsHub
DagsHub integration with Label Studio - Demo
Переглядів 421Рік тому
DagsHub integration with Label Studio - Demo

КОМЕНТАРІ

  • @davidlennear3112
    @davidlennear3112 День тому

    I really enjoyed the information.... I would like to try that with a floor plan of a building... is that is possible ?

  • @AnnaHyatt
    @AnnaHyatt 3 дні тому

    Wow, really cool!

  • @FedericoBacci-q8j
    @FedericoBacci-q8j 2 місяці тому

    Thanks Dean for the opportunity to chat! I'm curious to hear from everyone: What are some situations where you think traditional machine learning might outperform LLMs? Looking forward to your thoughts!

    • @DagsHub
      @DagsHub 2 місяці тому

      Thanks for coming on the podcast, it was a pleasure having you on :)

  • @sumanpathak-r6p
    @sumanpathak-r6p 3 місяці тому

    Can anyone share the notebook ?? Please..

    • @DagsHub
      @DagsHub 3 місяці тому

      The notebook appears in the description. Check it out :)

  • @samiularafatimon
    @samiularafatimon 4 місяці тому

    It’s a long time er are not working together

  • @xspydazx
    @xspydazx 5 місяців тому

    i think you guy really mean chat history : we could convicne the bot to speak a special way but when the system researts it has no memeory ... like ultrahal and the rag system .. we dont really need it as we need to update the model with the ltest information ... hence a single epoch ... with the chat history json and then cleaning it up and only keeping profile data local ... netflix is a data tool so it has transactional data and historical data and some profile data ... hence this data needs to be saved it can be saved with time and datestamps for easy recal! << the prompt is vital for saving information as this is also how it will be recalled ... so you need to be specific with the chat history update and tell it to remember this for later use! and fine tune it in !: the rag is a training device : so the transactions that occur are important to udate the model with as it shows an example of completing a task... with mistakes so it knows the chain of thoughts for these tasks and the problems to solve ithin : hence when being requested this will occur internally: the model can do nearly anything as long as you give it training data : and enough examples framed oin the correct way : even by saving brain storming sessions and discussions the modle learns how to reason out a task : hence we are building data up with your rags as when the model learns it will no longer need the rag and we will be only updating the model !!! as usual

  • @ararar99
    @ararar99 6 місяців тому

    the git with DVC repo being shared, and the experiments on mlflow going to a shared host - then all the team needs is access to that host. Pls, correct im wrong - new to MLops and dont see how Dags is necessary here. Thank you

    • @DagsHub
      @DagsHub 6 місяців тому

      Hey @arynh1, thanks for the question. Sure you can go through the process of self hosting everything, building access controls, a shared UI for DVC (since that won't be shown in the central host), etc. Or you can use DagsHub. We actually have a blog about it. dagshub.com/blog/how-to-build-a-full-mlops-solution-for-computer-vision-using-oss-2/ It's not impossible to build it yourself, but probably not the best use of your team's time. Hope that's a helpful explanation.

  • @curdyco
    @curdyco 7 місяців тому

    The dafault endpoint is " localhost:5000/invocations ". What if the model is registerd on a remote server like dagshub, what will be the end point then?

  • @curdyco
    @curdyco 7 місяців тому

    you are building mlflow model build-docker to create container but where is the rest api code? how can i customise the interface for user?

    • @DagsHub
      @DagsHub 7 місяців тому

      Can you clarify what you mean? I'm not sure I understand the question

    • @curdyco
      @curdyco 7 місяців тому

      @@DagsHub see you said we can build a docker container using the command i mentioned above, but that docker container will not have app.py, only the model is inside the container, what about the web interface like with flask or something? what about dockerfile?

  • @curdyco
    @curdyco 7 місяців тому

    where did your terminal came from? is it in your laptop? or is it in dagshub? do i pull the repo in my pc and then run the terminal?

  • @curdyco
    @curdyco 7 місяців тому

    In the conda env why you only used mlflowm pillow and tf, is the because register_pyfunc were using them, but register_pyfunc was also using io, and base64?

    • @DagsHub
      @DagsHub 7 місяців тому

      io, and base64 are system packages that come with python, so you don't need to mention them

    • @curdyco
      @curdyco 7 місяців тому

      @@DagsHub okay, and in conda env we add only libraries that we are using in register_pyfunc?

  • @curdyco
    @curdyco 7 місяців тому

    you are using dvc command why? is there any detailed tutorial for that?

  • @jees__antony
    @jees__antony 7 місяців тому

    👍👍👍

  • @MSalman1
    @MSalman1 7 місяців тому

    Gpt: generative pretrained transformer

  • @staticalmo
    @staticalmo 7 місяців тому

    Can you please re explain "No free infrastructure as code tools"? It sounds like a trap

    • @DagsHub
      @DagsHub 7 місяців тому

      IaC tools are tools like Terraform, or CloudFormation or CDK. Most of them have prerequisites that make them impossible to use without paying the vendor behind them. Does that answer the question?

    • @staticalmo
      @staticalmo 7 місяців тому

      @@DagsHub no, does AWS make Lambda a trap because of S3?

    • @DagsHub
      @DagsHub 7 місяців тому

      @@staticalmo That's one way to look at it. It's more like, they structure it so that you can't easily use Lambda for free even though they have a free tier, because you need to use S3

    • @PavloFesenko
      @PavloFesenko 7 місяців тому

      ​@@staticalmo When AWS Serverless Application Model (SAM) tool deploys Lambda infrastructure code, it first creates a CloudFormation changeset and always stores it in an S3 bucket. So although the SAM tool is free to use, this last deployment step isn't free and you will need to pay a small fee for S3. Other AWS deployment tools like Cloud Development Kit (CDK) also use the same approach so there is no way to avoid paying for S3 as far as I know. Terraform also creates a similar state file but it either stores it locally or in the Terraform Cloud (with a nice free tier). The latter is especially useful when collaborating in the team. Of course, you can deploy everything manually using AWS interface and it's ok for a quick prototype but if you want to run a CI/CD pipeline, then infrastructure as code is the only option.

  • @sabaokangan
    @sabaokangan 8 місяців тому

    Thank you so much for sharing this with us on UA-cam

    • @DagsHub
      @DagsHub 7 місяців тому

      Thank you for listening :)

  • @ncroc
    @ncroc 8 місяців тому

    It would be good if there was a comparison with Huggingface.

    • @PavloFesenko
      @PavloFesenko 7 місяців тому

      HuggingFace is amazing and I totally forgot to mention it as an alternative. If you push your model to HuggingFace and make it public, then you can use their Inference API with Intel Xeon CPU for free. If you want to keep it private or use GPUs, then you will need to use their paid Inference Endpoints. And for both cases HuggingFace offers a really nice Python client. For AWS Lambda you can get maximum 6 CPU cores which is comparable with the lower end Intel Xeon.

  • @kaplandjk
    @kaplandjk 8 місяців тому

    Jason got on my radar about a week ago. Brilliant stuff.

    • @DagsHub
      @DagsHub 8 місяців тому

      Thanks for the kind words :)

    • @xspydazx
      @xspydazx 5 місяців тому

      yeds the instructor allows for the structure to call a function (just rip it out of the response first)... in the validation process you can call the function and return the response ... but its a great process but after a few months of this slow process we can upload the logging to be used as training data to train the model to perform this : as he said we are not reinventing the wheel : as even AGENTS are 50 years old and we have been using them for the same length of time !

  • @not_a_human_being
    @not_a_human_being 9 місяців тому

    why on earth would label studio be triggering model re-training?

    • @DagsHub
      @DagsHub 9 місяців тому

      The typical use case is for non-technical stakeholders to automatically trigger training when they are done. It's probably not a common use case, but we've actually seen customer require this.

    • @not_a_human_being
      @not_a_human_being 9 місяців тому

      @@DagsHubThank you for your response, I believe you have a great product!

    • @DagsHub
      @DagsHub 9 місяців тому

      @@not_a_human_being Thank you for the kind words! You just made our day :)

  • @GiftyPokuaa-c9n
    @GiftyPokuaa-c9n 11 місяців тому

    Thank you for making this video. Is the directory the name of your project on Dagshug or it could be any folder on desktop?

    • @DagsHub
      @DagsHub 11 місяців тому

      It can be any folder on your computer, but we recommend using the same name so it's easier to understand the relationship between your DagsHub project and the local path it's stored in.

  • @irfankhan-kc9gw
    @irfankhan-kc9gw Рік тому

    How to import annotated image into label studio? Please make a video

    • @DagsHub
      @DagsHub Рік тому

      Here's a blog about that topic: dagshub.com/blog/convert-annotations-to-dagshub/ Does it answer your questions?

  • @haydenbianca7532
    @haydenbianca7532 Рік тому

    Promo SM

  • @joelbhaskarnadar7391
    @joelbhaskarnadar7391 Рік тому

    Superb Work

  • @joelbhaskarnadar7391
    @joelbhaskarnadar7391 Рік тому

    👌🏿

  • @EmilioGagliardi
    @EmilioGagliardi Рік тому

    Teally great content. But almost impossible to follow along with that audio. Please consider redoing with quality audio. Cheers

  • @adilgun2775
    @adilgun2775 Рік тому

    Already dead

  • @sonny12681
    @sonny12681 Рік тому

    Can Wombo Dream creat comic strips to?

  • @bibinkunjumon5998
    @bibinkunjumon5998 Рік тому

    she did a damn great job.Really a lot hands own experience need for this kind of confidence

  • @j_r28
    @j_r28 Рік тому

    Democratization AI. Keep this good work thank you for bringing this, can't thank enough :)

  • @ganbayards
    @ganbayards Рік тому

    there is no access to notebook link

    • @DagsHub
      @DagsHub Рік тому

      @Granbayar, we tested the notebook link and it works from incognito mode, can you try again?

  • @markcuello5
    @markcuello5 Рік тому

    HELP

    • @DagsHub
      @DagsHub Рік тому

      Hey Mark, how can we help you?

  • @NickWindham
    @NickWindham Рік тому

    Love it. Julia and these guys deserve so much more positive attention.

  • @ChrisLovejoy
    @ChrisLovejoy Рік тому

    Really great video, thanks so much

    • @ChrisLovejoy
      @ChrisLovejoy Рік тому

      can't believe this only has 800 views :P Laszlo dropping nuggets of gold

  • @R-Kannada-DevOps
    @R-Kannada-DevOps Рік тому

    If you give one practical session on this is more helpful

    • @deanp.2166
      @deanp.2166 Рік тому

      There is also this hands on DVC session: ua-cam.com/video/8I0jMEs470o/v-deo.html

  • @thantzinoo938
    @thantzinoo938 Рік тому

    thanks amazing podcast, enjoy a lot!

  • @imflash217
    @imflash217 Рік тому

    How was the data labeled? Were each of the 12 images annotated with respective prompt-text? or it is fully un-supervised that takes in only images and chat-get generated text?

  • @sahilkadu9679
    @sahilkadu9679 Рік тому

    I am facing error while installing dvc......its showing permission denied☹

    • @DagsHub
      @DagsHub Рік тому

      Hey Sahil, thanks for the heads up, please visit our discord community for further assistance: discord.gg/pk22NradY4

  • @alexportugal3986
    @alexportugal3986 Рік тому

    Dude video is nice and all but the audio kills it completly. A mic is like 20 bucks and it would be a world changer not only in videos but also for when you are in calls.

  • @temesgengeta3629
    @temesgengeta3629 Рік тому

    How can we label Roby like text ? We try to label Roby text. However, text area could not accept roby base and roby text

  • @avinashmangipudi4203
    @avinashmangipudi4203 Рік тому

    Very interesting conversation. Thanks

  • @pratyushpattnaik2960
    @pratyushpattnaik2960 Рік тому

    That's really cool!!

  • @kingabzpro
    @kingabzpro Рік тому

    But why Paypal have so many false positive and false negative, if they started that early in ML.

    • @DagsHub
      @DagsHub Рік тому

      I think the answer is that they are solving a really hard problem, so you’re actually seeing how hard it is that even their really good algorithm is very far from perfect

  • @TheAIEpiphany
    @TheAIEpiphany Рік тому

    Great conversation guys!

  • @mertbozkir
    @mertbozkir Рік тому

    2 Fantastic people in one video! 😍

  • @jordantheman25
    @jordantheman25 Рік тому

    Interesting podcast!

  • @matthewthompson6159
    @matthewthompson6159 Рік тому

    👉Work tirelessly to be as lazy as possible.

  • @houssemabdelkefi1405
    @houssemabdelkefi1405 Рік тому

    thanks

  • @temesgengeta3629
    @temesgengeta3629 Рік тому

    Interesting! thank you for everything. How can we annotate manuscripts that contains ruby text (ruby, ruby base,) using label studio and DagsHub? we have a challenge to write a transcriptions of ruby text.

    • @DagsHub
      @DagsHub Рік тому

      If you push the files you need to annotate to your DagsHub repo, you can select them when opening a Label Studio instance and they will be synced and ready for labeling. Feel free to join our discord if you have further questions: discord.gg/28K23sd8R4

  • @aespar1
    @aespar1 Рік тому

    Excellent show! Thank you for sharing

  • @LoganKilpatrickYT
    @LoganKilpatrickYT Рік тому

    Thanks again for having me! This was so much fun to do.