Turn Your AI Model into a Real Product (Amazon SageMaker, API Gateway, AWS Lambda, Next.js, Python)

Поділитися
Вставка
  • Опубліковано 3 жов 2024

КОМЕНТАРІ • 17

  • @ricand5498
    @ricand5498 4 місяці тому +1

    amazing content!! you deserve more subscribers. This is exactly the kind of tutorial I've been looking for but NO ONE EXPLAINS THIS. There's hundreds of videos on deploying open source LLMs locally but there's almost no in-depth high quality info on deploying to remote servers on AWS especially for the masses. You earned a new subscriber with this one, keep making great tutorials!

  • @RatherBeCancelledThanHandled
    @RatherBeCancelledThanHandled 3 місяці тому

    Thanks for sharing ❤️👍

  • @mmzzzmeemee
    @mmzzzmeemee Рік тому +1

    underrated content, this is fire content!

    • @BrianHHough
      @BrianHHough  Рік тому

      So glad you liked this video!! Thanks so much for the kind words and support! 🤩🔥💯🙏

  • @gilbertyoungjr4898
    @gilbertyoungjr4898 Рік тому +1

    Let's goo. Keep pushing content. Great stuff brother.

    • @BrianHHough
      @BrianHHough  Рік тому

      Thanks so much, bro!! 🤩🔥 This was a really fun video to put together and I learned a ton in the process! 💡

  • @faizamalik8298
    @faizamalik8298 5 місяців тому

    that's brilliant.. I was looking for this

  • @elad3958
    @elad3958 Рік тому

    Outstanding.

  • @tal7atal7a66
    @tal7atal7a66 Рік тому +1

    thanks very interesting bro ❤

    • @BrianHHough
      @BrianHHough  Рік тому

      Thanks so much for checking out this build! Really glad you enjoyed the AI/ML content! 🙏🔥

  • @lewdogpop
    @lewdogpop Рік тому +1

    Let’s gooo

  • @finnsteur5639
    @finnsteur5639 Рік тому +1

    I'm trying to create 100 000 reliable tutorials for hundred complex software like photoshop, blender, da vinci resolve etc.. Llama and gpt don't give reliable answer unfortunately. Do you think finetuning llama 7b would be enough (compared to 70b)? Do you know how much time/data that would take?
    I also heard about embedding but couldn't get it to work on large dataset. Would that be a better option? We have at least 40 000 pages of documentation I don't know what the better approach is.

    • @BrianHHough
      @BrianHHough  Рік тому +1

      Really interesting use-case (lot's to share on this below...👀)! LLaMa 13B (which I used in my tutorials) is pretty solid. Jumping to 70B might be overkill in terms of time and resources, especially if you're initially testing out feasibility. I'd say, test the waters with something smaller like 7B, or 13B like what I used, and then decide. There's an inherent trade-off between model size and quality. LLaMA 70B will generally have better performance than LLaMa 7B due to its larger parameter count, but the improvements might be marginal beyond a certain point, and the cost in terms of computation and time might be disproportionately higher for the 70B model. That's where 13B could be the happy medium for testing/use, and once you get the tests you want to test for, maybe quickly run a 70B build for a bit and see if the performance is any different. Just keep costs in mind of course!
      Related to embeddings - I've seen the debates too! RAG is awesome, but there are some quirks, especially when handling broad queries. Augmenting LLMs using RAG is particularly effective for specific tasks, but there are of course certain inherent challenges, like what you shared. It’s all about how you chunk and index your data. Make sure your tutorials are bite-sized to get the most out of RAG. RAG can handle localized info retrieval well, but struggles with broader queries requiring a scan of the entire dataset, especially if it's as large as you're describing like in the 100,000s.
      Overall, I'd say start small, test it out, then scale. And with your 40k pages of docs, you’ve got a goldmine to work with! 💎 Please let me know how you get along with this! Curious to hear how it goes and what you build! 🛠

  • @eltafhussain
    @eltafhussain Рік тому

    Can you explain how many concurrent request can g5.12xLarge instance handles when using LlaMa 2 7B or 13B model? What would be solution for such scenarios?

    • @eltafhussain
      @eltafhussain Рік тому

      I have used smaller instance and getting issue with multiple request as the instance memory insufficient for handle multiple requests

  • @rsalazar9784
    @rsalazar9784 Місяць тому

    Amazon SageMaker is good for a few users, but when the number of users is 10 thousand or 100 thousand it is no longer useful.