Deploy Llama 2 on AWS SageMaker using DLC (Deep Learning Containers)

Поділитися
Вставка
  • Опубліковано 21 сер 2024
  • In this tutorial video, I'll show you how to effortlessly deploy Llama2 large language model on AWS SageMaker using Deep Learning Containers (DLC). We'll walk through each step, from accessing pre-built DLC images to configuring SageMaker for Llama2 deployment, designed to make the process smooth and understandable, whether you're new to Generative AI or experienced in the field.
    AWS SageMaker DLC: github.com/aws...
    AI Anytime GitHub: github.com/AIA...
    #ai #llm #python

КОМЕНТАРІ • 42

  • @ashleymavericks
    @ashleymavericks Рік тому +1

    Waiting for GGML quantised model deployments. Btw, thanks for your videos.

  • @49_jaypandya40
    @49_jaypandya40 4 місяці тому

    the content is amazing

  • @yashsrivastava4878
    @yashsrivastava4878 5 місяців тому +1

    thank you ,
    can you please make a video on how to finetune mistral 7b on aws sagemaker, S3, boto3 (in form of async jobs)

  • @shumon29
    @shumon29 11 місяців тому +2

    I am not able to find the gists. The attached repository has only a LICENSE and README file. Could you please share me the repo or gist links?

  • @dchuguashvili
    @dchuguashvili 11 місяців тому +2

    What is the advantage, if any, of using this approach instead of deploying the llama2 model directly from Sagemaker JumpStart?

    • @Digitalsmb
      @Digitalsmb 11 місяців тому

      Would love to know answer to this too

  • @user-iu4id3eh1x
    @user-iu4id3eh1x Рік тому

    So simple.... Thank you

  • @danielmz99
    @danielmz99 Рік тому +3

    Hi thanks for your videos. Would it be possible to get a video on GGML models being deployed on SageMaker? It is unclear what requirements it needs. They fact that they are CPU optimized will help adoption as many small businesses can't really afford the $40/day hosting cost of a 5g.2x LLM + running costs if all they need is an LLM which is private. Local deployment might not be an option as if you need a 13b+ model to get a decent outcome takes a GGML to require also significant dedicated hardware. I see private cloud GGML deployments as the perfect compromise for cheap running costs and decent functionality for a very large number of usecases. I think it would be a great video. Thanks for your efforts

    • @AIAnytime
      @AIAnytime  Рік тому +3

      On GGML deployment, soon..... Pls stay tuned.

    • @ashleymavericks
      @ashleymavericks Рік тому

      I can totally resonate with your viewpoint, I'm exploring similar possibilities for a low cost setup.

    • @ashleymavericks
      @ashleymavericks Рік тому +1

      @AIAnytime It would be great if you try to deploy a GGML model on AWS compute instances and the REST API is compatible with OpenAI specifications. (can leverage LocalAI project)

  • @sohailhosseini2266
    @sohailhosseini2266 11 місяців тому

    Thanks for sharing!

    • @AIAnytime
      @AIAnytime  11 місяців тому

      Thanks for watching!

  • @avijit_barua
    @avijit_barua Рік тому

    very helpful video!

  • @kaarthikandu
    @kaarthikandu Рік тому

    Can we use spot instances when deploying the models ? Have you tried ?

    • @AIAnytime
      @AIAnytime  11 місяців тому

      You can but that will be interrupted.

  • @amangrover9343
    @amangrover9343 11 місяців тому

    i am getting error RuntimeError: weight model.layers.0.self_attn.rotary_emb.inv_freq does not exist while using Phind/Phind-CodeLlama-34B-v2 model

  • @mohammadkashif6072
    @mohammadkashif6072 Рік тому +1

    What IAM roles to assign for the first time in AWS SageMaker?

    • @AIAnytime
      @AIAnytime  Рік тому

      Sagemaker full access
      S3 full access

  • @sravantipris3544
    @sravantipris3544 3 місяці тому

    is GPU required or can it run on CPU only

  • @rohitleo9712
    @rohitleo9712 3 місяці тому

    Hi can we do this for summarization purpose

  • @Ankur-be7dz
    @Ankur-be7dz 11 місяців тому

    while we use the hugging face tokens and secret key, does hugging face charge us money? Or its free?

    • @AIAnytime
      @AIAnytime  11 місяців тому +1

      No they don't charge. It's free but they do have an API hit rate limit but for you, it won't be a problem. Feel free to use it. It's free.

  • @efexzium
    @efexzium 9 місяців тому

    how can we deactivate this endpoint?

  • @karamjittech
    @karamjittech Рік тому

    Awesome video. But how can we fine tune and using RAG approach?

    • @AIAnytime
      @AIAnytime  Рік тому +4

      Coming soon..... Will same deployed LLMs for RAG based application

  • @user-ie9hr5sl8h
    @user-ie9hr5sl8h Рік тому

    Can you show how to do it in aws EC2 instances?

  • @PrasadPrasad-hi7pl
    @PrasadPrasad-hi7pl Рік тому

    Could you please make s tutorial on deploying a chatbot for pdf files using sagemaker. Thank you in advance

    • @AIAnytime
      @AIAnytime  Рік тому +2

      Yes, i will use the same deployed model for this use case. This will be my next 2 videos. Next will be lambda function and API gateway and then the chatbot for your knowledge base.

  • @VenkatesanVenkat-fd4hg
    @VenkatesanVenkat-fd4hg Рік тому

    Highly appreciated, Thanks for your videos. I hav got an error:
    AWS SageMaker Endpoint Failed. Reason: The primary container for production variant AllTraffic did not pass the ping health check see the cloudwatch logs eventhough I hav run the same code in the huggingface deploy for llama 2 7b but falcon 7b runs fine, any help...

    • @AIAnytime
      @AIAnytime  Рік тому +1

      Thank you! The issue is gated model... Can you use this model?NousResearch/Llama-2-7b-chat-hf it's same but not gated... This should be deployed fine.

    • @VenkatesanVenkat-fd4hg
      @VenkatesanVenkat-fd4hg Рік тому

      @@AIAnytimeThanks for your kind response. I hav deployed successfully 7b today only but 13 b needs the AWS quota...(I found related error). Whether I can try quantized version of 13b without AWS quota problem. Kindly reply...

    • @mydsworld3130
      @mydsworld3130 Рік тому

      @@VenkatesanVenkat-fd4hg throwing the same errors as you have written before(for 7b model) I am not able figure it out can you pls help how you figured it out

    • @VenkatesanVenkat-fd4hg
      @VenkatesanVenkat-fd4hg Рік тому

      @@mydsworld3130 check the cloudwatch logs....

  • @jayasuriyap8748
    @jayasuriyap8748 7 місяців тому

    Kindly make an video how to deploy in azure.