Deploy LLMs using Serverless vLLM on RunPod in 5 Minutes

Поділитися
Вставка
  • Опубліковано 30 вер 2024

КОМЕНТАРІ • 14

  • @udaykiran2053
    @udaykiran2053 2 місяці тому +3

    as you are using the llama model , what is the need for OpenAI installed to check it in the colab Notebook , can you explain

    • @nishalk781
      @nishalk781 Місяць тому

      I think he's using openai model for its functions, like that module has stream which will make things easier for you if u need to receive text has chunks, instead of entire text.

  • @SohamBasu-b1x
    @SohamBasu-b1x 27 днів тому

    can we set automated pause and resume in runpod endpoints ? like I want it to run for 3 hours per day in the morning? Can I set that up?

  • @Bluedrake42
    @Bluedrake42 Місяць тому

    Finally a tutorial that isn't awful. Thank you for existing.

  • @matthewchung74
    @matthewchung74 2 місяці тому +1

    Serverless on runpod with a bigger model, like llama70b on multiple gpus would be awesome!

  • @jamesalxl3636
    @jamesalxl3636 Місяць тому

    im trying to run a 70B uncensored model, will this be possible with this method?

  • @premierleaguehighlights9061
    @premierleaguehighlights9061 2 місяці тому

    Can i use deepfacelab on runpod?

  • @frag_it
    @frag_it 2 місяці тому

    Bro do one for azure Kubernetes with vllm

    • @AIAnytime
      @AIAnytime  2 місяці тому

      Coming soon

    • @frag_it
      @frag_it 2 місяці тому

      @@AIAnytime make sure you do a in depth guide would be awesome to learn and apply the llama 3.1 405 B on it. You can even make it a longer playlist ppl would go crazy over it

  • @shekharkumar1902
    @shekharkumar1902 2 місяці тому

    Sounds like a web promotion. Please create video with agentic based use case example with free of cost llms in local computer