How to Self-Host an LLM | Fly GPUs + Ollama

Поділитися
Вставка
  • Опубліковано 26 лис 2024

КОМЕНТАРІ • 16

  • @MrManafon
    @MrManafon 3 місяці тому

    These videos are really cool. I'm not a beginner, far from it, but it is soooo nice to get this information in such a distilled manner, and from a person that clearly knows what they are talking about. So natural!

    • @jjbetspro
      @jjbetspro 3 місяці тому +1

      And she has a great personality.😂

  • @יהוידעשטינמץ
    @יהוידעשטינמץ 3 місяці тому

    I see alot of explainer videos and yours are the best!, just grate content delivery and tone, prefection at all!

  • @sidiocity
    @sidiocity 2 дні тому

    Can we use any llama based model ? In the destination xan we use the llm we have downloaded ? Imean the custom llm based on llama ?

  • @miro016
    @miro016 2 місяці тому

    Can you provide wireguard instructions you mentioned? Btw perfect tutorial :)

  • @thedavymac
    @thedavymac 3 місяці тому

    That looks like a nice way to run an LLM for my personal use, but I’d like to also try out one of the bigger LLM models.
    Is that doable at all?
    Or will I need to stick to models that fit within the 40gb GPU memory of the a100 for instance?

    • @flydotio
      @flydotio  3 місяці тому +1

      How big are you talking about? Generally the amount of vram you need is parameter count times parameter size in bits divided by 8 to get bytes plus 20%. Check this short for more info: ua-cam.com/users/shortstCE-awsKmmg
      In general though:
      * 13b or lower: any GPU works, no caveats
      * 30b or lower: any GPU works, but you need at least Q8 or FP8 quantization
      * 70b or lower: use the a100-80gb or the L40s
      * greater than 80b: it depends, if you're lucky it'll work on one GPU, if not then you'll need to use multiple GPUs

  • @elias8294
    @elias8294 3 місяці тому

    Cool vid, thanks!

  • @oskrm
    @oskrm 3 місяці тому +1

    ollama run llama3 why is fly cool?

  • @mikeeomega
    @mikeeomega 3 місяці тому

    Great video 👌👌

    • @flydotio
      @flydotio  3 місяці тому

      Thank you 👍

  • @hassenalhandy4720
    @hassenalhandy4720 22 дні тому

    I don't understand, how is this self hosting, isn't this cloud hosting?

  • @TheloniousBird
    @TheloniousBird 3 місяці тому

    Hey, I tried setting this up but I have this error:
    2024-08-24T00:27:36.386 runner[***] ord [info] Machine started in 3.517s
    2024-08-24T00:27:37.133 app[***] ord [info] INFO Main child exited normally with code: 1
    2024-08-24T00:27:37.152 app[***] ord [info] INFO Starting clean up.
    2024-08-24T00:27:37.266 app[***] ord [info] INFO Umounting /dev/vdc from /root/.ollama
    2024-08-24T00:27:37.268 app[***] ord [info] WARN could not unmount /rootfs: EINVAL: Invalid argument
    2024-08-24T00:27:37.269 app[***] ord [info] [ 3.718685] reboot: Power down
    any ideas on what would cause this?>

    • @TheloniousBird
      @TheloniousBird 3 місяці тому

      I got it, I had to play around with the memory sizes

    • @dareljohnson5770
      @dareljohnson5770 2 місяці тому

      @@TheloniousBird What memory size? Explain?

    • @TheloniousBird
      @TheloniousBird 2 місяці тому

      @@dareljohnson5770 in the fly Toml, VM -> memory I had to set it to 16 where it was originally set to 8