Custom LLM Fully Local AI Chat - Made Stupidly Simple with NVIDIA ChatRTX

Поділитися
Вставка

КОМЕНТАРІ • 104

  • @gamefromscratch
    @gamefromscratch  4 місяці тому

    Links
    gamefromscratch.com/nvidia-chatrtx-easy-local-custom-llm-ai-chat/
    -----------------------------------------------------------------------------------------------------------
    *Support* : www.patreon.com/gamefromscratch
    *GameDev News* : gamefromscratch.com
    *GameDev Tutorials* : devga.me
    *Discord* : discord.com/invite/R7tUVbD
    *Twitter* : twitter.com/gamefromscratch
    -----------------------------------------------------------------------------------------------------------

  • @Theraot
    @Theraot 4 місяці тому +28

    I have used LM Studio, it will run on lower end hardware, but expect very bad performance. You can try using models that have been quantized, which will perform better, but will be less precise (they can degenerate into random text). And I do not remember it having an easy way to reference files.
    About RAG, be aware that you want a model that has been trained on the general subject. That is: if a model is specialized in poetry, it probably won't do well with code even if you give it all the text books. Why? Because it is trying to rhyme, that is the pattern it learned.
    On the other hand, with convinience of ChatRTX, you should be able to give it your project files - those that you would not dare upload to an online AI - and have it give you results based on that, which would be specific to what you are doing. And let that be another reason to put comments and choose good variable names: the better the context you can give the AI the better.
    Finally, do not forget: Garbage in, garbage out.

  • @samwood3691
    @samwood3691 4 місяці тому

    The quote from HAL made me have to check this video. Local LLM is a cool idea. Regarding NVidia, they basically bailed on Linux which really sucks, but hopefully this will not stop this from being made available soon on Linux.

  • @quantumgamer6208
    @quantumgamer6208 4 місяці тому

    Does it work with pycharm code like python and Lua code for game and game engine development

  • @josemartins-game
    @josemartins-game 4 місяці тому

    Turn off the internet. What does it respond ?

    • @flrn84791
      @flrn84791 22 дні тому +1

      The same. This doesn't use internet...

  • @vi6ddarkking
    @vi6ddarkking 4 місяці тому +2

    Well they've been Stupidly Simple fo a couple years now with the WebUIs like Oobabooga.
    So this isn't exactly anything impressive.

  • @D3bugMod3
    @D3bugMod3 4 місяці тому +4

    Yoh,
    Will definitely spend some time playing with this. Was using chat to help me write a story & lore bible. But Chat can only remember so much before you have to start a new conversation.
    Not to mention Chats constant need to equivocate over nuanced or political ideas. I spent so much time getting it to see holes in its logic. No doubt this system will still have issues. But at least i wont have to keep starting over.
    Thanks as always

  • @TheRealAfroRick
    @TheRealAfroRick 4 місяці тому +4

    RAG does not go to the web. With retrieval augmented generation, the local data you provide is embedded (converted to numerical form) and store in a vector database of some sort..then when you make a request to the chat bot, your query is also embedded using the same method as the data you previously embedded. A semantic search is performed (generally with cosine distance from the vectors) and the relevant data is sent to the LLM in its context window so it can base its response on the content in your data. This is done specifically to prevent hallucinations by the LLM since it has never seen the data in your documents.

    • @Matlockization
      @Matlockization 4 місяці тому

      'Vector database' sounds like the cloud or 3rd party data mining operatives who are only too happy to pay for the privilege. People also have to understand that one of the AI's here are linked to Meta, which is owned by Mark Zuckerberg who is famous for sharing people's data.

  • @rob679
    @rob679 4 місяці тому +6

    I use linux version of Ollama through WSL with OpenWebui as frontend, it already has RAG functionality and everything is installed by basically 3 commands. Llama3 8b works great and I can hook it to VS Code through Continue extension and have personal local Copilot.

    • @13thxenos
      @13thxenos 4 місяці тому

      Came to comment something similar.
      Now if openWevUI gives the functionality to fine tune models furthur...

    • @rewindcat7927
      @rewindcat7927 4 місяці тому

      Is there a good resource for a smooth-brain to get started on this track? Thanks!!! 🙏

    • @TrolleyTrampInc
      @TrolleyTrampInc 4 місяці тому

      @@rewindcat7927 networkchuck has recently done a video explaining everything. setup ollama and then simple install the continue extension on vs code

    • @jimmiealencar7636
      @jimmiealencar7636 4 місяці тому

      Would it run well with a 6gb rtx?

    • @rob679
      @rob679 4 місяці тому

      @@jimmiealencar7636 Yes, but unless you run natively under windows, you also need 16GB of system ram. Llama3 8b uses about 3.5GB of vram on my 3050. And if everything falls, you can always run it on CPU only, but it will be slow.

  • @refractionpcsx2
    @refractionpcsx2 4 місяці тому +1

    Can confirm this does *not* install on a 2000 series RTX card. Tried on my 2080Ti and the installer goes nope.

  • @OriginRow
    @OriginRow 4 місяці тому +4

    How can I fetch Unreal Engine Docs to PDF ?
    🤔

    • @UltimatePerfection
      @UltimatePerfection 4 місяці тому

      Getleft or other website downloader and then HTML to PDF converter.

    • @OriginRow
      @OriginRow 4 місяці тому

      @@UltimatePerfection
      Recently they moved Docs to forums LMAO 🥵
      It's not working

  • @francompagnie
    @francompagnie 22 дні тому

    At this point it is still a demo. It does not remember the context (your previous questions) or learn anything in the long run. It's just a kind of smart explorer... It can speak your language. It can go over the internet like to make a website page summary (more or less accurate & buggy). It says it can access UA-cam videos (to provide information about them) but as it can't find the video you are asking for but another random video (?), so it's useless for now.

  • @mascot4950
    @mascot4950 4 місяці тому +2

    My experience is that these small models fall apart really quickly, especially when it comes to generalized questions. For programming, they seem do to a bit better, but the difference is still quite noticeable if you ask small and large models the exact same question.
    The first "oh, hey, this actually feels pretty close to at least ChatGPT3.5" for me was llama 3 70b, clocking in at 42GB in size. I can only fit about half of that on my GPU, and with the rest running on CPU it's pretty slow. Like 2 tokens per second.

  • @youMEtubeUK
    @youMEtubeUK 4 місяці тому +1

    I have this and while it runs nice with my 4090, I still use online tools for PDFS and general research. Understand if you want to keep files private. But with a chrome extension i can use all the main AI platforms across mutiple devices. Also found gemini pro 1.5 better for large 700 page pdfs.

  • @micmacha
    @micmacha 4 місяці тому +8

    As a man who has countless useful epubs and pdfs, this looks very useful. I especially like that it will give you a list of its sources; not exactly a full citation yet but very usable. However, I'm not terribly keen on it being Windows only and it's asking for a hefty graphics card and a lot of disk space for something I can do by hand. I think this is good news and it shows that Nvidia is, if haphazardly, listening to the real concerns with LLMs.
    Otherwise it's becoming an extremely tired subject.

    • @MurphyArtPrints
      @MurphyArtPrints 4 місяці тому

      What's your primary source for said PDF's and files? I need to start building a collection with the way things are going.

    • @micmacha
      @micmacha 4 місяці тому +1

      @@MurphyArtPrints Oh, I've scanned a number of them, and many others are from independent epub sellers like Humble Bundle and a few (legal) torrents. I'm with you on proprietary ebook viewers; it may be more durable and portable than paper but you never know when someone's going to pull the plug.

    • @flrn84791
      @flrn84791 22 дні тому

      @@micmacha Not sure how scanned PDFs are gonna perform... I don't know if this does OCR, I'd be surprised if it did. Also, what did you mean by "something I can do by hand"?

  • @aa-xn5hc
    @aa-xn5hc 4 місяці тому

    That is a bad app. For example, it cannot take into account the previous chat when answering a follow up question.

  • @0AThijs
    @0AThijs 4 місяці тому +1

    No api, no custom model loading, just a simple ui, no updater... (I already downloaded it three times to update it, each time ~30gb, yes 30GB FOR MISTRAL!).

  • @a.aspden
    @a.aspden 4 місяці тому +1

    You mentioned copilot. Does this work as well as copilot if you give it your code folder to train on?

  • @tmanook
    @tmanook 4 місяці тому +1

    Interesting usage for AI. Seems like it could be handy. For me, I really want an easier time localizing my game. I still need to figure out the optimal way to do that.

  • @JARFAST
    @JARFAST 4 місяці тому

    Does it support the Arabic language?

  • @phizc
    @phizc 4 місяці тому +5

    Ollama is also an interesting option. It supports Linux, Windows, and Mac. AMD support is in preview on Linux and Windows. It sets up a server that can be accessed via an API or a simple cli chat interface.

  • @24vencedores11
    @24vencedores11 4 місяці тому

    Nice! but you're too fast man!

  • @SimeonRadivoev
    @SimeonRadivoev 4 місяці тому +4

    It's not training doccumintation/dataset it's document retrival. It literally just takes pieces of your documents and inserts them into the prompt.

  • @Matlockization
    @Matlockization 4 місяці тому

    That RAG or 'sanity checker' means that it's possible your data is being distributed to 3rd parties for analysis.

    • @flrn84791
      @flrn84791 22 дні тому

      No. You can run this in airplane mode. Also, if you're worried about what gets sent where, just open wireshark and check it yourself.

    • @Matlockization
      @Matlockization 22 дні тому

      @@flrn84791 What do I look for in Wireshark to see whether my data is going elsewhere ?

  • @gokudomatic
    @gokudomatic 4 місяці тому +1

    I have a feeling that this thing need NVidia hardware that has RTX. My 1060 GTX won't run that.

    • @FromagioCristiano
      @FromagioCristiano 4 місяці тому +2

      At 1:20 there is the system requirements: Geforce RTX 30/40 Series, RTX Ampere (the ones like RTX A2000, RTX A4000) and the Ada Generation GPUs (but that is not for us mere peasants)

    • @hipflipped
      @hipflipped 4 місяці тому +4

      the 1060 is ancient.

    • @gokudomatic
      @gokudomatic 4 місяці тому +1

      @@hipflipped yes, it is. What's your point?

  • @大支爺
    @大支爺 4 місяці тому

    ChatRTX's installer has lots of bugs, never fixed. My PC has Win11 24H2 192GB DDR5+4090 installed.

  • @nightrain472
    @nightrain472 4 місяці тому +1

    I use GBT4All for local LLM

  • @judasthepious1499
    @judasthepious1499 4 місяці тому

    AI : hallo user, what are you doing?
    please upgrade your nvidia graphics card.. or you can't continue using our AI service

  • @맛집전문가
    @맛집전문가 4 місяці тому +2

    train unreal engine 5

    • @gamefromscratch
      @gamefromscratch  4 місяці тому +4

      You can if you can get a text or PDF version of the documentation. Or enough PDF Unreal Engine books. Really its a matter of dumping as much documentation into your training model folder as you can source.

    • @jefreestyles
      @jefreestyles 4 місяці тому +1

      Can one add multiple file/folder locations? Or is it really just one folder that has to be the root? Can it use symbolic links or folder/file shortcuts?

  • @ionthedev
    @ionthedev 4 місяці тому

    Why do they hate linux so much

  • @kurtisharen
    @kurtisharen 4 місяці тому

    How does it handle cross-referencing? What happens if you ask the math question, then ask how to calculate the same thing in Godot? It would need to know and understand the first question and how it applies to the second question instead of just looking up a direct answer in the documentation you give it.

  • @sergiofigueiredo1987
    @sergiofigueiredo1987 4 місяці тому

    Anything LLM is probabbly one of the best rags chat with documents software. its open source, the developers are dedicated and it rock a TON of configurations

  • @AscendantStoic
    @AscendantStoic 4 місяці тому

    LM Studio is great ... I use it quite often ... there is also Ollama but it as far as I know it doesn't have a UI, but it's easy to use.

  • @jefreestyles
    @jefreestyles 4 місяці тому

    Thanks for showing this! Seems maybe 1 other downside is you shouldn't have too many editors open or in use while using it. I wonder what would break first when local compute is maxed out.

  • @dariusz.9119
    @dariusz.9119 4 місяці тому +7

    One thing to add is ChatRTX requires Windows 11. 70% of the market is Windows 10 so it's only for a limited number of users

    • @MonsterJuiced
      @MonsterJuiced 4 місяці тому +3

      Thanks for that lmao, I'm on win10 because 11 breaks my dev software and kills my performance. Shame this is win 11 only

    • @sean7221
      @sean7221 4 місяці тому +1

      LMDE 6 is the future, Windows can go to hell

    • @habag1112
      @habag1112 4 місяці тому +6

      It runs fine on win10 for me (using rtx 3070)

    • @varughstan
      @varughstan 4 місяці тому +3

      I am running this on Windows 10. Working fine.

    • @TheSleepJunkie
      @TheSleepJunkie 4 місяці тому

      Get real. I haven't seen a single windows 10 pc on the market. Not even the cheap ones competing with chromebooks.

  • @AnnCatsanndra
    @AnnCatsanndra 4 місяці тому

    Easy to install and use, easy to train on my own data? Man this thing is gonna be killer for brainstorming and worldbuilding!

  • @RoughEdgeBarb
    @RoughEdgeBarb 4 місяці тому +5

    This might be the first use-case of LLMs I'm interested in. Local is necessary to address huge env cost of GenAI, and the ability to parse your own documentation is interesting.

  • @Stealthy_Sloth
    @Stealthy_Sloth 4 місяці тому

    Llama 3 with Pinokio works great for this as well.

  • @scribblingjoe
    @scribblingjoe 4 місяці тому +3

    This actually sounds pretty cool.

  • @JaxonFXPryer
    @JaxonFXPryer 4 місяці тому

    Dang it... I have so much text in markdown format that is useless for this training data 😭

    • @flrn84791
      @flrn84791 22 дні тому

      Mmh, just change the extension to .txt? :D Also, that's probably something nvidia will change at some point to add different text formats, html, markdown, code, etc.

  • @nangld
    @nangld 4 місяці тому

    Too slow, given it is 7b running on a GPU.

  • @djumeau
    @djumeau 4 місяці тому

    Does it read image based pdfs? Or do you have to convert the pdfs into readable format?

    • @rob679
      @rob679 4 місяці тому

      Most likely it doesn't it doesnt state anywhere and some people commented on NV page that it doesn't see the files.

  • @thesteammachine1282
    @thesteammachine1282 4 місяці тому +6

    Win 11 only ? Lol no..

  • @kyryllvlasiuk
    @kyryllvlasiuk 4 місяці тому

    I've got 2060 with 6 GB :(

  • @bdeva029
    @bdeva029 4 місяці тому

    Nice video. This is good content

  • @shotelco
    @shotelco 3 місяці тому

    Excellent! Thanks.

  • @Saviliana
    @Saviliana 4 місяці тому

    So Kobold but Nvidia?

  • @etherealregions
    @etherealregions 4 місяці тому

    This is very interesting 🤔

  • @MrHannatas
    @MrHannatas 4 місяці тому

    Need this with agents

  • @pm1234
    @pm1234 4 місяці тому +14

    They're late to the party: not llama3, only windos, only a basic chat interface. Open source RAG tools are already here.

    • @r6scrubs126
      @r6scrubs126 4 місяці тому +4

      Did you even watch the first 30 seconds. It's an easier alternative to all the open source build it yourself ones. I think that's great

    • @pm1234
      @pm1234 4 місяці тому

      ​@@r6scrubs126 It would have been great (and still late) if it had all the things I mentioned in my comment. I watched it, THEN commented it. Menu shows llama2 13B (@2:05), no llama3, it's only for window$ (@1:17), and the chat UI is basic (not even sure it does markdown tables). RAG are getting common now. If you're happy because you don't know OS tools, no problemo!

    • @gabrielesilinic
      @gabrielesilinic 4 місяці тому

      Llama 3 is not even open source by definition, Mistral is doing a better job

    • @claxvii177th6
      @claxvii177th6 4 місяці тому +1

      Llama3 isnt open-source??

    • @claxvii177th6
      @claxvii177th6 4 місяці тому

      Seriously, i was using for an entrepreneur application

  • @JasonBrunner-SM
    @JasonBrunner-SM 4 місяці тому +1

    Any HELPFUL comments from those that are already experts on this topic about the better LLMs to use with this from the stand point of game dev? Since Mike admitted this is not his area of expertise.

  • @FusionDeveloper
    @FusionDeveloper 4 місяці тому

    Neat idea, but unnecessarily high system requirements, make it prohibitive for most people.
    I can run Ollama with llama3 with lower system requirements and make my own GUI.

  • @impheris
    @impheris 4 місяці тому +2

    i like some things on AI but this is getting pretty boring now

    • @ronilevarez901
      @ronilevarez901 4 місяці тому

      That's like saying to a new parent that watching their child breathing must be boring.
      This is a new type of life developing in front of your eyes. This is History. I do find History boring, but seeing it happening every day is in a different level.

  • @PurpleKnightmare
    @PurpleKnightmare 4 місяці тому

    OMG This is way more cooler than I thought.

  • @strangeboltz
    @strangeboltz 4 місяці тому

    Awesome video! thank you for sharing