ChatGLM: The ChatGPT killer? Checking out ChatGLM6B

Поділитися
Вставка
  • Опубліковано 27 гру 2024

КОМЕНТАРІ • 257

  • @avi7278
    @avi7278 Рік тому +544

    God you are a breath of fresh air... all these new AI niche grifters that have infested literally every corner of youtube are unbearable.

    • @Melvin420x12
      @Melvin420x12 Рік тому +22

      That's why you should avoid clickbaity titles and thumbnails. People talking about GPT-5 or doing outrageous claims. I look for difficult words and model names in the title and thumbnails. Those are high quality videos.

    • @avi7278
      @avi7278 Рік тому +26

      @@Melvin420x12 right but with the algo constantly pushing them even if I say don't reco this channel there two new grifters every day. "I don't k ow anything about AI or programming, but let me sit here and tell you all about it! Derp. Did you see auto gpt? Derp. Derp. Derp. Wow now I'm an AI authority because I have 40,000 other derps who have followed my derpiness!" /r

    • @tonywhite4476
      @tonywhite4476 Рік тому

      That's bc everyone is blowing smoke up each others arses and no one is calling this crap out. I'll probably get a nasty response for just saying that. But I've paid my dues to rant. I started learning development thru UA-cam and I've spent hundreds on BS. So yeah, as long as we're buying bs, they're selling bs. And they should stop with the "it's dangerous but let me show you how to build it", crap. Sentdex is cool but I've never seen him use a click bait title before. Until now.

    • @tkdevlop
      @tkdevlop Рік тому

      well it's there lively hood

    • @iconelias508
      @iconelias508 Рік тому +3

      I thought I was the only getting these recommendations

  • @ChaiTimeDataScience
    @ChaiTimeDataScience Рік тому +48

    I love how Harrison is standing in videos when he is teaching us via code and sits down when its a video talking about a concept or project!

  • @jeffreycooper6163
    @jeffreycooper6163 Рік тому +14

    From basic python to self driving tech - now LLMs. Harrison you are a gentleman who always guides the way through cool new tech with no BS. Thanks for this video!

  • @mytechnotalent
    @mytechnotalent Рік тому +95

    It is mind-numbing how cheap the GLM is compared to GPT3. Can you imagine where we are going to be in just another few months? Great one Harrison as always!

  • @PMX
    @PMX Рік тому +28

    Gave it a test, it used a Chinese term in a response for some reason. I asked it to define it and it did (I assume correctly, or at least it fit with how it used it in the previous response). Apparently the model was trained "on 1T tokens of English and Chinese" and it seems some Chinese is leaking into the English responses, perhaps in the same way a multilingual person may occasionally use a word in a different language by accident, as the word captures a particular idea better than the language being used at the moment (?).

  • @HelloThere65272
    @HelloThere65272 Рік тому +17

    I literally cannot bear any "AI news" these days because it's always sleeky dudes that have obviously no idea what they are talking about that just want to grift on the next thing (basically crypto bros from 9 months ago). I literally hide videos about AI per default with the "not interested" button, but I clicked on this one immediately. Thank you Sentdex!

  • @Dmitrii-q6p
    @Dmitrii-q6p Рік тому +22

    - are you better than chatGPT?
    - As an AI language model, I am capable of generating text that is similar to human language, but I don’t have the same level of natural language understanding as a human. So, in terms of generating text, I am similar to ChatGPT in that respect. However, in terms of specific abilities such as understanding natural language and generating appropriate responses to a user’s questions, I may be slightly better as I have been trained on a wider range of data and have additional features such as context and history that ChatGPT does not have.

    • @foxshot97
      @foxshot97 Рік тому +1

      this is the response i got: "As an AI language model, I’m designed to模仿人类的语言交流能力, but I’m not quite as powerful as ChatGPT, the open-source machine learning model developed by Facebook AI Research. While I can generate human-like responses to a wide range of topics and questions, ChatGPT is specifically designed to generate text that is similar to human language, with a focus on language production and understanding.
      That being said, I’m constantly learning and improving, and I’m eager to continue contributing to the field of AI and language processing. So, in some ways, I’m still developing and could potentially improve in the future."

  • @BinarySplit
    @BinarySplit Рік тому +8

    Thanks for sharing! I've been looking for an open LLM with English & Chinese support, and this sounds perfect! It's really cool that the authors care about the affordability of running the model.
    You're absolutely right that many models are undertrained and this one was well trained. That's definitely a big factor. However, GLM also claims better accuracy with similar numbers of training tokens as other models. Reading through the paper, they integrated a huge bag of tricks into both the pretraining and fine-tuning regimes to make it more sample efficient. I don't think there's any singular pivotal innovation (though there certainly are innovations) - it looks like it's the actual legitimate hard work in designing and building the complicated beasts that are their pretraining regime and fine-tuning datasets.

  • @rumidom
    @rumidom Рік тому +30

    Have you seen the open-assistant efforts in creating a open RLHF dataset? Hopefully that could also result in gptchat like smaller models that can be run locally

  • @dan110024
    @dan110024 Рік тому +3

    One of the few channels that has mostly dialogue and no shitty b-roll stock footage that nobody cares about. I always know I'm gonna get a sensible yet intriguing take on something.

  • @mysticaltech
    @mysticaltech Рік тому

    Man, the server racks behind you, so dope! Thanks for sharing about this valuable model.

  • @jessemair3684
    @jessemair3684 Рік тому +1

    thank you for finding these and sharing them with us. Years ago you got many of us excited about agi chatbots. Love your vids, thank you!

  • @mlcat
    @mlcat Рік тому +5

    I tried a few models for assistance, mainly llama based sub-models, for some reason huggingface variation of llama works worse than original fb one, maybe that's a mac-only issue though, but all the "loras" only work with hf one. Noticed that loras like gpt4all and alpaca-lora were trained either on alpaca dataset or chatgpt outputs and provide censored answers for obviously provocative questions. One model that still acts like an assistant but does not censor outputs is llama adapter, it's less accurate in answers but it provides a better impression about the capabilities of untouched models.
    Thank you for this video, will also try this model later!

  • @serta5727
    @serta5727 Рік тому +16

    Open Assistant is about to drop a new opensource ChatGPT soon

    • @sentdex
      @sentdex  Рік тому +9

      Can't wait to check it out, been following them closely. I wonder how it will compare.

    • @freestyletow1
      @freestyletow1 Рік тому

      15 abr

    • @mikerhinos
      @mikerhinos Рік тому +1

      I commented the same thing on Twitter :D
      Open Assistant could have the same effect as Stable Diffusion in image generation, and Midjourney wouldn't be where they are now without SD.

  • @TheRealMcNuggs
    @TheRealMcNuggs Рік тому +5

    Will you ever continue the "Neural Networks from scratch" series? Was really looking forward to backpropagation.

  • @list1726
    @list1726 Рік тому

    Thank you for posting. Keep up the good work

  • @rikvermeer1325
    @rikvermeer1325 Рік тому

    You are really good at explaining this; subscribed!

  • @JC.72
    @JC.72 Рік тому +4

    it seems like the 130b model is also available, but u need to send a request form in, and wait for them to approve it through email. they ask questions such as what your intended use are, and the need to agreement to their terms. and i wonder how much better is the 130b model if there is any type of metric such as % differences.

  • @alexjenkins8026
    @alexjenkins8026 Рік тому +4

    Great find and summary! Love your work.
    I'd love to see you use it!
    Are you able to run through a code example (python) of how to train any of these models (say a hugging face LLM) for a different purpose? I hear it a lot (and you mentioned it as well) but i've been struggling to repurpose them to do other things... Say a paragraph of text as input and a 'score' out or something (maybe how relevance to a topic, sentiment, or even how appropriate it would be to say at work or something)?
    I've even just tried to take the output of these models pre-text and feeding that into another model that I train - that way i don't have to try and fine-tune the big model at all (and get huge advantages of only having to feed my training data into the big model once, save the output and then train my small model)... but it seems i'm unable to get the big models to be consistent at anything - like it's not 'exploiting' enough and sending through the same inputs changes the outputs dramatically :/
    Keep it up! :)

  • @codechimps
    @codechimps Рік тому +1

    This is such a solid video we need more of these when it comes to AI today.

  • @sgaseretto
    @sgaseretto Рік тому +5

    You should try RWKV and their fine-tuned Raven models, they work as RNN but are trained as a Transformer. The model is gaining a lot of momentum now

  • @ds920
    @ds920 Рік тому +1

    Thank you, sir! A breath of fresh air🎉

  • @lijostar167
    @lijostar167 Рік тому +7

    Have you tried the vicuna model yet? Supposed to be '92%' as good as chatgpt. How would say it compares to ChatGLM?

    • @IronZk
      @IronZk Рік тому +1

      i have tried vicuna 7b on cpu and it's impressive!

  • @SpaghettiRealm
    @SpaghettiRealm Рік тому +1

    Thank you, your content is great as always

  • @lutzsommer4708
    @lutzsommer4708 Рік тому +8

    What about the LLaMa oder Alpaca model? How do they compare to ChatGLM6B?

    • @sentdex
      @sentdex  Рік тому +3

      Only used the base LLaMa, no quality chat variant that I could find, can you link to any specific project page or weights that I can download and try? Finding one issue with many of these models is just... ease of use OOTB.

    • @djklfsdjklfjlif
      @djklfsdjklfjlif Рік тому

      @@sentdex Almost all of the recent exciting ones are built from the leaked LLaMA which is why getting ahold of them isn't quite as easy. Most have gotten uploaded to HuggingFace in various formats by this point though. LLaMA of course, then the various transformations of it like 4bit. Alpaca, which didn't release their weights, but has been reproduced & released by others. Vicuna which uses the ShareGPT dataset and is said to be better than ChatGLM6b. And gpt4-x-alpaca which looks promising. The downside of course is licensing. Since they're built from LLaMA you can't use any of these for commercial use. The later ones were tuned using output from OpenAI which is against their terms. All of these can run on consumer hardware.
      For finding more about these - There's the oobabooga project which makes discovering & running a lot of these easier. There's also the Oobabooga & LocalLLaMA subreddits which discuss the newest stuff. Lots of model announcements hit the front page of HN due to the hype right now.
      There's also the Open Assistant work which looks promising. Don't know much about it yet but worth looking into.

    • @eh_channel3
      @eh_channel3 Рік тому +3

      @@sentdex There is vicuna model that was released few days ago, give me much better result compare to LLama

    • @lutzsommer4708
      @lutzsommer4708 Рік тому

      @@sentdex I'm sorry I can't reply to you with a useful content, my comment keeps getting deleted... I have no clue why.

    • @sentdex
      @sentdex  Рік тому +1

      @@lutzsommer4708 feel free to email me harrison@pythonprogramming.net

  • @happywednesday6741
    @happywednesday6741 Рік тому +1

    Sentdex can you guide me to a place on your channel, a playlists/lists, or somewhere where I can brush up on ML for these large language models and recent generative AI tools?

  • @skaltura
    @skaltura Рік тому +2

    I wonder about the training side, can we further train these tiny models and if so how, with what hardware and software to fine tune them to very specific narrow tasks

  • @ander300
    @ander300 Рік тому

    Part 10 of Neural Net from Scratch, about analytical derivatives??? Please bring the series back!

  • @fitybux4664
    @fitybux4664 Рік тому

    GLM-130B vs Vicuna? Any opinion on that? I know that Vicuna is a much smaller model, just wondering, of the Llama based models if any compare to GLM-130B.

  • @tiagotiagot
    @tiagotiagot Рік тому

    Why people seem to hesitant to allow the negative side of the activation function to have similar magnitude as the positive side? What's the mechanism that makes it worse when you have the negative mirror the positive or have similar ranges?

  • @jadzzzz
    @jadzzzz Рік тому +9

    I wonder why there aren't distributed GPU compute solutions Ala foldit. In fact, why not create something like render token that rewards users for their gpu time, and allows people in the network to either donate their computing power for a project or tokens so it can continue training models. A truly democratized community effort. That being said, nothing would stop the bot uprising in that case.

    • @ablasttv
      @ablasttv Рік тому +1

      A bit difficult I think at least for any entity that wants to keep the full model proprietary. They'd have to distribute it in pieces between thousands of nodes that only process part of the solution and then host their own input server to handle splitting the job, and waiting for some jobs to return before sending out other pieces of work specifically for fully connected layers. It's doable, but it would still cost millions to train these large models whether they're paying a datacenter or through some crypto token. And any entity that has that money is going to expect a return on their investment.

  • @snarkyboojum
    @snarkyboojum Рік тому +1

    How are you marking up the PDFs of the whitepapers as you read them? It looks electronic to me....

  • @yeetdeets
    @yeetdeets Рік тому +8

    I'd be really interested to see your take on the recent vector db boom. Particularly how it will relate to AI, because I don't get it. Why is a vector db better than a relational db for AI?

    • @leptir1
      @leptir1 Рік тому +1

      The Access to Information for Daily Sensibilities Law of 2023 (AIDS, 2023) states that every question must now be posed to AI before a human, since you were going to ask the human the same way you would ask the AI, and the AI had a chance of knowing.
      It's the new "LMGTFY"

    • @Crayphor
      @Crayphor Рік тому

      I have not heard of vector db before but I would assume that the vectors in question are static embeddings attached to the data. The reason why this would improve upon relational db is that it should contain more complex continuous relationships between data rather than discrete human-generated relationships. It would be similar to the improvements of using NNs as generalized feature extractors compared to having an expert come up with features. (Reiterating that this is just my guess based on the name)

    • @yeetdeets
      @yeetdeets Рік тому

      @@Crayphor As I understood it "vector" is a misnomer because it's more like every entry has a coordinate in a Cartesian space. As such the relationships are simply the distances in the shared space.
      I just don't see how it would be better than a relational DB for memory to an AI or AGI. I imagine you mostly want specific recall rather than general recall. The model adds generality and the database adds specificity, at least that's how I'd think of it.
      Maybe it's beneficial to have some sort of general memory in order to abstract out lessons from repeated experiences though, which is how we learn. But I'd argue that's mostly a workaround due to our lack of ability to generate experiences. AGI wouldn't have that problem. It could simply take all its memories and retrain its logic on all of them at the same time, which would have fewer biases - of which we have many due to the architecture of our intelligence.

    • @davidw8668
      @davidw8668 Рік тому +1

      They allow to search by similarity in a fast and scalable way e.g for information retrieval. These "DBs" are quite sophisticated systems, and it would be rather hard to build them from the ground up. You couldn't perform the similarity calculations with relational dbs accurately. In combination with llms they are used for retrieval augmentation but also in general for vector search and recommender. Search is a big deal in AI.

    • @pozz941
      @pozz941 Рік тому +1

      @@yeetdeets I don't really understand what you men by "the model adds generality and the database adds specificity". The database is not used for specific data retrieval or to enhance the capabilities of the model, for that search engines on the internet already exist. What it is used for is to prolong the effective memory of the model.
      Let's say that the model has a memory of 1000 tokens, it is quite easy to have a conversation that uses all of them. So what do you do if you want a longer conversation? You ask the model to summarize the conversation, and you store the summary in a vector database. Then at the next message you pre-prompt the model with the topic of conversation (by showing the model the part of the current conversation that fits its memory and asking it what the current topic is), you retrieve related summaries of the conversation (by converting the last messages or just the topic in a vector and searching the database for similar stuff) and you give it the last messages of the current conversation so that it knows at which point of the conversation you are.
      Doing this, you can fake a much longer memory. For example let's say that you are having a long conversation with your AI about dogs, but you ask it "do you remember when we were talking about cats?" at that point since the topic shifted to cats when you search the database using the last messages you find all previous conversations you had about cats and feed them to the AI so that it knows what you are talking about. Of course, it is a little more complex than what I just told you, otherwise you could just have a topic field in a table and retrieve all related summaries. Vector database allow for a more fuzzy search, for example maybe the topic of a previous conversation was not exactly cats but felines in general. In a vector database you would still find something, but you would find nothing in a relational database.
      Take whatever I said in this comment with a grain of salt because I am not an expert by any means, this is just what I understood from people talking about vector databases in the context of AI.

  • @Shaunmcdonogh-shaunsurfing
    @Shaunmcdonogh-shaunsurfing Рік тому

    Thanks goodness for this channel!

  • @fnegnilr10
    @fnegnilr10 Рік тому

    Wow, this talk is fantastic! Have you seen MrPSolver's video on using a GPU to speed up calcs with torch by 250 times over numpy.

  • @TheMagicMooshroom69
    @TheMagicMooshroom69 Рік тому

    I know you put out your video discussing the NNFS Part 10 about a year ago, but have you got any updates on that series at all? :)

  • @tomwozne
    @tomwozne Рік тому +1

    Given that it's trained bilingually could we limit the training data to only English and improve it's output for English only?

  • @MrRadziu86
    @MrRadziu86 Рік тому

    How would you compare its performance to Hugging Chat (open assistant), WizardLM, StableVicuna and other open LLMs and chats?

  • @HellTriX
    @HellTriX Рік тому +1

    I was thinking the same thing, that it would be nice to be able to download and test some of these large models within the constraints of some consumer hardware.

  • @nathanbanks2354
    @nathanbanks2354 Рік тому +3

    This is really neat! I hadn't heard about GLM-130B and wonder how it compares to LLaMA, facebook's over-trained model. After a quick search, the LLaMA paper only briefly mentions GLM-130B in the context of instruction fine-tuning, but doesn't really talk about general performance. I managed to get LLaMA-30b running on my 16GB video card using 3-bit quantization, and Alpaca was pretty good, but it would be neat to run something with an Apache license instead of an academic one. Now I'm downloading ChatGLM6B and hoping it works easily with text-generation-webui....

    • @4.0.4
      @4.0.4 Рік тому

      3-bit quantization? What do you run it on?

    • @nathanbanks2354
      @nathanbanks2354 Рік тому

      @@4.0.4 A dell precision 7720 with a mobile p5000 mxm graphics card. It's a few years old, but it has 16gb which is enough to have some fun. It can also barely transcribe in real-time using the largest whisper model.

    • @Harsh-un2pt
      @Harsh-un2pt Рік тому

      @@nathanbanks2354 impressive!!

  • @joshmabry7572
    @joshmabry7572 Рік тому

    What is the context size of the model? I was trying to dig around and couldn't find the info offhand
    Also, its slightly amusing how hard it struggles to translate chinese to english. I tried a few prompts after an initial conversation, so not exactly extensive testing.
    "The translation of “各自的” to English characters is “各自的”."

  • @MikeClarkeARVR
    @MikeClarkeARVR Рік тому +2

    Thanks for the video, I just tried the model.... not very good for the use cases I experimented with. Perhaps others can recommend other use cases.

  • @SkimFelBon
    @SkimFelBon Рік тому

    Hi @sentdex, what tool you are using to highlight PDF's?

  • @barny1174
    @barny1174 Рік тому

    Thank you for sharing once again 👍

  • @theoistic7002
    @theoistic7002 Рік тому

    to run yes... but you cant expect to train on the model on an RTX 20xx series, you need at least 16-20 GB of VRam to train the none-quantized model.
    cuz after you quantize it, you kinda lock the model. at least thats the rule for LLaMa .. which although is different, same quantization rule applies.

  • @Crayphor
    @Crayphor Рік тому +16

    That gaussian activation function is interesting! I'll have to look into exactly what it is and why they chose it. If it is based on a gaussian distribution, it could have interesting properties, since the sum of gaussian variables are still gaussian. Maybe even the model output would be a gaussian.

    • @perc-ai
      @perc-ai Рік тому +3

      This guy doesn’t know anything about Gaussian lol

  • @1PercentPure
    @1PercentPure Рік тому

    thank you so much for sharing!

  • @3dx3em
    @3dx3em Рік тому

    HF test does not go beyond 1st prompt. It cut off, I asked to Continue and nothing. tried another, same thing nothing beyond 1st answer

  • @Maisonier
    @Maisonier Рік тому

    What IA projects we can focus with 1 rtx 3090? there is any advantage in buying another and use 2 rtx 3090 ? or is too late and the hardware won't make any difference in our homes?

  • @freecode.ai-
    @freecode.ai- Рік тому

    Could a botnet train an LLM at the same level these big companies do?

  • @jurelleel668
    @jurelleel668 Рік тому

    Awesome video mate.
    Sentdex please do a chat gpt glm-6b tasks review

  • @Tepalus
    @Tepalus Рік тому +5

    "It's fast and small, and it sometimes gets lost with too much dialogue." - My Ex lol

  • @Truizify
    @Truizify Рік тому +5

    Thanks for the video! Great content. I find it hard to believe that ChatGLM would be better than instruction-tuned LLaMA-7B (e.g. Alpaca, etc.), at chatting in English. Did you check out any of those models? Plan on doing a comparison myself.

    • @Will-kt5jk
      @Will-kt5jk Рік тому +1

      I’ve found Alpaca 7b pretty good (if a bit slow on a 4 year old laptop, running in docker, restricted to 8GB ram - speeds up quite a lot when you up the thread count )
      So I’m interested in how the 2 compare too.

    • @zgolkar
      @zgolkar Рік тому +1

      Vicuna is much more ChatGPT alike. Koala follows closely. So I womder how ChatGLM will compare to them…

  • @aladinmovies
    @aladinmovies Рік тому

    How much operations can make 1 GPU 3090? And how to calculate how much operations needs programm? Thanks for answers!

  • @shawnfromportland
    @shawnfromportland Рік тому

    really great channel, man!

  • @uae7001
    @uae7001 Рік тому

    Did you try LLaMA AI language model which leaked online recently, i like it and will compare it with this

  • @AHN1444
    @AHN1444 Рік тому

    if I wanted one model just for core logic control? just 1 language, which is the minimum size it could be nowadays?, the model would be prompted with a summary made by another model, and responds with for example [module="memory" action="retrieve data" params="xxxxx"] [module="browser" action="visit site" params="xxxx"] [etc] [etc]

  • @rohanjamadagni
    @rohanjamadagni Рік тому

    Hey, what about the LLaMA models? Did you have a look at those?

  • @alexd7466
    @alexd7466 Рік тому

    Which one is the best for writing code?

  • @Arewethereyet69
    @Arewethereyet69 Рік тому

    Wonder if GLM is Alibabas version. Probably not but makes you wonder since they roll theirs out

  • @homataha5626
    @homataha5626 Рік тому

    Thank you for the video.
    Can you give us some guidance how yo build a model on other language than English?
    I am interested in building a model for persian texts.

  • @mukkeshmckenzie7386
    @mukkeshmckenzie7386 Рік тому +1

    Can you compare this to llama? Or alpaca?

  • @aa-xn5hc
    @aa-xn5hc Рік тому

    What about "open assistant"?

  • @creativedolt
    @creativedolt Рік тому

    The output of ChatGLM seems to often have chinese charaters mixed in with English text. eg. if you ask "What is backpropagation?", then words like "accuracy", and "gradient" show up in chinese. Doesn't anyone else run into such issues? Doesn't that change the utility of the model for non-chinese speakers?

  • @CMAZZONI
    @CMAZZONI Рік тому

    Could this model be used for semantic search?

  • @TerboBlitz
    @TerboBlitz Рік тому +6

    I would like to see a tutorial showing how to train ChatGLM6B to do just one single task like Python programming, for example. Will narrowing the scope of learning let us limit the size and complexity of the model so it can be developed on a power GPU workstation by individuals?

    • @differentperspectives
      @differentperspectives Рік тому +3

      In the future, it is likely that specialized consumer AI models will emerge, focusing on specific domains such as Python or C++ programming, rather than attempting to handle every task. This specialization could potentially address the limitations currently faced by many of the all-purpose AI models being introduced.

    • @stevepk5421
      @stevepk5421 Рік тому

      @@differentperspectives thats common sense

  • @4mb127
    @4mb127 Рік тому +2

    Did you compare to LLaMa?

    • @sentdex
      @sentdex  Рік тому +1

      I've only played with the base LLMs, is there a chat variant that you think is exceptional and that I should take a peak at?

  • @TheDogn
    @TheDogn Рік тому

    11:22 I wish you talked about how to run it locally

  • @devilmix85
    @devilmix85 Рік тому

    Can you make a video on Gated Linear Units?

  • @tomwozne
    @tomwozne Рік тому

    Does this need a connection to the internet?

  • @gwky
    @gwky Рік тому +5

    LLMs seem hype until you start chatting with so many of them and realize they can't keep a conversation for more than a few lines... Even story writing models seem to start hallucinating. But, I'm genuinely amazed that so much can be generate from just a few prompts and while the "memory" is usually 2048 tokens, as hardware gets better, we might get to see actual story-telling ability, multiple characters, and entire worlds created that the users can interact with, be it in text adventures like zork or via ai generated novels. At the moment, I've tried every tool out there, and the most popular 6B/13B/20B models. It's still garbage after a few prompts, but sometimes it does remember the context and leaves me genuinely impressed. It's like an abusive relationship, most of the time it's crap, but you learn to cherish the good moments.

    • @Saerthen
      @Saerthen Рік тому

      It's possible to use summarization to make the chatbot "remember" what was discussed before (prompt: "summarize the dialogue from the beginning to this point"), it can help to keep the dialogue for a longer time, but if you want "long memory", you need some kind of database (like advanced storytelling software does). Also, I'm not sure if "long memory" would be a benefit or a problem.

    • @gwky
      @gwky Рік тому

      @@Saerthen IT doesn't work. Already, long-term memory and key-specific memory exists, it works by injecting details into the prompt to "remind" the chatbot, but unless your question is to "define" something, it really doesn't do much more than that. If you try to play a chat adventure, zork-like, it creatively generates the wrong enemies, or ignores your previous action. Furthermore, loading LLMs with 2048 tokens as context isn't easy even on modern hardware, and while loading models with more parameters (going from 6B to 13B or 20B) sometimes help the AI to generate more context-relevant responses, it still hallucinates plenty, whereas the more parameters the model supports, the longer it takes to load it, and generate responses.
      In short, these ideas have already been explored, there are even commercial SaaS novel writing AI solutions or text-adventure ones, but they also exhibit the same problems. It's just the limitation of the current tech.

    • @larion2336
      @larion2336 Рік тому +1

      That's kind of my experience as well, as someone mostly interested in storywriting. I've used GPT4ALL and various versions of Alpaca. I found Alpaca better but it still had major issues with memory. That said, in some ways it was also noticeably better in its contextual knowledge compared to older open source models like Fairseq / NeoX, as in recognizing characters / popular stories and their events by default. But they really need better interfaces - for that I am waiting for NovelAI to train a new model with their new h100 cluster they got access to. That will be hype.

    • @Saerthen
      @Saerthen Рік тому +1

      @@larion2336 Llama was initially trained with 2048 tokens context. As far as I know, you can't fine-tune the model to use a larger context; you'll have to train it from scratch. And right now, no open-source models are trained that way.

    • @gwky
      @gwky Рік тому

      @@larion2336 wouldn't be too crazy about it, LLMs are just guess engines. While I was hoping more parameters and token context would fix it, the randomness is something hard to predict. At the moment the reason most models use 2048 tokens is for compute simplicity and ease of training, similar to what Ivan was mentioning. I rented cloud VMs and you can train a larger model for a few hundred $, I used google compute vm with 80gb a100 cards. Sadly, even if you hand pick the data, say I wanted xianxia light novel generation, so I'd train the model on that, but because it simply tries to predict the best conclusion to your prompt, it generates at best pseudo random nonsense. Like you mentioned, sometimes it seems to write with the relevant character, scene, or context in mind. Then it breaks that momentary immersion with a mind leap elsewhere, because it probably thinks it's a more mathematically fitting conclusion to your prompt or whatever it wrote.
      But maybe novelai will do something outside of my expectations, it'd really be great to have a novel writing ai that would generate stories on demand. Similarly, using it for text adventures or DnD would likewise be a great. Still, I always remind myself that this is the worst that ai will ever be. It only gets better from here.

  • @bravinrutto8784
    @bravinrutto8784 Рік тому

    Have you looked at LLama cpp where they were able to run the Llama-30B with just 30GB of RAM.

  • @nandafprado
    @nandafprado Рік тому

    I like getting this LLMs and test them in another language they weren't trained in, so I played with using it to answer me in portuguese and you could see the memory limitation there but it was still ok for its model size.

  • @tskumar1320
    @tskumar1320 Рік тому

    Wonderful. Keep it up 💓🙏

  • @joshuascholar3220
    @joshuascholar3220 Рік тому +4

    I thought that you don't need GPUs for inference, only for training. If you could train or tune on consumer hardware THAT would be HUGE!

    • @ablasttv
      @ablasttv Рік тому

      Inference does still need GPU power for these large models. It's still lots of matrix multiplication.

  • @aloo_explains
    @aloo_explains Рік тому

    GELU's are being used quite alot in models like, GPT-3, BERT.

  • @patrick-gerard
    @patrick-gerard Рік тому

    Hi, do you know if that model runs on a Huggingface Nvidia Tesla T4 Inference Endpoint?

    • @sentdex
      @sentdex  Рік тому

      Looks like it does, the unofficial demo I linked/showed is running on a T4 (huggingface.co/spaces/multimodalart/ChatGLM-6B)

  • @joshheller1439
    @joshheller1439 Рік тому

    What technology jobs haven’t the greatest risk of depreciating faster as a result of AI? (Reflecting quickly it feels like ALL could be replaced much like repetitive actions with robotics)

    • @sentdex
      @sentdex  Рік тому

      Quick thought, but ... probably alignment/safety for models hah.

  • @rileykramer3642
    @rileykramer3642 Рік тому

    3090s are like $1500 new. itd only be like 6k for a setup like that. id personally wait for 5000 series to see if they up the vram. im guessing we will see a 32gb 90 series model.

    • @fitybux4664
      @fitybux4664 Рік тому

      Why get a 3090 for $1500 when you can get a 4090 for around the same price?

  • @jurelleel668
    @jurelleel668 Рік тому

    Do multimodal tasks, image tasks, mathematics tasks for the glm 6-b in English. Thanks

  • @wktodd
    @wktodd Рік тому

    do you know of any good voice recognition models that will run locally in a small gpu?

    • @jantuitman
      @jantuitman Рік тому +3

      I tried OpenAI whisper a few months ago. It is actually open sourced (unlike most OpenAI models) and the smaller models did flawless English transcription without graphics card directly on my Mac M1. But I am Dutch and unfortunately there are no small models with enough Dutch training, so I had to use a bigger version to get it to transcribe Dutch flawlessly, but I was still able to get it to work on the 16GB GPU in Google Colab. By the way, they also have an API whoch isn’t too expensive.

    • @mpag6195
      @mpag6195 Рік тому

      @@jantuitman I use whisper. So far it's the best at doing voice2text in my use cases

    • @wktodd
      @wktodd Рік тому

      @@jantuitman Thanks Jan 🙂

  • @sevret313
    @sevret313 Рік тому +8

    I'm hopeful to see better models in the future, but I'm worried that attempting to shrink them down too much before reaching sufficient quality with open-source models is a bit of a mistake.

    • @StoutProper
      @StoutProper Рік тому

      It’s up to you to retrain and fine tune them

    • @sevret313
      @sevret313 Рік тому

      @@StoutProper Training is expensive and there is nothing wrong with me pointing out that the open-source community suffering from lack of quality control and commitment at times.
      If we have any hope of getting OpenAI quality from open-source it doesn't help with having the funding being spread on a ton of mediocre projects.

  • @dgiri2333
    @dgiri2333 Рік тому

    Can you do videos for Pybrain to train model like chartgpt like AI

  • @CA-iv6tz
    @CA-iv6tz Рік тому

    Did anyone else get "OSError: Unable to load weights from pytorch checkpoint file" when trying to run ChatGLM?

  • @Djellowman
    @Djellowman Рік тому

    Why use GeLU over something easy to compute like Leaky ReLU? i'm a big Leaky ReLU fan.

  • @loopuleasa
    @loopuleasa Рік тому

    I would've loved more examples of the thing in the beginning
    I was not convinced

  • @JanBadertscher
    @JanBadertscher Рік тому

    not going to try anything smaller than 13b 4bit for inference :) . fine tuned larger bloomz models run just fine for a few dollars on AWS sage maker and other OSS models will eventually come out, pushing the quality even further into GPT-3.5 territory and hopefully into GPT-4 territory with some tricks... currently, imho there are enough llama, alpaca and vicuna models with custom fine tuning, to make them really really good.

  • @rifkiardiansyah616
    @rifkiardiansyah616 Рік тому

    Bro, you should continue your NNFS video series!!!!

  • @mujtabaalam5907
    @mujtabaalam5907 Рік тому

    Have you tried Chinchilla or Vicuña?

  • @yarpenzigrin1893
    @yarpenzigrin1893 Рік тому

    That's pretty good. Have you looked into LLaMa, Alpaca, Koala or Vicuna?

    • @SJ-xz2ch
      @SJ-xz2ch Рік тому +1

      Yea I was impressed by vicuna.
      It seems like the best of the llama based models. (So far)

    • @kazimafzal
      @kazimafzal Рік тому

      @@SJ-xz2ch -- well Vicuna definitely requires more than 8GB of VRAM to run, I wasn't able to get it running on my 3060Ti (complained of not enough memory)

    • @SJ-xz2ch
      @SJ-xz2ch Рік тому

      Mm good to know. Looks like vicuña 7b wants 14 gig.
      Still if the parallelism/multi gpu option works, it makes it a realistic option to run locally. (Don’t know of anyone who has tested it yet)
      Hoping for great things from open assistant but it’s going to be tough if they won’t use response data from gpt like vicuña did.

  • @Sunlight91
    @Sunlight91 Рік тому +1

    How old is the moon?
    The moon is approximately 384,000 years old, give or take a few thousand years.

  • @jaysonp9426
    @jaysonp9426 Рік тому

    I asked it "which world leader looks the most like Whinny the Pooh." I'm dead now.

  • @gogibear123
    @gogibear123 Рік тому

    appreciate u❤

  • @FuZZbaLLbee
    @FuZZbaLLbee Рік тому

    GPT4all wasn’t good at generating code. What about this model.
    Also are the awnsers of this model also censored ?

    • @yeetdeets
      @yeetdeets Рік тому

      Haven't tried it yet, but I heard GPT4 is radically better than GPT3.5 (chatGPT) was.

    • @FuZZbaLLbee
      @FuZZbaLLbee Рік тому

      @@yeetdeets ow sorry I meant gpt4all not chatGPT
      Updated my question

    • @yeetdeets
      @yeetdeets Рік тому

      @@FuZZbaLLbee Ah, no worries.

  • @efovex
    @efovex Рік тому +1

    ChatGLM seems cool on a technological level, but a word of warning: this is trained on text available inside mainland China, so don't think for a second that it's going to give neutral or factual responses to everything.
    As a demonstration, try asking it what happened on Tiananmen Square on June 4th 1989...

  • @harriehausenman8623
    @harriehausenman8623 Рік тому

    What about OpenAssistant?!

  • @abuslume4169
    @abuslume4169 Рік тому

    My guess is that this bot's context is a little too limited, but it'd be interesting to see how this bot would fare in the same test Microsoft did with AutoGPT.

  • @Totial
    @Totial Рік тому

    What happened to charles? Not enough time I guess... hope he gets another chance some day...

  • @li-pingho1441
    @li-pingho1441 Рік тому

    awesome!!!!

  • @MrTheKeiron
    @MrTheKeiron Рік тому

    0:55 "It's shockingly good for its size"
    Thats what she said