I love small and awesome models

Поділитися
Вставка
  • Опубліковано 28 січ 2025

КОМЕНТАРІ •

  • @BORCHLEO
    @BORCHLEO 4 місяці тому +37

    you are underrated matt! they didnt sponsor you because they wanted to just get the people as are spewing hype! you go into such detail! your content should be #1 on any ollama tutorial.

  • @KonstantinBykov-o3k
    @KonstantinBykov-o3k 4 місяці тому +6

    I’ve tested 3b parameters model and very impressed with it. Speed and accuracy. I use it for improving descriptions in catalog. And it looks like we don’t have to pay to Open AI anymore. Thank you for your video!

  • @utvikler-no
    @utvikler-no 4 місяці тому +1

    Thanks

    • @technovangelist
      @technovangelist  4 місяці тому +1

      What??? You are too kind... a member AND a tip. Thanks so much.

    • @utvikler-no
      @utvikler-no 4 місяці тому

      @@technovangelistI just love the simple and yet the comprehensive way you explain the subjects. Keep up the good work❤

  • @trsd8640
    @trsd8640 4 місяці тому +3

    The first really helpful video about llama 3.2! Thanks Matt!

  • @solyarisoftware
    @solyarisoftware 4 місяці тому +17

    Hi Matt, I upvoted as usual. Two notes:
    Ollama HW resources calculations (proposal for a new Ollama video): In this video, you thankfully show how easy it is to set the context length in the model file, bypassing Ollama's default. How does the context length influence the RAM usage of the host? In general, it would be great to dedicate a video to hardware resource calculations based on model size, quantization, context size, and possibly other macro parameters. It would also be helpful to discuss how CPU, and especially GPU, can improve latency times (especially in a multi-user environment).
    You mention "your" function call method. I know you've already done a video on this topic, but since it's very useful in practice, maybe you could create a new video with code examples (Python is welcome).
    Other viewers: If you agree, please upvote my comment. Community thoughts are welcome!
    Thanks again,
    Giorgio

    • @arthurhjorth1490
      @arthurhjorth1490 4 місяці тому +1

      Agreed! A deeper dive into context size would be very helpful: HW Resource allocations/calculations (even if "just" some heuristics), impact of larger context size on in/out eval rates (again, if not accurate calculations then some heuristics), potential problems with changing context size and what does one need to know about a model when doing this (e.g. what happens if you set a context size larger than what the model allows?).
      Edit to add: I'd also be interested in a potential context size video touching on how context size affects/interacts with parallel requests, and how to set up multi-user environments that share the model's context size.
      Thank you for an, as usual, excellent tutorial!
      Cheers, Arthur

    • @solyarisoftware
      @solyarisoftware 4 місяці тому

      @@arthurhjorth1490 Yes. Just a more note about evaluations criteria. Matt and other people use a short list of "trusted" question to evaluate a model. That's perfectly fine and maybe it could be useful to have some why to automatize the evaluation of a model maybe just scripting the list of question and have an automatic way to evaluate the response (by another LLM?). Just a food for thought / a possible video topic in the future?

  • @AKBARESFAHANI
    @AKBARESFAHANI 4 місяці тому +2

    I love your content and learn every time I watch, thank you

  • @TheHummChannel
    @TheHummChannel 4 місяці тому +1

    Really this channel deserve way more exposure! Love the contents and the host ! Keep the good work thanks

  • @FOGSUser
    @FOGSUser 4 місяці тому +2

    Loving the Companion Plugin for Obsidian with llama3.2:1b. Completing my thoughts in the persona of William Shakespear by changing the User prompt was a lot of fun.
    Fair Companion App, thy wondrous might,
    Doth streamline notes and tasks with ease and light.
    In Obsidian vault, thou dost thy work impart,
    With real-time updates, a digital heart.

  • @КравчукІгор-т2э
    @КравчукІгор-т2э 3 місяці тому

    Thanks Matt! Everything is interesting and clear as always!

  • @GundamExia88
    @GundamExia88 4 місяці тому +1

    Ha! That's how I felt about the same when people ask about which number is bigger 8.8 vs 8.21! It depends in what context! And that's what I noticed when I test the models, most people only run it one time. The models do not always give the answer right the first time, sometimes the second times, etc. Great video.

  • @vbitz4800
    @vbitz4800 3 місяці тому

    Software engineer for many many years here....I have recently stumbled upon ollama and your videos. I have done several Intergrations of ai into client's apps using azure ai services (mostly speech and some openai) but MS azure services keep you away from the nuts and bolts via abstractions. This is fine for productivity but not great for understanding. Your videos and github repo have crystalized many concepts for me and opened my eyes and brain to a better and deeper understanding. Thank you, sincerely, for what you are doing here. It is priceless. BTW, I adore ollama and the smaller models....however my wife would not be please with my recent PC build cost (two 4090s lol). But hey, this is addictive! Isn't it amazing to see the realization of what we all dreamed of when we started in the programming/IT realm!

    • @technovangelist
      @technovangelist  3 місяці тому

      Yup. Compared to what I was doing in my Intro to AI programming class I took in 1989 at FSU this is science fiction.

  • @jackonell1451
    @jackonell1451 4 місяці тому +6

    Wondering what would be the success rate of the tool call if wrapped in a framework like Yacana ?
    Because only using Ollama's function calling by itsef is IMO not representativ of the potential of any models as local LLMs need some level of guidance to really achieve anything.

    • @technovangelist
      @technovangelist  4 місяці тому +1

      The older approach works fantastically with every model

    • @jackonell1451
      @jackonell1451 4 місяці тому

      ​@@technovangelist I completely agree with you that Ollama's function calling is excellent. Forcing LLMs to output JSON the way it does is very impressive! However, we use PHI models in production, and only using Ollama didn't yield the expected results... The JSON was valid, but the arguments generated were not. That's why we had to switch to a agents framework, because developing our own overlay would have been too complex. With the agent framework we now have enhanced tool calling and a way to do multi-turn chat so multiple PHIs can brainstorm together. Also, we needed control loops at each step to ensure we get a computable output in the final phase.

    • @jackonell1451
      @jackonell1451 4 місяці тому

      We had a success rate of around 30% when calling tools with correct values using only the "/generate" endpoint. Now, we're at about 70%. Additionally, the team has improved at prompt engineering, which has been a huge factor!
      Also, thanks for making videos, they're always a great help. Keep it up! ^^

  • @ClearSight2022
    @ClearSight2022 4 місяці тому

    Hi Matt very clear presentation yet again. I also really enjoy your polished style, so I subscribed even though I do so rarely. Cheers !

  • @TheDiverJim
    @TheDiverJim 4 місяці тому

    Love the breath holding tangent!

  • @Joooooooooooosh
    @Joooooooooooosh 4 місяці тому

    Dude ollama is great. Thank you.

  • @mduthwala439
    @mduthwala439 4 місяці тому

    Well explained especially the 1B

  • @viertekco
    @viertekco 3 місяці тому

    Your delivery is great had no idea u were a founder..that's awesome.🎉 Meta wheres the ❤!!

  • @alvinnorin8820
    @alvinnorin8820 3 місяці тому

    8:12 I'd set the temperature to zero, in which case everyone using the model will get the same answer every time for the same inputs. Setting it to zero makes it default to always responding with whatever answer is most likely the accurate one. It's a predictive language model after all.
    The higher the temperature, the more varied responses. Setting it to zero disables the randomness parameter and thus removes variety completely. Very useful when benchmarking models against each other.

    • @technovangelist
      @technovangelist  3 місяці тому

      Setting temp to zero will not get you the same answer every time. You would have to set temp and seed and you will reduce the variations but it may not be the best answer.

    • @alvinnorin8820
      @alvinnorin8820 3 місяці тому

      @@technovangelist Right, I assume the seed could also be random. It might be different across different models. I ran llama3.1:70b with temperature at zero, and that seemed to get me the same story from the same Minecraft chat logs along with its system prompt. LLMs have different architectures though, and it doesn't seem like all support all parameters. Taking away randomness is highly advantageous when optimizing system prompts though, being able to compare static responses.

    • @technovangelist
      @technovangelist  3 місяці тому

      i don't know if that’s true. testing a system prompt on a limited version of the model helps if you only use the model in that limited way every time.

  • @g.s.3389
    @g.s.3389 4 місяці тому +4

    just a question: what is the best model for supporting me in python programming that I can use with ollama?

    • @yahoolane
      @yahoolane 4 місяці тому

      What is your use case, llmana3 is a good default

  • @jayd8935
    @jayd8935 4 місяці тому

    If you have a cat, it stole the water bottle! Thanks for the review too. I will be interested to try this on my usual M1, might remove the need to run models on another more powerful machine.

  • @Cingku
    @Cingku 4 місяці тому

    Could you explain what the generation completion hotkey does in the Companion plugin for Obsidian? When I use the Companion, it automatically generates text, completes it, and streams the response. So, in what situation would I need to use this hotkey? I'd appreciate it if you could clarify this because I was confused by this.

  • @danielarista1352
    @danielarista1352 4 місяці тому

    Matt, I"m a non-CS/SWE guy trying to hack away at a tool that uses LLMs to add some NL to a the UI of an app I'm building. It's b/c of you I choose Ollama over other options. Thanks brother.

  • @KevinKreger
    @KevinKreger 4 місяці тому

    Good one. I saw someone training the 1B model on their iPhone😮

  • @nosuchthing8
    @nosuchthing8 3 місяці тому

    How much vram is required for the 3b token method?

  • @BirdManPhil
    @BirdManPhil 4 місяці тому +3

    ive been using llama 3.1 8b on my 4050 laptop very comfortably for ai assisted tasks in obsidian and i cant wait to see if these smaller 3b models are a better fit. you get a sub from me im all aboard the self hosted train next stop ai station lets gooo

  • @JeromeBoivin-tx7fm
    @JeromeBoivin-tx7fm 4 місяці тому +7

    Thank you Matt for your videos. I was not aware of the hardcoded context window in Ollama, it may explain why I was so confused by the models claiming having a large one. Why is that? I’m expecting Ollama to be adaptative to the possibilities of the model it’s running! Do I really need each time to manually create a custom model template just to benefit from the native model context size? Do you already posted a video answering these questions? Thank you so much and keep the good job! Cheers from France!

    • @manuelbradovent3562
      @manuelbradovent3562 4 місяці тому

      @JeromeBoivin-tx7fm Also interested related the context and if in the model file also prompt, end token, etc was added.

    • @technovangelist
      @technovangelist  4 місяці тому +4

      Context takes a lot of memory. And it’s hard to put rails around it so it doesn’t fully crash the machine. I’ve had the machine reboot when it takes too much. And lots of folks have tiny gpus so we got lots of support requests. So it went to a blanket 2k unless you specify the size. But since it’s so easy for most devs to create that file and since ollama is intended as a dev tool first, it’s seemed like a good decision

  • @jazzejeff1
    @jazzejeff1 4 місяці тому +2

    Your channel's so nice I wish could sub twice. Keep up the great work.

  • @researchandbuild1751
    @researchandbuild1751 3 місяці тому

    Can llama 3.2 1b do function calling? I don't know if it's smart enough to format output

    • @technovangelist
      @technovangelist  3 місяці тому

      Depends. Using the newer more official format, I don't think it does a good job. Using the older approach I have documented in a few videos on this channel I expect it would be great at it.

    • @researchandbuild1751
      @researchandbuild1751 3 місяці тому

      ​@@technovangelist I will try to look into the original method. I still don't think many people explain how it all works in plain English. I will look at your earlier videos about functions

  • @AndresSolar-y3g
    @AndresSolar-y3g 4 місяці тому

    ...worth a thumb up...

  • @PriNovaFX
    @PriNovaFX 4 місяці тому

    What if you set temperature to 0, does the tool functions test succeed better?

  • @yacahumax1431
    @yacahumax1431 4 місяці тому

    ollama makes it so easy

  • @Aarifshah-A
    @Aarifshah-A 4 місяці тому +3

    Lol the ending 😂😂😂

  • @alexandrep4913
    @alexandrep4913 4 місяці тому

    There is an awesome video on UA-cam talking about the specific model and how censored it is. I wouldn't be surprised if people find the older model to be more capable.

  • @utsavgoswami5263
    @utsavgoswami5263 4 місяці тому

    well, matt you are our fav choice for all things AI!

  • @toadlguy
    @toadlguy 4 місяці тому

    These smaller models are great for research, particularly as they are fairly easy to modify in code. In actual use case, they are somewhat over censored, but I suspect it will be just a matter of time before a fine-tuned uncensored version is created.

  • @dakkon77blackblade20
    @dakkon77blackblade20 3 місяці тому

    I would really like to know if these models are any good for entity extraction like for graphRAG or chunk generators for traditional RAG... That would be a great topic!

  • @johang1293
    @johang1293 4 місяці тому +1

    Good stuff

  • @enermaxstephens1051
    @enermaxstephens1051 3 місяці тому

    Can we simply download the GGLM file and use it wit GPT4all?

  • @tecnopadre
    @tecnopadre 4 місяці тому +1

    1st. Always thank you Matt. Question. I've been testing 3B since launch ata my Laptop with NPU. WebUI on a server and Ollama in my Laptop. The thing is My laptop has NPU and Ollama is not taking advantages of it. Ollma 3B is taking the small GPU and CPU. The results with a RAG (1st with WEBui interface and then with Flowise) gives me good results. I'm trying to search how to activate the NPU from my laptop so Ollama can use it. It would be great. I think LLM Studio does it? Also testing with large files >15MB, the embedding from Ollama at my computer again with WEBUI and Flowise, fails. The computer can't handle. Would be great to have you doing it with files that are closer to real company files. I think 3B model it's great. The last test I did is using it with Groq and of course, there is where I can test it 🙂

  • @wardehaj
    @wardehaj 4 місяці тому

    Thanks for this great video explaining how to use these small LLMs! I will be waiting for your video about the vision model. Maybe compare llama2 vision with pixtral?

  • @stasoline
    @stasoline 4 місяці тому

    Cool video!

  • @harrykekgmail
    @harrykekgmail 4 місяці тому

    interesting video. thank you

  • @aiamfree
    @aiamfree 4 місяці тому

    Why am I getting Error: error loading model for all the 3.2 downloads?

    • @technovangelist
      @technovangelist  4 місяці тому

      Have you updated ollama?

    • @aiamfree
      @aiamfree 4 місяці тому

      @@technovangelist yes that fixed it, thanks… it’s sooo damn fast!!

  • @akongas
    @akongas 3 місяці тому

    That's great. Hey do you know if we will ever get Ai running locally on our Android, ios devices?

  • @AlexanderYudin
    @AlexanderYudin 4 місяці тому

    Which hardware setup you have ?

  • @userou-ig1ze
    @userou-ig1ze 4 місяці тому +1

    Thanks for the great content. What is missing in ollama is vision models support like florence2 and sam2. If it had a nice api for that, that could be used with curl or so... dreams.
    Raspberry pi with vision models must be so incredibly overpowered, I prefer not thinking about it too much

    • @technovangelist
      @technovangelist  4 місяці тому

      Raspberry pi overpowered???? way underpowered is more accurate, especially considering the cost of them. Physical size is the big benefit these days. But Florence2 looks like an older model that didn't get much love. Some of the other vision models on Ollama got a lot more coverage. And hadn't heard of sam2 either. Both architectures aren't supported so would require a lot of work to get working.

    • @userou-ig1ze
      @userou-ig1ze 4 місяці тому

      @@technovangelist thanks for the time to reply, appreciated. Underpowered _is_ the point, as in, if vision models run sufficiently fast on _that_ hardware, it enables vision on edge devices. Florence2 was released months ago, and the combination of selecting pixels by typing, and segmenting and tracking over time with sam2, is an incredibly powerful concept- I needn't ask any lay man to become creative, the usefulness of text driven vision perception seems insane

  • @kshabana_YT
    @kshabana_YT 4 місяці тому +1

    I tried to run Llama3.2 1b in Samsung s 20 plus Error: no suitable llama servers found. And I am running ollama serve

    • @Psychopatz
      @Psychopatz 4 місяці тому

      just use layla lite then import the model. Yep its a hassle on making your lammacpp to work

    • @kshabana_YT
      @kshabana_YT 4 місяці тому

      I don't know what are you talking about

  • @modoulaminceesay9211
    @modoulaminceesay9211 4 місяці тому

    All things local AI and I just subscribed that’s what I need

  • @TaFeiYen
    @TaFeiYen 3 місяці тому

    First time seeing your vid. Interesting take. I know you have demonstrated some use cases of the models. But to general people, there's way too many models to pick from. Do you have a guide on that? To narrow down which model to use? I know it will always be bias but I would like to hear your take.

    • @technovangelist
      @technovangelist  3 місяці тому +1

      This was the first, maybe second, time I looked at one model. I would like to do more of them.

  • @merefield2585
    @merefield2585 4 місяці тому

    Hey Matt, thanks for a great video - do you keep the code featured in your videos in public repos?

  • @ts757arse
    @ts757arse 4 місяці тому

    I've just tried this with fabric. It's as good at summarising youtube videos as microsoft 8x7B. Still gets confused occasionally with certain tasks.
    I've got a large-ish DB of technical documents for an obscure bit of research kit and I'm going to try swapping out my current LLM and see if this performs better. It'll probably be about 10x faster.

  • @dna100
    @dna100 4 місяці тому

    Lovin' the channel. 👍👍It'll be great once Ollama supports vision

    • @technovangelist
      @technovangelist  4 місяці тому +3

      Ollama does support vision today. The llama3.2 vision should be very soon

  • @chrisBruner
    @chrisBruner 4 місяці тому

    Good video

  • @UnwalledGarden
    @UnwalledGarden 4 місяці тому

    Awww yeah!

  • @autumblak
    @autumblak 3 місяці тому

    Hey matt, I have an intel based MacBook, and I want ollama to utilize my gpu, but I don't know how to go about it. I have searched all round but to no avail. Could you offer some pointers, or resources to where I can succeed?

    • @technovangelist
      @technovangelist  3 місяці тому

      Unfortunately there are no options. Well except buying an apple silicon MacBook or switching to a pc.

  • @zhouyangbo4498
    @zhouyangbo4498 4 місяці тому

    ollama run llama3.2:1b
    Error: llama runner process has terminated: signal: abort trap error:done_getting_tensors: wrong number of tensors; expected 147, got 146
    any idea about this error?

    • @technovangelist
      @technovangelist  4 місяці тому +1

      You need to update ollama. You should always update whenever there is a new version.

    • @zhouyangbo4498
      @zhouyangbo4498 4 місяці тому

      ok ,I will try it , maybe it is GFW issue, thanks.

  • @aiamfree
    @aiamfree 4 місяці тому

    when is ollama getting the vision models anyone know?

    • @technovangelist
      @technovangelist  4 місяці тому +2

      The team is working on it.

    • @aiamfree
      @aiamfree 4 місяці тому

      @@technovangelist awesome, thanks Team!

  • @ivanalberquilla9953
    @ivanalberquilla9953 4 місяці тому

    Thank you for the video. What is the tool you use for writing?

  • @megairrational
    @megairrational 4 місяці тому

    Great content. Could you briefly describe the machine you use for this task? You mentioned 3 seconds…

    • @technovangelist
      @technovangelist  4 місяці тому +2

      I usually do and forgot this time. M1 Max MacBook Pro with 64gb. A machine you can get for about 1500 usd today.

    • @megairrational
      @megairrational 4 місяці тому +1

      @@technovangelist thank you! 64GB? Impressive. Please keep it up! You are a great communicator

  • @BulletProof-n7n
    @BulletProof-n7n 4 місяці тому

    Respect and appreciation for what you do. Any chance of using a llm on the new Samsung galaxy s23?

  • @thestype
    @thestype 4 місяці тому

    I asked it to create a component in javascript in which llama3.1 8B and mistral-nemo greatly succeeded. But llama3.2 3B failed miserably, mixing up different libraries unintelligently. Its just fast, but also a random word generator is fast.

    • @technovangelist
      @technovangelist  4 місяці тому

      But a random word generator wouldn't be anywhere near as good as llama32 3b.

  • @martijnveenman
    @martijnveenman 4 місяці тому +1

    Amazing video, thank you. Is companion the only ai plugin you use in Obsidian? Looking forward to seeing more practical AI obsidian applications.

  • @agi_lab
    @agi_lab 4 місяці тому

    I would request you to test out llms on some complex tools (as simple as file create tool fails on 3b model). I assume of i give proper func.desc, it might not. Need to experiment

  • @antoniomonteiro3698
    @antoniomonteiro3698 4 місяці тому

    llama 3.2 1B Q4:
    To count the number of 'F's and 'R's, I'll go through the word "strawberry" carefully.
    Here's the breakdown:
    * 'F': 1
    * 'R': 2
    So, there are 1 'F' and 2 'R's in the word "strawberry".
    or:
    please create a java program to count the number of 'F's and 'R's in the word 'strawberry'
    public class Main {
    public static void main(String[] args) {
    // Define the word
    String word = "strawberry";
    // Initialize counters for 'F's and 'R's
    int countF = 0;
    int countR = 0;
    // Loop through each character in the word
    for (char c : word.toLowerCase().toCharArray()) {
    // Check if the character is 'F' or 'R'
    if (c == 'f' || c == 'r') {
    // Increment the counter
    if (c == 'f') {
    countF++;
    } else {
    countR++;
    }
    }
    }
    // Print the results
    System.out.println("Number of 'F's: " + countF);
    System.out.println("Number of 'R's: " + countR);
    }
    }
    output:
    Number of 'F's: 0
    Number of 'R's: 3
    sorry, they left me home alone...

  • @TLabsLLC-AI-Development
    @TLabsLLC-AI-Development 4 місяці тому

    Meta Matt!

  • @shuntera
    @shuntera 4 місяці тому

    Matthew Berman’s review of the vision models show them to be censored to the point of uselessness. Hopefully someone will bring out uncensored versions

    • @technovangelist
      @technovangelist  4 місяці тому

      reviews on the vision stuff seems to be mixed so far. It could be that some of the folks who have it have a bad implementation. I look forward to finding out.

  • @MK-kk1zb
    @MK-kk1zb 9 днів тому

    I have a question for you and would be more than happy if you could answer.
    I want to run a local AI (Ollama + docker + OpenwebUI) for our company and for its server, I wanted to build a machine with the below specs:
    AMD Ryzen Threadripper PRO 7965WX (24 cores/48 threads)
    ASUS Pro WS WRX80E-SAGE SE WIFI
    two MSI GeForce RTX 4090 SUPRIM LIQUID X 24G Hybrid Cooling 24GB
    8x 16GB DDR5 ECC
    Corsair AX1600i 1600 Watt 80 Plus Titanium ATX Fully Modular Power Supply
    Noctua NH-U14S TR4-SP3
    2T SSD storage
    If I purchase and run this machine, can eg, 1000 people join and use the Ollama at the same time? or will there be a delay or is there a limitation etc,...
    Thank you in advance.

    • @technovangelist
      @technovangelist  9 днів тому

      asking questions at the same time? probably not. a bit too low end. If they are all asking at roughly the same time, maybe a network of 100 of those machines with some sort of proxy? assuming you are only using 8b models. it would be a lot easier and cheaper to use openai's enterprise solution on azure.

    • @MK-kk1zb
      @MK-kk1zb 9 днів тому

      @@technovangelist thank you for the response but I think the more realistic concurrent users will be around 50 and we don't mind a bit of slowness... but no matter what the solution should be something local and offline... so do you think this setup can handle 50 users at a time with an 8B model

  • @BruceWayne15325
    @BruceWayne15325 3 місяці тому

    I love small LLM's. I don't think people realize the power they have to simplify their lives. I love to use Obsidian for note taking. Using a local LLM, I can have it easily summarize my notes, giving me an at-a-glance view of each daily note. Have a long meeting? Transcribe it and summarize it, then stuff it in your notes. When we get agents then they will become vastly more useful.
    People put too much emphasis on the ability of models to do analytical tasks with great accuracy. They don't understand that the power of AI is the ability to have the AI write a program on-the-fly to do that kind of analysis, and then give you the result. AI will never be 100% accurate. It's like a human. We make mistakes too. We are spoiled with computers and the ability to have 100% reliable answers, but an AI can give that to you, just through the extra step of creating a program to accomplish the task.

  • @kshabana_YT
    @kshabana_YT 4 місяці тому +2

    Why do you quit ollama 😢😢😢

    • @technovangelist
      @technovangelist  4 місяці тому +1

      Are you asking about quitting the app? Or why I left the company? That second thing is not something for this comment thread.

    • @emmanuelgoldstein3682
      @emmanuelgoldstein3682 4 місяці тому +2

      @@technovangelist Due to your hesitance on commenting, we'll just assume they were having Diddy parties until you clear it up

    • @starlord7526
      @starlord7526 4 місяці тому

      @@emmanuelgoldstein3682 did you just say diddy party brah? jajajajaja

    • @kshabana_YT
      @kshabana_YT 4 місяці тому

      Company

  • @arkemiffo
    @arkemiffo 4 місяці тому

    Just tried the 3.2:3b. I said hello and got a reply blazingly fast, so I asked if it was on meth or something. Got the standard "I'm just a model, I can't human", so I said I was just surprised to see such fast answers on a local model. And this is where things got confused.
    Apparently, Llama3.2:3b thinks it's working off a cloud-service. It refused the notion that I'm running this locally.
    Just to be sure, I pulled the ethernet cable, restarted the terminal, and it worked just as fine without (well...duh).
    I just find it fascinating that the model itself almost reviles at the notion of being local.

  • @junaidmundichipparakkal1700
    @junaidmundichipparakkal1700 4 місяці тому

    Can someone point to ollama and RAG locally from Matt?

  • @changeagent228
    @changeagent228 4 місяці тому +1

    First test I did was "what number is larger 9.9 or 9.11?" and it insisted 9.11 was bigger. When is 2.3 out?

  • @omercelebi2012
    @omercelebi2012 4 місяці тому

    Man you forgot your cup!

  • @BeauKpad
    @BeauKpad 3 місяці тому

    My favorite hallucination : I work for the Jill Stein for President campaign so I use info about her to test AIs. She's a public figure, but a minor one, and I know more about her than most sources.
    When playing around with WebSim, I asked it to make me a Jill Stein fan site.
    The results were shockingly accurate, history, platform, etc... except the photos. The photos were all of Jill Biden.
    For all the normies, Jill Biden is a Joe Biden's wife. Jill Stein was running against Joe Biden.
    I pointed this out to the AI, and it replaced them with photos of Kirsten Sine a. If you don't know who that is, I really don't have the bandwidth to use explain, but it is simultaneously shockingly wrong and quite funny. Like, I know what it is getting at.

  • @protovici1476
    @protovici1476 4 місяці тому

    The vision portion isn't to great.

  • @iamarto
    @iamarto 4 місяці тому +2

    Whoever took the sponser from meta, I don't think they asked for it. But in case you haven't noticed they have more subscribers than you.

    • @technovangelist
      @technovangelist  4 місяці тому +2

      Some have 1/3 the number of subs compared to me. So that’s not it.

  • @ChristophBackhaus
    @ChristophBackhaus 4 місяці тому +2

    I want you to count the number of r's in Strawberry.
    To do so I want you to go Letter by letter and every time you find one r I want you to count up
    Gets it right every time...

    • @danielarista1352
      @danielarista1352 4 місяці тому

      >>> I want you to count the number of r's in aferradfafa;;rr;r;adfafaf'''.. To do so I want you to go letter by letter and every time you find one r I want you to count up.
      Let's start counting:
      a - no R (0)
      f - no R (0)
      e - no R (0)
      r - found an R! (1)
      r - found another R! (2)
      a - no R (2)
      d - no R (2)
      f - no R (2)
      a - no R (2)
      Continuing...
      ;; is a semicolon, not a letter to count
      rr - found two Rs! (4)
      ; - no R (4)
      r - found another R! (5)
      ; - no R (5)
      a - no R (5)
      d - no R (5)
      f - no R (5)
      a - no R (5)
      f - no R (5)
      a - no R (5)
      f - no R (5)
      So, there are 5 Rs in the given text.

  • @PeterHagen
    @PeterHagen 4 місяці тому +2

    Llama 3.1 & 3.2 are unfortunately very poor in Dutch language usage

  • @SlykeThePhoxenix
    @SlykeThePhoxenix 4 місяці тому

    There's 4 killers in the room. Since when does dying make you not a killer?

  • @sskohli79
    @sskohli79 4 місяці тому

    Hey Matt, nice video. But I don’t think it’s as impressive as you put it. I am sure the llama3.1’s performance was comparable

    • @technovangelist
      @technovangelist  4 місяці тому +3

      It wasn’t available in a 1 and 3 b model.

  • @marcobassini3576
    @marcobassini3576 Місяць тому

    Asking twice and getting two different answers is not good! It would be preferable to get the (more or less) wrong answer, but consistently every time.
    How can you rely upon a tool that cannot decide what to answer? You NEVER know if it is right or wrong (unless you already know the answer, which negates the usefulness of the assistant).

  • @Jason-ju7df
    @Jason-ju7df 4 місяці тому

    Microsoft GRIN MoE: A Gradient-Informed Mixture of Experts MoE Model 6.6b
    Ranks better

    • @technovangelist
      @technovangelist  4 місяці тому +1

      In benchmarks? Or in real tests. One is useful the other has zero real value.

  • @xevil21
    @xevil21 4 місяці тому

    It's amazing how such a small model is smarter than you?

  • @amoledzeppelin
    @amoledzeppelin 21 день тому

    Hmm, Llama-3.2-1B-Instruct.Q8_0 answered the strawberry question correctly.