Comparing LLMs with LangChain

Поділитися
Вставка
  • Опубліковано 17 жов 2024
  • Colab Notebook: colab.research...
    In this video I look at how to compare various models to see their outputs for a variety of tasks. This is an area LangChain is currently adding to as well. Models covered include GPT-3, ChatGPT 'gpt-3.5-turbo', Flan-20B, Flan-T5-XL, Cohere-command-xl
    If you have any requests for videos please let me know in the comments
    For more tutorials on using LLMs and building Agents, check out my Patreon:
    Patreon: / samwitteveen
    Twitter: / sam_witteveen
    My Links:
    Linkedin: / samwitteveen
    Github:
    github.com/sam...
    github.com/sam...

КОМЕНТАРІ • 30

  • @FreestyleTraceur
    @FreestyleTraceur Рік тому +3

    Really glad I stumbled upon your videos. Super high quality, informative, and helpful. I'm sure your subscriber count will be ballooning in the coming weeks and months. Great stuff!

  • @micbab-vg2mu
    @micbab-vg2mu Рік тому

    At the moment because I am a beginner I focus only on the GPT model, but maybe in the future, I will use others. Thank you for the video.

  • @AntonioSorrentini
    @AntonioSorrentini Рік тому +1

    The most powerful, and by far the best of all these, today are Alpaca 7B and Vicuna 13B. Yes, these days innovation runs fast and frenetic in this sector.

    • @samwitteveenai
      @samwitteveenai  Рік тому +2

      Checkout Koala 13B it s like Vicuna but trained on more data.

    • @AntonioSorrentini
      @AntonioSorrentini Рік тому

      @@samwitteveenai Thank you very much!

  • @CesarAngeles28
    @CesarAngeles28 Рік тому

    Hi Sam,
    Thanks so much for the video. It is quite cool and funny as well. You mention a paper in there, is it a research paper? Could we get ahold of it?
    Thanks so much again and all the best!

    • @samwitteveenai
      @samwitteveenai  Рік тому

      Yeah all papers should be on Arxiv if not in the description of the video. That video is from quite a while back, so not sure which paper you are talking about, but take a look on Arxiv.

  • @AykutKlc
    @AykutKlc Рік тому +6

    Alpaca 30B - 4bit quantized:
    > What is the opposite of up?
    Down.
    > Answer the following question by reasoning step by step. The cafeteria had 23 apples. If they used 20 for lunch, and bought 6 more, how many apple do they have?
    They now have 18 apples (the original 23 minus the 5 that were eaten at lunch).
    > Can Geoffrey Hinton have a conversation with George Washington? Give the rationale before answering.
    No, because they are not in the same time period and thus cannot communicate directly. However, if we had access to some sort of portal or device that could allow them both to be present at once then it would theoretically be possible for Geoffrey Hinton to have a conversation with George Washington.
    The others were'nt good enough.

    • @samwitteveenai
      @samwitteveenai  Рік тому +1

      I haven't tried the 4bit version but the 30B model should have better results than that if finetuned properly.

  • @raghadabdulrab7844
    @raghadabdulrab7844 8 місяців тому

    What are my options If i want to compare really large LLMs?

  • @sharifehfadaei1245
    @sharifehfadaei1245 11 місяців тому

    Thank you for your wonderful video 👌how can I compare the performance of different finetuned LLMs based on my dataset?

  • @doctalk
    @doctalk Рік тому +1

    It will be great if you compared the alpaca one here as well

  • @hamzakivideo
    @hamzakivideo Рік тому

    Great video, do you think we can do the same evaluation for qa retrieval from a doc?

    • @samwitteveenai
      @samwitteveenai  Рік тому

      yeah you should be able to, but it might have to be a bit more manual about how you do it.

    • @hamzakivideo
      @hamzakivideo Рік тому

      @@samwitteveenai , I have been working on one, got mpnet embedding and flan as retrieval to work, would be great to collaborate with you on it!

  • @RedCloudServices
    @RedCloudServices Рік тому

    Sam thank you these LLM reviews could become Yelp for AI 😆 I have a question can you post a video about computational LLM? For example to provide an actual table of data and then have a conversation or derive insights w the LLM. Langchain have a csv agent but do others?

    • @samwitteveenai
      @samwitteveenai  Рік тому +1

      The CSV idea is certainly interesting, I am planning to show fine tuning a custom model for natural language to SQL. Maybe I will try to incorporate the csv or tables analysis in there.

    • @RedCloudServices
      @RedCloudServices Рік тому

      @@samwitteveenai this GPT plugin announcement could make so many computational use cases possible. There are langchain text to sql agents but csv is such a simple use case. Imagine uploading an entire budget integrated with a custom LLM corpus. Or an ML plugin with label predictions on-the-fly.

    • @samwitteveenai
      @samwitteveenai  Рік тому +1

      I do think there is a lot of discreet applications currently , but I am also very interested in neural plugins eg separate models within a plugin for ChatGPT etc

  • @attilavass6935
    @attilavass6935 Рік тому +1

    The creative task should have been done with much higher Temp than 0.1 - I think.
    I see it's not easy to setup params by task (compare), but I might try this with higher Temps later.

    • @samwitteveenai
      @samwitteveenai  Рік тому

      yes totally the more randomness (creativity) you want in the higher the temp.

  • @ChenXibo
    @ChenXibo Рік тому

    What a funny video you made! I love it.

  • @toddnedd2138
    @toddnedd2138 Рік тому

    Very nice information. Giving a LLM a kind of memory is a little bit challenging. Currently I´m trying to fine tune LLaMA for text-to-sql. Then it could access the DB for previous information. The DB schema is not so easy to design, maybe some (artificial) intelligence could help :D

    • @samwitteveenai
      @samwitteveenai  Рік тому

      Are you using a public dataset?

    • @toddnedd2138
      @toddnedd2138 Рік тому +1

      ​@@samwitteveenai Thanks for asking. No, I generate the dataset by myself. To be honest, I let GPT4 do all the heavy lifting. It is also more a spare time project and maybe a simple and naive approach. To give you an example, using your example sentence, a little bit extended:
      1. Let the AI create a semantic interpretation of sentence
      Q: Find the subject an it´s attributes in the following sentence and classify the attributes. The sentence is "Hi, I´m Sam and 42 years old".
      A: "Sam" is an attribute of the subject that specifies the name of the person being referred to, and "42 years old" is another attribute that specifies the age of the person.
      2. Let the AI create a class object, e.g. in C#
      Q: Could you give me a c# class that models the subject.
      A: public class Person
      {
      public string Name { get; set; }
      public int Age { get; set; }
      }
      3. Let the AI create the SQL to create the table and store the concrete object
      CREATE TABLE Persons (
      Id INT IDENTITY(1,1) PRIMARY KEY,
      Name NVARCHAR(50) NOT NULL,
      Age INT NOT NULL
      );
      INSERT INTO Persons (Name, Age)
      VALUES ('Sam', 42);
      The training data point consists then of:
      {'instruction': 'Convert text to sql',
      'input': 'Hi, I´m Sam and 42 years old',
      'output': 'CREATE TABLE Persons (
      Id INT IDENTITY(1,1) PRIMARY KEY,
      Name NVARCHAR(50) NOT NULL,
      Age INT NOT NULL
      );
      INSERT INTO Persons (Name, Age)
      VALUES ('Sam', 42);'
      Not sure if and how it will work. There is maybe more context in the instruction and/or input necessary.

    • @samwitteveenai
      @samwitteveenai  Рік тому

      @@toddnedd2138 Checkout WikiSQL www.kaggle.com/datasets/shahrukhkhan/wikisql you may find this useful.

    • @toddnedd2138
      @toddnedd2138 Рік тому

      @@samwitteveenai Thank you for the link, appreciate it very much. 🤩