Stanford CS229 I Machine Learning I Building Large Language Models (LLMs)

Поділитися
Вставка
  • Опубліковано 1 лис 2024

КОМЕНТАРІ • 174

  • @CG-hj1cu
    @CG-hj1cu Місяць тому +339

    I'm a student for life....approaching 40.....never had the privilege of attending a university like Stanford. To get access to these quality lectures is amazing. Thank you

    • @Fracasse-0x13
      @Fracasse-0x13 Місяць тому +1

      This is a quality lecture?

    • @kevinlanahan2194
      @kevinlanahan2194 Місяць тому +16

      @@Fracasse-0x13 for people who dont have access to education, yes, it is a quality lecture.

    • @darrondavis5848
      @darrondavis5848 28 днів тому +3

      i am living my dreams

    • @shaohongchen1063
      @shaohongchen1063 28 днів тому +8

      @@Fracasse-0x13 why this is not a quality lecture?

    • @MyLordaizen
      @MyLordaizen 26 днів тому +1

      They all the same
      Everything is on the web
      you don't need certification to tell the world you know it
      Build the best

  • @nothing12392
    @nothing12392 2 місяці тому +322

    It is one thing to be a great research institution but to be a great research institution that is full of talented and kind lecturers is extremely impressive. I've been impressed by every single Stanford course and lecture I have participated in through SCPD and UA-cam and this lecturer is no exception.

    • @stanfordonline
      @stanfordonline  Місяць тому +14

      Thank you for sharing your positive experiences with our courses and lectures!

  • @bp3016
    @bp3016 Місяць тому +287

    Is my teachers in school looked this good, I wouldn't miss a single class. He's handsome af.

  • @yanndubois3914
    @yanndubois3914 Місяць тому +301

    Slides: drive.google.com/file/d/1B46VFrqFAPAEj3kaCrBAtQqeh2_Ztawl/view?usp=sharing

    • @Imperfectly_perfect_007
      @Imperfectly_perfect_007 Місяць тому +4

      Thank you sir...i heartly appreciate it😊.... lecture was awesome 🤌

    • @tastyjourney215
      @tastyjourney215 Місяць тому +3

      thankyou so much. i really appreciate it

    • @helloadventureworld
      @helloadventureworld Місяць тому +6

      lecture was perfect. is there a playlist for the whole class of cs229 for the same semester as this video? all I have found was before 2022 which made me wondering

    • @yanndubois3914
      @yanndubois3914 Місяць тому +5

      @@helloadventureworld no, the rest of CS229 has not been released and I don't know if it will. This is only the guest lecture.

    • @helloadventureworld
      @helloadventureworld Місяць тому +3

      @@yanndubois3914 Thanks for the response and information you have shared :)

  • @SudipBishwakarma
    @SudipBishwakarma 2 місяці тому +32

    This is really a great lecture, super dense but still digestible. Its not even been 2 years since ChatGPT was released to public and to see the rapid pace of research around LLMs and it getting better is really interesting. Thank you so much, now I have some papers to read to further my understanding.

  • @EduardoLima
    @EduardoLima 15 днів тому +14

    We live in a tremendous moment in time. Free access to the best lectures on the most relevant topic from the best university

  • @ReflectionOcean
    @ReflectionOcean Місяць тому +43

    Insights By "YouSum Live"
    00:00:05 Building large language models (LLMs)
    00:00:59 Overview of LLM components
    00:01:21 Importance of data in LLM training
    00:02:59 Pre-training models on internet data
    00:04:48 Language models predict word sequences
    00:06:02 Auto-regressive models generate text
    00:10:48 Tokenization is crucial for LLMs
    00:19:12 Evaluation using perplexity
    00:22:07 Challenges in evaluating LLMs
    00:29:00 Data collection is a significant challenge
    00:41:08 Scaling laws improve model performance
    01:00:01 Post-training aligns models with user intent
    01:02:26 Supervised fine-tuning enhances model responses
    01:10:00 Reinforcement learning from human feedback
    01:19:01 DPO simplifies reinforcement learning process
    01:28:01 Evaluation of post-training models
    01:37:20 System optimization for LLM training
    01:39:05 Low precision improves GPU efficiency
    01:41:38 Operator fusion enhances computational speed
    01:44:23 Future considerations for LLM development
    Insights By "YouSum Live"

  • @SerhiiFedorov-v1l
    @SerhiiFedorov-v1l 9 днів тому +2

    Thank you for the video! I am glad that we live in this time and can witness the development of AI technologies.

  • @Grace_lr
    @Grace_lr 26 днів тому +47

    Suddenly I am interested in LLMS

  • @anshdeshraj
    @anshdeshraj 28 днів тому +17

    finally a someone said Machine Learning instead of slapping AI on everything!

    • @duartesilva7907
      @duartesilva7907 10 днів тому

      I feel that whenever someone talks about AI a lot it means that they know nothing about it

  • @dr.mikeybee
    @dr.mikeybee 2 місяці тому +22

    This is very well done. It's super easy to understand. I think your students should learn a lot. It's a great skill to be able to present complex material in a simple fashion. It means you really understand both the material and your audience.

  • @devanshmishra-ez1tn
    @devanshmishra-ez1tn 13 днів тому +6

    00:10 Building Large Language Models overview
    02:21 Focus on data evaluation and systems in industry over architecture
    06:25 Auto regressive language models predict the next word in a sentence.
    08:26 Tokenizing text is crucial for language models
    12:38 Training a large language model involves using a large corpus of text.
    14:49 Tokenization process considerations
    18:40 Tokenization improvement in GPT 4 for code understanding
    20:31 Perplexity measures model hesitation between tokens
    24:18 Comparing outputs and model prompting
    26:15 Evaluation of language models can yield different results
    30:15 Challenges in training large language models
    32:06 Challenges in building large language models
    35:57 Collecting real-world data is crucial for large language models
    37:53 Challenges in building large language models
    41:38 Scaling laws predict performance improvement with more data and larger models
    43:33 Relationship between data, parameters, and compute
    47:21 Importance of scaling laws in model performance
    49:12 Quality of data matters more than architecture and losses in scaling laws
    52:54 Inference for large language models is very expensive
    54:54 Training large language models is costly
    59:12 Post training aligns language models for AI assistant use
    1:01:05 Supervised fine-tuning for large language models
    1:04:50 Leveraging large language models for data generation and synthesis
    1:06:49 Balancing data generation and human input for effective learning
    1:10:23 Limitations of human abilities in generating large language models
    1:12:12 Training language models to maximize human preference instead of cloning human behaviors.
    1:16:06 Training reward model using softmax logits for human preferences.
    1:18:02 Modeling optimization and challenges in large language models (LLMs)
    1:21:49 Reinforcement learning models and potential benefits
    1:23:44 Challenges with using humans for data annotation
    1:27:21 LLMs are cost-effective and have better agreement with humans than humans themselves
    1:29:12 Perplexity is not calibrated for large language models
    1:33:00 Variance in performance of GPT-4 based on prompt specificity
    1:34:51 Pre-training data plays a vital role in model initialization
    1:38:32 Utilize GPUs efficiently with matrix multiplication
    1:40:21 Utilizing 16 bits for faster training in deep learning
    1:44:08 Building Large Language Models from scratch
    Crafted by Merlin AI.

  • @mukammedalimbet2351
    @mukammedalimbet2351 6 днів тому +1

    great! thanks for sharing! One thing i would suggest is to transcribe or add subtitle of questions that is being asked by the students. That way we could better understand the answer given by lecturer.

  • @Qxxliu
    @Qxxliu 27 днів тому +2

    one good point when they discuss the difference between ppo and dpo is reward model can reduce the dependency of labeled preference data

  • @BMoRideNGrind
    @BMoRideNGrind 25 днів тому +3

    Really incredible delivery of complicated information. ❤

  • @minhatvo82
    @minhatvo82 Місяць тому +9

    fantastic, wonderful, significant, magnificent, outstanding, class of titans, world-class🎉

  • @thunderbirdk
    @thunderbirdk 4 дні тому +1

    Wow! Such a wonderful presentation! Thanks so much!

  • @majidmehmood3780
    @majidmehmood3780 9 днів тому +1

    people should first learn about basic language models like bigrams, unigrams. these were the first language models and stanford really has good lectures in it

  • @Nightsd01
    @Nightsd01 10 днів тому +1

    What an awesome video. Data quality is a real issue, and even more interestingly, LLM’s learn a lot like humans. Introduce the simpler concepts first (training data prompts) and then introduce more complex subjects, and the LLM’s learn more just like humans

  • @KelvinMeeks
    @KelvinMeeks 10 днів тому +1

    Great talk. Loved the level of detail, the insights, the pacing.

  • @for-ever-22
    @for-ever-22 2 місяці тому +8

    This is an amazing breakdown of the high level overview of an LLM’s. Every aspect of an LLM was mentioned. Thank you for this amazing video. I’ll come back here often

  • @sonudixit-h3w
    @sonudixit-h3w 2 місяці тому +4

    Thanks a lot for sharing this. I would like to point a correction-
    time 20:28 -
    Consider case prob(true_token)

    • @yanndubois3914
      @yanndubois3914 2 місяці тому

      Yes that's correct, it's the baseline performance of a very bad language model.

  • @PratikBhavsar1
    @PratikBhavsar1 2 місяці тому +10

    Very informative, updated and crisp~ keep them coming..don't stop now!

  • @namazbekbekzhan
    @namazbekbekzhan 20 днів тому +3

    00:10 Обзор создания больших языковых моделей
    02:21 Сосредоточьтесь на оценке данных и системах на практике
    06:25 Авторегрессивные языковые модели предсказывают следующее слово
    08:26 Токенизация текста и размер словаря имеют решающее значение для языковых моделей.
    12:38 Токенизация и обучение токенизаторов
    14:49 Оптимизация процесса токенизации и решения по объединению токенов
    18:40 GPT 4 улучшил токенизацию для лучшего понимания кода
    20:31 Переплетение измеряет колебания модели между словами.
    24:18 Оценка открытых вопросов является сложной задачей.
    26:15 Различные способы оценки крупных языковых моделей
    30:15 Шаги по предварительной обработке веб-данных для больших языковых моделей
    32:06 Проблемы с обработкой дубликатов и фильтрацией низкокачественных документов в больших масштабах.
    35:57 Сбор данных о мире имеет решающее значение для практических крупных языковых моделей.
    37:53 Проблемы при предобучении крупных языковых моделей
    41:38 Законы масштабирования предсказывают улучшение производительности с увеличением объема данных и размером моделей.
    43:33 Вычисления определяются данными и параметрами.
    47:21 Понимание значения законов масштабирования при создании больших языковых моделей
    49:12 Хорошие данные имеют решающее значение для лучшего масштабирования.
    52:54 Вывод для больших языковых моделей дорогой.
    54:54 Обучение крупных языковых моделей требует высоких вычислительных затрат.
    59:12 Большие языковые модели (LLM) требуют дообучения для выравнивания, чтобы стать AI-ассистентами.
    1:01:05 Создание крупных языковых моделей (LLM) включает в себя тонкую настройку предварительно обученных моделей на желаемых данных.
    1:04:50 Предобученные языковые модели оптимизируют под конкретные типы пользователей во время дообучения.
    1:06:49 Сбалансирование генерации синтетических данных с человеческим вводом имеет решающее значение для эффективного обучения.
    1:10:23 Проблемы в создании контента, превышающего человеческие способности
    1:12:12 Генерация идеальных ответов с использованием максимизации предпочтений
    1:16:06 Обучение модели вознаграждения с использованием логитов для непрерывных предпочтений
    1:18:02 Обучение крупных языковых моделей с помощью ПО и проблемы в обучении с подкреплением
    1:21:49 Обсуждение о методах обучения с подкреплением и их преимуществах в использовании моделей наград.
    1:23:44 Проблемы использования людей в качестве аннотаторов данных
    1:27:21 LLM более экономичны и предлагают лучшее согласие, чем люди.
    1:29:12 Проблемы с перплексией и калибровкой в языковых моделях
    1:33:00 Вариативность в производительности GPT-4 в зависимости от подсказок
    1:34:51 Важность предобучения в больших языковых моделях
    1:38:32 Использование ГПУ для умножения матриц может быть в 10 раз быстрее, но коммуникация и память играют ключевую роль.
    1:40:21 Уменьшенная точность для более быстрой матричной умножения
    1:44:08 Создание больших языковых моделей (ЯМП)
    Crafted by Merlin AI.

  • @sucim
    @sucim 2 місяці тому +12

    Fabulous lecture! Goes into all important concepts and also highlights the interesting details that are commonly glossed over, thanks for recording!

  • @mohammedosman4902
    @mohammedosman4902 2 місяці тому +15

    great lecture, wish the speaker had more time to go over the full presentation

  • @carvalhoribeiro
    @carvalhoribeiro Місяць тому +4

    Great presentation and very helpful. Thanks for sharing this

  • @nomi6761
    @nomi6761 2 місяці тому +6

    How do people know that "adding more data" is not just increasing likelihood of training on something from the benchmarks, while "adding more parameters" is not just increasing the recall abilities (parametric memory capacity) of the model to retrieve benchmark stuff during evaluation? Really curious about that point.

  • @maximshaposhnikov7970
    @maximshaposhnikov7970 Місяць тому +2

    What an amazing lecture, now want a part 2 about the topics that haven’t been touched upon 🤩

  • @boeingpameesha9550
    @boeingpameesha9550 Місяць тому +6

    My sincere thanks for sharing it.

  • @doomed5206
    @doomed5206 8 днів тому +3

    suddenly i m interested in llms😗😗😗

  • @Pl15604
    @Pl15604 2 місяці тому +3

    The training algorithm is actually the key... It is because of RLHF that we have GPT-4

  • @danieleneh3193
    @danieleneh3193 10 днів тому +1

    This is a gold mine

  • @AlphaVisionPro
    @AlphaVisionPro 25 днів тому +6

    You can build my ❤️

  • @cui_1152
    @cui_1152 6 днів тому

    Please give this dude 15more minutes, for Tiling, Flash Attention, Parallelization for data and model !!

  • @squidwardswift
    @squidwardswift 17 днів тому +4

    Dayum he’s fine

  • @SyedShayanAliShah
    @SyedShayanAliShah 8 днів тому +1

    The reason Stanford graduate the rule the world

  • @luxbran532
    @luxbran532 Місяць тому +3

    Great lecture

  • @beansforbrain
    @beansforbrain 10 днів тому +1

    Looking forward to do a PostDoc from SU

  • @imalive404
    @imalive404 Місяць тому +3

    @5:55 there is an approximation. it lies on the axioms. the axiom being probability should sum to 1. second the approximation is that distribution only comes out of the given corpora. The given corpora is the approximation of the total population. Which we all know has its own biases.

  • @xiaoxiandong7382
    @xiaoxiandong7382 27 днів тому +1

    would love to see the other recordings of cs25!

  • @esamyakIndore
    @esamyakIndore 14 днів тому +1

    More lecture of Machine learning plz share

  • @SuperLano98
    @SuperLano98 Місяць тому +2

    When will the other lectures be updated? This was so good!

  • @zeep14dabs
    @zeep14dabs 29 днів тому +1

    this is amazing, can you guys make a playlist for begginers?. thank you!

  • @keshmesh123
    @keshmesh123 26 днів тому +3

    thank you! great lecture.

  • @meer.sohrab
    @meer.sohrab 2 місяці тому +3

    The best one we want more

  • @giuseppefrau1097
    @giuseppefrau1097 2 дні тому

    thanks for this great lecture. Is also the lecture on transformers available somewhere?

    • @stanfordonline
      @stanfordonline  2 дні тому

      You might be interested in the lectures in this playlist: ua-cam.com/play/PLoROMvodv4rNiJRchCzutFw5ItR_Z27CM.html&si=KmCNuzfcc_E0cxDg

  • @sahejagarwal801
    @sahejagarwal801 Місяць тому +3

    Most amazing video ever

  • @hamzadata
    @hamzadata 2 місяці тому +4

    man this is amazing!

  • @nataliatenoriomaia1635
    @nataliatenoriomaia1635 8 днів тому +1

    Can we please have access to the previous lecture about Transformers?

  • @alexmoonrock
    @alexmoonrock 7 днів тому

    This interests me but I have no coding experience. Any tips to where to start , surely Standford lectures ? Coding 101 I guess. Anything helps :)

  • @F3lp1s
    @F3lp1s 2 місяці тому +3

    So Amazing!

  • @enzoluispenagallegos5440
    @enzoluispenagallegos5440 2 місяці тому +4

    Thank you for this

  • @web3global
    @web3global Місяць тому +1

    Thank you! 🚀

  • @Neilblaze
    @Neilblaze 2 місяці тому +3

    Great content, thanks!

  • @kartikeychhipa3813
    @kartikeychhipa3813 2 місяці тому +4

    Just Amazing!

  • @njabulonzimande2893
    @njabulonzimande2893 23 дні тому +1

    LLM - chatbots
    Architecture (Neural networks)
    Training algorithm
    Data
    Evaluation
    System

  • @Zoronoa01
    @Zoronoa01 9 днів тому +1

    Where can we find the rest of the videos for CS229 summer 2024?

  • @GoWithAndy-cp8tz
    @GoWithAndy-cp8tz 5 днів тому +1

    Interesting and a good chest btw. Clark Kent?

  • @rodrigoherurbi3992
    @rodrigoherurbi3992 29 днів тому +2

    This genius saying "2K return tickets from JFK to LDN are not significant" (in terms of environmental impact) and that "next models will be +10X FLOPS" just makes me conclude that these guys are not only throwing money at the problem (i.e. gen AI) but don't have a thoughtful solution on how to train AI considering the environment and economic aspects of it.

  • @mudassiria
    @mudassiria День тому

    the lecture is good but the thing i dislike is the frequent change of the slide screen with the tutor camera. the video should be like a mini-player of tutor camera at the bottom corner with the slide screen on for the full time. that irritates me a lot in the whole lecture, making my focus fluctuate constantly

  • @sanjayg1728
    @sanjayg1728 9 днів тому

    Could you please share the link to the lecture on Transformers that you were referring to in the video?

  • @watchitpunk5616
    @watchitpunk5616 28 днів тому +2

    Steve rogers talking about AI❤

  • @TwoMonkeys-im4rm
    @TwoMonkeys-im4rm 2 місяці тому +4

    A 2024 lecture

  • @Deepneuralmess
    @Deepneuralmess 2 місяці тому +3

    🇰🇪 well Represented.

  • @shoaibyehya3600
    @shoaibyehya3600 2 місяці тому +6

    Impressive

  • @losdewill
    @losdewill 19 днів тому +2

    Yann, if you ever get to read this, you are a truly handsome man. I

  • @SettimiTommaso
    @SettimiTommaso 2 місяці тому +5

    Yes!

  • @balajinadar1503
    @balajinadar1503 28 днів тому +2

    Ignore this comment
    Day 1 19:05
    Dat 2 28:38

  • @weskerrongkaima1173
    @weskerrongkaima1173 6 днів тому

    the biggest novelty of chatgpt is the UI lol

  • @cherryfan9987
    @cherryfan9987 2 місяці тому +5

    Thank u

  • @chrisj2841
    @chrisj2841 Місяць тому +3

    Anyone here took the class in which this lecture was held ( cs229 summer 2024) ?

  • @astridkjellberg
    @astridkjellberg 6 днів тому +1

    suddenly, i'm a software engineer.

  • @Moroi12
    @Moroi12 28 днів тому +6

    Not fair ,was here to learn ,got distracted by charm

  • @not_amanullah
    @not_amanullah 18 днів тому +1

    thanks ❤️🤍

  • @mohammadhosseinzolfagharna8106
    @mohammadhosseinzolfagharna8106 10 днів тому +2

    It feels like learning LLMs from clark kent (superman) 😂😅

  • @Sohammhatre10
    @Sohammhatre10 2 місяці тому +3

    Does anyone have the pdf or ppt for this lecture, if so please reply to this comment. Thanks!

  • @T3NS0R
    @T3NS0R Місяць тому +4

    I have a doubt in Scalable data for SFT, isn't the model be biased as its using its own knowledge to generate dataset and further trained on the same?

  • @ganodiya001
    @ganodiya001 10 днів тому +1

    anybody know of any resources for learning LLM?

  • @quackplay9243
    @quackplay9243 19 днів тому +2

    "She likely prefers Stanford"

  • @LaibaKhan-q4t
    @LaibaKhan-q4t Місяць тому +6

    He's hot.

  • @chasingthatfeelin
    @chasingthatfeelin 14 днів тому +2

    I'm majoring in Finance but he is so hot so im here

  • @lyeln
    @lyeln 2 дні тому

    I appreciate the sharing but I find this view of LLMs too simplistic and approximate (when not outright missing some pieces) for those in the field, and probably too complicated/misleading for those who aren't. Also I don't see the due attention to mechanistic interpretability, emergent properties and models reasoning debate with appropriate quotes of recent papers.

  • @laptoproyale4681
    @laptoproyale4681 2 місяці тому +22

    I can literally watch this whole lecture because he's so hot

  • @AbigailWalling-x4o
    @AbigailWalling-x4o 2 місяці тому +1

    Lubowitz Walks

  • @rajdeepraj624
    @rajdeepraj624 29 днів тому +2

    Upload next video

  • @vimalk8923
    @vimalk8923 2 місяці тому +1

    are there slides?

  • @robbyrayrab
    @robbyrayrab 25 днів тому +1

    aye where can I find cs 336???

    • @inditalian7279
      @inditalian7279 22 дні тому

      how do uk which stanford course to look for? which website do u usually use

  • @azharalibhutto1209
    @azharalibhutto1209 3 дні тому

  • @sheikhobama3759
    @sheikhobama3759 2 місяці тому +9

    Knock Knock!

    • @Ef554rgcc
      @Ef554rgcc 2 місяці тому +2

      Banana

    • @JoseMonteverde
      @JoseMonteverde 2 місяці тому +1

      Who’s there!

    • @Ef554rgcc
      @Ef554rgcc 2 місяці тому +1

      @@JoseMonteverde Banana

    • @HB-kl5ik
      @HB-kl5ik 2 місяці тому +1

      We're no strangers to love, you know the rules and so do I

    • @Ef554rgcc
      @Ef554rgcc 2 місяці тому +1

      @@HB-kl5ik banana

  • @SevgiYılmaz-b4x
    @SevgiYılmaz-b4x Місяць тому +1

    Gulgowski Crescent

  • @StormySkye22
    @StormySkye22 28 днів тому +4

    sir I didn't do my homework... please punish me

  • @a4assasin
    @a4assasin 12 днів тому +1

    🇧🇩❤

  • @MKIIL
    @MKIIL 2 місяці тому +1

    Good morning i guess?

  • @oliviazhao1244
    @oliviazhao1244 10 днів тому +2

    上帝给他关了哪扇窗啊

  • @m4v3r1ck-eq9we
    @m4v3r1ck-eq9we 2 дні тому

    tldr?

  • @kyle880413
    @kyle880413 10 днів тому +2

    he is cute

  • @hkgyguhuviChbjn
    @hkgyguhuviChbjn Місяць тому +1

    Thomas Mark Williams Robert Taylor Linda

  • @sss40719
    @sss40719 19 днів тому +1

    why he kinda...

  • @sageisrage
    @sageisrage 23 дні тому +2

    he kinda......