This is why Deep Learning is really weird.

Поділитися
Вставка
  • Опубліковано 29 кві 2024
  • In this comprehensive exploration of the field of deep learning with Professor Simon Prince who has just authored an entire text book on Deep Learning, we investigate the technical underpinnings that contribute to the field's unexpected success and confront the enduring conundrums that still perplex AI researchers.
    Understanding Deep Learning - Prof. SIMON PRINCE [STAFF FAVOURITE]
    Watch behind the scenes, get early access and join private Discord by supporting us on Patreon:
    / mlst
    / discord
    / mlstreettalk
    Key points discussed include the surprising efficiency of deep learning models, where high-dimensional loss functions are optimized in ways which defy traditional statistical expectations. Professor Prince provides an exposition on the choice of activation functions, architecture design considerations, and overparameterization. We scrutinize the generalization capabilities of neural networks, addressing the seeming paradox of well-performing overparameterized models. Professor Prince challenges popular misconceptions, shedding light on the manifold hypothesis and the role of data geometry in informing the training process. Professor Prince speaks about how layers within neural networks collaborate, recursively reconfiguring instance representations that contribute to both the stability of learning and the emergence of hierarchical feature representations. In addition to the primary discussion on technical elements and learning dynamics, the conversation briefly diverts to audit the implications of AI advancements with ethical concerns.
    Pod version (with no music or sound effects): podcasters.spotify.com/pod/sh...
    Follow Prof. Prince:
    / simonprinceai
    / simon-prince-615bb9165
    Get the book now!
    mitpress.mit.edu/978026204864...
    udlbook.github.io/udlbook/
    Panel: Dr. Tim Scarfe -
    / ecsquizor
    / ecsquendor
    TOC:
    [00:00:00] Introduction
    [00:11:03] General Book Discussion
    [00:15:30] The Neural Metaphor
    [00:17:56] Back to Book Discussion
    [00:18:33] Emergence and the Mind
    [00:29:10] Computation in Transformers
    [00:31:12] Studio Interview with Prof. Simon Prince
    [00:31:46] Why Deep Neural Networks Work: Spline Theory
    [00:40:29] Overparameterization in Deep Learning
    [00:43:42] Inductive Priors and the Manifold Hypothesis
    [00:49:31] Universal Function Approximation and Deep Networks
    [00:59:25] Training vs Inference: Model Bias
    [01:03:43] Model Generalization Challenges
    [01:11:47] Purple Segment: Unknown Topic
    [01:12:45] Visualizations in Deep Learning
    [01:18:03] Deep Learning Theories Overview
    [01:24:29] Tricks in Neural Networks
    [01:30:37] Critiques of ChatGPT
    [01:42:45] Ethical Considerations in AI
    References:
    #61: Prof. YANN LECUN: Interpolation, Extrapolation and Linearisation (w/ Dr. Randall Balestriero)
    • #61: Prof. YANN LECUN:...
    Scaling down Deep Learning [Sam Greydanus]
    arxiv.org/abs/2011.14439
    "Broken Code" a book about Facebook's internal engineering and algorithmic governance [Jeff Horwitz]
    www.penguinrandomhouse.com/bo...
    Literature on neural tangent kernels as a lens into the training dynamics of neural networks.
    en.wikipedia.org/wiki/Neural_...
    Zhang, C. et al. "Understanding deep learning requires rethinking generalization." ICLR, 2017.
    arxiv.org/abs/1611.03530
    Computer Vision: Models, Learning, and Inference, by Simon J.D. Prince
    www.amazon.co.uk/Computer-Vis...
    Deep Learning Book, by Ian Goodfellow, Yoshua Bengio, and Aaron Courville
    www.deeplearningbook.org/
    Predicting the Future of AI with AI: High-quality link prediction in an exponentially growing knowledge network
    arxiv.org/abs/2210.00881
    Computer Vision: Algorithms and Applications, 2nd ed. [Szeliski]
    szeliski.org/Book/
    A Spline Theory of Deep Networks [Randall Balestriero]
    proceedings.mlr.press/v80/bal...
    DEEP NEURAL NETWORKS AS GAUSSIAN PROCESSES [Jaehoon Lee]
    arxiv.org/abs/1711.00165
    Do Transformer Modifications Transfer Across Implementations and Applications [Narang]
    arxiv.org/abs/2102.11972
    ConvNets Match Vision Transformers at Scale [Smith]
    arxiv.org/abs/2310.16764
    Dr Travis LaCroix (Wrote Ethics chapter with Simon)
    travislacroix.github.io/
  • Наука та технологія

КОМЕНТАРІ • 377

  • @MachineLearningStreetTalk
    @MachineLearningStreetTalk  3 місяці тому +49

    What did you like about this video? What can we improve?!

    • @jd.8019
      @jd.8019 3 місяці тому +13

      Firstly, I wanted to say--I can't believe I just realized I wasn't subscribed to your channel; this mistake has been rectified!
      Secondly, I could easily write an essay (or more accurately, a love letter) about this channel: there are very few insightful AI channels on UA-cam, a few mediocre ones, and the rest. This channel, without a doubt, is in a league of its own. As an engineer, when I see a new MLST video posted, it's like sitting down for a mouthwatering gourmet meal, after being forced fed nothing but junk food for weeks on end.
      Finally, with that said, allow me to attempt a justification of my admiration:
      1) Guests: You always have amazing guests! Your interview style never fails to engender engaging, thoughtful, and most of all - fun conversations! In this video in particular, Simon seems like he’s having a blast speaking about something he’s passionate about, and that enthusiasm genuinely put a smile on my face.
      2) Editing: The videos are always well put together and the production value is always phenomenal! I mean, wow… Compared to the other AI channels on UA-cam, Machine Learning Street Talk makes them look like amateurs.
      3) Knowledge: Most other channels seem content to merely discuss the latest ML hype as it happens in real-time; this is fine, and most aren’t objectively wrong, however, it's mostly surface level discussion and smacks of novice insight. They are, for lack of a better description, an animated news feed. With the exception of Yannic, MLST is the only other mainstream channel I’m aware of with a solid academic pedigree, and it's palpable. I’ve been completely starving for this kind of more in-depth/rigorous discussion. I can only speak for myself, but I imagine there are many who come from a STEM/technical background who feel the same way, so thank you on our behalf.
      Keep up the great work!

    • @MachineLearningStreetTalk
      @MachineLearningStreetTalk  3 місяці тому +4

      @@jd.8019 Thank you sir!!

    • @agenticmark
      @agenticmark 3 місяці тому

      snag the first Ilya interview since the openai debacle :D

    • @therainman7777
      @therainman7777 2 місяці тому +2

      I love the overall aesthetic and the production value. I think many technical channels don’t realize that even for highly technical subject matter, these things are incredibly important for building a large audience that keeps coming back. I also like the way on some episodes you seamlessly edit and link together clips from other interviews; my only criticism is that sometimes when you do that the narrative thread that ties them together is not always made sufficiently clear. If you could include some quick, unobtrusive means of showing the viewer why those clips are being edited in and what the thread is, I think that would significantly up the utility and educational value people get out of it. Kind of like a mind map, but within your videos.

    • @alib5503
      @alib5503 2 місяці тому

      Music?

  • @samgreydanus6148
    @samgreydanus6148 2 місяці тому +261

    I'm the author of the MNIST-1D dataset (discussed at 1h15). Thanks for the positive words! You do an excellent job of explaining what the dataset is and why it's useful.
    Running exercises in Colab while working through the textbook is an amazing feature.

    • @therainman7777
      @therainman7777 2 місяці тому +4

      Nice, I love the dataset for testing and learning purposes, so thank you so much for creating and releasing it 🙏

    • @rubyciide5542
      @rubyciide5542 2 місяці тому +6

      How to learn ml from scratch and create stuff like u do?

    • @datafiasco4126
      @datafiasco4126 17 днів тому

      Thank you for that. I always start my trainings with that.

  • @Friemelkubus
    @Friemelkubus 4 місяці тому +73

    Currently going through it and it's one of the best textbooks I've read. Period. Not just DL books, books.
    I love it.

    • @dennisdelgado4276
      @dennisdelgado4276 6 днів тому +3

      Sorry you’re going through it man. Hope things get better for you

  • @oncedidactic
    @oncedidactic 4 місяці тому +17

    Brilliant, clear, direct conversation. Thank you!!

  • @aprohith1
    @aprohith1 4 місяці тому +11

    What a gripping conversation.. Thank you. !

  • @chazzman4553
    @chazzman4553 3 місяці тому +23

    The best channel I've seen for AI. Cutting edge, no amateur overhyped BS. Down to earth.

  • @amesoeurs
    @amesoeurs 4 місяці тому +77

    i've read most of this book already and it's fantastic. it feels like a spiritual sequel to goodfellow's original DL book.

    • @sauravsingh9177
      @sauravsingh9177 4 місяці тому +8

      Can I read this book , even without reading Goodfellow's book?
      I am currently reading Jeremy's DL with fastai Pytorch book.

    • @amesoeurs
      @amesoeurs 4 місяці тому +12

      @@sauravsingh9177 yes although you're expected to have basic familiarity with stats, lin alg, calculus etc.

    • @RogueElement.
      @RogueElement. 2 місяці тому

      Hello.... I'm a med student and not so proficient in math... Please could you list a comprehensive math tools requirement to be able to understand DL? I'LL BE RIGHT ON IT!🙏😊 ​@@amesoeurs

  • @dariopotenza3962
    @dariopotenza3962 3 місяці тому +20

    Simon taught the first semester of my second year "Machine Learning" module at university! really nice man, we used this book as the module notes. He was very missed when he left in second semester and the rest of the module was never able to live up to his teaching.

  • @makhalid1999
    @makhalid1999 4 місяці тому +6

    Love the studio, would love to see more face-to-face podcasts here

  • @rajdeepbosemondal7648
    @rajdeepbosemondal7648 3 місяці тому +6

    Cool thoughts on digging into the nitty-gritty of deep learning frameworks. The connection between language models and our brains, especially in Transformers, really makes you think. Checking out how things stay consistent inside and finding ways to boost brainpower raises some interesting questions. Looking forward to diving deeper into these fancy concepts!

  • @beagle989
    @beagle989 4 місяці тому +5

    great conversation, appreciate the skepticism

  • @timhaldane7588
    @timhaldane7588 2 місяці тому +2

    I really appreciate being used as an example near the end of the discussion. Version 2.0 is coming along slowly, but I am confident I'll get there.

  • @mattsigl1426
    @mattsigl1426 4 місяці тому +5

    It’s interesting that in Integrated Information Theory consciousness literally is a super-high dimensional polytope (with every dimension corresponding to a whole system state in an integrated network) in an abstract space called Qualia space.

  • @chodnejabko3553
    @chodnejabko3553 17 днів тому +2

    The overparametrization conundrum may be related to the fact we look at what NN are in a wrong way. To me NN is not a "processor" type of object, it's a novel type of memory object, memory which stores and retrieves data by overlaying them on top of one another & also recording the hidden relations that exist in the data set. This is what gets stored in the "in between" places even if the input resolution is low - the logic of coexistence of different images (arrays), which is something not visible on the surface.
    I'm a philologist by training, and in XX cent. literature there was this big buzz around the concept of "palimpsest". Originally palimpsests were texts written on reused parchment from which previous text were scraped off with a razor. Despite scraping, the old text still remained under the new one, which led to having two texts in the same space on the page. In literature this became a conceptual fashion of merging two different narratives into one, with usually very surreal effect. One of the authors that comes to mind is William S.Burroughs.
    In the same way merged narratives evoke a novel situation due to novel logical interactions between the inputs, the empty space in an overparametrized NN gets filled with the logic of the world from which the input data comes & this logic exists between them even when the resolution is low.
    Maybe NN is a Platonic space. Many images of trees somehow hold in them the "logic of the tree", which is something deeper and non obvious to eyes, since in their form alone converge both the principles of molecular biology & the atmospheric fluid dynamics, ecosystemic interactions, up to the very astronomical effects of sun, moon, earth rotation, etc.
    All of it contributes to this form in one way or another, so the form reflects those contributions & therefore holds partial logic of those interactions within it.
    Information is a relation between the object and it's context (in linguistics we say - it's dictionary). A dataset not only introduces objects, it also as a whole becomes a context (dictionary) through which each object is read.
    In that sense maybe upscaling input data sets prior to learning is detrimental to the "truth" of those relations. I would be inclined to assume we'd be better off if we let the NN fill in those spaces based on the logic of the dataset, unless we want the logic of transformations to somehow influence the output data (say - we specifically are designing upscaling engine).

  • @dorian.jimenez
    @dorian.jimenez 4 місяці тому +5

    Thanks Tim, great video I learned a lot.

  • @AliMoeeny
    @AliMoeeny 3 місяці тому +4

    Tim, this is incredibly insightful. Thank you

  • @stevengill1736
    @stevengill1736 4 місяці тому +5

    Sounds like there are as many questions as answers at this point - looks like a great book with plenty of instructive graphics - look forward to reading it....cheers & happy Gnu year!

  • @DailyFrankPeter
    @DailyFrankPeter 2 місяці тому +3

    The sombre Chopin tones in the background emphasize how deep the learning truly is but leave me with little hope of ever fully understanding it... :D

  • @jasonabc
    @jasonabc 4 місяці тому +20

    Best source and community for ml on the internet by far. Love the work you guys do mlst

  • @truehighs7845
    @truehighs7845 3 місяці тому +11

    The way I don't understand deep learning is that it is a statistical power modulated by randomness to emulate reasoned speech, but it is really the top 3-5-7 reasonable speeches selected randomly, at every word. So in theory whatever the AI says, it should not be able to say it twice unless you tweak its parameters and taken away the randomness (temp) it will always repeat the same thing. It's good at emulating speech that gives the resemblance of an intelligent articulation, but it is indeed the syntax and vocabulary (data) placed in a statistical congruent manner that gives that illusion.
    It's like a super sales guy, it will talk very well, but there would be no substance in his apparent passion.

    • @franzwollang
      @franzwollang 3 місяці тому +11

      It's still an open question though how different this is from what humans do. What if human brains operate on a similar principle, of learning patterns, activating a subset of them based on contextual input, and then selecting from them via some noisy sampling?
      That's what really bakes people's noodles.
      The biggest difference, in my mind, is that most ML objective functions are explicitly and statically evaluated and human ones are implicit in the effect of dynamical chemical processes on learning rates or whatever hyperparameters govern organic learning. Reinforcement learning approaches hint at what more human-like ML systems could look like.

    • @premium2681
      @premium2681 3 місяці тому +4

      A lot of what I say has no substance or passion.

    • @AnthonyBecker9
      @AnthonyBecker9 Місяць тому +1

      Turns out learning to predict the next token requires understanding a lot of things

    • @arnaudjean1159
      @arnaudjean1159 Місяць тому +1

      Yes these models need a world models not only statistical but reasoning with self-learning that improve relevance .
      For the moment it's just a well educated salesman but in a very very narrow way.

    • @truehighs7845
      @truehighs7845 Місяць тому +2

      @@franzwollang Yes and your point is compelling. If anything the very fact that humans are learning while they infer or confer makes them already a different beast.
      As we can cogitate and plan, even while not talking, we "defrag" out knowledge. And if we do confer with someone else, we have even faster learning curves. While I am writing this, I am thinking about expressing my knowledge of the AI, and it is a composite knowledge, made of essential theoretical knowledge, psycho-plasticity while using the AI and inference matured from black-box prompting various AI over the last years.
      So yes compared to us its learning is uni-dimensional and purely linguistic while we have a convergence of learning mechanisms working together, all the time.
      Currently there is so much to take on in AI that I have a dozen of half baked stuff opened, but ideally your very inference should automatically fine tune your bot, openpipe is trying to do such a thing, but ideally, it should be ported to the unsloth engine, as openpipe uses openai and it's going to cost you a boat load of money to run anywhere near good between inference, dataset generation and loads of fine-tuning sessions.

  • @exhibitD79
    @exhibitD79 3 місяці тому

    Fantastic - Thank you so much for this content. Loved it.

  • @almor2445
    @almor2445 2 місяці тому

    Great chat, loved it. Just wanted to query the final thought experiment a little. He said the hypothetical AGI would be created in a random super-power in a random company. That's not what employees of Seep Mind or Open AI are doing. They are both switching on a different dial that is specifically being activated in the USA and in their companies respectively. That means a different results for most people.

  • @amesoeurs
    @amesoeurs 4 місяці тому +9

    great episode. tim, you should try to get chris bishop on the show too. he finally released the companion book to PRML this month.

  • @Tesla_Sentiment_Tracker
    @Tesla_Sentiment_Tracker 3 місяці тому

    Loved the discussion! Thank you

  • @joshismyhandle
    @joshismyhandle 4 місяці тому +1

    Great stuff, thanks

  • @debmukherjee4818
    @debmukherjee4818 4 місяці тому +3

    Thanks!

  • @Pianoblook
    @Pianoblook 4 місяці тому +10

    Thank you for another excellent conversation! I really loved the discussion of the practical, grounded ethical concerns - I hope y'all consider having more ethicists on the show!

    • @MichaelBeale
      @MichaelBeale 3 місяці тому

      Have you considered the ethical implications of them doing that, @Pianoblook??

    • @Pianoblook
      @Pianoblook 3 місяці тому

      ​@@MichaelBeale yes, hence why I recommended it

  • @Earth2Ross
    @Earth2Ross 2 місяці тому +1

    So glad I found this channel, I have some catching up to do!!

  • @ProBloggerWorld
    @ProBloggerWorld 3 дні тому

    2:07 so glad you mentioned Schmidthuber. 😅

  • @ethanlazuk
    @ethanlazuk 7 днів тому

    SEO learning AI and ML here. Thoroughly enjoyed the video -- especially the bits on ethics -- and appreciate the channel. I just caught this discussion but will share the vid and continue exloring the channel. Think it's critical, whether or not people use the technology, to understand AI's implications at large and on a deeper level. Cheers!

  • @MrMootheMighty
    @MrMootheMighty 2 місяці тому +1

    Really appreciate this conversation.

  • @arturturk5926
    @arturturk5926 2 місяці тому +2

    The most amazing thing about this video to me is that Simon's hair matches that microphone perfectly, nice work lads...

    • @Daniel-Six
      @Daniel-Six Місяць тому +1

      😂 It's called a dead-cat wind filter in the video trade. Good one!

  • @dm204375
    @dm204375 2 місяці тому

    One of the best videos I've seen regarding the topic. I hold most of the Professors views on the matter so its refreshing to see not everyone in the ML community drank the A.I cool aid. Though I am a lot more pessimistic in that I think we can't slow down or put the genie in the bottle and the effort is wasted. So enjoy the cool new tech while you can enjoy anything...

  • @user-gz2po7dx3k
    @user-gz2po7dx3k 2 місяці тому

    Awesome presentations, thank you for great content!

  • @JuergenAschenbrenner
    @JuergenAschenbrenner 2 місяці тому

    Here you have a guy on the hook, I love how you throw these common buzzwords, like emergent agency, set phenomena at him and let him sort it out, which he does in a way that he gives me the feeling of actually understanding, really nice stuff

  • @AdrianMark
    @AdrianMark 2 місяці тому

    Thank you Professor Prince. Your book is invaluable.

  • @lioncaptive
    @lioncaptive 2 місяці тому

    Simon's contribution adds to the MLST ambitious book club.

  • @richardpogoson
    @richardpogoson 2 місяці тому

    The author was my lecturer last year in my first semester! The dude is brilliant!

  • @snarkyboojum
    @snarkyboojum 6 днів тому

    Beautifully produced! Love these types of videos from you. The amount of work that goes into creating one of these videos is mind boggling. Serious kudos. What a service you’re doing for the current and future generations of technologists.

  • @kyrgyzsanjar
    @kyrgyzsanjar 2 місяці тому

    Alright I'm sold. Ordering the book!

  • @FunwithBlender
    @FunwithBlender 3 місяці тому +3

    i see it as alchemy in the sense of it being between the line of where science meets magic or the unknown...interesting times

  • @quinnlintott406
    @quinnlintott406 2 місяці тому

    This is done very well. Bravo sir!

  • @sproccoli
    @sproccoli 3 місяці тому +1

    > me, who know nearly nothing about the theory of all of this stuff, but has implemented an image classification network hearing him talking about 'trying to push softmax functions to infinity'
    I get that reference.

  • @tomripley7148
    @tomripley7148 4 місяці тому

    good new year

  • @u2b83
    @u2b83 4 місяці тому

    1:13:45 Three-Dimensional Orange (Volume): For a regular three-dimensional orange, which we can approximate as a sphere, the volume is calculated using the formula 4/3 * pi r^3
    Four-Dimensional Orange (Hypervolume): In four dimensions, an object analogous to a sphere is called a "hypersphere." The formula for the hypervolume of a 4D hypersphere is 1/2 * pi^2 r^4

  • @philipamadasun
    @philipamadasun 3 місяці тому

    Haven't finished the video yet but, does this have an E-book as well?

  • @ehfik
    @ehfik Місяць тому

    thank you for these great interviews

  • @pedrojesusrangelgil5064
    @pedrojesusrangelgil5064 3 місяці тому +1

    Hey great book recommendation! Any with similar approach and style on machine learning? Thanks!

  • @u2b83
    @u2b83 4 місяці тому

    Wow!

  • @u2b83
    @u2b83 4 місяці тому

    1:13:26 Figure 18.3a explains [to me] why we call it diffusion. I guess the hypothesis goes, as long as you take a small enough step size, you'll stay within the "[conditional] distribution" q(z)(z_t | x*) when iterating on the diffusion kernel q(z_t, x*), i.e. the dashed cyan border representing the time-evolving distribution bounds. Anyone here thinks this diffusion process looks kinda like the stock market? Where we have piece-wise linear dumdums jump out with their limit orders to steer q(z_t, x*) at every iteration lol

    • @Daniel-Six
      @Daniel-Six Місяць тому +1

      Yeah... It seems reasonable to surmise a kind of fractaline connection pervades these phenomena.
      I got the same vibe when I watched Veritasium's discussion of the option pricing equation for some reason.

  • @FranAbenza
    @FranAbenza 4 місяці тому +3

    The suspense is killing me, what's the punchline?
    jokes aside, what an amazing chapter this was! I am getting this book

    • @clifdavis2
      @clifdavis2 2 місяці тому

      The punchline is that humans are complicated catflaps that confuse their simulations of themselves with themselves.

  • @federicoaschieri
    @federicoaschieri 3 місяці тому +8

    Brilliant interview. Refreshing to watch professional content and not the garbage AI channels that the algorithm suggests first. This whole AI hype is built around OpenAI marketing to finance its cash-burning company, and the recommendation algorithm fuels it. I cannot agree more on the distinction between science and engineering. When I obtained my phd in mathematical logic I was excited about AI, because I thought it could unravel the mystery of intelligence. But we are not learning anything about it, and it is so depressing. We need so badly a profound theory of neural networks, instead of watching these engineers trying random stuff until something works.

  • @charlesalexanderable
    @charlesalexanderable 4 місяці тому +3

    Cool it with the sound effects

  • @hamzadata
    @hamzadata Місяць тому

    What an opportunity to listen to this episode when I just started reading the book recently :)

  • @spqrspqr3663
    @spqrspqr3663 Місяць тому +3

    the video is a textbook example of the best traditions of British science ie being objective, no nonsense and honest intellectually. At one point the tech was summarized as Modelling Probability Distributions in a multidimensional space and universal function approximation and hence having nothing to do with "thinking", with which i fully agree (as a professional software engineer). What was however shocking to see towards the end of the video was that the professor (despite the spot on tech summary i just provided), then went into a completely unfounded scifi statements about doctors, lawers and enginners (and even greeting cards designers :-)) losing their jobs to the tune of 800 million (or was it 80 million whatveer). I cant comprehend how the professior managed to reconcile these two things in his head (non GAI nature of the current tech, GAI still a pipedream, no real thinking/intelligence in the current tech just deriving and modeling probabilities from the data to fit patters in it) and that replacing "knowldgfe workers" by the tone

    • @SimonPrince-lr9dk
      @SimonPrince-lr9dk Місяць тому

      Good point and thank you for your kind words. I guess I reconcile these things because I think the technology we have already (even if there was no more significant development) might be enough to cause massive job losses. It could make many individuals much more productive and that would mean most companies would need fewer people. For example, I used Grammarly to proof my book. That was 2 months of proofreading work for someone just gone... Happy to be proved wrong about this though!

    • @Joorin4711
      @Joorin4711 3 години тому

      It's not science fiction that Dall-E can take a sketch, generate a rendered image in a specific style and then offer up 10 variations which in turn can be sold (as greeting cards in this example). If using Dall-E is more cost effective than employing artists doing the same thing fewer artists will be able to make money creating greeting cards. This has nothing to do with AGI and everything with economics.
      The same thing can be said about lawyers who, typically, sift through data, apply their knowledge about laws and precedents and generate missives that are used to argue in favour of their clients. If any step in that process can be replaced by, say, GPT4 and make it more cost effective, fewer lawyers will be able to make money offering that service. No AGI, just economics.
      So, I see no problem with his stance on AGI, or lack thereof, and him accepting a prediction of workers losing their employment when non-AGI technology is being used in more and more sectors.

  • @-BarathKumarS
    @-BarathKumarS 4 місяці тому

    Agree with the other comment, it's more of a follow-up to the goodfellow book.
    Definitely needs a pre-req

    • @andrice42
      @andrice42 4 місяці тому

      What's a good pre-req alternative?

    • @-BarathKumarS
      @-BarathKumarS 4 місяці тому

      @@andrice42 goodfellow is the bible, nothing else comes close.

  • @morgan9hough
    @morgan9hough 4 місяці тому

    Right on. It’s an equation

  • @dengyun846
    @dengyun846 2 місяці тому +1

    "reflecting it like a crazy house of mirrors into the rest of the space"...Darren Aronofsky pioneered some of the thoughts in this direction 26 years ago. Time for me to watch it again :)

  • @muhokutan4772
    @muhokutan4772 4 місяці тому

    Where is the patreon link?

    • @MichaelBeale
      @MichaelBeale 3 місяці тому

      If you had to, what would you guess that their patreon url is?
      Yep! You nailed it.

  • @Shaunmcdonogh-shaunsurfing
    @Shaunmcdonogh-shaunsurfing 22 дні тому

    How have I only just stumbled on this channel. Good to be home.

  • @adamhaney9447
    @adamhaney9447 2 місяці тому

    Thank you so much for this.

  • @user-th1xk2dz3j
    @user-th1xk2dz3j 2 місяці тому

    Do I need any previous knowledge on order to be ready to read this book

    • @drsjdprince
      @drsjdprince 2 місяці тому

      You need a little linear algebra and basic calculus. I teach the first half to 2nd year undergraduate cs students and they can follow it.

  • @amdenis
    @amdenis 3 місяці тому

    I have only been deep learning research and development for 9 years, and if you really wish to understand those fundamental aspects of deep learning that you believe nobody understands how and why deep learning works, I would love to help you with obtaining that understanding. I can explain it from an analytical, geometric or other perspective as you find most useful.

    • @raul36
      @raul36 3 місяці тому +1

      ​@@Eet_Mia Obviously not. A minimun knowledge of advance math is requiered to deeply understand deep learning.

    • @olaf0554
      @olaf0554 3 місяці тому

      ​@@jonathanjonathansenIn any case it would be algebra, calculus and statistics, genius. Precisely what you just said absolutely invalidates your answer, because you have the slightest idea. Literally everything is algebra. If you don't understand that, you don't have the slightest idea about mathematics.😂😂

    • @DJWESG1
      @DJWESG1 3 місяці тому

      Sociological??

  • @michaelwangCH
    @michaelwangCH 3 місяці тому

    Hi, Tim. Please explain why the randomization of image pixel does not have negative impact of DNN training results, only slowing down the learning speed.
    How does DNN detect object in image without consideration of the neighborhood of pixels? Theoretically, object detection should not work, but it does, why? Thanks, Tim.

    • @SimonPrince-lr9dk
      @SimonPrince-lr9dk 3 місяці тому +1

      The resulting model won't generalize (at least not well) -- this is just to say that you can still train it successfully even when the data has been trifled with. It's a statement about training, not about performance.

    • @michaelwangCH
      @michaelwangCH 3 місяці тому

      The order of pixel vaules matters, if we flatten the pixel value into large vector as the input layer of DNN - the DNN learn the order of the input data as well - the importance is the consistancy how we flatten the pixel matrix into one large vector, resp. how you do it, is your choice, but has to be consistent across the training and test datasets - PyTorch, Keras take care of the consistency if we train DNN for our projects, therefore those questions do not appear in daily work of ML-Engineers. The ML-scientists have an important roll to play here, to fill this gap and ask uncomfortable questions, dive deeply into this topic which for ML-Engineers and CS people as given, no further question will be asked about reason and why.

    • @michaelwangCH
      @michaelwangCH 3 місяці тому +2

      ​@@SimonPrince-lr9dkyour book is a good summarization of the fundamentals in ML-classes of Msc. and PhDs in CS and thank you for your time and effort you put into your book to clarify many questions I had during my university years - simplified many concepts in ML and made your book free for public, thank you.

  • @platotle2106
    @platotle2106 2 місяці тому

    Fantastic interview. One gripe. Towards the end, when Tim was asked about flipping a switch to have AI clones created of himself, the guest tried to push as much as possible while being kind, but it's dissapointing that Tim would not engage. The point of hypotheticals is to provide less ambiguous scenarios for understanding your values. As long as they're not something offensive, you absolutely must engage with them or else there is no dialogue and no point to a discussion. You also should try your best to steelman them and address the main point of the question instead of nitpicking about some contingent aspect of the hypothetical. So instead of bringing up the identity problem with cloning yourself as an issue and leaving it there, you can modify the hypothetical such that the AI isn't a clone of Tim, but rather only a clone of Tim's abilities, i.e it can do anything Tim does, but not necessarily in the same fashion. That way, you remove a non-necessary concern with the hypothetical, and engage with the obvious intended point of the hypothetical.

  • @paratracker
    @paratracker 2 місяці тому

    What a great guest! I'm disappointed that you think ChatGPT doesn't do anything. So, the Rabbit doesn't do anything (useful) either? Agency warrants more discussion. Assumptions will change when LLMs are imbued with curiosity, exploring observed ambiguities, autonomously experimenting with tweaked/alternate architectures, and a motivation to discover higher level abstractions, and explain them.

  • @space-time-somdeep
    @space-time-somdeep 2 місяці тому

    You guys are great.. please try to connect with Indian and Chinese professors if possible.. it will enrich us all❤

  • @1potdish271
    @1potdish271 4 місяці тому

    This book would be available on O’Reilly?

  • @robincheungmbarealtorbroke8461
    @robincheungmbarealtorbroke8461 3 місяці тому

    And it's a good thing there's no extant models for representing a complex system yet because it is a Reductionist application ignorant of the dynamics of systems with higher order complexity.
    We need to first ask neural networks to reengineer themselves to produce a "complex systems native" model (this is the underlying reason for the seemingly idiosyncratic paradoxical observation: quantitatively, the difference between a Reductionist outcome and one after prediction using a second generation neural network-re-conceptualization of a neural network in

    • @therainman7777
      @therainman7777 2 місяці тому +2

      I don’t think you understand most of the words that you’re using. Your comment really didn’t make sense. For example, what’s a “second generation neural network”? I’m sorry to tell you that that’s not a thing. You did use a TON of big words though, so congrats 😂

    • @robincheungmbarealtorbroke8461
      @robincheungmbarealtorbroke8461 2 місяці тому

      @@therainman7777 again, what is it you're trying to accomplish by pointing out that you think I don't know what I'm talking about because you can't understand what I wrote?
      Why don't you try first explain back to me what you understood and I'll explain it based on what you write, in a way you most certainly will understand--and I'll bet you $100 you'd agree, too

  • @bofloa
    @bofloa 3 місяці тому

    I did an experiment, by not using multiple weight per input to a node.. for the purpose of reducing computation cost, to my suprise it generalised.. I create a network of (2,1) 2 input 1 output.
    normally you will required 2 weight but in my experiment I use just one weight and then trainned the node to find patern and it did with single weight.. so I ask how is this possible

  • @bernardofitzpatrick5403
    @bernardofitzpatrick5403 2 місяці тому +1

    Perhaps “Open Mind” will be the ultimate AGI. 😂.
    Awesome discussion btw.

  • @123string4
    @123string4 27 днів тому

    41:58 he says he doesn't know why the smooth interpolation works but isn't it just dimensionality reduction? If you take the SVD of high dimensional data you can throw out small singular values and dramatically reduct the dimensionality without introducing a lot of error.

  • @datafiasco4126
    @datafiasco4126 17 днів тому

    Extremely important information about MNIST 1D dataset. I always start with it. Unfortunately it is also the most underrated. People don't understand power of simplicity.

  • @u2b83
    @u2b83 4 місяці тому

    I need to get my library to order this book, asp!

    • @u2b83
      @u2b83 4 місяці тому +1

      I'm such a nerd that I slowed the vid to 0.25x to try to read some of the page flips lol

    • @MachineLearningStreetTalk
      @MachineLearningStreetTalk  4 місяці тому +4

      You can read it online for free! udlbook.github.io/udlbook/

    • @SimonPrince-lr9dk
      @SimonPrince-lr9dk 3 місяці тому

      @@u2b83 You can read it for free, but don't let that stop you getting your library to order it, lol.

    • @u2b83
      @u2b83 3 місяці тому +1

      @@MachineLearningStreetTalk wow! thanks! :))

  • @petroflorence7962
    @petroflorence7962 3 місяці тому

    I understan this is so that more advance way of computing for more difult computational outcomes

  • @chodnejabko3553
    @chodnejabko3553 17 днів тому

    My favorite analogy to describe the divergence of ML & brain science is the casus of airplanes. Flying was for a long time reserved for certain products of evolution & at the beginning engineers were imagining themselves copying forms and behaviors of birds to understand them.
    But in the end engineering & abstract understanding of physics behind the flying is what gave us technology that applies those principles, usually in ways very strange to nature itself. Only in recent years, due to modeling tools that came out of engineering, people begun to model bird-like, or insect-like machines & understanding how nature does it.
    I like to believe ML will develop what I call "scientific principles of thinking", an overview of complex information processing techniques that some decades down the line will inform us on how we should look at our own brains to understand them.
    But just like with artificial bird wings, artificial human-like brains, will be nothing more then work of hobbyists and proof-of-concept academic research, since the engineered ML solutions will be that more powerful and tailored to specific tasks.
    Not just neuro-mimicry, but the very "general intelligence" - as it is a great narcissistic dream - is also a rather useless engineering solution. What engineering problem does it solve anyway? Pretending to be human? So this would mean the current world somehow does not want humans to exist. Which maybe is the real sinister problem we must address before we "provide solutions" to it.

  • @bennguyen1313
    @bennguyen1313 2 місяці тому

    I wonder what the key difference is that causes some very smart people to think AGI could lead to the end of humanity and therefore research should be paused.. where others, especially those that do AI for a living, think that is pure science-fiction.
    Would love to see a roundtable discussion with Simon Prince, Connor Leahy, Jürgen Schmidhuber Geoffrey Hinton, George Hotz, Elon, Ken Ford, John Carmack, and Eliezer Yudkowsky.

    • @Daniel-Six
      @Daniel-Six Місяць тому

      Everyone but Yudkowsky. I cannot sit through another minute watching him hold off a dookie. That perpetual grimace...

  • @sayanbhattacharya3233
    @sayanbhattacharya3233 2 дні тому

    Thanks man, this is so beautiful ❤

  • @gergerger5888
    @gergerger5888 3 місяці тому

    2:01:55 I also observe a certain (well-intentioned) paternalism in Simon's opinion. If there are people who can have rewarding occupations and build self-taught personalities, it is precisely because there is a large part of the population doing alienating jobs in factories, supermarkets and offices. No one wants to spend the best years of their life loading boxes in a forklift or washing dishes in a restaurant, it is circumstances that push those people in that direction. I hope the day comes when human beings are freed from those chains.
    On the other hand, I agree that the transition to automation can be hard, but I think we made a mistake in delegating this transition to governments. States are by nature perverse, what we need is a committed and strong civil society, with the capacity to save, that is willing to temporarily welcome those who are left out. In this sense, I believe that we can learn much more from indigenous societies, and from how anthropologists tell us that they build their social support networks, than from the social engineers who are to come.

    • @SimonPrince-lr9dk
      @SimonPrince-lr9dk 3 місяці тому +1

      This is a fair criticism of my viewpoint. But it's also paternalistic to say that human beings need to be "freed from these chains". Personally, I actually would rather be loading boxes / washing dishes than be unemployed.

  • @danielrodio9
    @danielrodio9 2 місяці тому

    I love the idea that the moment LLM's start generalizing, no one understands why
    (Even though Hinten or Sutskever might claim that they do)

  • @ian-haggerty
    @ian-haggerty Місяць тому

    Great to bring a friendly face to the book. Can't wait to devour it!

  • @GulliverImpreso
    @GulliverImpreso 3 дні тому

    Can I read this even I have no Idea about AI Engineering or coding? I don't even know calculus it seems it has a lot calculus inside the book I think couldn't understand it when I'll start reading this book btw I'm still interested to read this book because I've a plan to take AI engineering course.

  • @vfwh
    @vfwh 3 місяці тому

    He’s asking you if you would flip the switch. His question is perfect, and the fact that you avoid answering it should make you think, no?

  • @piotr780
    @piotr780 2 місяці тому +1

    practical deep learning:
    1. pull the model
    2. tune tune
    3. apply
    there is nothing interesting here - only trivial and ultimately hard tasks, but mayby 1% of companies builds their own models

  • @TheLummen.
    @TheLummen. Місяць тому

    Thank you for that !

  • @gergerger5888
    @gergerger5888 3 місяці тому

    54:26 If I understood correctly, what the host is trying to say here is this: for a fixed number of parameters, deep neural networks have more linear regions than shallow neural networks. HOWEVER, the function modelled by the DNN has lots of inner symmetries,; it is not a "free function". So why does it work so well?
    It is to say: when you add more layers to a neural network you can fit more complex functions with less parameters, but the shape of these functions is weird and, to some extent, """"""""""fractal"""""""""". It is hard to understand why this kind of highly constrained functions fit real world relations so good. UDL explores this topic, although there is not an answer yet.

  • @trvst5938
    @trvst5938 2 місяці тому

    Lex Friedman interviews on AI are 👍👍

  • @Daniel-Six
    @Daniel-Six Місяць тому

    I am more and more convinced that the inscrutibly large matrices ubiqitous to machine learning are a black-box interface to a mechanism situated elsewhere and unseen in our computational domain--which potentially employs an entirely different kind of logic to generate the suspicious efficiencies we observe now.
    As Dr. Waku put it on his channel; we are actually making API calls to a more elaborate machine.

  • @CosasCotidianas
    @CosasCotidianas 2 місяці тому

    I have the feeling that this book is going to be a must in universities.

  • @robincheungmbarealtorbroke8461
    @robincheungmbarealtorbroke8461 3 місяці тому

    I think an ai Alignment approach that doesn't also include prioritization of neural networks being put through deep learning models and to shift neural networks, which are only human inspired, to integrate the second generation improvements of allowing a more holistic, less structurally-reductionist to replace the neural network by the Neutral lneural network's use in shifting from a syllogistically reductionist architecture of neural networks to its more holistically re-envisioning version of what was created by our reductionist culture

    • @therainman7777
      @therainman7777 2 місяці тому +1

      Dude, you left two comments on this same video using a whole bunch of “big words” but actually not saying anything coherent. This is absolute word salad, please spend some time learning the material first before you try to start sounding off with big opinions. It is painfully clear you don’t know what you’re talking about.

    • @robincheungmbarealtorbroke8461
      @robincheungmbarealtorbroke8461 2 місяці тому

      @@therainman7777 if you say so--then what is your goal in pointing it out?
      At least I put forth my line of reasoning (the point of it is that in business--and I may not know what I'm talking about because I didn't make the dean's list in my mba--but pretty close :p) that the misalignment of fiduciary duty with "social purpose," which was the original quote I was referring to--is not so much idiosyncratic as it is backwards, across the board.
      If you survey what society has done, across the board, from a polymathic point of view (that is, not only breadth, which would be "Jack of all trades but master of none," polymathic implies both breadth and depth) it's kind of a chronometer that I'd venture a guess would coincide with the magnetic pole reversal when pretty much everything is BACKWARDS.
      Anyway, that is the concept I was outlining, and the jump to my observation that science is not only still stuck applying Reductionist approaches to a whole whack of complex systems that are completely inappropriate to look at from a reductionist view is the root of that problem and the example in specific is to highlight something that almost everyone seems to take for granted as true, when it cannot be.

  • @terjeoseberg990
    @terjeoseberg990 3 місяці тому

    I believe that it makes perfect sense that gradient descent causes a complex function to find a solution that works.

  • @FRANKWHITE1996
    @FRANKWHITE1996 3 місяці тому

    subscribed

  • @abdulwasey7985
    @abdulwasey7985 Місяць тому +1

    the most human discussion on AI ever

  • @zhandanning8503
    @zhandanning8503 Місяць тому

    I wonder though, machien learning is very similar to humans. The example about computer vision and humans being able to recognize things is only if the thing was within the human's training data, right? So then in this regard a model can do what humans can do. I think an interesting question is can neural networks do more than humans? I suppose that would be when AGI is accepted to have come along. Was a very fascinating video thank you.

  • @gulllars4620
    @gulllars4620 2 місяці тому

    At around 1:35:00 they are talking about the limits of LLMs and that scaling them can't possibly lead to superintelligence because it's interpreting a data manifold. I'd conjecture this is the wrong way of reasoning about the prospect. LLMs will be components of systems, not a black box that is the entire system. A single forward pass through an LLM is more akin to system 1 thinking, but you can make control loops with multiple contexts and levels of reasoning to make stronger cognitive systems where the LLM is the world model and reasoning engine if put in the right framework. Also, LLMs can be extended with RAG, or access to external systems with APIs. This can also allow the model to have external memory it can store and load learned information, skills and insights into. I think the best example of this is the mindcraft bot Voyager, or possible ChatDev. Those kinds of systems being driven by SOTA LLMs is how i believe we'll fairly soon hit AGI and then ASI.

    • @MachineLearningStreetTalk
      @MachineLearningStreetTalk  2 місяці тому

      Vanilla transformers are finite state automatas i.e. fixed compute. Yes, you can call them recursively to overcome their obvious computational/expressivity limitations, but then you have the "search problem" which has been plaguing traditional AI for decades. The problem in AI is how to navigate exponential search spaces and select good models dynamically and efficiently. We are no closer to this, pretty much zero progress - don't believe the hype.

    • @gulllars4620
      @gulllars4620 2 місяці тому

      @@MachineLearningStreetTalk fair point, and i agree on the general characterization as placing a search problem on top. Though Alpha zero did beat humans at something that looks like a search problem that we couldn't imagine that machines could beat us at due to the size of the search space (Go). Chain of thought, tree of thought, and similar prompting strategies or frameworks show promise, and by investing a lot more in samples to search over results can improve a lot and address issues of hallucinations or inconcistencies, especially if you pick based on heuristics rather than voting.
      It may be currently hyped, but it's showing more capabilities than the base model in isolation, and the better the prompting and framework the more capabilities or accuracy it seems to be possible to extract. I think this is a compounding scaling factor beyond just scaling the model itself.
      And native agents and agent frameworks is ongoing research being pushed hard currently and scaling in funding and people working on it.

    • @gulllars4620
      @gulllars4620 2 місяці тому

      @@MachineLearningStreetTalk BTW, quick part 2. My background is from embedded systems, so I'm familiar with Finite State Automata or Finite State Machines from there. I'm currently a senior engineer working on data intelligence but off on paternity leave and keeping mentally active by staying up to date on AI stuff. The best youtube source I've seen on agent frameworks and cognitive frameworks is Dave Shappiro, who seems to be a techno-optimist and recently self described accelerationist. I think he may make a good guest on your podcast, as he's been active in the AI area and saw the potential around GPT 2 time frames.
      Thanks again for the reply here, love your channel and the engagement with the community.

    • @MachineLearningStreetTalk
      @MachineLearningStreetTalk  2 місяці тому +2

      @@gulllars4620 AlphaZero just uses a bunch of naive heuristics for a single task, it's not intelligent at all. The search approach it uses (MCTS) ignores high entropy trajectories and the value network for estimating board state is obviously brittle and has since been defeated with simple adversarial attacks. Intelligence is the process which finds novel and robust models in efficient time (which are then shared and embedded socially) - this is obviously not neural networks as no new models are being created "at runtime", although a future system might use an NN as a selection heuristic for such a search. I can tell from your response that you fundamentally misunderstand some important concepts, most important of which is that intelligence is the system which produced our collective knowledge, LLMs are just second-order removed artefacts from that process. I would council to diversify your information diet on AI content, there are a load of hype merchants out there. 2024 might be the year where we start to understand the difference between knowledge and intelligence - the former will get you far but the latter is the vanguard.

    • @gulllars4620
      @gulllars4620 2 місяці тому

      @@MachineLearningStreetTalk thanks, i will look for more sources. I know about the monte Carlo tree search component, and it's a bit surprising to me that such a brute force aporoach (of course combined with heuristics for evaluating paths to continue down) managed to solve the problem better than virtually all humans, though yes, it's brittle to adversarial attacks. Alpha fold and GNOME are similarly impressive though all these are not general systems like LLMs, the concepts could be used with an LLM as a component.
      I guess i should separate and specify more clearly between my thoughts on AGI and ASI.
      I think it's only a matter of a 10^1 scale months until AI driven systems can theoretically replace median humans at currently economically valuable tasks. We'll probably still have bastions where we can hold the line longer even as AI systems surpass any human in skill in various tasks.
      I think the main constraints on digital and embodied AI actually replacing/displacing humans first economically and later in all cognitive labor is more likely a question of compute resources, capital, other resources (energy and industrial inputs), and latency of scaling and ramping manufacturing.

  • @Eggs-n-Jakey
    @Eggs-n-Jakey 2 місяці тому

    Before I watch, are you just explaining why I have a stroke when trying to internalize these concepts? or are you guiding me to build the model in my mind? I need some type of mental model to understand things fully.

  • @FunwithBlender
    @FunwithBlender 3 місяці тому

    really cool

  • @DrkRevan
    @DrkRevan 2 місяці тому

    You are AWESOME!!!!!

  • @ataarono
    @ataarono 3 місяці тому

    why just look for minimum in the value function if you could also look for maximum

  • @newbiadk
    @newbiadk 2 місяці тому

    great episode