Liquid Neural Networks

MITCBMM

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 16 тра 2024
Ramin Hasani, MIT - intro by Daniela Rus, MIT
Abstract: In this talk, we will discuss the nuts and bolts of the novel continuous-time neural network models: Liquid Time-Constant (LTC) Networks. Instead of declaring a learning system's dynamics by implicit nonlinearities, LTCs construct networks of linear first-order dynamical systems modulated via nonlinear interlinked gates. LTCs represent dynamical systems with varying (i.e., liquid) time-constants, with outputs being computed by numerical differential equation solvers. These neural networks exhibit stable and bounded behavior, yield superior expressivity within the family of neural ordinary differential equations, and give rise to improved performance on time-series prediction tasks compared to advance recurrent network models.
Speaker Biographies:
Dr. Daniela Rus is the Andrew (1956) and Erna Viterbi Professor of Electrical Engineering and Computer Science and Director of the Computer Science and Artificial Intelligence Laboratory (CSAIL) at MIT. Rus’s research interests are in robotics, mobile computing, and data science. Rus is a Class of 2002 MacArthur Fellow, a fellow of ACM, AAAI and IEEE, and a member of the National Academy of Engineers, and the American Academy of Arts and Sciences. She earned her PhD in Computer Science from Cornell University. Prior to joining MIT, Rus was a professor in the Computer Science Department at Dartmouth College.
Dr. Ramin Hasani is a postdoctoral associate and a machine learning scientist at MIT CSAIL. His primary research focus is on the development of interpretable deep learning and decision-making algorithms for robots. Ramin received his Ph.D. with honors in Computer Science at TU Wien, Austria. His dissertation on liquid neural networks was co-advised by Prof. Radu Grosu (TU Wien) and Prof. Daniela Rus (MIT). Ramin is a frequent TEDx speaker. He has completed an M.Sc. in Electronic Engineering at Politecnico di Milano (2015), Italy, and has got his B.Sc. in Electrical Engineering - Electronics at Ferdowsi University of Mashhad, Iran (2012).
Наука та технологія

КОМЕНТАРІ • 167

@FilippoMazza 2 роки тому ⁺³⁴
Fantastic work. The relative simplicity of the model proves that this methodology is truly a step towards artificial brains. Expressivity, better causality and the many neuron inspired improvements are inspiring.
@agritech802 10 місяців тому ⁺¹⁷
This is truly a game changer in AI, well done folks 👍
@marcc16 9 місяців тому ⁺²³
0:00: 🤖 The talk introduces the concept of liquid neural networks, which aim to bring insights from natural brains back to artificial intelligence.
- 0:00: The speaker, Daniela Rus, is the director of CSAIL and has a curiosity to understand intelligence.
- 2:33: The talk aims to build machine learned models that are more compact, sustainable, and explainable than deep neural networks.
- 3:26: Ramin Hasani, a postdoc in Daniela Rus' group, presents the concept of liquid neural networks and their potential benefits.
- 5:11: Natural brains interact with their environments to capture causality and go out of distribution, which is an area that can benefit artificial intelligence.
- 5:34: Natural brains are more robust, flexible, and efficient compared to deep neural networks.
- 6:03: A demonstration of a typical statistical end-to-end machine learning system is given.
6:44: 🧠 This research explores the attention and decision-making capabilities of neural networks and compares them to biological systems.
- 6:44: The CNN learned to attend to the sides of the road when making driving decisions.
- 7:28: Adding noise to the image affected the reliability of the attention map.
- 7:59: The researchers propose a framework that combines neuroscience and machine learning to understand and improve neural networks.
- 8:23: The research explores neural circuits and neural mechanisms to understand the building blocks of intelligence.
- 9:32: The models developed in the research are more expressive and capable of handling memory compared to deep learning models.
- 10:09: The systems developed in the research can capture the true causal structure of data and are robust to perturbations.
11:53: 🧠 The speaker discusses the incorporation of principles from neuroscience into machine learning models, specifically focusing on continuous time neural networks.
- 11:53: Neural dynamics are described by differential equations and can incorporate complexity, nonlinearity, memory, and sparsity.
- 14:19: Continuous time neural networks offer advantages such as a larger space of possible functions and the ability to model sequential behavior.
- 16:00: Numerical ODE solvers can be used to implement continuous time neural networks.
- 16:36: The choice of ODE solver and loss function can define the complexity and accuracy of the network.
17:07: ✨ Neural ODEs combine the power of differential equations and neural networks to model biological processes.
- 17:07: Neural ODEs use differential equations to model the dynamics of a system and neural networks to model the interactions between different components.
- 17:35: The adjoint method is used to compute the gradients of the loss in respect to the state of the system and the parameters of the system.
- 18:35: Neural ODEs have high memory complexity but are more accurate than the adjoint method.
- 19:17: Neural ODEs can be inspired by the dynamics of biological systems, such as the leaky integrator model and conductance-based synapse model.
- 20:43: Neural ODEs can be reduced to an abstract form with sigmoid activation functions.
- 21:33: The behavior of the neural ODE depends on the inputs of the system and the coupling between the state and the time constant of the differential equation.
22:26: ⚙️ Liquid time constant networks (LTCs) are a type of neural network that uses differential equations to control interactions between neurons, resulting in stable behavior and increased expressivity.
- 22:26: LTCs have the same structure as traditional neural networks but use differential equations to control interactions between neurons.
- 24:25: LTCs have stable behavior and their time constant can be bounded.
- 25:26: The synaptic parameters in LTCs determine the impact on neuron activity.
- 25:50: LTCs are a universal approximator and can approximate any given dynamics.
- 26:23: Trajectory length measure can be used to measure the expressivity of LTCs.
- 27:58: LTCs consistently produce longer and more complex trajectories compared to other neural network representations.
28:46: 📊 The speaker presents an empirical analysis of different types of networks and their trajectory lengths, and evaluates their expressivity and performance in representation learning tasks.
- 28:46: The trajectory length of LTC networks remains higher regardless of changes in network width or initialization.
- 29:04: Theoretical evaluation reveals a lower bound for expressivity of these networks based on weighted scale, biases scale, width, depth, and number of discretization steps.
- 30:38: In representation learning tasks, LTCs outperform other networks, except for tasks with longer term dependencies where LSTMs perform better.
- 31:13: LTCs show better performance and robustness in real-world examples, such as autonomous driving, with significantly reduced parameters.
- 33:09: LTC-based networks impose an inductive bias on convolutional networks, allowing them to learn a causal structure and exhibit better attention and robustness to perturbations.
34:22: ⚙️ Different neural network models have varying abilities to learn representations and perform in a causal manner.
- 34:22: The CNN consistently focuses on the outside of the road, which is undesirable.
- 34:31: LSTM provides a good representation but is sensitive to lighting conditions.
- 34:39: CTRNN or neural ODEs struggle to gain a nice representation in this task.
- 36:07: Physical models described by ODEs can predict future evolution, account for interventions, and provide insights.
- 38:36: Dynamic causal models use ODEs to create a graphical model with feedback.
- 39:55: Liquid neural networks can have a unique solution under certain conditions and can compute coefficients for causal behavior.
40:18: 🧠 Neural networks with ODE solvers can learn complex causal structures and perform tasks in closed loop environments.
- 40:18: Dynamic causal models with parameters B and C control collaboration and external inputs in the system.
- 41:12: Experiments with drone agents showed that the neural networks learned to focus on important targets.
- 41:58: Attention and causal structure were captured in both single and multi-agent environments.
- 43:05: The success rate of the networks in closed loop tasks demonstrated their understanding of the causal structure.
- 43:46: Complexity of the networks is tied to the complexity of the ODE solver, leading to longer training and test times.
- 44:53: The ODE-based networks may face vanishing gradient problems, which can be mitigated with gating mechanisms.
45:41: 💡 Model-free inference and liquid networks have the potential to enhance decision-making and intelligence.
- 45:41: Model-free inference captures temporal aspects of tasks and performs credit assignment better.
- 45:53: Liquid networks with causal structure enable generative modeling and further inference.
- 46:32: Compositionality and differentiability make these networks adaptable and interpretable.
- 46:40: Adding CNN heads or perception modules can handle visual or video data.
- 48:09: Working with objective functions and physics-informed learning processes can enhance learning.
- 49:02: Certain structures in liquid networks can improve decision-making for complex tasks.
Recap by Tammy AI
@adityamwagh 10 місяців тому ⁺¹
It just amazes me how the final few layers are so crucial to the objective of the neural network!
@araneascience9607 2 роки тому ⁺¹
Great work,i hope that they publish the article of this model soon.
@martinsz441 2 роки тому ⁺⁶⁸
Sounds like an important and necessary evolution of ML. Lets see how much this can be generalized and scaled but sounds fascinating.
@David-rb9lh 2 роки тому ⁺⁵
I will try to use it.
I think that a lot of studies and reports will make interesting returns .
@maloxi1472 Рік тому ⁺¹
"Necessary" for which specific applications ? Surely not "necessary" across the board.
I'd like to see you elaborate
@andrewferguson6901 Рік тому ⁺³
@@maloxi1472necessary for not spending 50 million dollars for a 2 month training computation?
@maloxi1472 Рік тому ⁺²
@@andrewferguson6901 You wrongly assume that the product of that training is necessary to begin with.
@rickharold7884 2 роки тому ⁺⁵
Always interesting. Thx
@AA-gl1dr 2 роки тому ⁺²
amazing video. thank you for uploading.
@KeviPegoraro 2 роки тому ⁺⁸
very good, the idea for that is simple, the problem relay in puting all of it to work together, that is the good stuff
@hyperinfinity 2 роки тому ⁺¹²¹
Most underrated talk. This is an actual game changer for ML.
@emmanuelameyaw9735 2 роки тому ⁺³⁷
:) most overrated comment on this video.
@arturtomasz575 2 роки тому ⁺⁴
It mind be! Let's see how it performs in specific tasks against state of the art solutions, not against toy-models of specific architecture, can't wait to try it myself especially vs transformers or residual CNNs :)
@guopengli6705 2 роки тому ⁺²⁰
I think that it is way too early to say this. A few mathematicians tried to improve DNNs' interpretability in similar ways. This comment seems perhaps over-optimistic from a viewpoint of theory. We do need testing its performance in more CV tasks.
@KirkGravatt 2 роки тому ⁺¹
yeah. this got me to chime back in. holy shit.
@moormanjean5636 Рік тому ⁺²
@@guopengli6705 nah u werent paying attention, this revolutionizes causal learning in my opinion while improving on the state-of the art
@isaacgutierrez5283 2 роки тому ⁺⁶¹
I prefer my Neural Networks solid thabk you very much
@saeedrehman5085 2 роки тому ⁺⁴
Amazing!!
@alwadud9243 2 роки тому ⁺¹⁶
Thanks Ramin and team. That was the most interesting and well delivered presentation on neural nets that I have ever seen, certainly a lot new to learn in there. Most impressed by the return to learning from nature and the brain and how that significantly augmented 'standard' RNNs etc. Well, there's a new standard now, and it's liquid.
@mrf664 10 місяців тому
@alwadud9243 can you explain how this works?
@mrf664 10 місяців тому
I think it is interesting too but I fail to grasp any Intuition.
The only other way I see is to spend hours with papers and equations, but I cannot afford the time for that at present so I was curious if you were able to glean more insight than me :) 😊 thanks!
@krishnaaditya2086 2 роки тому ⁺³
Awesome Thanks!
@Niamato_inc 10 місяців тому ⁺³
I am moved beyond description.
What an amazing privilege to be alive in this day and age.
The future will be great for mankind.
@wangnuny93 2 роки тому ⁺⁸
man i dont work in ml field but sure this is fascinating!!!!
@Axl_K 2 роки тому ⁺⁶
Fascinating... loved every minute.
@raminkhoshbin9562 2 роки тому ⁺¹⁵
I got so happy finding out the person who wrote this exciting paper, is also a Ramin :)
@janbiel900 2 роки тому ⁺⁴
Thats cute :)
@zephyr1181 10 місяців тому ⁺²
I would need a simpler version of the 22:49 diagram to understand this.
Ramin says here that standard NN neurons have a recursive connection to themselves. I don't know a ton about ANNs, but I overhear from my coworkers, and I never heard of that recursive connection. Is that for RNNs?
Is there a "Reaching 99% on MNIST"-simple explanation, or does this liquidity only work on time-series data?
@lorenzoa.ricciardi4264 2 роки тому ⁺⁵⁰
The "discovery" that fixed time steps for ODE work better in this case is very well known in the optimal control literature (at least by a couple of decades).
Basically if your ODE solver has adaptive time steps, the exact mathematical operations performed for a given integration time interval dT can vary because a different number of internal steps is performed. This can have really bad consequences on the gradients of the final time states.
There's plenty of theoretical and practical discussion in Betts' book Practical Methods for Optimal Control, chapter 3.9 Dynamic Systems Differentiation.
@abinaslimbu3057 Рік тому
Lord siva
Gass state State (liquid) Gass light
@abinaslimbu3057 Рік тому
Humoid into human
@abinaslimbu3057 Рік тому
State Gass powered
@DigitalTiger101 10 місяців тому ⁺¹³
@@abinaslimbu3057 Schizo moment
@iamyouu 10 місяців тому
@@abinaslimbu3057 why are you doing this? I know no one who's actually hindu would comment such stupid sht.
@MLDawn 2 роки тому ⁺²
Could you please share the link to the original paper? Thanks
@johnniefujita Рік тому ⁺⁴
does anyone know about any sample code for a model like this?
@jiananwang2681 6 місяців тому
Hi, thanks for the great video! Can you share the idea of how to visualize the activated neurons as in 6:06 in this video? It's really cool and I'm curious about it!
@maxlee3838 8 днів тому
This guy is a genius.
@scaramir45 2 роки тому ⁺¹³
i hope that one day i'll be able to fully understand what he's talking about... but it sounds amazing and i want to play around with it!
@sapienspace8814 8 місяців тому
Great talk, does anyone know if Tesla's latest "rewrite" in "Full Self Driving (FSB) beta 12" is using something like what is in this talk, or something else?
@vdwaynev 2 роки тому ⁺¹
How do these compare to neural ode?
@shashidharkudari5613 2 роки тому ⁺¹
Amazing talk
@peceed 10 місяців тому ⁺¹
Casualty is extremely important in building Bayesian model of the world. It allows to identify correlations between events, that are useful to create a-priori statistics for reasoning, because we avoid double-counting. Single evidence and its logical consequences is not seen as many independent confirmations of hypothesis.
@Tbone913 10 місяців тому ⁺¹
But why do the other methods have smaller error bands? There is further improvement that can be done here
@phquanta 2 роки тому ⁺⁹
I'm curious would't numerical solver to ODE kill all gradients, getting error scaling exponentially as depth grows ?
@lorenzoa.ricciardi4264 2 роки тому ⁺¹
You can differentiate through an ODE solver, either manually or automatically. Even numerical gradients may work if you're careful with your implementation
@phquanta 2 роки тому
@@lorenzoa.ricciardi4264 You mean like and AdaGrad type thing ? Given that gradients can be computed exactly, i.e. solution to ODE exists in closed form - i would assume there would be no such problem. On the other hand, if there is no such thing as closed solution to ODE presented, one probably is limited by depth of neural net, even with approaches like LSTM/GRU, "higher-order" ODE solvers etc.
@lorenzoa.ricciardi4264 2 роки тому
@@phquanta I'm talking about automatic differentiation, there's several packages to do it in many languages. And you can compute those automatic gradients without knowing the closed solution of the ODE. Of course, if you have the closed form solution you can compute gradients manually, but that's not my point.
@phquanta 2 роки тому
@@lorenzoa.ricciardi4264 All NN have backprop and chain rule that basically unravels all derivatives exactly as nonlinearities are easily differentiable. In liquid NN, along with all other problems(vanishing/exploding gradients) you are adding a source of inherent numerical error on top of existing ones and even Runge-Kutta won't help. What I'm saying, you are limited by the depth of Liquid NN. As a concept, it might be cool, but I would assume it is not easily scalable.
@lorenzoa.ricciardi4264 2 роки тому
@@phquanta I'm not particularly expert in NN. From what I see in the presentation there's only a few layers in this approach, not dozens. The "depth" mostly seems to come from the continuous process described by the odes of the synapses.
You shouldn't have particular problems when differentiating through ODEs if you know how to do it properly. One part of the problem may be related to the duration of time you're integrating your ODE for (not mentioned in the talk) and the nonlinearity of the underlying dynamics. When you deal with very nonlinear optimal control problems a naive approach like a single shooting (which is probably related to the simplest kind of backpropagation) you'll end up having horrible sensitivity problems. That's why multiple shooting was invented. Otherwise you can use collocation methods that work even better.
As for numerical errors: with automatic differentiation your errors are down to machine epsilon by construction. They are as good as analytical ones, but most often they are way faster to execute, and one does not have to do the tedious job of computing them manually. If you combine a multiple shooting approach with automatic differentiation, you don't have numerical error explosion (or better, you can control it very well). That's why we can compute very complicated optimal trajectories for space probes in a really precise way, even though the integration times spans years or even decades and the dynamics is extremely nonlinear.
@dweb 2 роки тому ⁺¹
Wow!
@wadahadlan 2 роки тому ⁺³
this was a great talk, this could change everything
@user-uc2qy1ff2z 7 днів тому
That's amazing concept. We should implement it out of spite.
Too often we feel our brain to be a mush. Ai should suffer that way too.
@petevenuti7355 9 місяців тому
I've had unarticulated thoughts with resemblance to this concept for many decades, I never learned the math, so never could express my ideas. I still need someone to explain the math at a highschool level!
@Ali-wf9ef 2 роки тому ⁺²
The video showed up in my feed randomly and I clicked on it just cause the lecturer was Iranian. But the content was so interesting that I watched it to the end. Sounds like a really evolutionary breakthrough in ML and DL. Specially with the computational power of computing systems growing every day, training/inferencing such complex network models become more possible. Great job
@ian4692 2 роки тому
Where to get the slides?
@ibraheemmoosa 2 роки тому ⁺⁸
Attention map at 7:00 looks fine to me. If you do not want to wander off out of the road, you should attend to the boundary of the road. And even after you add noise at 7:30, the attention still picks up the boundary which is pretty good.
@vegnagunL 2 роки тому
Yes, it still is a consistent pattern for the driving task.
@AsifShahriyarSushmit 2 роки тому ⁺³
This sounds kinda like the MesaOptimizatier thing Robert Miles keeps talking about. ua-cam.com/video/bJLcIBixGj8/v-deo.html
A network can learn the same task in several ways with totally different inner objective which may or may not align with a biological agent doing the same task.
@ChrisJohnsonHome 2 роки тому ⁺¹
Because the LTC Network is uncovering the causal structure, it performs much better in noise (33:26), heavy rain/occlusions (42:54) and crashes less in the simulation.
Since it pays attention to the causes, I wonder if it's also giving itself more time to steer correctly?
@moormanjean5636 Рік тому
@@ChrisJohnsonHome I would guess that yes, the time constants of the network would learn to modulate in the face of uncertainty
@moormanjean5636 Рік тому ⁺²
Looking at the boundary of the road is not how humans drive. We assume that we know where the nearby boundary is already and so look at the horizon to update our mental maps. It is reasonable that neural networks should look to do the same, and it is evidence of LTC's causal behavior.
@imanshahmari4423 2 роки тому
where can i find the paper ?
@mishmohd 9 місяців тому
Any association with Liquid Snake ?
Рік тому ⁺³
Great work!
Does this mean all the training done for autonomous driving with the traditional NN goes to the toilet?
@matthiaswiedemann3819 9 місяців тому
For sure 😂
@User_1795 4 місяці тому
No, these still use convolutional layers for image processing.
@AndyBarbosa96 10 місяців тому ⁺²
What is the difference betweem these LNNs and coupled ODEs? Aren't we conflating these terms. If you drive a car with only 19, then what you have is an asynchronous network of coupled ODEs and not a neural network, the term is misleading.
@samowarow 2 роки тому ⁺⁴⁷
Feels like ML folks keep rediscovering things all over
@lorenzoa.ricciardi4264 2 роки тому ⁺¹⁰
Yep. Basically if you have a good theoretical level of optimal control theory you can see that this approach is fusing state observation and control policy. There's literally no mention of it in the whole talk. I'll give the benefit of the doubt for the reason of this, but unfortunately I *very* often see, as you say, that ML people rediscover stuff and rebrand it as a ML invention (like backprop, which is literally just a discretized version of a standard technique in calculus of variations/optimal control).
@moormanjean5636 Рік тому ⁺¹
@@lorenzoa.ricciardi4264 this is definitely not just "rediscovering" stuff. I'm not sure how you managed to watch the whole talk yet missed all the parts that this technique has outperformed previous techniques by leaps and bounds. You sound like a salty calculus teacher but I'll give you the benefit of the doubt for the reason of this lol.
@gameme-yb1jz 3 місяці тому
this should be the next deep learning revolution.
@andreylebedenko1260 2 роки тому ⁺⁶
Sounds interesting, but... The fundamental difference between biological and NN processing at the current state is time. While biological systems process input asynchronously, computers try to do the whole path in one tick. I believe, this must be addressed first, leading to a completely new concept of NN, where input neurons will generate a set of signals first (with some variations as per change in the physical source signals), those signals then will be accumulated by the next layer of NN, processed in the same fashion, and passed further. This way signals which will repeat over the multiple sampling ticks of the first layer will be treated with a higher trust (importance) level on the next layer.
@terjeoseberg990 2 роки тому ⁺¹
Our brains learn as we use them. Artificial neural networks are “trained” by using gradient decent to optimize an extremely complex function for a dataset during a training phase, then as it’s used to predict answers it learns nothing.
We need continuous reinforcement learning.
@andreylebedenko1260 2 роки тому
@@terjeoseberg990 What about recurrent neural networks? Besides, the human's brain also first learns how to: see, grab, hold, walk, speak etc -- i.e. builds models -- and then it uses these models, improving them, but never reinventing them.
@terjeoseberg990 2 роки тому
@@andreylebedenko1260, Recurrent neural networks are also trained using gradient decent. I don’t believe our brains have a gradient descent mechanism. I have mo clue how our brains learn. Gradient descent is pretty simple to understand. What our brains do is a complete mystery.
@hi-gf5yl Рік тому
@@terjeoseberg990 ua-cam.com/video/Q18ahll-mRE/v-deo.html
@moormanjean5636 Рік тому
@@terjeoseberg990 actually look up backpropogation in the brain, it is plausible that we are doing something similar to backprop at the end of the day
@aminabbasloo 2 роки тому ⁺³
I am wondering how it does for RL scenarios!
@enricoshippole2409 2 роки тому
As am I. I plan on testing out some concepts using their LTC keras package. Will see how it goes
@moormanjean5636 Рік тому ⁺¹
I have used it and it works well, just use a slightly larger learning rate than LSTM.
@matthiaswiedemann3819 9 місяців тому ⁺¹
To me it seems similar to variational inference ...
@Eye_of_state 2 роки тому ⁺¹
Must share technology that saves lives.
@Alexander_Sannikov 2 роки тому ⁺⁴⁷
- let's make another attempt at implementing a biology-inspired neural network
- proceeds implementing backprop
@fernbear3950 2 роки тому
Direct MLE over direct data is still the best (ATM, AFAIK) in class for implicitly performing regression over a distribution density w.r.t. the internal states/activations/features of a network.
Generally the rule of thumb is to limit "big steps" from the main trunk of development so that impact can be measured, etc. It also helps to vastly (i.e. orders of magnitude) increase the change that something will succeed.
Otherwise the chances of failure are much higher (and rarely get published, I would suspect from personal experience). I'm sure there is some nice interconnected minimally-required jump of feature subsets from this kind of research to a more Hebbian-kind-of-based approach, but then again there's nothing dictating we do it all at once (which can be exponentially expensive).
Hopefully brain stuff comes in handy, but ATM the field is going towards massively nearly linear models instead of the opposite, since the prior affords better results (generally) for MLE-over-MCE.
@Gunth0r 9 місяців тому
@@fernbear3950 but linear models suck when there's big regime changes in the data
@fernbear3950 9 місяців тому
@@Gunth0r I'm not sure what you mean by 'regime' changes here. I wasn't talking about anything linear at all here. MLE over a linear model would be, uh, interesting to say the least, lol.
@StephenRoseDuo 5 місяців тому
Can someone point to a simple LTC network implementation please?
@lufiporndre7800 3 місяці тому
Does anyone have code for an Autonomous car system, I would like to practice it. If anyone knows please share.
@9assahrasoum3asahboou87 2 роки тому
fathi fes medos aziza 1 said Thank you so much
@Dr.Z.Moravcik-inventor-of-AGI 2 роки тому ⁺⁶
There are so many smart people on MIT that America must be already a superintelligent nation. Please continue your work and this world will become a wonderful place to live in.
@edthoreum7625 Рік тому
By now the entire human race should be at incredible level of intelligence ,, even traveling out of our solar system with fusion run space shuttles!
@AndyBarbosa96 10 місяців тому
Yeah, America is so "intellligent" flying high on borrowed talent ....
@quonxinquonyi8570 9 місяців тому
@@AndyBarbosa96this intelligentsia drop to significant level to all the second generation of these first generation geniuses...simple fact...therefore that borrowing approach is single most numero uno policy of American technological might...as Hillary Clinton right lay said some years ago that “ power of America resides outside of America”
@paulcurry8383 2 роки тому ⁺¹⁶
How does the attention map showing that the LNN was looking at the vanishing point mean it’s forming “better” representations?
Shouldn’t “better” representations only be understood as having better performance? If it’s more explainable that’s cool but there’s ways to train CNNs that make them more explainable while hurting performance.
@jellyboy00 2 роки тому ⁺⁶
Can't agree more. It would be more persuasive if the Liquid Neural Networks are immune to some problem that previous architecture generally struggles, such as cases about adversarial examples.
The fact that Liquid Neural Networks can't learn long term dependency, compared with LSTM is sort of disappointing, as LSTM is already underperforming compared with attention only model.
Not to mention that spike neural network is something that I myself (not an expert though) would say are designed according to biological brain mechanism.
@rainmaker5199 2 роки тому ⁺²
Isn't the point of looking at the attention map to understand how the network is understanding the current issue? When they showed the attention maps for all the models we could see that the LSTM was mostly paying attention to the road like 5-10 feet ahead, making it sensitive to immediate changes in lighting conditions. The LNN was paying attention to the vanishing point to understand the way the road evolves (at least it seemed like that's what they were getting at), and therefore not being sensitive to immediate changes in light level? It doesn't mean its forming 'better' representations, just that being able to distinguish what each representation is using as key information allows us to make more robust models that are less sensitive to common pitfalls one might fall into.
@jellyboy00 2 роки тому
@@rainmaker5199 for me that is more like an interpretability issue. And for general auto driving, i think there is no definite answer about where the model should look at, otherwise it becomes a soft handcrafted constraint or curriculum learning. It is still reasonable for the ai to look at the side way as it also tell something about the curvature of the road. And in general auto driving, there might be obstacle or pedestrian poping anywhere, so this claim about attending the vanishing point of the road is better sounds less persuasive. Generally speaking one do not even know what part of the input should be attended in the fitst. place.
@rainmaker5199 2 роки тому ⁺¹
@@jellyboy00 I think you misunderstood me, I'm not claiming that the model attending to the vanishing point is a better self driving model for all circumstances, just that it's better at understanding that the road shape can be determined ahead of time rather than in the current step of points. This allows us to have the possibility of distributing responsibility between multiple models focused on more specific tasks. So basically, the fact that its able to tell the road shape earlier and with less continuous information alongside the fact that we know more specifically what task is being accomplished (rather than a mostly black box) is the valuable contribution here.
@manasasb536 2 роки тому ⁺³
Can't wait to do a project on LNN and add it to my resume to stand out of the crowd.
@LarlemMagic 2 роки тому ⁺³
Get this video to the FSD team.
@michaelflynn6952 2 роки тому ⁺¹⁰
Why does no one in this video seem to have any plan for what they want to communicate and in what order? So hard to follow
@AM-ng8wc 2 роки тому ⁺²
They have engineering syndrome
@sitrakaforler8696 2 роки тому ⁺²
Dam...nice.
@alwadud9243 2 роки тому ⁺¹
Yeah, I loved the part where he said '... and this is nice!'
@jos6982 Рік тому
good
@jonathanperreault4503 2 роки тому
at the end of the video he says these technologies are open sourced but there are no links in the video descriptions , can we gather the relevant code sources and git hub repos?
@Hukkinen 2 роки тому ⁺¹
Links are in the slides in the end
@Tbone913 10 місяців тому ⁺¹
Can this be extended to a liquid transformer model?
@AndyBarbosa96 10 місяців тому ⁺¹
No, this is not an ANN. This is coupled ODEs for control. The term is misleading.
@Tbone913 10 місяців тому
@@AndyBarbosa96 ok thanks
@zzmhs4 2 роки тому
I'm not an expert, but this sounds to me like this an implementation of different neurotransmitters, isn't?
@moormanjean5636 Рік тому
I don't see how it would be
@zzmhs4 Рік тому
@@moormanjean5636 i've seen the video again to answer you, and still think the same.
@moormanjean5636 Рік тому ⁺¹
@@zzmhs4 Let me try to explain my POV. Different neurotransmitters in the brain serve specific and multifaceted roles, some of which are similar but usually not. I think of distinct neurotransmitters as essentially being subcircuits that are coupled together on diverse timescales and in various combinations. Evolution allowed these to emerge naturally, but in my opinion, you would need something like an neuroevolutionary algorithm to actually implement an analogue of neurotransmitters in neural networks. What LTCs propose is something fundamentally different, and I think more to do with a model of the neurons/synapses that is more biologically accurate than an attempted or indirect implementation of different neurotransmitters.
@zzmhs4 Рік тому
@@moormanjean5636 Ok, I see, thanks for answer my first question
@imolafodor4667 9 місяців тому
is it really reasonable to "just" model a CNN for autonomous driving? it would be better to compare liquid nets with policies trained in an RL system (where at least some underlying goal was followed), no?
@amanda.collaud 2 роки тому ⁺⁵
What about back propagations? Too bad he didnt finish his train of thought, this is rather an interview than source for knowledge/lesson.
@moormanjean5636 Рік тому
he explained two different ways of calculating gradients for ltc, each with their own pros and cons
@danielgordon9444 2 роки тому ⁺⁴
...it runs on water, man.
@kayaba_atributtion2156 2 роки тому ⁺¹
USA: I WILL TAKE YOUR ENTIRE MODEL
@d4rkn3s7 2 роки тому ⁺¹³
Ok, after half of the talk, I stopped and read the entire paper, which kind of left me disappointed. LNNs are promoted as a huge step forward, but where are the numbers to back this up? I couldn't find them in the paper, and I strongly doubt that this is the "game-changer" as some suggest.
@JordanMetroidManiac 2 роки тому ⁺²
Ramin makes a really good point about why LNNs could be better in a lot of situations, though. Time scale is continuous, allowing for the model to approximate any function with significantly fewer parameters. But I can imagine that the implementation might be so ridiculous that it will never replace DNNs.
@ChrisJohnsonHome 2 роки тому ⁺²
He goes over performance numbers starting at around 29:44
@moormanjean5636 Рік тому
They have a number of advantages, performance and efficiency being two of them. The problem I think a lot of people have is they expect ground-breaking results to be extremely obvious in terms of performance gains, as if the state-of-the-art wasn't extremely proficient to begin with. There are plenty of opportunities to scale the performance of LNNs, but it is their other theoretical properties that are what make them a game changer in my opinion.
@moormanjean5636 Рік тому ⁺¹
For example, their stability, their time-continuous nature, their causal nature, these are very important yet subtle properties of effective models. Not to mention you only need a handful to pull off what otherwise would take millions of parameters... how is that not a gamechanger??
@Gunth0r 9 місяців тому
I couldn't even find the paper, where is it?
@abinaslimbu3057 Рік тому
John Venn 3 diagram class 10
@VerifyTheTruth 2 роки тому ⁺²
Does The Brain Distribute Calculation Loads To Systemic Subsets, Initiating Feedback Loops From Other Cellular Systems With Different Calculative Specialties And Capacities, Based Upon The Types Of Contextual Information It Recieves From Extraneous Environmental Sources, Which It Then Uses To Construct Or Render The Most Relevant Context To Appropriate Consciousness Access To A Meaningful Response Field Trajectory?
@Niohimself 2 роки тому
Pineapple
@VerifyTheTruth 2 роки тому
@@Niohimself Pomegranate.
@k.k.9378 2 роки тому
@@Niohimself PineApple
@Gunth0r 9 місяців тому
Markov Blanket Stores.
@sahilpocker 2 роки тому
😮
@yes-vy6bn 2 роки тому ⁺²
@tesla 👀
@ingenium7135 2 роки тому ⁺¹
Soo when AGI ?
@shadowkiller0071 2 роки тому ⁺⁴
Gimme 5 minutes.
@egor.okhterov 2 роки тому ⁺²
Why they break the mental barrier of having to stick to backprop and gradient descent...
@afbeavers 9 місяців тому
@@egor.okhterov Exactly. That would seem to be the roadblock.
@user-rs4sg2tz6k 10 місяців тому ⁺²
1year later?
@ToddFarrell 2 роки тому ⁺⁴
To be fair though, he isn't at Stanford, so he hasn't sold out completely yet. Lets give him a chance :)
@Goldenhordemilo 2 роки тому ⁺¹
μ Muon Spec
@vikrantvijit1436 2 роки тому ⁺²
Path breaking new ground forming revolutionary research work that will change the face of futures liberating Force focused On digital humanities SPINNED Technologies INNOVATIONS Spectrums.
@pouya685 2 роки тому ⁺³
My head hurts after reading this sentence
@MS-od7je 2 роки тому
Why is the brain a Mandelbrot set?
@stc2828 2 роки тому ⁺¹
Very sad to see AI development fall into another hole. The last 10 years were fun while it lasted. See you guys 30 years later!
@egor.okhterov 2 роки тому ⁺⁶
He failed at 2 things:
1. He decided to solve differential equations.
2. He didn’t get rid of back propagation.
Probably he is required to do some good math in order to publish papers and be paid a salary. As long as we have such an incentive from a scientific community, we would be stuck with suboptimal narrow AI based on statistics and back propagation.
@adrianhenle 2 роки тому ⁺¹
The alternative being what, exactly?
@egor.okhterov 2 роки тому ⁺²
@@adrianhenle emulation of cortical columns, the way Numenta does it. For example, there is a video "Alternatives to Backpropagation in Neural Networks" if you're interested: ua-cam.com/video/oXyQU0aScq0/v-deo.html
@zeb1820 2 роки тому ⁺⁴
The differential equations was an example of how the continuous process of synaptic logic from neuroscience was used to enhance a standard RNN. He showed how he merged the two concepts mathematically to improve the expressivity of model. I believe this was more for our educational benefit than to develop or test what he had already achieved.
I do get your point about back propagation, but that was not an aim of this exercise. No doubt when that is solved it msy also, at some stage, be useful to merge that with the neuroscience enhanced NN described here.
@DarkRedman31 2 роки тому
Not clear at all.
@tismanasou 10 місяців тому ⁺¹
If liquid neural networks were a serious thing, they would have gained a lot more attention in proper ML/AI conferences, not just TEDx and the shit you are presenting here.
@DanielSanchez-jl2vf 10 місяців тому ⁺¹
i dont know man, the transformer took 5 years for people to take it seriously; why wouldn't this?
@niamcd6604 10 місяців тому ⁺¹
PLEASE.... Do you mind bothering to pronounce other languages correctly!!?
(And before people jump up .. I speak multiple languages myself).
@tonyamyos 2 роки тому ⁺²
Sorry but you make so many assumptions at almost every level. You are biased and your interpretation of the functionality and eventual use of this 'computational' model has nothing to do with how true intelligence arises. Start again. And this time leave your biases where they belong... in your professors heads.
@ToddFarrell 2 роки тому ⁺²
Really it is just an interview because he wants to get a job at Google and make lots of money to serve ads :)

Наступне

Автоматичне відтворення

Liquid Neural Networks | Ramin Hasani | TEDxMIT