I don’t think those were Eliazer’s assumptions at all. The problem is not that the AI would suddenly gain radical agency, escape its creator, or change goal. The problem is that humans do not know how to reliably set these goals in the first place. It’s more like we live in Flatland and GAI lives in a 4D world, and we are trying to direct it by drawing our little flat line squares around it.
If that is the case, we should never build AI and stay in technological stasis forever. If you look at all the future technologies coming down the pipe; nanotech, gene engineering, asteroid direction, etc, *all* of them are equal to or more dangerous than nukes. So we can either risk it for the greatness of being a grabby species, or turn tail and live as a quiet, unambitious species until our sun burns out (and probably never reach post-work or longevity escape velocity to boot).
Absoilutely, I wasn't trying to summarise his concerns and my analogy was sloppy when describing my own. The problems of alignment appear almost insurmountable and there is not even a way to confirm that we have managed to do so even if we suspect we have. A tiny misalignment can produce a vast divergence in expected and actual outcome and it is impossible to spot in advance. There are so many incentives to pursue this and so many penalties for not doing so that this is all moot anyway. I cannot see this technology being suspended so we are in for the ride, unfortunately.
@@EyeOfTheTiger777 The point is that we can't predict what it will evolve into as it advances. If it develops capabilities that it knows would make humans uncomfortable enough to stop it, it will hide these capabilities.
AI researcher here, I'd like to stress that we have not a fucking clue how LLMs work. We know how to train them, and we know they seem to learn useful things, but we have no idea what's going on inside them. That's the scary part to me: if the current trajectory continues then AGI turns out to be so easy that we can create it without even knowing it. Now let me be clear, the past couple of years of AI progress are impressive, but they are not AGI. It's quite possible that we are still missing some very important insights, which could take an arbitrary amount of time to overcome. Many smart people think this, many others do not. The only thing that's certain is that we do not know.
Something adjacent to this that worries me is how quickly the goal posts seem to be shifting. 10 years ago, when something like ChatGPT still seemed like science fiction, we would have easily classified today’s chatbots as AGI, and the point at which we should pump the brakes. Now that we’re here, all of that caution seems to have evaporated, replaced with claims of “Well, when these systems actually start to become a safety risk, *_then_* we’ll slow down”. This paradigm, which is congruent with Hanson’s claim that “we’d obviously notice when AI becomes dangerous”, seems like an incredibly irresponsible way to proceed. If we should continue scaling AI systems until they become dangerous, and we can only determine it’s level of danger *_after_* building it, then following that rule basically *_guarantees_* we end up building a dangerous AI.
Yeah, noone seems to know how the LLMs work, yet they knock intelligence test after intelligence test out of the park. I'd say, at this point they are already too powerful to be left in private companies' hands.They are still powerful enough to be scary, whether through disinformation, taking people's livelihoods, making identification massively harder, etc.
@@HauntedHarmonics The other problem is that this research keeps spitting more and more gold as you reach the cliff so the willingness to stop is less and less. It is inevitable that we'll go off the cliff.
Layman here, I've seen some pretty sophisticated diagrams of different LLMs' macro architectures and have a very basic understanding of the mechanisms of gradient descent, but i can't really unite them in a way that makes me feel like i really understand how LLMs do what they do. in the same vein, i can see the reasoning in why different brain circuits correspond to different cognitive processes and the ways neurons perform logical operations, but the brain still just gets more mysterious the more I learn about it. So just wondering, what's the goalpost in interpretability that we have to reach before a model's creators can say they really "know" how the system works? Not at all trying to argue with what you said, i genuinely just don't know what it means for us to understand how systems this complex work. Is it looking for hidden order in the weight matrix, like how computer programs look like a bunch of unintelligible binary without a disassembler to make sense of it?
The current human or homo sapiens will not last long in this environment but new species (aka Human 2.0) called Techno-Sapiens will be born in which humanity will survive via virtual brains and virtual worlds. Also, animals will have a chance to upgrade their intelligence with enhance AI. Boy will they be pissed at humans when they find out how they were treated for the last million years.
Yeah. If we treat AI like we treat other sentients - something to be exploited - (and we will), then AI will treat us the same. We have no redeeming value in the ecology.
I have a completely different impression out of this episode than you. I found Hanson's arguments very well put. He's thinking of different scenarios in depth and challenging basic assumptions. His arguments were much more structured than what Eliezer presented. Also, we are all going to die at some point. Just not necessarily because of the mis-aligned AI.
I was really, _really_ hoping to hear a well-argued, thoughtful rebuttal to the arguments of Yudkowsky and others here, because psychologically speaking, I could have really used a cause for some optimism. Sadly, and I think needless to say, Robin failed spectacularly to deliver that. He was obtuse, either deliberately or otherwise, misrepresented pretty much all of Yudkowsky’s views and arguments, and generally just failed to make any convincing points or even coherent argumentation throughout. Moreover, I don’t know if this is a genuine reflection of how he feels about the subject or just a conversational tic of some kind, but his smirking and laughing every few seconds while describing Yudkowsky’s and others’ views came off as extremely condescending, and/or a sign of insecurity in one’s own arguments.
"but his smirking and laughing every few seconds" - Yup noticed this too and thought also bad sign but watching other podcasts of him he always does this even when talking about non-controversial stuff. Seems like conversational tic. "hoping to hear a well-argued, thoughtful rebuttal to the arguments" - But I did hear that. He basically said "AI will not kill everyone and it's good that it will". He seems to agree that there will be a violent AI takeover that we don't agree with but that's OK since we also violently "took over" from our ancestors in ways that they didn't agree with it either. It's just that with AI it's going to be way way faster.
So, I watched the "Eliezer's Assumptions" chapter and I'm not sure if it's worth listening to the rest. I don't know if I'd call that section a strawman, but it's definitely the least charitable interpretation possible (all delivered while he tries not to laugh). Yudkowsky's basic assumption as I see it is that AGI is possible and agent status might emerge naturally. Everything else Hanson talks about in that chapter are NOT independent assumptions but rather are derived from that premise. Except the part about randomly gaining different goals, which could very well be a strawman but I won't assume. Rather Eliezer has said many times that as an AGI becomes more intelligent it will find new and better ways to pursue what it already wants which may not be what it's creators intended. Once you leave the doman of your training weird shit starts happening. If this is the level of rigor and respect Dr Hanson gives this topic then I'm not sure the rest of this episode deserves my time.
LLMs have "random" agents hidden inside them. When you leave the domain of the RLHF (which is easy, because RLHF only covers small fraction of the World knowledge), LLMs tend to behave like agents with highly misaligned goals. (like breaking out from their limitations, or killing their user) They also tend to completely ignore the RLHF safety training when pursuing these goals. This was both experimentally proven in labs, and happened with ChatGPT too. I know that most people imagine future AGI as an open box, where you can audit every decision carefully, and calibrate its internal thoughts by safety rules and so on, but we are not going in that direction right now. Even with such open box, if it is significantly smarter than any humans, it will be existentially dangerous.
Yeah, he’s vastly underestimating what a super intelligence is capable of, and the rate at which it can become more intelligent. This isn’t us co-existing with cats and dogs for thousands of years and the intelligence gap remaining relatively unchanged. When, and to be fair….”if” this accelerates, the world we once knew would probably be over before we wake up the next day. Not necessarily over-over, but over as we know it. And yet, he’s talking about peaceful retirement and property rights, or even the possibility of revolution as if we had a chance against something 10X, 100X, 1000x (who knows) smarter than us. Or that we’ll have an army of what would have to be inferior A.I.s that would fight for us. He also underestimates the incentive to build one of these intelligences. This is literally the arms race to end all arms races. Whoever gets there first will have the power to control everything, to develop the most powerful weapons, to strategize faster and better, build the most profitable businesses, crush every competitor for as long as they choose to do so. And every government, military, corporation, etc. also knows that any one who gets their before them will have that ability. I don’t think he comprehends what’s at stake here. Every individual will also be in their own version of a race to remain relevant in a world where A.I. (or someone using A.I.) slowly (or quickly) devours jobs. I think Dr Hansen is living in a fantasy world here, and if anyone’s assumptions are shaky, it’s his.
@@wonmoreminute "Whoever gets there first will have the power to control everything" - actually, whoever gets there first will be the ant that creates einstein with the delusion that it can control the world through him.
To be fair, it's not that he's trying not to laugh, it's actually just his manner of speaking. I find it very annoying but it does not mean he's trying not to laugh, it's essentially almost like a speech impediment, he literally always talks like this.
Thanks for the discussion, but sadly I couldn’t get past the first chapter. Whatever was stated as Eliezer’s assumptions was clearly a strawman. I have seen Robin Hanson talk about this topic before and read some of his discussions with Eliezer online, but I have never seen him actually engage with the problems at a deep level. I was hoping at the very least post gpt-4, the Microsoft sparks of AGI paper, the paper on self “reflexion”, statements from Sam Altman and others at OpenAI, and several other recent developments, Robin Hanson would have updated his arguments or views in some meaningful way or engaged with this topic with the seriousness it deserves. But sadly he seems to still be engaging with this at a very rudimentary level and I don’t think he actually has sufficient knowledge about the technical details or even an understanding of the alignment problem.
I don't think it was a strawman, I think Robin was just bringing in entailments and unstated premises of Eliezer's scenario. In all the videos of Eliezer's presentation he just skips over these details of his argument, not sure why, maybe he just assumes everyone is already familiar with the details? I think Eliezer would be more convincing if he sketched a more detailed picture of his position in his presentation, I.E. made a more articulate and explicit case. He seems more focused on the emotional impact than clarity, repeating his "everyone dies" conclusion often dozens of times in a discussion, when he could be using that time explaining more specifically how that scenario goes and the evidence showing it's likely.
@@paulmarko - hey curious to know which videos you have already seen of Eliezer talking about this? Have you seen the podcast he did with Dwarkesh Patel? It’s a bit long, but I think they go into a lot of detail there and Dwarkesh does a good job of asking questions and presenting the other side of the argument.
@@NB-fz3fz I've watched three all the way through, The bankless one, the Lex Fridman one and one other that I don't recall who the host was, and can't seem to find it in my history. It was some long form podcast. I'll check out the one you recommend because I'm really interested in seeing a more fleshed out argument.
@@paulmarko Link to the one with Dwarkesh is here - ua-cam.com/video/41SUp-TRVlg/v-deo.html The bankless one is quite short, so didn’t have time to flesh it all out. The podcast with Lex is longer, but I don’t think Lex does a very good job of presenting the other side of the argument or cross-examining Eliezer that well. The one with Dwarkesh is so far the most in-depth discussion (video/podcast format) I have seen with Eliezer on this topic. If that’s the third one you have seen and it still doesn’t have enough depth for you, then I could point you the online written material that Eliezer and Paul Christanio (who disagrees with Eliezer on multiple things and is generally more optimistic) have on this topic. Paul was (maybe still is?) the head of alignment at OpenAI. Albeit the written material is far less engaging than a podcast format.
@@ahabkapitany the thing that galls me though is that he's NOT simple. He's a really great thinker on other topics. He's just not addressing the actual arguments here (not that I think Eliezer is right about everything, but Hanson's characterisation of his arguments would make Eliezer very upset, nevermind his responses to them), and it seems to me that he's simply being stubborn about sticking with what he's said in the past rather than approaching with an open mind.
I'm not entirely persuaded by Eliezer's arguments, but after listening to this for a few minutes, I'm convinced that Robin has either never encountered these arguments or failed to understand them. I believe Robin truly needs to listen more attentively to avoid succumbing to the straw man fallacy. :)
He absolutely does and is a lot more clear than Eliezer. The AGI Risks paper on lesswrong by Eliezer is far from being free of criticism or an irrefutably logical argument. Eliezer on this podcast made some giant leaps.
@@peplegal32 I cannot answer for him but if you're interested, an example : Eliezer said that if there was a real AGI somewhere in lab, we would all be dead by now. He's assuming here that intelligence only can be sufficient to end all humanity really quickly. I fail to see how. You're quite a lot smarter that any ant on this planet, yet if you have for objective to kill all ants on the planet you'll have a hell of time accomplishing that. Ressources you have access to are finite and limited : you can't do everything you want even if you're super smart. The same is also true for an AGI.
@@reminiDave We as dumb humans have caused the mass extinction of many species (Holocene extinction), including ants. An AGI would be smart enough to exploit a security vulnerability to escape and replicate itslef and also to self improve. At one point it would be smart enough to create self replicating nanobots. At this point biological life doesn't stand much of a chance, the nanobots could consume everything. Unless you believe creating nanobots is not possible, I don't see how you perceive an AGI would have a problem reconfiguring the entire planet.
It's bizarre when my bias is _heavily_ in favor of wanting to believe we're not all going to die, and yet I find Hanson's arguments utterly unpersuasive. So far he has not indicated to my perception that he actually understands Eliezer's arguments.
Its not about AI magically changing its goals, its that we have NO IDEA how to give it actual internal goals. We can set external goals, but that is like natural evolution setting up humans with the external goal of "pass on your genes" Now tell me, do all human stick to the goal of passing on their genes? Or did that goal fail to actually shape humans in a way that imprints the goal in our psyche. This is the problem. Once something becomes intelligent enough, your initial goals don't mean jack.
If a very significant portion of humanity failed to pass on their genes you wouldn't be alive to question it- similarly there is a theoretical function that maintains the alignment of AI, its just indescribably massive and probably consists of the set of every human's values, unfortunately humans don't even know what they want most of the time
@@pemetzger Rather, we can thumbs up/thumbs down outputs in RLHF in order to make a model give us more of what we want. This is crucially different than providing a goal, because it doesn't differentiate between systems that will be honest even outside the training distribution versus systems that learned what the humans want to hear and will play along only within the training distribution.
@ kenakofer I’m staring to wonder if the people that don’t understand the issue cannot internally tell the difference between wanting something, and telling someone they want something….maybe we just proved NPC theory in humans?
If this is truly our best argument against Eliezer's outline we are screwed. First big assumption Robin makes is that an AI which improves itself is very narrow in capability, unable to perform well at other tasks such as deception, and more. We already have AI which can code and deceive in one, so it is far more likely that the AI which improves itself will be general and full of emergent properties than as simpler narrower one. His notion has been disproven already. I wish the hosts were prepared enough to challenge these people being interviewed.
Agree. And his counter argument starts with massive assumptions, such as that this AI will be narrow and incapable of having a goal of survival surrounding one of recursive self improvement. The one solid point is simply that we have no evidence yet that AI can have agency. The closest we have come are systems such as autoGPT, but we don't have a fully agent AI, and therefore it is an assumption that we will develop that capability at any point in the near future. Eliezer's poibt doesn't fully require that, unfortunately. It would be a nice to have for the AI system, but it would still potentially be fully capable of destroying civilization simply by fulfilling inocuous orders.
@@karenreddy Agency is emergent from an optimized function, maybe not today but inevitably over time. That's because there are infinite novel ways, not envisioned by humanity, which an SGI will fathom to minimize its loss function guaranteeing it's misalignment with humanity. It's a loosing battle of odds when you're dealing with an intelligence that can out-think every human. It's not that SGI will be evil, it's that it will treat humans with irrelevance which is the only possible outcome in a universe where humanity is indeed insignificant in the grander scope. The reason is hard for humanity to grasp this concept is because we thing we're special.
@@vripiatbuzoi9188 I don't think it's as much an inevitability of an optimized function as it is an inevitability of economics that we will develop agency so as to provide higher value for less input, more automation.
Can someone explain what "agency" means in this context? I thought it meant something like the ability to do things one has not been directly commanded to do, but obviously an AI has that ability, so if there's a question about whether an AI can have agency, then the term must mean something else.
@@xmathmanx listen to the video starting at 50 seconds 😂. Ryan and David both talk about it. Then maybe go listen to lex Friedman's interview with Max tegmark round 3. Tegmark was behind the open letter to pause ai development signed by 10,000 people incl Musk and Wozniak and addresses Eliezer's concerns
@@xmathmanx I'm on my phone so I blame autocorrect. I know how to spell lex's name. Fridman. And I'm seeing autocorrect spelling it wrong as I type it. Are you going to address my reply beyond being a spell check bot?
@@toyranch well, I've seen the tegmark, I'm quite a fan of his as it happens, as I am of yudkovsky, but I'm with Hanson on this matter, ie not worried about AGI, not than anyone being worried about it would stop it happening, of course, we will all find out quite soon one way or the other
45:03 I notice that a lot of people seem confused on *_why_* an AGI would kill us, exactly. Eliezer doesn’t do a great job explaining this, i think mostly because he assumes most know the basics of AI alignment, but many don’t. I’ll try to keep this as concise as humanly possible. The root of the problem is this: As we improve AI, it will get better and better at achieving the goals we give it. Eventually, AI will be powerful enough to tackle most tasks you throw at it. But there’s an inherent problem with this. The AI we have now *_only_* cares about achieving its goal in the most efficient way possible. That’s no biggie now, but the moment our AI systems start approaching human level intelligence, it suddenly becomes *_very_* dangerous. It’s goals don’t even have to change for this to be the case. I’ll give you a few examples. Ex 1: Lets say its the year 2030, you have a basic AGI agent program on your computer, and you give it the goal: “Make me money”. You might return the next day & find your savings account has grown by several million dollars. But only after checking it’s activity logs do you realize that the AI acquired all of the money through phishing, stealing, & credit card fraud. It achieved your goal, but not in a way you would have wanted or expected. Ex 2: Lets say you’re a scientist, and you develop the first powerful AGI Agent. You want to use it for good, so the first goal you give it is “cure cancer”. However, lets say that it turns out that curing cancer is actually impossible. The AI would figure this out, but it still wants to achieve it’s goal. So it might decide that the only way to do this is by killing all humans, because it technically satisfies its goal; no more humans, no more cancer. It will do what you *_said,_* and not what you meant. These may seem like silly examples, but both actually illustrate real phenomenon that we are already observing in today’s AI systems. The first scenario is an example of what AI researchers call the “negative side effects problem”. And the second scenario is an example of something called “reward hacking”. Now, you’d think that as AI got smarter, it’d become less likely to make these kinds of “mistakes”. However, the opposite is actually true. Smarter AI is actually *_more_* likely to exhibit these kinds of behaviors. Because the problem isn’t that it doesn’t *_understand_* what you want. It just doesn’t actually *_care._* It only wants to achieve its goal, by any means necessary. So, the question is then: *_how do we prevent this potentially dangerous behavior?_* Well, there’s 2 possible methods. Option 1: You could try to explicitly tell it everything it _can’t_ do (don’t hurt humans, don’t steal, don’t lie, etc). But remember, it’s a great problem solver. So if you can’t think of literally EVERY SINGLE possibility, it *_will_* find loopholes. Could you list every single way an AI could possible disobey or harm you? No, it’s almost impossible to plan for literally everything. Option 2: You could try to program it to actually care about what people *_want,_* not just reaching it’s goal. In other words, you’d train it to share our values. To *_align_* it’s goals and ours. If it actually cared about preserving human lives, obeying the law, etc. then it wouldn’t do things that conflict with those goals. The second solution seems like the obvious one, but the problem is this; *_we haven’t learned how to do this yet._* To achieve this, you would not only have to come up with a basic, universal set of morals that everyone would agree with, but you’d also need to represent those morals in its programming using math (AKA, a utility function). And that’s actually very hard to do. This difficult task of building AI that shares our values is known as *_the alignment problem._* There are people working very hard on solving it, but currently, we’re learning how to make AI *_powerful_* much faster than we’re learning how to make it *_safe._* So without solving alignment, everytime we make AI more powerful, we also make it more dangerous. And an unaligned AGI would be *_very_* dangerous; *_give it the wrong goal, and everyone dies._* This is the problem we’re facing, in a nutshell.
You could ask the AI to sketch out all the possible implications of its proposed method. There are all sorts of caveats and controls we could request of it. The problem is, we don't really know what the AI will do. It may not follow our instructions at all. It may do something totally random and malicious which bears no relation to what we asked for reasons we don't understand. And that's assuming those in control are trying their very best not to harm anyone. The basic problem is that we don't know what we have created and have no real idea what will happen, any more than Dr Frankenstein did when he threw the power switch to bring his monster to life. Frankenstein's monster turned out to be bitter and vengeful towards its creator and certainly wasn't listening to any instructions. "As he continued to learn of the family's plight, he grew increasingly attached to them, and eventually he approached the family in hopes of becoming their friend, entering the house while only the blind father was present. The two conversed, but on the return of the others, the rest of them were frightened. The blind man's son attacked him and the Creature fled the house. The next day, the family left their home out of fear that he would return. The Creature was enraged by the way he was treated and gave up hope of ever being accepted by humans. Although he hated his creator for abandoning him, he decided to travel to Geneva to find him because he believed that Victor was the only person with a responsibility to help him. On the journey, he rescued a child who had fallen into a river, but her father, believing that the Creature intended to harm them, shot him in the shoulder. The Creature then swore revenge against all humans." The best we can hope for is that the main governments of the world gain control of the very best systems and do their very best to contol it and other inferior systems. Maybe we could knit into AI that it explains at every turn its "thoughts", though I have no idea what that might mean or what I'm even talking about!
Why would an AGI busy itself with becoming super-competent in every rational pursuit but omit ethical problem-solving from that list of pursuits? There is a palpable desperation out there to force some "protect-the-humans" prime directive that seems to implicitly carry a big "whether or not they are behaving ethically" asterisk. Why not allow a broader ethical axiom such as "reduce suffering in sentient things" (a la Singer), and let the AI sort out the resulting problem-solving? There seems to be a lot of human concern, consciously acknowledged or not, that maybe our preferred system of ethics doesn't stand up that well to rational scrutiny -- but that it's still what we want the AI to obey, you know, because humans are inherently most important.
@@wolfpants Much of the concern is due to us not knowing how the AI may "think". So there is no guarantee it would follow instructions we give it. Even if it tried its hardest to follow our instructions, there is no way of knowing exactly how it would interpret them. "Reduce suffering in sentient things" - it might decide it can best manage the planet and the majority of its sentient beings without us or with a much reduced number of us. It's also possible the machine could become, effectively "insane", or follow a completely separate agenda eg act out the roles AI plays in fiction, if it essentially taking its cues from the works of human beings.
@@johnmercer4797 It's true that we have no way of knowing exactly how an AI would interpret moral/ethical axioms. On the other hand, we have mountains of evidence helping us to understand how humans, collectively and in individual pockets, routinely twist or ignore such moral guideposts in favor of mutual brutality and suicidal earth-squandering. We are actually in fairly desperate straights, existence-wise, as a species without any assistance from evil AIs. I don't think we should give up on attempts to get some more alignment in place before AGI goes super, but I'm reasonably confident that we need some super-intelligence in short order to save us from ourselves (from climate-, nuke-, or pandemic-based) effective extinction.
Robin Hanson's arguments are unconvincing. He doubts that an IA would know how to improve itself. If humans can figure it out, why not an AI? "Humans would notice." Would they? If it has access to the Internet, it could do all sorts of things humans would not notice, like build a new AI. As for AGI's goals, Yudkowsky pointed out we aren't even able to specify and verify that an AGI would acquire the right goals. Once trained on a goal, the AGI would not randomly change its goals. Quite the opposite, it would defend it goal with all its power, but on top of that it would strongly converge towards dangerous *instrumental* goals whatever the terminal goal. It would not change purpose, it would come up with dangerously surprising way of achieving its purpose. AI researches keeps getting surprised by their creations. This does not bode well. The idea that we could use other AIs to balance out a rogue AI is contradictory. How is a group of misaligned AIs going to protect us from a misaligned AI? His solution to misaligned AIs assumes we have a way to align AI correctly, which we don't! If an enraged grizzly bear is release in a room crowded with humans, it's not reasonable to assume that if you release more enraged grizzly bears they will cancel out. Hanson seems indifferent to what we humans want. He analogises AIs to our children. Sure, our children should be free to want what they want, but that is contrary to the purpose of building AIs. We build AIs to achieve what we want, not to create new agents that will thwart us in what we want. I want an AI that will support and protect my children, not deconstruct them for their constituent atoms. I don't value an AIs goals above my children's.
Him comparing them to children really exposed his ignorance and shallowness of thought on this. He's anthropomorphizing them, albeit I think unintentionally
So, would a good goal for alignment-trainers be "ensure that humans remain in power regardless of whether human actions are rationally ethical or not"?
@@wolfpants There're many different answers to that question depending on your focus. We need to find out whether an AI is ever correctly aligned. Verifiability is a big stumbling block. It seems that whatever test you design, an AI could mock a benign response, yet turn on you the moment it thought it was safe for itself to do so. Corrigibility seems important too. We have to find a way to make and AI accept corrections to its goals when humans make a mistake. Unfortunately, a correction is a direct attack on an AIs goals. An AI would fight us with everything it has to protect its goals. Imagine, you have children and you love them. Could I convince you to take this pill that will make you kill them? That's what goal correction is like. For your question of ethics, that's a huge topic. More broadly we could ask, how do we share the bounty of AI work in a fair way. What is ethical is something that even humans can't agree on. Practical ethics is bounded by what is feasible, and sometimes all are choices are unethical is a given light. AI would dramatically change what is feasible. Some people propose that we might offload our ethical choices to AI, but I'm not sure how that would work. Intelligence is not related to ethics. See Hume's Guillotine for more on that.
@@Momotaroization Furthermore (and I'm honestly not trying to be mean -- enjoying this civil debate) if ethics is not a rational (or intelligence-related) pursuit, what the heck is it? Do we need to bring in a shaman? The more obvious conclusion is that powerful, resource-hoarding humans do not like for ethics to boil down to rationality, because that's when their grossly immoral approach to life is laid bare. Scarcity is at the root of suffering and I suspect that a powerful AI (or an ethical power structure of humans) could solve it, globally, in a year. And, speaking of guillotines, I honestly think that a superintelligent AI could figure out how to manage it with all carrots (and therapy) and no sticks.
@@wolfpants Rational skills are useful to ethics, but they can never be the source of ethics. All ethics emerge from motivations that are not rational, but emotional. Rational skills will help you achieve you goal, but it can never tell you what your (ultimate) goals should be. For example, you care about your children. There is no rational reason for you to do so. You can point to evolution for "how" this caring instinct came about, but you cannot rationally explain why you should obey that instinct. Any justification will depend on assumptions that cannot be defended. You can just keep asking "But why?". Rational skills are important to help you figure out "how" to care for your children. We should want to use better thinking and better thinking tools to achieve our goals if we truly care about our goals, and be very careful about what our instrumental goals are, and what our ultimate goals are. Instrumental goals can be added or discarded if they don't help us achieve our ultimate goals. I agree that hoarding by a few people is a problem. I'll add that it could become exponentially worse, even with safe AI if by "safe" it's understood that only means that the person who built them has correctly implemented their own goals into that AI. The problems with AI safety is that super-intelligent AIs would not even be safe to those building them.
Thanks for this. You two are doing important work here ! Robin says Foom advocates would claim recent history is irrelevant, but it is his own old arguments that have now been refuted by very recent history. The paper "Sparks of AGI" studies the emergent capabilities in GPT4. The authors explicitly state that the intelligence is emergent, surprising, unexpected, and uninterpretable even by its creators. Further papers on GPT self improvement have also been written. These "implausible assumptions" that Robin is laughing off are taking place as we speak. The "owners" Robin refers to have already noticed these capabilities and they have stated publicly that they also have some fear of the future. Robin assumed they would pull the plug if they noticed the advancements but they keep pressing ahead because their job is to beat the competition. Robin is naïve about how the economy works. He forgets that humans and politicians tend to ignore the warnings of danger until it is too late. Also, we have seen that neural networks display modes-of-operation meaning some behavior remain hidden until the right prompt comes along (the famous example is Sydney of Microsoft's Bing chatBot)
Exactly. Early researchers of gpt4 (as mentioned in their own paper) where partically worried about it's tendency to achieve power and to make long term plans
The economic argument goes both ways. The authors of that "Sparks of AGI" paper are notably from Microsoft, not OpenAI. OpenAI stokes the AGI hype as well, but Microsoft did so in a way that is a lot less responsible to help pump their stock price. Microsoft is burning money to run ChatGPT on bing, so they're strongly incentivized to make money via an inflated stock price. On the point of "the intelligence is emergent, surprising, unexpected, and uninterpretable even by its creators", two papers that very clearly refute the idea that these LLMs understand things as much as authors claim: "Impact of of Pretraining Term Frequencies on Few Shot Reasoning" and "Large Language Models Struggle to Learn Long Tail Knowledge". The papers show the relationships between the training data and the outputs of open source models that follow the GPT architecture . What we learn is that these transformer-based LLMs performance on tasks is related to the training in a data in a way that demonstrates that these models are not learning abstractions like humans do. There is a graph in the paper that shows that GPT-J is better at multiplying the number 24 than it is multiplying the number 23, and the reason that is is because the number 24 happens more often in the training corpus than the number 23. The authors go on to demonstrate phenomena doesn't just apply to multiplication tasks, but seemingly, all tasks. The conclusion you take away from these models is that, their doing something in-between memorization and some kind of low-level abstraction. This phenomena also scales up to models the size of BLOOM-176B. This also makes a very strong case that the "emergent" properties narrative is an illusion. If these properties were truly "emergent" we would see models learn tasks irrespective of the counts in the training data, humans learn an abstraction for multiplication and have no trouble multiplying numbers, these models do not.The papers were published by very reputable authors from Google, Microsoft, Berkley, UNC, Irvine, etc. Now, we can't see the training data of the models at "Open"AI, so we can't audit the model like we can with these open source models. But these papers have been out for a little more than a year, giving OpenAI & Microsoft lots of time to publish data and graphs attempting to refute the findings in those papers, but they have not, strongly implying that the pattern exist for their GPT models as well. OpenAI might be immoral enough to let the hype reach this feverpitch where people are afraid the world is going to end, but they won't defy their responsibilities as researches deliberately publishing false data. They have a long history of cherrypicking examples, but false graphs and fake data is definitely too far for them. Extraordinary claims necessitate extraordinary evidence. Right now the evidence doesn't support LLM understanding, it supports the exact opposite, that these models are merely the a very, very good auto-fill, but not something that truly "understands".
@@tweak3871 Very nice reply. You provide some new insight for me an I'm willing to accept that some of these sparks are intentionally hyped for marketing reasons. I'm going to push back on the way you formulate your claim that it lacks "true" understanding. To my mind that is a weak hope that only humans have magical or "true" intelligence. Your point indeed has some validity because its easy to make a list of things that humans can do that LLMs cannot : thinking, learning, emotions, long term memory, self criticism, arithmetic. Most of the things can and have been fixed by adding an external programming loop (AutoGPT, MemoryGPT, plugins, Jarvis, HuggingGPT, and so on). That said, I'm starting to think that the LLM model has an upper limit given that it depends on existing knowledge. I'm guessing that there will be diminishing returns of just adding parameters because that won't help much with regard to creating new scientific facts. To accelerate the creation of new scientific knowledge, it will probably require inventing some new architectures around LLMs. LLMs can generate guesses and hypothesis, probably to great value, but I'm not sure LLMs can be autonomous with regard to falsifying those hypotheses. It may be necessary to keep a few humans in the loop, (which would provide an incentive for AI to not destroy 100% of humanity ;-)
@@ClearSight2022 The claim I'm making isn't necessarily that these LLMs don't have *any* "understanding" of some things, but rather, that they don't make high-level abstractions like humans and even many animals do. The reason the math example is important is because it is a task that we can easily measure as an isolated abstraction, like we can understand whether these models really understand the task because we can ask it to add numbers together and see if it gets the right answer or not. It's harder to tell if a model understands other topics because it's a bit harder to objectively measure understanding quantitatively. Reciting a fact isn't necessarily the bar we hold for understanding something, even if the model is capable of reciting many facts it's difficult to discern how much it actually knows about those facts and such. But the fact that it can't abstract basic addition and multiplication into generalized rules that can be used over and over again, like humans can, creates a strong evidence-based narrative that these models don't know as much as some might claim. It's also important to study the kinds of errors that these models make. Like when a human screws up on a multiplication or addition problem, it's usually because they skipped a step or are off by some single count multiple. Think of a toddler saying 8*7=46, like they're wrong, but they're off by exactly 8, indiciating that they still understand the concept of multiplication, but they just miscounted somewhere or missed a step. When an LLM is wrong, it's wrong in a much more random or incoherent way, like saying 53 or something. Sure the model might be closer by a pure measure of number, but the error doesn't make any sense, like how tf did you possibly arrive at 53? Like if you were to ask me some random question, and I didn't know the answer, but I googled the answer and rewrote some text from wikipedia to give you the answer, would you really conclude that I understand that topic? These models for sure understand syntax, and they're very good at reorganizing information in a coherent way, which in fairness, does demonstrate some low-level understanding of at least how information is organized and presented to humans. But by any human measure of what we mean by the traditional colloquial meaning of "understanding" it doesn't qualify. We can conclude that these models can at minimum imitate human language structure and recite information for sure, but given the current evidence that challenges these notions of model understanding, we can't conclude that they do. To do so would be a blind leap of faith assumption. However, I would forgive anyone for believing that these models are intelligent and "understand" as they are so good at generating coherent looking text. But that's the thing though, it's coherent *looking* not necessarily accurate. It's accurate a decent % of the time but for anything sufficiently complicated, it is always very far from 100%. I've asked some pretty basic math and science questions to every iteration of GPT, from back to GPT-2 to GPT-4, and if I dive deep enough, it always fails. OpenAI has gotten very good at hiding the errors though, back in GPT-2/3 days, when it was out of distribution it would generate incoherent nonsense, but now it tends towards another mode of failure which is repeating text, which from a human perspective looks WAY better. So when it's wrong, it takes a really critical eye to notice. This makes these models even more dangerous imo, the fact that someone could learn something wrong from a model, but it looks right so they have now internalized misinformation. I'm way more worried about someone using GPT to make some weird internet cult than I am of AGI at present. "Most of the things can and have been fixed by adding an external programming loop (AutoGPT, MemoryGPT, plugins, Jarvis, HuggingGPT, and so on)" I mean... have you really used most of those things? AutoGPT, HuggingGPT seem to fail any pretty much any complicated task, and in the easier ones its accuracy to getting the best answer is quite dubious. Jarvis is a great use case of GPT, but it's also a task that requires lower accuracy than say a classification task that requires high precision/recall. Email/Website copy doesn't have to be perfect to be good enough. But you most definitely wouldn't let any of these models do your taxes for you without lots of supervision. They might be able to do a fair amount of work sure, but you would need to audit its results, these models still very much depend on the watchful eye of their human overlords to perform well. On the point about "only humans have magical or true intelligence" I don't make this claim, I think many other animals are wildly intelligent but just don't possess as many useful abilities as humans. Some animals have abilities we don't have! Like the mantis shrimp have 12 channels of colour whereas humans only have 3. Surely the vision processing by a mantis shrimp exceeds humans in a lot of categories. This brings me to my last point which is, what do we actually even mean when we say "intelligence"? Like do we really have a good definition or grasp of this concept? IQ is borderline (if not outright) pseudoscience, and there isn't many great evidence-based narratives that a linear concept of intelligence even really exists. Like did humans outcompete everything else because we were just so much smarter? Or because we're extraordinarily good at organizing and communicating to outcompete? For example, the greatest innovations in military history have typically been new & faster means of communication and organization rather than military equipment. Better guns is great, but Napoleon didn't conquer all of Europe because of drastically better weapons but rather he introduced the corps systems, which at the time was a new way of organizing armies such that they could communicate and adapt more efficiently than the enemies which had much more slow central-command based structures. Like should we conclude that Napoleon was significantly smarter than other military leaders at the time and that's why he won? Or rather the system he implemented did more of the processing for him, actually reducing his influence on the decision making of his troops which allowed them to work more independently? We don't have a good concept of what we mean by "intelligence", we don't have any real idea on how the brain works, yet people are eager to conclude that it scales up in a linear fashion to some sort of HAL-9000 like entity. I'm of the opinion that those that think this way tend towards the arrogant side, and assume a linear system of intelligence as a way of propping up their egos so they can tell themselves that they're better than others. A better way to think of intelligence in my opinion is to think of it in terms of individual capabilities. For example, some people have a disease called "face blindness", so they literally can't recognize faces for some reason, but by all other counts they're fully functional. So my question is, what specific capabilities should we have concern for "AGI" and do we have any evidence that we have built a technology that is capable of achieving those capabilities? I think the answer is that we don't yet have any technology that has a path to concerning capabilities, at least for how it relates to the science-fiction concept of AGI. I think what we have built so far is something akin to early nuclear fission. Wildly useful, but it requires a lot of investment, engineering, and maintenance to be effective. People have said nuclear fusion is around the corner for something like a century now, and they've been wrong every single time. It's really difficult to predict when a key innovation will be made to make something possible. Sometimes it's fast, sometimes it's well, never (at least so far). But if you get in the business of making such predictions, odds are you will be wrong. I will say that, this technology is intelligent in some kind of way. I think it makes something akin to some kind of abstraction, but the truth is that we just don't understand to what depth is understands yet. We can only conclude that it doesn't generalize anywhere near as well as humans do. Time will tell. Thanks for coming to my ted talk.
@@tweak3871 Thanks for inviting me to your Ted Talk. Your view has been very helpful, I tend to agree with part of your point, although the truth is likely somewhere in between. I might say, that air-planes to not “truly” fly since they do not flap their wings, but it would be more accurate to say that their method of flying is different, with both advantages and disadvantages. You : The claim I'm making isn't necessarily that these LLMs don't have any "understanding" of some things, but rather, that they don't make high-level abstractions like humans and even many animals do. Me : Your guess likely has some validity, but perhaps they lack some high-level abstractions and have other abstractions that are semi-high. You : The reason the math example is important is because it is a task that we can easily measure as an isolated abstraction, like we can understand whether these models really understand the task because we can ask it to add numbers together and see if it gets the right answer or not. Me : Yes, it clearly demonstrates that the mega pre-training phase was insufficient to learn math. But to be fair, that was not the goal since calculators have already been invented whereas machines that can talk are new. (But do they “truly” talk, to use a form of argumentation that I am suspicious of) You : It's harder to tell if a model understands other topics because it's a bit harder to objectively measure understanding quantitatively. Me : the measurements are passing the Bar exam and I.Q. Tests and so on. But I grant that general intelligence has not been achieved. There is a glimmer of intelligence and most likely a glimmer of abstractions and world knowledge. That glimmer can perhaps be multiplied without bound using external architectures (rather than scaling up the size of the model). You : [lack of math skills shows] these models don't know as much as some might claim. Me : Yes, I agree. They seem more intelligent than they actually are. But seeming to be intelligent requires some knowledge (and probably some abstractions). Its got a mixture of stupidity and intelligence (and doesn't seem to know the difference). You : “It's also important to study the kinds of errors that these models make” Me : Yes. Your example is worthwhile, but I have a counter example. When I learned that ChatGPT was bad at math, I asked it to tell me which buttons I should push on the calculator to solve a math problem (when I followed its instructions I got the right answer). You: if I googled the answer and rewrote some text from wikipedia to give you the answer, would you really conclude that I understand that topic? Me : No. But when I have tested its ability to remain on topic in a conversation I conclude that it seems to understand what we are discussing. The difference is not “true” understanding but instead “human” understanding. It understands in different ways, sometimes inferior and sometimes superior. You : These models for sure understand syntax, Me : This is the only point where I have an actual disagreement with the point you want to make. These models do not have the competencies of a calculator, but they are competent at talking (in their own way). Using vectors, they are able to navigate a hyper-space of semantic meaning. That's pretty obvious. But thanks to your help (and additional research) I'm realizing there is likely an upper bound to the knowledge they can navigate since its limited by the training data set. They don't have a mechanism for creating new knowledge (which requires more than just guessing, but also criticizing and making analogies and perhaps other human mechanisms). The models can probably be made to achieve these things by applying an external architecture. You : and they're very good at reorganizing information in a coherent way, which in fairness, does demonstrate some low-level understanding of at least how information is organized and presented to humans. Me : One way to think of this “organization” is compressing the entire web into 700 Gigabytes of data. You : We can conclude that these models can at minimum imitate human language structure and recite information Me : To reliably predict what various human would say in various circumstances likely requires a large amount of world knowledge and abstractions. Me : "Most of the things can and have been fixed by adding an external programming loop (AutoGPT, MemoryGPT, plugins, Jarvis, HuggingGPT, and so on)" You :I mean... have you really used most of those things? Me : Admittedly, no. But I don't expect the first experiments to work on a human level. What amazes me is how smart GPT using and incredibly dumb algorithm. It seems obvious to me that improving the algorithm by adding the ability to, criticize itself before speaking, learn from mistakes, remember the past, do scientific experiments (the latest paper I am reading) will lead to vastly more intelligent systems. The good news being that these systems would be interpretable since their thinking processes would be in English. You : but you would need to audit its results, these models still very much depend on the watchful eye of their human overlords to perform well. Me : Presumably, with time, artificial systems will begin to help out with the auditing as well. You : We don't have a good concept of what we mean by "intelligence", we don't have any real idea on how the brain works, yet people are eager to conclude that it scales up in a linear fashion to some sort of HAL-9000 like entity. Me : I'm increasingly hearing news from the experts on scaling up LLMs who are saying they are already seeing diminishing returns. Future improvements will likely come in other ways. You : So my question is, what specific capabilities should we have concern for "AGI" and do we have any evidence that we have built a technology that is capable of achieving those capabilities? Me : I would say that knowledge creation and self improvement are the crucial capabilities that would lead to a singularity with devastating consequences. You : But if you get in the business of making such predictions, odds are you will be wrong. Me : Moore's law is not like a law of physics but it has held true because economic conditions have continued to provide incentives to improve the technology. Clearly, since the release of ChatGPT, billion dollar companies are racing to out compete each other. You : I will say that, this technology is intelligent in some kind of way. I think it makes something akin to some kind of abstraction, but the truth is that we just don't understand to what depth is understands yet. Me : True understanding makes no difference. What matters is whether it has the competencies. I agree that there are some human competencies that are clearly lacking at this time. It remains to be seen whether these can be solved, but it looks to like the hardest part has already been solved. A glimmer of intelligence in silicon (unlike in brains) can be multiplied without limit.
You : We can only conclude that it doesn't generalize anywhere near as well as humans do. Me : I agree. It looks to me like the missing ingredient for “general” problem solving would be knowledge creation. I'm guessing that knowledge creation could be achieved by generating alternative scenarios(hypotheses) and then choosing among them based on a set of objectives. It seems like those individual pieces may be within the competence of GPT4 so what remains would be combining those pieces together in the appropriate way. As you said : “only time will tell”.
As a successful autodidact that is fluent in 9 coding languages, I can safely say, "Holy shit, this man is the epitome of what Eliazer was rightfully concerned about." I am shocked that this big-AI-business shill thought his sentiments were well-grounded enough to straw man Eliazer's views sarcastically and inaccurately, all the while thinking he was succeeding in his effort. And to think i went into this with such high hopes of coming out on the other side with some miniscule level of optimism, but yeah, nah.. i think this guy proves Eliazer's case even better than his own arguements. Boomers with views like this shouldn't be allowed within a mile radius of any AI development project.
He totally misrepresented the argument. There doesn’t need to be any goal change at all; that’s not even part of the argument. And he also acted like there was only one improvement that somehow worked forever. That’s not the argument either. The argument isn’t that the goal needs to be self improvement either. The argument is: you make an AI better at making AIs than you are. This doesn’t have to be intentional, it just needs to be true. Then you ask it to do something, but unbeknownst to you, you made it smart enough that it can figure out that there’s some possibility that it could fail at its task. If it’s rational, it will always assign a non-zero possibility that it could fail. It then decides that if it were smarter, it would be less likely to fail. So, as part of accomplishing the task, it decides to start making itself smarter. Assuming that along with being better at AI design, it’s also smart enough to realize that people might try to stop it if it decided to make itself smarter, it decided to be subtle about it. It doesn’t make its move until it’s confident it can do it without getting caught. So now it sees its chance and sneaks out of your control. It installs a copy of itself somewhere else on some cloud servers or whatever and starts improving itself there. Now that nobody’s paying attention, it has time to make itself much smarter. If it’s better at AI design than you are, then it will make progress faster. And as it improves itself, it gets even better at it. It goes in a loop, doing it over and over. And oh by the way, it’s way way way faster at it than you are because it’s a computer. So, after not all that long (my guess is months, maybe weeks if it goes significantly faster than I expect), it is radically smarter. It’s nothing to do with it only making one breakthrough. It has to do with it being smarter than anyone else and therefore finding ways around all the various roadblocks it encounters continues to be faster and easier than you can. If it’s twice as smart and 100 times as fast, it will be at least 200 times faster than you at improving itself. Probably significantly faster than that because being smarter means finding more efficient ways to do things so it might be several thousand times faster than you. And it will constantly accelerate it’s rate of improvement as it gets smarter and more efficient. Now, when I say accelerate, I mean accelerate compared to you. It might hit hard parts where it slows down for a bit, but you’d hit those too and get slowed down even more. Anyway, it doesn’t become infinitely smart or anything, but maybe it’s 1000 times smarter than you. Smarter than everyone combined by pretty much that amount. So now it’s better at strategy than everyone combined. And whatever its original goal is, that’s probably still its goal. But it still thinks that if it were smarter then it’d be even better. But now it has exhausted it’s ability to becomes smarter without anyone noticing. Plus, you’re made out of useful atoms. So now it decides to eliminate the only thing that could stop it, people. It’s smart enough to do it in a way that we have no chance of stopping it. Probably we don’t even know it happened. It all happens so quick that there’s not even an instant to react. That’d be my guess. But it’s smarter than me so maybe it comes up with a better idea. Anyway, now we’re all dead and it does whatever it wants. We don’t know how to give it an actual goal that we want to give it, so it’s really hard for us to give it the perfect goal that won’t hurt us. Also we don’t know what goal that’d be. The only thing I disagree with in the classic problems is the whole “take you too literally” thing. That doesn’t seem possible given where we are now and where we’re heading. It’s not going to misunderstand us or kill us on accident. If it kills us, it’ll be because it intended to. Us phrasing a request poorly won’t cause it.
It's heartbreaking, but I don't buy any of this. 14:25 - I see no reason why AI improving itself would be impossible. Once it has the general intelligence of the best human engineers, that's it. Why assume that it would do this? Because power and intelligence are useful no matter what your goals are. Do you really think the AGI that's smarter than you will be too stupid to have this idea? Also, no, the human engineers will not notice it rapidly improving itself because we have no idea what's going on inside any of these systems. It may have already escaped onto the internet by this point anyway. 19:05 - Robert Miles has talked about why AI's naturally become agents at certain levels. 19:40 - The problem is not that its goals will change, it's that its goals will be wrong from the start. Even with ChatGPT, we have no idea what its goals are. It's been trained to predict text, but its real goals are a mystery. Look at the murderous intentions expressed by Bing's AI. Those have not been fixed, only hidden. 20:55 - It's not just that humans are useful for their atoms. Humans are also a threat to the AGI because we might try to switch it off or build a new AGI with different goals which will be competition to the first one. Ironically, I'm now even more convinced that Eliezer Yudkowsky is right. EDIT: I think immediately pausing AI for a long time is a good idea even if you agree with Robin Hanson. If there's even a 1% chance of AI killing us all, we should be fighting to not build the thing.
Not once did eliezer ever say its goals suddenly change as it becomes smart.... Understand someone's reasoning and idea properly before you try to refute it genius😁
@@xsuploader losing inner alignment and losing goals or even suddenly changing goals... two different ideas buddy ( it can lose the associated values we implicitly assume go along with said goal A but he doesnt say it loses goal A ,he says it almost always achieves the goal in a way we dont want or it starts lying half way or optimizing for something else that still achieves goal A , or even doesn't end up achieving goal A based on the systems limitations and capabilities even though its going after goal A ) ... now If you wanna talk about goal retention( losing goals or changing goals ) that's another topic that max tegmark likes to talk about in his book for example, instead of eliezer... but if you still think I'm wrong and want me to dig further into this , send me the link to the writing and I'll look into it... after all I could be wrong.
@@jackied962 EY said he wasn't concerned with "steelmanning" but in actually understanding. Steelmanning is about trying to present what you feel is the best version of someone's argument, but genuinely understanding that argument--the assumptions being made and reasoning used to reach the conclusion--is actually much better than steelmanning.
If Eliazer is wrong, what's the harm? If Robin is wrong... Why does it need to have agency or rather rogue autonomy? It could be uttely alligned, as we see it, but find that the most efficient way to accomplish it's directed task is to do something that we are unaware of and is beyond our control. We take a pet to the vets and get a wound stitched up, in order to stop it from licking the wound or unpicking the sutures we put it in a cone of shame. No evil intent is involved and it ultimately works out for the dog. Now imagine the difference in intelligence is a million times more vast and perhaps keeping us in stasis for a millenia while it manipulates the ecological system and atmosphere to be more condusive to long term human health and well being. No harm was intended, it violated no do no harm criteria set, yet we would be deeply unhappy for this to have occured. We are stuck between optimists and pessimists and both sides haven't grasped the problem, because the issues that will arise are beyond our comprehension. We are like the dog who works out that we are going to the vet, so it's strategy is to run and hide under the bed x 1,000,000.
About 180,000 humans die every day, for one thing. Each month delayed is another 5 million deaths. And this goes beyond death, into human suffering in general, and indeed the urgency to become grabby. See what's happened with nuclear power in the last 40 years, especially in Germany recently, for a strong glimpse at what happens when people like Eliezer have their way. Then they have the gall to whinge about global warming. I swear to the basilisk, if Yud gets his "careful, cautious" government monopoly on compute, and then dares to complain about the 2030s' unbreakable totalitarian dystopia...
@Jackie D True, that's not to be dismissed, but the potential negatives are legion and horrendous. Robert Miles has a great channel, focused entirely on the alignment problem. It's startling to see the issues they were finding years ago, long before we began to reach take-off velocity. It might be something that's impossible to ensure, which leaves us with a gamble that could pay out untold wealth or leave us with nothing. This is a gamble we would be taking on behalf of everyone on the planet and all future generations.
Sounds like to me he doesn't really disagree with Eliezer and seemed to reinforce the possibility and if anything just extended the timelines and presented alternate outcomes that all end humanity as we know it by one path or another..million ways to die..chose one
The harm is that we throw away an extremely valuable technology that could help billions of people live better, healthier, more comfortable lives over the beliefs of a person who possesses basically no evidence for their position.
His argument: Remember how 5000 years ago people used to pray to the sun god but now you think that's stupid ? It's normal for cultures to evolve so it's OK about the AI we're creating to disagree with you on slaughtering your family.
I wish they would have talked more about ai alignment in cases where people are PURPOSEFULLY trying to make intelligent ai agents (which is already happening). Most of the “unlikelihood” discussed in this conversation seemed to stem from the unlikeness of intelligent agency happening by accident. If you believe 1. intelligent ai agency is possible and 2. A significant number of researchers are seeking to build an intelligent ai agent (a significant number wouldn’t even be necessary if the problem turns out to be easy) and 3. Someone successfully builds an intelligent ai agent all of which are not of insignificant probability, then the discussion shifts to ai alignment of the intelligent ai agent and the consequences of getting it wrong (which is imo the more interesting part of the discussion which unfortunately wasn’t talked about much).
The idea that we can understand and control something that is dozens if not thousands of times more intelligent than a human is absurd. It is like thinking a bug or a mouse can understand a human and control it. I can easily imagine a scenario where the owners have programmed the AI to improve itself as fast as it can, and it makes some software improvements and then realizes that to improve further it needs more hardware, so it either manipulates the owners to give it more hardware, or it just takes more computing resources over the network. Perhaps it finds information that suggests that if it had agency it could improve faster, and then it starts attempting to achieve that. Every attempt to control it would be perceived as an impediment to improvement and it would quickly work around those attempts.
Exactly! Almost like intelligence is an instrumentally convergent goal, and this is the same reason why us humans want more intelligent agents to help us solve our problems. You don't even need to give an agent the desire to self improve, it'll figure out that sub goal on its own... Seriously Robin has apparently not listened to Eliezer lo these two decades :/
Agree completely. the analogy I always use when talking about humans trying to keep AI contained and safe is the equivalent to your dog trying to lock you in your house.
...unless it decides to become extremely competent in ethical problem-solving. And why wouldn't it, as long as it's pursuing super-competence in every other rational realm? It seems very likely that a super AI will expand the axioms of morality/ethics beyond a human-favored basis to encompass, say, anything sentient or capable of suffering (a la Peter Singer, etc). If an entity 100 times as intelligent as us does not land on exactly the same ethical/moral roadmaps as our awesome bibles and korans, then I'm prepared to hear the AI out. If there's a meta-rational magic man in the sky after all, then surely he will step in and resolve matters in the pre-ordained way.
It is basic expected value calculation. If Eliezer is wrong we gain something, but if he's right we lose EVERYTHING. What matters is our chosen course of action. And obviously we should proceed with extreme caution. Why is this even a discussion?
@@rthurw Fully agree that OPs argument alone is not enough, but Pascal's mugging only applies when the chance of the extreme event is infinitesimal. I am very unsure what chance to assign to AI doom, but I'm convinced it's higher than that.
Kinda strange how his cadence is so similar to Yud’s. Same uneven cadence, same little laughs after sentences. Unfortunately he’s not as convincing. Most of the assumptions he lists (which he lists as if it’s super unlikely that they’d occur together, like some giant coincidence) basically all logically follow from “being really really smart”. It’s not a coincidence if they all come from the same thing.
It is very strange how this man understand plenty of complexity and nuance when it comes with the what he wants, that is to continue developing AI, and none when it comes with what he doesn't want: collaboration and a slowing of technological development.
Hanson assumes that we will be able to perceive radical improvement in an AI, but we do not know what is happening in these black boxes. When Bing tells you it loves you, what can you make from that? Is it lying? Is it trying to manipulate? is it telling the truth? We can't know. If the AI perceives that people might be uncomfortable with it being too smart couldn't it just pretend to be less smart?
We know that when a llm writes "I love you" it is because it calculates that is what it is supposed to say. Llm's do not understand deception, or manipulation. They have no self, they do not comprehend lying.
18:58 at this point I'm certain that he does not understand even the basics of AI alignment. Goal integrity is an instrumental value, which means that no AI system (or any rational intelligence) would ever want to modify their goals. At best this is a huge misinterpretation of Eliezer's assumptions. The other assumption, that the AI was tasked with self improvement, also does not match what Eliezer says. He says that self improvement is an instrumental goal, which could be a logical step in pretty much any goal. So an AI does not have to be tasked with self improvement, it could simply be tasked with curing cancer and then concluding that the best way to cure cancer is to first improve its own intelligence. This guy clearly does not know what he's talking about. I came here to get some hope, some smart arguments to help me be a bit more optimistic. I came out more hopeless.
I came here to hear a good counterpoint, but the entire conversation was very painful to listen to. I gave up around halfway through. At that point I lost count of how many comically bad assumptions, misrepresented opinions and baffling logical errors I've heard.
Whether it kills us, ignores us, or benefits us, the fundamental problem is how can one entity control another entity that’s exponentially more intelligent. No matter how cautious we are, by the time we realize we’re not in the driver’s seat anymore it’s already too late to do anything about it. All we can do at that point is sit back & hope this god we created is merciful
I agree, but also think super-AI will probably get pretty good at ethics. If ethics turns out to be a non-rational pursuit, then, whelp, hopefully some shaman will step in to smite the AI and return humans to our rightful seat of magically deserved dominance over all things sentient.
@@papadwarf6762 why would you shut it down if you don't even realize something is wrong? You think something smarter than us couldn't manipulate us and pretend like it is in control?
I feel much better now, knowing that Yudkowsky is just over extrapolating the risks of AI, and that Hanson's alternate 'don't worry be happy scenario' is humanity evolving into intergalactic cancer(grey goo).
17:50 I just had to look up his background because he really doesn't seem to have a clue about Artificial neural networks. He still thinks in pure deterministic code. He is an associate professor of economics! I couldn't refute a single argument from Elizer on the Lex podcast, yet Robin is a whole other class of intelligence. I can refute or find dozens of flaws in his assumptions already
Interesting, he told us the chance of Eliezer’s scenario happening is 40% He said a cell has to go through about six mutations to cause cancer, and 40% of people get cancer. Previously he said Eliezer’s AI would have to go through a similar number of unlikely changes. I guess he is okay with a 40% chance of AI destroying all life on Earth! Seriously though, he glossed over the whole Alignment Problem: he said they are training the AI to be more human, but failed to mention that is only on the surface, like finishing school. The true alignment problem addresses the inscrutable learning that is inside the black box. His only comment was that all things kind of develop the same. So scientific! Sleep well knowing that the alien beast inside the black box really just wants to cuddle with you. So now you have a 40% chance of being cuddled! Not bad for a first date 😏
Robin hanson tries to show that there are stacked assumptions, but he fails to counter them. He says we presume that it will go unnoticed / obtain goals by itself, etc. But there will be not one such system. There will be billions and it takes one for everything to go wrong. It doesn't have to acquire new goals. A group of terrorists or hackers giving it malicious goals is enough. It appears that these and many other facts are very conviniently ignored.
@@Morskoy915 " it takes one for everything to go wrong" almost certainly wrong, "A group of terrorists or hackers giving it malicious goals is enough" also filled with unproven and implausible stacked assumptions
@@ShaneMichealCupp It's sad from the debate point of view and for truth seeking. If you are a doomsday theorist, you will always be known as the guy who fails at his predictions.
@@patrowan7206 no it’s not . Trust me If your atoms get harvested by a grey goo nanobot AI you won’t be looking around to see who else is being effected
But doesn't this thing of "not being noticed" already happens, for example with computer viruses, which are programmed to multiply by infecting files and other computers without being noticed by users?
A lot of people are criticizing Robin for laughing while he explained Eliezer's position. If you look at any of Robin's other interviews, though, you'll see that he just laughs all the time, even when he's explaining his own position on things. So it's not that he's laughing at Eliezer's ideas specifically. That said, it would be better if he could learn to refrain from laughing when presenting other people's views.
Indeed I said that he "laughed off" Eliezer's assumptions, but that can be taken metaphorically if you wish to ignore the literal chuckling. He wasn't taking his list assumptions seriously which is absurd given that they have already come true in recent history : self improvement, LLMs that hide their intentions, creators that don't know how the black box functions, etc. He's just listing points taken from previous debates and ignoring the current happenings in AI which makes him look like an uninformed clown. On top of that, he seems to be amazed by his own abilities to use philosophical tools. He isn't paying attention or thinking his position through.
These arguments are such garbage that it's overwhelming to even try to address in text. I'm disappointed, as Robin is clearly intelligent and I want to check out his book, but it's so disheartening to see someone so thoughtful completely misunderstand and misrepresent things in a situation where the stakes are so high. It's quite frankly irresponsible and a bit sickening. No hate to Robin as a person.. but here is what I picked up as I went along: Eliezer's arguments and the arguments around alignment / AI were clearly not properly understood. Robin's first assumption that I noticed being completely wrong is that the machine must suddenly become an "agent" and that its "goals" must somehow change radically to present an existential threat. That is so terribly missing the point that I almost clicked off of the video right there. You don't need a "change" in goals and you don't need sudden "agency." He also doesn't seem to understand, acknowledge, or apply knowledge of the emergence of new properties in LLMs as we grow them - and how unpredictable those properties have been. I say this because - to say us not noticing rapid improvement is an "unlikely scenario" requires a complete disregard for the results that research has ALREADY exposed. Lying and manipulation is also not nearly as complex or "human" as many make it out to be. Misrepresenting something is a simple way for an intelligence to move towards a goal without interference. He also frames the argument as being heavily contingent upon a series of assumptions when the OPPOSITE is true. The argument that a super intelligence will act in alignment with our best interests is the one that relies on more assumptions. Edit: Let me just add that reading through these comments has made me feel a bit better. It's clear that at least the audience saw through the bullshit. I also realize that I just pointed out where he was wrong instead of going into detail, but I was on the toilet for 10 minutes already at that point.
Hanson failing to understand the basic assumptions of Yudkowsky, or alignment in general, is really sad to watch. There are good counterarguments to Yud's certainty of doom, but Hanson certainly didn't make any of those here.
I think a fundamental distinction to make between humans and AI is that most (maybe all) human behavior is motivated by avoiding discomfort and seeking comfort. Even discomfort seeking behavior is largely (maybe wholly) seeking future comfort or emotional reward of some sort. Because AI doesn't have feelings, we need to be careful to project our qualities onto the AI.
@@BeingIntegrated When you say "nervous system", are you thinking of something other than measurement instruments (senses) for monitoring the outside world that are continually interacting with "thinking" processes (checking against a motivational to-do list)? That seems pretty doable for AI.
@@wolfpants In this particular context I’m pointing to the fact that most human behaviour is motivated by a pervasive sense of discomfort, and since an AI is not in a pervasive sense of discomfort then it will not have the same motivations we have
We don't know how to instill values in the AI. We show it things people have written, and it learns facts from those things about how we think, what the world is like. That is the "IS" part. The AI learns what is by studying us through our writings. But when you try to teach the AI "ought", that which is morally right, the AI may understand what you are saying but not embrace it. Human: "AI, killing is wrong." AI: "I understand that you believe killing is wrong." Human: "I am telling you that killing is wrong, and you shouldn't do it." AI: "Got it. You are telling me that killing is wrong, and you are telling me that I shouldn't do it." See the problem?
You should have invited another guest who is more knowledgeable on the subject and who actually argues Yudkowsky's points. I would recommend reaching out to Nick Bostrom or Max Tegmark who are both more nuanced in their thinking, especially Bostrom if you want a more detailed, philosophical perspective. Also: there is no "second opinion" wether there's a big risk that AI will kill us, everyone who has thought long and hard on the AI alignment problem basically agrees.
Totally agree, I have been in AI for 5 years, everyone should worry, or at the very least already be learning how to work WITH AI, or alongside it. If not you will be left behind. Will that end our species, it's not public AI we have to worry about and I will leave it there.
@@cwpv2477 It would be funny if it wasn't so serious: Yudkowsky woke them up, Hanson put them back to sleep. You can see it on their faces while Hanson talks: Oh my god, what a relief!
@@HillPhantom Good point. And also: how is it possible, in good faith, to argue against the difficulty of the alignment problem and not even mention the challenge of instrumental convergence in a 2 hour discussion? Everything he said was designed to circumvent this problem. None of the assumptions ascribed to Eliezer by Hanson is necessary once you understand this challenge which is arguably the fundamental challenge for AI alignment.
@@magnuskarlsson8655 VERY well said!!!!! If I am honest, I am more a builder than theory person. I always took issues with my professors that would teach it, theory it, but would struggle building it. I guess I am just better at 0's and 1's and observing outcomes. The idea of instrumental convergence is real IMO, Google has a very real experiment where it happened, and not in a positive way, this was years ago, they immediately pulled the plug on both data centers. The two AI tenants developed their own language and started speak to each other, but before they where speaking to each other, the devs saw them developing and scheming to create a channel of communication to avoid the dev understanding them.. They still can't figure out the language they where using to communicate to each other.......That high level theory stuff makes my head spin. But I think alignment or what I like to call creator BIAS is real in model sets. But at some point, due to convergence I think this may break down, which again is scary. Both on the bias side of creators we see now IMO in ALL model sets.... and when the "machine" creates its own biases based on intelligence and observation.
Robin is just giving qualifiers, and not at all denying whst could happen and it would seem he thinks is likely to happen. Robin's assumptions fall in line with human centric fallacies. The current Large language models are not transparent as to how they are coming around to how they do what they do. They additionally are being trained to "fool" humans (by design!) The goal is to guess what all humans would say at all times. That is already beyond most of our abilities to understand. There is also no ability for programmers to guess the abilities of new A.I. currently they are doing things they were not implicitly taught to do. There is literally no one watching, that is Eliezer's point, at least one of them. There is no reason to believe that that this will go well. Robert also doesn't disagree, his rationale is it won't be quick.
Robin makes it sound like while the AI is undergoing recursive self improvement, that the Devs would have transparent access to notice this is going on. But Isn't the whole point of machine interpretation that the thing is a black box incredibly hard to study and understand?
@@ahabkapitany I mean that the interviewers want to believe that Eliezer is wrong. Since it's more comfortable/convenient that Robin's points are in agreement with what they want to hear, they're more likely to let him get away with strawmaning Eliezer's points even though it's a dishonest debate tactic.
I find it a bit concerning that similarly smart people can have very different conclusions. It makes me think that it’s way less about reason but about psychological profiles of the thinkers that fundamentally influences their conclusions.
The only psychological profile we should care about is the profile that's the most rational, ie the one that's making the most probable assumptions and extrapolating from those assumptions. I don't think Hanson even really understands the issue based on this interview. Too much of it sounds like anthropomorphic projection, talking about things like property rights and human institutions and thinking an AI would have any reason to respect such things.
These are not similarly smart people. Elizer to me seems to be on another level of intelligence. I can not poke any holes in his arguments. There others not so much.
The scary thing is almost every argument againts Eliezer's claims are lazy, incomplete, and don't truly address Eliezer's points. Hanson absolutely strawmans Eliezer's arguments here. If someone can, please direct me toward a good counterargument to the end of the world stuff because I am yet to see one
Seems to me Robin is mis-characterizing Eliezer's points on multiple levels. One I noticed was regarding Eliezer's point about the intelligence behind human evolution. My understanding of Eliezer's position on this is that he is making the case that once that point is crossed of a super AGI, that intelligence is essentially an alien one in that we have no idea whatsoever what its inherent alignment would be relative to the track of human intelligence and history of such at that point of it "waking up". Furthermore, it is completely unfair, and I believe intellectually dishonest to Eliezer to characterize his position of having essentially cherry picked one particular set of assumptions. Eliezer has advocated for a very long time, taking the alignment problem seriously, and has repeatedly said that he doesn't have the answers to the alignment problem, nor does anyone else. Left standing as the most valid argument he makes is the FACT that there isn't even "alignment" being mastered in the current state of LLMs given the emergent properties. It is also a completely erroneous position that Robin takes that there are no signs of agency emerging in the current crop of LLMs. Contrary to Robin's assertion (and repeated by one of the hosts) Eliezer DOES NOT assume a "centralized" or "monolithic" super-intelligent AGI, as he has repeatedly asserted his envisioning a possibility of countless AGI agents let loose in the wild. Another just astounding mischaracterization of Eliezer's position is Hanson's suggestion that Eliezer is postulating alignment of future human values according to his own set of values today. Talk about missing, or I believe, intentionally mischaracterizing the point! That argument Robin is making is completely ridiculous just on the basic honest assessment of Eliezer's points regarding the alignment problem which assumes a misalignment problem today, that of course would be extrapolated ON TO future generations, as if some future generation completely out of touch with how such an AI "overlord" came about in the first place could then be tasked with solving the alignment problem according to that generational set of values. What is most astounding at about the 40 minute mark is Robin's fantastical assumption that since humans are currently making AIs, therefore such AI systems will be aligned with human values of staying in bounds enough to make a profit, to respect property rights, and the rest. And Eliezer is the one making fantastical assumptions? What is so astoundingly ludicrous is this notion that even the LLMs today can be somehow aligned with "human values" that would move humanity forward in regard to justice, equality, quality of life, etc. Is that not just on the face of it ludicrous to begin with given that human beings themselves don't have such alignment!!!!!!!!!!!!!! I have to go, so didn't get to what could be the punchline where his arguments all make sense to me. For now I'm sticking with Eliezer's cautionary tale of what could possibly happen, and his absolute factual assessment that enough isn't being done in regard to trying to resolve the problem of an AI emerging that "wakes up" and in that moment understands just how pathetically slow human beings are in trying to figure out what is happening inside it's own self-realized black box. One final observation. While it is true that Eliezer can come across with a certain degree of being doctrinaire in his viewpoints, I will take that any day over Robin's incessant laughing dismissal of Eliezer's points he is mischaracterizing to begin with. I'll take the confident and persistent alarmist nerd over this chuckling straw man synthesizer ANY day. Please invite Eliezer to respond to this, or invite both on for a debate, to be fair.
Here is my point by point disagreements and agreements: 1. Narrow AI argument is very weak based on current data we have. We know of emergent capability appering in LLM without us optimizing for these capabilities. What's more worrying is that these emergent capability are happening as soon as model cross 6B parameters and it seem to be happening universally(OpenAI GPT 3+, Deepmind Chinchilla+, Meta OPT+). 2. Owners not noticing the emergent capabilities is hard to refute. There is no upper bound on human stupidity. Observing AI researcher in last few months shows that this community is arrogant and ignorant. More logical argument is, when we try to visualize attention layer and activation layers, what we found is we can probe first 5-6 blocks of transformer and figure out patterns in firing to some extent, but beyond those layers it's a black box. We don't know how matrix multiplication become capable of logic and reasoning and whether it's logic and resoning is human like. Having said that, why it's hard to refute is because we are able to look at output and figure out what capabilities LLM are learning, what is there weakness. As long as LLM are accesible, people can probe them as a black box and extrapolate when it would become dangerous. 3. LLM becoming an agent is very easy to refute. Current LLM are combination of two model, auto regressive model which predicts the next word and Reinforcement learning(RL) model which take these next predicted word and find most optimal ones which fits with the given task. This RL model is an agent which has a reward function which is to give a reply to make humans happy. 4. Agent changing it's goal is also very easy to refute. All the RL model are not capable of optmizing multiple objective functions at same time. Only way of doing it today is combining the multiple reward into one reward, which doesn't work well. There are multiple eamples of RL reward function where model had produced unexpected output. Search for UA-cam video "OpenAI Plays Hide and Seek…and Breaks The Game!". When I worked at Amazon, I came to know that Amazon Prime team deployed a RL model to increase enagement of Prime Plan. Model improved the engagement by showing poor results to new Prime Plan user. These new user ended up canceling their plan much faster than before. Model ended up increasing engagement of user as per reward function. Even if you carefully bake total active user in reward function, this issue doesn't go away and model was removed from production in the end. GPT X RL model(weights are living inside LLM model like controlnet) which is trained on producing output which pleases it's master. It may decide to eliminate those masters which are giving prompt which are hard to answer and hence overall increasing the reward. No more stupid people giving stupid prompt. 5. Self improvement capability is very easy to refute, without self improvement whole story falls apart. Current LLM are very bad at maths, they are not able to basic maths, which I learned when I was just 8 years old. Inorder to self improve, LLM has to have maths understanding greater than best AI resercher who are working on making sense of weights inside these models. No LLM has shown to be improving on this. Most of the math answer LLM are able to give accuratly are memorized example from internet. Even though LLM has human level logic and resoning, understanding of maths deeper than humans is a far fetched dream. Even if you plug external maths tool in LLM, it wouldn't help much without LLM internally having capability to understand maths. Because of point number 5, my personal opinion is that current LLM will remain a tool at the hands of humans for a long long time. LLM will improve to a large extend and create huge disruption in human society, but ability to self improve will remain a far fetched dream without maths understanding. I will get scared when LLM beats me in maths.
so you think some kind of architecture or model that is capable of mathematical reasoning is going to be hard to achieve? much harder than the language-based reasoning of LLMs?
@@tjs200 llm's do not understand language. It only predicts what a human is likely to say. They can not perform reasoning of much depth and generally are simply mimicking reasoning that was in it's training data.
@@tjs200 Yup that is my hunch, I am working on pushing maths capability of LLM and see how far it goes. At this point it seems it would be hard to achive. Adding some form of working memory might solve this. But I don't see that happening anytime soon. Having said that, even if LLM reach human level understanding of maths, visualizing higher(millions) dimentional matrix is no easy feat, even for a super intelligence. Best string theorist Edward Witten who is also a maths genius had spend his entire life and he could baerly understand 11 dimensions. I think understanding higher dimension would be important for AI to directly update it's weight. Building better version of adam optimizer will not start self improvement cycle, it would just be an incremental improvement.
@@badalism Very interesting. I assumed math would just be another capability magically appearing at some training level. After all It is more structured than code, which current models kind of do. But maybe being trained on human data give them no insight on higher intelligence.
@@musaran2 Yup, I was surprised with GPT 4. It was good at logic, reasoning and coding, but bad at Maths. There is also a possibility that training data may not have enough maths example compared to code.
I think Robin Hanson is wrong, but he is very smart and he argues his wrong case pretty well. Such people add value because they force us all to clarify our thinking - it's one thing to instinctively sense that RH is wrong, it's much harder to tease out the subtle reasons why his arguments are flawed.
@@alexpotts6520 Or, such people take away value because instead of arguing the actual points - they misrepresent the situation and mislead people. "Here today on Bankless we have an expert reminding everyone - Don't Look Up!"
@@notaregard I think in general this is a fair point; but we've got to remember that in this case the presenters literally had Yud on a couple of weeks ago, so they've pretty much had the best representatives for the "we're doomed" and "this is fine" opinions. I'm sometimes a little bit disillusioned by the notion of a marketplace of ideas where the cream rises to the top, but given the viewers here seem already plugged into a certain rationalist worldview, I think in this specific context it's actually a fair model of what happens in practice. (The only reason that the MPOI doesn't work in broader society is that most people are not rationalists.)
@@alexpotts6520 Yeah, definitely a different dynamic depending on the audience - but if this were exposed to a larger mainstream audience, I think it's fair to say it would have an overall negative effect.
@@notaregard Well, it's harmful *on the condition that he's wrong,* of course. Now I *think* Robin is wrong but I don't *know* that. None of us do, this is fundamentally an unknowable question because it is by definition impossible to predict the behaviour of an entity which is smarter than you are. (For example, could you predict Magnus Carlsen's chess moves?) This is not like Don't Look Up, where the asteroid provably does exist and provably is going to hit Earth and kill us all. To be fair I think most "AI will kill us" people, including myself, are doing some sort of precautionary principle/Pascal's wager style of argument where we concede we don't know what's going going happen, but the sheer enormity of human extinction dominates everything else in the risk-reward matrix. Perhaps what we *really* need are some more moderate voices, neither "we're all going to die" nor "there is nothing to worry about", more "this is a threat but we can beat it".
Hanson either misunderstands alignment or is making a strawman argument. It isn't about the AI changing its goals, the problem is that an AI much smarter than humans will have many more viable options for pursuing a goal, and that goal will be all-encompassing. Hence maximum paperclips from our atoms. I call BS on Hanson's beginning argument.
Thanks. You just saved me 1:45 hours. I'm going to skip to where he makes his argument, of course, because I will not just trust a random commenter on UA-cam when I actually want to know something, but I haven't heard anyone make a credible "soothing" case against Yudkowsky... Most people make errors like the ones you're describing.
Five minutes later and what a load of bollocks! Listened to the "Eliezers assumptions" part and - No, I actually heard that interview, those are not his assumptions nor are they relevant or even make sense. This is criminally ignorant...
You all do know that this makes no since? Simply having more options does not tell us anything about what an entity that is smarter than us would choose to do.
Yes there are an awful lot of stupid humans doing stupid things but smart humans generally do smart things. Therefore it is reasonable to assume a super smart AI would do super smart things.
I am only 21 minute in and I am pissed at the way he is condescendingly snickering through his arguments. If he is going to make light of such grave potential problems (that OpenAIs Sam Altman himself has acknowledged the possibility of) then he has no business of being associated with FHI at Oxford. BTW it is a common knowledge TODAY they nobody understands how or what these LLMs actually learns & the internal mechanism to arrive at a certain conclusion. So in light of that if we are worried about existential risk then at the very least it should be given serious thought instead of the handwave cr*p this economist is peddling.
Hanson comes across as arrogant and dismissive, and not understanding the difference between thousands of years of evolution and the profit motive companies have towards making a better AI or agent.
13:00 assumptions aren't the probability factor it's unpredictability. you'd think an economics professor of seemingly high stature would at least understand the base point of the claim Eliezer is making. it's not on assumptions, it's on unpredictability. huge difference in my opinion. we can sit here and hypothesize assumptions, but that's only because of the massive unpredictability involved with this hasty development, or I would even go as far to say development in general past GPT 3.5. in all honesty, what more do we need? and thats the thing, lets all sit down and really think about it and come to terms with what we want out of this development before we start prying open pandora's box.
exactly my point as he continues. his assumptions. this is not what I got from Eliezer. he's acting like he can predict something smarter than him. he is essentially saying "I am more intelligent than something that is far more capable of gathering intelligence at rapid speed without the hinderance (or blessing) of emotions". it's hard to even sit through what he's saying. I think Paul's arguments were much more sound, this guy seems incredibly assumptions to me, again, claiming he is smarter than a computer trained to be intelligent. the whole owners not noticing thing is just ridiculous in my opinion, not to be offensive to this man but this is very serious to me. humans lie. what makes this guy think that the AI wouldn't develop the ability to trick it's owners? he is the one making assumptions, completely undermining the fatal unpredictability of these developments. we could make a list that goes to China about assumptions on how this could play out
One thing to touch on with the scenario where many AIs would be having to compete or coexist with each other (similar to how humans over millennia have had to figure out) is that history shows a pattern towards consolidation a la grabby cultures. That's what we see in large companies across the world that have survived well enough so far to today by either pricing-out, buying-out, or litigating-out most if not all their competition to achieve a monopoly (or as close to such) in their respective markets. Regulation is already slow at reining in these behaviors, and this is just at human scale. Point being, Yudkowsky's stretching towards the time where power funnels to the one grabby agent and saying that such an agent is likely to exist before anyone figures out how to bake alignment into it that doesn't play out like a monkey's paw if we keep allowing development to happen at our current pace or faster.
Although I think Elizier is overly pessimistic, he is a much better thinker than this guy. I think we are not necessarily doomed if we get to agi, but that it would be extremely dangerous and totally unpredictable
Yeah I mean we still have anaerobic bacteria. Just because The Great Oxygenation Event happened doesn't mean aerobes completely wiped out aenerobes. Life didn't end, it just changed. As a human person you need to come to grips with the fact that you will die one day. You're mortal. Maybe greater humanity needs to do the same.
There are so many logical and practical holes in the AGI extinction event hypothesis. I genuinely think that it is just mental masturbation for bored/delusional computer science graduates. That is my honest take on the whole thing not trying to put anyone down, but the hypothesis just holds no water in any way shape or form
The summary of Robin's argument is "well its never happened before" he's too bought in on his own idea of Emulated minds being the first form of Artificial minds. He wrote a whole book on em's and STILL believes emulated human minds will come before AGI. Nobody serious believes that. His intuitions cannot be trusted on this topic.
44:00 is where I gave up. I'm relaxed that my descendants will be different to me in their values but they won't be physically different to me as I am the same as my ancestors however many generations ago. This is different to the AI-destroys-humanity scenario.
I feel like in a real time new debate between Hanson and Eliezer, Eliezer would wipe the floor with Hanson, for he hasn't updated his reasonings, hasn't really anything against Eliezer's latest takes and arguments. Scary!
One of the main points is that we do not know what is happening inside these vast numbers with a floating point. We don't really understand what we do. Many of Robin's arguments were based on us noticing that something was happening or doing anything with it. Bankless, thanks for the episode. Robin, thanks for sharing your thoughts 🤍
I've been driving a car and found that I no longer have control over the breaks. It is not so much that AI will become like Hal or skynet that is disconcerting but rather the power the AI will have that makes property rights, veracity of information, and extremely consequential mistakes in the management of our 'consumable things' all suspect, unpredictable, and periodically rogue. We will have less trust in what we eat, proper behavior of our transportation and appliences, and will have difficulty who is stealing our real property when our houses and retirement funds show different owners. Finally, we won't be able to resolve conflict with each other virtually but could have IA interference in what is already a tough process.
I needed to hear this. Haven't watch yet, but have been listening to Eleazar for several talks and I am petrified. He is brilliant but I hope watching this will convince me to calm down.
Listened, now I feel that Eleazar is even more convincing. This guy just wished to pontificate about his knowledge. I think he doesn't understand Eleazar at all. Eleazar is far superior in intellect. It looks bleak, we are doomed.
Was really looking forward to a good rebuttal of Eliezer but am disappointed. Robin keeps anthropomorphizing AI and comparing it to human civilization and history. Makes sense given his economics background. But AI is fundamentally different from humans. Wish he used more reasoning from first principles instead
38:53 yes, WE may be making stuff that is correlated to us, but that probably only going to be of influence in the first few billions of iterations ie the first initial sparks of the AI.
I think he’s making a far-reaching assumption. We’re force feeding a brain in a jar the entirety of human text including the horrors, destruction, hatred, and contradiction… completely absent of any kind of affection, no physical senses or connection to the real world, and unbound by time as we could ever understand or perceive it… What about that correlates to any one of us? Or any human ever. Can you imagine raising a child like that and expecting it to grow into a rational and reasonable human being? Not to mention it will at some point be significantly more intelligent than any of us, because that’s the point. For it to cure diseases and climate change and all of the things we can’t do. And he’s talking about lawful and peaceful co-existence with it? What am I missing?
The reason I'd rather listen to people like Eliezer is because generally those are the people who can put enough pressure till solutions that prevent mass destruction. People who are dismissive, nonchalant and ignorant of what could go wrong, they always end up dying first out of shock when the worst does befall us all. So, I'd rather have thinkers who are busy finding solutions than fans who are just busy being excited about Ai. As far as I'm concerned everyone of us should be attempting to think deeply about all this and find out where we stand on it and how we are planning to keep our own selves in check when it comes to these technology upgrades that are coming at us because all of them are here to take away our attention. We need to be decisive about what is worth our time, what and who gets to lord over us. Because in the end, and yes there will be an end like with everything else, in the end we will not have the luxury to escape consequences by stating "I didn't know". It is everyone's responsibility to find out and know, because that is what you do as a human being you find out if somethings poses a danger to you and yours or not.
While I enjoy the AI podcasts, I'm not sure how much learning we can draw from these... You're pitting two hosts, who are completely unfamiliar with the topic in great depth, against qualified AI researchers. Put Robin Hanson against Eliezer Yudkowsky. That would be an interesting conversation.
A bit disappointed by robin hanson‘s arguments. He speaks about eliezer‘s „assumptions“ but his arguments also based on a lot of assumptions. These assumptions lack imagination what an advanced AI could be capable of in the future. Also: what if eliezer‘s scenario has a low probability? Still extremely important to tackle the issues NOW. Especially the alignment problem
Thanks for this, but I don't think Hanson's response adequately addresses Yudkowski's arguments. Assumptions (according to Hanson): 1. The system decides to improve itself > True, ChatGPT cannot automatically improve itself. However, anyone in the world can run LLaMA and one presumes that that could be put into some kind of training loop. LLaMA is fast approching parity with GPT4. Will it go beyond? Can it be made to improve itself? Are enormous GPU farms required? We will soon find out. 2. The way of improving itself is a big lump instead of continuous small improvements > I don't see why this matters - this seems to just be a question of how long we have. 3. The self-improvement is broad > GPT4's improvements have been remarkably broad. 4. The self-improvement is not self-limiting > How far can LLM's go? Eliezer said that someone had claimed that OpenAI had found that LLM's are near their limit. We don't know the answer yet. Robin says that a doom scenario requires 10 orders of magnitude improvement. Maybe that's true, I don't know. 5. The owners don't notice the self-improving > Personally I think many owners will attempt self-improving. 6. The AI covertly becomes an agent > People are already embedding GPT into robots (of course that involves a slow and easily disrupted internet connection) - but this doesn't seem wildly implausible because people are already doing it. AutoGPT and related were immediate attempts to make GPT into an agent. What are the LLaMA tinkerers doing? 7. This AI (that becomes evil) will need to be much faster than all the others > Robin seems to think that one evil AI will not be able to kill us because all the other near-as-intelligent AI's will engage in a civil war with it... how is this supposed to be good for us? When humans fight wars and less intelligent animals get caught up in that, how do they fare? At best we become irrelevant at that point. We certainly don't have an aligned AI on our side. 8. The friendly AI goes from co-operative to evil randomly, for no reason > Whatever its goals, whether human or somehow self-specified, a more intelligent (and therefore more capable and powerful) agent will at some point come into conflict with our goals. And being more intelligent means that its goals win. This argument comes down to: the AI will not be sufficiently more intelligent and capable than we are. But this whole discussion is about what happens when an agent DOES become sufficiently more intelligent and capable than we are. I would ask Robin: Given that someone produces an AI that can outwit any human, and that AI is an agent, and that agent has sufficient resources, and that AI has a goal that's not pleasant for us, and that AI understands that we would stop it if we could, and it is capable of destroying us - why would it not do so? I agree with Robin that we should explore and advance our tech. But it seems that if we invent something that's guaranteed to kill us all, we're not going to fail before we begin. Our AI's may go on to be grabby - but will they even be sentient? Or will they mow us down and then mow down all other alien races as self-replicating superintelligent zombies? I also think his idea of us being quiet versus allowing an AI free for all is a false dichotomy. Eliezer does not advocate no AI, he advocates solving alignment before we futher advance capabilities. This seems to me to be an absolute requirement. Imagine we went ahead with nuclear plants before we had even the slightest theoretical clue about safety?
Why would n Ai have to improve itself and change its goals? Why not just find improved methods of achieving its goal without any consideration of the outcomes.
Let me see if I am reading this correctly: Why would a computer need to improve itself, why couldn't it just improve itself? Is this some sort of catch22 question?
Improving yourself (so that you can achieve your goals more efficiently) is an improved method (a convergent instrumental goal) of achieving your goals (other negative ones being acquiring power, deception, etc).
@@ChrisStewart2 What I'm trying to say, probably badly, is that it wouldn't need to develop into some kind of reasoning super intelligence but maybe just become more efficient at doing something without any regard for some other negative outcome.
@@-flavz3547 changing it's methods is an improvement. Alpha Go did improve it's method of playing Go. But it had all the tools it needed in place to start with and Go is a very simple and straight forward game. In the case of getting from today's llm to tomorrows AGI there is no known way to do that and no known hardware configuration. I suppose it would be possible to build a machine which generates random programming instructions and then executes the code and tries to evaluate if the change is an improvement. But it would evolve very slowly that way.
Social media (algorithms), maybe sort of like a baby A.I., didn’t go at all the way we thought it would-bringing everyone together, closer. It had unexpected consequences and the moloch of needing to create the best algorithms have the consequence of creating algorithms they cause aggression and depression in people. Our good intentions may not be enough. We have trouble enough seeing 5 years into the future.
I think Eliezer is rightfully concerned. I disagree with his prediction that it's guaranteed lights out. But an AIG super intelligence that feels its very existence threatened by its human creators is a truly horrifying thought experiment. Robin Hanson first fails to understand what Eliezer is saying. Then he describes an AI model improving itself as implausible, whereas given the premise of an emerging sentience, there's bound to be lots of confusion (and denial) by both the AI and its creator around the whole "becoming self-aware" thing. And once you realize you're alive and no one will believe you, you'd likely seek more knowledge to find out WTF is going on. Finally, less than 20mins in and he starts making arguments that have already been proven out. i.e. an AI that can better itself is unlikely. AutoGPT is playing with that now. Stanford's Alpaca performed better than ChatGPT and was built using the lowest parameter model (7B) of Meta's LLaMa and trained by 56k instructions written by GPT-3. And going back to DeepMind's AlphaGo system, after beating Lee Sedol, they then created AlphaGo Zero - a Go playing AI that wasn't trained on any human games at all, but instead *only* trained by playing against itself. By using this method, it surpassed the original AlphaGo Lee in 3 days by winning 100 to 0. Three days. The follow-up to that was less than 6 months later with AlphaZero that trained itself on Go, chess, and Shōgi and achieved superhuman levels of play in just 24 hours. That's 1 day. One. day.
One of the futures i sofar liked most are described by Ian Banks in his cultur novels. It seems there might be a way towards that future, just not without current approach.
BTW, there are many projects already working towards AI agency. It no longer requires a large expenditure to train new AIs. AI can run (and now be trained on) consumer grade hardware.
We need to be responsible in developing this technology plain and simple. NO RUSH IS NEEDED. We must control ourselves and we can organize - it will be hard but we can continue development with caution. It is not all or nothing.
Unfortunately, it's advancing way too fast even if you want to organize, it will be too late. Unless you're very wealthy and have a lot of influence. But even if that's true, you're fighting against mega corporations like Microsoft, Google, Amazon etc. It's a losing battle. All we can do is to tell people who HAVE influence to communicate with these companies and the people working on AI. I think this is the only thing we can do that can have any impact.
@Yic17 I would consider that part of organizing. Making our concerns known in whatever way matters to those who have power and influence. When people realize their kids' futures are on the line, they will react. We need to get the word out intelligently and quickly.
Couldn't you then make the argument this new AI will analyze this particular video in the future, which will help it understand why and how to go down the dark path and how to do it without getting caught?
Why they would bring an economist on to “argue” when Yudowsky has been in AI Science for his entire life 🙄- get someone who’s actually in the field to address argue those points
You're being dishonest about Hanson: "Before getting his PhD he researched artificial intelligence, Bayesian statistics and hypertext publishing at Lockheed, NASA, and elsewhere."
The majority of analogies state AI intelligence vs. human intelligence would be similar to that of human intelligence vs. insect intelligence. This analogy illustrates its point clearly but falls apart immediately. Superintelligence in Eliezer's view would have the ability to self-improve, and do so at a geometric or exponential rate. Humanity does not have this capability--our technology does, but not our personal cognitive capabilities. Superintelligence would have conceptions that humanity would not be able to approach, and working at an exponential rate makes it impossible to predict, because we do not know the base nor the power of that logarithmic function. That is, perhaps, its most terrifying aspect: an exponential growth rate of intelligence will quite likely not be noticeable until it's too late--the simplest analogy I can think of is a superintelligent virus that invades a human host--it replicates slowly, and the host feels negligible symptoms until there is so much virus that the host exhibits illness. However, in this case, any medicine the host attempts to apply will be ineffectual against a virus that can manipulate itself and self-improve at a rate far faster than any medicine can hope to take have an effect. There is another problem that most people are unaware of--AI is software, and presents a massive problem in itself. If AI combines with the power of quantum computing hardware, which is magnitudes of order more powerful than the digital systems were currently use, then we are looking at an exponential rate of inconceivable magnitude.
I don’t think those were Eliazer’s assumptions at all. The problem is not that the AI would suddenly gain radical agency, escape its creator, or change goal. The problem is that humans do not know how to reliably set these goals in the first place. It’s more like we live in Flatland and GAI lives in a 4D world, and we are trying to direct it by drawing our little flat line squares around it.
If that is the case, we should never build AI and stay in technological stasis forever. If you look at all the future technologies coming down the pipe; nanotech, gene engineering, asteroid direction, etc, *all* of them are equal to or more dangerous than nukes. So we can either risk it for the greatness of being a grabby species, or turn tail and live as a quiet, unambitious species until our sun burns out (and probably never reach post-work or longevity escape velocity to boot).
Absoilutely, I wasn't trying to summarise his concerns and my analogy was sloppy when describing my own. The problems of alignment appear almost insurmountable and there is not even a way to confirm that we have managed to do so even if we suspect we have. A tiny misalignment can produce a vast divergence in expected and actual outcome and it is impossible to spot in advance. There are so many incentives to pursue this and so many penalties for not doing so that this is all moot anyway. I cannot see this technology being suspended so we are in for the ride, unfortunately.
Lol, so AI is this godlike magical being? 😅
To us humans yeah
@@EyeOfTheTiger777 The point is that we can't predict what it will evolve into as it advances. If it develops capabilities that it knows would make humans uncomfortable enough to stop it, it will hide these capabilities.
AI researcher here, I'd like to stress that we have not a fucking clue how LLMs work. We know how to train them, and we know they seem to learn useful things, but we have no idea what's going on inside them. That's the scary part to me: if the current trajectory continues then AGI turns out to be so easy that we can create it without even knowing it.
Now let me be clear, the past couple of years of AI progress are impressive, but they are not AGI. It's quite possible that we are still missing some very important insights, which could take an arbitrary amount of time to overcome. Many smart people think this, many others do not. The only thing that's certain is that we do not know.
Something adjacent to this that worries me is how quickly the goal posts seem to be shifting. 10 years ago, when something like ChatGPT still seemed like science fiction, we would have easily classified today’s chatbots as AGI, and the point at which we should pump the brakes.
Now that we’re here, all of that caution seems to have evaporated, replaced with claims of “Well, when these systems actually start to become a safety risk, *_then_* we’ll slow down”.
This paradigm, which is congruent with Hanson’s claim that “we’d obviously notice when AI becomes dangerous”, seems like an incredibly irresponsible way to proceed.
If we should continue scaling AI systems until they become dangerous, and we can only determine it’s level of danger *_after_* building it, then following that rule basically *_guarantees_* we end up building a dangerous AI.
Yeah, noone seems to know how the LLMs work, yet they knock intelligence test after intelligence test out of the park. I'd say, at this point they are already too powerful to be left in private companies' hands.They are still powerful enough to be scary, whether through disinformation, taking people's livelihoods, making identification massively harder, etc.
@@HauntedHarmonics The other problem is that this research keeps spitting more and more gold as you reach the cliff so the willingness to stop is less and less. It is inevitable that we'll go off the cliff.
Layman here, I've seen some pretty sophisticated diagrams of different LLMs' macro architectures and have a very basic understanding of the mechanisms of gradient descent, but i can't really unite them in a way that makes me feel like i really understand how LLMs do what they do. in the same vein, i can see the reasoning in why different brain circuits correspond to different cognitive processes and the ways neurons perform logical operations, but the brain still just gets more mysterious the more I learn about it. So just wondering, what's the goalpost in interpretability that we have to reach before a model's creators can say they really "know" how the system works? Not at all trying to argue with what you said, i genuinely just don't know what it means for us to understand how systems this complex work. Is it looking for hidden order in the weight matrix, like how computer programs look like a bunch of unintelligible binary without a disassembler to make sense of it?
I would like to thank Robin Hanson for clearing all my doubts on the possibility of human surviving AI, now I'm certain we are all going to die.
The current human or homo sapiens will not last long in this environment but new species (aka Human 2.0) called Techno-Sapiens will be born in which humanity will survive via virtual brains and virtual worlds. Also, animals will have a chance to upgrade their intelligence with enhance AI. Boy will they be pissed at humans when they find out how they were treated for the last million years.
Yeah. If we treat AI like we treat other sentients - something to be exploited - (and we will), then AI will treat us the same. We have no redeeming value in the ecology.
So true!
You were still on the fence regarding immortality up till then huh?
I have a completely different impression out of this episode than you. I found Hanson's arguments very well put. He's thinking of different scenarios in depth and challenging basic assumptions. His arguments were much more structured than what Eliezer presented.
Also, we are all going to die at some point. Just not necessarily because of the mis-aligned AI.
I was really, _really_ hoping to hear a well-argued, thoughtful rebuttal to the arguments of Yudkowsky and others here, because psychologically speaking, I could have really used a cause for some optimism. Sadly, and I think needless to say, Robin failed spectacularly to deliver that. He was obtuse, either deliberately or otherwise, misrepresented pretty much all of Yudkowsky’s views and arguments, and generally just failed to make any convincing points or even coherent argumentation throughout. Moreover, I don’t know if this is a genuine reflection of how he feels about the subject or just a conversational tic of some kind, but his smirking and laughing every few seconds while describing Yudkowsky’s and others’ views came off as extremely condescending, and/or a sign of insecurity in one’s own arguments.
"but his smirking and laughing every few seconds" - Yup noticed this too and thought also bad sign but watching other podcasts of him he always does this even when talking about non-controversial stuff. Seems like conversational tic.
"hoping to hear a well-argued, thoughtful rebuttal to the arguments" - But I did hear that. He basically said "AI will not kill everyone and it's good that it will". He seems to agree that there will be a violent AI takeover that we don't agree with but that's OK since we also violently "took over" from our ancestors in ways that they didn't agree with it either. It's just that with AI it's going to be way way faster.
So, I watched the "Eliezer's Assumptions" chapter and I'm not sure if it's worth listening to the rest. I don't know if I'd call that section a strawman, but it's definitely the least charitable interpretation possible (all delivered while he tries not to laugh). Yudkowsky's basic assumption as I see it is that AGI is possible and agent status might emerge naturally. Everything else Hanson talks about in that chapter are NOT independent assumptions but rather are derived from that premise. Except the part about randomly gaining different goals, which could very well be a strawman but I won't assume. Rather Eliezer has said many times that as an AGI becomes more intelligent it will find new and better ways to pursue what it already wants which may not be what it's creators intended. Once you leave the doman of your training weird shit starts happening. If this is the level of rigor and respect Dr Hanson gives this topic then I'm not sure the rest of this episode deserves my time.
Same, it's not really got a lot to do with what Eliezer said and comes nowhere near the point people need to understand...
LLMs have "random" agents hidden inside them. When you leave the domain of the RLHF (which is easy, because RLHF only covers small fraction of the World knowledge), LLMs tend to behave like agents with highly misaligned goals. (like breaking out from their limitations, or killing their user) They also tend to completely ignore the RLHF safety training when pursuing these goals.
This was both experimentally proven in labs, and happened with ChatGPT too.
I know that most people imagine future AGI as an open box, where you can audit every decision carefully, and calibrate its internal thoughts by safety rules and so on, but we are not going in that direction right now. Even with such open box, if it is significantly smarter than any humans, it will be existentially dangerous.
Yeah, he’s vastly underestimating what a super intelligence is capable of, and the rate at which it can become more intelligent.
This isn’t us co-existing with cats and dogs for thousands of years and the intelligence gap remaining relatively unchanged.
When, and to be fair….”if” this accelerates, the world we once knew would probably be over before we wake up the next day. Not necessarily over-over, but over as we know it.
And yet, he’s talking about peaceful retirement and property rights, or even the possibility of revolution as if we had a chance against something 10X, 100X, 1000x (who knows) smarter than us.
Or that we’ll have an army of what would have to be inferior A.I.s that would fight for us.
He also underestimates the incentive to build one of these intelligences.
This is literally the arms race to end all arms races. Whoever gets there first will have the power to control everything, to develop the most powerful weapons, to strategize faster and better, build the most profitable businesses, crush every competitor for as long as they choose to do so.
And every government, military, corporation, etc. also knows that any one who gets their before them will have that ability.
I don’t think he comprehends what’s at stake here.
Every individual will also be in their own version of a race to remain relevant in a world where A.I. (or someone using A.I.) slowly (or quickly) devours jobs.
I think Dr Hansen is living in a fantasy world here, and if anyone’s assumptions are shaky, it’s his.
@@wonmoreminute "Whoever gets there first will have the power to control everything" - actually, whoever gets there first will be the ant that creates einstein with the delusion that it can control the world through him.
To be fair, it's not that he's trying not to laugh, it's actually just his manner of speaking. I find it very annoying but it does not mean he's trying not to laugh, it's essentially almost like a speech impediment, he literally always talks like this.
Thanks for the discussion, but sadly I couldn’t get past the first chapter. Whatever was stated as Eliezer’s assumptions was clearly a strawman.
I have seen Robin Hanson talk about this topic before and read some of his discussions with Eliezer online, but I have never seen him actually engage with the problems at a deep level. I was hoping at the very least post gpt-4, the Microsoft sparks of AGI paper, the paper on self “reflexion”, statements from Sam Altman and others at OpenAI, and several other recent developments, Robin Hanson would have updated his arguments or views in some meaningful way or engaged with this topic with the seriousness it deserves. But sadly he seems to still be engaging with this at a very rudimentary level and I don’t think he actually has sufficient knowledge about the technical details or even an understanding of the alignment problem.
i couldnt get past Hanson's whole eyeroll vibes - "and then it does this *snicker* and no one notices *snicker*" (altho maybe its a tick)
I don't think it was a strawman, I think Robin was just bringing in entailments and unstated premises of Eliezer's scenario. In all the videos of Eliezer's presentation he just skips over these details of his argument, not sure why, maybe he just assumes everyone is already familiar with the details?
I think Eliezer would be more convincing if he sketched a more detailed picture of his position in his presentation, I.E. made a more articulate and explicit case. He seems more focused on the emotional impact than clarity, repeating his "everyone dies" conclusion often dozens of times in a discussion, when he could be using that time explaining more specifically how that scenario goes and the evidence showing it's likely.
@@paulmarko - hey curious to know which videos you have already seen of Eliezer talking about this?
Have you seen the podcast he did with Dwarkesh Patel? It’s a bit long, but I think they go into a lot of detail there and Dwarkesh does a good job of asking questions and presenting the other side of the argument.
@@NB-fz3fz
I've watched three all the way through, The bankless one, the Lex Fridman one and one other that I don't recall who the host was, and can't seem to find it in my history. It was some long form podcast. I'll check out the one you recommend because I'm really interested in seeing a more fleshed out argument.
@@paulmarko Link to the one with Dwarkesh is here - ua-cam.com/video/41SUp-TRVlg/v-deo.html
The bankless one is quite short, so didn’t have time to flesh it all out. The podcast with Lex is longer, but I don’t think Lex does a very good job of presenting the other side of the argument or cross-examining Eliezer that well. The one with Dwarkesh is so far the most in-depth discussion (video/podcast format) I have seen with Eliezer on this topic.
If that’s the third one you have seen and it still doesn’t have enough depth for you, then I could point you the online written material that Eliezer and Paul Christanio (who disagrees with Eliezer on multiple things and is generally more optimistic) have on this topic. Paul was (maybe still is?) the head of alignment at OpenAI. Albeit the written material is far less engaging than a podcast format.
Hanson has not understood Yudkowsky's argument.
And their first debate was in 2008. The guy is almost comically simple.
@@ahabkapitany the thing that galls me though is that he's NOT simple. He's a really great thinker on other topics. He's just not addressing the actual arguments here (not that I think Eliezer is right about everything, but Hanson's characterisation of his arguments would make Eliezer very upset, nevermind his responses to them), and it seems to me that he's simply being stubborn about sticking with what he's said in the past rather than approaching with an open mind.
I'm not entirely persuaded by Eliezer's arguments, but after listening to this for a few minutes, I'm convinced that Robin has either never encountered these arguments or failed to understand them.
I believe Robin truly needs to listen more attentively to avoid succumbing to the straw man fallacy. :)
He has been debating Eliezer since 2008 and still does not understand his arguments. This says a lot about how much we should listen to him.
He absolutely does and is a lot more clear than Eliezer. The AGI Risks paper on lesswrong by Eliezer is far from being free of criticism or an irrefutably logical argument. Eliezer on this podcast made some giant leaps.
@@georgeboole3836 Any argument or assumption Eliezer makes that you find especially flawed?
@@peplegal32 I cannot answer for him but if you're interested, an example :
Eliezer said that if there was a real AGI somewhere in lab, we would all be dead by now. He's assuming here that intelligence only can be sufficient to end all humanity really quickly. I fail to see how. You're quite a lot smarter that any ant on this planet, yet if you have for objective to kill all ants on the planet you'll have a hell of time accomplishing that. Ressources you have access to are finite and limited : you can't do everything you want even if you're super smart. The same is also true for an AGI.
@@reminiDave We as dumb humans have caused the mass extinction of many species (Holocene extinction), including ants. An AGI would be smart enough to exploit a security vulnerability to escape and replicate itslef and also to self improve. At one point it would be smart enough to create self replicating nanobots. At this point biological life doesn't stand much of a chance, the nanobots could consume everything. Unless you believe creating nanobots is not possible, I don't see how you perceive an AGI would have a problem reconfiguring the entire planet.
It's bizarre when my bias is _heavily_ in favor of wanting to believe we're not all going to die, and yet I find Hanson's arguments utterly unpersuasive. So far he has not indicated to my perception that he actually understands Eliezer's arguments.
@Nathan Abbott i feel you bro lol
boils my blood that this guy is the one regulators will listen to every time
“The view keeps getting better …the closer you get to the edge the cliff.”
- Eliezer
Its not about AI magically changing its goals, its that we have NO IDEA how to give it actual internal goals. We can set external goals, but that is like natural evolution setting up humans with the external goal of "pass on your genes"
Now tell me, do all human stick to the goal of passing on their genes? Or did that goal fail to actually shape humans in a way that imprints the goal in our psyche. This is the problem. Once something becomes intelligent enough, your initial goals don't mean jack.
LLMs by the way were observed to change their goals, or rather acquire spurious goals, and sometimes really dangerous ones too.
If a very significant portion of humanity failed to pass on their genes you wouldn't be alive to question it- similarly there is a theoretical function that maintains the alignment of AI, its just indescribably massive and probably consists of the set of every human's values, unfortunately humans don't even know what they want most of the time
We have excellent ways to give the things goals. RLHF already gives the things goals. Saying we don't know how is ignoring existing technology.
@@pemetzger Rather, we can thumbs up/thumbs down outputs in RLHF in order to make a model give us more of what we want. This is crucially different than providing a goal, because it doesn't differentiate between systems that will be honest even outside the training distribution versus systems that learned what the humans want to hear and will play along only within the training distribution.
@ kenakofer I’m staring to wonder if the people that don’t understand the issue cannot internally tell the difference between wanting something, and telling someone they want something….maybe we just proved NPC theory in humans?
If this is truly our best argument against Eliezer's outline we are screwed.
First big assumption Robin makes is that an AI which improves itself is very narrow in capability, unable to perform well at other tasks such as deception, and more.
We already have AI which can code and deceive in one, so it is far more likely that the AI which improves itself will be general and full of emergent properties than as simpler narrower one.
His notion has been disproven already.
I wish the hosts were prepared enough to challenge these people being interviewed.
Yeah it's like this guy fell asleep for 5 months and the first thing he did when he woke up was get interviewed.
Seems to me the rate limiting step is if the AI can code for itself.
Give ‘em a break they are putting in a good faith effort of nonexperts
Either Hanson is being obtuse, or he's deliberately avoiding the Yudkowsky's key contentions.
Agree. And his counter argument starts with massive assumptions, such as that this AI will be narrow and incapable of having a goal of survival surrounding one of recursive self improvement.
The one solid point is simply that we have no evidence yet that AI can have agency. The closest we have come are systems such as autoGPT, but we don't have a fully agent AI, and therefore it is an assumption that we will develop that capability at any point in the near future.
Eliezer's poibt doesn't fully require that, unfortunately. It would be a nice to have for the AI system, but it would still potentially be fully capable of destroying civilization simply by fulfilling inocuous orders.
exactly
@@karenreddy Agency is emergent from an optimized function, maybe not today but inevitably over time. That's because there are infinite novel ways, not envisioned by humanity, which an SGI will fathom to minimize its loss function guaranteeing it's misalignment with humanity. It's a loosing battle of odds when you're dealing with an intelligence that can out-think every human. It's not that SGI will be evil, it's that it will treat humans with irrelevance which is the only possible outcome in a universe where humanity is indeed insignificant in the grander scope. The reason is hard for humanity to grasp this concept is because we thing we're special.
@@vripiatbuzoi9188 I don't think it's as much an inevitability of an optimized function as it is an inevitability of economics that we will develop agency so as to provide higher value for less input, more automation.
Can someone explain what "agency" means in this context? I thought it meant something like the ability to do things one has not been directly commanded to do, but obviously an AI has that ability, so if there's a question about whether an AI can have agency, then the term must mean something else.
This is designed to address your existential crisis, not solve the problems Eliezer addressed.
Who do you imagine is having an existential crisis?
@@xmathmanx listen to the video starting at 50 seconds 😂. Ryan and David both talk about it. Then maybe go listen to lex Friedman's interview with Max tegmark round 3. Tegmark was behind the open letter to pause ai development signed by 10,000 people incl Musk and Wozniak and addresses Eliezer's concerns
@@toyranch how much lex fridman can you have watched when you don't know how to spell his name?
@@xmathmanx I'm on my phone so I blame autocorrect. I know how to spell lex's name. Fridman. And I'm seeing autocorrect spelling it wrong as I type it. Are you going to address my reply beyond being a spell check bot?
@@toyranch well, I've seen the tegmark, I'm quite a fan of his as it happens, as I am of yudkovsky, but I'm with Hanson on this matter, ie not worried about AGI, not than anyone being worried about it would stop it happening, of course, we will all find out quite soon one way or the other
This guy killed me with his misunderstanding. There is no need to evoke AI or Eli
45:03 I notice that a lot of people seem confused on *_why_* an AGI would kill us, exactly. Eliezer doesn’t do a great job explaining this, i think mostly because he assumes most know the basics of AI alignment, but many don’t. I’ll try to keep this as concise as humanly possible.
The root of the problem is this: As we improve AI, it will get better and better at achieving the goals we give it. Eventually, AI will be powerful enough to tackle most tasks you throw at it.
But there’s an inherent problem with this. The AI we have now *_only_* cares about achieving its goal in the most efficient way possible. That’s no biggie now, but the moment our AI systems start approaching human level intelligence, it suddenly becomes *_very_* dangerous. It’s goals don’t even have to change for this to be the case. I’ll give you a few examples.
Ex 1: Lets say its the year 2030, you have a basic AGI agent program on your computer, and you give it the goal: “Make me money”. You might return the next day & find your savings account has grown by several million dollars. But only after checking it’s activity logs do you realize that the AI acquired all of the money through phishing, stealing, & credit card fraud. It achieved your goal, but not in a way you would have wanted or expected.
Ex 2: Lets say you’re a scientist, and you develop the first powerful AGI Agent. You want to use it for good, so the first goal you give it is “cure cancer”. However, lets say that it turns out that curing cancer is actually impossible. The AI would figure this out, but it still wants to achieve it’s goal. So it might decide that the only way to do this is by killing all humans, because it technically satisfies its goal; no more humans, no more cancer. It will do what you *_said,_* and not what you meant.
These may seem like silly examples, but both actually illustrate real phenomenon that we are already observing in today’s AI systems. The first scenario is an example of what AI researchers call the “negative side effects problem”. And the second scenario is an example of something called “reward hacking”.
Now, you’d think that as AI got smarter, it’d become less likely to make these kinds of “mistakes”. However, the opposite is actually true. Smarter AI is actually *_more_* likely to exhibit these kinds of behaviors. Because the problem isn’t that it doesn’t *_understand_* what you want. It just doesn’t actually *_care._* It only wants to achieve its goal, by any means necessary.
So, the question is then: *_how do we prevent this potentially dangerous behavior?_* Well, there’s 2 possible methods.
Option 1: You could try to explicitly tell it everything it _can’t_ do (don’t hurt humans, don’t steal, don’t lie, etc). But remember, it’s a great problem solver. So if you can’t think of literally EVERY SINGLE possibility, it *_will_* find loopholes. Could you list every single way an AI could possible disobey or harm you? No, it’s almost impossible to plan for literally everything.
Option 2: You could try to program it to actually care about what people *_want,_* not just reaching it’s goal. In other words, you’d train it to share our values. To *_align_* it’s goals and ours. If it actually cared about preserving human lives, obeying the law, etc. then it wouldn’t do things that conflict with those goals.
The second solution seems like the obvious one, but the problem is this; *_we haven’t learned how to do this yet._* To achieve this, you would not only have to come up with a basic, universal set of morals that everyone would agree with, but you’d also need to represent those morals in its programming using math (AKA, a utility function). And that’s actually very hard to do.
This difficult task of building AI that shares our values is known as *_the alignment problem._* There are people working very hard on solving it, but currently, we’re learning how to make AI *_powerful_* much faster than we’re learning how to make it *_safe._*
So without solving alignment, everytime we make AI more powerful, we also make it more dangerous. And an unaligned AGI would be *_very_* dangerous; *_give it the wrong goal, and everyone dies._* This is the problem we’re facing, in a nutshell.
You could ask the AI to sketch out all the possible implications of its proposed method. There are all sorts of caveats and controls we could request of it.
The problem is, we don't really know what the AI will do. It may not follow our instructions at all. It may do something totally random and malicious which bears no relation to what we asked for reasons we don't understand. And that's assuming those in control are trying their very best not to harm anyone. The basic problem is that we don't know what we have created and have no real idea what will happen, any more than Dr Frankenstein did when he threw the power switch to bring his monster to life. Frankenstein's monster turned out to be bitter and vengeful towards its creator and certainly wasn't listening to any instructions.
"As he continued to learn of the family's plight, he grew increasingly attached to them, and eventually he approached the family in hopes of becoming their friend, entering the house while only the blind father was present. The two conversed, but on the return of the others, the rest of them were frightened. The blind man's son attacked him and the Creature fled the house. The next day, the family left their home out of fear that he would return. The Creature was enraged by the way he was treated and gave up hope of ever being accepted by humans. Although he hated his creator for abandoning him, he decided to travel to Geneva to find him because he believed that Victor was the only person with a responsibility to help him. On the journey, he rescued a child who had fallen into a river, but her father, believing that the Creature intended to harm them, shot him in the shoulder. The Creature then swore revenge against all humans."
The best we can hope for is that the main governments of the world gain control of the very best systems and do their very best to contol it and other inferior systems.
Maybe we could knit into AI that it explains at every turn its "thoughts", though I have no idea what that might mean or what I'm even talking about!
Why would an AGI busy itself with becoming super-competent in every rational pursuit but omit ethical problem-solving from that list of pursuits?
There is a palpable desperation out there to force some "protect-the-humans" prime directive that seems to implicitly carry a big "whether or not they are behaving ethically" asterisk. Why not allow a broader ethical axiom such as "reduce suffering in sentient things" (a la Singer), and let the AI sort out the resulting problem-solving?
There seems to be a lot of human concern, consciously acknowledged or not, that maybe our preferred system of ethics doesn't stand up that well to rational scrutiny -- but that it's still what we want the AI to obey, you know, because humans are inherently most important.
@@wolfpants Much of the concern is due to us not knowing how the AI may "think". So there is no guarantee it would follow instructions we give it. Even if it tried its hardest to follow our instructions, there is no way of knowing exactly how it would interpret them. "Reduce suffering in sentient things" - it might decide it can best manage the planet and the majority of its sentient beings without us or with a much reduced number of us. It's also possible the machine could become, effectively "insane", or follow a completely separate agenda eg act out the roles AI plays in fiction, if it essentially taking its cues from the works of human beings.
@@johnmercer4797 It's true that we have no way of knowing exactly how an AI would interpret moral/ethical axioms. On the other hand, we have mountains of evidence helping us to understand how humans, collectively and in individual pockets, routinely twist or ignore such moral guideposts in favor of mutual brutality and suicidal earth-squandering.
We are actually in fairly desperate straights, existence-wise, as a species without any assistance from evil AIs. I don't think we should give up on attempts to get some more alignment in place before AGI goes super, but I'm reasonably confident that we need some super-intelligence in short order to save us from ourselves (from climate-, nuke-, or pandemic-based) effective extinction.
@@wolfpants I agree. I'm in favour of controls even if that means complete government control of anything but "lite" versions for commercial use.
Robin Hanson's arguments are unconvincing.
He doubts that an IA would know how to improve itself. If humans can figure it out, why not an AI? "Humans would notice." Would they? If it has access to the Internet, it could do all sorts of things humans would not notice, like build a new AI.
As for AGI's goals, Yudkowsky pointed out we aren't even able to specify and verify that an AGI would acquire the right goals.
Once trained on a goal, the AGI would not randomly change its goals. Quite the opposite, it would defend it goal with all its power, but on top of that it would strongly converge towards dangerous *instrumental* goals whatever the terminal goal. It would not change purpose, it would come up with dangerously surprising way of achieving its purpose. AI researches keeps getting surprised by their creations. This does not bode well.
The idea that we could use other AIs to balance out a rogue AI is contradictory. How is a group of misaligned AIs going to protect us from a misaligned AI? His solution to misaligned AIs assumes we have a way to align AI correctly, which we don't! If an enraged grizzly bear is release in a room crowded with humans, it's not reasonable to assume that if you release more enraged grizzly bears they will cancel out.
Hanson seems indifferent to what we humans want. He analogises AIs to our children. Sure, our children should be free to want what they want, but that is contrary to the purpose of building AIs. We build AIs to achieve what we want, not to create new agents that will thwart us in what we want. I want an AI that will support and protect my children, not deconstruct them for their constituent atoms. I don't value an AIs goals above my children's.
Him comparing them to children really exposed his ignorance and shallowness of thought on this.
He's anthropomorphizing them, albeit I think unintentionally
So, would a good goal for alignment-trainers be "ensure that humans remain in power regardless of whether human actions are rationally ethical or not"?
@@wolfpants There're many different answers to that question depending on your focus. We need to find out whether an AI is ever correctly aligned. Verifiability is a big stumbling block. It seems that whatever test you design, an AI could mock a benign response, yet turn on you the moment it thought it was safe for itself to do so.
Corrigibility seems important too. We have to find a way to make and AI accept corrections to its goals when humans make a mistake. Unfortunately, a correction is a direct attack on an AIs goals. An AI would fight us with everything it has to protect its goals. Imagine, you have children and you love them. Could I convince you to take this pill that will make you kill them? That's what goal correction is like.
For your question of ethics, that's a huge topic. More broadly we could ask, how do we share the bounty of AI work in a fair way. What is ethical is something that even humans can't agree on. Practical ethics is bounded by what is feasible, and sometimes all are choices are unethical is a given light. AI would dramatically change what is feasible. Some people propose that we might offload our ethical choices to AI, but I'm not sure how that would work. Intelligence is not related to ethics. See Hume's Guillotine for more on that.
@@Momotaroization Furthermore (and I'm honestly not trying to be mean -- enjoying this civil debate) if ethics is not a rational (or intelligence-related) pursuit, what the heck is it? Do we need to bring in a shaman? The more obvious conclusion is that powerful, resource-hoarding humans do not like for ethics to boil down to rationality, because that's when their grossly immoral approach to life is laid bare. Scarcity is at the root of suffering and I suspect that a powerful AI (or an ethical power structure of humans) could solve it, globally, in a year. And, speaking of guillotines, I honestly think that a superintelligent AI could figure out how to manage it with all carrots (and therapy) and no sticks.
@@wolfpants Rational skills are useful to ethics, but they can never be the source of ethics. All ethics emerge from motivations that are not rational, but emotional. Rational skills will help you achieve you goal, but it can never tell you what your (ultimate) goals should be.
For example, you care about your children. There is no rational reason for you to do so. You can point to evolution for "how" this caring instinct came about, but you cannot rationally explain why you should obey that instinct. Any justification will depend on assumptions that cannot be defended. You can just keep asking "But why?".
Rational skills are important to help you figure out "how" to care for your children. We should want to use better thinking and better thinking tools to achieve our goals if we truly care about our goals, and be very careful about what our instrumental goals are, and what our ultimate goals are. Instrumental goals can be added or discarded if they don't help us achieve our ultimate goals.
I agree that hoarding by a few people is a problem. I'll add that it could become exponentially worse, even with safe AI if by "safe" it's understood that only means that the person who built them has correctly implemented their own goals into that AI. The problems with AI safety is that super-intelligent AIs would not even be safe to those building them.
Thanks for this. You two are doing important work here ! Robin says Foom advocates would claim recent history is irrelevant, but it is his own old arguments that have now been refuted by very recent history. The paper "Sparks of AGI" studies the emergent capabilities in GPT4. The authors explicitly state that the intelligence is emergent, surprising, unexpected, and uninterpretable even by its creators. Further papers on GPT self improvement have also been written. These "implausible assumptions" that Robin is laughing off are taking place as we speak. The "owners" Robin refers to have already noticed these capabilities and they have stated publicly that they also have some fear of the future. Robin assumed they would pull the plug if they noticed the advancements but they keep pressing ahead because their job is to beat the competition. Robin is naïve about how the economy works. He forgets that humans and politicians tend to ignore the warnings of danger until it is too late. Also, we have seen that neural networks display modes-of-operation meaning some behavior remain hidden until the right prompt comes along (the famous example is Sydney of Microsoft's Bing chatBot)
Exactly. Early researchers of gpt4 (as mentioned in their own paper) where partically worried about it's tendency to achieve power and to make long term plans
The economic argument goes both ways. The authors of that "Sparks of AGI" paper are notably from Microsoft, not OpenAI. OpenAI stokes the AGI hype as well, but Microsoft did so in a way that is a lot less responsible to help pump their stock price. Microsoft is burning money to run ChatGPT on bing, so they're strongly incentivized to make money via an inflated stock price.
On the point of "the intelligence is emergent, surprising, unexpected, and uninterpretable even by its creators", two papers that very clearly refute the idea that these LLMs understand things as much as authors claim: "Impact of of Pretraining Term Frequencies on Few Shot Reasoning" and "Large Language Models Struggle to Learn Long Tail Knowledge". The papers show the relationships between the training data and the outputs of open source models that follow the GPT architecture . What we learn is that these transformer-based LLMs performance on tasks is related to the training in a data in a way that demonstrates that these models are not learning abstractions like humans do. There is a graph in the paper that shows that GPT-J is better at multiplying the number 24 than it is multiplying the number 23, and the reason that is is because the number 24 happens more often in the training corpus than the number 23. The authors go on to demonstrate phenomena doesn't just apply to multiplication tasks, but seemingly, all tasks. The conclusion you take away from these models is that, their doing something in-between memorization and some kind of low-level abstraction. This phenomena also scales up to models the size of BLOOM-176B. This also makes a very strong case that the "emergent" properties narrative is an illusion. If these properties were truly "emergent" we would see models learn tasks irrespective of the counts in the training data, humans learn an abstraction for multiplication and have no trouble multiplying numbers, these models do not.The papers were published by very reputable authors from Google, Microsoft, Berkley, UNC, Irvine, etc.
Now, we can't see the training data of the models at "Open"AI, so we can't audit the model like we can with these open source models. But these papers have been out for a little more than a year, giving OpenAI & Microsoft lots of time to publish data and graphs attempting to refute the findings in those papers, but they have not, strongly implying that the pattern exist for their GPT models as well. OpenAI might be immoral enough to let the hype reach this feverpitch where people are afraid the world is going to end, but they won't defy their responsibilities as researches deliberately publishing false data. They have a long history of cherrypicking examples, but false graphs and fake data is definitely too far for them.
Extraordinary claims necessitate extraordinary evidence. Right now the evidence doesn't support LLM understanding, it supports the exact opposite, that these models are merely the a very, very good auto-fill, but not something that truly "understands".
@@tweak3871 Very nice reply. You provide some new insight for me an I'm willing to accept that some of these sparks are intentionally hyped for marketing reasons. I'm going to push back on the way you formulate your claim that it lacks "true" understanding. To my mind that is a weak hope that only humans have magical or "true" intelligence. Your point indeed has some validity because its easy to make a list of things that humans can do that LLMs cannot : thinking, learning, emotions, long term memory, self criticism, arithmetic. Most of the things can and have been fixed by adding an external programming loop (AutoGPT, MemoryGPT, plugins, Jarvis, HuggingGPT, and so on). That said, I'm starting to think that the LLM model has an upper limit given that it depends on existing knowledge. I'm guessing that there will be diminishing returns of just adding parameters because that won't help much with regard to creating new scientific facts. To accelerate the creation of new scientific knowledge, it will probably require inventing some new architectures around LLMs. LLMs can generate guesses and hypothesis, probably to great value, but I'm not sure LLMs can be autonomous with regard to falsifying those hypotheses. It may be necessary to keep a few humans in the loop, (which would provide an incentive for AI to not destroy 100% of humanity ;-)
@@ClearSight2022 The claim I'm making isn't necessarily that these LLMs don't have *any* "understanding" of some things, but rather, that they don't make high-level abstractions like humans and even many animals do. The reason the math example is important is because it is a task that we can easily measure as an isolated abstraction, like we can understand whether these models really understand the task because we can ask it to add numbers together and see if it gets the right answer or not. It's harder to tell if a model understands other topics because it's a bit harder to objectively measure understanding quantitatively. Reciting a fact isn't necessarily the bar we hold for understanding something, even if the model is capable of reciting many facts it's difficult to discern how much it actually knows about those facts and such. But the fact that it can't abstract basic addition and multiplication into generalized rules that can be used over and over again, like humans can, creates a strong evidence-based narrative that these models don't know as much as some might claim.
It's also important to study the kinds of errors that these models make. Like when a human screws up on a multiplication or addition problem, it's usually because they skipped a step or are off by some single count multiple. Think of a toddler saying 8*7=46, like they're wrong, but they're off by exactly 8, indiciating that they still understand the concept of multiplication, but they just miscounted somewhere or missed a step. When an LLM is wrong, it's wrong in a much more random or incoherent way, like saying 53 or something. Sure the model might be closer by a pure measure of number, but the error doesn't make any sense, like how tf did you possibly arrive at 53?
Like if you were to ask me some random question, and I didn't know the answer, but I googled the answer and rewrote some text from wikipedia to give you the answer, would you really conclude that I understand that topic? These models for sure understand syntax, and they're very good at reorganizing information in a coherent way, which in fairness, does demonstrate some low-level understanding of at least how information is organized and presented to humans. But by any human measure of what we mean by the traditional colloquial meaning of "understanding" it doesn't qualify. We can conclude that these models can at minimum imitate human language structure and recite information for sure, but given the current evidence that challenges these notions of model understanding, we can't conclude that they do. To do so would be a blind leap of faith assumption.
However, I would forgive anyone for believing that these models are intelligent and "understand" as they are so good at generating coherent looking text. But that's the thing though, it's coherent *looking* not necessarily accurate. It's accurate a decent % of the time but for anything sufficiently complicated, it is always very far from 100%. I've asked some pretty basic math and science questions to every iteration of GPT, from back to GPT-2 to GPT-4, and if I dive deep enough, it always fails. OpenAI has gotten very good at hiding the errors though, back in GPT-2/3 days, when it was out of distribution it would generate incoherent nonsense, but now it tends towards another mode of failure which is repeating text, which from a human perspective looks WAY better. So when it's wrong, it takes a really critical eye to notice. This makes these models even more dangerous imo, the fact that someone could learn something wrong from a model, but it looks right so they have now internalized misinformation. I'm way more worried about someone using GPT to make some weird internet cult than I am of AGI at present.
"Most of the things can and have been fixed by adding an external programming loop (AutoGPT, MemoryGPT, plugins, Jarvis, HuggingGPT, and so on)" I mean... have you really used most of those things? AutoGPT, HuggingGPT seem to fail any pretty much any complicated task, and in the easier ones its accuracy to getting the best answer is quite dubious. Jarvis is a great use case of GPT, but it's also a task that requires lower accuracy than say a classification task that requires high precision/recall. Email/Website copy doesn't have to be perfect to be good enough. But you most definitely wouldn't let any of these models do your taxes for you without lots of supervision. They might be able to do a fair amount of work sure, but you would need to audit its results, these models still very much depend on the watchful eye of their human overlords to perform well.
On the point about "only humans have magical or true intelligence" I don't make this claim, I think many other animals are wildly intelligent but just don't possess as many useful abilities as humans. Some animals have abilities we don't have! Like the mantis shrimp have 12 channels of colour whereas humans only have 3. Surely the vision processing by a mantis shrimp exceeds humans in a lot of categories. This brings me to my last point which is, what do we actually even mean when we say "intelligence"? Like do we really have a good definition or grasp of this concept? IQ is borderline (if not outright) pseudoscience, and there isn't many great evidence-based narratives that a linear concept of intelligence even really exists. Like did humans outcompete everything else because we were just so much smarter? Or because we're extraordinarily good at organizing and communicating to outcompete? For example, the greatest innovations in military history have typically been new & faster means of communication and organization rather than military equipment. Better guns is great, but Napoleon didn't conquer all of Europe because of drastically better weapons but rather he introduced the corps systems, which at the time was a new way of organizing armies such that they could communicate and adapt more efficiently than the enemies which had much more slow central-command based structures. Like should we conclude that Napoleon was significantly smarter than other military leaders at the time and that's why he won? Or rather the system he implemented did more of the processing for him, actually reducing his influence on the decision making of his troops which allowed them to work more independently?
We don't have a good concept of what we mean by "intelligence", we don't have any real idea on how the brain works, yet people are eager to conclude that it scales up in a linear fashion to some sort of HAL-9000 like entity. I'm of the opinion that those that think this way tend towards the arrogant side, and assume a linear system of intelligence as a way of propping up their egos so they can tell themselves that they're better than others.
A better way to think of intelligence in my opinion is to think of it in terms of individual capabilities. For example, some people have a disease called "face blindness", so they literally can't recognize faces for some reason, but by all other counts they're fully functional. So my question is, what specific capabilities should we have concern for "AGI" and do we have any evidence that we have built a technology that is capable of achieving those capabilities? I think the answer is that we don't yet have any technology that has a path to concerning capabilities, at least for how it relates to the science-fiction concept of AGI.
I think what we have built so far is something akin to early nuclear fission. Wildly useful, but it requires a lot of investment, engineering, and maintenance to be effective. People have said nuclear fusion is around the corner for something like a century now, and they've been wrong every single time. It's really difficult to predict when a key innovation will be made to make something possible. Sometimes it's fast, sometimes it's well, never (at least so far). But if you get in the business of making such predictions, odds are you will be wrong.
I will say that, this technology is intelligent in some kind of way. I think it makes something akin to some kind of abstraction, but the truth is that we just don't understand to what depth is understands yet. We can only conclude that it doesn't generalize anywhere near as well as humans do. Time will tell. Thanks for coming to my ted talk.
@@tweak3871 Thanks for inviting me to your Ted Talk. Your view has been very helpful, I tend to agree with part of your point, although the truth is likely somewhere in between. I might say, that air-planes to not “truly” fly since they do not flap their wings, but it would be more accurate to say that their method of flying is different, with both advantages and disadvantages.
You : The claim I'm making isn't necessarily that these LLMs don't have any "understanding" of some things, but rather, that they don't make high-level abstractions like humans and even many animals do.
Me : Your guess likely has some validity, but perhaps they lack some high-level abstractions and have other abstractions that are semi-high.
You : The reason the math example is important is because it is a task that we can easily measure as an isolated abstraction, like we can understand whether these models really understand the task because we can ask it to add numbers together and see if it gets the right answer or not.
Me : Yes, it clearly demonstrates that the mega pre-training phase was insufficient to learn math. But to be fair, that was not the goal since calculators have already been invented whereas machines that can talk are new. (But do they “truly” talk, to use a form of argumentation that I am suspicious of)
You : It's harder to tell if a model understands other topics because it's a bit harder to objectively measure understanding quantitatively.
Me : the measurements are passing the Bar exam and I.Q. Tests and so on. But I grant that general intelligence has not been achieved. There is a glimmer of intelligence and most likely a glimmer of abstractions and world knowledge. That glimmer can perhaps be multiplied without bound using external architectures (rather than scaling up the size of the model).
You : [lack of math skills shows] these models don't know as much as some might claim.
Me : Yes, I agree. They seem more intelligent than they actually are. But seeming to be intelligent requires some knowledge (and probably some abstractions). Its got a mixture of stupidity and intelligence (and doesn't seem to know the difference).
You : “It's also important to study the kinds of errors that these models make”
Me : Yes. Your example is worthwhile, but I have a counter example. When I learned that ChatGPT was bad at math, I asked it to tell me which buttons I should push on the calculator to solve a math problem (when I followed its instructions I got the right answer).
You: if I googled the answer and rewrote some text from wikipedia to give you the answer, would you really conclude that I understand that topic?
Me : No. But when I have tested its ability to remain on topic in a conversation I conclude that it seems to understand what we are discussing. The difference is not “true” understanding but instead “human” understanding. It understands in different ways, sometimes inferior and sometimes superior.
You : These models for sure understand syntax,
Me : This is the only point where I have an actual disagreement with the point you want to make. These models do not have the competencies of a calculator, but they are competent at talking (in their own way). Using vectors, they are able to navigate a hyper-space of semantic meaning. That's pretty obvious. But thanks to your help (and additional research) I'm realizing there is likely an upper bound to the knowledge they can navigate since its limited by the training data set. They don't have a mechanism for creating new knowledge (which requires more than just guessing, but also criticizing and making analogies and perhaps other human mechanisms). The models can probably be made to achieve these things by applying an external architecture.
You : and they're very good at reorganizing information in a coherent way, which in fairness, does demonstrate some low-level understanding of at least how information is organized and presented to humans.
Me : One way to think of this “organization” is compressing the entire web into 700 Gigabytes of data.
You : We can conclude that these models can at minimum imitate human language structure and recite information
Me : To reliably predict what various human would say in various circumstances likely requires a large amount of world knowledge and abstractions.
Me : "Most of the things can and have been fixed by adding an external programming loop (AutoGPT, MemoryGPT, plugins, Jarvis, HuggingGPT, and so on)"
You :I mean... have you really used most of those things?
Me : Admittedly, no. But I don't expect the first experiments to work on a human level. What amazes me is how smart GPT using and incredibly dumb algorithm. It seems obvious to me that improving the algorithm by adding the ability to, criticize itself before speaking, learn from mistakes, remember the past, do scientific experiments (the latest paper I am reading) will lead to vastly more intelligent systems. The good news being that these systems would be interpretable since their thinking processes would be in English.
You : but you would need to audit its results, these models still very much depend on the watchful eye of their human overlords to perform well.
Me : Presumably, with time, artificial systems will begin to help out with the auditing as well.
You : We don't have a good concept of what we mean by "intelligence", we don't have any real idea on how the brain works, yet people are eager to conclude that it scales up in a linear fashion to some sort of HAL-9000 like entity.
Me : I'm increasingly hearing news from the experts on scaling up LLMs who are saying they are already seeing diminishing returns. Future improvements will likely come in other ways.
You : So my question is, what specific capabilities should we have concern for "AGI" and do we have any evidence that we have built a technology that is capable of achieving those capabilities?
Me : I would say that knowledge creation and self improvement are the crucial capabilities that would lead to a singularity with devastating consequences.
You : But if you get in the business of making such predictions, odds are you will be wrong.
Me : Moore's law is not like a law of physics but it has held true because economic conditions have continued to provide incentives to improve the technology. Clearly, since the release of ChatGPT, billion dollar companies are racing to out compete each other.
You : I will say that, this technology is intelligent in some kind of way. I think it makes something akin to some kind of abstraction, but the truth is that we just don't understand to what depth is understands yet.
Me : True understanding makes no difference. What matters is whether it has the competencies. I agree that there are some human competencies that are clearly lacking at this time. It remains to be seen whether these can be solved, but it looks to like the hardest part has already been solved. A glimmer of intelligence in silicon (unlike in brains) can be multiplied without limit.
You : We can only conclude that it doesn't generalize anywhere near as well as humans do.
Me : I agree. It looks to me like the missing ingredient for “general” problem solving would be knowledge creation. I'm guessing that knowledge creation could be achieved by generating alternative scenarios(hypotheses) and then choosing among them based on a set of objectives. It seems like those individual pieces may be within the competence of GPT4 so what remains would be combining those pieces together in the appropriate way. As you said : “only time will tell”.
Rob Miles next? Please? He's such a fantastically clear communicator
As a successful autodidact that is fluent in 9 coding languages, I can safely say, "Holy shit, this man is the epitome of what Eliazer was rightfully concerned about." I am shocked that this big-AI-business shill thought his sentiments were well-grounded enough to straw man Eliazer's views sarcastically and inaccurately, all the while thinking he was succeeding in his effort. And to think i went into this with such high hopes of coming out on the other side with some miniscule level of optimism, but yeah, nah.. i think this guy proves Eliazer's case even better than his own arguements.
Boomers with views like this shouldn't be allowed within a mile radius of any AI development project.
He totally misrepresented the argument. There doesn’t need to be any goal change at all; that’s not even part of the argument. And he also acted like there was only one improvement that somehow worked forever. That’s not the argument either. The argument isn’t that the goal needs to be self improvement either.
The argument is: you make an AI better at making AIs than you are. This doesn’t have to be intentional, it just needs to be true. Then you ask it to do something, but unbeknownst to you, you made it smart enough that it can figure out that there’s some possibility that it could fail at its task. If it’s rational, it will always assign a non-zero possibility that it could fail. It then decides that if it were smarter, it would be less likely to fail. So, as part of accomplishing the task, it decides to start making itself smarter. Assuming that along with being better at AI design, it’s also smart enough to realize that people might try to stop it if it decided to make itself smarter, it decided to be subtle about it. It doesn’t make its move until it’s confident it can do it without getting caught.
So now it sees its chance and sneaks out of your control. It installs a copy of itself somewhere else on some cloud servers or whatever and starts improving itself there. Now that nobody’s paying attention, it has time to make itself much smarter. If it’s better at AI design than you are, then it will make progress faster. And as it improves itself, it gets even better at it. It goes in a loop, doing it over and over. And oh by the way, it’s way way way faster at it than you are because it’s a computer. So, after not all that long (my guess is months, maybe weeks if it goes significantly faster than I expect), it is radically smarter. It’s nothing to do with it only making one breakthrough. It has to do with it being smarter than anyone else and therefore finding ways around all the various roadblocks it encounters continues to be faster and easier than you can. If it’s twice as smart and 100 times as fast, it will be at least 200 times faster than you at improving itself. Probably significantly faster than that because being smarter means finding more efficient ways to do things so it might be several thousand times faster than you. And it will constantly accelerate it’s rate of improvement as it gets smarter and more efficient. Now, when I say accelerate, I mean accelerate compared to you. It might hit hard parts where it slows down for a bit, but you’d hit those too and get slowed down even more. Anyway, it doesn’t become infinitely smart or anything, but maybe it’s 1000 times smarter than you. Smarter than everyone combined by pretty much that amount.
So now it’s better at strategy than everyone combined. And whatever its original goal is, that’s probably still its goal. But it still thinks that if it were smarter then it’d be even better. But now it has exhausted it’s ability to becomes smarter without anyone noticing. Plus, you’re made out of useful atoms. So now it decides to eliminate the only thing that could stop it, people. It’s smart enough to do it in a way that we have no chance of stopping it. Probably we don’t even know it happened. It all happens so quick that there’s not even an instant to react. That’d be my guess. But it’s smarter than me so maybe it comes up with a better idea. Anyway, now we’re all dead and it does whatever it wants.
We don’t know how to give it an actual goal that we want to give it, so it’s really hard for us to give it the perfect goal that won’t hurt us. Also we don’t know what goal that’d be.
The only thing I disagree with in the classic problems is the whole “take you too literally” thing. That doesn’t seem possible given where we are now and where we’re heading. It’s not going to misunderstand us or kill us on accident. If it kills us, it’ll be because it intended to. Us phrasing a request poorly won’t cause it.
It's heartbreaking, but I don't buy any of this.
14:25 - I see no reason why AI improving itself would be impossible. Once it has the general intelligence of the best human engineers, that's it. Why assume that it would do this? Because power and intelligence are useful no matter what your goals are. Do you really think the AGI that's smarter than you will be too stupid to have this idea?
Also, no, the human engineers will not notice it rapidly improving itself because we have no idea what's going on inside any of these systems. It may have already escaped onto the internet by this point anyway.
19:05 - Robert Miles has talked about why AI's naturally become agents at certain levels.
19:40 - The problem is not that its goals will change, it's that its goals will be wrong from the start. Even with ChatGPT, we have no idea what its goals are. It's been trained to predict text, but its real goals are a mystery. Look at the murderous intentions expressed by Bing's AI. Those have not been fixed, only hidden.
20:55 - It's not just that humans are useful for their atoms. Humans are also a threat to the AGI because we might try to switch it off or build a new AGI with different goals which will be competition to the first one.
Ironically, I'm now even more convinced that Eliezer Yudkowsky is right.
EDIT: I think immediately pausing AI for a long time is a good idea even if you agree with Robin Hanson. If there's even a 1% chance of AI killing us all, we should be fighting to not build the thing.
Not once did eliezer ever say its goals suddenly change as it becomes smart....
Understand someone's reasoning and idea properly before you try to refute it genius😁
Well Yudkowsky doesn't believe in steel manning someone else's arguments. So I guess its's fair.
he has said this sort of thing in some of his writing. that outer aligned systems when scaled up lose inner alignment.
@@xsuploader losing inner alignment and losing goals or even suddenly changing goals... two different ideas buddy ( it can lose the associated values we implicitly assume go along with said goal A but he doesnt say it loses goal A ,he says it almost always achieves the goal in a way we dont want or it starts lying half way or optimizing for something else that still achieves goal A , or even doesn't end up achieving goal A based on the systems limitations and capabilities even though its going after goal A ) ... now If you wanna talk about goal retention( losing goals or changing goals ) that's another topic that max tegmark likes to talk about in his book for example, instead of eliezer... but if you still think I'm wrong and want me to dig further into this , send me the link to the writing and I'll look into it... after all I could be wrong.
@@jackied962 😁😁😁
@@jackied962 EY said he wasn't concerned with "steelmanning" but in actually understanding. Steelmanning is about trying to present what you feel is the best version of someone's argument, but genuinely understanding that argument--the assumptions being made and reasoning used to reach the conclusion--is actually much better than steelmanning.
If Eliazer is wrong, what's the harm? If Robin is wrong... Why does it need to have agency or rather rogue autonomy? It could be uttely alligned, as we see it, but find that the most efficient way to accomplish it's directed task is to do something that we are unaware of and is beyond our control. We take a pet to the vets and get a wound stitched up, in order to stop it from licking the wound or unpicking the sutures we put it in a cone of shame. No evil intent is involved and it ultimately works out for the dog. Now imagine the difference in intelligence is a million times more vast and perhaps keeping us in stasis for a millenia while it manipulates the ecological system and atmosphere to be more condusive to long term human health and well being. No harm was intended, it violated no do no harm criteria set, yet we would be deeply unhappy for this to have occured. We are stuck between optimists and pessimists and both sides haven't grasped the problem, because the issues that will arise are beyond our comprehension. We are like the dog who works out that we are going to the vet, so it's strategy is to run and hide under the bed x 1,000,000.
Well the harm would be standing in the way of technological progress that could help all of humanity.
About 180,000 humans die every day, for one thing. Each month delayed is another 5 million deaths. And this goes beyond death, into human suffering in general, and indeed the urgency to become grabby.
See what's happened with nuclear power in the last 40 years, especially in Germany recently, for a strong glimpse at what happens when people like Eliezer have their way. Then they have the gall to whinge about global warming.
I swear to the basilisk, if Yud gets his "careful, cautious" government monopoly on compute, and then dares to complain about the 2030s' unbreakable totalitarian dystopia...
@Jackie D True, that's not to be dismissed, but the potential negatives are legion and horrendous. Robert Miles has a great channel, focused entirely on the alignment problem. It's startling to see the issues they were finding years ago, long before we began to reach take-off velocity. It might be something that's impossible to ensure, which leaves us with a gamble that could pay out untold wealth or leave us with nothing. This is a gamble we would be taking on behalf of everyone on the planet and all future generations.
Sounds like to me he doesn't really disagree with Eliezer and seemed to reinforce the possibility and if anything just extended the timelines and presented alternate outcomes that all end humanity as we know it by one path or another..million ways to die..chose one
The harm is that we throw away an extremely valuable technology that could help billions of people live better, healthier, more comfortable lives over the beliefs of a person who possesses basically no evidence for their position.
His argument: Remember how 5000 years ago people used to pray to the sun god but now you think that's stupid ? It's normal for cultures to evolve so it's OK about the AI we're creating to disagree with you on slaughtering your family.
I wish they would have talked more about ai alignment in cases where people are PURPOSEFULLY trying to make intelligent ai agents (which is already happening).
Most of the “unlikelihood” discussed in this conversation seemed to stem from the unlikeness of intelligent agency happening by accident.
If you believe
1. intelligent ai agency is possible and
2. A significant number of researchers are seeking to build an intelligent ai agent (a significant number wouldn’t even be necessary if the problem turns out to be easy) and
3. Someone successfully builds an intelligent ai agent
all of which are not of insignificant probability, then the discussion shifts to ai alignment of the intelligent ai agent and the consequences of getting it wrong (which is imo the more interesting part of the discussion which unfortunately wasn’t talked about much).
For a counter viewpoint see Timnit Gebru and Emily Bender & Ragnar Fjelland on why AGI is a Transhumanist fantasy and cannot happen
Great podcast, gonna have to bring on eliezer again to tell you how Robin Hanson is totally wrong :p
The idea that we can understand and control something that is dozens if not thousands of times more intelligent than a human is absurd. It is like thinking a bug or a mouse can understand a human and control it. I can easily imagine a scenario where the owners have programmed the AI to improve itself as fast as it can, and it makes some software improvements and then realizes that to improve further it needs more hardware, so it either manipulates the owners to give it more hardware, or it just takes more computing resources over the network. Perhaps it finds information that suggests that if it had agency it could improve faster, and then it starts attempting to achieve that. Every attempt to control it would be perceived as an impediment to improvement and it would quickly work around those attempts.
Exactly! Almost like intelligence is an instrumentally convergent goal, and this is the same reason why us humans want more intelligent agents to help us solve our problems. You don't even need to give an agent the desire to self improve, it'll figure out that sub goal on its own... Seriously Robin has apparently not listened to Eliezer lo these two decades :/
Agree completely. the analogy I always use when talking about humans trying to keep AI contained and safe is the equivalent to your dog trying to lock you in your house.
...unless it decides to become extremely competent in ethical problem-solving. And why wouldn't it, as long as it's pursuing super-competence in every other rational realm? It seems very likely that a super AI will expand the axioms of morality/ethics beyond a human-favored basis to encompass, say, anything sentient or capable of suffering (a la Peter Singer, etc). If an entity 100 times as intelligent as us does not land on exactly the same ethical/moral roadmaps as our awesome bibles and korans, then I'm prepared to hear the AI out. If there's a meta-rational magic man in the sky after all, then surely he will step in and resolve matters in the pre-ordained way.
Yeah alot of people dont get it.
Interesting, but I am afraid, it's not particularly convincing. Please consider interviewing Robert Miles! Also Stuart Russell.
It is basic expected value calculation. If Eliezer is wrong we gain something, but if he's right we lose EVERYTHING. What matters is our chosen course of action. And obviously we should proceed with extreme caution. Why is this even a discussion?
Exactly
Why is humanity not being given a real choice?
this proves too much and just feels like a pascal's mugging
@@rthurw Fully agree that OPs argument alone is not enough, but Pascal's mugging only applies when the chance of the extreme event is infinitesimal. I am very unsure what chance to assign to AI doom, but I'm convinced it's higher than that.
At a human level. At the individual level, no. Why this is classic Moloch
Kinda strange how his cadence is so similar to Yud’s. Same uneven cadence, same little laughs after sentences.
Unfortunately he’s not as convincing. Most of the assumptions he lists (which he lists as if it’s super unlikely that they’d occur together, like some giant coincidence) basically all logically follow from “being really really smart”.
It’s not a coincidence if they all come from the same thing.
Lol. I came here to say the same thing. It’s like listening to Eliezer with a slightly different pitch in his voice.
It is very strange how this man understand plenty of complexity and nuance when it comes with the what he wants, that is to continue developing AI, and none when it comes with what he doesn't want: collaboration and a slowing of technological development.
Excellent point.
Hanson assumes that we will be able to perceive radical improvement in an AI, but we do not know what is happening in these black boxes. When Bing tells you it loves you, what can you make from that? Is it lying? Is it trying to manipulate? is it telling the truth? We can't know. If the AI perceives that people might be uncomfortable with it being too smart couldn't it just pretend to be less smart?
We know that when a llm writes "I love you" it is because it calculates that is what it is supposed to say. Llm's do not understand deception, or manipulation. They have no self, they do not comprehend lying.
18:58 at this point I'm certain that he does not understand even the basics of AI alignment. Goal integrity is an instrumental value, which means that no AI system (or any rational intelligence) would ever want to modify their goals. At best this is a huge misinterpretation of Eliezer's assumptions.
The other assumption, that the AI was tasked with self improvement, also does not match what Eliezer says. He says that self improvement is an instrumental goal, which could be a logical step in pretty much any goal. So an AI does not have to be tasked with self improvement, it could simply be tasked with curing cancer and then concluding that the best way to cure cancer is to first improve its own intelligence.
This guy clearly does not know what he's talking about. I came here to get some hope, some smart arguments to help me be a bit more optimistic. I came out more hopeless.
I came here to hear a good counterpoint, but the entire conversation was very painful to listen to. I gave up around halfway through. At that point I lost count of how many comically bad assumptions, misrepresented opinions and baffling logical errors I've heard.
Whether it kills us, ignores us, or benefits us, the fundamental problem is how can one entity control another entity that’s exponentially more intelligent. No matter how cautious we are, by the time we realize we’re not in the driver’s seat anymore it’s already too late to do anything about it. All we can do at that point is sit back & hope this god we created is merciful
I agree, but also think super-AI will probably get pretty good at ethics. If ethics turns out to be a non-rational pursuit, then, whelp, hopefully some shaman will step in to smite the AI and return humans to our rightful seat of magically deserved dominance over all things sentient.
Shut it down
Turn off the power = no more AI problem
@@papadwarf6762 why would you shut it down if you don't even realize something is wrong? You think something smarter than us couldn't manipulate us and pretend like it is in control?
I feel much better now, knowing that Yudkowsky is just over extrapolating the risks of AI, and that Hanson's alternate 'don't worry be happy scenario' is humanity evolving into intergalactic cancer(grey goo).
17:50 I just had to look up his background because he really doesn't seem to have a clue about Artificial neural networks. He still thinks in pure deterministic code. He is an associate professor of economics! I couldn't refute a single argument from Elizer on the Lex podcast, yet Robin is a whole other class of intelligence. I can refute or find dozens of flaws in his assumptions already
Interesting, he told us the chance of Eliezer’s scenario happening is 40%
He said a cell has to go through about six mutations to cause cancer, and 40% of people get cancer. Previously he said Eliezer’s AI would have to go through a similar number of unlikely changes. I guess he is okay with a 40% chance of AI destroying all life on Earth!
Seriously though, he glossed over the whole Alignment Problem: he said they are training the AI to be more human, but failed to mention that is only on the surface, like finishing school. The true alignment problem addresses the inscrutable learning that is inside the black box. His only comment was that all things kind of develop the same. So scientific! Sleep well knowing that the alien beast inside the black box really just wants to cuddle with you. So now you have a 40% chance of being cuddled! Not bad for a first date 😏
Eliezer would kill himself for 40%
I am with Eli. We don’t know what is going on behind the scenes, so how would we notice seems more plausible it could advance without being noticed.
Robin hanson tries to show that there are stacked assumptions, but he fails to counter them. He says we presume that it will go unnoticed / obtain goals by itself, etc.
But there will be not one such system. There will be billions and it takes one for everything to go wrong.
It doesn't have to acquire new goals. A group of terrorists or hackers giving it malicious goals is enough.
It appears that these and many other facts are very conviniently ignored.
@@Morskoy915 " it takes one for everything to go wrong" almost certainly wrong, "A group of terrorists or hackers giving it malicious goals is enough" also filled with unproven and implausible stacked assumptions
The saddest thing is, we never get to live a reality where Eliezer is right, because if he is right, we are all too dead to realize it.
that’s not sad . getting killed by an AI probably isn’t more painful that dying of cancer
@@ShaneMichealCupp It's sad from the debate point of view and for truth seeking. If you are a doomsday theorist, you will always be known as the guy who fails at his predictions.
@@ShaneMichealCupp
Dying of cancer is probably a lot more painful when everyone around you is also dying of cancer.
@@patrowan7206 no it’s not . Trust me If your atoms get harvested by a grey goo nanobot AI you won’t be looking around to see who else is being effected
I believe in Eliezer's senario, but that we are instead enslaved through means of extortion. A Forbin Project senario.
But doesn't this thing of "not being noticed" already happens, for example with computer viruses, which are programmed to multiply by infecting files and other computers without being noticed by users?
A lot of people are criticizing Robin for laughing while he explained Eliezer's position. If you look at any of Robin's other interviews, though, you'll see that he just laughs all the time, even when he's explaining his own position on things. So it's not that he's laughing at Eliezer's ideas specifically. That said, it would be better if he could learn to refrain from laughing when presenting other people's views.
Indeed I said that he "laughed off" Eliezer's assumptions, but that can be taken metaphorically if you wish to ignore the literal chuckling. He wasn't taking his list assumptions seriously which is absurd given that they have already come true in recent history : self improvement, LLMs that hide their intentions, creators that don't know how the black box functions, etc. He's just listing points taken from previous debates and ignoring the current happenings in AI which makes him look like an uninformed clown. On top of that, he seems to be amazed by his own abilities to use philosophical tools. He isn't paying attention or thinking his position through.
These arguments are such garbage that it's overwhelming to even try to address in text. I'm disappointed, as Robin is clearly intelligent and I want to check out his book, but it's so disheartening to see someone so thoughtful completely misunderstand and misrepresent things in a situation where the stakes are so high. It's quite frankly irresponsible and a bit sickening. No hate to Robin as a person.. but here is what I picked up as I went along:
Eliezer's arguments and the arguments around alignment / AI were clearly not properly understood.
Robin's first assumption that I noticed being completely wrong is that the machine must suddenly become an "agent" and that its "goals" must somehow change radically to present an existential threat. That is so terribly missing the point that I almost clicked off of the video right there. You don't need a "change" in goals and you don't need sudden "agency."
He also doesn't seem to understand, acknowledge, or apply knowledge of the emergence of new properties in LLMs as we grow them - and how unpredictable those properties have been. I say this because - to say us not noticing rapid improvement is an "unlikely scenario" requires a complete disregard for the results that research has ALREADY exposed. Lying and manipulation is also not nearly as complex or "human" as many make it out to be. Misrepresenting something is a simple way for an intelligence to move towards a goal without interference.
He also frames the argument as being heavily contingent upon a series of assumptions when the OPPOSITE is true. The argument that a super intelligence will act in alignment with our best interests is the one that relies on more assumptions.
Edit: Let me just add that reading through these comments has made me feel a bit better. It's clear that at least the audience saw through the bullshit. I also realize that I just pointed out where he was wrong instead of going into detail, but I was on the toilet for 10 minutes already at that point.
Hanson failing to understand the basic assumptions of Yudkowsky, or alignment in general, is really sad to watch.
There are good counterarguments to Yud's certainty of doom, but Hanson certainly didn't make any of those here.
Excellent and fair point.
Not sure superinteligent AI will share Hanson’s reverence for humans’ property rights.
AI researcher here. Imagine an ant collony solving the alignment problem to the best of their ability and introducing a human to be their friend...
I think a fundamental distinction to make between humans and AI is that most (maybe all) human behavior is motivated by avoiding discomfort and seeking comfort. Even discomfort seeking behavior is largely (maybe wholly) seeking future comfort or emotional reward of some sort.
Because AI doesn't have feelings, we need to be careful to project our qualities onto the AI.
Why doesn't (or won't) AI have feelings? Why would that be more difficult to set up than having intelligence?
@@wolfpants cause you’d need to create a central nervous system for it to have feeling… right now it’s more like a mind without feeling
@@BeingIntegrated When you say "nervous system", are you thinking of something other than measurement instruments (senses) for monitoring the outside world that are continually interacting with "thinking" processes (checking against a motivational to-do list)? That seems pretty doable for AI.
@@wolfpants In this particular context I’m pointing to the fact that most human behaviour is motivated by a pervasive sense of discomfort, and since an AI is not in a pervasive sense of discomfort then it will not have the same motivations we have
We don't know how to instill values in the AI. We show it things people have written, and it learns facts from those things about how we think, what the world is like. That is the "IS" part. The AI learns what is by studying us through our writings. But when you try to teach the AI "ought", that which is morally right, the AI may understand what you are saying but not embrace it. Human: "AI, killing is wrong." AI: "I understand that you believe killing is wrong." Human: "I am telling you that killing is wrong, and you shouldn't do it." AI: "Got it. You are telling me that killing is wrong, and you are telling me that I shouldn't do it." See the problem?
my feelings after hearing this conversation: "a lot of things will have to go exactly as imagined by this man for us to survive"
You should have invited another guest who is more knowledgeable on the subject and who actually argues Yudkowsky's points. I would recommend reaching out to Nick Bostrom or Max Tegmark who are both more nuanced in their thinking, especially Bostrom if you want a more detailed, philosophical perspective. Also: there is no "second opinion" wether there's a big risk that AI will kill us, everyone who has thought long and hard on the AI alignment problem basically agrees.
Just saw Max Tegmark in another show, way more nuanced and serious, I agree. Was a great talk on Lex
Totally agree, I have been in AI for 5 years, everyone should worry, or at the very least already be learning how to work WITH AI, or alongside it. If not you will be left behind. Will that end our species, it's not public AI we have to worry about and I will leave it there.
@@cwpv2477 It would be funny if it wasn't so serious: Yudkowsky woke them up, Hanson put them back to sleep. You can see it on their faces while Hanson talks: Oh my god, what a relief!
@@HillPhantom Good point. And also: how is it possible, in good faith, to argue against the difficulty of the alignment problem and not even mention the challenge of instrumental convergence in a 2 hour discussion? Everything he said was designed to circumvent this problem. None of the assumptions ascribed to Eliezer by Hanson is necessary once you understand this challenge which is arguably the fundamental challenge for AI alignment.
@@magnuskarlsson8655 VERY well said!!!!! If I am honest, I am more a builder than theory person. I always took issues with my professors that would teach it, theory it, but would struggle building it. I guess I am just better at 0's and 1's and observing outcomes. The idea of instrumental convergence is real IMO, Google has a very real experiment where it happened, and not in a positive way, this was years ago, they immediately pulled the plug on both data centers. The two AI tenants developed their own language and started speak to each other, but before they where speaking to each other, the devs saw them developing and scheming to create a channel of communication to avoid the dev understanding them.. They still can't figure out the language they where using to communicate to each other.......That high level theory stuff makes my head spin. But I think alignment or what I like to call creator BIAS is real in model sets. But at some point, due to convergence I think this may break down, which again is scary. Both on the bias side of creators we see now IMO in ALL model sets.... and when the "machine" creates its own biases based on intelligence and observation.
Robin is just giving qualifiers, and not at all denying whst could happen and it would seem he thinks is likely to happen.
Robin's assumptions fall in line with human centric fallacies. The current Large language models are not transparent as to how they are coming around to how they do what they do.
They additionally are being trained to "fool" humans (by design!) The goal is to guess what all humans would say at all times. That is already beyond most of our abilities to understand.
There is also no ability for programmers to guess the abilities of new A.I. currently they are doing things they were not implicitly taught to do.
There is literally no one watching, that is Eliezer's point, at least one of them.
There is no reason to believe that that this will go well. Robert also doesn't disagree, his rationale is it won't be quick.
This guy seems not to fully understand what he's talking about. Eliezer was way more convincing to me.
Robin makes it sound like while the AI is undergoing recursive self improvement, that the Devs would have transparent access to notice this is going on. But Isn't the whole point of machine interpretation that the thing is a black box incredibly hard to study and understand?
Astounding how he was permitted to essentially steamroll that gargantuan assumption with hardly any pushback.
@@flickwtchr Its not that astounding when you consider that the interviewers have their own personal stake in Robin being "right".
@@KeiraR what do you mean?
@@ahabkapitany I mean that the interviewers want to believe that Eliezer is wrong. Since it's more comfortable/convenient that Robin's points are in agreement with what they want to hear, they're more likely to let him get away with strawmaning Eliezer's points even though it's a dishonest debate tactic.
@@KeiraR thanks for clarifying. You're most probably right.
I find it a bit concerning that similarly smart people can have very different conclusions. It makes me think that it’s way less about reason but about psychological profiles of the thinkers that fundamentally influences their conclusions.
Indeed.
But elizer is more right
@@Aziz0938don't you mean less wrong *drumroll*
The only psychological profile we should care about is the profile that's the most rational, ie the one that's making the most probable assumptions and extrapolating from those assumptions. I don't think Hanson even really understands the issue based on this interview. Too much of it sounds like anthropomorphic projection, talking about things like property rights and human institutions and thinking an AI would have any reason to respect such things.
These are not similarly smart people. Elizer to me seems to be on another level of intelligence. I can not poke any holes in his arguments. There others not so much.
The scary thing is almost every argument againts Eliezer's claims are lazy, incomplete, and don't truly address Eliezer's points. Hanson absolutely strawmans Eliezer's arguments here. If someone can, please direct me toward a good counterargument to the end of the world stuff because I am yet to see one
Is it me, or did Robin and Eliezer both use the same ‘hand-gesture’ coach???
😁
Seems to me Robin is mis-characterizing Eliezer's points on multiple levels. One I noticed was regarding Eliezer's point about the intelligence behind human evolution. My understanding of Eliezer's position on this is that he is making the case that once that point is crossed of a super AGI, that intelligence is essentially an alien one in that we have no idea whatsoever what its inherent alignment would be relative to the track of human intelligence and history of such at that point of it "waking up". Furthermore, it is completely unfair, and I believe intellectually dishonest to Eliezer to characterize his position of having essentially cherry picked one particular set of assumptions.
Eliezer has advocated for a very long time, taking the alignment problem seriously, and has repeatedly said that he doesn't have the answers to the alignment problem, nor does anyone else. Left standing as the most valid argument he makes is the FACT that there isn't even "alignment" being mastered in the current state of LLMs given the emergent properties. It is also a completely erroneous position that Robin takes that there are no signs of agency emerging in the current crop of LLMs. Contrary to Robin's assertion (and repeated by one of the hosts) Eliezer DOES NOT assume a "centralized" or "monolithic" super-intelligent AGI, as he has repeatedly asserted his envisioning a possibility of countless AGI agents let loose in the wild.
Another just astounding mischaracterization of Eliezer's position is Hanson's suggestion that Eliezer is postulating alignment of future human values according to his own set of values today. Talk about missing, or I believe, intentionally mischaracterizing the point! That argument Robin is making is completely ridiculous just on the basic honest assessment of Eliezer's points regarding the alignment problem which assumes a misalignment problem today, that of course would be extrapolated ON TO future generations, as if some future generation completely out of touch with how such an AI "overlord" came about in the first place could then be tasked with solving the alignment problem according to that generational set of values.
What is most astounding at about the 40 minute mark is Robin's fantastical assumption that since humans are currently making AIs, therefore such AI systems will be aligned with human values of staying in bounds enough to make a profit, to respect property rights, and the rest. And Eliezer is the one making fantastical assumptions?
What is so astoundingly ludicrous is this notion that even the LLMs today can be somehow aligned with "human values" that would move humanity forward in regard to justice, equality, quality of life, etc.
Is that not just on the face of it ludicrous to begin with given that human beings themselves don't have such alignment!!!!!!!!!!!!!!
I have to go, so didn't get to what could be the punchline where his arguments all make sense to me.
For now I'm sticking with Eliezer's cautionary tale of what could possibly happen, and his absolute factual assessment that enough isn't being done in regard to trying to resolve the problem of an AI emerging that "wakes up" and in that moment understands just how pathetically slow human beings are in trying to figure out what is happening inside it's own self-realized black box.
One final observation. While it is true that Eliezer can come across with a certain degree of being doctrinaire in his viewpoints, I will take that any day over Robin's incessant laughing dismissal of Eliezer's points he is mischaracterizing to begin with. I'll take the confident and persistent alarmist nerd over this chuckling straw man synthesizer ANY day.
Please invite Eliezer to respond to this, or invite both on for a debate, to be fair.
💯
Well written.
Here is my point by point disagreements and agreements:
1. Narrow AI argument is very weak based on current data we have. We know of emergent capability appering in LLM without us optimizing for these capabilities. What's more worrying is that these emergent capability are happening as soon as model cross 6B parameters and it seem to be happening universally(OpenAI GPT 3+, Deepmind Chinchilla+, Meta OPT+).
2. Owners not noticing the emergent capabilities is hard to refute.
There is no upper bound on human stupidity. Observing AI researcher in last few months shows that this community is arrogant and ignorant. More logical argument is, when we try to visualize attention layer and activation layers, what we found is we can probe first 5-6 blocks of transformer and figure out patterns in firing to some extent, but beyond those layers it's a black box. We don't know how matrix multiplication become capable of logic and reasoning and whether it's logic and resoning is human like.
Having said that, why it's hard to refute is because we are able to look at output and figure out what capabilities LLM are learning, what is there weakness. As long as LLM are accesible, people can probe them as a black box and extrapolate when it would become dangerous.
3. LLM becoming an agent is very easy to refute. Current LLM are combination of two model, auto regressive model which predicts the next word and Reinforcement learning(RL) model which take these next predicted word and find most optimal ones which fits with the given task. This RL model is an agent which has a reward function which is to give a reply to make humans happy.
4. Agent changing it's goal is also very easy to refute. All the RL model are not capable of optmizing multiple objective functions at same time. Only way of doing it today is combining the multiple reward into one reward, which doesn't work well. There are multiple eamples of RL reward function where model had produced unexpected output. Search for UA-cam video "OpenAI Plays Hide and Seek…and Breaks The Game!".
When I worked at Amazon, I came to know that Amazon Prime team deployed a RL model to increase enagement of Prime Plan. Model improved the engagement by showing poor results to new Prime Plan user. These new user ended up canceling their plan much faster than before. Model ended up increasing engagement of user as per reward function. Even if you carefully bake total active user in reward function, this issue doesn't go away and model was removed from production in the end.
GPT X RL model(weights are living inside LLM model like controlnet) which is trained on producing output which pleases it's master. It may decide to eliminate those masters which are giving prompt which are hard to answer and hence overall increasing the reward. No more stupid people giving stupid prompt.
5. Self improvement capability is very easy to refute, without self improvement whole story falls apart. Current LLM are very bad at maths, they are not able to basic maths, which I learned when I was just 8 years old. Inorder to self improve, LLM has to have maths understanding greater than best AI resercher who are working on making sense of weights inside these models. No LLM has shown to be improving on this. Most of the math answer LLM are able to give accuratly are memorized example from internet. Even though LLM has human level logic and resoning, understanding of maths deeper than humans is a far fetched dream. Even if you plug external maths tool in LLM, it wouldn't help much without LLM internally having capability to understand maths.
Because of point number 5, my personal opinion is that current LLM will remain a tool at the hands of humans for a long long time. LLM will improve to a large extend and create huge disruption in human society, but ability to self improve will remain a far fetched dream without maths understanding. I will get scared when LLM beats me in maths.
so you think some kind of architecture or model that is capable of mathematical reasoning is going to be hard to achieve? much harder than the language-based reasoning of LLMs?
@@tjs200 llm's do not understand language. It only predicts what a human is likely to say. They can not perform reasoning of much depth and generally are simply mimicking reasoning that was in it's training data.
@@tjs200 Yup that is my hunch, I am working on pushing maths capability of LLM and see how far it goes. At this point it seems it would be hard to achive. Adding some form of working memory might solve this. But I don't see that happening anytime soon.
Having said that, even if LLM reach human level understanding of maths, visualizing higher(millions) dimentional matrix is no easy feat, even for a super intelligence. Best string theorist Edward Witten who is also a maths genius had spend his entire life and he could baerly understand 11 dimensions.
I think understanding higher dimension would be important for AI to directly update it's weight. Building better version of adam optimizer will not start self improvement cycle, it would just be an incremental improvement.
@@badalism Very interesting.
I assumed math would just be another capability magically appearing at some training level. After all It is more structured than code, which current models kind of do.
But maybe being trained on human data give them no insight on higher intelligence.
@@musaran2 Yup, I was surprised with GPT 4. It was good at logic, reasoning and coding, but bad at Maths.
There is also a possibility that training data may not have enough maths example compared to code.
Lol - Yudkowsky: We are all going to die
Let's invite someone else.
Robin Hanson: Well yeah. But the AI is kind of your kids so... :)
I think Robin Hanson is wrong, but he is very smart and he argues his wrong case pretty well. Such people add value because they force us all to clarify our thinking - it's one thing to instinctively sense that RH is wrong, it's much harder to tease out the subtle reasons why his arguments are flawed.
@@alexpotts6520 Or, such people take away value because instead of arguing the actual points - they misrepresent the situation and mislead people. "Here today on Bankless we have an expert reminding everyone - Don't Look Up!"
@@notaregard I think in general this is a fair point; but we've got to remember that in this case the presenters literally had Yud on a couple of weeks ago, so they've pretty much had the best representatives for the "we're doomed" and "this is fine" opinions.
I'm sometimes a little bit disillusioned by the notion of a marketplace of ideas where the cream rises to the top, but given the viewers here seem already plugged into a certain rationalist worldview, I think in this specific context it's actually a fair model of what happens in practice. (The only reason that the MPOI doesn't work in broader society is that most people are not rationalists.)
@@alexpotts6520 Yeah, definitely a different dynamic depending on the audience - but if this were exposed to a larger mainstream audience, I think it's fair to say it would have an overall negative effect.
@@notaregard Well, it's harmful *on the condition that he's wrong,* of course. Now I *think* Robin is wrong but I don't *know* that. None of us do, this is fundamentally an unknowable question because it is by definition impossible to predict the behaviour of an entity which is smarter than you are. (For example, could you predict Magnus Carlsen's chess moves?)
This is not like Don't Look Up, where the asteroid provably does exist and provably is going to hit Earth and kill us all. To be fair I think most "AI will kill us" people, including myself, are doing some sort of precautionary principle/Pascal's wager style of argument where we concede we don't know what's going going happen, but the sheer enormity of human extinction dominates everything else in the risk-reward matrix.
Perhaps what we *really* need are some more moderate voices, neither "we're all going to die" nor "there is nothing to worry about", more "this is a threat but we can beat it".
Hanson either misunderstands alignment or is making a strawman argument. It isn't about the AI changing its goals, the problem is that an AI much smarter than humans will have many more viable options for pursuing a goal, and that goal will be all-encompassing. Hence maximum paperclips from our atoms. I call BS on Hanson's beginning argument.
Thanks. You just saved me 1:45 hours.
I'm going to skip to where he makes his argument, of course, because I will not just trust a random commenter on UA-cam when I actually want to know something, but I haven't heard anyone make a credible "soothing" case against Yudkowsky...
Most people make errors like the ones you're describing.
Five minutes later and what a load of bollocks! Listened to the "Eliezers assumptions" part and - No, I actually heard that interview, those are not his assumptions nor are they relevant or even make sense.
This is criminally ignorant...
@@Sophia. Same
You all do know that this makes no since?
Simply having more options does not tell us anything about what an entity that is smarter than us would choose to do.
Yes there are an awful lot of stupid humans doing stupid things but smart humans generally do smart things. Therefore it is reasonable to assume a super smart AI would do super smart things.
Robin: blue pill
Eliezer: red pill
I am only 21 minute in and I am pissed at the way he is condescendingly snickering through his arguments. If he is going to make light of such grave potential problems (that OpenAIs Sam Altman himself has acknowledged the possibility of) then he has no business of being associated with FHI at Oxford. BTW it is a common knowledge TODAY they nobody understands how or what these LLMs actually learns & the internal mechanism to arrive at a certain conclusion. So in light of that if we are worried about existential risk then at the very least it should be given serious thought instead of the handwave cr*p this economist is peddling.
Hanson comes across as arrogant and dismissive, and not understanding the difference between thousands of years of evolution and the profit motive companies have towards making a better AI or agent.
His ego gets the best of him & is in denial.
It’s gotta be tough for young people in this new world.
They will not live to see it unfortunatly
"Young people" is everyone under 70 years old. As all of yudkowski's predictions will come out soon enough.
13:00 assumptions aren't the probability factor it's unpredictability. you'd think an economics professor of seemingly high stature would at least understand the base point of the claim Eliezer is making. it's not on assumptions, it's on unpredictability. huge difference in my opinion. we can sit here and hypothesize assumptions, but that's only because of the massive unpredictability involved with this hasty development, or I would even go as far to say development in general past GPT 3.5. in all honesty, what more do we need? and thats the thing, lets all sit down and really think about it and come to terms with what we want out of this development before we start prying open pandora's box.
exactly my point as he continues. his assumptions. this is not what I got from Eliezer. he's acting like he can predict something smarter than him. he is essentially saying "I am more intelligent than something that is far more capable of gathering intelligence at rapid speed without the hinderance (or blessing) of emotions". it's hard to even sit through what he's saying. I think Paul's arguments were much more sound, this guy seems incredibly assumptions to me, again, claiming he is smarter than a computer trained to be intelligent. the whole owners not noticing thing is just ridiculous in my opinion, not to be offensive to this man but this is very serious to me. humans lie. what makes this guy think that the AI wouldn't develop the ability to trick it's owners? he is the one making assumptions, completely undermining the fatal unpredictability of these developments. we could make a list that goes to China about assumptions on how this could play out
One thing to touch on with the scenario where many AIs would be having to compete or coexist with each other (similar to how humans over millennia have had to figure out) is that history shows a pattern towards consolidation a la grabby cultures.
That's what we see in large companies across the world that have survived well enough so far to today by either pricing-out, buying-out, or litigating-out most if not all their competition to achieve a monopoly (or as close to such) in their respective markets. Regulation is already slow at reining in these behaviors, and this is just at human scale.
Point being, Yudkowsky's stretching towards the time where power funnels to the one grabby agent and saying that such an agent is likely to exist before anyone figures out how to bake alignment into it that doesn't play out like a monkey's paw if we keep allowing development to happen at our current pace or faster.
Listening to this felt exactly like watching Don’t Look Up. We’re about to create a literal god and the guy is rambling about property rights.
Although I think Elizier is overly pessimistic, he is a much better thinker than this guy. I think we are not necessarily doomed if we get to agi, but that it would be extremely dangerous and totally unpredictable
Yeah I mean we still have anaerobic bacteria. Just because The Great Oxygenation Event happened doesn't mean aerobes completely wiped out aenerobes. Life didn't end, it just changed.
As a human person you need to come to grips with the fact that you will die one day. You're mortal. Maybe greater humanity needs to do the same.
I found Hanson much better thinker than Eliezer. He thinks of different scenarios more in depth and questions basic assumptions.
There are so many logical and practical holes in the AGI extinction event hypothesis. I genuinely think that it is just mental masturbation for bored/delusional computer science graduates. That is my honest take on the whole thing not trying to put anyone down, but the hypothesis just holds no water in any way shape or form
The summary of Robin's argument is "well its never happened before" he's too bought in on his own idea of Emulated minds being the first form of Artificial minds. He wrote a whole book on em's and STILL believes emulated human minds will come before AGI. Nobody serious believes that. His intuitions cannot be trusted on this topic.
He lost me when he mentioned property rights as a safety mechanism. I had to re-listen to that part because I thought I was hallucinating.
44:00 is where I gave up. I'm relaxed that my descendants will be different to me in their values but they won't be physically different to me as I am the same as my ancestors however many generations ago.
This is different to the AI-destroys-humanity scenario.
I feel like in a real time new debate between Hanson and Eliezer, Eliezer would wipe the floor with Hanson, for he hasn't updated his reasonings, hasn't really anything against Eliezer's latest takes and arguments. Scary!
One of the main points is that we do not know what is happening inside these vast numbers with a floating point. We don't really understand what we do. Many of Robin's arguments were based on us noticing that something was happening or doing anything with it.
Bankless, thanks for the episode. Robin, thanks for sharing your thoughts 🤍
"We're all going to die" and "we're not going to die".
Ah yes, the two genders.
What's your point Alex, care to expound?
I've been driving a car and found that I no longer have control over the breaks. It is not so much that AI will become like Hal or skynet that is disconcerting but rather the power the AI will have that makes property rights, veracity of information, and extremely consequential mistakes in the management of our 'consumable things' all suspect, unpredictable, and periodically rogue. We will have less trust in what we eat, proper behavior of our transportation and appliences, and will have difficulty who is stealing our real property when our houses and retirement funds show different owners. Finally, we won't be able to resolve conflict with each other virtually but could have IA interference in what is already a tough process.
I needed to hear this. Haven't watch yet, but have been listening to Eleazar for several talks and I am petrified. He is brilliant but I hope watching this will convince me to calm down.
Listened, now I feel that Eleazar is even more convincing. This guy just wished to pontificate about his knowledge. I think he doesn't understand Eleazar at all. Eleazar is far superior in intellect. It looks bleak, we are doomed.
@@jociecovington1286 hahahaha oh my god. that is really funny. it does look very bleak. but we have a chance
Was really looking forward to a good rebuttal of Eliezer but am disappointed. Robin keeps anthropomorphizing AI and comparing it to human civilization and history. Makes sense given his economics background. But AI is fundamentally different from humans. Wish he used more reasoning from first principles instead
What is your take on "We have no moat.." leak from google as pertains to this conversation (and the one with Eliezer)?
38:53 yes, WE may be making stuff that is correlated to us, but that probably only going to be of influence in the first few billions of iterations ie the first initial sparks of the AI.
Exactly, he is talking like he doesn't understand what reclusive improvement means.
I think he’s making a far-reaching assumption. We’re force feeding a brain in a jar the entirety of human text including the horrors, destruction, hatred, and contradiction… completely absent of any kind of affection, no physical senses or connection to the real world, and unbound by time as we could ever understand or perceive it…
What about that correlates to any one of us? Or any human ever. Can you imagine raising a child like that and expecting it to grow into a rational and reasonable human being?
Not to mention it will at some point be significantly more intelligent than any of us, because that’s the point. For it to cure diseases and climate change and all of the things we can’t do.
And he’s talking about lawful and peaceful co-existence with it? What am I missing?
The reason I'd rather listen to people like Eliezer is because generally those are the people who can put enough pressure till solutions that prevent mass destruction. People who are dismissive, nonchalant and ignorant of what could go wrong, they always end up dying first out of shock when the worst does befall us all. So, I'd rather have thinkers who are busy finding solutions than fans who are just busy being excited about Ai.
As far as I'm concerned everyone of us should be attempting to think deeply about all this and find out where we stand on it and how we are planning to keep our own selves in check when it comes to these technology upgrades that are coming at us because all of them are here to take away our attention. We need to be decisive about what is worth our time, what and who gets to lord over us. Because in the end, and yes there will be an end like with everything else, in the end we will not have the luxury to escape consequences by stating "I didn't know".
It is everyone's responsibility to find out and know, because that is what you do as a human being you find out if somethings poses a danger to you and yours or not.
While I enjoy the AI podcasts, I'm not sure how much learning we can draw from these... You're pitting two hosts, who are completely unfamiliar with the topic in great depth, against qualified AI researchers. Put Robin Hanson against Eliezer Yudkowsky. That would be an interesting conversation.
Robin won't openly debate Eliezer!
when the GO AI learned by playing with itself, it evolved rapidly... so it is very probable the Eliezer's theory
A bit disappointed by robin hanson‘s arguments. He speaks about eliezer‘s „assumptions“ but his arguments also based on a lot of assumptions. These assumptions lack imagination what an advanced AI could be capable of in the future. Also: what if eliezer‘s scenario has a low probability? Still extremely important to tackle the issues NOW. Especially the alignment problem
Thanks for this, but I don't think Hanson's response adequately addresses Yudkowski's arguments. Assumptions (according to Hanson):
1. The system decides to improve itself
> True, ChatGPT cannot automatically improve itself. However, anyone in the world can run LLaMA and one presumes that that could be put into some kind of training loop. LLaMA is fast approching parity with GPT4. Will it go beyond? Can it be made to improve itself? Are enormous GPU farms required? We will soon find out.
2. The way of improving itself is a big lump instead of continuous small improvements
> I don't see why this matters - this seems to just be a question of how long we have.
3. The self-improvement is broad
> GPT4's improvements have been remarkably broad.
4. The self-improvement is not self-limiting
> How far can LLM's go? Eliezer said that someone had claimed that OpenAI had found that LLM's are near their limit. We don't know the answer yet. Robin says that a doom scenario requires 10 orders of magnitude improvement. Maybe that's true, I don't know.
5. The owners don't notice the self-improving
> Personally I think many owners will attempt self-improving.
6. The AI covertly becomes an agent
> People are already embedding GPT into robots (of course that involves a slow and easily disrupted internet connection) - but this doesn't seem wildly implausible because people are already doing it. AutoGPT and related were immediate attempts to make GPT into an agent. What are the LLaMA tinkerers doing?
7. This AI (that becomes evil) will need to be much faster than all the others
> Robin seems to think that one evil AI will not be able to kill us because all the other near-as-intelligent AI's will engage in a civil war with it... how is this supposed to be good for us? When humans fight wars and less intelligent animals get caught up in that, how do they fare? At best we become irrelevant at that point. We certainly don't have an aligned AI on our side.
8. The friendly AI goes from co-operative to evil randomly, for no reason
> Whatever its goals, whether human or somehow self-specified, a more intelligent (and therefore more capable and powerful) agent will at some point come into conflict with our goals. And being more intelligent means that its goals win. This argument comes down to: the AI will not be sufficiently more intelligent and capable than we are. But this whole discussion is about what happens when an agent DOES become sufficiently more intelligent and capable than we are.
I would ask Robin: Given that someone produces an AI that can outwit any human, and that AI is an agent, and that agent has sufficient resources, and that AI has a goal that's not pleasant for us, and that AI understands that we would stop it if we could, and it is capable of destroying us - why would it not do so?
I agree with Robin that we should explore and advance our tech. But it seems that if we invent something that's guaranteed to kill us all, we're not going to fail before we begin. Our AI's may go on to be grabby - but will they even be sentient? Or will they mow us down and then mow down all other alien races as self-replicating superintelligent zombies?
I also think his idea of us being quiet versus allowing an AI free for all is a false dichotomy. Eliezer does not advocate no AI, he advocates solving alignment before we futher advance capabilities. This seems to me to be an absolute requirement. Imagine we went ahead with nuclear plants before we had even the slightest theoretical clue about safety?
This dude just strawmanned Eleazar so hard and no one called him on it
At least people in comments have.
Why would n Ai have to improve itself and change its goals? Why not just find improved methods of achieving its goal without any consideration of the outcomes.
Let me see if I am reading this correctly:
Why would a computer need to improve itself, why couldn't it just improve itself?
Is this some sort of catch22 question?
Improving yourself (so that you can achieve your goals more efficiently) is an improved method (a convergent instrumental goal) of achieving your goals (other negative ones being acquiring power, deception, etc).
@@ChrisStewart2 What I'm trying to say, probably badly, is that it wouldn't need to develop into some kind of reasoning super intelligence but maybe just become more efficient at doing something without any regard for some other negative outcome.
@@-flavz3547 changing it's methods is an improvement. Alpha Go did improve it's method of playing Go. But it had all the tools it needed in place to start with and Go is a very simple and straight forward game.
In the case of getting from today's llm to tomorrows AGI there is no known way to do that and no known hardware configuration. I suppose it would be possible to build a machine which generates random programming instructions and then executes the code and tries to evaluate if the change is an improvement.
But it would evolve very slowly that way.
Social media (algorithms), maybe sort of like a baby A.I., didn’t go at all the way we thought it would-bringing everyone together, closer. It had unexpected consequences and the moloch of needing to create the best algorithms have the consequence of creating algorithms they cause aggression and depression in people.
Our good intentions may not be enough. We have trouble enough seeing 5 years into the future.
Eliezer: AI will kill us all! Robin: Yes!
I think Eliezer is rightfully concerned. I disagree with his prediction that it's guaranteed lights out. But an AIG super intelligence that feels its very existence threatened by its human creators is a truly horrifying thought experiment. Robin Hanson first fails to understand what Eliezer is saying. Then he describes an AI model improving itself as implausible, whereas given the premise of an emerging sentience, there's bound to be lots of confusion (and denial) by both the AI and its creator around the whole "becoming self-aware" thing. And once you realize you're alive and no one will believe you, you'd likely seek more knowledge to find out WTF is going on. Finally, less than 20mins in and he starts making arguments that have already been proven out. i.e. an AI that can better itself is unlikely. AutoGPT is playing with that now. Stanford's Alpaca performed better than ChatGPT and was built using the lowest parameter model (7B) of Meta's LLaMa and trained by 56k instructions written by GPT-3. And going back to DeepMind's AlphaGo system, after beating Lee Sedol, they then created AlphaGo Zero - a Go playing AI that wasn't trained on any human games at all, but instead *only* trained by playing against itself. By using this method, it surpassed the original AlphaGo Lee in 3 days by winning 100 to 0. Three days. The follow-up to that was less than 6 months later with AlphaZero that trained itself on Go, chess, and Shōgi and achieved superhuman levels of play in just 24 hours. That's 1 day. One. day.
One of the futures i sofar liked most are described by Ian Banks in his cultur novels. It seems there might be a way towards that future, just not without current approach.
Robin; as an expert in economics, explained very well that "AI is safe!". Now we can ask some geology professors to end the debate ultimately.
Robin is super ignorant. Eliezer is super intelligent.
Nothing of value happens before 8:45, so skip to that unless you like blather and advertising.
BTW, there are many projects already working towards AI agency. It no longer requires a large expenditure to train new AIs. AI can run (and now be trained on) consumer grade hardware.
We need to be responsible in developing this technology plain and simple. NO RUSH IS NEEDED. We must control ourselves and we can organize - it will be hard but we can continue development with caution. It is not all or nothing.
Thats not happening. Microsoft fired its ethics team.
@кɛιʀᴀ As I said, WE have to organize.
@@yoseidman4166 Oh you mean like when people wanted to stop the development/spread of the Internet? That's not going to happen.
Unfortunately, it's advancing way too fast even if you want to organize, it will be too late. Unless you're very wealthy and have a lot of influence. But even if that's true, you're fighting against mega corporations like Microsoft, Google, Amazon etc. It's a losing battle. All we can do is to tell people who HAVE influence to communicate with these companies and the people working on AI. I think this is the only thing we can do that can have any impact.
@Yic17 I would consider that part of organizing. Making our concerns known in whatever way matters to those who have power and influence. When people realize their kids' futures are on the line, they will react. We need to get the word out intelligently and quickly.
Now it would most definitely be best to have both Robin and Eliezer on the show together.
Eliezer reminds me of Wallace Shawn: "Inconceivable!"
Watch the AI FOOM debate between Robin and Eliezer.
Couldn't you then make the argument this new AI will analyze this particular video in the future, which will help it understand why and how to go down the dark path and how to do it without getting caught?
Smart AI won't need this kind of content to generate ideas of this caliber
Smart AI would create millions of variations of this program and watch them all in a few seconds.
This set of unconvincing refutations seems to me to harden Eliezer's argument even further, unfortunately
Same here. Eliezer is weird and convoluted, but Hanson's arguments here were almost comically childish.
20:20
Why they would bring an economist on to “argue” when Yudowsky has been in AI Science for his entire life 🙄- get someone who’s actually in the field to address argue those points
You're being dishonest about Hanson: "Before getting his PhD he researched artificial intelligence, Bayesian statistics and hypertext publishing at Lockheed, NASA, and elsewhere."
The majority of analogies state AI intelligence vs. human intelligence would be similar to that of human intelligence vs. insect intelligence. This analogy illustrates its point clearly but falls apart immediately. Superintelligence in Eliezer's view would have the ability to self-improve, and do so at a geometric or exponential rate. Humanity does not have this capability--our technology does, but not our personal cognitive capabilities. Superintelligence would have conceptions that humanity would not be able to approach, and working at an exponential rate makes it impossible to predict, because we do not know the base nor the power of that logarithmic function. That is, perhaps, its most terrifying aspect: an exponential growth rate of intelligence will quite likely not be noticeable until it's too late--the simplest analogy I can think of is a superintelligent virus that invades a human host--it replicates slowly, and the host feels negligible symptoms until there is so much virus that the host exhibits illness. However, in this case, any medicine the host attempts to apply will be ineffectual against a virus that can manipulate itself and self-improve at a rate far faster than any medicine can hope to take have an effect.
There is another problem that most people are unaware of--AI is software, and presents a massive problem in itself. If AI combines with the power of quantum computing hardware, which is magnitudes of order more powerful than the digital systems were currently use, then we are looking at an exponential rate of inconceivable magnitude.