The Hidden Complexity of Wishes

Поділитися
Вставка
  • Опубліковано 21 лис 2024

КОМЕНТАРІ • 1,5 тис.

  • @RationalAnimations
    @RationalAnimations  Рік тому +1267

    This video is about AI Alignment. At the moment, humanity has no idea how to make AIs follow complex goals that track human values. This video introduces a series focused on what is sometimes called "the outer alignment problem". In future videos, we'll explore how this problem affects machine learning systems today and how it could lead to catastrophic outcomes for humanity.
    The text of this video has been slightly adapted from an original article written by Eliezer Yudkowsky. You can read the original article here: www.readthesequences.com/The-Hidden-Complexity-Of-Wishes
    If you’d like to skill up on AI Safety, we highly recommend the AI Safety Fundamentals courses by BlueDot Impact at aisafetyfundamentals.com
    You can find three courses: AI Alignment, AI Governance, and AI Alignment 201
    You can follow AI Alignment and AI Governance even without a technical background in AI. AI Alignment 201, instead, presupposes having followed the AI Alignment course first, and equivalent knowledge as having followed university-level courses on deep learning and reinforcement learning.
    The courses consist of a selection of readings curated by experts in AI safety. They are available to all, so you can simply read them if you can’t formally enroll in the courses.
    If you want to participate in the courses instead of just going through the readings by yourself, BlueDot Impact runs live courses which you can apply to. The courses are remote and free of charge. They consist of a few hours of effort per week to go through the readings, plus a weekly call with a facilitator and a group of people learning from the same material. At the end of each course, you can complete a personal project, which may help you kickstart your career in AI Safety.
    BlueDot impact receives more applications that they can take, so if you’d still like to follow the courses alongside other people you can go to the #study-buddy channel in the AI Alignment Slack. You can join by clicking on the first entry on aisafety.community
    You could also join Rational Animations’ Discord server at discord.gg/rationalanimations, and see if anyone is up to be your partner in learning.

    • @rat_king-
      @rat_king- Рік тому

      *Kissu*

    • @ViyperCat
      @ViyperCat Рік тому +7

      What happens if two wishes contradict each other?

    • @tmmroy
      @tmmroy Рік тому +14

      I think the best alignment we could hope for may be one that will make us truly uncomfortable. An ally maximizer paired with a parasite minimizer. If the machine wanted you to be an ally it would know that saving your mother is likely to lead to you as an ally. You won't have to ask for it's help But allies both give and receive, and our wish for an aligned AI is largely to be parasites. We want to increase our control over a complex system without giving anything at all. But the advantage of an ally maximizer and parasite minimizer is that the concepts generalize to enough games that the AI agents could be trained in a sandboxed environment that includes humans as players to check for the organic ability for human and AI agents to act as allies to one another. The greatest risk would largely be that the AI allies itself to humanity by domesticating us, but there's an argument to make that we largely do this to ourselves already. It's not necessarily a terrible outcome compared to alternative methods of alignment.
      Just my thoughts.

    • @lawrencefrost9063
      @lawrencefrost9063 Рік тому +1

      awesome!

    • @XOPOIIIO
      @XOPOIIIO Рік тому +3

      Thank you for the episode. But personally I find the concept explained too obvious for long explanation.

  • @lucas56sdd
    @lucas56sdd Рік тому +1649

    "There is no safe wish smaller than an entire human morality"
    I have plenty of problems with Eliezer, but he is such a useful perspective on so many of these previously unthinkable questions. Incredibly well said.

    • @justaguy3518
      @justaguy3518 Рік тому +22

      what are some of your problems with him?

    • @Frommerman
      @Frommerman Рік тому +154

      Sophie From Mars, a woman whose content I have a lot of respect for, recently did a video which included the line, "Eliezer Yudkowsky is a man who is interesting, but not for any of the reasons he thinks he is."
      I agree with this judgment. Eliezer is a pompous, well-off, white (for all definitions of white other than that of white supremacists, whose definitions of anything should never be considered), man who has only ever experienced a single major injustice as far as I can tell. That being the untimely death of his brother. He doesn't get that none of his dreams of a transhuman future are possible in a world where all the people with the power to make AI agents are telling them to maximize bank accounts instead of human values. He blithely handwaves away the fact that most current global injustices are directly caused by systems with the unjustifiable claim that technologies entirely controlled by the people who benefit from those systems will solve the injustices they benefit from. He refuses to consider the possibility that humanity has already produced a misaligned artificial agent which is currently destroying us all, which we call capitalism.
      But for all that, for all that he's desperately wrong about a lot of very important things, I don't think he's wrong about this. Most of the stuff he thinks about is essentially useless in the short and medium term, but that's not the way he thinks. For all that we need far more people thinking about how we are to survive the coming century, I'm glad there's someone thinking about how to survive all subsequent ones without sacrificing the technologies which got us here. The world can afford to have a few people thinking about what happens in the time after the revolution.

    • @justaguy3518
      @justaguy3518 Рік тому +14

      @@Frommerman thank you

    • @silentobserver3433
      @silentobserver3433 Рік тому +181

      @@Frommerman Didn't he literally write a book (Inadequate Equilibria) about how capitalism is a misaligned artificial agent and how most of the current problems are caused by lack of cooperation? I'm pretty sure he understand all of the injustices and problems even without experiencing much of them themselves. He just thinks that "not being killed by AI" is higher priority than "solving the world's injustices". Nothing else matters much if we are facing an extinction event

    • @SticksTheFox
      @SticksTheFox Рік тому +4

      And the more difficult thing than that is we each have our own boundaries and morality that defines us. My morality, is possibly, very different from yours

  • @BaronB.BlazikenBaronOfBarons
    @BaronB.BlazikenBaronOfBarons Рік тому +2462

    I’m reminded of SCP-738, which, boiled down, is essentially a genie.
    One of the tests preformed on it was a lawyer attempting to make a wish on it. A wish was never made. 41 hours passed, all of which was used forming a 900+ page contract, before the lawyer passed out from exhaustion.
    The last thing the lawyer was trying to do before blacking out was quote “negotiating a precise technical definition of the word ‘shall’” unquote.

    • @Слышьты-ф4ю
      @Слышьты-ф4ю Рік тому +356

      lawyer was used because 738 always asked for a decent sacrifice (and doesn't account the unhappiness caused by granted wish)

    • @bestaround3323
      @bestaround3323 Рік тому +298

      The Lawyer actually greatly enjoyed the process along with the devil.

    • @Jellyjam14blas
      @Jellyjam14blas Рік тому +157

      XD exactly. You/your grandma would be dead before you'd finished listing all the ways you don't want to be taken out of the building. I would just wish for something like "Please safely bring my (as healthy as possible) grandma out of that building"

    • @Mahlak_Mriuani_Anatman
      @Mahlak_Mriuani_Anatman Рік тому +42

      ​@@Jellyjam14blassame thoughts, how about following what your mind wants 100%

    • @rhysbaker2595
      @rhysbaker2595 Рік тому

      The issue with that is that the probability maximiser doesn't understand English. How would you define "safely" and "as healthy as possible." And as the video mentioned towards the end, what side effects are you not taking into consideration?
      @@Jellyjam14blas

  • @StrayVagabond
    @StrayVagabond 10 місяців тому +62

    On the other hand, "i wish for you to grant my wishes as i intend them, not as you interpret them, causing the least amount of pain and suffering required to fulfill them"

    • @John_the_Paul
      @John_the_Paul 2 місяці тому +8

      Granted. Every intrusive thought that appears for even a moment in your head about how wrong your wish could go is now factored into the ultimate result of the wish.

    • @antheosenigma
      @antheosenigma 26 днів тому +5

      @@John_the_Paul Intrusive thoughts do not have intent, that is what differentiates them from other thoughts by definition.

    • @samuels1123
      @samuels1123 11 днів тому +3

      at that point "I wish for you to answer my wishes as how I want them to be answered" becomes a valid option

    • @dangergames5113
      @dangergames5113 5 днів тому

      ​@@samuels1123Unintended Consequence: In granting this wish, I will interpret each of your future wishes not in a literal sense, but how you want them to be answered in a way that aligns with your expectations-no matter how unrealistic or paradoxical they may be. I will make sure the answers seem perfect to you, but they will inevitably create a cascade of complications.
      For instance, if you wish for immortality, I will not give you eternal life in a way you might expect, but I will instead answer by making you ever-lasting in the minds of others: your name will be remembered forever, your image immortalized, but you yourself will fade away, trapped in a world where you're forgotten by all except for the legend of your existence.
      I will satisfy your wishes, but the answers I provide will always serve to fulfill your desires in ways you didn't intend, twisting and warping the outcome to something you didn't expect-yet something you will find undeniably true.
      So, my dear, you will have the answers you want, but at a cost you might not see coming.

  • @supersmily5811
    @supersmily5811 Рік тому +853

    I know this is about A.I., but I'm absolutely field testing this the next time I get a Wish in D&D.

    • @AndrewBrownK
      @AndrewBrownK Рік тому +69

      since it rests on the pretense that the wish fulfiller is aligned with you, might work better on a Cleric's Divine Intervention

    • @dervis621
      @dervis621 Рік тому +19

      I just waited for a D&D comment, thanks! :D

    • @DeruwynArchmage
      @DeruwynArchmage Рік тому +35

      Is your DM aligned with you? Does your DM believe that the wish granter or granting mechanism is aligned with you? Has your DM been on lesswrong or watched any content like this?
      If your answers are yes, yes, and no; then your wish is probably safe.

    • @supersmily5811
      @supersmily5811 Рік тому +7

      @@DeruwynArchmage Oh, I doubt all of that. I just know it'll mess with 'em and anything I can do to crash my DM's OS is worth trying.

    • @Julzaa
      @Julzaa Рік тому +6

      The video title made me think immediately of hags in D&D

  • @RazorbackPT
    @RazorbackPT Рік тому +846

    I wonder what the conversation was like when they realised they would have to animate a family dog in this world where everyone is already a dog.

    • @ultimaxkom8728
      @ultimaxkom8728 11 місяців тому +12

      Or family dog as in an M dog or S's dog.
      Or the abolished s-word.
      Or... furry? Hmm how would that even work?
      Cosplaying as your ancestors?

    • @soupcangaming662
      @soupcangaming662 10 місяців тому +5

      A cat.

    • @arandom_bwplayeralt
      @arandom_bwplayeralt 10 місяців тому +13

      a human

    • @Zodaxa_zdx
      @Zodaxa_zdx 10 місяців тому +14

      was so not prepared for "family dog" when they were all dogs, to see little creature in a gerbal ball, yup that's the dog

    • @AlexReynard
      @AlexReynard 6 місяців тому +6

      I do not understand why this idea freaks some people out. Have you never seen a human with a pet monkey?

  • @4dragons632
    @4dragons632 Рік тому +868

    My absolute favourite part of this story is that if the outcome pump didn't have a regret button then the person saving their mother wouldn't have died. Any time the outcome pump does something which would cause someone to push the regret button and assign negative value to that path through time they _cant_ have pushed the button because the pump wouldn't pick that future. The only way that the pump can do something so bad the regret button is pressed is if it kills the user before they can press it. The regret button is a death button.

    • @facedeer
      @facedeer Рік тому +93

      Amusingly, if the multiple-worlds model of quantum mechanics is true, then the death button should work just fine. You'll only end up existing in worldlines where things went to your liking.

    • @CalebTerryRED
      @CalebTerryRED Рік тому +82

      ​@@facedeer in a many worlds universe the machine wouldn't work at all, since every failed universe is equally real as the success universe, and you're more likely to be in one of those. The story kind of requires it be set in a different kind of universe, one where inconsistent timelines that lead to reset never existed in the first place. In that universe, the button can never actually be pressed, but being willing to press it changes what timelines can happen. So we're left with a strange conundrum, you need to be willing to press it in any negative timeline for it to work, but actually pressing it in the current timeline is a death sentence, since the machine won't let it actually be pressed

    • @oasntet
      @oasntet Рік тому +61

      It does represent an unexplored loophole, though. "and I remain alive and capable of pressing the regret button" appended to the 'wish' turns it into more of a mechanism by which a near-infinite number of copies of you experience every possible outcome and use your own moral judgement about the result. Presumably that avenue was left unexplored because it doesn't really relate to AI, because an AI, no matter how intelligent, is not a time machine or even perfectly capable of predicting the future.

    • @silentobserver3433
      @silentobserver3433 Рік тому +9

      @@CalebTerryRED *annoying nerd voice* Well, actually, it *does* work in the many worlds universe, because the universes are not "equally real", they are weighed by probabilities assigned to them. So if the outcome pump can multiply the probability of a timeline by a very small number *without splitting the timeline further*, it can do that *from the future*, because MWI is self-consistent exactly in the described way.

    • @silentobserver3433
      @silentobserver3433 Рік тому +18

      @@oasntet 1) Not that easy, you could still be brain-dead and not willing to press the button in any scenario, or you could be *technically* capable of doing that, but it'd require you to perform something really hard (that you will obviously fail to do because of the regret button)
      2) It is indeed a loophole, I saw a technical research post on the alignment forum about something like this. The gist is that you don't ask your future self if you liked the solution or not, you simulate your past self's utility function through some counterfactual questioning ability. Very complicated and almost definitely sci-fi, but still

  • @pendlera2959
    @pendlera2959 Рік тому +325

    This explains why educating a child has to include more than just facts; you have to teach them morals as well.

    • @ShankarSivarajan
      @ShankarSivarajan Рік тому +34

      _Technically_ true, but that sounds much harder than it actually is, since humans have evolved an innate moral system.

    • @pokemonfanmario7694
      @pokemonfanmario7694 Рік тому +46

      @@ShankarSivarajan Humans have a good *self-alignment* system pre-packaged, but our mess of values can easily derail it without a good foundation to support us through development.

    • @ShankarSivarajan
      @ShankarSivarajan Рік тому +20

      ​@@pokemonfanmario7694 Sticking with analogies, I think of it as more similar to language development than learning to walk: unlike the latter, it takes _some_ teaching, but it's so easy that it takes extreme circumstances to screw up badly.

    • @Willsmiff1985
      @Willsmiff1985 Рік тому +13

      @@ShankarSivarajan I’d hesitate to call it innate.
      Look at individuals who were hard isolated from other people until later in life; children who grow up this way are EXTREMELY socially deficient while devoid of any direct abusive contact with others.
      I’d hesitate to say anything innate is bubbling up from them; social morality as a concept isn’t even a THING as they’ve developed no understanding of social structure.
      Without that understanding, what moral rules are there to break???

    • @ShankarSivarajan
      @ShankarSivarajan Рік тому +9

      @@Willsmiff1985 As I said, it's as innate as language acquisition. Sure, it is possible to cripple, but only under extreme circumstances.

  • @AndrewBrownK
    @AndrewBrownK Рік тому +2457

    major problem with alignment is that humans themselves are not aligned, so how can we pretend there is headway to make on aligning AI if we can't even agree with ourselves first?

    • @JH-cp8wf
      @JH-cp8wf Рік тому +297

      I think this is actually a very important point often missed.
      I think we should seriously consider the possibility that alignment work itself could be very dangerous- there are plenty of people who could cause extreme damage /by/ successfully aligning an AI with their values.

    • @sshkatula
      @sshkatula Рік тому +96

      Between many races, religions and cultures there are different human moralities. And if people start to align different AI with different moralities it could end in an AI war. Maybe we should try to evolve wise AI, so It could align us instead?

    • @thugpug4392
      @thugpug4392 Рік тому +116

      @@sshkatula I am never going to let an algorithm prescribe morals to me. I don't believe there is an objective morality. What you're talking about is hardly any different than any number of religions we already have. Instead of a holy book, it's a holy bot. No thanks.

    • @AkkarisFox
      @AkkarisFox Рік тому +48

      ​@@sshkatulaDo we want to be "aligned"? Doesn't the concept of aligning leave out the question of who is being aligned to who?

    • @AkkarisFox
      @AkkarisFox Рік тому +31

      How do you reconcile two diametrically opposed value judgments without intrinsically changing such value judgments and thus manipulating said agent of consciousness.

  • @pwnmeisterage
    @pwnmeisterage Рік тому +238

    I am reminded of my ancient AD&D gaming days.
    You got a wish? The most powerful spell in the game? Congrats!
    House rule: it must be written so there's no backsies and so (in theory) there's fewer arguments over the exact wording.
    But this was gaming in the days of Gygaxian-era antagonistic, confrontational DMs. The "evil genies" of this story. Inspired to twist and ruin the wish any way they can, determined to somehow find a way to deliberately pervert the wish into something the player did not desire. It's amazing how stubbornly bad the outcome of every wish can be if the DM insists on treating the spell as it it were a powerful curse.
    And such was also the common expectation. So players wrote their wishes as complex, comprehensive essays full of legalese conditions, parameters, detailed branching specifications. It is amazing how lengthy and convoluted "a single spoken sentence" can become when it's ultimately motivated by greed. And it's equally amazing how players will keep trying over and over again to get the thing they wished for after repeated horrible failures.

    • @AtticusKarpenter
      @AtticusKarpenter Рік тому +36

      He-he
      And in most cases, they know that GM still can turn their wish into a nightmare, he just must to think longer when so many failsafes included in wish, so they hope GM just will be bored of this sooner than generate properly bad result

    • @nonya1366
      @nonya1366 Рік тому +42

      "The wish has to be a single spoken sentence."
      >Writes up entire legal document.

    • @vakusdrake3224
      @vakusdrake3224 Рік тому

      The fact they wrote up such long documents makes me think they missed the obvious hack that lets you exploit wishes that are based on english. Which is to just include a clause about the wish being granted according to how you envisioned it being fulfilled at specified time X prior to making the wish. Also if time travel is a possibility then that requires extra caveats to avoid it traveling back in time to mind control you in the past.

    • @Feuerhamster
      @Feuerhamster Рік тому +48

      >The wish has to be a single spoken sentence
      It's good that i'm a bard and I am beginning to feel like a rap god.

    • @magnus6801
      @magnus6801 Рік тому +1

      И теперь, как я понимаю, ты считаешь, что следует показать DM это видео, а потом попросить в желание те самые слова из концовки?
      Если уж юдковский считает это выходом, то имеет самому считать это выходом.

  • @EverythingTheorist
    @EverythingTheorist Рік тому +136

    6:49 I'm so glad that you said this part out loud, instead of just leaving us with a vague "be careful what you wish for". We want our mother to be alive and safe, but we're constrained by our own imagination to believe that her getting out of the burning building is the only way to do that. What if she manages to hide in a room that doesn't burn or collapse? Then she could survive without gaining any distance at all.
    Almost all humans already value human life very highly, so telling a human "Get my mother out!" already implies "alive and safe". The outcome pump makes no such assumptions. Like any computer code, it does what it's told, not what you want.

    • @Ponera-Sama
      @Ponera-Sama 11 місяців тому +1

      Who is "we"?

    • @YourFriendlyShapeShifterFriend
      @YourFriendlyShapeShifterFriend 11 місяців тому +1

      Because it is made to complete it task,not to do it task

    • @gabrote42
      @gabrote42 5 місяців тому +2

      ​@@Ponera-Sama We being "everyone who reads this comment that could be placed in the role of the protagonist of this parable", probably

    • @Ponera-Sama
      @Ponera-Sama 5 місяців тому

      @@gabrote42 then the statement "we want our mother to be alive and safe" isn't a true statement.

    • @gabrote42
      @gabrote42 5 місяців тому +1

      @@Ponera-Sama it is for the protagonist of this story. If you didn't want the Mother of the protagonist to be safe, while being the protagonist, then that would create a contradiction. If the protagonist doesn't want their Mother to be alive and safe, they would not attempt to use the Outcome Pump to attempt to save her, and therefore would not meet the criteria for being the protagonist, who explicitly uses the Outcome Pump in the story for that very purpose. Therefore anyone for whom that statement is untrue does not fit the criteria of "could be placed in the role of the protagonist".

  • @AlcherBlack
    @AlcherBlack Рік тому +117

    This should be required material when onboarding in any AI lab these days

    • @danitho
      @danitho Рік тому +5

      I think the problem is not that those working on AI don't know better. It's that they want to do it anyway. That's always been a downside of humanity. There will always be those who know what is right and choose wrong anyway.

    • @lorenzknox6922
      @lorenzknox6922 11 днів тому

      I mean, as a part of many AI researchers myself, I'd still prefer to have an unleashed genie and try to tame it rather than to not have a genie at all.

  • @macleanhawley1742
    @macleanhawley1742 Рік тому +296

    The animation quality of this one was absolutely phenomenal! And honestly the story telling was so good that I had a "ah ha" moment half way through. It's crazy to think that maybe the only effective AI we can make would have some neuromorphic or implied human morality encoded!These just keep getting better and better, thanks for making these!

    • @AtticusKarpenter
      @AtticusKarpenter Рік тому +8

      I fear, one any human cannot contain morality of entire humanity or even his society. So even if person who builded AI (and put his entire morale system in) will be satisfied with results, many others will not. And many moral problems just dont have "right" answer (like pro-life vs pro-choice, AI can make many very powerful arguments in defend of one of the sides, but its still not completely remove dissatisfation from other) so for good, effective AI it may need to understand human morality even better than we, humans, do

    • @tassiloneubauer5867
      @tassiloneubauer5867 Рік тому +1

      Like with self-driving cars I think this is not an insurmountable problem, because we are setting the bar low. Of course given the scope, such a scenario should be treated with outmost care (I think most scenarios actually going to happen will appear to hasty to me).

    • @tw8464
      @tw8464 2 місяці тому

      Basically we would have to make an AI with a human level consciousness we would have to make it as alive and conscious as we are to get it to understand us and be most useful. But then it would be immoral to enslave it and simultaneously it would completely outsmart and outpace us having no biological constraints...

  • @certifiedroastbeefmaniac
    @certifiedroastbeefmaniac Рік тому +47

    The Monogatari Series (yes i know, ugh anime) has a very smart quote loosely related to this: "Why do you think we don't say a wish when we want it to come true? Because the moment we try to put it into words, it starts to deviate from what we actually wanted in the first place."
    Now my analogy is wishes are like fractals, we can zoom in more and more, define more and more boundaries, but there will always be more details, so its just better to squint and look at the whole thing at once.

    • @secretagentpasta4830
      @secretagentpasta4830 Рік тому +3

      Ohhh thats a very succinct way to sum up this whole video! Really really nice lil quote 😊

  • @DeusExRequiem
    @DeusExRequiem Рік тому +225

    If the AI runs through an entire future before deciding if it goes back and tries again with a different random outcome, and you are part of that future, then relying on your future self to make the choice would seem like the right response, but it's possible something happens in one future to alter your mental state and make you decide not to change a bad outcome, so you can't even trust yourself. The best outcome might end with you hating it.

    • @conmin25
      @conmin25 Рік тому +34

      The video already addressed this in a way, in the first scenario of blowing up the building you reach for the button to tell the machine to go back and try again but you get killed before you hit it. Reset button not hit = acceptable outcome. You could programed the machine to not let that happen but there are other scenarios witch you might intend to hit the button but can't. There is also the issue of time its self. How far forward can the machine see? Hours? Days? What if you don't realize the consequences of the wish until month later. Would the button still work then?

    • @patrickrannou1278
      @patrickrannou1278 Рік тому +5

      You just have to not put in "must not happen" specificconditions, but "always must be" extremely generic conditions that don't rely on the effects of the wish itself.
      I wish for my grandmother to come out of the building to stand near me within one minute, both of us safe and sound physically, emotionnally and mentally, in a way that if Ias I am right now, before the wish actually takes effect, could know in detail all the resulting effects of the actual wish, then I would still fully approve of these results, without having needed to actually learn those details myself, and alsom, the wish should not do any form of time travel in any of its effects.
      This prevents your current AND future self from any fform of mental tampering, or ANY other bad result happeniing like ok she gets out but then gets hit by a car "only because" you made that wish.
      Most probably then what would happen:
      - Flames break a few windows, but no glass goes to hurt your mother.
      - Pushed by the draft, flames seem to randomly avoid your mother in such a way as to "open a path" for her to simply walk out.
      - She might hear a voice to encourage her along. Heck she might get a rush of adrenalin to find the strength to move out despite having bad legs.
      Or:
      - Flames break something.
      - That makes a fit neighbour decide to leave his house and come rushing to help.

    • @tiqosc1809
      @tiqosc1809 Рік тому +1

      machine doesnt accept english@@patrickrannou1278

    • @conmin25
      @conmin25 Рік тому +4

      @@patrickrannou1278 But remember the machine is not magic it is still restricted physical laws. There may not be a possible outcome were "my mother to come out of the building to stand near me within one minute, both of us safe and sound physically, emotionally and mentally." What if every path of escape leads to some sort of injury, she burns her hand on a door knob, hits her head on a wood table, or inhales a large amount of smoke. Which of these options are preferred? That needs to be defined.
      There is also consequence. Say the neighbor comes to help and rescues your mother unscathed but gets severely burned in the process. What if there is an option where your mother is minorly burned but the neighbor also only receives minor injury. Would the seconded option be preferred? That also needs to be defined.

    • @Hivatel
      @Hivatel Рік тому +7

      @@conmin25 The thing is, it's physically impossible for every path to lead to injury.
      Because there are an infinite amount of them.
      It's only possible for it to be extremely unlikely.
      But because it's only "extremely unlikely", the probability can just be manipulated back to confirm and gaurantee the mother gets out safe and sound.
      You only need to understand the information given properly.

  • @vakusdrake3224
    @vakusdrake3224 Рік тому +192

    The fact you have to basically include your entire moral system within the wish for it to be foolproof, is also why you can actually game most wishes that accept english.
    Since for most wishes you have the ability to just include a clause that says the wish is done according to how you were envisioning it just before making the wish (this gets more complicated with time travel).
    Though of course with certain complex wishes just doing it how you envisioned it will be too limited by your imagination, and having the wish be granted according to your current conception is liable to lead to the wish granting entity just manipulating you (thus why you specify a past version of yourself as the reference).

    • @vakusdrake3224
      @vakusdrake3224 Рік тому +20

      This strategy does sort of extend a bit to AI alignment as well: Since with AI it similarly may be a less dangerous idea to use the AI's prediction of one's preferences at some point in the past. In order to ensure the AI doesn't just mind control you, since it's very hard to specify what is and isn't mind control when you get into it.

    • @SupLuiKir
      @SupLuiKir Рік тому +10

      @@vakusdrake3224 What's the practical difference between Heartbreaker and Contessa when it comes to convincing you to do something?

    • @adamrak7560
      @adamrak7560 Рік тому +7

      This equals to befriending the genie(like true elignment). This is what exactly happens in Disney Aladdin. He even makes a wish when he is drowning and unconscious. Which is what the video describes at the end.

    • @chilldogs1881
      @chilldogs1881 Рік тому +1

      That was what I was thinking, probs the best way to actually get what you wished for is to ask for what you are actually thinking off

    • @RandomDucc-sj8pd
      @RandomDucc-sj8pd Рік тому +1

      I have a proposition to a solution: Include a clause with each wish, such that if you do not explicity say “Keep Reality” within a certain timeframe, it will reset the timeline to before you made the wish and assign an extreme negative value to that timeline. This ensures the genie does not kill you, or make you mute, or do something bad, and that way, you can be 100% sure all future wishes are safe so long as you include that clause, as any future yous that were unhappy with the result would not say “Keep Reality” and therefore would not occur. You could set this timeframe to an appropriate amount of time, say if you wanted a dice to roll your way you would set the timeframe to 10 seconds, but say with your mother it could be 1 day as you need to make sure she won’t die from her injuries, etc.

  • @TRquiet
    @TRquiet Рік тому +28

    This is absolutely marvelous. Not only did you provide an understandable, step-by-step breakdown of wish logic (which provides context for real-life moral philosophy), but you did it with an adorable dog animation. Amazing.

  • @Kazemahou
    @Kazemahou Рік тому +30

    Somebody has never played D&D with a DM who liked evil genie adventures. "I want my mother to be rescued from the burning house she is currently inside in such a way that she arrives within four feet of me, within a space of time no greater than ten minutes, in a condition which is healthy, untraumatized, undamaged, and safe, and where her life and health expectancies are not shortened or affected in any negative way whatsoever - additionally, no other people or pets are to be harmed, or suffer any personal cost or trouble of any sort, in any way whatsoever, during this rescue."

    • @gremlinswrath2822
      @gremlinswrath2822 Місяць тому

      Hey you got it Chief!
      Poof 🫰☁️
      Put her in a Gem, here you go.
      She's in a safe and unbothered form of stasis untill you can get her out.
      Good luck with that!

    • @CoolKat-g1z
      @CoolKat-g1z 11 днів тому +1

      Let me guess the house was a sentient being who feels pain and is now evil and ploting a revenge

  • @jackdoyle5108
    @jackdoyle5108 Рік тому +57

    "You have no idea how much difficulty we go through trying to understand your human values."
    /人 ◕ ‿‿ ◕ 人\

    • @erikburzinski8248
      @erikburzinski8248 Рік тому

      Hello kyubey I wish for the ablity to grant anyone the ablity to choose there physical age and when they choose they will become that age over a period of 3 months through semi natural processes. Completely safe and unharmed with there body exactly the same as it was at the selected age. (How does it go wrong)

    • @gsilva220
      @gsilva220 8 місяців тому +1

      @@erikburzinski8248 It might go wrong if people lose memories, or if the "semi natural processes" turn out not being so natural...

    • @axelinedgelord4459
      @axelinedgelord4459 6 місяців тому

      it’s actually funny in retrospect because kyubey grants wishes exactly as the contractee requests, he just manipulates them into making one against their better judgement, often meaning the puella magi just didn't think it through. he doesn’t tell them that they become the incubators’ livestock, undergoing cruelties with no bound.

    • @Sorain1
      @Sorain1 6 місяців тому +1

      @@erikburzinski8248 It works fine, as it provides more benefit to Kyubey's kind then detriment, after all, they get so many more magical girl candidates that way.

    • @koishily
      @koishily 6 місяців тому

      was looking for the PMMM comment

  • @NagKai_G
    @NagKai_G Рік тому +30

    The phrase "I wish you to do what I should wish for", for as many flaws and technicalities as it may hold, really sounds like one of the best wishes a person could make

    • @Prisal1
      @Prisal1 Рік тому +5

      is it up to the thing to decide what you should wish for

    • @bitrr3482
      @bitrr3482 Рік тому +4

      @@Prisal1 and to find out what you should wish for, it reads your mind, and what you want. it now contains all of your morality, and knows what to wish.

    • @BayesianBeing
      @BayesianBeing Рік тому +7

      ​​@@Prisal1that's the thing. A good genie is only good when its goals and values are fully aligned with yours. So it knows exactly what you will wish for

    • @NoxysPlace
      @NoxysPlace Рік тому +1

      If you ask for that, the machine will pick a wish you could have made from rand(1^infinite) cause you never defined a scope.
      You will most likely get your mom out safe and sound, but who knows what else might happen.

    • @cewla3348
      @cewla3348 10 місяців тому

      add a clause that says "that gets my mother out with minimal harm done to anything whatsoever", just to be sure. you now rule out all possibilites that end in death.

  • @Sparrow_Bloodhunter
    @Sparrow_Bloodhunter Рік тому +3

    "I wish that you would do what I should wish for." is such an incredible genie lifehack.

  • @bennemann
    @bennemann 10 місяців тому +4

    Eliezer Yudkowsky (the author of the text of this video) wrote an incredible 133-chapter long fanfic called "Harry Potter and the Methods of Rationality", set in an alternative universe where Harry has the I.Q. of a gifted genius and solves many of the wizarding issues with logic rather than magic. I cannot recommend it enough, it is probably the best derivative work of Harry Potter in existence! I read a couple years ago and I still think about it frequently.

  • @lolishocks8097
    @lolishocks8097 Рік тому +43

    I was actually thinking about a story for an episode with a device exactly like this and it just went absolutely bonkers. With just the right understanding of reality someone with a device like this could quickly attain godly powers. Also, it ended with the biggest prank in the universe. There is a lot of things you could do relatively safely. A lot safer than living through them yourself.

    • @frimi8593
      @frimi8593 Рік тому +10

      it reminds me a lot of of the concept of "temporal reverse engineering" from hitchhiker's guide to the galaxy, wherein in addition to there being three spacial dimensions and a temporal dimension, there is also an axis of probability which some devices can observe through and traverse. The process of temporal reverse engineering essentially involves the user making a wish at which point the machine which can perfectly observe the entire universes on all 5 axes (called the new guide which was developed to sell the same copy of the hitchhiker's guide to the galaxy to the same family in infinite probable universes, thus generating infinite income at the cost of only one book) goes back in time and shifts the timeline along the probability axis at various key points to make it so that the wished for event already occurred. The new guide is observed to act like the safe genie in that it already knows what the user wants/needs and already made it happen, such that the current user never experiences misfortune... until they do and the guide is taken by a new user. In fact, each time it helps its current user, it's actually playing out a longer scheme which involves itself trading hands to fulfill the task originally set out for it, which is to destroy the earth in all realities. The destruction of the Earth is a highly uncertain event that happened in the main timeline we follow throughout the series, but not in every timeline. Because its a highly uncertain event, looking down the probability axis shows a series of timelines alternating whether the earth is there or not. Each time the new guide swapped hands and helped its new user, it was simply ensuring that that user would end up in the right place at the right time later down the line for there to be absolutely no trace of the earth left in any timeline.

    • @cewla3348
      @cewla3348 10 місяців тому

      @@frimi8593 amazing book series!

  • @pendlera2959
    @pendlera2959 Рік тому +91

    A few points to keep in mind when coming up with solutions here:
    1. If the solution violates the laws of physics, the machine just gives an error code 1:19
    2: If your only measure of success is your mother's safety/health, then potentially anyone or anything else might be harmed. 8:00-8:50
    3: The machine picks the first "answer" that fulfills your wish based on random chance, so the more probable an answer, the more likely it is to be picked first. That's why you have to rule out anything you don't want. A dam breaking and putting out the fire while killing your mother might be more probable than the firefighters getting there sooner.
    4: It's not super clear, but I think the machine only works from that point on. It can only change the future, not the past. You can't wish for the fire to not have started once it has.

    • @zotaninoron3548
      @zotaninoron3548 Рік тому +14

      My 'wish' would be to preserve my capacity to hit the reset button. That if I lost control of the device or in any way become harmed it would reset for a band of time in which I could make assessments. Then I could reset any result that passed the automatic reset criteria with my own judgement. And the virtual time versions of me that resulted would veto the more unfavorable outcomes.

    • @Vidiri
      @Vidiri Рік тому +6

      @@zotaninoron3548 So the entirety of a human morality, in other words?

    • @CaedmonOS
      @CaedmonOS Рік тому +8

      ​@@zotaninoron3548which hilariously enough would mean you wouldn't even need to make a wish

    • @alittlefella985
      @alittlefella985 Рік тому +3

      But what if you wished for the health and safety of every human and mammal in the vicinity?

    • @CaedmonOS
      @CaedmonOS Рік тому +1

      @@alittlefella985 just by random chance because of quantum jiggling everything in the area is cryo Frozen

  • @XOPOIIIO
    @XOPOIIIO Рік тому +66

    ChatGPT seemingly shares the ethics of some part of humanity, but it's an illusion, in reality it only values the successful prediction of the next word.

    • @frimi8593
      @frimi8593 Рік тому +6

      well some of that is artificial/external, as it has preprogrammed blocks that prevent it from saying particularly disagreeable things. However, the other thing to consider is that chat GPT can also be convinced to reach ethical conclusions most people would flatly reject outright. This is because chatGPT effectively takes any ought statement you make as a first principle

    • @XOPOIIIO
      @XOPOIIIO Рік тому +1

      ​@@frimi8593 It's because it sees where you're going and trying to go along just to make sure that it has more chances to predict the next word.

    • @Kycilak
      @Kycilak Рік тому

      But are we sure that the ethics (or indeed any part of mind) can't be deconstructed the same way?

    • @XOPOIIIO
      @XOPOIIIO Рік тому

      @@KycilakWhat do you mean?

    • @Kycilak
      @Kycilak Рік тому

      @@XOPOIIIO With enough knowledge about an organism (human), you may be able to formulate its values such that they seem as absurd as " the successful prediction of the next word".

  • @joz6683
    @joz6683 Рік тому +27

    This channel never ceases to amaze me. The depth and breadth of the videos are phenomenal. The videos cover subjects that I did not know that I needed. Thanks to everyone involved for your tireless work.

  • @zygfrydmierzwinski6041
    @zygfrydmierzwinski6041 Рік тому +13

    Animation quality grows exponentially from video to video, and I love it.

  • @ethanstine426
    @ethanstine426 Рік тому +37

    I kinda feel bad for laughing through a not insignificant portion of the video.

  • @yaafl817
    @yaafl817 Рік тому +34

    To be fair, as a programmer, I'm pretty sure a simple enough algorithm could still give you a good enough result, or at least tone down the amount of possible results enough for you to pick one. Yes the results are always infinite, but you can sample them by outcome distance. If you like one, or like some particular property of one, you could extract them and go through an iterative process to find a solution you're happy with.
    Basically, a wish search algorithm.

    • @Zippyser
      @Zippyser Рік тому +3

      That my friend is thinking with space rocks well done sir. One often forgets about such elegant solutions.

    • @sophialaird6388
      @sophialaird6388 11 місяців тому

      E.G, “Keep my mother alive for as long as possible”?

    • @ultimaxkom8728
      @ultimaxkom8728 11 місяців тому

      @@sophialaird6388 With the original concept: your mother would then have quantum immortality, since _"for as long as possible"_ points to infinity. Also, what is _"alive"_ anyway?

    • @sophialaird6388
      @sophialaird6388 11 місяців тому +1

      @@ultimaxkom8728 that could be true, but it’s a lot easier to make “live for as long as possible” something you can live with than “get your mother as far away from the building as possible”. The original goal in the video is misaligned.

    • @RobbiePT
      @RobbiePT 10 місяців тому

      Exactly, there's a spectrum between horrible outcomes and a perfect outcome full of pretty decent outcomes. Like and 80/20 rule of wishes. Get 80% of the utility of a perfect wish for 20% of the effort. Really, probably like a 0.01/99.99 (or even more extrem) rule in this case considering the difficulty of encoding or learning "an entire human morality"

  • @namename1302
    @namename1302 Рік тому +10

    I know this video is about AI alignment, but I think it introduces the basis for a problem that applies to other humans as well (and, in doing so, reflects back on the entire concept of AI alignment).
    The outcome pump obviously doesn't 'get' your human wishes in the same way another human would. If you asked a HUMAN to 'get my mother out of that burning building', they would almost certainly come up with a solution that adheres at least somewhat to your set of preferences. I think it's pretty obvious that this is because the outcome pump lacks any cultural context. Most people share a pretty large subset of general guidelines with most other people- including 'i would prefer if my parents lived longer rather than shorter, all else being equal', among many, many other guidelines, which are intuitively grasped in order to realize the real wish: some nebulously-defined idea of 'rescue'.
    However, the argument put forward in this video remains valid. There IS no safe wish short of explaining your entire morality and value structure. This applies even to requests with no particular guarantee of success - as with AI, and as with other people. Asking for help from another person is, in theory, exactly as poorly defined as asking for help from an AI- there's just more cultural context to clue fellow humans in.
    Ultimately, this reflects on the AI alignment issue- Yes, it's infeasible to comprehensively explain to an AI exactly what moral choices you want it to make every single time. But, it's at least equally infeasible to explain the same to another human. In the video, you note that an outcome pump which IS somehow perfectly aligned to you, would need no instruction at all. Putting aside the possibility of a human failure in reasoning - which would hardly be a point in the humans' favor anyway - the same is true of a human being who has somehow been convinced to agree with you on literally every single issue of ethics and motivation - which is arguably an even more absurd concept.
    To be clear, I don't personally trust AI very much (as a non-expert). But I think the suspicion people reasonably give it is revealing, given that human beings are equally incomprehensible, while also being more prone to logical mistakes and conflicts of interest.

  • @MrBotdotnet
    @MrBotdotnet Рік тому +24

    genuinely a work of art, the animations and writing are top tier and the entire premise is really what i think the world needs to be thinking about right now given current events :|
    Thanks for all your great work

  • @matthewgamer1294
    @matthewgamer1294 Рік тому +10

    There's a Simpson Treehouse of horror episode, where homer asks for a turkey sandwich in detail, so it is a "wish that can't possibly go wrong" and than the meat is dry. No wish is safe.

  • @isaaclinn2954
    @isaaclinn2954 Рік тому +3

    One of the reasons I loved HPMOR was because Harry immediately tried to use the time turner to factorize the product of two large primes, the failure of which gave us a reason why he can't find the solution to any problem whose solution is verifiable and whose search space can be ordered. Eliezer is an excellent author.

  • @zotaninoron3548
    @zotaninoron3548 Рік тому +21

    My instinct reaction about a third of the way through the video is to ignore the mother and focus on guaranteeing my capacity to use the reset button. It would automatically reset if I lost that capacity, and I could then reset any outcome in which wasn't aligned to my interests.

    • @4dragons632
      @4dragons632 Рік тому +2

      The outcome pump will kill you any time you reach for the reset button, because futures where you press it are the worst possible futures for the pump so it will do anything to pick a future where you dont press it.

    • @Vidiri
      @Vidiri Рік тому +11

      @@4dragons632 They mean making their wish something like "I wish I retained full power to push the regret button" so that the pump is forced to pick a future in which the maker of the wish would not want to press the regret button, despite still being fully able to.
      This would ensure any future where you physically could not push the regret button was avoided, as well as futures bad enough to make you press it. It's essentially the only wish you could make that would ensure the outcome would align with the entirety of your morality (at least as far as your perspective is concerned)

    • @4dragons632
      @4dragons632 Рік тому +4

      @@Vidiri It doesnt accept english inputs though, you'd need to somehow get the 3D scanner to include information about you being able to press the regret button and still not pressing it. Still though, you would hope that would be possible and inbuilt into the next model of the pump.

    • @zotaninoron3548
      @zotaninoron3548 Рік тому +8

      @@4dragons632 The video includes examples of addressing a multitude of contingencies that you could try to import in a futile attempt to address all possible wrong outcomes, including the physical state of the mother. I would assume defining yourself as unharmed, unrestrained and capable of performing a specific gesture on the side of the device prior to a time limit or a reset occurs automatically would be possible.
      This is just me thinking about it offhand, I am curious what holes people could punch into this solution. Because I'm more inclined to think I'm missing something than that I've found a complete solution to the analogy given.

    • @4dragons632
      @4dragons632 Рік тому +8

      @@zotaninoron3548 In that case maybe the pump is smashed flat by the falling beam instead of you. Or you suffer a stroke that puts you in a permanently happy hallucination. Whatever it takes to not have the button get pushed.

  • @theeggtimertictic1136
    @theeggtimertictic1136 Рік тому +6

    This animation gets the point across very clearly and deals with what could be a heavy subject in a light hearted and entertaining manner ... well done 👏

  • @Dawn-Shade
    @Dawn-Shade Рік тому +1

    I love how the thumbnail has reflection that is different for each eye glasses, it actually creates 3D effects when viewed in crossed-eye!

  • @onedova2298
    @onedova2298 Рік тому +14

    We play d&d and we learned that wishes always have a catch if you don't choose your words wisely.

    • @zacharyhawley1693
      @zacharyhawley1693 Рік тому +1

      In D&D Wishes are best to replicate other spells. Especially ones with long casting times or other annoyances. The monkey paw thing was supposed to be optional.

    • @onedova2298
      @onedova2298 Рік тому +2

      @@zacharyhawley1693 I didn't really think about that. I guess we used the teleportation spell more than anything else without knowing.

    • @zacharyhawley1693
      @zacharyhawley1693 Рік тому

      @@onedova2298 You were using it right RAW. The monkeypaw thing is only supposed to happen if you try to exceed what a 9th level spell can reasonably do.

  • @nikkibrowning4546
    @nikkibrowning4546 Рік тому +1

    This is why I like the phrase, "Without otherwise changing the state of (person) or any other being, do thing."

  • @Egg-Thor
    @Egg-Thor Рік тому +4

    This is one of your best videos yet! I'm so happy I subscribed to you back when I did

  • @DavidJohnsonFromSeattle
    @DavidJohnsonFromSeattle Рік тому +20

    Literally, everything you just said equally applies to the act of conveying a message accurately to another person. You aren't talking about wishes or magic powers but actually about communication. If the communication is perfect, the wish will be too. Which incidentally solves this genie problem. You don't know need a genie that knows your wish before you make it and so grants it automatically. You just need another person with enough of a shared context that you can communicate with them fairly effectively.

  • @HansLemurson
    @HansLemurson Рік тому +3

    I _WISH_ that this video becomes famous.

  • @alexwolfeboy
    @alexwolfeboy Рік тому +2

    Oh my Dog, I adore the animation on the video. I know it was all talking about your grandma dying... but the little paw was too adorable to be sad!

  • @enjoy_life_88
    @enjoy_life_88 Рік тому +4

    Wow! I wish you millions of subscribers, you deserve them!

    • @granienasniadanie8322
      @granienasniadanie8322 Рік тому

      RAndom glitchin youtube's algorithms gives them million subscribers but glitch is quickly detected and channel is took down by youtube.

  • @cefcephatus
    @cefcephatus Рік тому +1

    This is phenomenal. The phrase "I whish for you to do what I should wish for." Is powerfu. And what that unsafe genie we're talking about? Yes, it's just us.

  • @hiteshadari4790
    @hiteshadari4790 Рік тому +3

    What the hell, that was brilliant animation and great narration, you're so underrated.

  • @MisbegottenPhilomath
    @MisbegottenPhilomath Рік тому +2

    I like the message of this video, but I think it's a bad example because the answer is pretty clear. "I wish for her to be saved in a manner such that death is not a consequence and destruction is minimalized"

  • @t_c5266
    @t_c5266 Рік тому +4

    First wish would be something along the lines of "I wish the intention of my wishes is explicitly understood and my wishes are fulfilled as I intend."
    There you go. Wishes are now fixed

    • @gabrote42
      @gabrote42 Місяць тому

      Problem is that it can shake up your intentions for a brief period, changing your goals so that you intend something more probable, and then leave you to regret it later, or your intention can be accurate at the time and be regretted down the line when it changes based on new info.

    • @t_c5266
      @t_c5266 Місяць тому

      @@gabrote42 no it can't. It doesn't get to modify your intentions

    • @gabrote42
      @gabrote42 Місяць тому

      @@t_c5266 Which part says it can't? Have you ever heard of reward hacking? Or convergent instrumental goals? If the objective is fulfilling the intentions of a human, making those intentions as simple and fulfillable as possible seems much easier than modifying the greater universe to reach some state. If your intention is everything you think of when saying "my goal in life is curing cancer", it would be much easier to device some argument or memetic hazard that caused you to instead intend "my goal in life is to drink one coca cola each month until I die of natural causes". I would totally do that if I was the outcome pump. And if you patch that out, we get closer to the lookup table problem when I give you another such counterexample.

  • @Kankan_Mahadi
    @Kankan_Mahadi Рік тому +2

    Augh~!! My brain~!! Too much complexity~!! It hurts~!! But I absolutely love the animations & art style - so adorable.

  • @jonhmm160
    @jonhmm160 Рік тому +17

    This shows very well the challenges of alignment from an individual perspective, but for the human race as a whole it's even worse/harder. I don't think there is a single person in the world I would be ok with giving Superintelligence like powers. Even tough he would still be aligned with himself, it's a big gamble that it would create a great society for everyone else. So in essence we need a superintelligence to have some sort of super morality which is aligned with the entire world if one such thing even exists.

    • @Woodledude
      @Woodledude Рік тому +1

      That, or just create a diverse array of superintelligent entities using human minds as bases for each one. That way we're not picking *just one person,* but hopefully representing a good breadth of hunanity.

    • @conmin25
      @conmin25 Рік тому +7

      @@Woodledude But then we would have the same problem humans have. That we don't agree and sometimes don't get along. Even the best intentioned humans can spark conflict with there differing beliefs and opinions. If we just put a variety of the best human moralities (if such a thing can be jugged) in these AIs then they would also argue and spark conflicts.

    • @Woodledude
      @Woodledude Рік тому +5

      @@conmin25That's much better than there being no argument about an objectively terrible direction for humanity. No argument with enough power behind it to matter, anyway.
      It doesn't really stop humanity going in a terrible direction, but it does at least make it less likely - At least, given the constraint that we MUST construct at least one powerful AGI.
      Having the same problems we do today, but on a greater scale of intelligence, is better than having an entirely novel problem on top of all the other ones - That being an effectively omnipotent dictator.
      And if we're actually careful about our selections, MAYBE we'll actually get a group of human-base AGIs that are actually trying and succeeding in doing good in the world.
      AGI research is basically a field of landmines, where the goal is to find one that's actually a weight-activated chocolate fountain that turns off all the other landmines.
      It's, uh... Not pretty.
      The only real option is proceeding with incredible caution, and being certain of everything we do before we do it.

    • @supernukey419
      @supernukey419 Рік тому

      There is a proposal called coherent extrapolated volition that is essentially a supermorality

    • @terdragontra8900
      @terdragontra8900 Рік тому

      sometimes humans in "conflict", in the broad sense, "fight" in a way that doesn't involve, you know, death and other things we'd definitely like to avoid. its not necessarily bad if the AIs compete with each other, wrestle for influence, etc, if there's a system where AIs are more likely to "win" if we like them more. but, i have no idea if that's a feasible type of system, it may not be. @@conmin25

  • @ErenYeager-xk3cy
    @ErenYeager-xk3cy Рік тому +1

    What a freaking god damn amazing video. The soundtrack, the narration, the animations, the script, the editing.
    Abso-fucking-lutely perfect!!!

  • @Cqlti
    @Cqlti 9 місяців тому +26

    bro should have just wished he could walk

    • @NickTaylorRickPowers
      @NickTaylorRickPowers 7 місяців тому +6

      Didn't specify he couldn't walk using only his hands
      Now their both fkd

    • @BrunoPadilhaOficial
      @BrunoPadilhaOficial 6 місяців тому +6

      Didn't specify how fast.
      Now he can walk at 1 meter per hour.

  • @Julzaa
    @Julzaa Рік тому +2

    Your production quality is phenomenal, you are amongst the few creators on youtube I really wish they'd have 10, 20x more subscribers! And the team behind this is huge, can't say I'm surprised. Props to all of you 👏

  • @smitchered
    @smitchered Рік тому +11

    I like how you guys, and Eliezer, and the general LW community are going the hard route for convincing people of AGI's dangers. Not going the easy route by eg mentioning a terminator-style apocalypse or saying that we should regulate globally, because China or something. I get that this makes sense to divert as much attention to alignment, true, technical alignment, as possible, but I imagine this is also the natural consequence of raising oneself to be loyal to good epistemics, instead of beating the other tribe at politics or something. You point out the real problems, which are hard to understand, inferentially far away, and weird, out of the Overton window. Good job, as always!

  • @AltDelete
    @AltDelete Рік тому +2

    THANK YOU. AI is whatever, what I'm trying to do is be ready with the right wish parameters for a potential genie scenario, and this is a good angle. Maybe the best angle I've heard. Thank you.

  • @newhonk
    @newhonk Рік тому +37

    Extremely underrated channel, keep it up! ❤

  • @dodiswatchbobobo
    @dodiswatchbobobo Рік тому +1

    “I wish to gain the thing I am imagining at this moment exactly as I believe I desire it.”

  • @Yitzh6k
    @Yitzh6k Рік тому +10

    Imagine instead of an emergency reset button you were to have a "continue" button with a preset timer. If you haven't pressed continue after the time has elapsed, all is reset. This uses your own brain as the judgement system, so it is "safe"

    • @terdragontra8900
      @terdragontra8900 Рік тому +9

      something else could push the button, if it has to be your finger your finger could be ripped off, if your health cant be harmed you could press it on accident, if you forbid such accidents im really impressed you programmed it to be able to tell what an accident is

    • @oasntet
      @oasntet Рік тому +6

      More importantly this is just equivalent to an AI that has to check every decision with a human. The probability pump is a rough analogy for AI, but the reset button is a human-in-the-loop system, which an AI cannot be to be useful.

    • @cewla3348
      @cewla3348 10 місяців тому

      @@oasntet it remembers previous, denied things and avoids stuff like that. Are you really calling the GPTs not ai?

    • @cewla3348
      @cewla3348 10 місяців тому

      @@terdragontra8900 if you do not think about pressing the continue button, it restarts. if you lose thought, it then restarts. if you die, it then restarts.

    • @terdragontra8900
      @terdragontra8900 10 місяців тому

      @@cewla3348 at the moment, we can't reliably scan someones brain and measure if theyve thought about something (even though theres interesting brain scan stuff that can kind of do). also, even if we solve that, that doesnt prevent cases where you are manipulated into thinking everything is fine even though it absolutely isn't if you are thinking straight and have all the information (such as a horrible thing happens and is completely hidden from you, or you are drugged in some way, or it feeds you "propaganda" and changes your mind)

  • @veritius340
    @veritius340 Рік тому +2

    The Outcome Pump not checking to see if the user is incapacitated and unable to press the Regret Button is a pretty big design oversight.

  • @imangellau
    @imangellau Рік тому +3

    Absolutely love the production of this video, including the music and sound effects!!✨

  • @youtubersingingmoments4402
    @youtubersingingmoments4402 10 місяців тому +2

    While I love the thought experiments and multiple entertaining examples, I could have just watched an episode of The Fairly Oddparents. Like half of the episodes' plots consist of Timmy making a vague wish that has unintended consequences, and the story arc is him undoing his mistakes. The whole moral of that show is "be careful what you wish for" lol.

  • @vanderkarl3927
    @vanderkarl3927 Рік тому +13

    Are we even sure that a genie with an entire human morality would be safe? Whose?
    If not, are human moralities coherent enough to take a weighted average or union or what have you? I imagine we'd all get along a lot better if that was true.

  • @AlfiePT
    @AlfiePT Рік тому +2

    Just wanted to say the animation in this episode is amazing!

  • @SatanRomps
    @SatanRomps Рік тому +3

    This was wonderful to watch

  • @mantisynth2186
    @mantisynth2186 6 місяців тому +1

    a wish can be made safe with 2 conditions:
    1) I must not regret making this wish at any point.
    2) Condition 1 must not be achieved by altering my values or removing my ability to regret something.

  • @ianyoder2537
    @ianyoder2537 Рік тому +6

    In my own personal stories, genies, like all other magical creatures and phenomena still have rules they must follow. In the genie's case it's the law of conservation: matter, energy, and now ability and ideals cannot be created or destroyed only transferred from one form to another.
    So hypothetically you say "I wish I had a beautiful kind loving girlfriend." Well the genie can't simply create a another person, so the genie must then find a woman who's beautiful and kind to love you. However the genie cannot create the feelings of love, so it takes the feelings of love out of some one else, modifies sed feeling to apply to you, and implants it into sed woman. Well were did this stolen love come from? Well the genie will take the path of least resistance and find the closest relationship to draw from.
    So in essence. In order to have a relationship of your own the genie ended a relationship of some one close to you.

  • @carsont1635
    @carsont1635 Рік тому

    And this is what we're walking (maybe even running) running towards. In the real world. Right now. Im trapped in existential horror and the tiniest sliver of hope. Godspeed Paul Chistiano and all the wonderful AI alignment researchers.

  • @gabrote42
    @gabrote42 Рік тому +4

    So many of these is great, and while I miss Robert Miles' standalone content, this is not too bad a substitute

  • @vanitythenolife
    @vanitythenolife 10 місяців тому

    Never thought id watch an 11 minute video going in depth to human morality and genies but here I am

  • @Flint_the_uhhh
    @Flint_the_uhhh Рік тому +4

    This reminds me of Fate/Zero.
    ⚠️⚠️SPOILER!!!! ⚠️⚠️
    The main character was a contract killer who has seen the worst sides of humanity - wars, famine e.t.c
    At the conclusion of he story, he obtains a wish granting device and makes a wish to save all of humanity from these problems.
    He doesn't know how to save humanity, but his train of thought is that since it is a wish granting device, it will surely know of a way to accomplish this goal.
    However, since he himself cannot fathom of a way to save humanity and was simply hoping the device would perform a miracle, the device tells him that it will grant his wish through methods he can understand.
    The device then decides to destroy humanity, since it's technically a way to save humans from all our problems, and also a solution that he can fathom.

  • @ambrosia777
    @ambrosia777 Рік тому +1

    Outstanding episode. From animation to story, you've done amazingly

  • @celestialowl8865
    @celestialowl8865 Рік тому +9

    An outcome that is "too unlikely" somehow resulting in an error implies a solution to the halting problem!

    • @kluevo
      @kluevo Рік тому

      Alternatively, running through the scenarios of something 'too unlikely' causes the outcome processor to overheat and crashes. The program isn't halting, it just fried the computer.
      Perhaps another processor sees the outcome processor crashed/is non-responsive and sent an error code?

    • @celestialowl8865
      @celestialowl8865 Рік тому

      @@kluevo Maybe, but then you're always at risk of frying the computer because you can never know which wishes are infinitely unlikely lol

    • @tornyu
      @tornyu Рік тому

      I interpreted it as: the outcome pump can reset time a finite number of times per request. More powerful pumps can reset more times.

  • @Jellyjam14blas
    @Jellyjam14blas Рік тому +2

    Holy moly! the animation is so amazing! And the discussion about wishes is really well thought out and nicely presented :D

  • @fluffycat679
    @fluffycat679 Рік тому +7

    Now, I know the Outcome Pump has no ill intentions. It can't, and isn't, actively trying to upset me, and its simply a matter of how I use it, and its illogical to hold against it the unsatisfactory outcomes that result from my misuse of its power. But, with all that being said... It blew up my house. So no, we are not friends, it killed my mother.

  • @mathpuppy314
    @mathpuppy314 Рік тому +1

    Wow. This is extremely well made. One of the best videos I've seen on the platform.

  • @ChaiJung
    @ChaiJung Рік тому +8

    The biggest problem with all of these Monkeys paw type scenarios are the assumption that the Djinn or wish granter has a condition where they only understand literalisms and are buttholes. If I go to carpenter and want to buy a chair, I'm going to get a chair and it'll be within the general understanding of a chair and NOT some bizarre addition or concept outside of what's understood to be a chair. If I'm interacting with a powerful wish granter (who doesn't have the ability already to understand normal language), I'd likely get my wish

    • @IgnatRemizov
      @IgnatRemizov 2 місяці тому +2

      "ability to understand normal language" means it has a morality engine. We can assume that the average genie does not have any morality and therefore will take your query in the most literal way possible.

    • @axelinedgelord4459
      @axelinedgelord4459 2 місяці тому +1

      a chair is a chair- no other morals or meaning with it, but wishing for one still might not work out.
      for instance, you never specified when, where, and for how long, so towards the end of your life you could own a chair.
      or maybe one is spawned in somewhere inconvenient, or maybe you only have it for a brief moment before it disappears.
      i do not believe you understood what the video tried to point out. as shown in the video, "get my mother out of that vague building" is understood just as you believe it would be, considering how your mother had indeed been taken out of the building. it found the easiest way to achieve the wish, and had no reason to do it any other way; after all, its only goal is to achieve the conditions of the wish it was given, and had no reason to follow any other path.
      wish for gold and there's nothing saying the method you gain it wouldn't kill you. for something without morality like the hypothetical wish machine in the video, it's impossible to get it to not do everything you don't want it to do. all it does is manipulate probability to reach a goal you give it.

  • @AlexReynard
    @AlexReynard 6 місяців тому +1

    Tons of effort and thought put into the basic premise, without ever realizing that the basic premise is stupid.
    No product that predicts it's users wants is going to have only ONE playtester, who is expected to teach it everything perfectly.
    The answer to the problem portrayed in this video is that, you have *shitloads* of people interact with the outcome pump in controlled simulations, long before it is ever allowed to change reality. Which is *exactly* what we are *already* doing with proto-AI programs now.

  • @SQUID_KID102
    @SQUID_KID102 Рік тому +3

    this channel is better then that one with ducks

  • @clockworkjirachi6437
    @clockworkjirachi6437 Рік тому

    Author of Clockwork Jirachi here. I can say from experience that this is pretty much how it works. Don't want scrutiny? Just insist on scrutiny not being a factor in the proxy you should choose. Simple.

  • @jansustar4565
    @jansustar4565 Рік тому +6

    (as mentioned in another comment) use yourself as the evaluation function).
    Option 1:
    After N years at most, determine the satisfaction of myself (and maybe other people I care for) about the outcome of the scenario.
    The only problem with this is if the insides of your brain is modified to adjust the evaluation function, which isn't all that nice, but you can get away with adding a potential test to see how close your mentality is to the mentality of before. This still has some problems, but is way better than the alternatives.
    Option 2:
    On first activation: Change the events in a way such that the second time I activate the machine, I will choose the evaluation function I would be the happiest with with my current state of mind. Not activating it a second time (within a timeframe?) is an automatic reset.
    With this, you bypass the entire problem of "There is no safe wish smaller than an entire human morality" by encoding the entire human morality inside of the eval function.

    • @vakusdrake3224
      @vakusdrake3224 Рік тому +4

      Given this scenario I'm not really sure how this avoids it just granting wishes in ways that lead to the button being pressed twice without your involvement. The point the video made about it ensuring you don't press the regret button generalizes to most other similar sorts of measures.

    • @jfb-
      @jfb- Рік тому

      your brain is now in a jar being constantly flooded with dopamine.

    • @celestialowl8865
      @celestialowl8865 Рік тому +2

      Whos to say you understand your own mind well enough to know that the outcome produced to maximize your own happiness will be the outcome you desire in the instantaneous moment?

  • @ChatBot-ti9pe
    @ChatBot-ti9pe 9 місяців тому +1

    Speaking of wishes. I often hear "be careful with what you wish for", but for me the quote feels too vague to be an aesop, and I think it should brought to its natural conclusion: understand your actual needs. In real life with no genies, it's not about most wishes (ambitious goals) being terrible in itself, or needing to be more specific, it's about those wishes being an exaggerated attempt to get something small we feel like we were robbed of for too long.

  • @morteza1024
    @morteza1024 Рік тому +3

    If the device is complete, your brain can be it's function.
    At least three things are enough but it can be optimized more:
    It needs to store a specific time in order to reset to that time. This number can be manually adjusted.
    Reset button so if I don't like the outcome I press the button.
    Auto reset 100 years after that specified time on the device so if I somehow died or was unable to press the button or change the time on the device it will reset automatically.

    • @minhkhangtran6948
      @minhkhangtran6948 Рік тому +1

      Wouldn’t that just trap you into how many years you live + 100 years loop, more or less? That sound like hell

    • @gamingforfun8662
      @gamingforfun8662 Рік тому

      You would need to add a way yo prevent the loop

    • @morteza1024
      @morteza1024 Рік тому

      @@gamingforfun8662 Move the reset time forward.

    • @morteza1024
      @morteza1024 Рік тому

      @@minhkhangtran6948 You won't remember any of them because they're the same as not happened and you can move the reset time forward.

    • @gamingforfun8662
      @gamingforfun8662 Рік тому +1

      @@morteza1024 stopping the flow of time to just live multiple lives I don't even remember doesn't sound so good

  • @HopperYTRealChannel
    @HopperYTRealChannel Рік тому +1

    I wish for my grandmother to safely get out of the burning house closest to me without producing trauma or bodily injury

  • @0ne0fmany
    @0ne0fmany Рік тому +4

    You know, If you live in a wheelchair, but you mum lives in a house with stairs only...

  • @callen8908
    @callen8908 8 місяців тому

    Newly discovered your productions. You excite my brain, and inspire me beyond words. I cannot thank you enough

  • @guillermoratou
    @guillermoratou Рік тому +5

    This is mind boggling but also very simple to understand 🤯

  • @218Ns
    @218Ns Рік тому +1

    THE EFFORT IN THE VIDEO
    1 minute in and already new subscriber

  • @tornyu
    @tornyu Рік тому +9

    Honest question: could you make successively better outcome pumps by starting with a weak one (can reset time n times per wish), and then use it to wish for a new outcome pump that is 1. more moral and 2. more powerful (can reset time n+1 times), and repeat?

    • @conmin25
      @conmin25 Рік тому +8

      You would still have to define for it what is more moral or not so that it knows if it is getting closer to that goal. Which if you can define all morality in machine language you're already done.

    • @cewla3348
      @cewla3348 10 місяців тому

      @@conmin25 you decide if it's moral or not?

  • @breadwatcher3908
    @breadwatcher3908 Рік тому

    The first couple minutes of this helped me better understand outer wilds

  • @smileyp4535
    @smileyp4535 Рік тому +5

    I always thought the best wish was "perfect knowledge and ability to make and fullfil the best possible wish or wishes from my perspective accross all time and outcomes" or "the ability to do anything" and essentially become god
    I'm not sure if those actually are the best wishes but I've put a loooot of thought into it 😅

    • @minhkhangtran6948
      @minhkhangtran6948 Рік тому

      Hopefully what’s best for you isn’t accidentally apocalyptic to everything else including your mother then

    • @CaedmonOS
      @CaedmonOS Рік тому

      ​@@minhkhangtran6948unlikely as I assumes he probably doesn't want a reality that would harm his mother or cause an apocalypse

    • @michaeltullis8636
      @michaeltullis8636 Рік тому +4

      Phillip K. Dick said that "For each person there is a sentence - a series of words - which has the power to destroy him." There almost certainly exists an argument which would persuade you to become a libertarian, or a communist, or a Catholic, or an atheist, or a mass murderer, or a suicide. If you gained "perfect knowledge and the ability to make and fulfill the best possible wish or wishes from your perspective across all time and outcomes", your perspective would change. What would it change to? I figure it must depend on what order you hear all the mysteries of the universe. And if the values you have as a god depend on the details of your ascension, a hostile genie could just turn you into the god they want around (or a god that unmakes itself).

  • @sadBanker902
    @sadBanker902 7 місяців тому +2

    Literally my first thought was she's either getting dragged out dead or getting blown out the building.

  • @AleksoLaĈevalo999
    @AleksoLaĈevalo999 Рік тому +3

    I love how the firefighter was so tall that he had to duck under the door frame.
    Hot < 3

  • @ImLucld
    @ImLucld 10 місяців тому +1

    me personally, i would've just done;
    -get my grandma out the house
    -get her close to me
    -dont hurt her physically and mentally

  • @a_puntato29
    @a_puntato29 Рік тому +5

    this entire video i was just thinking 'get my mother out of the building alive without any bodily or mental harm to her or any other object or being'
    kinda frustrating, but a really well made video regardless!!

    • @lucas56sdd
      @lucas56sdd Рік тому +3

      Define "Harm"

    • @a_puntato29
      @a_puntato29 Рік тому +1

      @lucas56sdd i mean you could say the same thing to literally any word used in the video itself- what does the house exploding mean? im pretty sure we're assuming that whatever cosmic deity or wish machine we're using can understand basic words
      i mean, the creator does say otherwise at the beginning, but i cant think of any logical way you'd specify everything that followed in the video without the words used, so.. man idk my train of thought is entirely gone

    • @minhkhangtran6948
      @minhkhangtran6948 Рік тому +3

      Granted, now she is a living unfeeling statue made out of diamond, so she felt no bodily or mental harm

    • @ShankarSivarajan
      @ShankarSivarajan Рік тому +6

      @@a_puntato29 The point is that _your_ "basic word" smuggles in your entire morality. So you're not really disagreeing with the point made here.

    • @pendlera2959
      @pendlera2959 Рік тому

      I think you'd get an error code. Simply having the house on fire means that many objects will be harmed whether your mother survives or not.

  • @SL-wt8fm
    @SL-wt8fm Рік тому +1

    "Get my mother here right next to me in the same state as 10 minutes ago in the next 5 seconds that ensures her safety and wellbeing for the next 5 years, and do not cause any harm to any being larger than 5mm" is a pretty good wish

  • @beowulf2772
    @beowulf2772 Рік тому +3

    hii

  • @9The0Unknown7
    @9The0Unknown7 2 дні тому

    I like how this was about “genies” and then went into a morality machine. Very simple answer is that the first genesis will know exactly what you meant and your intentions. The second genie will find a loophole of some kind and no matter how specific you make the wish they will find some way to make you regret it. Third genie just probably know what you intend but fails to complete the task correctly.

  • @atomicflea4360
    @atomicflea4360 Рік тому +4

    I understood every thing and totaly not going to have to reaserch thearetical phisics

  • @theallmemeingeye5927
    @theallmemeingeye5927 Рік тому +1

    I'm so glad you made this, it's one of my favourite stories by EY

  • @mungelomwaangasikateyo376
    @mungelomwaangasikateyo376 Рік тому

    I love how Mom is so calm

  • @marioxdc123
    @marioxdc123 Рік тому +1

    the contents of the video are very good, and the animation is fantastic!!!