"Bush hid the facts" Bug EXPLAINED

Поділитися
Вставка
  • Опубліковано 24 чер 2024
  • Let's reach 3000 likes 💜 Join us on a journey to uncover the most obscure Windows XP internals knowledge you will ever obtain. The "Bush hid the facts" bug got famous for seeming so politically loaded. However, this bug has nothing to do with politics. At all! In this video, we will look into WHY EXACTLY this bug happens. Well, 99% of the time.
    The two code files from the video can be found here:
    - flies.sh/IsTextUnicode
    - flies.sh/CensorOracle
    0:00 Bush hid the facts
    0:57 #1 - Pinpointing the issue
    3:18 #2 - Encoding detection
    3:56 #3 - Why is that an issue?
    5:49 #4 - The unicode detection is wrong
    6:21 #5 - What is wrong with it?
    7:56 #6 - The art of crafting
    10:00 #7 - Wikipedia
    11:22 #8 - Challenge
    11:57 Outro
    -------
    Join this channel to get access to perks:
    / @flytechvideos
    Want FlyTech Merch? Buy here:
    store.flytech.video
    Join the channel's discord server "The Flying Tech"!
    flytech.video/discord
    Follow me on Twitter:
    / flytechvideos
    -------
    Music:
    Falling Between The Waves - TheShwarma
    • Falling Between The Wa...
    No CC license, used with friendly personal permission :)
    VHS Heroes by Punch Deck
    punchdeck.bandcamp.com
    Promoted by @RoyaltyFreePlanet - royaltyfreeplanet.com
    Creative Commons Attribution 3.0
    bit.ly/RFP_CClicense
    Don’t Go Outside by Vladmsorensen
    vladm.bandcamp.com
    Promoted by @RoyaltyFreePlanet - royaltyfreeplanet.com
    Creative Commons Attribution 3.0
    bit.ly/RFP_CClicense
    -------
    #bushhidthefacts #unicode #lookinside #flytech
  • Наука та технологія

КОМЕНТАРІ • 991

  • @nielot0330
    @nielot0330 11 місяців тому +2434

    I love computers.

  • @catomajorcensor
    @catomajorcensor 11 місяців тому +4161

    The question we should be asking is "how did Windows developers come up with the worst way to detect Unicode?"

    • @user-fw8ry1tq4y
      @user-fw8ry1tq4y 11 місяців тому +704

      - So... how we are going to detect unicode?
      - -Meth- Math

    • @RDCST
      @RDCST 11 місяців тому +42

      @@user-fw8ry1tq4y Meth?

    • @Cyberfishofant
      @Cyberfishofant 11 місяців тому +492

      ​@@RDCSTi was gonna _crack_ a joke, but that wouldn't be funny

    • @FriedMonkey362
      @FriedMonkey362 11 місяців тому +204

      I mean it was good enough because it "rarely" occured, still the amount of false positives is indeed too much and yeah it was pretty bad

    • @intron9
      @intron9 11 місяців тому +74

      It acts like a detector for windows's "Unicode" files that only have English characters, but with a lot of false positives.

  • @avitus27
    @avitus27 11 місяців тому +2327

    I can't believe it took me this long to realise: this channel is a fly explaining bugs.

    • @Axcyantol
      @Axcyantol 11 місяців тому +77

      wait what

    • @amberwingthefairycat
      @amberwingthefairycat 11 місяців тому +186

      @@Axcyantol A fly (the type of insect) explaining bugs (computer errors, but the word also means insect)

    • @TDXC------
      @TDXC------ 11 місяців тому +54

      He also... Well creates bugs too
      (By that I mean destroying windows)

    • @Axcyantol
      @Axcyantol 11 місяців тому +26

      @@amberwingthefairycat no i understood it, i was just surprised about it being a fly explaining bugs

    • @amberwingthefairycat
      @amberwingthefairycat 11 місяців тому +8

      @@Axcyantol Oops, sorry, haha.

  • @TwoNumbahNiens
    @TwoNumbahNiens 11 місяців тому +1228

    I can't believe George W. Bush would do this.

    • @alexandermorozov8593
      @alexandermorozov8593 11 місяців тому +199

      "W." stands for WordPad

    • @JeyPeyy
      @JeyPeyy 11 місяців тому +72

      Amazing how he started hiding the facts at least 7 years before it happened

    • @blueschnabeltier_
      @blueschnabeltier_ 11 місяців тому

      Now its George L. Bush, 'cause he hid the facts

    • @johndododoe1411
      @johndododoe1411 11 місяців тому

      George Walker Bush Sr. ruled 1089 to 1992, and did secret government stuff with Nixon and Reagan .

    • @dylanbksp
      @dylanbksp 11 місяців тому +4

      do you know what i can believe bush did?

  • @shockwave952
    @shockwave952 11 місяців тому +1125

    15 years ago I made a "Windows XP Easter Eggs" video featuring this bug, and now I feel this strange sense of satisfaction finally knowing why it happens. Thanks FlyTech!

    • @Dog_Dogs
      @Dog_Dogs 11 місяців тому +13

      That video is best video.

    • @elesqueleto2010
      @elesqueleto2010 4 місяці тому

      i think you made the video a little too popular, still, good one 👍

  • @rogehmarbi
    @rogehmarbi 11 місяців тому +977

    I almost completely forgot WordPad exists, despite it being more powerful than Notepad.

    • @FlyTechVideos
      @FlyTechVideos  11 місяців тому +317

      WordPad is too powerful for us mortals

    • @npoaccount9154
      @npoaccount9154 11 місяців тому +39

      ​@@FlyTechVideosyea, nothing can break it

    • @masonboone4307
      @masonboone4307 11 місяців тому +11

      Notepad is a decedent of wordpad

    • @zUltraXO
      @zUltraXO 11 місяців тому +138

      @@npoaccount9154 wasn't there a video about corrupting the heck out of windows 10 and everything was broken except for wordpad

    • @Fidumo
      @Fidumo 11 місяців тому +51

      @@npoaccount9154 as soon as i read that, i tried breaking it. i ended up creating a document filled with a creepy smiley face image with the size of 2,306,727,936 bytes (that's 2.3 gb). i cant open it because it's too big, and i think i corrupted the file by closing wordpad while it was saving the file. i still dont think this counts as breaking it, i think its just the file thats broken.

  • @MadsterV
    @MadsterV 11 місяців тому +557

    Note: What Windows Write calls "unicode" is UCS-2, a now-obsolete 16-bit encoding.
    What today we call unicode is usually UTF-8, a variable-width encoding that conveniently matches US-ASCII, though there's also UTF-16 and UTF-32.

    • @billy65bob
      @billy65bob 11 місяців тому +26

      There's not just those variants either.
      There's the obsolete UTF-7 too (for some reason), and the multi-byte encodings come in both Little Endian and Big Endian flavours.

    • @freyja5800
      @freyja5800 11 місяців тому +12

      @@weakspirit_ internally, yes. but when saving text it does default to utf8, since a text in 7-bit ascii and in utf8 are the same, and since in english you rarely need characters beyond that, using utf8 is more space efficient

    • @laurentverweijen9195
      @laurentverweijen9195 11 місяців тому +9

      UCS-2 and UTF-16 is more or less the same (and what windows / C-sharp incorrectly call "unicode encoding".)

    • @Spartan322
      @Spartan322 11 місяців тому +8

      Technically ANSI Windows-1252 (what is functionally always used when Windows refers to "ANSI") is incompatible with UTF-8 as there are numerous UTF-8 bytes which are "reserved" (usually for multi-byte characters) which Windows-1252 uses as printable (single-byte) characters. If any of those bytes exist in the string when read as UTF-8 the encoding will break, which in a well developed system will merely produce a few erroneous characters. Now since Windows-1252 extends standard ASCII, most of the bytes in Windows-1252 will be read well in UTF-8, specifically all the common American used characters. The problem with the encoding only occurs when you have these non-ASCII characters either in Windows-1252 or in UTF-8, in which case if you try to read the either UTF-8 from Windows-1252 or Windows-1252 from UTF-8, you will get problems. Experienced this exact issue when dealing with a game that reads and writes Windows-1252 but which has resources written in UTF-8, causing all sorts of weird problems.

    • @hellterminator
      @hellterminator 11 місяців тому +17

      @@laurentverweijen9195 UCS-2 can only represent ~6% of the reserved Unicode codepoints and ~69% of the ones already assigned , whereas UTF-16 can represent them all through surrogates. Don't get me wrong, UTF-16 is the worst Unicode encoding, but at least it *is* a Unicode encoding, unlike UCS-2.

  • @y7o4ka
    @y7o4ka 11 місяців тому +500

    Dude really digged into windows api's assembly to uncover a strange bug from 1994
    Good job!

    • @thevorkman_6551
      @thevorkman_6551 11 місяців тому +7

      I think he not dive, he just wrote his own that works similar...
      Maybe, idk

    • @aetimes2
      @aetimes2 11 місяців тому +43

      @@thevorkman_6551 He went into the code to figure out how it worked, and then wrote his own that worked the same

    • @mstech-gamingandmore1827
      @mstech-gamingandmore1827 11 місяців тому +56

      Don't forget the special... _sources_ ;)

    • @thevorkman_6551
      @thevorkman_6551 11 місяців тому +1

      @@mstech-gamingandmore1827 Yes)))

    • @mskiptr
      @mskiptr 11 місяців тому +35

      XP source leak probably

  • @ivirius.parody
    @ivirius.parody 11 місяців тому +292

    Ah yes. As a developer, I love to see people finding bugs in our lazy work

    • @AlphaFruit-hx4cw
      @AlphaFruit-hx4cw 11 місяців тому +12

      I'm an programmer in my free time, and this is actually facts. 😂

    • @dubl33_27
      @dubl33_27 11 місяців тому +11

      @@AlphaFruit-hx4cw which Bush hid

    • @nt-authority-system666
      @nt-authority-system666 11 місяців тому +2

      you're spitting FACTS

    • @righteous-bison
      @righteous-bison 6 місяців тому +1

      @@nt-authority-system666bush wasn't

  • @jhgvvetyjj6589
    @jhgvvetyjj6589 11 місяців тому +176

    Aside from the statistical check, another heuristic uses the newline as a way to rule out Unicode since a word-aligned newline has the bytes 0D 0A, but U+0A0D is not assigned in Unicode. Also it apparently only detects based on the first 256 bytes or so, which might make the longest string challenge futile beyond that point.

    • @FlyTechVideos
      @FlyTechVideos  11 місяців тому +89

      oh wait, but that does mean that a non-word aligned newline could technically trigger it, right? should have spent more time researching nooooo

    • @Renteks-
      @Renteks- 11 місяців тому +19

      just make this comment your string so you can win the meta award

  • @Fasguy
    @Fasguy 11 місяців тому +190

    The amount of times i've seen this spewed about as an "easter egg" is nuts.
    Just like the "Bill Gates Sucks" "easter egg" in C64 BASIC.

    • @chri-k
      @chri-k 11 місяців тому +24

      what’s that?

    • @brinleyhamer729
      @brinleyhamer729 11 місяців тому +2

      yeah what’s that?

    • @Dj_Theorema
      @Dj_Theorema 11 місяців тому +25

      @@chri-k To keep It simple, C64 BASIC has a random number generator that, every time you turn on the machine, it always produces the same "random" sequence of floating-point numbers, all between 0 and 1. Using this "feature" someone wrote a small program (4 lines of code) that print the sentence "Bill Gates Sucks" on screen

  • @egeakan7276
    @egeakan7276 11 місяців тому +162

    God, when I tell you I thought Bush was a literal bush from nature and I couldn't figure it out until this day...

    • @mucookul
      @mucookul 11 місяців тому +4

      Same

    • @Aeduo
      @Aeduo 11 місяців тому +14

      Sounds like it'd be related to homer simpson creeping out of that bush gif.

    • @WindowsDrawer
      @WindowsDrawer 11 місяців тому +1

      Same

    • @sadpeperoni7508
      @sadpeperoni7508 11 місяців тому +3

      My wife's bush hid some facts. I'm catholic, so I discovered the truth only after the marriage

    • @Liggliluff
      @Liggliluff 11 місяців тому +12

      Yeah, for people not from USA, or not familiar with USA's presidents, the name "Bush" is likely going to refer to an actual bush instead.

  • @Blaineworld
    @Blaineworld 11 місяців тому +76

    now i kind of want to know how the current unicode detection works

    • @keltrm
      @keltrm 11 місяців тому +11

      I tried disassembling it, but it seems to have been moved to the kernel (RtlIsTextUnicode)

    • @Milennium1902
      @Milennium1902 11 місяців тому +41

      It uses the 2 bytes FF and FE shown at 5:24. To make the glitch happen on modern Windows you put ÿþ into the beginning of a text file, then save it, and voila! You don't even need to input any special text after it.

    • @lydianlights
      @lydianlights 11 місяців тому +15

      they probably do the sane standard thing and look for 0xFFFE at the start

    • @chri-k
      @chri-k 11 місяців тому +4

      @@Milennium1902**Icelandic obtains ÿ**

    • @hellterminator
      @hellterminator 11 місяців тому +18

      @@lydianlights They did that back then, too, but that's not nearly enough.
      The _presence_ of the BOM _confirms_ a file is unicode.
      The _absence_ of the BOM _does not_ mean a file is _not_ Unicode.
      That is to say, if there is no BOM, you still have to check if it's Unicode.

  • @SuperCaitball
    @SuperCaitball 11 місяців тому +44

    The "oracle" not guessing correctly on newlines might be due to differences in newline coding; given it's a Python script it may be using only LF line endings, but famously Windows always uses CRLF as its line endings.

  • @boalbads
    @boalbads 11 місяців тому +45

    "we are warned not to change it" "So let's change it" I love this channel.

  • @Kiwifruit00
    @Kiwifruit00 11 місяців тому +46

    4:19 for anyone wondering why fly wrote it as 0x75 0x42 when the hex editor shows 0x42 0x75 its because the file is encoded in "little endian", which means the last byte in the hex editor goes first when the computer is reading unicode.
    i dont know why the computer does this, nor am i an expert in these kinds of things but i just wanted to share in case someone wants to know

    • @johndododoe1411
      @johndododoe1411 11 місяців тому +4

      Windows uses little endian because the x86 CPUs do so .

    • @leap123_
      @leap123_ 11 місяців тому

      because x86 (which windows and every single os that supports x86 runs on) use little endian, so fuck intel i guess

    • @RedstoneNguyen
      @RedstoneNguyen 11 місяців тому +3

      Write a simple program converting from integer to string and you will find out why little endian is a thing. Btw, the numbers we are writing everyday is big endian.

    • @johndododoe1411
      @johndododoe1411 11 місяців тому +2

      @@RedstoneNguyen Both storage directions make perfect sense . Little endian for decimal digits is how Arabs write Arab numerals, big endian is how westerners write the same numbers with the same digits . Because computers gulp up entire binary numbers in one memory clock cycle it's entirely cultural for them too . The x86 and x80 CPU families belong to little endian design cultures, the 68000 and SPARC families belong to big endian design cultures . ARM and PowerPC hardware is bilingual in this matter .

    • @RedstoneNguyen
      @RedstoneNguyen 11 місяців тому

      @@johndododoe1411 i didnt say anything about culture. My idea is, little endian is mathematically simpler to implement than big endian.

  • @e_g..
    @e_g.. 11 місяців тому +24

    "This comment has the challenge shown for the longest strings that triggers the Windows glitch from the video you recorded. The video's bug shows that it's difficult for doing accidentally. Specially for the challenge proposed. I try using workarounds and the hardest one probably is the odd character words I have to usually put for those requirements. But sometimes I don't use odd length, which usually happens because the symbols are joined with the word. I created a small story for the text: The bus was not there for a bad reason, probably. Maybe someone found the reasons but I don't know? Which increases the words variety I can use for this! The current character counter is at 690 and I think I could add non-sense but that wouldn't get interesting enough. I'm a bit close here, 200 character distance. So, I could get another story using odd character count words only. There was a man named "e_g.". The challenge was waiting for him!! And so he broke the world record! FlyTech himself saw the text, and got amazed! The comment had too many characters and broke the record! I could not believe it! The story ended and thank you for reading this."
    this whole text has 1156 characters, which beats the previous record of 1016.
    the remaining balance was -399, which means you could add 2 more characters (following the rules from the challenge) without failing.
    a proof was made in a windows xp vm, and you can see the video proof in my channel or test it by yourself.

  • @maelmauron7530
    @maelmauron7530 11 місяців тому +40

    The 00 padding on non-unicode characters explains the fact that in a lot of files that are not text, some strings have spaces between each character... I've had this question since 2017 XD

    • @maelmauron7530
      @maelmauron7530 11 місяців тому +4

      By not text I mean archives, executables, etc.

    • @chri-k
      @chri-k 11 місяців тому +14

      but most modern applications ( one exception being Windows itself ) use UTF-8 and not UTF-16. UTF-8 is fully backwards-compatible with ASCII, so this may not be the only reason.

    • @johndododoe1411
      @johndododoe1411 11 місяців тому +7

      It's not padding, it's the page number . Page 00 is mostly the same as Western ANSI code, there are about 7000 other pages to keep track of, including the ones with smiley faces .

    • @Liggliluff
      @Liggliluff 11 місяців тому +1

      All characters are Unicode characters. Some characters are also in ANSI, but they are also in Unicode.

  • @cathacker13
    @cathacker13 11 місяців тому +23

    for my whole life i thought bush hid the facts was an intentional easter egg so this was a very interesting video to me personally just because of that

  • @Biaanca5036
    @Biaanca5036 11 місяців тому +57

    I was never able to reproduce this bug when I was a little kid..
    But well, my boxed copy of XP was Service Pack 2 so I guess that's a given :P
    But I also don't remember what OS I was using at the time either.

  • @2520WasTaken
    @2520WasTaken 11 місяців тому +86

    Have you heard about the Russian city: Seversk? It has a humid continental climate maintaining a low temperature and receiving 530mm precipitation every year. Through its presence, nuclear weapons have been assembled there and stored. One serious nuclear catastrophe would occur in 1993 because a container holding a dangerous and radioactive substance exploded.
    Character count: 362
    Edit: I didn't actually test this on Windows XP notepad, but I used a script, and the script gave 7640 and 2542, and 7640 is just barely greater than 3*2542

    • @AdachiVlogsFIN
      @AdachiVlogsFIN 11 місяців тому +7

      Saves And Does Not Corrupt On Windows XP Media Center Edition 2005.

    • @moregirl4585
      @moregirl4585 11 місяців тому

      @@AdachiVlogsFIN Confirmed on mine

    • @negativeoverseerlolphdundg2665
      @negativeoverseerlolphdundg2665 11 місяців тому

      whats M69 doing here

    • @Sypaka
      @Sypaka 11 місяців тому

      NO WAY LMAO.

    • @fr4ctalz638
      @fr4ctalz638 11 місяців тому +1

      tried it on my windows XP vm it worked

  • @russelllukenbill
    @russelllukenbill 11 місяців тому +5

    When 7/11 (I say 7/11 because if I wrote the right date my comment won't post) happened, the very next day in school someone came in to school and showed a group of us in the computer lab if you typed 9 and then 11 into Word using wingdings, it was a plane flying into two buildings. Wingdings has been changed since then.

    • @FlyTechVideos
      @FlyTechVideos  11 місяців тому +4

      i'm pretty sure nothing stops you from posting 9/11 in this comment section

    • @Xnoob545
      @Xnoob545 11 місяців тому

      @@FlyTechVideos a different commenter said that "11:44 I can't do that as any link (sometimes even links to UA-cam) cause the immediate deletion of the comment, meaning it's visible to its author until page reload, edits fail ("Unknown error") and reloading the page makes it disappear. Only the creator of a video can post links in their own comments section without needing to be worried"
      youtube likes censoring the comments

  • @selfSplintered
    @selfSplintered 11 місяців тому +12

    Congratulations on being an official Wikipedia source! :D

  • @hhkl3bhhksm466
    @hhkl3bhhksm466 11 місяців тому +70

    Your videos are always entertaining and informative. Keep it up!

  • @proCaylak
    @proCaylak 11 місяців тому +13

    2:05 you don't have to disappoint them. there also was another Bush who was president between '89 and '93. In fact, he's the father of the Bush we all know and hate. I have no explanations for "~Flytech" though. 😅

    • @mossadgynist
      @mossadgynist 3 місяці тому

      He hid the fact that he personally assassinated JFK

  • @623-x7b
    @623-x7b 4 місяці тому +3

    When I was a kid I used to open up exe files in a text editor. I thought programmers had to remember a lot of characters and had special keyboards.

  • @keiyakins
    @keiyakins 11 місяців тому +5

    the correct way to do this is, of course:
    1. is there a BOM? If so, respect it.
    2. try to decode it as UTF-8. If it worked, you're done, its UTF-8 (or us-ascii but thats a proper subset so whatever)
    3. If you get here, complain to the user and make them figure it out.
    (if you're loading a document with more metadata you may of course use that too, I'm assuming plain text)

    • @paradoxmo
      @paradoxmo 4 місяці тому

      UTF-8 didn’t exist when IsTextUnicode() was written. The Unicode encoding in use was UTF-16 (based on the earlier UCS-2). So this is legacy code from the early days of Unicode that was never updated.

  • @aprilnya
    @aprilnya 11 місяців тому +60

    mans casually drops "my windows crashes when opening notepad" like HUH

    • @FlyTechVideos
      @FlyTechVideos  11 місяців тому +9

      ua-cam.com/users/shortsAtu7atNw-kw

    • @zerotwoisreal
      @zerotwoisreal 11 місяців тому +5

      what more can you expect from windows 11

    • @UltraCenterHQ
      @UltraCenterHQ 11 місяців тому

      ​I mean, it's an insider build. You would expect bugs

  • @toydotgame
    @toydotgame 11 місяців тому +9

    This also explains what happens when you load Unicode-encoded files such as .lnk and .url in Notepad or Vim etc, and it displays the weird spaced out lettering as the program assumes ANSI plaintext. Cool!

  • @ChloekabanOfficial
    @ChloekabanOfficial 11 місяців тому +5

    The first time I found out about the "Bush hid the facts" bug, I thought the text was referring to a literal bush.

    • @OrbitalCookie
      @OrbitalCookie 11 місяців тому +1

      You can hide facts in a bush ... ... ..

    • @ChloekabanOfficial
      @ChloekabanOfficial 11 місяців тому +1

      @@OrbitalCookie Ah yes, my favourite hobby.

  • @SsvbxxYT
    @SsvbxxYT 11 місяців тому +6

    But what if...in Windows 3.5, Bush was referring to George *H.W.* Bush?

  • @DavidWonn
    @DavidWonn 11 місяців тому +11

    NT4 Notepad runs on NT 3.51, though it will abruptly close at times. On these older versions, you can change Notepad's global font, and in some cases you may even be able to read the erroneous Unicode characters!

  • @CattopyTheWeb
    @CattopyTheWeb 11 місяців тому +108

    Very interesting bug, Fly! Thanks for the video

    • @Tocinos
      @Tocinos 4 місяці тому

      69th like

  • @lior_haddad
    @lior_haddad 11 місяців тому +16

    encoder? I hardly know 'er!

  • @PanoptesDreams
    @PanoptesDreams 11 місяців тому +4

    All these years later, this explains why XP notepad was such a pee-pee about opening random text docs.

  • @VukAndrijanic
    @VukAndrijanic 11 місяців тому +6

    Nice video keep it up! Will you upload more creepypasta videos like you did before?

    • @FlyTechVideos
      @FlyTechVideos  11 місяців тому +15

      Thank you! I only ever uploaded 2 of them, and no, I am not planning to continue them as I came to dislike even the 2 videos that I already made.

  • @adrianv.v.4445
    @adrianv.v.4445 11 місяців тому +10

    Pairing a text-generating transformer with a minimizing function for the unicode check could be funny to see

    • @Brahvim
      @Brahvim 11 місяців тому

      Exactly what I thought!

  • @JustPyroYT
    @JustPyroYT 11 місяців тому +32

    Great Video! Very detailed explanation of the bug! :D

  • @kleinesfilmroellchen
    @kleinesfilmroellchen 11 місяців тому +4

    Even though Windows called it "Unicode", the less confusable and more accurate name is UTF-16.

  • @MarkSir
    @MarkSir 11 місяців тому +4

    You didn't dismiss the conspiracy. Bush, the old one, was US president from 1989 to 1993

  • @UltimatePerfection
    @UltimatePerfection 11 місяців тому +4

    2:15 Bush Sr predates Clinton though as the president.

    • @FlyTechVideos
      @FlyTechVideos  11 місяців тому +2

      The Bush that "hid the facts" refers to the junior one, doesn't it? (Iraq war?)

    • @UltimatePerfection
      @UltimatePerfection 11 місяців тому +2

      @@FlyTechVideos Bush Sr. hid two facts though: That his son is a liar, and that he was a madman that almost started the WW3.

  • @_AE_EA_
    @_AE_EA_ 11 місяців тому +7

    Could be wrong on this but I think windows uses CRLF encoding so you would need to put /r/n into the oracle to replicate the notepad newline

    • @FlyTechVideos
      @FlyTechVideos  11 місяців тому +6

      I tried it with
      after the video, and the oracle says that it's censored while Notepad still doesn't break

  • @pyromancy8439
    @pyromancy8439 11 місяців тому +3

    I have moved to Linux completely a long time ago, and every time I stumble upon Windows, I seriously don't understand how can the most popular desktop operating system STILL have significant issues with encoding. Today I was on a video call with my coworker and had the pleasure of witnessing a modern (2021 version) app with cyrillic text display umlauts and diacritics instead of actual text on English-configured Windows., Look, mum, we've had Unicode for 32 years now!
    P.S. yes, I know Windows uses UTF-16, I refer to UTF-8, which is used practically everywhere on the web.

  • @shaunclarke94
    @shaunclarke94 11 місяців тому +1

    Great info on how Unicode encoding is detected. I've always wondered about this.

  • @fgregerfeaxcwfeffece
    @fgregerfeaxcwfeffece 11 місяців тому +3

    Microsoft is well known to preserve traditional bugs. Even the Win10 installer still could not select a partition to install.

    • @Sypaka
      @Sypaka 11 місяців тому +2

      please explain in detail.

  • @FlyTechVideos
    @FlyTechVideos  11 місяців тому +8

    flies.sh/discord

  • @LeonAlkoholik67
    @LeonAlkoholik67 11 місяців тому +7

    Kinda expected that it's just another encoding issue. Even nowadays Notepad still has similar encoding issues like when you write a script and it contains some special characters in it... and then you realize your script is broken. Aside that you should never use Notepad for scripting anyway...

  • @UdderlyEvelyn
    @UdderlyEvelyn 11 місяців тому

    I remember when this bug was new when I was young, cool to see why it happened years later when I'd forgotten about it.

  • @joveaaron-real
    @joveaaron-real 11 місяців тому +5

    If Windows is smart enough to remove the first two unicode bytes (0xFF, 0xFE), why the hell didn't they use it to detect unicode aswell?

    • @FlyTechVideos
      @FlyTechVideos  11 місяців тому +14

      They did - 0xFF 0xFE was consistently recognized as Unicode. The problem is that they assumed that text without this prefix can be Unicode as well, and they used the presented heuristic to guess

    • @joveaaron-real
      @joveaaron-real 11 місяців тому

      @@FlyTechVideos sounds like somebody in Microsoft didn't read the documentation all the way 🤣

    • @jhgvvetyjj6589
      @jhgvvetyjj6589 11 місяців тому +7

      Microsoft detects Unicode even without the byte order mark by design since other programs and/or platforms may save text files like that.

  • @ThatRandomToast
    @ThatRandomToast 11 місяців тому +4

    2:58 This variant of BSOD is caused by using VMware SVGA 3D

  • @ethangibbs4955
    @ethangibbs4955 11 місяців тому +3

    4:16 Saying that everything in "Unicode encoding" is 2 bytes is a bit misleading. This applies only to the implementation of Unicode used in very old versions of Windows (UCS-2), and does not apply to any modern, variable-width Unicode encoding. Notepad received support for UTF-8 and UTF-16 with Windows 7.

    • @ethangibbs4955
      @ethangibbs4955 11 місяців тому

      en.wikipedia.org/wiki/Unicode_in_Microsoft_Windows
      en.wikipedia.org/wiki/Windows_Notepad

  • @ZipplyZane
    @ZipplyZane 11 місяців тому +6

    I was hoping you'd then follow with how they fixed the bug. How does Notepad in Windows 7 detect Unicode?

    • @chri-k
      @chri-k 11 місяців тому +4

      by checking if the file starts with 0xFFFE

  • @cs127
    @cs127 11 місяців тому +5

    great video. your explanation was perfect!

  • @CrushedAsian255
    @CrushedAsian255 11 місяців тому +4

    Before watching , let me guess is it some kind of Unicode auto detect mode bug

  • @DerivativeOfLog7
    @DerivativeOfLog7 11 місяців тому

    Nice video as always, but this time the same wallpaper on all versions was a bit confusing especially watching on a smaller screen

  • @jetseverschuren
    @jetseverschuren 11 місяців тому +2

    And that's why everybody just uses UTF-8 nowadays

    • @jhgvvetyjj6589
      @jhgvvetyjj6589 11 місяців тому +1

      Which is easily mixed up with all the other 8-bit encodings leading to more glitches!

    • @jetseverschuren
      @jetseverschuren 11 місяців тому +1

      @@jhgvvetyjj6589 for text, really only plain ASCII or UTF-8 is used, at least in sensible systems

  • @quantumelle
    @quantumelle 11 місяців тому +4

    How about this one?
    He erected later a great monastery in which he lived forty years and had eight hundred and eight followers--they bound him tightly and carried him between them on their shoulders
    (-9)

  • @intron9
    @intron9 11 місяців тому +7

    "in unicode encoding, each character is 2 bytes"
    Not exactly,but close enough explanation...

    • @Gameplayer55055
      @Gameplayer55055 11 місяців тому +6

      UTF-16. Then they wanted some nice 🥵emojis and even ♔chess. Then UTF-32 appeared. yet UTF-8 is variable length which saves the space but not the nervecells of c++ devs

    • @Mnnvint
      @Mnnvint 11 місяців тому +2

      In the old Windows World's favorite unicode encoding. Which they got stuck with, even though it was a bad idea, because they were too eager to use unicode and more sensible unicode encodings hadn't caught on yet.

    • @jhgvvetyjj6589
      @jhgvvetyjj6589 11 місяців тому +1

      @@Gameplayer55055 ♔ fits in 16-bit character though

    • @cl00e9ment
      @cl00e9ment 11 місяців тому +2

      It looks like they encoded the code points directly. They did not use UTF-8 encoding or else ASCII characters would be only one byte, and they did not use UTF-16 encoding either because UTF-16 is not padded with NULL bytes. In other words, they succeed to mess up their Unicode implementation and invent a new encoding while Unicode was supposed to unify everything. Oh the irony...

    • @jhgvvetyjj6589
      @jhgvvetyjj6589 11 місяців тому +1

      @@cl00e9ment All ASCII characters do have null byte in high byte when represented in 16-bit integer though. 0x20 in 8-bit becomes 0x0020 in 16-bit, which becomes 0x20 0x00 in little endian, which is the correct little endian representation of space in UTF-16.

  • @IvyANguyen
    @IvyANguyen 4 місяці тому +1

    Learnt something new in your video: Mojibake! I never knew this phenomenon had a specific name. I recall seeing it a lot in 1998 to 2001 on WebTV here in the US when viewing Japanese, Chinese, & Korean web pages. (Really, any pages that didn't use the Latin alphabet.)

  • @DacroyleYT
    @DacroyleYT 11 місяців тому +2

    Dang, I thought the "Bush hid the facts" thing was an easter egg until I saw this

  • @dedr4m
    @dedr4m 11 місяців тому +3

    Hmm, that's why this never got fixed!
    Somebody reported "this app can break" and the debugging team didn't get the message.

  • @stephaniethebatter7975
    @stephaniethebatter7975 11 місяців тому +3

    Bush didn't hide the facts, but Windows certainly did.
    (not part of the challenge, just a joke)

  • @loganiushere
    @loganiushere 11 місяців тому

    Did you include the carriage return when asking the oracle about new lines?

  • @martinligabue
    @martinligabue 11 місяців тому

    love that you cite the "sources"

  • @mfaizsyahmi
    @mfaizsyahmi 11 місяців тому +3

    I'd imagine Raymond Chen would write about this embarrassing algorithm implementation in his blog soon.

    • @FlyTechVideos
      @FlyTechVideos  11 місяців тому +5

      He already did: devblogs.microsoft.com/oldnewthing/20070417-00/?p=27223

  • @kaninchengaming-inactive-6529
    @kaninchengaming-inactive-6529 11 місяців тому +4

    Least broken microsoft product:

  • @ghost_ship_supreme
    @ghost_ship_supreme 11 місяців тому +1

    This would be an interesting game mechanic for secret codes

  • @JohnPaulBuce
    @JohnPaulBuce 11 місяців тому +1

    thanks for this, my issue in notepad before was probably caused by encoding

  • @Lopoi
    @Lopoi 11 місяців тому +7

    Can videos be used as citation? I thought it had to be academic articles or something like that

    • @FlyTechVideos
      @FlyTechVideos  11 місяців тому +11

      Not sure if they _can_ , but I think I've seen some. Don't take my word for it though

    • @rizkyadiyanto7922
      @rizkyadiyanto7922 11 місяців тому +1

      even blog posts can.

    • @shepardpower
      @shepardpower 11 місяців тому

      I think so

    • @wiger_
      @wiger_ 11 місяців тому +4

      generally self-published content is not supposed to be used as a reliable source for a citation, but in this case, i guess it could be used as a showcase of a behavior mentioned in the article

    • @renerpho
      @renerpho 11 місяців тому +3

      @@FlyTechVideos I've been working on that exact same question for a different video lately (a Karl Jobst documentary).
      The short version is that the video can not be used unless it is published by a reliable source. Since UA-cam videos are self-published, they don't count. An exception can be made if the person who posts the video is considered a subject matter expert. We've discussed that for Karl Jobst, but determined he doesn't qualify. For it to work, Jobst would have to have published articles about his work in trusted sources, outside of UA-cam, and he hasn't done so.
      What that means for your video:
      Have you published journal articles about PC bugs, under your name? (Just being cited by them is not enough.) If the answer is "yes" then that's great! Please give us a link to that. With some luck, that will make you pass as a subject matter expert, and THEN we can start thinking about citing your UA-cam video.
      P.S. For Jobst's video, we could avoid citing the video in the end. Jobst presented all his sources in the video. References to reliable sources that demonstrate that his conclusions were correct. Can you share such a source for your topic? In that case, we can start working on the Wikipedia article as well.

  • @netkv
    @netkv 11 місяців тому +15

    i completelly forgot some insane systems dont use utf8

    • @Mnnvint
      @Mnnvint 11 місяців тому +2

      The windows world started using unicode before utf8 was invented. The Java world too. Sometimes it pays to be slow (although I remember switching my old redhat/mandrake systems over to default to utf8 was not fun either).

    • @johndododoe1411
      @johndododoe1411 11 місяців тому +1

      Unicode was created in the late 1980s. Microsoft and Java chose the early 16 bit Unicode and then had to use UTF-16 to encode the next few thousand pages . Then someone decided that any characters not handled by UTF-16 should be banned from all other encodings .

  • @hackdesigner
    @hackdesigner 11 місяців тому +1

    Visual C++ 2008 Express Edition? I see a man of culture!

  • @mfessi
    @mfessi 11 місяців тому +1

    21x johnsparks still does the trick.
    Source: "Windows Notepad - an Old Problem Surfaces Again"

  • @Foxy_AR
    @Foxy_AR 11 місяців тому +18

    9:45 if you were to spam a lot of these blocks, could you write secret messages?

    • @FlyTechVideos
      @FlyTechVideos  11 місяців тому +38

      If you consider mojibake secret, then yes

    • @nothing-lo8lh
      @nothing-lo8lh 11 місяців тому +1

      @@FlyTechVideos Well no because it's fixed in newer versions of windows.

    • @Gameplayer55055
      @Gameplayer55055 11 місяців тому +5

      I live in ukraine and i've seen many university pages full of them (idiot devs, no )
      After making essays in vscode i can see some mysterious п»ї too. its shit from BOM as i know

    • @aurastrike
      @aurastrike 11 місяців тому +2

      @@nothing-lo8lh You can cause it by adding ÿþ to the start of a .txt file

  • @liniarc
    @liniarc 11 місяців тому +3

    So I ran a few tests and found the exact letters and words don't influence notepad's unicode detection algorithm too significantly. By far the biggest factor is the location of the space and punctuation symbols. If the space and punctuation symbols occur primarily at an odd index, then the unicode detection algorithm can get a large bonus towards the odd/lower bytes since the ascii distances between the punctuation symbols and lowercase letters greatly exceeds the ascii distances between any two lowercase letters. This means that if you use words with an odd number of letters, most space symbols end up on the odd indices. However, writing sentences using exclusively odd number of letters for all the words isn't easy. You can therefore sometimes use pairs of even words for a nicer structure. By using these tips, you can write quite lengthy sentences which sound almost completely natural without having to recalculate a new score every other character. PS. This whole comment would get censored by notepad

    • @ZephyrysBaum
      @ZephyrysBaum 11 місяців тому

      Wow

    • @liniarc
      @liniarc 11 місяців тому

      Character count: 1016
      3 * Higher Diff - Lower: -85
      Tested successfully on Windows 2000 pro
      With the heuristic outlined, it's fairly easy to make arbitrarily long strings, especially if you aren't overly concerned about clarity, word flow, or long term sentence structures. I wrote the comment without significant scripting/code assistance (only checking the isUnicode value every sentence or so).

    • @ZephyrysBaum
      @ZephyrysBaum 11 місяців тому

      @@liniarc I’ll try making one I think

  • @gowindows6639
    @gowindows6639 11 місяців тому +1

    Finally, a video that explains everything! tysm, FlyTech

  • @Liggliluff
    @Liggliluff 11 місяців тому +2

    (4:15) Incorrect, Unicode itself doesn't require 2 bytes per character. Unicode is just a list of character. It depends on what encoding you use. UTF-16 is what Windows uses, which requires 2 or 4 bytes per character, and there's also UTF-32 which requires 4 bytes per character, as well as UTF-8 which is a variable number of bytes per character.
    (5:25) FF FE doesn't mark it as Unicode exactly, it does, but it also marks it as UTF-16, which is why it's 2 bytes per character. B is 42 00 as you show, but FE FF is also UTF-16 but reversed, where B is 00 42 instead. EF BB BF is UTF-8, for example.
    (6:40) Since UTF-8 (Unicode) is variable length, why can't this be an odd length? Characters can be variable length from 1 to 6.

  • @cmyk8964
    @cmyk8964 11 місяців тому +6

    But why was that “low > 3*high” heuristic chosen?

    • @FlyTechVideos
      @FlyTechVideos  11 місяців тому +15

      A unicode string with "simple" characters usually has a lot of null bytes, e.g. 42 00 43 00 44 00 ... and the heuristic is engineered to detect exactly this. As we can see, this leads to false positives

    • @TheGodOfAllThatWas
      @TheGodOfAllThatWas 11 місяців тому +3

      ​@@FlyTechVideos Is that logic true across most languages or is this an English bias? Are there any countries where it's not true?

  • @Dontlokhere
    @Dontlokhere 11 місяців тому +3

    Bush hid the facts
    畋样凭摩琠映捡獴
    Picking up mongoose

  • @MylarDaleToloMDTTV
    @MylarDaleToloMDTTV 11 місяців тому +1

    The bug is also included in Windows Longhorn pre-reset.

  • @bonkmaykr
    @bonkmaykr 11 місяців тому

    I grew up thinking this was an intentional easter egg! I guess people just thought it was intentional because Bush has had conspiracy theories about him and XP came out around back then. Thanks for showing us how this really works, was very cool

  • @Neubulae
    @Neubulae 11 місяців тому +3

    Try "联通", iirc this single word also caused issue therefore similar "rumors", or rather, memes, arose amongst Chinese communities about how China Unicom had vendetta with Microsoft whatsoever.

  • @pandrk
    @pandrk 11 місяців тому +18

    I'm just wondering, how can the windows notepad crash windows?

    • @RandomGeometryDashStuff
      @RandomGeometryDashStuff 11 місяців тому +1

      buggy gpu driver?

    • @sebastianx708pl
      @sebastianx708pl 11 місяців тому +4

      When I used RTM version of Win 11 on VMware after release, while using paint crashed into BSOD in few seconds so could be a VM problem

    • @BlueFlame_00
      @BlueFlame_00 11 місяців тому +4

      10gb+ files

    • @johndododoe1411
      @johndododoe1411 11 місяців тому +1

      Win11 notepad is very buggy, it can't even display actual text files .

    • @RandomGeometryDashStuff
      @RandomGeometryDashStuff 11 місяців тому

      @@johndododoe1411yes, microsoft made notepad slower than notepad++
      same with cmd.exe (old one still exists as conhost.exe)

  • @Lampe2020
    @Lampe2020 11 місяців тому +1

    11:44 I can't do that as any link (sometimes even links to UA-cam) cause the immediate deletion of the comment, meaning it's visible to its author until page reload, edits fail ("Unknown error") and reloading the page makes it disappear. Only the creator of a video can post links in their own comments section without needing to be worried.

  • @KnakuanaRka
    @KnakuanaRka 4 місяці тому +1

    I’m guessing as for why it detected Unicode like this, the idea is that legitimate Unicode text will mostly contain symbols from a single alphabet, which are close together and thus will have high bytes that are close, so the sum of differences in high bytes will be low. If it isn’t, the high bytes (made of every second letter) will be all over the place, making the sum of differences higher. Would be good to have an explanation like that in the video.

    • @FlyTechVideos
      @FlyTechVideos  4 місяці тому +1

      You're right... thanks for the suggestion

  • @the-Gammaron
    @the-Gammaron 11 місяців тому +6

    "Legally not allowed to tell"
    Let me guess, reading leaked source code?
    Idk. You can't probably reply to this anyway.
    Or maybe it was a joke.

    • @Milenakos
      @Milenakos 11 місяців тому +6

      the word "sources" is italic

    • @the-Gammaron
      @the-Gammaron 11 місяців тому +1

      @@Milenakos Ahhhhhhh ok I'm dumb

    • @maximiliano_sv
      @maximiliano_sv 11 місяців тому +1

      @@the-Gammaron I still don't understand that about the sources, can someone explain it to me? is it some joke or something?

    • @ruben_balea
      @ruben_balea 11 місяців тому +1

      @@maximiliano_sv Basically Microsoft does not allow everybody to see the Windows source code or disassemble Windows to get the closest thing to the original source code, ChatGPT explains those things better than me:
      The source code of Windows is a set of instructions and programs written by Microsoft developers, forming the foundation of the operating system. While there are several reasons why an ordinary person cannot legally read the source code of Windows, here are some of the main ones:
      - Intellectual property rights: The source code of Windows is owned by Microsoft. The company invests significant resources, time, and effort in developing and improving their operating system. As a result, they have intellectual property rights over the code and can control who has access to it. This means they cannot allow just anyone to access and read their code without authorization.
      - Protection of trade secrets: The source code of Windows contains valuable and strategic information for Microsoft. Disclosing that code to unauthorized individuals could compromise the security, stability, and competitiveness of their operating system. Protecting their trade secrets is crucial for maintaining their market position and competitive advantage.
      - Risk of vulnerabilities and hacking: If the source code of Windows were available to the general public, it could be scrutinized by malicious individuals, including hackers and malware developers. This would increase the risk of discovering vulnerabilities and creating targeted attacks on the operating system, jeopardizing the security of millions of users.
      - License and end-user agreements: By using Windows, users agree to the terms and conditions set by Microsoft in their license and end-user agreement. These documents clearly state that users do not have the right to access, modify, or distribute the source code of the operating system.
      - In summary, the source code of Windows is owned by Microsoft and is protected by intellectual property rights and trade secrets. Its widespread access is not permitted to safeguard the security, competitiveness, and legal rights of Microsoft, as well as to prevent potential risks to users of the operating system.
      Disassembling parts of Windows to figure out the source code presents legal and practical challenges. Here are the reasons why an individual cannot easily disassemble Windows to obtain the source code:
      - Legal restrictions: Reverse engineering, which involves disassembling software to understand its underlying code, is subject to legal restrictions in many jurisdictions. Companies like Microsoft have legal protections in place to prevent unauthorized access, modification, and distribution of their software. Engaging in reverse engineering without proper authorization or a specific legal exception can infringe upon intellectual property laws.
      - Technical complexity: Disassembling software like Windows is a complex process that requires expertise in low-level programming languages, assembly code, and system architecture. It involves converting machine code (binary instructions) back into human-readable form. Understanding the disassembled code and piecing it together to reveal the original source code is a challenging and time-consuming task that requires significant skill and knowledge.
      - Incomplete information: Disassembling Windows and examining the disassembled code may provide insights into specific functions or algorithms, but it does not yield the complete source code. The source code of a complex software system like Windows consists of millions of lines of code, libraries, and dependencies, which cannot be fully reconstructed through disassembly alone.
      - Trade secrets and obfuscation: Software developers often use techniques such as code obfuscation to make disassembly and reverse engineering more difficult. Obfuscation intentionally complicates the disassembled code by adding irrelevant instructions, removing meaningful variable names, or applying other transformations. These measures aim to protect trade secrets and intellectual property by making it harder for unauthorized individuals to understand and reproduce the original source code.
      - In summary, legal restrictions, technical complexity, incomplete information, and code obfuscation make it challenging and legally risky for an individual to disassemble parts of Windows and obtain the complete source code. Reverse engineering is a highly specialized field that requires expertise, and engaging in it without proper authorization can have legal consequences.

    • @safwan6363
      @safwan6363 11 місяців тому

      Thank you lol

  • @coolmannder
    @coolmannder 11 місяців тому +4

    you can't make a video showing something in windows xp without dreamscape and bandicam

    • @FlyTechVideos
      @FlyTechVideos  11 місяців тому +7

      dreamscape is content-id protected and because i am not desiring to donate all my income to 007 sound system i would rather not
      but yes

    • @CattopyTheWeb
      @CattopyTheWeb 11 місяців тому

      lol

  • @DrKoneko
    @DrKoneko 4 місяці тому +2

    Whatever you say, fed.

  • @andyhu9542
    @andyhu9542 11 місяців тому +1

    I can see where this comes from: if someone types in Chinese (or other Unicode-only) language, the characters will be very close to each other in the encoding space. Therefore, the most significant byte would be very close to each other, while the least significant byte would essentially be random.

    • @jacky7204
      @jacky7204 11 місяців тому

      While if someone types with just a few Unicode characters thrown in, the ASCII values will produce the null bytes seen at 5:11, which also won't add to the most-significant-byte-difference counter.

  • @CBFNetworksInternational
    @CBFNetworksInternational 11 місяців тому +5

    Petition to make his videos way frequent:
    👇

    • @thepikachugamer
      @thepikachugamer 11 місяців тому +8

      That will overwork him

    • @CBFNetworksInternational
      @CBFNetworksInternational 11 місяців тому

      @@thepikachugamer Why? He is making videos every 4 months! Is it too rare?

    • @Alexander-oh8ry
      @Alexander-oh8ry 11 місяців тому +4

      ​@@CBFNetworksInternational Petition for you to 1. not decide that for another person and 2. to do your math

    • @CBFNetworksInternational
      @CBFNetworksInternational 11 місяців тому

      @@Alexander-oh8ry Petition to you to 1. do not send me messages if you're a alexander (yes, it's on purpose) and 2. not to make me do things that I shouldn't do today or after 2 weeks

    • @Alexander-oh8ry
      @Alexander-oh8ry 11 місяців тому +3

      @@CBFNetworksInternational dam somebody sounds insulted because they cant force a youtuber to do more videos and is called out for it
      (and because they cant do math)

  • @--zero
    @--zero 11 місяців тому +1

    at the start of the video i had figured it somehow wrote out a unicode byte order mark with ASCII text, i am happy i was wrong about that. however, that IsTextUnicode function feels kind of painful :(

  • @wolvvve
    @wolvvve 10 місяців тому

    this is still impossible for me to comprehend but when i can, even slightly, it’s very interesting

  • @human.earthling
    @human.earthling 11 місяців тому +1

    Nice music at 1:00

  • @MMMMMMarco
    @MMMMMMarco 11 місяців тому +1

    In unicode, a character isn't 2 bytes. Unicode itself is not an encoding, just a standard. UTF-8 is unicode where each character uses 1 byte or more, UTF-16 uses 2 bytes or more and UTF-32 always uses 4 bytes for each character.

  • @Tigrou7777
    @Tigrou7777 11 місяців тому +2

    It would be interesting to find out how they fixed it in newer versions of Windows (if that encoding is still relevant).
    What is a quick fix or did they choose a totally different method ?

    • @Sypaka
      @Sypaka 11 місяців тому +1

      doesnt work on Windows 10... but on notepad2.

  • @MasonSchmidgall
    @MasonSchmidgall 4 місяці тому

    I think i know why line termination stop the bug from occuring. In Windows, lines are terminated using both the carriage return '
    ' character and the newline '
    ' character. If the characters ever appear together without a separating byte, then notepad can instantly deduce that the file is ansi. If they were separated by a byte, then it was unicode.
    Of course, this is an unconfirmed theory of mine. However, it makes a lot of sense.

  • @AAFREAK
    @AAFREAK 4 місяці тому

    The fact I just remembered that Microsoft dropped support for WordPad, this video gives me yet another reason for me to be infuriated for that. WordPad has been faithful to me for other reasons, but this is a reminder of something I could have still benefitted when migrating to different OSes. Plus, seeing that even W11 crashed while using Notepad just tells me that it was a bad decision in the first place as it furthers the incompetence.
    Plus, I'm certain I remember this old bug from the days of the drink holder prank (which they've since patched over). It's nice this still can be done. Nostalgic, at least.

  • @edivaldobetta4437
    @edivaldobetta4437 4 місяці тому +1

    why do i love the style of the windows and why have i never seen it before

  • @JohnDlugosz
    @JohnDlugosz 11 місяців тому

    I posted an analysis of the algorithm as an answer to the post "how did Windows developers come up with...".
    TLDR; it can distinguish the patterns found in multi-byte ANSI vs 16-bit Unicode for Asian encoded text, and other languages fall into subsets of those distributions.

  • @Experiment626.Stitch
    @Experiment626.Stitch 11 місяців тому +1

    Ich mag den Glitch: Bush hid the Facts-Glitch

  • @BlazeYT_
    @BlazeYT_ 11 місяців тому

    This video also partly explains why editing metadata in Huawei Music may turn metadata info into chinese to other players, they're using Unicode not, ANSI like most other players for detecting metadata in music