Could The Internet Send You The WRONG Thing?

Поділитися
Вставка
  • Опубліковано 10 лис 2024

КОМЕНТАРІ • 632

  • @juaniththomas3591
    @juaniththomas3591 Рік тому +2581

    yes my son got sent adult videos after he tried to download homework questions, the internet is scary

    • @pofjiosgjsoges
      @pofjiosgjsoges Рік тому +348

      Happens to me all the time. So much wasted data on my mobile plan...

    • @timothytorpy4837
      @timothytorpy4837 Рік тому +240

      Sounds more like someone got caught lol

    • @qazhr
      @qazhr Рік тому +174

      Umm dude I think we need to talk about what really going on with your son.

    • @ordinarryalien
      @ordinarryalien Рік тому +72

      What was it like? I mean.. uh... I'd like to know so I can protect myself from it.

    • @Seelendrache
      @Seelendrache Рік тому +41

      ​@@qazhr depends on how old the son is 😂

  • @miigon9117
    @miigon9117 Рік тому +654

    Programmer here: Linus mentioned that the data goes through a CRYPTOGRAPHIC hash. While they certainly can be used to verify file integrity, more commonly used for such purpose are NON-cryptographic hash algorithms since they are generally faster. While the cryptographic ones are generally reserved for well, cryptography. (Difference: Protection against active malicious manipulation vs just plain transmission damage)
    What makes a hash function cryptographic(or not)? Basically it's how hard it is for a bad actor to crack (or produce a collision). MD5 is once considered a cryptographic hash and are popular among website as a mean as storing password, but has since been deprecated and regarded as a non-cryptographic hash just for file integrity verification after people start finding fast ways to crack MD5 hashes

    • @techaddictdude
      @techaddictdude Рік тому +11

      NERD alert!

    • @JustPlayerDE
      @JustPlayerDE Рік тому +110

      @@techaddictdude you are literally watching ltt, we all are nerds here lol

    • @I3erow
      @I3erow Рік тому +19

      dude MD5 is outdated und SHOULD NEVER be used for security related features anymore....

    • @JustPlayerDE
      @JustPlayerDE Рік тому +51

      @@I3erow md5 is still fine to check if the file is not currupted on transfer tho

    • @hubertnnn
      @hubertnnn Рік тому +6

      @@I3erow It depends on how secure something should be.
      I keep using md5 for securing image downloads.
      It is fast and simple, and no one is going to waste time cracking an md5 hash to get access (remove watermark) to a $1 picture, it will cost more to do so than just buying the picture itself.

  • @CarlosCabrera-kn1jb
    @CarlosCabrera-kn1jb Рік тому +715

    Actually Linus, as a CS Major it’s actually a miracle that information gets from point A to point B, fucking magic man, that’s why those low level devs always have a long beard, they’re magicians.

    • @GSBarlev
      @GSBarlev Рік тому +91

      Forget Point A to Point B--the potential that a stray cosmic ray strikes your RAM and flips a bit at just the wrong time is why I'm super psyched that DDR5 has ECC built in.

    • @JazGalaxy
      @JazGalaxy Рік тому +31

      Just… for your own information, don’t write a message to professional technologists and say “ as a CS Major..”

    • @akaiappears
      @akaiappears Рік тому +49

      ​@@GSBarlev And God said, let this bit be switched: and the bit was switched. And God saw the bit, and it was fucking magic to them; and God divined low level devs into existence

    • @damienchambers3302
      @damienchambers3302 Рік тому +1

      ​@Gilad Barlev it does? That's awesome. Does it have any practical applications for the average user though?

    • @CarlosCabrera-kn1jb
      @CarlosCabrera-kn1jb Рік тому +13

      @@JazGalaxy chill, it was just a joke my friendo.

  • @demophoon
    @demophoon Рік тому +35

    Fun fact, every credit/debit card has a checksum built into the number so computers can quickly determine if users accidentally typoed their number in wrong when paying for things online. Many other numbers which humans are expected to enter manually usually are designed with these sorts of checks in mind like insurance numbers, IMEI numbers and even those lil survey codes you find on receipts

    • @I.____.....__...__
      @I.____.....__...__ Рік тому +1

      The keyword being "many". Many also _don't_ have them. In fact, some of the survey-codes on receipts are practically human-readable and you can edit them as you please to fill out multiple surveys. (Not that there's any point; all you're doing is wasting your time giving them market-research for free. I'm not convinced they ever give ANYONE the cash prizes they claim to. 😒)

    • @bettercalldelta
      @bettercalldelta Рік тому +1

      the difference is the "checksum" on credit card numbers is quite primitive, google Luhn's Algorithm, it's just the numbers added, multiplied and moduloed, even still, it's good enough to catch common mistakes like wrong and swapped digits

    • @stevensavoie856
      @stevensavoie856 Рік тому +1

      I love that the word typoed is a typo.

    • @finadoggie
      @finadoggie Рік тому

      Interestingly, Social Security Numbers specifically do not do this

  • @danimayb
    @danimayb Рік тому +73

    Checksums are redundant pieces of information added to data that allow the receiver to verify if the data was received correctly. A simple checksum is to use only the first seven of the 8 bits in a byte for data. The eighth bit is a sum of the first 7 bits (modulo 2) that acts as a check for the first 7 -therein the name 'checksum'. In a nutshell :)

    • @southernflatland
      @southernflatland Рік тому +4

      But wait, there's more!
      They make 16 bit checksums too!
      Utilized in SNES - JRR Tolkien's Lord of the Rings
      Edit: I should know, I mostly decoded them. Punch in "3P5" multiple times in a row, like 8 times in a row, tell me that don't unlock all the characters...

    • @swimfan6292
      @swimfan6292 Рік тому +1

      LT really needs to do more videos like these. For himself and his viewers... Many are clueless

    • @eksmad
      @eksmad Рік тому +2

      Checksums as fast as possible .. YOU win! Instead of Linus.

    • @Rudxain
      @Rudxain Рік тому

      Actually 🤓, the simplest (and fastest) checksum for binary computers is `xorsum`. It's "infinitely"- parallelizable, but doesn't have "avalanche-effect"

  • @redaceFR
    @redaceFR Рік тому +61

    7-zip can generate the checksums of files too ! It's the CRC option in the 7-zip menu in the explorer. It avoids downloading something else to check it or typing a command.

    • @lucasrem
      @lucasrem Рік тому

      WHY YOU NEED LINUS ?????
      ads you need ?

  • @JV-pu8kx
    @JV-pu8kx Рік тому +112

    Data transmitted with UDP does not get resent. The UDP protocol is used for things like video streaming where an occasional dropped packet won't be missed.

    • @BrianG61UK
      @BrianG61UK Рік тому +22

      There is certainly no standard mechanism whereby UDP always gets resent if a packet is corrupted or lost. However, software that uses UDP might be able to tell when packets need to be resent and then do so. Consider, for example, QUIC. QUIC uses UDP, and packets that are lost most definitely do get resent. In fact, consider simple DNS using UDP. Most DNS clients will resend the outgoing packet if no response is received, it might only be resent once before giving up, or get resent but to a secondary DNS server instead, but it also might get resent to the same DNS server if a second one isn't specified.

    • @MrBleach163
      @MrBleach163 Рік тому +3

      I'm not sure that in video streaming one packet won't be missed given the complexity of compression algorithms...

    • @BrianMelancon
      @BrianMelancon Рік тому +7

      Brian Gregory is correct. It's not the case that using UDP means there are no validation checks. It's just not included in the UDP protocol, and is instead left to the application layer to handle as appropriate to the situation. Almost every application that uses UDP does in fact do some sort of data validation. For example, Wireguard uses UDP. Traffic over Wireguard is encrypted and needs to be 100% accurate.

    • @thoria
      @thoria Рік тому +3

      UDP also doesn't strictly require checksumming. With hardware generally always able to do it (see TOE, the TCP offload Engine protocol for more), it's almost always there, but it's not on *every* packet as Linus claimed, likely because the tangent would just eat up way more time than it's worth.

    • @BrianG61UK
      @BrianG61UK Рік тому +2

      @@MrBleach163 Yes. For something like a Skype call it depends, but you'll probably be lucky if there isn't some kind of visible glitch, but the point is, you don't want to wait for retransmission and have the time delay keep increasing to the point where you're waiting ages for the person you're calling to respond to what you say.

  • @JV-pu8kx
    @JV-pu8kx Рік тому +54

    I've heard that many of the free cloud storage services use checksums to save space. They run the hash and compare to what is already on their servers. If two matching files are uploaded, only one copy gets stored. It does not matter if the files were uploaded to separate accounts, only one copy is actually stored.

    • @DraxTrac
      @DraxTrac Рік тому +3

      You know, I've always wondered about that.

    • @GSBarlev
      @GSBarlev Рік тому +4

      Pretty sure it happens with Plex / Jellyfin metadata fetchers as well, which is why occasionally you'll get results that aren't just a little bit off, but, like, wildly off.

    • @lPlanetarizado
      @lPlanetarizado Рік тому +4

      as a bug bounty guy, that opens options to find bugs, thanks

    • @blahorgaslisk7763
      @blahorgaslisk7763 Рік тому +12

      This is called deduplication, and is a staple feature in large storage systems. However they should not stop at running a simple hash to decide if the files are the same. At next step they check the file length and then if it still matches they check the actual binary data. This has to be done as the checksum of two files can be the same, even with different content and even file size.
      Simple checksums like CRC32 is very easy to manipulate. A anime fan sub group used to make sure their releases all had a CRC32 checksum that showed the episode number. So episode 1 had the checksum 01010101, episode 2 got hashed as 02020202 and so on. This is (marginally) harder with MD5 and a lot harder with SHA256 or better. But even without malicious intent there are only so many hash values possible in say 256 bits that eventually two files will have the same hash. This means that a hash value can't guarantee that the file is what you think it is. It can only guarantee that the file hash is the same. So use it to check for transmission errors, and file integrity. Not as prof of content not being manipulated by a third party.

    • @CodeAsm
      @CodeAsm Рік тому +2

      @@lPlanetarizado "hash collision" and can be very tricky. Id probably do it on multiple levels, data chunk wise, file wise and check metadata (file size, dates, entropy)

  • @theyruinedyoutubeagain
    @theyruinedyoutubeagain Рік тому +273

    Not all CRCs are cryptographic, actually I'm pretty sure most fast checksums are not crypto hardened. Still, great video!

    • @GSBarlev
      @GSBarlev Рік тому +33

      I mean yeah, the simplest checksum is just a parity bit.

    • @SelecaoOfMidas
      @SelecaoOfMidas Рік тому +8

      I mean, even CRC32 is highly susceptible to collisions (files that are different from each other, but having the same hash value), and SHA-1 had that issue take hold around 2013. Most entities have moved on to hash algorithms like SHA-256 up to SHA-2048, depending on the importance of the data and urgency vs compute cost per file.

    • @ShadowSlayer1441
      @ShadowSlayer1441 Рік тому +1

      Why would you use anything other than sha256?

    • @lordsponge10
      @lordsponge10 Рік тому +1

      @@ShadowSlayer1441 you would use SHA3-256 which is more robust to attacks. Most programs still use SHA2-256.

    • @tr7zw
      @tr7zw Рік тому +5

      @@ShadowSlayer1441 Speed? Simplicity?

  • @SuperFromND
    @SuperFromND Рік тому +70

    if you have 7-zip installed (which you really should if you don't, it's amazing), it actually adds all sorts of checksum-generation options to the right-click menu in windows, its really handy

    • @talon262
      @talon262 Рік тому +10

      And you're not limited on using 7-zip on just ZIP/RAR/other compressed archive format files to pull their checksums... you can use that built-in functionality for pretty much any file.

    • @Gohan1138
      @Gohan1138 Рік тому +3

      Winrar gang here

    • @TylerTMG
      @TylerTMG Рік тому

      i use breezip from microsoft store

    • @rf8003
      @rf8003 Рік тому

      Peazip good as well...

    • @kvbc5425
      @kvbc5425 Рік тому

      WINrar 👑

  • @ArdentMoogle
    @ArdentMoogle Рік тому +354

    And we're slowly moving to quantum-resistant hash functions, to avoid the issue of quantum computing in the future.

    • @Dinkleberg96
      @Dinkleberg96 Рік тому +5

      That will be a problem for security

    • @matthewparker9276
      @matthewparker9276 Рік тому +32

      ​@@Dinkleberg96 no it won't. By the time a quantum computer powerful enough to work on decrypting real internet packages exists all important things will be using quantum secure algorithms. It'll be Y2K all over again.

    • @robspiess
      @robspiess Рік тому +31

      @@matthewparker9276 Not necessarily. Important long-term data which was encrypted with "good at the time" cryptographic ciphers are being saved for future quantum computers to decrypt. Even though we can't break it now, saving RSA-4096 encrypted "Who_Shot_JFK.docx" and "Herbs_and_Spices_v11.KFC" for computers 30 years from now could cause real national security issues.

    • @triciaf61
      @triciaf61 Рік тому

      @@robspiess i cant wait for "Herbs_and_Spices_v11.KFC" to get cracked and cause the USA to fall into utter chaos due to it revealing the real herbs and spices.

    • @WolvenSpectre
      @WolvenSpectre Рік тому +1

      @@robspiess I think he though the first reply was saying that Quantum Resistant Algos were bad for security, and not Quantum Computing will be bad for security. That post was kinda ambiguous.

  • @thoria
    @thoria Рік тому +35

    I'm kind of surprised there was no mention of block-level CRC for storage media, the checksum that makes it possible for RAID-scrubbing to find faults, and for disks in general to be reasonably certain they're reading back the same values that were written in the first place, something almost everyone takes for granted.

    • @HarpaxA
      @HarpaxA Рік тому +1

      He's talking abt Internet Checksum, not RAID

    • @thoria
      @thoria Рік тому

      ​@@HarpaxA He, and the writers, are, but a large part of the runtime is spent on offline and local-network file-validation and things like passwords. The fact that this is a design consideration that allows data-integrity issues to be found in RAID when the multi-disk abstraction might otherwise hide problems until way too late is just one application and a way to get people's attention with a topic that seems to draw some number of views (yay, algorithm).
      Block-device-level checksums seem relevant to this topic specifically because there's an emphasis on "how does data reliably get from point A to point B?" and it needs to be stored and retrieved from somewhere. There's nothing about a magnetic head or voltage-assessment that provides assurance that read-mistakes won't happen without a checksum of their own.

  • @soup5344
    @soup5344 Рік тому +8

    personally i prefer the method of looking at the files and going "Yeah that seems about right"

    • @TylerTMG
      @TylerTMG Рік тому

      wait THIZ|S ISNT MY 8K TOY STORY 1 VIDEO

  • @Bacender
    @Bacender Рік тому +4

    Hashtab is one of the best checksum tools for Windows. It adds a tab in the properties dialog of a file to let you compare checksums.

  • @tr7zw
    @tr7zw Рік тому +26

    Noteworthy that a checksum on the download page being the same as the downloaded file doesn't mean that it hasn't been tampered with. If you're man in the middle-d or the site is compromised enough, hackers could also just replace the hash shown on the site to match the modified file.

    • @thorbear
      @thorbear Рік тому +6

      Yeah, comparing checksums for downloaded files when the checksum and file are on the same server feels like security theater, just giving a (false) sense of security without actually adding any security.
      Doesn't the practice come from (and make more sense in) the scenario where a 3rd party file hosting service (or a mirror) is used to store the actual files, while only a link and a checksum is on the website itself, so you can verify that the file you get from the 3rd party is the one intended by the owner of the website?

    • @blahorgaslisk7763
      @blahorgaslisk7763 Рік тому +3

      Very important post! A hash is never proof of what the file contains, just that the file you ran the hash algorithm on has the same hash result as what you were told to expect. So use it to verify that the file wasn't corrupted in transmission or changed in some form. But don't rely on the content being what you expect just because the hash matches what's on the site you got it from.

    • @o0Donuts0o
      @o0Donuts0o Рік тому

      Then sign your files…

    • @I.____.....__...__
      @I.____.....__...__ Рік тому +1

      @@o0Donuts0o File-signing certificates are expensive af. There's no LetsEncrypt for that. 😕

    • @stayfunsteven2207
      @stayfunsteven2207 Рік тому

      The first thing that came to my mind when I first heart about checksums. But then to other attacks it can be helpful. So it is obviously not useless.

  • @The_Life
    @The_Life Рік тому +16

    Whenever he says "bad actors", I can't help but think actors who just suck at their job doing shady things

    • @Duaality.
      @Duaality. Рік тому

      If they're acting at doing their job and they still suck, then I'd argue that still makes them a bad actor

  • @SamarthCat
    @SamarthCat Рік тому +2

    4:23 not every service uses TCP, for example, most online games and realtime apps use UDP to reduce latency because packets don't have to arrive correctly.

  • @VFPn96kQT
    @VFPn96kQT Рік тому +2

    There is a big difference between *Cryptographic* checksum and the one used for verification that a file arrived correctly such as TCP/IP protocol

  • @delofon
    @delofon Рік тому +2

    1:33 If those bad actors could replace the download with a malicious one on some website, it would be of no hassle for them to replace the checksum as well. MITM attacks are guarded against with protocols like SSL. Checksums are not used to validate the security of a file but rather to confirm it was downloaded correctly from the origin (even though TCP handles it too) so that, in the worst case scenario, your PC doesn't break down from an incorrect OS download.

    • @kpcraftster6580
      @kpcraftster6580 Рік тому +2

      Yes BUT... often downloads are hosted on a different domain than the checksums. Ideally you download the file from the least suspicious mirror and get copies of the checksum from multiple other sources.

  • @Alphalaneous
    @Alphalaneous Рік тому +3

    Note that 7zip has a checksum viewer as well, so if you have that, you can view the checksum of a file easily

  • @bladewind0verlord
    @bladewind0verlord Рік тому +2

    literally right after he said "make sure they don't get corrupted" at 3:40, my blender simulation used up the last of my RAM and made the video start stuttering and I swear to god I just assumed that it was just a gag for the video

  • @VivekYadav-ds8oz
    @VivekYadav-ds8oz Рік тому +17

    Checksums are only useful if you're expecting errors not malicious intervention. Anybody could just change the source, and then re-hash the source and send that as the checksum. Encryption will be necessary regardless.

    • @monkeyoperator1360
      @monkeyoperator1360 Рік тому +1

      not exactly, most websites write the checksum out, so that you run the checksum yourself the file doesn't check itself

    • @evertchin
      @evertchin Рік тому +1

      wrong.... this is why we use strong crypto as checksum, it will take you forever to reshash the contents to match the original checksum.

    • @fishyfish2679
      @fishyfish2679 Рік тому

      Yeah no, encryption alone does not mean attacker can't change plaintext. E.g. stream ciphers are vulnerable to known plaintext attacks. What you want is an unforgeable checksum, and in the field of cryptography you have two ways for that, digital signatures (software/drivers/official email etc), and message authentication codes (generally instant messaging). It's very common data that is assigned a MAC or digital signature is also encrypted, but unless we're talking about authenticated encryption, integrity and authenticity is provided by algorithms other than the encryption.

    • @Gramini
      @Gramini Рік тому

      @@evertchin OP meant that if you can change the file on someones server, you probably can also change the displayed checksum on the web page.

  • @HyperGadgets
    @HyperGadgets Рік тому +1

    Couple of points:
    - the cryptographic hash function outputs aren't guaranteed to be unique, but are generally designed to avoid collisions.
    - passwords aren't just hashed (or at least they shouldn't be 😅), if they were, then if the Database was leaked, the attacker would be able to tell the simple passwords. Two people with the same password would then have the same hash output. This could also mean the attacker can generate hashes from a list of common passwords and compare against the database to find people with common passwords and hack their accounts.
    To get around this, a "salt" is added. The salt is randomly generated and when combined with the password and then hashed, it will create a new output, even if two users have the same password.
    This is why you should use unique/random passwords, because if the server doesn't salt the passwords, common passwords can be found easily and then anywhere you use that same password is then potentially compromised - even if the other places do salt them.

  • @phil2of3
    @phil2of3 Рік тому +3

    A good hacker that changes a file for something malicious one on some server would also change the checksum file at the same time

    • @GSBarlev
      @GSBarlev Рік тому +2

      Except good opsec is to never store your checksums (or your salt) on the same server as your sensitive data.
      Checksums in my circles also tend to be cryptographically signed via PGP.

  • @mikejetzer4155
    @mikejetzer4155 Рік тому +1

    If you want to verify a file that's copied locally (i.e., both the source and destination file are on locally-accessible filesystems), doing a file compare (e.g., the Unix/Linux "cmp" command) should be much faster than doing a checksum, and will tell you exactly where the first different byte appears.
    I'm not a Windows guy, so I don't know how easy this is to do in Windows, but if you're going to get a third-party product to perform your checksums for you, you could probably get a third-party "cmp" program.

  • @dshcfh
    @dshcfh Рік тому +1

    On the "Windows Explorer doesn't compute hashes" note;
    It wouldn't take a lot of resources to do that at all. They would only have to add a checksum middleman to the file transfer stream.

    • @jkahgdkjhafgsd
      @jkahgdkjhafgsd Рік тому +1

      with the number of cores medium & high-end systems have these days it's not like performance is a concern either (just make it optional)

  • @StubbornProgrammer
    @StubbornProgrammer Рік тому +5

    Awww I really wanted Linus to mention salting in the password segment. I know it's too much of a tangent for such a short video but it's a neat solution to an unfortunately real security problem.

    • @o0Donuts0o
      @o0Donuts0o Рік тому

      I think this comment section has all the salt covered over CRC vs checksum.

  • @grantjoseph2730
    @grantjoseph2730 Рік тому +1

    It's worth pointing out that the reason TCP's checksums aren't for security is because anyone who could replace the file being downloaded with malware could also just change the checksum to match the malware they inserted. That's why TLS/HTTPS uses an enhanced version of checksums called digital signatures that uses special encryption tricks to prove that the checksum was calculated by the server you're downloading the file from and not an attacker.

  • @christopherchappell8881
    @christopherchappell8881 Рік тому +4

    Did discover something interesting with MS Teams. Apparently, it is possible to get corrupted files sent out over Teams between users. Colleague of mine had a known good file direct from the manufacturer. They then sent that file via teams to several other users that needed access to the file but couldn't access the direct download. 2 of those it was sent to could not use the firmware file because the device they were updating kept throwing an error saying the file could not be validated. I had them send me their copy through a program that I know does checksums and when I compared the file size just on its face it was smaller than the verified original. So while it seems teams attempts to deliver files, I can say first hand that it's not guaranteed to arrive in one piece.

    • @shadamethyst1258
      @shadamethyst1258 Рік тому +1

      I mean, it's MS Teams, I wouldn't expect it to work properly for anything

    • @blahorgaslisk7763
      @blahorgaslisk7763 Рік тому

      A quick solution is to archive the file using 7zip and add the checksum to the file name. When the receiver run the file through 7zip to unarchive it will check the checksum and even if it matches it still will throw a fit when trying to unarchive the file if the archive has been changed in any way. This should be enough to catch any unintentional tampering, such as lost or corrupted packages.

    • @o0Donuts0o
      @o0Donuts0o Рік тому

      Sooo it couldn’t be corrupted from pc to device requiring firmware? It’s just MS Teams? Lord your diagnosis skills are terrible.

  • @TheARN44
    @TheARN44 Рік тому +72

    I’m surprised that this video didn’t mention google registering the .zip domain.

    • @Bert-og9rk
      @Bert-og9rk Рік тому +18

      I thought it was going to bring that up considering the thumbnail.

    • @MrSevenEleven
      @MrSevenEleven Рік тому +8

      Why would it?

    • @Zikeji
      @Zikeji Рік тому +4

      Given the title and the thumbnail that is exactly what I thought as well. Disappointed lol.

  • @erice6755
    @erice6755 Рік тому +3

    You should mention that there is a difference between UDP and TCP in this instance. Because if we're doing something over UDP it's not gonna bother with resending it, lost is lost at that point.

  • @neilalcoseba6978
    @neilalcoseba6978 Рік тому

    This is still used if you have a slow internet and constant disconnection when doing downloads. Checksum is a way to check if your downloaded file is not corrupted.

  • @kylejohnson779
    @kylejohnson779 Рік тому

    A TQ vid on cryptography, specifically password storage and Rainbow tables would be pretty cool as a sequel to this. Would love to see more security related content

  • @CoolJosh3k
    @CoolJosh3k Рік тому +1

    CRC32 will do for a quick check, but for security it is best to use SHA256 to ensure nothing was tampered with.

  • @randomgeocacher
    @randomgeocacher Рік тому +1

    Checksums vs Hashes vs Keyed Hash (MAC) and signatures could have been more clearly separated / explained. Fitting it into a technique format/speed is a challenge but would add a lot of value / clarity.

  • @6Twisted
    @6Twisted Рік тому +1

    Pretty abysmal that Windows doesn't use checksums. I've had a few known corrupted files before and who knows how many unknown corrupted files.

  • @Raistling
    @Raistling Рік тому

    TCP/IP doesn't actually do a checksum in that way.
    now, it has been a while since I read up on it, but if memory serves, then TCP checks on a per packet basis instead.
    it also uses a kind of "session" number in order to keep track of a session of communication.
    sending info from A to B would look something like this:
    A: Sending packets 1-14
    B: received packet 14
    A: sending packets 15-34
    B: received packet 31
    A: sending packets 32-42
    B: received packet 36
    A: sending packets 36-40
    So in addition to having a session token in all this information, A tags all sent packets with a number per packet as well. B will read every packet it gets until it has either read all packets or the packet it receives is not the one numerically after the last. so if it gets 1, 2, 3, 5, then it stops and sends back that it got packet 3.
    Notice how little data B actually uses by just sending a response of the latest packet it received in a series. This makes sure that TCP is not gonna use tons of data to communicate back and forth.
    But it still does communicate back and forth in order to keep signal integrity.
    UDP/IP on the other hand is not like that.
    UDP is like pouring a bucket of water down the drain. Most of it should arrive sequentially, but some might not arrive in order. Or at all. But it doesn't matter, since the receiver isn't checking it. Video streaming is done like this in order to keep up with the massive amounts of data being sent, where TCP might lag behind. But it comes at the cost of sometimes being out of order and have a little lag spike here and there.

  • @TechX1320
    @TechX1320 Рік тому +5

    And yet hash collisions exist. We use this to crack files on some games to mod them

    • @GSBarlev
      @GSBarlev Рік тому

      Yup. Since they mentioned Steam, it's worth noting that before SteamOS added the ability to directly change the boot animations, you could still swap in your own custom -Shrek supercut- video on the Steam Deck as long as it was precisely (down to the byte) the same length as the OG ani.

    • @TechX1320
      @TechX1320 Рік тому +1

      @@GSBarlev old-school game called combat arms, you can use hash collisions to modify the game files to create exploits like wallhacks.
      Ash collisions don't mean the same byte size. Generally when you perform hash collisions, the file gets bigger

    • @GSBarlev
      @GSBarlev Рік тому

      @@TechX1320 True. I'm conflating checksums with hashes. But we're on the subject of file verification anyway, so I think the point is fair.

  • @ramavabray
    @ramavabray Рік тому +1

    I use TeraCopy in windows to handle all file copy and moves because I can turn on its verify option as a default and never have to worry about it again.

  • @nvmuzrowrihk
    @nvmuzrowrihk Рік тому +32

    Looking at the thumbnail, I thought this was about the new .zip TLD...

    • @BakersTuts
      @BakersTuts Рік тому +1

      _sigh… unzips_

    • @mahdi9064
      @mahdi9064 Рік тому +1

      can you give more context ?

    • @TheDakes
      @TheDakes Рік тому +2

      ​@@mahdi9064 In short: Google registered .zip (and .mov) tlds for its domain service. This is bad because many programs will automatically convert zip file names into links now, even if sent by a trusted person. So bad actors could now register domains of common file names to host malware.

    • @robertlinke2666
      @robertlinke2666 Рік тому

      @@TheDakes then it's probably good google snatched them before any actual malicious actors could. sure i dont trust google, and niether should anyone, but they wont use this to send you to malicious sites

  • @rawl1
    @rawl1 Рік тому +1

    Love it when u make 5 minute videos with 1 minute ad

  • @louisloudogtrottier3310
    @louisloudogtrottier3310 Рік тому

    TY for bringning that up.

  • @I3erow
    @I3erow Рік тому +1

    1:15 that doesn't mean that a strong hash value can cover for a weak password!! ALWAYS choose strong passwords guys

    • @TylerTMG
      @TylerTMG Рік тому

      so 1qaz2wsx3edc4rfv5tgb6yhn7ujm8ik9ol0p is weak?

    • @BrianG61UK
      @BrianG61UK Рік тому

      @@TylerTMG No capital letters, no punctuation ;-)
      Also I can see exactly how you typed it so I don't even need to remember it if I want to hack you, it's already written down on every QWERTY keyboard.

    • @TylerTMG
      @TylerTMG Рік тому

      @@BrianG61UK also how do i enable 2 factor?

  • @hikariyouk
    @hikariyouk Рік тому +1

    TCP/IP -along with a lot of other things - uses CRC-32, which categorically isn't a cryptographic hash (even if it's used as one sometimes).

  • @-B.H.
    @-B.H. Рік тому

    TeraCopy as a windows file transfer replacement has been my go to for years for this.

  • @prawny12009
    @prawny12009 Рік тому

    One thing you didn't mention is that corrupt file downloads can be deliberately induced by your isp because of "traffic shaping",
    The worst part is that the download would have been faster and use less data/bandwidth if they had simply allowed the download to go unimpeded instead of forcing you try over and over.

  • @Pixelcrafter_exe
    @Pixelcrafter_exe Рік тому +1

    The output of a hash functions are not necessarily unique since the input may be infinite but the output is finite. Its just higly unlikely to happen.

  • @MatthewSuffidy
    @MatthewSuffidy Рік тому +1

    Since checksums are a much smaller set of data than the data itself, it is possible for certain permutations of data to produce the same checksum, but improbable. That fact and others means that computers are not necessarily totally reliable but may be 1 in 1 x 10 e 10 reliable per bit or so.

    • @InfernosReaper
      @InfernosReaper Рік тому

      Improbable, but inevitable due to the sheer amount of files and limitations of the system, which is why it's a terrible way for companies to check data on people's phones to send to law enforcement agencies

  • @notenoughmonkeys
    @notenoughmonkeys Рік тому

    To address confusion. Checksums//CRC’s/Hashes etc. are often used interchangeably but basically all have the same basic goal. Can you with reasonable confidence know the file/data you have is the one that you actually wanted.
    Simple checksums use very lightweight insecure algorithms but their only purpose is detecting simple corruption. There’s no security component., meaning it’s relatively trivial to modify the file and tweak it such that it still has a valid, if not identical checksum if you were a malicious actor.
    When you bring in cryptography the intent is to prevent that attack vector. In that whilst you can modify a file, doing it such a way that leaves the files hash unchanged is non trivial.
    Any/all methods of hashing will suffer collisions by the very nature of containing less data than the thing it’s describing. I.e. you can’t uniquely describe a 1gb file using only 256 bytes of data, if that were true we’d all just download the file hash and magically reconstruct the original file from that.
    The essence of the more secure methods is to make it that the collisions will be a function of chance, not intent.

  • @sadravin1
    @sadravin1 Рік тому

    Video Suggestion: How to clean and maintain a Linux OS. Example: in windows you can delete temp files and stuff. How do we do that stuff on Linux. when i use the command prompt to install apps and frameworks; how do i know how to remove the bloat and leftover files after install? what are the common practices for keeping it clean?

  • @electricz3045
    @electricz3045 Рік тому +2

    1:00 that's wrong, passwords don't become stored as hash, they become encrypted. Hash and encryption are not the same. Hash is 1 way directional so it csnt be reversed (thus hashing passwords in DB won't let users login anymore as it can't verify rhe oassword's correctness) while with encryption like AES or MD5 user authorization will work.

    • @CarlosCabrera-kn1jb
      @CarlosCabrera-kn1jb Рік тому +2

      Most DBs will store the password’s hash. The encryption part goes from frontend to backend, backend then transforms the password into it’s hashed form and stores/compares against the hash saved in DB. Nothing wrong with that.

    • @GSBarlev
      @GSBarlev Рік тому

      ​@@CarlosCabrera-kn1jb Yup. Technically it's possible that two passwords will share the same hash, but the likelihood (assuming good encryption) is far less than the odds that the key to your dad's 1998 Ford Taurus could also have started someone else's car (look it up)

    • @tercmd
      @tercmd Рік тому

      1. AES is __encryption_ and MD5 is _hashing._
      2. The same password will produce the same hash so the hashes can just be compared.

  • @chuckthetekkie
    @chuckthetekkie Рік тому

    And yet we still get corrupted downloads sometimes and have to manually download the file again.

  • @stefanos6505
    @stefanos6505 Рік тому

    Sometimes at low level shit goes wrong, but TCP also contains an ACK signal: if something does not arrive it will resend it.

  • @finkelmana
    @finkelmana Рік тому +1

    You better hope your password is not stored as a hash, as rainbow tables solve that problem. Salted hashes... well thats different.

  • @SongStudios
    @SongStudios Рік тому +1

    Yeah, I've been getting these weird "ads" or "sponsored segments" for every video I watch.

  • @riiiiiiiiiiiiiiiiip
    @riiiiiiiiiiiiiiiiip Рік тому +1

    Gosh darn it, Colton..

  • @ChitChat
    @ChitChat Рік тому

    Hashing is one way encryption and used to digitally sign files. Like for root servers handing out certificates for intermediates.

  • @shgysk8zer0
    @shgysk8zer0 Рік тому +1

    PGP / cryptographic signatures are even better still. Every computer and device should come with PGP/GPG... so useful! Even works for signing email. Anyone can generate a hash, and using HMAC requires sharing the password/key (which makes it easy to fake authenticity). Public key crypto is the only real solution.

  • @Aeturnalis
    @Aeturnalis Рік тому +1

    3:13 skip ad

  • @OutlawJackC
    @OutlawJackC Рік тому

    I remember tom scott going on about websites that put their checksums on there
    And he said if they are able to change the file sent it wouldnt be too dificult to change the hash on the website to the hackers file

  • @Deadi12
    @Deadi12 Рік тому +2

    Thought this was going to be a video on the osi model. This is just as good.

  • @Schalari
    @Schalari Рік тому

    This is the MOST informative Video I´ve watched so far. Thanks!!

  • @grayfox8547
    @grayfox8547 Рік тому +2

    Linus has been cooking in that sun

  • @MrSuspicious0
    @MrSuspicious0 Рік тому

    Hashtab is another great checksum utility for windows, adds a hash tab to the properties of any file, showing it's hashes in many common hashing functions, you can paste in your hash and it will verify if its correct.

  • @dobelini303
    @dobelini303 Рік тому

    You should do a video on the Border Gateway Protocol (BGP). One of the most fundamental and cool pieces of internet infrastructure that even most software engineers have no idea about!

  • @semmu93
    @semmu93 Рік тому

    i expected you to talk about error correction codes and how they are used in transmit, would love to see a video of it from you!

  • @amikadm
    @amikadm Рік тому +2

    interesting question : could you "reverse" the SHA to get the file back from it ?

    • @Gramini
      @Gramini Рік тому

      Absolutely not. Those fancy hashing functions are lossy, so you loose details. SHA1 is 160 bits / 20 bytes, sha256 is 256 bits / 32 bytes. If I give you the hash of my 3 MB file, well, you cannot restore it. That's also why those hashes are used for storing password, as they cannot be reversed.

  • @shanent5793
    @shanent5793 Рік тому

    The cryptographic hash of a small file takes more than a trivial amount of time, and that time is constant for files smaller than the hash block size. Large files can take advantage of pipelining and amortize any required context switches. Hashing a gigabyte of data in one file will take much less time compared to hashing the same data divided into 2²⁸ four-byte long files.

  • @kethernet
    @kethernet Рік тому

    Worth noting that TCP and UDP use a small non-cryptographic checksum. It's only 16-bits, not nearly as long as the one the animation showed. That means random collisions are far more likely (but still pretty rare), where random bitflips could pass the check, and since the checksum is part of the packet itself, it doesn't provide meaningful security from intentional changes by a "man in the middle". HTTPS provides end-to-end security that prevents that, but basic TCP and UDP don't.

  • @der_rechtsamwald
    @der_rechtsamwald Рік тому

    Hashtab inserts a extra tab "hash" into the files-options where you also can compare

  • @tharsis
    @tharsis Рік тому +1

    Going by the thumbnail, here I thought this video was an incredibly speedy response to .zip top-level domains now being a thing, making phishing and tricking people into downloading malicious data stupidly easy.

  • @anderstroberg3704
    @anderstroberg3704 Рік тому

    Getting a hash for a large file while you are copying it does only take a trivial amount of time. You already have the file in memory, it's just a few instructions extra per byte. What takes time is getting a hash for a file you aren't reading anyway, as file operations is where time is spent.

  • @lpprogrammingllc
    @lpprogrammingllc Рік тому

    Checksums are also how automatically de-duplicating filesystems for incremental backups work. Each file is stored not by its name, but by its content hash. The file metadata then just records the hash of the content, and any other file with the same content will point to the same physical extent. Tahoe-LAFS leverages this for distributed files between friends, and Freenet uses a similar process to shard and distribute files pseudo-anonymously across the entire Freenet network.

    • @flameshana9
      @flameshana9 Рік тому

      Why do backup programs still make duplicates then? I've tried so many and they all do a painfully bad job at it.

    • @BrianG61UK
      @BrianG61UK Рік тому

      Best to use a long cryptographic hash for de-duplicating. A long CRC could work too. A simple sum is not really suitable for this, too much chance of collisions.

    • @BrianG61UK
      @BrianG61UK Рік тому

      @@flameshana9 In my experience it mainly gets used to avoid re-backing up a file (or block of data) that hasn't changed since last time it was backed up. Not to avoid backing up a file that is a copy of another file also on the source media.

  • @colt5189
    @colt5189 Рік тому

    I had to download a program that would verify a copy/paste. As Windows doesn't do it for some reason. As sometimes a copy/paste gets corrupted. And you don't know until you try and open the file later. And then, that could be real bad if you don't have a backup.

  • @AlFasGD
    @AlFasGD Рік тому

    My computer literally crashed at 4:47 and rebooted on its own, a very creepy coincidence

  • @Tomcat2_kanal
    @Tomcat2_kanal Рік тому

    It is not only at TCP/IP layer but eve on the link layer the Etherner frames have CRC of that frame...

  • @spayde960
    @spayde960 Рік тому

    The movie in the intro is called 1917, it's one of my favorite movies of all time

  • @bou222
    @bou222 Рік тому

    More of this!!!! plz and thank you

  • @cbremer83
    @cbremer83 Рік тому

    This is also how ZFS verifies file integrity on ZFS RAID arrays.

  • @austinbentley6234
    @austinbentley6234 Рік тому +2

    Don't wanna be that guy but here are a few exceptions to the answer to the question in the title: hash collisions (weak algorithms,) web cache poisoning, and request smuggling.

  • @KingLundh
    @KingLundh Рік тому

    Always nice to learn something useful from time to time.

  • @Progaros
    @Progaros Рік тому

    if you have 7-zip installed, you can right click a file and get "all" hash-sums

  • @98SE
    @98SE Рік тому +1

    I remember when the channel was called "Fast as Possible", god that was quite a long time ago now and I've been watching LMG since 2012!...

  • @PBandECHO
    @PBandECHO Рік тому

    I swear the steam validity verification takes longer than an actual new install.

  • @sorak185
    @sorak185 Рік тому +1

    The thumbnail made me think this was going to be about the .zip TLD issue currently going on...

  • @L9MN4sTCUk
    @L9MN4sTCUk Рік тому

    Resilient File System (ReFS) which comes with Workstation editions of Windows does this checking.

  • @CasterbalTV
    @CasterbalTV Рік тому

    *Teracopy* - It checks checksum after files transfer.

  • @amd2800barton
    @amd2800barton Рік тому

    Why have I never thought to use checksums when copying files on my own computer and network? I usually just resorted to verifying that the total byte count was identical.

  • @MrRom92DAW
    @MrRom92DAW Рік тому

    Checksums are important. People (or really just Apple sycophants) like to say ALAC is just as good as FLAC and it’s totally not a problem at all that Apple doesn’t let you use FLAC, because you can just convert between the two and lossless is lossless! Except that’s not the case at all because FLAC natively stores a checksum for every track and ALAC is entirely a-lacking in this regard.

  • @revcrussell
    @revcrussell Рік тому

    TCP/IP _needs_ the checksum because of data collision is a real problem.

  • @tomrous
    @tomrous Рік тому

    When someone changes file on a website, is not a problem to change the checksum too.

  • @mort_brain
    @mort_brain Рік тому

    A quite tingling theme regarding your recent story =)

  • @carlanderson5068
    @carlanderson5068 Рік тому

    One point you missed. Hashes aren't unique as you stated. That's why we can have collisions. After all, no finite value uniquely mapped to an infinite value. Intentional collisions are difficult to figure out currently due to provably hard math, but accidental collisions aren't what these are trying to protect against. Adding in details like original file size can help reduce collisions even more, but don't completely eliminate the possibility (speaking mathematically).

  • @richardclark7679
    @richardclark7679 Рік тому

    Totally agree checksums should be ubiquitous. I can't count the number of times I've been bitten by this.

  • @Kelble
    @Kelble Рік тому

    Do a video on CRCs next! Kinda like a checksum

  • @anthonymorris8891
    @anthonymorris8891 Рік тому

    There was a forum I used to use that got confused and applied a scantily clad woman with the text SEND NUDES as my profile picture. I named my PFP the same as that one so the server was like, yeah these are the same.

  • @vladislavkaras491
    @vladislavkaras491 Рік тому

    Thanks for the video!

  • @Cxrruptwd
    @Cxrruptwd Рік тому +1

    421th comment - file hashes are used to do checksums checksums check the file for any curruption

  • @FlyboyHelosim
    @FlyboyHelosim Рік тому

    So how does that explain uploaded or downloaded files getting corrupted? This was especially an issue a with slow or unstable internet connection.

  • @00001Htheprogrammer
    @00001Htheprogrammer Рік тому

    But TCP already ensures a packet drop/corruption will raise an error, right?
    That means manually checking the full file isn't neccesary?
    And if hackers want to tamper with the file, they can also easily change the checksum to the one calculated from the malicious file.