Minor stuff: Tar files have directory entry in a blocks aligned to 512B and 512B long ahead of each file. Basically one sector on old discs. Files are then padded by zeros. Zip files have directory entry in a block in front of each file with much less attributes (but more than gzip I believe), so it can be written as a stream and then each directory entry is repeated in the end file. Offset to the central directory is stored at the very end of file. IMHO choices are: zip for maximum compatibility, tar.gz for compatibility within Linux bubble and archiving files including user rights (which is pointless except backups) 7z for maximum compression if it's worth the time and relatively good compatibility within IT bubble lz4 for maximum speed on large sparse data for internal use zstd as a tradeof of good speed and compression for general purpose (beats ZIP's inflate/deflate almost every time in both (de)compression speed and ratio), but for internal use as it's not widespread and it's not archive format so it needs container such as 7zip, but 7zip itself has to be patched to support it
5:50 technically zip files store metadata at the end of archive also known as EOCD. it makes it easier to add new files. as it just need to append those and rewrite metadata at end of file.
Would it be possible to convert all files you want to compress to plain text files prior to compression? If the DEFLATE alg works better on text files, that would seem like a good idea, no? Is it more efficient to convert an mp4 to text, then compress, than just compressing the mp4 directly? 🤔 So many questions! 😂 Thanks for the explanations Tony. Keep up the great work! 😁👍🏻
bruh what mp4 is mp4, you can't "translate" it to txt, whatever this even means. when saying txt compress better, he's talking about compressing text that have 26 symbols from the alphabet, and have been designed ti compress well language, and not really for random stuff, because y'know, we use languages lmao
@@NielsGx Yes, you're absolutely right. We use languages. Like Machine Code, Binary Coded Decimal, Binary, Assembly Code, Hexadecimal. The list goes on and on and on. You can represent an mp4 video (or any other file type) in whatever type of encoding you want. Then we transmit that data using algorithms like BPSK and QPSK using beams of light to shoot the data down massive undea-sea cables from continent to continent. Literally anything is possible. Even the words you're reading from this comment right now have been transmitted by strings of 1s and 0s to explain this to you. But of course, what do I know? I've only been studing Electronic Engineering and Computer Science since before you lost all your milk teeth.
It's possible but there's no advantage. DEFLATE compresses text better than binary because natural text typically has less entropy. When you convert binary to text (using hex, base64, base91 etc) you cannot magically remove that entropy, so you get seemingly random text that's bigger than the original data
tar tar tvf setuptools-58.0.2.tar.lz find the file you want tar xvf setuptools-58.0.2.tar.lz setuptools-58.0.2/tools/finalize.py no need for it to extract it all :)
You make really interesting videos! Especially because they cover not very popular, but really important topics about linux.
I was going to sleep then this video popped up , great explanation of various topics, u deserve more subs
very clear explanation you are a prodigious talent my friend
Thanks!
Minor stuff:
Tar files have directory entry in a blocks aligned to 512B and 512B long ahead of each file. Basically one sector on old discs. Files are then padded by zeros.
Zip files have directory entry in a block in front of each file with much less attributes (but more than gzip I believe), so it can be written as a stream and then each directory entry is repeated in the end file. Offset to the central directory is stored at the very end of file.
IMHO choices are:
zip for maximum compatibility,
tar.gz for compatibility within Linux bubble and archiving files including user rights (which is pointless except backups)
7z for maximum compression if it's worth the time and relatively good compatibility within IT bubble
lz4 for maximum speed on large sparse data for internal use
zstd as a tradeof of good speed and compression for general purpose (beats ZIP's inflate/deflate almost every time in both (de)compression speed and ratio), but for internal use as it's not widespread and it's not archive format so it needs container such as 7zip, but 7zip itself has to be patched to support it
5:50 technically zip files store metadata at the end of archive also known as EOCD.
it makes it easier to add new files. as it just need to append those and rewrite metadata at end of file.
This was very well explained. Thank you.
Awesome video!
Thanks for a very useful video!
Thank you for this
Good Video!
Great video, thanks!
Nice ❤
Accidantly i get to watch this video. I need tuttorial to convert backup Whatsapp acc in tar.gz to txt. Can you give me tuttorial?
wonderful explanation, ty😎
Thanks!
new subscriber here, love your content, if only you put that mic down
Would it be possible to convert all files you want to compress to plain text files prior to compression? If the DEFLATE alg works better on text files, that would seem like a good idea, no? Is it more efficient to convert an mp4 to text, then compress, than just compressing the mp4 directly? 🤔 So many questions! 😂 Thanks for the explanations Tony. Keep up the great work! 😁👍🏻
bruh what
mp4 is mp4, you can't "translate" it to txt, whatever this even means.
when saying txt compress better, he's talking about compressing text that have 26 symbols from the alphabet, and have been designed ti compress well language, and not really for random stuff, because y'know, we use languages lmao
@@NielsGx Yes, you're absolutely right. We use languages. Like Machine Code, Binary Coded Decimal, Binary, Assembly Code, Hexadecimal. The list goes on and on and on. You can represent an mp4 video (or any other file type) in whatever type of encoding you want. Then we transmit that data using algorithms like BPSK and QPSK using beams of light to shoot the data down massive undea-sea cables from continent to continent. Literally anything is possible. Even the words you're reading from this comment right now have been transmitted by strings of 1s and 0s to explain this to you. But of course, what do I know? I've only been studing Electronic Engineering and Computer Science since before you lost all your milk teeth.
It's possible but there's no advantage. DEFLATE compresses text better than binary because natural text typically has less entropy.
When you convert binary to text (using hex, base64, base91 etc) you cannot magically remove that entropy, so you get seemingly random text that's bigger than the original data
Awesome video, keep making more!
Woo!
Thank you for making this.
Thanks for the kind words!
great explanation.
Thanks!
Hernandez Betty Wilson Anthony Lee Amy
tar tar tvf setuptools-58.0.2.tar.lz find the file you want
tar xvf setuptools-58.0.2.tar.lz setuptools-58.0.2/tools/finalize.py
no need for it to extract it all :)