Luigi Auriemma (the creator of QuickBMS, a utility often used for unpacking such formats) wrote a paper called "Overview of game file formats and archives" that goes deep into this topic as well. Something that is alluded to when mentioning the headers but wasn't fully explained is sectors - a CD is divided into sectors, 2048 bytes large each, and these are basically the smallest unit you can work with on the CD. So you can't read less than 2048 bytes at a time, and if data is split between multiple sectors, you have to read both sectors. Therefore it is wise on a CD to align data on the nearest 2048 bytes, i.e. sacrifice a tiny amount of space to add some padding so that files will fit cleanly into the sectors instead of being split down the middle of them. Hence why it may be desirable to control the offset where it begins :)
Oh, I almost mentioned Auriemma's work here! Earlier ideas for this video were more narrative and chronological, and would've included their IPK script for another Artoon title being used as the basis for a BLiNX 2 script many years ago. That was what I used for my original Blinx jacket mod way back then. I definitely ought to give that paper a read, too, so thanks for mentioning it (and likewise for the quick explanation of sectors that I didn't manage to fit in anywhere).
Packing everything into binary blobs was happening way before the CDROM era. The further back you go with machines with slow media and smaller memory footprints, you will find it everywhere. On machines with only 64KB of addressable RAM this was even more crucial, you could have a block of data in an extended ram bank, page that in unpack the data to addressable RAM. It still happens today :)
It also happens today. What comes to mind is the Warriors series still packs it's data into gigantic bin files (IIRC usually names LINKDATA for whatever reason). Although they are using their own engine I think?
This is such a cool video! Thank you for taking time to document this in a condensed form, it's always really fascinating to learn what makes some of our favorite games tick, as well as the ingenuity around technical challenges. It reminds me that I really need to take the time to think about how I want to handle memory in my own games... Even if we generally don't have the same kinds of limitations, utilizing the most out of the variety of systems games can run on can be really interesting.
*AWESOME* work here! And thank you for taking the time to explain some of these concepts for those of us that have close to zero knowledge of these things as well. The explanation of how and why data is laid out on the disc was my favorite part.
A niche detailed behind the scenes overview and reverse engineering of one very particular hidden technical aspect of a only moderately successful console and the underlying subtle technical and creative decisions made by developers 20+ years ago in my fav childhood game that I love to death that almost no one else in the world has ever played let alone heard of? And it's simply a really good video too? Bro this checks literally all my boxes, ez sub :)
great video! it's always interesting to see how things were done back in the day. i assume this is somewhat of a "lost art" - the ps5 and xbox series both copy the entire contents of the game disc to the SSD before you can play, so i imagine they don't use any of these techniques, and the switch, with its flash cartridges, has no need to worry about seek times or layer switching. i very much doubt that the modern xbox developer kit includes a blu-ray disc authoring tool, as cool as that would be 😄 i'd be interested in learning about the methods used on the the PS3 and 360 - did the density of the PS3's blu-ray discs change anything? were slower, more space efficient compression methods used to take advantage of the more powerful processors? does installing an xbox 360 game to the HDD just copy-paste the files, or is it extracting archives?
I suppose you're right. I meant for the answer to be that the developers can fine-tune their own format to their needs, like BLiNX 2 with its file duplication you won't see in those other formats. Another detail is that most common archive formats use compression that favours high compression ratios in exchange for slower decompression. If you're trying to save space on a computer or minimize the amount of data you have to send over the internet, that's a good tradeoff, but for a game developer whose main goal is to get a game loading quickly, it can be undesirable. The LZSS used in a few games I've seen is pretty quick to decompress, on the other hand. I also suspect there's an element of whether libraries are available for a given compression format with a licence that your studio is willing to use, and how easy an algorithm is to implement yourself otherwise. LZSS is again very easy to implement yourself, at least naively (as I've seen Climax do), and there's a public domain implementation that's ready to use without any restriction (as BLiNX 2 did).
Off the top of my head: - Tar doesn't support unpacking individual files without first unpacking the entire archive. - Zip compresses files "in isolation"; that is, it isn't able to deduplicate data that is the same in two files. I presume that this was to support extracting individual files. - 7z supports extracting individual files and deduplicating shared data, but likely was too uncommon to really be used "as is".
Luigi Auriemma (the creator of QuickBMS, a utility often used for unpacking such formats) wrote a paper called "Overview of game file formats and archives" that goes deep into this topic as well.
Something that is alluded to when mentioning the headers but wasn't fully explained is sectors - a CD is divided into sectors, 2048 bytes large each, and these are basically the smallest unit you can work with on the CD. So you can't read less than 2048 bytes at a time, and if data is split between multiple sectors, you have to read both sectors. Therefore it is wise on a CD to align data on the nearest 2048 bytes, i.e. sacrifice a tiny amount of space to add some padding so that files will fit cleanly into the sectors instead of being split down the middle of them. Hence why it may be desirable to control the offset where it begins :)
Oh, I almost mentioned Auriemma's work here! Earlier ideas for this video were more narrative and chronological, and would've included their IPK script for another Artoon title being used as the basis for a BLiNX 2 script many years ago. That was what I used for my original Blinx jacket mod way back then.
I definitely ought to give that paper a read, too, so thanks for mentioning it (and likewise for the quick explanation of sectors that I didn't manage to fit in anywhere).
Packing everything into binary blobs was happening way before the CDROM era. The further back you go with machines with slow media and smaller memory footprints, you will find it everywhere. On machines with only 64KB of addressable RAM this was even more crucial, you could have a block of data in an extended ram bank, page that in unpack the data to addressable RAM. It still happens today :)
It also happens today. What comes to mind is the Warriors series still packs it's data into gigantic bin files (IIRC usually names LINKDATA for whatever reason). Although they are using their own engine I think?
This is such a cool video! Thank you for taking time to document this in a condensed form, it's always really fascinating to learn what makes some of our favorite games tick, as well as the ingenuity around technical challenges. It reminds me that I really need to take the time to think about how I want to handle memory in my own games... Even if we generally don't have the same kinds of limitations, utilizing the most out of the variety of systems games can run on can be really interesting.
WAIT 32 SUBS ???????????????
your vid was amazing bro !!! I really like this ambient music in bg
As an archive file format enjoyer this piqued my interest. Excellent video, hope to see more. :)
*AWESOME* work here! And thank you for taking the time to explain some of these concepts for those of us that have close to zero knowledge of these things as well. The explanation of how and why data is laid out on the disc was my favorite part.
This is a really well done video on something interesting but pretty obscure, good work
Nice video, loved the visuals helping to undestand the concepts you explained.
A niche detailed behind the scenes overview and reverse engineering of one very particular hidden technical aspect of a only moderately successful console and the underlying subtle technical and creative decisions made by developers 20+ years ago in my fav childhood game that I love to death that almost no one else in the world has ever played let alone heard of? And it's simply a really good video too? Bro this checks literally all my boxes, ez sub :)
The algorithm has blessed me
Okay yeah, Subbed. Good explanation, good Visualisation, just the right amount of gamer snark. :)
I love this channel😊
great video
great video! it's always interesting to see how things were done back in the day.
i assume this is somewhat of a "lost art" - the ps5 and xbox series both copy the entire contents of the game disc to the SSD before you can play, so i imagine they don't use any of these techniques, and the switch, with its flash cartridges, has no need to worry about seek times or layer switching. i very much doubt that the modern xbox developer kit includes a blu-ray disc authoring tool, as cool as that would be 😄
i'd be interested in learning about the methods used on the the PS3 and 360 - did the density of the PS3's blu-ray discs change anything? were slower, more space efficient compression methods used to take advantage of the more powerful processors? does installing an xbox 360 game to the HDD just copy-paste the files, or is it extracting archives?
good video 👍
This is awesome! ❤
Top tier!
I remember archive formats. But not from xbox. I remember it from games like Armor Command from Ripcord games.
Great vid! What were the music used?
As best as I can tell, pretty much all of it seemed to be from Blinx 2's soundtrack. Was there a particular track that you were interested in?
@Revoker1221 oh man, if i were to remember 😆
Meteos planet parameters, funnily enough when you try to mod the game, are stored in .mtp files.
you did not explain why games didn't use standard archive formats like zip or 7zip or tar or whatever
I suppose you're right. I meant for the answer to be that the developers can fine-tune their own format to their needs, like BLiNX 2 with its file duplication you won't see in those other formats.
Another detail is that most common archive formats use compression that favours high compression ratios in exchange for slower decompression. If you're trying to save space on a computer or minimize the amount of data you have to send over the internet, that's a good tradeoff, but for a game developer whose main goal is to get a game loading quickly, it can be undesirable. The LZSS used in a few games I've seen is pretty quick to decompress, on the other hand.
I also suspect there's an element of whether libraries are available for a given compression format with a licence that your studio is willing to use, and how easy an algorithm is to implement yourself otherwise. LZSS is again very easy to implement yourself, at least naively (as I've seen Climax do), and there's a public domain implementation that's ready to use without any restriction (as BLiNX 2 did).
Off the top of my head:
- Tar doesn't support unpacking individual files without first unpacking the entire archive.
- Zip compresses files "in isolation"; that is, it isn't able to deduplicate data that is the same in two files. I presume that this was to support extracting individual files.
- 7z supports extracting individual files and deduplicating shared data, but likely was too uncommon to really be used "as is".