"One of the main causes of the fall of the Roman Empire was that, lacking zero, they had no way to indicate successful termination of their C programs." -- Robert Firth
@@spencerwhite3400 but then you run into the problem of only having 3999 possible numbers, which makes it somewhat challenging to add 65,535 to anything
@@trejkaz My guess would be _automaton officiī_ or _automatum officiī,_ based on what little I remember about how to decline Latin. (Your looks like what Google Translate spat out for me. That cannot be trusted for things like this since it knows nothing about Latin grammar, AFIK, it just pattern matches against a database of translations from other sources. The Google Translate output appears to me to be grammatically incorrect.)
@@pomelo9518 This would be what's known as an SQL injection (simply put, SQL databases are the most common type of database; they've been around for ever and stood the test of time). Basically, if the input into an SQL database like MySQL is not filtered properly, it can quickly lead to it being interpreted as code, which happens all the time. `DROP TABLE` is the SQL command to delete a database table, the asterisk `*` is a so-called wildcard character (special characters to help you select stuff). An asterisk means "insert everything here", or in other words: Delete all database entries.
But that uses special characters not on the approved list of special characters! Though interestingly, if you don't need a semicolon to terminate the statement, some SQL injection could theoretically be performed with just the space, apostrophe, and hyphen that ARE allowed... Though again, considering that they are allowed in the first place, they almost certainly have input sanitization or else unexpected string termination from the apostrophe would be the FIRST thing they'd discover.
The only thing I know about the American Kennel club is that they spent decades creating genetic abominations like the Pug. Pedigree is another word for horribly inbred
As a database professional, this pains me a great deal. There is no reason (other than a lack of will) that they could not extend that character limit, and absolutely no reason why they should not.
and like... why roman numerals? if the system would somehow break with more characters you could still have 100,000 different dogs with the same name using Arabic numerals. seems like more than enough
Turns out your are not the only one who finds this frustrating. I know someone who made a 15 minute video about it involving hours if not days of work.
I think it should be a crime to store numeric data in a database as anything other than.... a number. If you want the name to have a Roman numeral appended, just print it that way on the client. Makes LITERALLY EVERYTHING easier for EVERYONE!
My Dad once told me that in their company's product, they had a two-character field for the year (they started sometime in the early 90s). When the 00's came around, since it was infeasible at the time to upgrade the database format across all clients, they instead went 98 -> 99 -> A0 -> A1. I believe it's really fixed by now though.
@@kevinpaynter Nah... before starting a project i roll some dice. It's 19. Well i guess i'll implement a base19 numbering system and only use that. That's why the fibonacci script is already 2gb in size.
As soon as the numbers came up in the allowed character list, I started imagining dogs named like vanity plates. You just have to name your 38th dog Skyl4b and you're good to go! After a number of generations with similarly constructed names, we'll eventually get around to naming them 5ky148.
Someone's going to register their dog's name as just fifty zeroes "I wanted my dog to always appear at the beginning of any list" or would that be 50 A's?
Writing 3999 as MMMIM instead of MMMCMXCIX, as it should, made me shiver. But then you went ahead and proposed writing 3 as IIV. That's the most cursed thing I've ever seen in my life.
IL, IC and IM are perfectly legit Roman representations of 49, 99 and 999. This has come up in math class back when I was in school. The computing teacher was called in to arbitrate and he decided the argument in favor of me, that is, that they were acceptable Roman numbers.
Usually, the AKC registered name isn't a name. It's a manufacturer serial number. It usually includes the name of the kennel the dog is from, a code indicating the litter it is from, a name that often has to fit some theme set by the kennel, and sometimes other weirdness. A quick Google gives me a picture of a dog named "Ableaim Patent Pending MC"
There's also the 4th and possibly most boring/"intellectually challenging" option depending on who you might ask: migrate the database to a more modern standard c;
The rule about "no more than three in a row" and, indeed, the use of negatives is a partially modern invention. The Romans did regularly have to deal with numbers over 3999-the classic legion was larger than that, never mind the census data-and even older modern clocks have digits like VIIII instead of IX. There's no reason we can't update the rules further.
I once tried to search why "4" is written differently on clocks sometimes and couldn't find any "logic" to it. It's not tied to a location or a period of time, clocks with IIII were made even in 1960s, 1970s. It's a somewhat rare feature and looks like it was simply a choice of the maker.
@@vsm1456 there is a "theory" that it may have IV may have been an old abreviation for Jupiter during that time.. and so because having a clock read essential 1 2 3 GOD 5 6 wouldhave been "interesting" it was changed to IIII, and it just carried over once again its just a theory with no proof from what i could find
The reality is, no dog registered with the kennel club is named “Spot”. They’ve all got names like Harry Potter secondary characters, although “Excelsior Ludlum Spottiswoode IV” might get called “Spot” at home
@@wintaaaaa Often there are spam accounts that copy real top comments that were made early on. You can recognize the spam ones from the profile picture, name or at the very latest the channel description. Links are bad, they want to steal your money :D
@@theEWDSDS That's fair, I wouldn't call that the ultimate tell either though. But in this particular case I agree that they probably aren't. I just thought it was valuable enough to educate people on it so they don't fall for it.
My boyfriend had a show dog which was named something like “Whittier’s A Midsummer Night’s Dream” for show, but had a completely different name at home. Then they get something like “ex. something or other.”
Going to guess the database was using Hollerith cards, like those that used to be used for census data. 80 columns of 10 digits, so likely they allocated 2 rows to store a representation of the Roman numerals, using 3 bits, and then 3 sets of numbers, and a check bit, over the 2 rows. Leaves 78 columns for use, with likely one taken for breed (512 breeds should be enough to handle any likely future use), leaving 77 for name of dog and a reference field to another card, containing address. Used machines that at the time were common, easy to get hold of as used and working, and also had the whole "computerised" feel to it. Easy to do lookup per field as to what would be allowed, and what translation to use, when sending to output as well.
So... is BZA like RZA or GZA? Are you part of the Wu-Tang Clan?! Or are you from the Board of Zoning Appeals? Either way, we need to talk about some stuff!
Nice theory but the problem is that the letters of the alphabet all fit in one column. The trick is that they use two holes from different rows in the same column to do so. One hole is in the top three rows and the other in the bottom 9. See the Wikipedia article titled Punch card. I think that Matt's theory that only 6 bytes were allocated is most likely the correct one. They used to allow only 8 bytes for names on computer graded standardized tests back in the 60s. It was always alarming to me that I could not put my entire last name on the answer sheet.
Matt: "And that's where most videos would end: here's a crazy fact, here's the reason behind it, job done. But not here! The Stand-up Maths policy is to try and fix things! And I reckon I've got some solutions so Americans can have more dogs with the same name." Me: "Arabic numerals. With six digits, you can count op to 999 999." Option 3: if you were to start at 39 instead of at 1, you can have two more dogs with the same name. I like this channel so much.
Even better! Instead of six characters which would take 6 bytes of storage, you could use a 32 bit unsigned int! You could save 2 bytes storage space per dog, but you could still count to an absurd number of 4 294 967 296 dogs per name!
@@kacperfilipek8461 And you could have a function translate that to a roman numeral for display, well at least the lower values that can be represented by roman numeral.
"if you allow multiple Ms, which some people do" - some people would include the Romans. Although III and IV were the most common representations for 3 and 4, you can also find IIV AND IIII.
Yeah Roman numerals are way more versatile then people realize. Additive numbers (ex. IIII for 4) really was the most common variant until the subtractive IV became more popular in Medieval times. None of these are wrong as long as the standard is consistent. You know what the Romans didn’t do though? Use IL for 49 (and all subsequent derivatives of that). That’s the only Roman numeral trick that some people use that is objectively incorrect.
How we began: In America, only 37 dogs of the same breed can have the same name. How we ended: We could potentially register every observable atom in the universe in the Kennel Club, even if they were the same breed of atom.
%90 of the atoms in the universe are the same breed, Hydrogen. The other %10 is Helium, and there are some rare elements other than hydrogen and helium
Option 5: Start a movement for everyone register their dogs' names using only the letters I, V and X in the hope of breaking their system somewhere down the line.
It's funny because it kind of makes the character longer. The number of characters is the same, but the actual width is different (III vs IIV, 13:21 pixel ratio)
@Johnny Rep I was about to comment about 18!! I’m so glad someone put that that is the way it works for numbers bigger than ten when using wording in Latin.
I can't believe I was right in the middle of a deep dive on Roman history, Roman mathematics and the Roman calendar, and look who drops into my feed! And Skylab too! 🐕
Regarding 8 as IIX the 18th legion used XIIX as their label which precisely follow that rule you mention. Yes, it was unusual but XIIX is nicely symmetric and so was probably the reason why it was chosen. Also of note is that it is not only read the same forward and backward but also upside down and that makes it very nice for a label which may appear in many directions during a battle depending on the angle of the banner with the ground.
Fun fact: there is actually precedent for usage of option II in Roman times; the Wikipedia article on Roman numerals specifically mentions IIX for 8 and variants as "Irregular subtractive notation" (not necessarily consistantly - it shows both IIXX and XIIX for 18, for example).
So basically the Romans are doing exactly what Matt explained how he conceptualised Roman numerals ... if smaller digits come before larger digits, the smaller digits are negatives. Smart people, those Romans were.
@@justinwatson1510 you have to parse it as "II before XX," not "IIX before X." It's more complex than just left-to-right, otherwise XIV would be 11+5=16 instead of the 10+4=14 it actually is.
Even binary, the least dense numeric base, has more room with six digits than roman numerals It also has zero, so you can make use of all 64 rather than just 63
although in terms of binary data, binary itself is much more space efficient. standard ASCII characters are 1 byte each, which is 8 bits, so the total amount of space used by 6 roman numeral characters is 6*8=48 bits. with 48 bits, you could store a 48-bit aka 6-byte integer (uint6) that would allow for 2^48=281474976710656 different unique identifiers for each dog name. then, if you really want to, just convert it to roman numerals once you're ready to display it somewhere
This might be less of a common issue than it might seem at first. AKC breeding dogs have 2 names, their "common" name ("Skylab"), but then they have these crazy gelical names like "Parker's Dazzling Stellar Cosmodog". So the AKC registered name tends to be very unique anyways.
Most breed registies do this. Often the registared name has somewhere the breeders farm buried in the name, so as the animal moves around (by sale or in liniage) one can instantly see the original breeder.
This real life solution is basically a practical implementation of the final option Matt presented. There's just so many unique options that it's almost inconceivable to need 37 iterations of the same text.
@@nolin132 Lots of competitive animal societies have the distinction in order to be unique when competing. Plenty of racing horses with a common name of something like 'Spot' or 'Blackie' have an official name more like 'Guyana Star Dweej' or 'It'sallinthechase'.
I'm pretty sure that the ultimate reason for this limitation is that the original system was likely coded in COBOL. COBOL has this thing for building data records using ASCII characters with a fixed width for each field, with a newline after each record. This makes reading through a file with a buttload of records in it quite easy since every record is exactly the same number of bytes, and finding a particular value within a record is easy for COBOL because it's always in exactly the same spot. This is what made/makes COBOL so good for this kind of record storage. Today, we'd use some kind of RDBMS but back when computers were new(-ish), such things didn't exist yet, and flat files written to tapes were the thing to use, so COBOL and the COBOL way of storing data made a ton of sense. In particular, knowing the size of each record in advance let you scroll through a tape very quickly and easily. This is important with tape since rewinding a tape is very slow, so overshooting the mark was very bad, and you couldn't just read the beginning and end of a file at the same time with a tape (which is needed for many modern storage methods, like ZIP files and RDBMSs), you had to read the whole thing from beginning to end. You also had to write the files from beginning to end without going backward, and it was unlikely that you'd be able to store the whole thing in memory at once, so you couldn't just make a table of contents for the beginning and then write out the rest of the file. You couldn't have generated the table of contents until after the whole thing was written so it would have to be written at the end, but you need the table of contents to navigate, so it needs to be at the beginning, but you can't do that, and OMG, it's not going to work. That's why you didn't usually put ZIP files onto tapes in the early days of ZIP files; their table of contents is at the end of the file. Tar files put it at the beginning, but at the cost of having to have the files already compiled onto a disk of some sort as an intermediate. The individual files could contain whatever tables of contents were necessary for navigation and because you were using a disk, you could write them to the beginning of the file after you'd already written the rest of the file's contents to disk. But you can't do that when you're writing directly to a tape with no disk storage intermediate. COBOL and its fixed width record-based files was perfect for the era when you had tapes but no disks.
First thought was "what if they allowed an additional digit? How far can you get before hitting 8 numerals?" I was quite pleased to realise that 88 is actually the first Roman numeral of length 8 😁
I expected Matt to talk about the furthest you can get with a length of numerals. 8 characters would allow numbers up to 187, and 9 characters would get to 287. The longest possible Roman numeral is MMMDCCCLXXXVIII, or 3888.
Note on option 1: there was no actual rule "only three of each in a row" to the people for whom "roman numerals" were just "numerals". Even back when Julius Caesar still had things to say, IIII instead of IV could be found in official records.
I was listening to a podcast about the disappearance of the ninth legion and they said that it looked like some detachments wrote the legion's number as IX and others wrote it as VIIII. So as the inscriptions that from the Rhine all write the number in the less common way, that probably means that it was only a single detachment that got sent there and not the whole legion.
Fun fact, I used to had a dog breeder as a client, and would see the bills for the name registrations. Most of the time the dog's registered name would be [Breeder name] [Actual Name]. Now, this was in Canada, so it was the Canadian Kennel Club, so their naming rules may be different. So if the breeder's business was called 'Starwood Country Kennel' they would register their dogs as Starwood Rover, or Starwood Spot. That way you never have to actually worry about having dogs with the same name.
matt did mention at the end that they have like 36 characters excluding the roman numeral suffix, which also allowed all 10 arabic numeral digits. so they could simply bodge an arabic numeral into the registered name anyway. if they wanted to preserve the breeder prefix, they could probably also devise some sorta 3-char acronym convention so they saved more space for the actual name. in short: there are literally innumerable ways to work around a deprecated and badly designed database, instead of just consigning themselves to its restrictions.
Horses can only be registered with a unique name, so in that same vein, you end up with ridiculous names like "JKB Frosty Cloud Dancer" but the horse is named "Dance".
My solution: IF one adds a space after a name, that doubles the number of names you can have. Therefore our number of names increases for every character not used under the limit. Such a system would probably annoy them greatly, however, so they would likely moderize the system.
And you could add more than one space so Spot______VI and Spot____VI would both be valid and different dog names. Edit: consider the names Skylab and Sky Lab, they are different names, and as such why does the space being at the end and being in the middle make a difference
"They have so overpowered the rest of the name that there are enough names for everything!" Oh, Matt, how could you apply combinatorics to the characters in the names but not to the atoms you're trying to name? Ok, so we name every atom, but we can also name, e.g., "that galaxy", and presumably things like people and, well, dogs. But what if I want a name for me and my dog? What if I want a name for everything in my galaxy except my dog? At the end of the day, we need names for every combination of atoms, and that's going to need rather a lot of names. Loved the video.
Every single fundamental particle gets a unique ID. To described groups of particles, like protons, neutrons, atoms, molecules, dogs, planets, etc, you set the bits whose places correspond to the IDs of the particles that compose the group and that binary number can uniquely describe every possible group of particles you can name. (Seriously, binary numbers are really good at indexing power sets.) It would also imply that the universe itself can be described with the name of -1, because two's complement.
After some Googling, it seems like the Romans actually had signs for the 1000s, like ↁ (5000) and ↂ (10000). But they had multiple ways of indicating 1000s. An alternative notation, and one that is easier to grasp, is the vinculum notation, where V̅ would be 5000. I'm not sure whether you could actually say that Roman numerals don't have the option to go above 3999. They could, but because there were multiple methods of handling that, we don't seem to teach them nowadays.
This is the comment I was looking for. I mean, *I* learned about virgules in school, so why on Earth didn't a professional mathematician do so? Seems Matt got a Parker math education.
He mentioned the vinculum in this video. Presumably the AKC doesn't (or didn't) have the ability to type a vinculum when they created the database (or didn't know about them at all).
This is SO cool!!! I wish I was taught this in school, but it makes sense that I wasn't. The only reason kids are taught Roman numerals these days is to understand places where they're traditionally used, and those symbols have fallen out of common usage.
Afaik the romans in daily practice actually did put 4 of the same characters after another. So instead of 9 = IX (as we would think), they wrote it VIIII. Same with the 4 having been written IIII instead of IV etc
Given how dogs are named in the AKC, it's unlikely to see 38 of the same name anyway. The majority of dogs official AKC names include the kennel names where the dog was bred, and potentially the kennel name of the borrowed sire.
11:36 "I have become very focused on the Roman Numeral aspect -- don't get me wrong, had a great time." Having a great time with mathematics and sharing it with the world is what makes this channel awesome. :)
Some dogs are made of multiple atoms [citation needed] so we really need names long enough to handle the power set of all atoms in the observable universe.
@@QuantumHistorian Depends on your definition of "in" but taken one way it is in fact a regular occurrence anywhere multiple dogs conspire to become yet more dogs.
@@QuantumHistorian We need names for every possible combination of atoms. You wouldn't call a water molecule "That bloke with the hydrogen atom called Benedict DCLXVI"
Your solutions are nice, but I'm *way* more curious about why AKC's naming system has these limitations in the first place now. I'm trying to imagine a system where the roman numerals couldn't just be outright replaced with numbers, even pre-computerization.
When something stays the same for some time, it becomes tradition. And when something becomes tradition, rationality no longer applies. We still have 7 days weeks!
@@juanausensi499 I totally get your point but in this case I think it's probably just being lazy to update the database to accept more numerals because of some runtime issues etc.
We had a regex at work that only allowed up to four characters. It went for years before someone realized that it was missing 18 and a bunch of other ones. I think the fact that the gaps weren't consecutive made it easier to miss.
Or you know, they could store the numbers as a database defined integer, and have a function convert said integer into a roman numeral for display. The limit at that point is the size of an integer in your DB, which should be easily large enough.
i'm pretty certain that the institution is older than computers, so they are probably backed by some legacy mechanical system that is causing this issue
The "changing the sign" explanation makes perfect sense to me, and parsing is quite a bit easier with that in mind (you would only need an accumulator, a mapper, and the previous individual value). But Option 2 would make that parser a bit more difficult to implement.
option 2 is the same implementation detail as the regular interpretation, you just only allow two consecutive characters not three, and when you reach that threshold add the next character to the end, then pop off the left, then increment the right (just like how you already have to do it, just that popping off the left can only happen once normally, so I guess that explains "the previous individual value" has to become a stack/queue (of size 2?)) I II IIV IV V VI VII IIX IX X ...
My partner related this story to me [web developer] while we were out walking our own dog last night. Never has a database design story caused me more psychic damage.
@@Nathan-dt2tu I guess you're new to the internet then? It's pretty common to use them interchangeably for comic effect, as though one is describing a 4e D&D power. Thanks for the splaining though.
I like the idea that "elementary particle" is technically a breed of dog, and the entire observable universe is technically America, so each particle should have its own unique name registered with the American Kennel Club.
If you are limited for naming to strings of length 6 made of the characters {"I", "V", "X", "L", "C", "D", "M"}, the optimal way to number dogs is to interpret the string in base-7, so e.g. "VXLDIL" could mean dog #22886, which is calculated by 1*7^5 + 2*7^4 + 3*7^3 + 5*7^2 + 0*7^1 + 3*7^0. This gives us the maximum of 7^6 = 117649 dogs, which is the most you could uniquely number using only these 7 Roman numeral characters.
@@Zemnmez that's assuming you can only use the existing 7 roman numerals, and 6 characters. So that's the maximum you can get out of it within those constraints. Of course if you can add more symbols then you can cram however high base you want, but that involves changing how the data is stored. The assumption is whatever antiquated database they use can only store those 7 characters and 6 of them. On a punch card, that could be represented with 3 holes in 6 columns with zero room for expansion. If you could just change database, then you might as well just store regular arabic numbers for up to 999999 dogs, or just store it as a regular unsigned 16/24/32/64 bit number and have practically unlimited dogs.
Assuming you're allowed strings of up to and including length 6 then you can use bijective base 7 with a total of 7¹ + 7² + 7³ + 7⁴ + 7⁵ + 7⁶ = 137256 dogs.
@@Nathan-dt2tu the rule is, minimize the number of characters, AND don't use more than 3. It really only should kick in on 8 or similar patterns like 80, where you'd be adding one digit plus three, instead of two subtracted from one. The video writing IIV kind of missed the point of minimizing, but it's still valid.
i love how mathematicians go on these weirdly absurd thought experiments, meanwhile the AKC probably hasn't even used roman numerals in decades and the rule only exists because it's been grandfathered in.
I think you have to backwards tho, with teh advent of phones and the interenet, and other modes of instant communication, they only allow more than a single dog to have the same name becuase it is grandfathered in.
They could assign a separate identifier (like a number or code) and assign relationships by the coding instead of the name. That way even if two dogs are related and have the same name you can use tell they’re two separate dogs.
@@stargate525 I don't think they meant using the identifiers to assign the dog's lineage lol, they just mean give each 'dog' in the system an ID number.
As a Hearing Impaired member of your fanbase, I must inform you that closed captioning is unavailable to me on this video. I have a very hard time understanding speech without CC, and so does my aging mother. I would appreciate it if you could accommodate us Hearing Impaired or others with audial processing issues. But that part about the atom dogs was great!
I know Tom Scott pays for CC, so that's another channel worth checking out. My understanding is that the automatic CC provided by UA-cam takes time to generate, partly because there are hundreds of hours of content uploaded to UA-cam every second. The best way to prioritize this video for CC is to share it with friends and like/comment/subscribe, I guess.
@@saturatedodin476 > Saturated Odin clicks on a maths video > something something roman numerals something > scrolls to comments > sees "Labs are the best!" > stretches > clicks REPLY > cracks his knuckles > starts typing > "No" > refuses to elaborate further > hits enter > leaves
Or convert all the current Roman numerals to base 10 strings in the database (since it currently only accepts strings not integers) and display them as Roman numerals when retrieved. For example convert “XXXVII” to “37” in the database. That gives you 999,999 of each name.
Option 4: 6 characters is a storage for 6 characters - use 0-9 on those characters and have a function to decode it into Roman numerals if you want to present them as such.
Even if for some unimaginable reason the storage can only use letters from roman numerals, since there are 7 letters (I, V, X, L, C, D and M) you can basically just use base 7. That would add up to 7^6 = 117649 dog names (maybe 117648 if not accounting for zero).
Or, Option 4: The ACC drops Roman numerals INSIDE the database, and uses either base 10 (999,999 Dogs of the same breed with the same name) or base 16 (0xFF FFFF (16,777,215 dogs of the same breed with the same name). To keep their Display in Roman(ish) numerals, they could have a 'look-up table' in their database to pull up a string based on the value, adding new Roman(ish) numerals. We already have I=1 V=5 X=10 L=50 C=100 D=500 M=1,000, so, borrowing from Engineering notation, we can add T=5,000, P=10,000 E=50,000 Z=100,000 Y=50,000 ... Here the Engineering prefixes stop, so the 'No More than 3' rule and the 'no duplicate 5s' rule leaves us with ZZZY, 350,000. So we go with another Engineering habit, "when the 'numbers' run out, start using letters." So we'll have A=1,000,000 B=5,000,000 F=10,000,000, and now we can represent all 16x10^6 dogs with Roman(ish) numerals. Although it might just be easier to just use the hexadecimal numbers, the database is already working in that base internally (actually, it works in base 2, but 4 hex digits completely describes a 16-bit 'word,' the standard memory size in modern computers.)
@unsubtract those guys might be overwhelmed by the options there. Even counting puppy mills, has any breed even gotten close to that number, counting not only living dogs, but ancestors back to the establishment of the breed.
@Andrew Dreasler why skip G in between M and T tho? I can understand counting "k" as lost, but not G. :-B Also, the ZZZY is not quite the max (also, your typo for Y made you write it wrong, 350,000=ZZZE and in fact ZZZY is invalid), the max would be YZZZPZ or 890,000. ;-) [Similar to how the largest 6-character number that doesn't use M is DCCCXC = 890]
@@wasabi991011 because base64 is therefore ambiguous... unless you want to encode the operations (and potentially the spaces) as well, sacrificing any remaining bit of readability... :-s Also, it's not easy to work out a ballpark estimate of a number from its representation... which would be quicker to evaluate, A0F or oP, or maybe even MjU3NQ==(!) ?
Another question: Is this really an issue? Here in Germany people who register their dogs use fancy unique names like "Harold Vincent von der Vogelweyde" (the "last name" usually referring to the dog's sire) but call them "Spot" at home.
In the US (this video does reference the American Kennel Club), it is common for the dog's name to include the kennel name of the breeder. My puppy is named (kennel name)'s Delicate Sound of Thunder, but his call name is Thor.
So THAT'S why dogs are registered with such weird names. Growing up, we had a Great Dane named Rufus. Even though he was not registered with the AKC (at least I don't think he was), my mom insisted that his "official" name was Sir Rufusson of... "(something, something, something - I don't remember because we just called him Rufus).
regarding the "everything could have a different name" part: seeing as a dog is composed of many atoms, we should also allow names for groups of atoms, not just individual atoms. and there are so many more distinct atom groups than there are atoms. that powerset is gigantic
Option 2 was also my favourite. I read as a child that some people used the IIX notation. A few years later, I remembered this and agreed. I would actually comment about this notation, even if you haven't mentioned option 2. Some people might argue that you can't "change important history", but if we are still using them, they aren't really history.
8:15 In fact the real Romans did exactly what Matt suggests, mainly on their calendars. Eighteen could be written XIIX and twenty-eight XXIIX. (But they stuck with XXXVIII for 38, leaving them forever excluded from the American Kennel Club.)
That's probably also because that's how they _called_ those numbers. 18, 28, ... up to 88 were called "two-from-[the number two units up]" in Latin (duodeviginti, duodetriginta, etc.)
Huh, I still wonder if representing four as IIII is a modern invention or if it was also used by the Romans (I heard it was invented to stop people confusing IV and VI on the clock)
Not quite... or at least: not consistent. For example 18 can and was written as XVIII and XIIX. The epitaph of Centurion Marcus Caelius and the Fasti are good examples of this. In fact the subtractive method is the rarer of the two methods and probably only in widespread use because of Microsoft Excel's ROMAN() function
@@Domihork Searched around a little bit. Many sources give the 14th Century Wells Cathedral clock in England as the earliest known example. I saw some claims the practice dates back to Ancient Roman sundials, but not much evidence (no supporting images of Ancient Roman sundials, for one)
In No Man Sky, I would always name each type of planet with a roman numeral system up to 30, then switch to decimal after 30. It was a little confusing with each numeral sometime in the teens so that why I decided on the switch.
8:23 was really interesting to me as in Latin thats actually how their numbers are named, (e.g. 18 is duodeviginti, “two from twenty”) so it almost makes me wonder if thats actually how the Romans did it
That's how some French words work, too.. 17 18 19 are dix sept, dix huit, dix neuf (10 7, 10 8, 10 9) and 80 is yearly quatre vingt (4 20..as in 4x20, not related to weed)
By now I've given up on understanding how Matt thinks, and I'm completely fine with it. But the last thing I was expecting when I clicked on this video was that it would end with anything related to antimatter.
Here are a couple of other solutions: 1. Just expand the database field to use more characters. This is a simple thing to do (depending on db system) 2. Use an integer instead of a character based Roman numeral. 3. Ok so you want to keep the Roman numbers, covert the database to use an integer and on the display of that value convert it on the fly to a Roman numeral. This could be done on a db "View" (assuming a sql type db) or on the UI.
That last bit made me laugh out loud about there being more options than there are atoms in the universe, but then I thought, "Hang on... it would not even be possible to STORE that many names unless we had a large number of spare universes to hold the data AND knew how to even do that!"
I love thinking about the implications of huge numbers. Once you get big enough, you run into issues of the universe not being big enough to handle them. A very fun existential thought experiment.
It is a funny thought. I wonder though how much the number of universes needed gets cut if we use compression. If we know that these hundred atoms in a row have the same dog name, we could store that as Spot (I:C) rather than [Spot I, Spot II, Spot IIV, Spot IV, etc.]
One thing I find interesting about Option 2 is that the word for 18 in Latin is "duodeviginti", literally meaning two from twenty. So by all rights it should be written XIIX.
Small point regarding the possible names, I fear you have overestimated as any name with multiple consecutive spaces would be confusing and open to error!
@@zsdavis Pretty sure there's no difference between a five character name and a 50 character name that ends with 45 spaces. But even if they didn't count like that, there would be 39^49 of length 49, 39^48 of length 48, and so on. And while those are big relative to the number of dogs that have ever existed, they are small relative to 39^50. You'd just be multiplying the total by the fast-converging geometric series 1 + 1/39 + 1/39^2 + 1/39^3 +... Which increases the total in the video from 1.32e81 to 1.36e81
In the spirit of this video, I will argue against option 2. As the first "I" wouldn't be followed by a larger number, it would be additive; therefore, 8 = IIX = I + (-I) + X = 1 - 1 + 10 = 10. Fun video, though. Loved all the Skylabs as stars at the end.
@@Froudd While true it's about understanding the number and I think anyone who understands roman numerals would understand the meaning without someone telling them
Skylab. :) Golden retriever here. Considered naming him Au thinking it would be great fun to call out 'Eh! You!" at the park. Decided to keep the au in the name and went with Tau. :)
As someone in the tech sector, I got chills when Matt suggested the possibility of registering every atom in the universe. Some things are not meant to be done at scale O_o
Option 4: remember that you live in the 21st goddamn century and use Arabic numerals like a normal person. Option 5: remember that disk space has become absurdly cheap (because we live in the 21st goddamn century) and just add more space for the stupid, pointless, self-important roman numeral suffix. Option 6: use the inbuilt GUID that literally every competent database schema already has. Option 7: store the suffix in Arabic numerals and just convert it to Roman numerals for display if you just REALLY INSIST on being a pretentious jackass.
I hate to disagree, but in the 21st century you are mostly using binary numbers (at least if you are working with databases and you don't want to wast disk space on ascii or you don't want to do high precision decimal calculations like 0.1 + 0.2, but that should not be needed for keys in a database...). And with that you can show them in whatever display format you want...
Yeah, I guessed where this was going before he explained it, and I was expecting it to be Roman numerals stored in 8 characters, since powers of 2 are common in computing so 8 bits seems a logical amount to use. Actually now I'm curious to see how far you can go with these. 6 digits: 37, XXXVII 7 digits: 87, LXXXVII 8 digits: 187, CLXXXVII 9 digits: 287, CCLXXXVII 10 digits: 387, CCCLXXXVII 11 digits: 887, DCCCLXXXVII 12 digits: 1887, MDCCCLXXXVII 13 digits: 2887: MMDCCCLXXXVII 14 digits: 3887: MMMDCCCLXXXVII 15 digits: 3999: MMMCMXCIX (to go any further you need a symbol for 5000 - that would get you up to 8887).
@@thethiefmaster Storage might not be the only thing at play here. I assume they also print things involving the dogs name, which means that printing a name with a roman numeral greater than 6 digits *might* throw off all their templates, which they'd then also have to fix in addition to changing it in the database. tbh that's a somewhat weak chain of inference. *shrug*
It probably didn't use a database, they just allocated bytes in a binary file via C... depending on what year it was done. To read bytes from a file in c you have to more or less know what size to read and what size variable to put it in. You have to know how the variables are saved in the file. On a 16 bit system an integer would be 16 bits, short int 8 bits, a char would be 8 bits or one byte. (If I remember correctly, lol) A c string uses 8 bit chars for ascii 7 bits plus 1 bit for the extended ascii. A c string (array of chars, basically) terminates with the null terminator, so 6 characters written in ascii would take (6+1) bytes. There was no unicode or utf-8 yet. To convert, you'd need to write something to read and then rewrite in the new format. Been a while since I've read a file manually like this, but that is the gist of it. They also might have written it in ascii files instead of binary, but anyway, could be many valid reasons, back then memory conservation was pretty important, especially for millions of any sort of records held in memory with limited ram. If you don't hold as much in ram, then you have to access slow disks (which were a lot slower then) more often. Having said all that if it wasn't written in C, it might have been written in something else with other limitations :P
...Or option the best, use the data bits to represent an integer in binary and convert and display in roman numerals at best you could have 6 bytes (48 bits) or 281,474,976,710,656 dogs
It goes to show how terrible people are at data storage (and how many horror stories about it there are) that when you mentioned 38, I was able to immediately guess at the "max 6 characters" thing. That type of error happens way too often lol.
@Eyeguy640 : 1976 Olympics Gymnastics in Montreal. Programmers in charge of recording scores asked gymnastic officials, how many digits needed to record/display scores? They said, no one will score perfect 10 so 9.99. Then Nadia Comăneci, age 14, scored perfect 10.00. Long delay before score was displayed. It was displayed as 1.00. I was there.
The way I always remembered the slightly positional rule was that V is 5 IV is 1 before 5 = 4 and VI is 1 after 5 = 6. They could always just use normal letters instead of roman numerals. For standard uppercase only letters that'd give them 26!/20! (165,765,600) slots within the 6 characters.
26!/20! is the number of strings of 6 *distinct* characters, using a 26-character alphabet. If you're allowed to repeat characters, you can go up to 26^6 strings
I know I'm a year late but at the 8:45 minute mark, by how you explained how roman numerals work. wouldn't IIX be 1+(-1)+10? meaning the 1's cancel each other out. Since only 1 I has a bigger numeral to the right of it.
2 роки тому+24
From an engineer perspective, replace the text field of 6 character with a 32bit unsigned integer. When printing the number, convert it to roman numerals on the fly. 4 billion numbers with only 4 bytes instead of 6. But of course as Matt says, some numbers can't be represented easily in roman numerals.
And that's why you're just an engineer, and not a celebrity/politician/CEO and an elite member of a Kennel Club. Doing things an easy, simple logical way? Plebs does that!
i'm pretty certain that the institution is older than computers, so they are probably backed by some legacy mechanical system that is causing this issue
@@eduardopupucon considering all living dogs in their database are younger than computers I see no reason why they can't archive old records and transfer to a new system for new records. If an organization of that size doesn't use a digital database to create, read and update records they are wasting money on human labor.
I wonder if their database was first computerized on a 6 bits/byte system. It would make sense why they limited things to 36-character names and 6-character Roman numerals. (EDIT: Such systems existed in the 1950s-70s, before 8 bits/byte was fully standardized -- largely in the '60s by IBM. With 8 bits/byte, the limits probably would've been 32-character names and 8-character Roman numerals.)
Slight correction: Your suggestion to rework the numeral system at 7:40 would actually be wrong. The reason you only ever subtract single numerals is because otherwise it leads to a lot of confusion once you get bigger numbers. *Especially* in the context of computers, you'd get some strange results. For instance, let's take 2008 in numerals: MMVIII. This is super simple to deal with, because we know that M is bigger than V, and V is bigger than I, so you just add everything together. However, if we instead did your suggestion and made it MMIIX, it would actually end up being 2010, because you would add M+M+I, then I-X. Now you could develop a system to account for this, but it would be pretty cumbersome, and more importantly you'd introduce a number of different numerals that mean exactly the same thing, which is really bad for any number system to have. XXMM would equal MCMLXXX (1980), CCL would equal LCCD (250), and even your own example (VIII = IIX) would lead to inconsistency and confusion. There's a good reason no Romans ever used IIX.
"The Parker Way. That's not gonna catch on, but I gonna give it a go." Matt, this is the Internet. Have you forgotten the Parker Square? If you say "it won't catch on", you're basically challenging the internet to make it a thing that will catch on no matter what.
"One of the main causes of the fall of the Roman Empire was that, lacking zero, they had no way to indicate successful termination of their C programs." -- Robert Firth
@@spencerwhite3400 but then you run into the problem of only having 3999 possible numbers, which makes it somewhat challenging to add 65,535 to anything
Alan Perlis disagreed. Epigram 111:
Why did the Roman Empire collapse?
What is the Latin for office automation?
@@Curt_Sampson officium automation?
@@trejkaz My guess would be _automaton officiī_ or _automatum officiī,_ based on what little I remember about how to decline Latin.
(Your looks like what Google Translate spat out for me. That cannot be trusted for things like this since it knows nothing about Latin grammar, AFIK, it just pattern matches against a database of translations from other sources. The Google Translate output appears to me to be grammatically incorrect.)
The Romans assumed that the amount of territory controlled by the Roman Empire would always be positive.
I, for 1, like Roman numerals.
😂😂😂
Good one!
I like that a lot
That makes the II of us.
good one mate
"Assuming all atoms are the same breed of dog" - I wish more of my physics exams started with that assumption.
Fun fact: You can name more than 37 dogs “;drop table *” without the database being full. (no legal advice)
Bobby Tables at home.
Exploits of a dog owner...
someone explain to me what that does
@@pomelo9518 This would be what's known as an SQL injection (simply put, SQL databases are the most common type of database; they've been around for ever and stood the test of time). Basically, if the input into an SQL database like MySQL is not filtered properly, it can quickly lead to it being interpreted as code, which happens all the time. `DROP TABLE` is the SQL command to delete a database table, the asterisk `*` is a so-called wildcard character (special characters to help you select stuff). An asterisk means "insert everything here", or in other words: Delete all database entries.
But that uses special characters not on the approved list of special characters! Though interestingly, if you don't need a semicolon to terminate the statement, some SQL injection could theoretically be performed with just the space, apostrophe, and hyphen that ARE allowed... Though again, considering that they are allowed in the first place, they almost certainly have input sanitization or else unexpected string termination from the apostrophe would be the FIRST thing they'd discover.
"The American Kennel Club used to keep track of dogs with the same name by using Roman numerals. They still do, but they used to, too." -Matt Hedberg
You win 😁
Yea, within context, that made sense, but quoted like this, it's hilarious. Definitely can be better phrased.
6:36 I find this vaguely erotic
The only thing I know about the American Kennel club is that they spent decades creating genetic abominations like the Pug.
Pedigree is another word for horribly inbred
Who's Matt Hedberg?
As a database professional, this pains me a great deal. There is no reason (other than a lack of will) that they could not extend that character limit, and absolutely no reason why they should not.
and like... why roman numerals? if the system would somehow break with more characters you could still have 100,000 different dogs with the same name using Arabic numerals. seems like more than enough
Turns out your are not the only one who finds this frustrating. I know someone who made a 15 minute video about it involving hours if not days of work.
As a professional programmer whose career started on punch cards in the 80's, this doesn't surprise me a bit!
I think it should be a crime to store numeric data in a database as anything other than.... a number. If you want the name to have a Roman numeral appended, just print it that way on the client. Makes LITERALLY EVERYTHING easier for EVERYONE!
My Dad once told me that in their company's product, they had a two-character field for the year (they started sometime in the early 90s). When the 00's came around, since it was infeasible at the time to upgrade the database format across all clients, they instead went 98 -> 99 -> A0 -> A1.
I believe it's really fixed by now though.
This is why it's important to pick the correct data type when modelling a database
It's also important to pick the correct database when registering every atom in the known universe. I heard Mongo is webscale???
There is a distinct and shameful lack of Roman Numeral as a data type when creating databases.
@@kevinpaynter Nah... before starting a project i roll some dice. It's 19.
Well i guess i'll implement a base19 numbering system and only use that.
That's why the fibonacci script is already 2gb in size.
VARCHAR, done.
@@KenLou VARCHAR(6). FTFY.
As soon as the numbers came up in the allowed character list, I started imagining dogs named like vanity plates. You just have to name your 38th dog Skyl4b and you're good to go! After a number of generations with similarly constructed names, we'll eventually get around to naming them 5ky148.
Someone's going to register their dog's name as just fifty zeroes "I wanted my dog to always appear at the beginning of any list" or would that be 50 A's?
@@Treblaine Big discord servers moment
5ky148 sounds like something Elon Musk would name his dog.
@@Treblaine I guess single zero comes before fifty zeroes
@@DajesOfficial Hmm, but would the person stupid enough to try that even know that?
Why has Skylab been absent from your previous videos? The road to two million is paved with dog treats.
By the way the dog has appeared in three of the last four videos Matt's put out, I'm guessing A) the dog is new, and 2) Matt knows.
+
+
Skylab the Dog has his own channel: ua-cam.com/users/SkylabtheDog
@@jaapsch2 That is excellent, thank you.
"They're registered by breed, and therefore antimatter is a different list" is not a conclusion I would've expected going into this video.
Want to break your head? Atoms aren't fundamental units. One atom should hold a few names for its quarks, leptons, electrons, etc.
Writing 3999 as MMMIM instead of MMMCMXCIX, as it should, made me shiver.
But then you went ahead and proposed writing 3 as IIV. That's the most cursed thing I've ever seen in my life.
IL, IC and IM are perfectly legit Roman representations of 49, 99 and 999. This has come up in math class back when I was in school. The computing teacher was called in to arbitrate and he decided the argument in favor of me, that is, that they were acceptable Roman numbers.
Usually, the AKC registered name isn't a name. It's a manufacturer serial number. It usually includes the name of the kennel the dog is from, a code indicating the litter it is from, a name that often has to fit some theme set by the kennel, and sometimes other weirdness. A quick Google gives me a picture of a dog named "Ableaim Patent Pending MC"
I love the leap from “you could name every atom in the observable universe” to “everything can be dogs!”
And antimatter is a different breed
@@FrederickGrumieaux might need more names for all the bowsons
Elden Ring was right!
There's also the 4th and possibly most boring/"intellectually challenging" option depending on who you might ask: migrate the database to a more modern standard c;
I don't think they wanna spend the money moving all the punch cards around.
@@jarrod752 they would have to individualy re-tape half of the funched holes!
do you wanna put Matt out of a job?
Or the even more boring option, switch to using Arabic numerals in the same field
@@jeffsergeant Sounds easy, but try changing primary key values with existing foreign key constraints in place. That's a nightmare.
The rule about "no more than three in a row" and, indeed, the use of negatives is a partially modern invention. The Romans did regularly have to deal with numbers over 3999-the classic legion was larger than that, never mind the census data-and even older modern clocks have digits like VIIII instead of IX. There's no reason we can't update the rules further.
also.. 4 was often written as IIII
I once tried to search why "4" is written differently on clocks sometimes and couldn't find any "logic" to it. It's not tied to a location or a period of time, clocks with IIII were made even in 1960s, 1970s. It's a somewhat rare feature and looks like it was simply a choice of the maker.
@@vsm1456 I heard it was because it make the clock be divided nicely into 3 equal part (the I part, the V part and the X part)
@@vsm1456 there is a "theory" that it may have IV may have been an old abreviation for Jupiter during that time..
and so because having a clock read essential 1 2 3 GOD 5 6 wouldhave been "interesting" it was changed to IIII, and it just carried over
once again its just a theory with no proof from what i could find
@@weberman173 Another reason people mention is, "IIII" helps visually balance out "VIII" that's on the other half of the clock.
The reality is, no dog registered with the kennel club is named “Spot”. They’ve all got names like Harry Potter secondary characters, although “Excelsior Ludlum Spottiswoode IV” might get called “Spot” at home
Hmmm this comment is sussily similar to the one above it
@@wintaaaaa Often there are spam accounts that copy real top comments that were made early on. You can recognize the spam ones from the profile picture, name or at the very latest the channel description. Links are bad, they want to steal your money :D
@@vez3834 its not a bot its from 2011
@@theEWDSDS That's fair, I wouldn't call that the ultimate tell either though. But in this particular case I agree that they probably aren't. I just thought it was valuable enough to educate people on it so they don't fall for it.
My boyfriend had a show dog which was named something like “Whittier’s A Midsummer Night’s Dream” for show, but had a completely different name at home. Then they get something like “ex. something or other.”
Going to guess the database was using Hollerith cards, like those that used to be used for census data. 80 columns of 10 digits, so likely they allocated 2 rows to store a representation of the Roman numerals, using 3 bits, and then 3 sets of numbers, and a check bit, over the 2 rows. Leaves 78 columns for use, with likely one taken for breed (512 breeds should be enough to handle any likely future use), leaving 77 for name of dog and a reference field to another card, containing address.
Used machines that at the time were common, easy to get hold of as used and working, and also had the whole "computerised" feel to it. Easy to do lookup per field as to what would be allowed, and what translation to use, when sending to output as well.
So... is BZA like RZA or GZA? Are you part of the Wu-Tang Clan?!
Or are you from the Board of Zoning Appeals?
Either way, we need to talk about some stuff!
Nice theory but the problem is that the letters of the alphabet all fit in one column. The trick is that they use two holes from different rows in the same column to do so. One hole is in the top three rows and the other in the bottom 9. See the Wikipedia article titled Punch card. I think that Matt's theory that only 6 bytes were allocated is most likely the correct one. They used to allow only 8 bytes for names on computer graded standardized tests back in the 60s. It was always alarming to me that I could not put my entire last name on the answer sheet.
XXIXIX
@@ianflemings4989 XXXIIX?
Matt: "And that's where most videos would end: here's a crazy fact, here's the reason behind it, job done. But not here! The Stand-up Maths policy is to try and fix things! And I reckon I've got some solutions so Americans can have more dogs with the same name."
Me: "Arabic numerals. With six digits, you can count op to 999 999."
Option 3: if you were to start at 39 instead of at 1, you can have two more dogs with the same name.
I like this channel so much.
Even better! Instead of six characters which would take 6 bytes of storage, you could use a 32 bit unsigned int! You could save 2 bytes storage space per dog, but you could still count to an absurd number of 4 294 967 296 dogs per name!
@@kacperfilipek8461 And you could have a function translate that to a roman numeral for display, well at least the lower values that can be represented by roman numeral.
@@kacperfilipek8461 we need more though. 4 billion I don’t think I’d enough
@@kacperfilipek8461 Why waste those 2 bytes? That's the kind fo thinking that gave us Y2K! Use them all for 281,474,976,710,656 dogs.
At that point, just use the regular base 10 number system
"if you allow multiple Ms, which some people do" - some people would include the Romans. Although III and IV were the most common representations for 3 and 4, you can also find IIV AND IIII.
Yeah Roman numerals are way more versatile then people realize. Additive numbers (ex. IIII for 4) really was the most common variant until the subtractive IV became more popular in Medieval times. None of these are wrong as long as the standard is consistent.
You know what the Romans didn’t do though? Use IL for 49 (and all subsequent derivatives of that). That’s the only Roman numeral trick that some people use that is objectively incorrect.
@@HipsterShiningArmor CXCIIX = 198
How we began: In America, only 37 dogs of the same breed can have the same name.
How we ended: We could potentially register every observable atom in the universe in the Kennel Club, even if they were the same breed of atom.
sombody ask the k-club how any fidos?
Obviosly they're the same breed. It's atom breed
@@Данилтычкрейзи is that the hiss and tones i hear when nothing else? Sound of atom breed?
That escalated quickly.
%90 of the atoms in the universe are the same breed, Hydrogen. The other %10 is Helium, and there are some rare elements other than hydrogen and helium
Option 4: The American Kennel Club stops acting like an elitist clique and just uses a sensible database based on unique ID numbers....
A kennel club ceasing to be elitist? Boy do I have news for you...
Good sir, they are not _acting_ line an elitist clique :-D
They use just numbers. I would say that it is elitist to claim that everyone should use the same notation convention then you.
Option 5: Start a movement for everyone register their dogs' names using only the letters I, V and X in the hope of breaking their system somewhere down the line.
@@johanlugthart7782 unless one notation is objectively superior to the other which is the case here
I love how excited Matt is about Option III but never even mentions how it only allows II more dogs.
If you're speaking in the third person you'll need to pluralize "love" there.
@@charliedobbie8916 I loves????
@@Johnny-tw5pr (One loves)
... per breed and name!
@@charliedobbie8916 but he says "I love"
My database teacher would be mad for 5 different reasons if I handled things like they did.
Matt writing 3 as IIV is the most characteristically unnecessary change and im a fan of it
It's funny because it kind of makes the character longer. The number of characters is the same, but the actual width is different (III vs IIV, 13:21 pixel ratio)
IIV also uses more lines than III
If you allow more than 1 negative digits some numbers can be interpreted in several ways. For example is IVX=10-(5-1)=6 or (10-5-1)=4?
@Johnny Rep I was about to comment about 18!! I’m so glad someone put that that is the way it works for numbers bigger than ten when using wording in Latin.
For me it was MMMIM instead of MMMCMXCIX
I can't believe I was right in the middle of a deep dive on Roman history, Roman mathematics and the Roman calendar, and look who drops into my feed! And Skylab too! 🐕
youtube version of 6 degrees of separations gonna have fun with this vid lol
Same. I’m a Latin teacher and I’ve been teaching about numerals and abacuses this week!!!
Regarding 8 as IIX the 18th legion used XIIX as their label which precisely follow that rule you mention. Yes, it was unusual but XIIX is nicely symmetric and so was probably the reason why it was chosen. Also of note is that it is not only read the same forward and backward but also upside down and that makes it very nice for a label which may appear in many directions during a battle depending on the angle of the banner with the ground.
Veritasium just posted a video about number 37... this is so unbelievable...
how I supposed to live with that ???
Fun fact: there is actually precedent for usage of option II in Roman times; the Wikipedia article on Roman numerals specifically mentions IIX for 8 and variants as "Irregular subtractive notation" (not necessarily consistantly - it shows both IIXX and XIIX for 18, for example).
The same way we have sometimes the irregular IIII = 4.
If you can *say* "duodeviginti" in Latin, you should be able to write IIXX too.
So basically the Romans are doing exactly what Matt explained how he conceptualised Roman numerals ... if smaller digits come before larger digits, the smaller digits are negatives.
Smart people, those Romans were.
Wouldn’t IIXX be a really long way of writing 2?
@@justinwatson1510 you have to parse it as "II before XX," not "IIX before X." It's more complex than just left-to-right, otherwise XIV would be 11+5=16 instead of the 10+4=14 it actually is.
Even binary, the least dense numeric base, has more room with six digits than roman numerals
It also has zero, so you can make use of all 64 rather than just 63
great point. Roman numerals are really awful. But with so many things, we improve over time.
Unary is less dense. You run out at exactly 111111.
@@doctorwhouse3881 Lmao
@@doctorwhouse3881 Base 1/2 is even less dense.
although in terms of binary data, binary itself is much more space efficient. standard ASCII characters are 1 byte each, which is 8 bits, so the total amount of space used by 6 roman numeral characters is 6*8=48 bits. with 48 bits, you could store a 48-bit aka 6-byte integer (uint6) that would allow for 2^48=281474976710656 different unique identifiers for each dog name. then, if you really want to, just convert it to roman numerals once you're ready to display it somewhere
This might be less of a common issue than it might seem at first. AKC breeding dogs have 2 names, their "common" name ("Skylab"), but then they have these crazy gelical names like "Parker's Dazzling Stellar Cosmodog". So the AKC registered name tends to be very unique anyways.
I thought only cats were jellicle.
Most breed registies do this. Often the registared name has somewhere the breeders farm buried in the name, so as the animal moves around (by sale or in liniage) one can instantly see the original breeder.
This real life solution is basically a practical implementation of the final option Matt presented. There's just so many unique options that it's almost inconceivable to need 37 iterations of the same text.
The fact that you need a second crazy "AKC certified" name IS the issue. If this rule didn't exist the dog could have one common name.
@@nolin132 Lots of competitive animal societies have the distinction in order to be unique when competing. Plenty of racing horses with a common name of something like 'Spot' or 'Blackie' have an official name more like 'Guyana Star Dweej' or 'It'sallinthechase'.
I'm pretty sure that the ultimate reason for this limitation is that the original system was likely coded in COBOL. COBOL has this thing for building data records using ASCII characters with a fixed width for each field, with a newline after each record. This makes reading through a file with a buttload of records in it quite easy since every record is exactly the same number of bytes, and finding a particular value within a record is easy for COBOL because it's always in exactly the same spot. This is what made/makes COBOL so good for this kind of record storage. Today, we'd use some kind of RDBMS but back when computers were new(-ish), such things didn't exist yet, and flat files written to tapes were the thing to use, so COBOL and the COBOL way of storing data made a ton of sense.
In particular, knowing the size of each record in advance let you scroll through a tape very quickly and easily. This is important with tape since rewinding a tape is very slow, so overshooting the mark was very bad, and you couldn't just read the beginning and end of a file at the same time with a tape (which is needed for many modern storage methods, like ZIP files and RDBMSs), you had to read the whole thing from beginning to end. You also had to write the files from beginning to end without going backward, and it was unlikely that you'd be able to store the whole thing in memory at once, so you couldn't just make a table of contents for the beginning and then write out the rest of the file. You couldn't have generated the table of contents until after the whole thing was written so it would have to be written at the end, but you need the table of contents to navigate, so it needs to be at the beginning, but you can't do that, and OMG, it's not going to work.
That's why you didn't usually put ZIP files onto tapes in the early days of ZIP files; their table of contents is at the end of the file. Tar files put it at the beginning, but at the cost of having to have the files already compiled onto a disk of some sort as an intermediate. The individual files could contain whatever tables of contents were necessary for navigation and because you were using a disk, you could write them to the beginning of the file after you'd already written the rest of the file's contents to disk. But you can't do that when you're writing directly to a tape with no disk storage intermediate. COBOL and its fixed width record-based files was perfect for the era when you had tapes but no disks.
First thought was "what if they allowed an additional digit? How far can you get before hitting 8 numerals?"
I was quite pleased to realise that 88 is actually the first Roman numeral of length 8 😁
I expected Matt to talk about the furthest you can get with a length of numerals. 8 characters would allow numbers up to 187, and 9 characters would get to 287. The longest possible Roman numeral is MMMDCCCLXXXVIII, or 3888.
Repeat after me: “There's no eighter eight than eighty-eight!” 😁
@@vigilantcosmicpenguin8721 Not true, you can use a vinculum (a bar above the number) to multiply it by 1000, allowing for waayyy more than 3888
@@Perseagatuna They weren't talking about the size of the number, they were talking about the length of the name of the number.
LXXXVIII MILES PER HOUR!!!!!!!!!!
Note on option 1: there was no actual rule "only three of each in a row" to the people for whom "roman numerals" were just "numerals". Even back when Julius Caesar still had things to say, IIII instead of IV could be found in official records.
I think that was just an unspoken convention respected by most so there's no II/IIIIIIIIX (two) way to write a number.
I was listening to a podcast about the disappearance of the ninth legion and they said that it looked like some detachments wrote the legion's number as IX and others wrote it as VIIII.
So as the inscriptions that from the Rhine all write the number in the less common way, that probably means that it was only a single detachment that got sent there and not the whole legion.
There was probably some guy who wrote numbers in Is just to annoy his workmates.
@@connordarvall8482 Called Parker ?
One of Caesar's legions was Legio XIIII, rendered that way.
Fun fact, I used to had a dog breeder as a client, and would see the bills for the name registrations. Most of the time the dog's registered name would be [Breeder name] [Actual Name]. Now, this was in Canada, so it was the Canadian Kennel Club, so their naming rules may be different. So if the breeder's business was called 'Starwood Country Kennel' they would register their dogs as Starwood Rover, or Starwood Spot. That way you never have to actually worry about having dogs with the same name.
matt did mention at the end that they have like 36 characters excluding the roman numeral suffix, which also allowed all 10 arabic numeral digits. so they could simply bodge an arabic numeral into the registered name anyway. if they wanted to preserve the breeder prefix, they could probably also devise some sorta 3-char acronym convention so they saved more space for the actual name.
in short: there are literally innumerable ways to work around a deprecated and badly designed database, instead of just consigning themselves to its restrictions.
We're in Australia, where owning a greyhound comes with mandatory registration, and our last name is used as his last name.
What do you do if you want to own 101 Dalmatians named "Spot"?
@@LiviuGelea Probably get asked if you had your name changed to Cruella
Horses can only be registered with a unique name, so in that same vein, you end up with ridiculous names like "JKB Frosty Cloud Dancer" but the horse is named "Dance".
My solution:
IF one adds a space after a name, that doubles the number of names you can have. Therefore our number of names increases for every character not used under the limit.
Such a system would probably annoy them greatly, however, so they would likely moderize the system.
And you could add more than one space so Spot______VI and Spot____VI would both be valid and different dog names.
Edit: consider the names Skylab and Sky Lab, they are different names, and as such why does the space being at the end and being in the middle make a difference
"what's your dog's name?"
"Spot____34, or was it Spot_____34? Shoot."
Had to scroll to find this comment before I posted it… “Spot” is a good boy, so is “Spot “. “ Spot” however is a little rascal.
screw them, they're roman.
"No one can argue with that!.... people will argue with that :("
It's really nice to see how self aware you are :'D great video
Ow, you beat me to it, drats...
I was saying; on that all Karens in the world go like "I want to.."
( and then internet explodes )
"They have so overpowered the rest of the name that there are enough names for everything!" Oh, Matt, how could you apply combinatorics to the characters in the names but not to the atoms you're trying to name? Ok, so we name every atom, but we can also name, e.g., "that galaxy", and presumably things like people and, well, dogs. But what if I want a name for me and my dog? What if I want a name for everything in my galaxy except my dog? At the end of the day, we need names for every combination of atoms, and that's going to need rather a lot of names.
Loved the video.
And then you have to register each atom's name, and record that in a storage medium, which would take multiple atoms. We need a hyper or metaverse!
Even more names if you name groups of names.
...and then we break down the atoms into particles. Uh-oh, we are going to need more characters.
Power set of names....
Every single fundamental particle gets a unique ID. To described groups of particles, like protons, neutrons, atoms, molecules, dogs, planets, etc, you set the bits whose places correspond to the IDs of the particles that compose the group and that binary number can uniquely describe every possible group of particles you can name. (Seriously, binary numbers are really good at indexing power sets.)
It would also imply that the universe itself can be described with the name of -1, because two's complement.
After some Googling, it seems like the Romans actually had signs for the 1000s, like ↁ (5000) and ↂ (10000). But they had multiple ways of indicating 1000s. An alternative notation, and one that is easier to grasp, is the vinculum notation, where V̅ would be 5000. I'm not sure whether you could actually say that Roman numerals don't have the option to go above 3999. They could, but because there were multiple methods of handling that, we don't seem to teach them nowadays.
This is the comment I was looking for. I mean, *I* learned about virgules in school, so why on Earth didn't a professional mathematician do so? Seems Matt got a Parker math education.
He mentioned the vinculum in this video. Presumably the AKC doesn't (or didn't) have the ability to type a vinculum when they created the database (or didn't know about them at all).
This is SO cool!!! I wish I was taught this in school, but it makes sense that I wasn't. The only reason kids are taught Roman numerals these days is to understand places where they're traditionally used, and those symbols have fallen out of common usage.
Google translate does that - 5,000 is (V), all the way up to 4E6 - 1
Afaik the romans in daily practice actually did put 4 of the same characters after another. So instead of 9 = IX (as we would think), they wrote it VIIII. Same with the 4 having been written IIII instead of IV etc
Another option is to start the list at -27 (-XXVII) and go up to 37, which gives a run of 65 different choices
you'd need one character for the minus sign though
Romans didn't have a symbol for zero (only a word) so you'd have to allow the modern 0 or something if you wanted to use zero.
@Henry 1 Nice, but in traditional roman numerals it's always singles before another symbol - 1s, 10s, 100s (I, X, or C)
@@thethiefmaster or just omit the symbol altogether. Skylab 0 would simply be Skylab
@@Macieks300 Which I’ve accounted for, -XXVII is six digits, -XXVIII (-28) is seven
Given how dogs are named in the AKC, it's unlikely to see 38 of the same name anyway. The majority of dogs official AKC names include the kennel names where the dog was bred, and potentially the kennel name of the borrowed sire.
I was here for the invention of Parker Numerals
Parker the llXth.
11:36 "I have become very focused on the Roman Numeral aspect -- don't get me wrong, had a great time." Having a great time with mathematics and sharing it with the world is what makes this channel awesome. :)
Some dogs are made of multiple atoms [citation needed] so we really need names long enough to handle the power set of all atoms in the observable universe.
Only if the same atom can be in multiple dogs. Which is clearly the case (albeit rarely simultaneously).
@@QuantumHistorian Depends on your definition of "in" but taken one way it is in fact a regular occurrence anywhere multiple dogs conspire to become yet more dogs.
@@SaraWolffs "conspire" 🤣 love that
Ok, the "citation needed" made me laugh good job sir/madam! XD
@@QuantumHistorian We need names for every possible combination of atoms. You wouldn't call a water molecule "That bloke with the hydrogen atom called Benedict DCLXVI"
Your solutions are nice, but I'm *way* more curious about why AKC's naming system has these limitations in the first place now. I'm trying to imagine a system where the roman numerals couldn't just be outright replaced with numbers, even pre-computerization.
When something stays the same for some time, it becomes tradition. And when something becomes tradition, rationality no longer applies.
We still have 7 days weeks!
The nerve to even suggest such a thing!
Filthy prole.
@@juanausensi499 I totally get your point but in this case I think it's probably just being lazy to update the database to accept more numerals because of some runtime issues etc.
We had a regex at work that only allowed up to four characters. It went for years before someone realized that it was missing 18 and a bunch of other ones. I think the fact that the gaps weren't consecutive made it easier to miss.
Gotta love that he didn't once mention the solution of using regular digits instead of roman numerals because that would be too easy
That may have been the whole point.
Or indeed increasing the length of the field. Literally one SQL statement.
@@trejkaz If the database is a SQL database and not a flat file program.
Or Base 32 which would give over a billion
Or you know, they could store the numbers as a database defined integer, and have a function convert said integer into a roman numeral for display. The limit at that point is the size of an integer in your DB, which should be easily large enough.
but that utility function would take a whole 15 minutes to implement! Way too costly to change 😉
@@justthink124 And It would make sense! We are not allowed to do things correctly here.
i'm pretty certain that the institution is older than computers, so they are probably backed by some legacy mechanical system that is causing this issue
@@eduardopupucon aka human
@@mienzillaz humans would be able to count past 38, probably some punch card machine
I absolutely love how Matt never once touches on the obvious solution, yet everyone knows it's there. That's the comedy here.
The delivery of "... antimatter... that's, that's a completely different list" was gold
Antimatter would obviously a different breed, so that allows for the same amount of dogs
The "changing the sign" explanation makes perfect sense to me, and parsing is quite a bit easier with that in mind (you would only need an accumulator, a mapper, and the previous individual value). But Option 2 would make that parser a bit more difficult to implement.
option 2 is the same implementation detail as the regular interpretation, you just only allow two consecutive characters not three, and when you reach that threshold add the next character to the end, then pop off the left, then increment the right (just like how you already have to do it, just that popping off the left can only happen once normally, so I guess that explains "the previous individual value" has to become a stack/queue (of size 2?))
I
II
IIV
IV
V
VI
VII
IIX
IX
X
...
My partner related this story to me [web developer] while we were out walking our own dog last night. Never has a database design story caused me more psychic damage.
You might have meant psychological damage. Psychic damage is when Mewtwo attacks.
@@Nathan-dt2tu I guess you're new to the internet then? It's pretty common to use them interchangeably for comic effect, as though one is describing a 4e D&D power. Thanks for the splaining though.
I was wondering where I'd seen this video before, up until you plugged the podcast.
Say hi to Bec!
I like the idea that "elementary particle" is technically a breed of dog, and the entire observable universe is technically America, so each particle should have its own unique name registered with the American Kennel Club.
Manifest destiny at it's finest. "Can it be observed? Then it is AMERICA!"
Americans already put their flag on the moon. That's one celestial body down, a couple sextillion to go!
So the strings in string theory is just all the dog hair getting everywhere?
@@charliedobbie8916 Indeed, getting Matted
Then I name my dog FDHJK'1116541CBNMG6816-81UI61A6
If you are limited for naming to strings of length 6 made of the characters {"I", "V", "X", "L", "C", "D", "M"}, the optimal way to number dogs is to interpret the string in base-7, so e.g. "VXLDIL" could mean dog #22886, which is calculated by 1*7^5 + 2*7^4 + 3*7^3 + 5*7^2 + 0*7^1 + 3*7^0.
This gives us the maximum of 7^6 = 117649 dogs, which is the most you could uniquely number using only these 7 Roman numeral characters.
why is base-7 optimal? can't you just go for arbitrarily high bases? there are base encoding using the entire set of Unicode
@@Zemnmez since we are limited to a character set of size 7, we can't do any higher base
@@Zemnmez that's assuming you can only use the existing 7 roman numerals, and 6 characters. So that's the maximum you can get out of it within those constraints.
Of course if you can add more symbols then you can cram however high base you want, but that involves changing how the data is stored. The assumption is whatever antiquated database they use can only store those 7 characters and 6 of them. On a punch card, that could be represented with 3 holes in 6 columns with zero room for expansion.
If you could just change database, then you might as well just store regular arabic numbers for up to 999999 dogs, or just store it as a regular unsigned 16/24/32/64 bit number and have practically unlimited dogs.
Great idea! And, you could convert the result to base 10 (and _then_ to traditional Roman numerals) after reading the value from the database.
Assuming you're allowed strings of up to and including length 6 then you can use bijective base 7 with a total of 7¹ + 7² + 7³ + 7⁴ + 7⁵ + 7⁶ = 137256 dogs.
I have been advocating for Parker Roman Numerals for nearly 3 decades now and I am glad that I'm no longer alone in this fight.
Welcome, fellows.
The problem is it lacks any sort of standard. Under Parker Roman Numerals, I could write IIIIIIIVXL to mean 28.
@@Nathan-dt2tu I'm kheul with that
@@Nathan-dt2tu the rule is, minimize the number of characters, AND don't use more than 3. It really only should kick in on 8 or similar patterns like 80, where you'd be adding one digit plus three, instead of two subtracted from one.
The video writing IIV kind of missed the point of minimizing, but it's still valid.
"Assuming all atoms are technically the same breed. Antimatter. That's a different list" that one got me good
i love how mathematicians go on these weirdly absurd thought experiments, meanwhile the AKC probably hasn't even used roman numerals in decades and the rule only exists because it's been grandfathered in.
I think you have to backwards tho, with teh advent of phones and the interenet, and other modes of instant communication, they only allow more than a single dog to have the same name becuase it is grandfathered in.
They use the naming system to keep track of maternal and paternal lineages, and it would wreak havoc to have two dogs with the same name.
They could assign a separate identifier (like a number or code) and assign relationships by the coding instead of the name. That way even if two dogs are related and have the same name you can use tell they’re two separate dogs.
@@grandpaobvious "yes, this dog's dad was clearly the Skylab born in 1856."
Is there an age-out period for the registration?
@@stargate525 I don't think they meant using the identifiers to assign the dog's lineage lol, they just mean give each 'dog' in the system an ID number.
"NO ONE CAN ARGUE WITH THIS. people are gonna argue with this." i love it XD
Well, that happens when you come up with your own rules for an existing numeric system.
As a Hearing Impaired member of your fanbase, I must inform you that closed captioning is unavailable to me on this video. I have a very hard time understanding speech without CC, and so does my aging mother. I would appreciate it if you could accommodate us Hearing Impaired or others with audial processing issues.
But that part about the atom dogs was great!
Captions are auto generated. You might want to reach out to youtube about it.
I'm commenting so he sees this
I know Tom Scott pays for CC, so that's another channel worth checking out. My understanding is that the automatic CC provided by UA-cam takes time to generate, partly because there are hundreds of hours of content uploaded to UA-cam every second.
The best way to prioritize this video for CC is to share it with friends and like/comment/subscribe, I guess.
Some channels allow people to submit closed captions, but that is the channel's choice.
@@tonymouannes They can also upload their own CC if I remember correctly, so he is able to do something about it if he's willing to manually add them.
When Skylab taps Matt’s arm to ask for another treat ❤️
Skylab is such a beautiful pup. Labs are the best!
No
SkyLab is a beautiful dog, however IMHO Border Collies are the best but don't tell my Jack Russell.
@@saturatedodin476
> Saturated Odin clicks on a maths video
> something something roman numerals something
> scrolls to comments
> sees "Labs are the best!"
> stretches
> clicks REPLY
> cracks his knuckles
> starts typing
> "No"
> refuses to elaborate further
> hits enter
> leaves
@@Fytrzaczek21 Establishing Dogminance
Or convert all the current Roman numerals to base 10 strings in the database (since it currently only accepts strings not integers) and display them as Roman numerals when retrieved. For example convert “XXXVII” to “37” in the database. That gives you 999,999 of each name.
Option 4: 6 characters is a storage for 6 characters - use 0-9 on those characters and have a function to decode it into Roman numerals if you want to present them as such.
Even if for some unimaginable reason the storage can only use letters from roman numerals, since there are 7 letters (I, V, X, L, C, D and M) you can basically just use base 7. That would add up to 7^6 = 117649 dog names (maybe 117648 if not accounting for zero).
Or, Option 4: The ACC drops Roman numerals INSIDE the database, and uses either base 10 (999,999 Dogs of the same breed with the same name) or base 16 (0xFF FFFF (16,777,215 dogs of the same breed with the same name).
To keep their Display in Roman(ish) numerals, they could have a 'look-up table' in their database to pull up a string based on the value, adding new Roman(ish) numerals. We already have I=1 V=5 X=10 L=50 C=100 D=500 M=1,000, so, borrowing from Engineering notation, we can add T=5,000, P=10,000 E=50,000 Z=100,000 Y=50,000 ... Here the Engineering prefixes stop, so the 'No More than 3' rule and the 'no duplicate 5s' rule leaves us with ZZZY, 350,000.
So we go with another Engineering habit, "when the 'numbers' run out, start using letters." So we'll have A=1,000,000 B=5,000,000 F=10,000,000, and now we can represent all 16x10^6 dogs with Roman(ish) numerals.
Although it might just be easier to just use the hexadecimal numbers, the database is already working in that base internally (actually, it works in base 2, but 4 hex digits completely describes a 16-bit 'word,' the standard memory size in modern computers.)
@unsubtract those guys might be overwhelmed by the options there. Even counting puppy mills, has any breed even gotten close to that number, counting not only living dogs, but ancestors back to the establishment of the breed.
@Andrew Dreasler why skip G in between M and T tho? I can understand counting "k" as lost, but not G. :-B
Also, the ZZZY is not quite the max (also, your typo for Y made you write it wrong, 350,000=ZZZE and in fact ZZZY is invalid), the max would be YZZZPZ or 890,000. ;-) [Similar to how the largest 6-character number that doesn't use M is DCCCXC = 890]
Why stop at base 16? Base 64 can be done in human-readable ascii, using both the upper- and lower-case alphabets, digits, and the symbols + and /
@@wasabi991011 because base64 is therefore ambiguous... unless you want to encode the operations (and potentially the spaces) as well, sacrificing any remaining bit of readability... :-s
Also, it's not easy to work out a ballpark estimate of a number from its representation... which would be quicker to evaluate, A0F or oP, or maybe even MjU3NQ==(!) ?
Another question: Is this really an issue?
Here in Germany people who register their dogs use fancy unique names like "Harold Vincent von der Vogelweyde" (the "last name" usually referring to the dog's sire) but call them "Spot" at home.
Harold Vincent von dear Vogelweyde is a great name!
@@cmcgeeeable Harold Vincent "Spot" von der Vogelweyde is even better!
@@user-sl6gn1ss8p According to the video, the quote character isn't allowed. - Spot - would be though :-)
Name him Joe Biden
In the US (this video does reference the American Kennel Club), it is common for the dog's name to include the kennel name of the breeder. My puppy is named (kennel name)'s Delicate Sound of Thunder, but his call name is Thor.
So THAT'S why dogs are registered with such weird names. Growing up, we had a Great Dane named Rufus. Even though he was not registered with the AKC (at least I don't think he was), my mom insisted that his "official" name was Sir Rufusson of... "(something, something, something - I don't remember because we just called him Rufus).
Congratulations on 1 million subscribers - you really deserve it!
Or as the American Kennel Club would say, "congratulations on MMMMMM... (with a thousand M's) subscribers."
I thought of option 2 while watching the video, a couple minutes before you showed it, that felt awesome lol
Absolutely Brilliant! I remember this being talked about on A Podcast of Unnecessary Detail. 10/10
regarding the "everything could have a different name" part:
seeing as a dog is composed of many atoms, we should also allow names for groups of atoms, not just individual atoms.
and there are so many more distinct atom groups than there are atoms. that powerset is gigantic
Option 2 was also my favourite. I read as a child that some people used the IIX notation. A few years later, I remembered this and agreed.
I would actually comment about this notation, even if you haven't mentioned option 2.
Some people might argue that you can't "change important history", but if we are still using them, they aren't really history.
Would you believe this is the first thing I thought of after Veritasium’s new video about the number 37?
Same, and I only realised after some minute of video, that's kinda strange
8:15 In fact the real Romans did exactly what Matt suggests, mainly on their calendars. Eighteen could be written XIIX and twenty-eight XXIIX. (But they stuck with XXXVIII for 38, leaving them forever excluded from the American Kennel Club.)
That's probably also because that's how they _called_ those numbers. 18, 28, ... up to 88 were called "two-from-[the number two units up]" in Latin (duodeviginti, duodetriginta, etc.)
Huh, I still wonder if representing four as IIII is a modern invention or if it was also used by the Romans (I heard it was invented to stop people confusing IV and VI on the clock)
Not quite... or at least: not consistent. For example 18 can and was written as XVIII and XIIX. The epitaph of Centurion Marcus Caelius and the Fasti are good examples of this. In fact the subtractive method is the rarer of the two methods and probably only in widespread use because of Microsoft Excel's ROMAN() function
Clock faces often use the "valid-but-non-standard" IIII roman numeral instead of IV to make the face look more "balanced" visually.
@@Domihork Searched around a little bit. Many sources give the 14th Century Wells Cathedral clock in England as the earliest known example. I saw some claims the practice dates back to Ancient Roman sundials, but not much evidence (no supporting images of Ancient Roman sundials, for one)
In No Man Sky, I would always name each type of planet with a roman numeral system up to 30, then switch to decimal after 30. It was a little confusing with each numeral sometime in the teens so that why I decided on the switch.
I remember doing this when playing Civilization as a kid. Turned into a nightmare about XV in.
@@andrewciszewski9524 I learned roman numerals to read cornerstones on buildings, Those can get crazy in late 20th century buildings.
@@Radnugget I learned Roman numerals to keep track of the correct viewing order for the Rambo and Rocky sagas. :')
8:23 was really interesting to me as in Latin thats actually how their numbers are named, (e.g. 18 is duodeviginti, “two from twenty”) so it almost makes me wonder if thats actually how the Romans did it
After doing some research, (literally just checked wikipedia) I have found out that Romans actually did frequently use your ‘IIX’ system!
That's how some French words work, too.. 17 18 19 are dix sept, dix huit, dix neuf (10 7, 10 8, 10 9) and 80 is yearly quatre vingt (4 20..as in 4x20, not related to weed)
At 8:23 I thought: we should make seven a one syllable word. It would make Russel’s paradox number bigger.
Oooh, and 'deviginti' sounds so much like a root of 'twenty' . . .
@@stone5against1 Have you heard the french version of 1999? une mille, neuf cents, quatre vingt dix neuf! What a gob-full for nineteen ninety nine!
By now I've given up on understanding how Matt thinks, and I'm completely fine with it.
But the last thing I was expecting when I clicked on this video was that it would end with anything related to antimatter.
You nailed it, *as usual*, in great style, Matt. Congratulations. 🦴🎻XXXVII
Matt: "Everything is atoms."
Muons: "Do I look like an atom to you?"
I like the symmetry in the "length" (=number of digits) of the numbers for option 2, it gives a nice wavy line looking at the list starting at 8:45.
Here are a couple of other solutions:
1. Just expand the database field to use more characters. This is a simple thing to do (depending on db system)
2. Use an integer instead of a character based Roman numeral.
3. Ok so you want to keep the Roman numbers, covert the database to use an integer and on the display of that value convert it on the fly to a Roman numeral. This could be done on a db "View" (assuming a sql type db) or on the UI.
If they're using RN, it's likely an ASCII character! 7 bits (8 bits)....... 256^6
That last bit made me laugh out loud about there being more options than there are atoms in the universe, but then I thought, "Hang on... it would not even be possible to STORE that many names unless we had a large number of spare universes to hold the data AND knew how to even do that!"
One of the many reasons we haven't started sticking barcodes on elementary particles.
I love thinking about the implications of huge numbers. Once you get big enough, you run into issues of the universe not being big enough to handle them. A very fun existential thought experiment.
It is a funny thought. I wonder though how much the number of universes needed gets cut if we use compression. If we know that these hundred atoms in a row have the same dog name, we could store that as Spot (I:C) rather than [Spot I, Spot II, Spot IIV, Spot IV, etc.]
@@charliedobbie8916 thank goodness for that - I misread that post as 'the main reasons' at first scan .... phew!
missed opportunity to name the dog MattLab...
One thing I find interesting about Option 2 is that the word for 18 in Latin is "duodeviginti", literally meaning two from twenty. So by all rights it should be written XIIX.
No, it should be written "IIXX"!
Same for "duodecentum" (two from hundred, 98) => IIC
@@peterjansen7929why not IXIX?
@@danielboyd4079 Because it's 2 from 20, not 1 from 10 plus another 1 from 10 (nor 10 plus 2 from 10, as jeboideiasque's suggestion would require).
Small point regarding the possible names, I fear you have overestimated as any name with multiple consecutive spaces would be confusing and open to error!
Only one space is allowed, You also can't register an existing name, that's why you use a kennel Name.
@@dogwalker666 Surely multiple nonconsecutive spaces are allowed, right?
But he also didn't take into account names shorter than 50 characters. So it seems like the number should be much larger, actually.
@@zsdavis Pretty sure there's no difference between a five character name and a 50 character name that ends with 45 spaces.
But even if they didn't count like that, there would be 39^49 of length 49, 39^48 of length 48, and so on. And while those are big relative to the number of dogs that have ever existed, they are small relative to 39^50. You'd just be multiplying the total by the fast-converging geometric series 1 + 1/39 + 1/39^2 + 1/39^3 +...
Which increases the total in the video from 1.32e81 to 1.36e81
In the spirit of this video, I will argue against option 2. As the first "I" wouldn't be followed by a larger number, it would be additive; therefore, 8 = IIX = I + (-I) + X = 1 - 1 + 10 = 10. Fun video, though. Loved all the Skylabs as stars at the end.
Exactly Matt is not following his own rule!
@@Froudd While true it's about understanding the number and I think anyone who understands roman numerals would understand the meaning without someone telling them
I would obviously name my other dog Salyut
the first I isn’t immediately followed by a larger number but it is followed by a larger number at some point, that’s all that matters
@Angus Exactly! Thanks for writing this so I didn't have to :)
Can't have too many "M"s or you'd end up with that Crash Test Dummies song.
Skylab. :) Golden retriever here. Considered naming him Au thinking it would be great fun to call out 'Eh! You!" at the park. Decided to keep the au in the name and went with Tau. :)
As someone in the tech sector, I got chills when Matt suggested the possibility of registering every atom in the universe. Some things are not meant to be done at scale O_o
It's not like we'd have anything to register them ON, anyway. :-)
Option 4: remember that you live in the 21st goddamn century and use Arabic numerals like a normal person.
Option 5: remember that disk space has become absurdly cheap (because we live in the 21st goddamn century) and just add more space for the stupid, pointless, self-important roman numeral suffix.
Option 6: use the inbuilt GUID that literally every competent database schema already has.
Option 7: store the suffix in Arabic numerals and just convert it to Roman numerals for display if you just REALLY INSIST on being a pretentious jackass.
Actually factual
dog owners who would register with a "kennel club" are by definition pompous and pretentious... checks out
Did roman numerals kill your dog or something?
I hate to disagree, but in the 21st century you are mostly using binary numbers (at least if you are working with databases and you don't want to wast disk space on ascii or you don't want to do high precision decimal calculations like 0.1 + 0.2, but that should not be needed for keys in a database...). And with that you can show them in whatever display format you want...
Do you mean the XXIst century? What about Elizabeth 2? To me it sounds like a Hollywood sequel...
I love how “use more than six digits” is never considered
Or converting the database to use a number field and only *render* it as roman numerals...
Yeah, I guessed where this was going before he explained it, and I was expecting it to be Roman numerals stored in 8 characters, since powers of 2 are common in computing so 8 bits seems a logical amount to use.
Actually now I'm curious to see how far you can go with these.
6 digits: 37, XXXVII
7 digits: 87, LXXXVII
8 digits: 187, CLXXXVII
9 digits: 287, CCLXXXVII
10 digits: 387, CCCLXXXVII
11 digits: 887, DCCCLXXXVII
12 digits: 1887, MDCCCLXXXVII
13 digits: 2887: MMDCCCLXXXVII
14 digits: 3887: MMMDCCCLXXXVII
15 digits: 3999: MMMCMXCIX (to go any further you need a symbol for 5000 - that would get you up to 8887).
@@thethiefmaster Storage might not be the only thing at play here. I assume they also print things involving the dogs name, which means that printing a name with a roman numeral greater than 6 digits *might* throw off all their templates, which they'd then also have to fix in addition to changing it in the database.
tbh that's a somewhat weak chain of inference. *shrug*
It probably didn't use a database, they just allocated bytes in a binary file via C... depending on what year it was done. To read bytes from a file in c you have to more or less know what size to read and what size variable to put it in. You have to know how the variables are saved in the file. On a 16 bit system an integer would be 16 bits, short int 8 bits, a char would be 8 bits or one byte. (If I remember correctly, lol) A c string uses 8 bit chars for ascii 7 bits plus 1 bit for the extended ascii. A c string (array of chars, basically) terminates with the null terminator, so 6 characters written in ascii would take (6+1) bytes. There was no unicode or utf-8 yet. To convert, you'd need to write something to read and then rewrite in the new format. Been a while since I've read a file manually like this, but that is the gist of it. They also might have written it in ascii files instead of binary, but anyway, could be many valid reasons, back then memory conservation was pretty important, especially for millions of any sort of records held in memory with limited ram. If you don't hold as much in ram, then you have to access slow disks (which were a lot slower then) more often. Having said all that if it wasn't written in C, it might have been written in something else with other limitations :P
...Or option the best, use the data bits to represent an integer in binary and convert and display in roman numerals at best you could have 6 bytes (48 bits) or 281,474,976,710,656 dogs
It goes to show how terrible people are at data storage (and how many horror stories about it there are) that when you mentioned 38, I was able to immediately guess at the "max 6 characters" thing. That type of error happens way too often lol.
@Eyeguy640 : 1976 Olympics Gymnastics in Montreal. Programmers in charge of recording scores asked gymnastic officials, how many digits needed to record/display scores? They said, no one will score perfect 10 so 9.99. Then Nadia Comăneci, age 14, scored perfect 10.00. Long delay before score was displayed. It was displayed as 1.00. I was there.
The way I always remembered the slightly positional rule was that V is 5 IV is 1 before 5 = 4 and VI is 1 after 5 = 6. They could always just use normal letters instead of roman numerals. For standard uppercase only letters that'd give them 26!/20! (165,765,600) slots within the 6 characters.
"The way I always remembered the slightly positional rule is exactly the way everybody else remembers it."
26!/20! is the number of strings of 6 *distinct* characters, using a 26-character alphabet. If you're allowed to repeat characters, you can go up to 26^6 strings
@@esquilax5563 26p6 right?
altho yes its stupid to use roman numerals when u r space limited
I just felt a great disturbance in the force, as if every atom barked at the same time
As long as thy don't wiggle in sync, all is fine. : )
I know I'm a year late but at the 8:45 minute mark, by how you explained how roman numerals work. wouldn't IIX be 1+(-1)+10? meaning the 1's cancel each other out. Since only 1 I has a bigger numeral to the right of it.
From an engineer perspective, replace the text field of 6 character with a 32bit unsigned integer. When printing the number, convert it to roman numerals on the fly. 4 billion numbers with only 4 bytes instead of 6. But of course as Matt says, some numbers can't be represented easily in roman numerals.
That's what I was thinking too. No reason they can't migrate to a newer better database.
And that's why you're just an engineer, and not a celebrity/politician/CEO and an elite member of a Kennel Club. Doing things an easy, simple logical way? Plebs does that!
i'm pretty certain that the institution is older than computers, so they are probably backed by some legacy mechanical system that is causing this issue
@@eduardopupucon considering all living dogs in their database are younger than computers I see no reason why they can't archive old records and transfer to a new system for new records. If an organization of that size doesn't use a digital database to create, read and update records they are wasting money on human labor.
I wonder if their database was first computerized on a 6 bits/byte system. It would make sense why they limited things to 36-character names and 6-character Roman numerals.
(EDIT: Such systems existed in the 1950s-70s, before 8 bits/byte was fully standardized -- largely in the '60s by IBM. With 8 bits/byte, the limits probably would've been 32-character names and 8-character Roman numerals.)
This video title had to be either Matt Parker or Tom Scott.
This is your finest work, Parker. More Skylab please.
Slight correction: Your suggestion to rework the numeral system at 7:40 would actually be wrong. The reason you only ever subtract single numerals is because otherwise it leads to a lot of confusion once you get bigger numbers. *Especially* in the context of computers, you'd get some strange results. For instance, let's take 2008 in numerals: MMVIII. This is super simple to deal with, because we know that M is bigger than V, and V is bigger than I, so you just add everything together. However, if we instead did your suggestion and made it MMIIX, it would actually end up being 2010, because you would add M+M+I, then I-X.
Now you could develop a system to account for this, but it would be pretty cumbersome, and more importantly you'd introduce a number of different numerals that mean exactly the same thing, which is really bad for any number system to have. XXMM would equal MCMLXXX (1980), CCL would equal LCCD (250), and even your own example (VIII = IIX) would lead to inconsistency and confusion. There's a good reason no Romans ever used IIX.
"The Parker Way. That's not gonna catch on, but I gonna give it a go."
Matt, this is the Internet. Have you forgotten the Parker Square? If you say "it won't catch on", you're basically challenging the internet to make it a thing that will catch on no matter what.
I mean, that’s what he’s hoping for? For his name to be associated with something positive?
@@AliceYobby yes, Parker way is nice and elegant but it's still kinda impractical, so the Parker property still applies
LOL Now I'm wondering if that's why Matt said it! XD