The Clever Way to Count Tanks - Numberphile

Поділитися
Вставка
  • Опубліковано 22 гру 2024

КОМЕНТАРІ • 1,8 тис.

  • @numberphile
    @numberphile  4 місяці тому +188

    See brilliant.org/numberphile for Brilliant and 20% off their premium service & 30-day trial (episode sponsor)

    • @Blutania
      @Blutania 4 місяці тому +11

      The video: 38 minutes ago
      The comment: 1 day ago
      *_time travel confirmed?_*

    • @rmsgrey
      @rmsgrey 4 місяці тому

      @@Blutania It's standard for videos to be uploaded to UA-cam some time before they go live to everyone, so the uploader and, not infrequently, also patrons, channel members, or other privileged people who get given a link to the still private video can comment on it before it's published.

    • @Electronieks
      @Electronieks 4 місяці тому

      @@Blutaniawas private yesterday

    • @Electronieks
      @Electronieks 4 місяці тому +3

      Send this video to Ukraine 🇺🇦

    • @jerredhamann5646
      @jerredhamann5646 4 місяці тому +2

      Its likely they used that method but a less math way of doing it is permissable. one ur going to be sending spys and spy planes to bases storage yards and depots and since a lot these things are big and in the open and since you can only form the number of tank units u have tanks for u likely have a decent count of the number of units they have at x time. if u know the serial numbering system of the enemy, then the rise in the serial numbers over time from captured equipment will tell u their rates if last month the highest serial numbers in the low 1500s but now they are in the upper 1700 it doesnt take a math phd to figure out ur looking at about 270 tanks also since the serial numbers tell number date and location it tells u something more important the lag in their logistical system. If u know how long it takes for the enemy to make and move stuff u can predict movements and actions to some degree

  • @dowesschule
    @dowesschule 4 місяці тому +5813

    You didn't just pull out the first and last, but also the middle tanks 15&16!

    • @AndreasHontzia
      @AndreasHontzia 4 місяці тому +170

      And 23. Iluminati!!!

    • @shripalmehta
      @shripalmehta 4 місяці тому +77

      there's a mathematician!

    • @docsigma
      @docsigma 4 місяці тому +217

      Thats’s Numberwang!

    • @tonelemoan
      @tonelemoan 4 місяці тому +32

      SPOILER ALERT!11

    • @ilonachan
      @ilonachan 4 місяці тому +113

      the luckiest draw at the unluckiest time!

  • @GrimOrdnance
    @GrimOrdnance 4 місяці тому +404

    I adore the fact that you left the initial pull in the video, because that is the truth in probabilities. I appreciate your videos!

    • @atimholt
      @atimholt 25 днів тому +2

      True randomness is clumpy. That's why music streaming services often don't use true randomness-you'll get too much serendipity that feels unshuffled.

  • @adsilcott
    @adsilcott 4 місяці тому +1496

    6:33 I love the way the turrets are pointing at their actual positions in the number line :)

    • @Denis_Bobrov
      @Denis_Bobrov 4 місяці тому +22

      Oh, I didn't notice it )

    • @LoveDoveDarling
      @LoveDoveDarling 4 місяці тому +36

      And how the treads are in motion on the tanks. Editor going above and beyond. Bravo!

    • @taeliantalittia612
      @taeliantalittia612 4 місяці тому +6

      4:47

    • @miketothe2ndpwr
      @miketothe2ndpwr 4 місяці тому +8

      It's such a little detail for nerds. Love it as well

    • @LoveDoveDarling
      @LoveDoveDarling 4 місяці тому +5

      @@miketothe2ndpwr I don’t think it’s exclusive for nerds. It’s for anyone who pays attention at details to appreciate.

  • @bittencourt16
    @bittencourt16 3 місяці тому +78

    I've just simulated 10000 of this operation for number of tanks less than 100 and number of guesses between 10 and 50, and using the maximum value as the total of tanks gives 4 on an average error, while using the maximum value + average gap leaves 2.6 as average error. That method is simply 151% more precise!! Amazing!!!

    • @vitriolicAmaranth
      @vitriolicAmaranth 2 місяці тому +5

      What about other methods, like mean value * 2?

    • @evilduck5691
      @evilduck5691 19 днів тому +1

      @@vitriolicAmaranth this is exactly where my mind went. I feel like it should almost be equivalent to mean gaps, but I probably just haven't thought hard enough

    • @bancabancabanca
      @bancabancabanca 10 днів тому

      @@vitriolicAmaranthseem to work as well!

  • @redryder3721
    @redryder3721 4 місяці тому +3703

    I know it's irrelevant, but there's the old joke about letting three sheep loose in a field, but first labelling them "1" "2" and "4" so the person rounding them up spends ages looking for the 3rd.

    • @agranero6
      @agranero6 4 місяці тому +71

      I read about this prank in the book Show Me How or More Show Me How.

    • @SunroseStudios
      @SunroseStudios 4 місяці тому +90

      it's vaguely relevant!

    • @SwedishNeo
      @SwedishNeo 4 місяці тому +192

      It would also make sense in this case since the Germans wanted to make the appearance that they were building more tanks than they actually were. As such they could have skipped a couple of number in their serial. But I guess it would create to much chaos for the German mind to handle. xD

    • @hakanl2585
      @hakanl2585 4 місяці тому +130

      MI5 officer Peter Wright wrote in his book Spycatcher that MI5 bugged the Soviet embassy in Ottawa. So MI5 market all listening cable with number 1 and up. But in
      case Soviet would find these cable MI5 omitted some number hoping that Soviet what almost have to tear down the embassy in order to find the missing number.
      ( But trick did not work since Soviet had some spy within MI5 informing Soviet how many cable and what number they had. So Soviet never searched for the omitted
      number. )

    • @h.a.9880
      @h.a.9880 4 місяці тому +137

      ​@@SwedishNeo "New orders from Berlin: We are to skip a few serial numbers when imprinting parts, so our tank production looks bigger than it is..."
      - "But zat will bring dizorder to mein numbers!"

  • @LeonMatthews
    @LeonMatthews 4 місяці тому +131

    For several of my clients we incremented the serial number by some prime, rather than one, than in order to obfuscate the output somewhat. It also gave us some degree of parity checking on serial numbers later. Silly, really, but fun.

    • @accountxabcdef
      @accountxabcdef 2 місяці тому +9

      I would use a hash function. A secret number placed after the normal serial number, and then hash it and then use it as official serial number. Then every unit has its own official serial number, you have the secret and you can look it up, what the real serial number was and nobody is able to guess any different valid number. Even if he knows every number (except the secret) and your algorithm to create them.

    • @dafrandle
      @dafrandle 2 місяці тому

      @@accountxabcdef
      I would use a uuid and convert it to numbers via a bespoke translation - just have a check to avoid the rare collision

    • @AndreyCizov
      @AndreyCizov 2 місяці тому

      isn't it quite easy to figure out that all numbers are incremented by a prime number?

    • @accountxabcdef
      @accountxabcdef 2 місяці тому

      @@AndreyCizov
      You would need to see a few machines bought at the same time. You can not trust, that there will be all numbers used. Often there is a gap when an updated version is used. And when at that time the prime is changed, have fun to reverse engineer the prime...
      There will be enough people who are able to spot it or even reverse engineer it, but that number shouldn't be that great (depends on the batch size, amount sold to individual customers and prize - more expensive it's more reward to get something free as warranty).

  • @courtney-ray
    @courtney-ray 4 місяці тому +562

    At 6:36 you were right on! The gap below your minimum observation WAS equal to the gap above the maximum observation and the true number of tanks!

    • @vez3834
      @vez3834 4 місяці тому +19

      Amazing accuracy!

    • @IDNeon357
      @IDNeon357 Місяць тому

      The tank serial numbers were all encrypted by both allies and axis powers making this story entirely false.

    • @NTelling
      @NTelling Місяць тому +8

      @@IDNeon357 He addresses that in the video. He said the encryption was cracked.

    • @SpydersByte
      @SpydersByte Місяць тому +3

      @@IDNeon357 lol what? First of all he said they deciphered the coding they were using but also how would you know this? and why do you say it like its established fact when it clearly isnt?

    • @SpydersByte
      @SpydersByte Місяць тому

      yea Im surprised he didnt really point that out :D

  • @bjrnstrottman5637
    @bjrnstrottman5637 4 місяці тому +34

    My first instinct was to use the Central Limit Theorem to assume that the sample mean would approximately equal the population mean. Since we know the distribution is uniform and the population mean of a population of size n is (n+1)/2, twice our sample mean minus one should approximate the population size.
    Here our sample mean was 17, so this method of estimates the population size as 2(17) - 1 = 33.

  • @polyaddict
    @polyaddict 4 місяці тому +2498

    I love how british "they have a bit of a spy" is

    • @reidflemingworldstoughestm1394
      @reidflemingworldstoughestm1394 4 місяці тому +65

      It's not just a British thing. Sometimes I have myself a bit of a spy as well.

    • @diamondsmasher
      @diamondsmasher 4 місяці тому +75

      Personally, I have a bit of a lookey-loo

    • @dualcrocadile
      @dualcrocadile 4 місяці тому +20

      Sounds like a Karl Pilkington story

    • @Rubrickety
      @Rubrickety 4 місяці тому +9

      I'm glad I had a bit of a spy before making this exact same comment.

    • @bobknip
      @bobknip 4 місяці тому +8

      A bit of a stickybeak

  • @JS-mp7fy
    @JS-mp7fy 4 місяці тому +21

    I did this exact maths problem at high school in 1991, what a real blast from the past! Thank you!!!

  • @funnygeeks8126
    @funnygeeks8126 4 місяці тому +787

    1:22 "after the war, the allies can go into those tank making factories"
    I like how knowing if our math was right is more important than having won the war.

    • @trueriver1950
      @trueriver1950 4 місяці тому +11

      Of course!

    • @nekrataali
      @nekrataali 4 місяці тому

      The Cold War immediately took over international politics following WWII. Once the Allies realize the math was correct, it changes how they both conduct espionage and counteract it.

    • @michaelwright2986
      @michaelwright2986 4 місяці тому +43

      Always preparing for the next one.

    • @vinterskugge907
      @vinterskugge907 4 місяці тому +70

      ​@@michaelwright2986Or, as they say, "The generals are always fully prepared for the previous war".

    • @mattiasthorslund6467
      @mattiasthorslund6467 4 місяці тому +38

      To the mathematicians, checking their math was the motivation to win the war

  • @user-fi4zi5il9z
    @user-fi4zi5il9z 4 місяці тому +12

    His enthusiasm is so contagius and it's so cool! the formula is surprisingly simple!

  • @fatsquirrel75
    @fatsquirrel75 4 місяці тому +1111

    Pointing out that lower numbers are more likely is such a good observation. Brady keeps highlighting his genius video after video.

    • @hylen26
      @hylen26 4 місяці тому +79

      I don't know about genius but he does ask some excellent questions.

    • @rjwiechman
      @rjwiechman 4 місяці тому +72

      As my late Father would have said, "Not cessinarily!". It is also true that more of the lower numbered tanks would have been destroyed or broken down and replaced and no longer in service.

    • @Yggdrasil42
      @Yggdrasil42 4 місяці тому +20

      ⁠Exactly. Another type of survivorship bias.

    • @freitchetsleimwor2406
      @freitchetsleimwor2406 4 місяці тому +5

      So the number line does not reflect a set of equally likely observations. Some of the serial numbers that are not yet observed are less likely to be observed than others.
      I think I am understanding this right, the not yet observed numbers between the maximum and minimum have a higher average probability of being observed than that of the numbers outside the bounds. And if the biases don't cancel each other out, the prediction is skewed. I'm sure this is a well known probability thing I'm just working this out

    • @boggisthecat
      @boggisthecat 4 місяці тому +5

      It’s a fairly obvious observation, I think. The mathematics being shown assumes that all objects appear at once, so no temporal complications. Presumably the mathematicians engaged in this work factor in the production dates where they were known.
      Another confounding problem is repair or rebuild. For example, Russia is taking old tanks and rebuilding them into modern configurations. So these tanks are not entirely produced from new - but serial numbering is going to be a mix of old and new, dependent upon components. (It’s very complicated in this case, because there are multiple variants and changes between foreign and domestic components. We know how many thermal sights Russia bought from Thales in France, but don’t know how many domestic equivalents are being produced, as an example. So if you get a Thales serial number it’s somewhat useful, but domestic ones require some time to aggregate the data. If you can’t capture enough data then it’s not going to work, but then there are other more obvious reasons for why the information isn’t necessarily helpful in this case.)
      ‘Spys’ typically rely upon stuff like observing rail shipments. This can be gamed (which Russia has a long history of doing, because they aren’t fools) to feed false information to your opponents, however. Serial numbers are much more solid, provided you can make sense of the systems being used. These are kept very secret, unsurprisingly.

  • @gustavakerman2566
    @gustavakerman2566 4 місяці тому +227

    Alternative title: Local British mathematician gets blindsided by sheer stupid luck

  • @MuffinsAPlenty
    @MuffinsAPlenty 4 місяці тому +770

    Watching James Grime explain mathematics is such a joy.

    • @perplexedon9834
      @perplexedon9834 4 місяці тому +12

      All my homies love James Grime

    • @fariesz6786
      @fariesz6786 4 місяці тому +9

      he's just that fun mixture of adorable, approachable, nerdy, and just proficient in his job

    • @stapler942
      @stapler942 4 місяці тому +5

      Due to Siivagunner I have this mental image of him approaching menacingly to tell me about *e*.
      But I agree, he is a joy to watch.

    • @derhesligebonsaibaum
      @derhesligebonsaibaum 4 місяці тому +3

      yeah, he always seems to have so much fun doing it

    • @warp9988
      @warp9988 4 місяці тому

      Making Math awesome.

  • @otaviodiniz5934
    @otaviodiniz5934 4 місяці тому +136

    Man, it's 11pm local time, I'm awake since 4am, my week was a rollercoaster, I'm mad about my job, I'm dealing with a woman that is getting in my nerves, my bank account is zeroed, I'm tired and pissed...
    But for some reason, his enthusiasm telling this story made me happy instantaneously.
    Thank you for this, God bless you and your beloved ones. Got a subscription.

    • @hydra8sk
      @hydra8sk 4 місяці тому +4

      Keep it up! Better times are ahead pal

    • @connorkapooh2002
      @connorkapooh2002 4 місяці тому +8

      Bro, in the future you will stumble upon your comment and you'll remember where you are at now in your life. You've made it this far, you'll keep going

    • @sirllamaiii9708
      @sirllamaiii9708 4 місяці тому +2

      You need money brother? Any way i can help?

    • @arjanab6227
      @arjanab6227 4 місяці тому

      @@sirllamaiii9708such a kind Man U are bless you sir

    • @panthermodern6572
      @panthermodern6572 4 місяці тому

      Hope you're doing better now. And even if you're not, it's all gonna be alright ;)

  • @jameswkirk
    @jameswkirk 4 місяці тому +684

    A company I worked for made computers & peripherals and used 64 bit random serial numbers. They had multiple manufacturing sites, and calculated that the odds of selecting two identical numbers was smaller than human bookkeeping and errors trying to coordinate multiple product lines.

    • @ragnkja
      @ragnkja 4 місяці тому +163

      So, like UA-cam assigning video IDs, they decided that it was faster and more accurate to just check for duplicates, because the probability of the same number being assigned twice in the time it takes to check if it has already been used is extremely small.

    • @SaHaRaSquad
      @SaHaRaSquad 4 місяці тому +110

      ​@@ragnkja Even checking for duplicates would be unnecessary if cryptographic hashsums are used. The odds of getting randomly occurring collisions with them are so low that on average it would take much longer than the lifetime of the universe.

    • @Rivinwin
      @Rivinwin 4 місяці тому +14

      Lol, that's awesome. I love and hate it.

    • @Rivinwin
      @Rivinwin 4 місяці тому +51

      ​@@SaHaRaSquadYah, treat a huge range of numbers as a domain, split it into segments and assign a segment to each factory, ie. 64 bit number where the top 3 or 4 bits are specific to each factory, increment the value at each factory independently of eachother per product, assign a hash of that value as the product serial number 👍

    • @jurjenbos228
      @jurjenbos228 4 місяці тому +24

      Yep, if you use 64 bit numbers the probability of a single collision in the numbers starts to raise only after about 4 billion devices are manufactured. And even then: so what? Almost all numbers are unique.

  • @b1oodzy
    @b1oodzy 4 місяці тому +102

    I thought I was smart with my calculation of (1+15+16+23+30)/5x2 = 34 but this guy pulls out a giant sheet of paper and introduces probabilities.

    • @Ryanmathewsc
      @Ryanmathewsc 3 місяці тому +14

      My mind went to the same place. As the sample size increases, the average should approach the median number. I wonder if the methods in the video offer a meaningful improvement over simply doubling the observed average.

    • @_..-.._..-.._
      @_..-.._..-.._ 2 місяці тому +2

      The x2 part didn’t make sense to me hmm 🤔

    • @b1oodzy
      @b1oodzy 2 місяці тому +15

      @@_..-.._..-.._ The first part of the equation calculates the average which is 17. To calculate the maximum you'd need to do x2 to get 34.

    • @Exaspatial
      @Exaspatial 2 місяці тому

      Same here

    • @felipea.barretto7503
      @felipea.barretto7503 2 місяці тому +2

      I did the same thing except I subtracted 1 to estimate 33. My reasoning is that if we have N tanks, all with equal probabilities, the expected average of the distribution is 1/N * (sum of 1 to N) = (N+1)/2 . Estimating this with the sample average μ, you get N = 2μ-1, which is why I subtracted the one.

  • @art1099
    @art1099 4 місяці тому +4606

    No war thunder sponsor? Missed opportunity

    • @Nick-the-fox
      @Nick-the-fox 4 місяці тому +99

      THis is targeting a different audience
      It's like a opera gx sponsor on a non gamer channel

    • @williamnathanael412
      @williamnathanael412 4 місяці тому +35

      What is war thunder

    • @Sjobling
      @Sjobling 4 місяці тому +326

      ​@@williamnathanael412 If you'd typed that into google instead of the UA-cam comments, you'd have an answer immediately. But now, you have a sarcastic response 7 minutes later instead.

    • @serinat_1408
      @serinat_1408 4 місяці тому +72

      Right here I have a bag of german tanks! Do you know where you can also find German tanks? WAR THUNDER!!!!

    • @alexscriabin
      @alexscriabin 4 місяці тому +19

      ​@@Nick-the-foxDude what is an "anti-gamer channel"? Is it just one that reports on game devs being overworked at fromsoft or that was anti-gamergate ten years ago?

  • @caiocc12
    @caiocc12 4 місяці тому +5

    There's a thing called "fixed-format cryptography" which can be used to make sequential numbers look random. The nice thing about it is that the encrypted number is in the same domain as the plain number (i.e. the original numbers range from 0 to say, 1 million, the encrypted numbers will also be in that range), so the attacker doesn't know they are encrypted and thinks it's just a plain sequential number. I've used that to protect against brute-forcing IDs on a system, while keeping the IDs short enough to be encoded as a barcode

  • @mark97199
    @mark97199 4 місяці тому +1060

    This only works of the serial numbers are sequential. Knowing this, the US named the the third SEAL team "SEAL Team 6" to confuse Soviet intelligence.

    • @penfold-55
      @penfold-55 4 місяці тому +121

      And if you know where they start. For example, if the serial number was a date, this just wouldn't work (even though the numbers are sequential, they are not consecutive)

    • @AbstruseJoker
      @AbstruseJoker 4 місяці тому +69

      Dates would still reveal some info about how many tanks there are

    • @chickenwheel45
      @chickenwheel45 4 місяці тому +48

      He mentions that there's an encoding on top of this

    • @Sp4mMe
      @Sp4mMe 4 місяці тому +32

      Yeah, real world probably has a lot of further problems. Like what if one month all new tanks go to front X, one month they all go to front Y, and your information and rate of capture/observation is different, for example ... ?
      But then, you might also have some rough indications from observation planes or train schedules or something that might help correlate some gaps in your data. Of course, there might also be decoys and whatnot ... well, I'm sure a lot can be done there.

    • @BenjaminGatti
      @BenjaminGatti 4 місяці тому +18

      Serial numbers are by definition subset of a series. You need to know the series.

  • @Wagon_Lord
    @Wagon_Lord 4 місяці тому +5

    I heard this story ages ago, but never understood how it worked. That "flipping the number line around" line makes so much sense; so simple once the trick's revealed. Lovely!

  • @EchosTackyTiki
    @EchosTackyTiki 4 місяці тому +232

    In arms production it's fairly common for factories to assign serial number ranges to particular products in advance, so the serial number ranges having gaps within them is relatively normal. It's also normal for them to start production at something like 10,000 if they expect to make in the tens of thousands of that particular item, that way they all the items are serialized, but they also maintain the same number of digits in their serial number for uniformity without using a bunch of leading zeros. Overrunning that serial range usually results in a letter prefix or suffix being added.

    • @halfsourlizard9319
      @halfsourlizard9319 4 місяці тому +3

      By what metric is that better than using leading zeros? Or, why the aversion to leading zeros? (Also, why not just use GUIDs? Fixed size, convey identity but no other information, never going to run out.)

    • @mnxs
      @mnxs 4 місяці тому +14

      ​@@halfsourlizard9319As for the GUIDs, because the use of serial numbers for arms predates the invention of GUIDs by 100+ years. So, in other words, tradition - why change when you already have a perfectly workable scheme.

    • @cidiousblack2136
      @cidiousblack2136 4 місяці тому

      @@halfsourlizard9319 When creating records people will often omit leading zeros when recording numbers possibly out of laziness, possibly by convention. Forcing the leading digit to be a non-zero digit prevents this deletion from happening,
      Why care about leading zeros? The zeros still have meaning. For instance the number of digits present can be helpful in indicating that a number in a record is a serial number specifically. Further whenever number codes get concatenated it's important to not omit digits or this will change the shape of the number code, i.e. if the serial number were a concatenation of year-month-number. Granted concatenated codes should be dash separated or similar, But if we can't trust the clerk to put the leading zeros on the number, why would I trust the clerk to bother writing dashes between numbers.

    • @AmiiboDoctor
      @AmiiboDoctor 4 місяці тому +1

      It's normal now... but it wasn't normal then

    • @gaiamission7200
      @gaiamission7200 4 місяці тому

      ​@@AmiiboDoctor It was more normal than actually. Sequential serialization is fairly rare

  • @eshed
    @eshed 4 місяці тому +20

    I used this method with serial numbers of accordions made in the late 30s by Hohner, a German company. Now I have a spreadsheet named "The German Accordion Problem" with more than 150 rows.

    • @dragoncurveenthusiast
      @dragoncurveenthusiast 4 місяці тому

      Cool!
      So, how many did they produce per month?

    • @eshed
      @eshed 4 місяці тому +2

      ​@@dragoncurveenthusiast
      Unfortunately they didn't mark the month in the serial number, but fortunately they didn't restart every month either.
      That means I could estimate the total number of accordions with serial numbers between 1934 and 1940 to around 860000.

    • @crownhouse2466
      @crownhouse2466 4 місяці тому +1

      @@eshed Thats a lot of accordions

    • @JtotheAKOB
      @JtotheAKOB 3 місяці тому +1

      @@eshed you sure, they did not encode them, so their counter Accordion producers can not estimate the amount of accordions? :P

    • @eshed
      @eshed 3 місяці тому

      @@JtotheAKOB I'm relatively certain.
      Out of the 150, I have ~20 serial numbers for which I also know the actual production date. If you plot the numbers vs the dates, you get a lovely almost linear (R^2=0.995) graph. The only way I can think of to get this relationship while preventing accurate estimates, would be to randomly skip numbers with a constant probability.

  • @K_Forss
    @K_Forss 4 місяці тому +254

    My immediate thought was that the average of a random subset should be the same as the average of the whole, so the number of tanks should be twice the mean of the picked ones 2*(1+15+16+23+30)/5=34 for the first pick and 2*(3+10+15+18+24)/5=28 for the second. My guess is that they used multiple estimate methods and weighted the results depending on inherent uncertainties/errors of the methods

    • @sanandanojha2988
      @sanandanojha2988 4 місяці тому +21

      Yeahs that exactly what I was thinking! Although, I suppose that it might be more susceptible to outliers then the average distance method...

    • @journeymantraveller3338
      @journeymantraveller3338 4 місяці тому +18

      Same argument applies to the median. You can also get 95% confidence intervals for the mean and the median.

    • @Mayur7Garg
      @Mayur7Garg 4 місяці тому

      Why twice?

    • @yurie2388
      @yurie2388 4 місяці тому +12

      @@Mayur7Garg The average is roughly half of the total since you have both low and high numbers. Average tries to arrive at the middle point of the number set when all the numbers are unique and in series.
      (1+15+16+23+30)/5=17, which we know is too little since we have the number 30 in the series.

    • @Mayur7Garg
      @Mayur7Garg 4 місяці тому +8

      @@yurie2388 Basically it stems from the fact that the median and the mean would be identical for such a series. So if you know the mean, then you can use it like a median to assume that the final number is at twice the distance. But in that case, using the median in the first step directly is more appropriate. Also, one issue that I have with all these solutions including the one in the video is that they do not seem to work if the serial numbers do not start from 1 but from let us say 100.

  • @stco2426
    @stco2426 4 місяці тому +6

    Cool. When I was studying population biology we were given a task to work out the number of taxis in a city and we used the capture, mark, recapture method, using the taxi number, rather than marking anything. So, just noting the numbers in a given time (capture and 'mark') and then noting the numbers in a given period, which was later (recapture v not seen before). There are all sorts of sample to population complexities and improvements to the estimate with longer observations (but issues with recounts if the obs period is too long). Also, an improvement if a third count period is used.
    I wonder if there are any seminal capture, mark recapture examples that Numberphile might comment on and re-create on brown paper?

  • @Limrasson
    @Limrasson 4 місяці тому +806

    His reaction to tank 30 immediately raised suspicion and I would have said "yeah, that's 30 tanks in the bag."

    • @dewhi100
      @dewhi100 4 місяці тому +62

      Yep "Tank 30, oh, hmm, interesting..."

    • @PixelPhobiac
      @PixelPhobiac 4 місяці тому +2

      🤣

    • @Alex-ff8si
      @Alex-ff8si 4 місяці тому +2

      300th like

    • @roffie
      @roffie 4 місяці тому +2

      30 got the dinks

    • @cubexyz199
      @cubexyz199 4 місяці тому +4

      I'm on the spectrum and I still cannot see it

  • @Ring_Zero
    @Ring_Zero 4 місяці тому +5

    We're using similar techniques with serial numbers to investigate production numbers for relatively rare camera models from the early 1970s.

  • @EXPLICITBG
    @EXPLICITBG 4 місяці тому +296

    Tanks for sharing

    • @volodyadykun6490
      @volodyadykun6490 4 місяці тому +2

      You know destroyers for bases, get ready for

    • @cubes_art7956
      @cubes_art7956 4 місяці тому +1

      Came here to say this.

    • @myc0p
      @myc0p 4 місяці тому +10

      I would like to extend my tanks to Ukraine 🇺🇦

    • @talananiyiyaya8912
      @talananiyiyaya8912 4 місяці тому +1

      Thanks*

    • @EXPLICITBG
      @EXPLICITBG 4 місяці тому

      @@talananiyiyaya8912 r/woosh

  • @Robi2009
    @Robi2009 Місяць тому +2

    11:57 - the other problem would be the oldest tanks (i.e. built pre-1939) were either destroyed, removed from service or rebuilt into something else (like AA or AT platform) by the end of war

  • @MichaelDoornbos
    @MichaelDoornbos 4 місяці тому +61

    I love the "German Tank Problem." There's a great video on UA-cam showing this method of counting the Commodore 1571 Disk Drives. Using this technique for "other real-world problems" is a fun exercise.

    • @Grunchy005
      @Grunchy005 4 місяці тому +9

      Upvote for Commodore 1571

    • @thekinginyellow1744
      @thekinginyellow1744 4 місяці тому +1

      Wow, not even from 8-bit guy!

    • @LuisRamos-jg1gf
      @LuisRamos-jg1gf 4 місяці тому

      What's the video called? 😊

    • @ampulka
      @ampulka 4 місяці тому +1

      found it: "How Many Commodore 1581 Disk Drives? The German Tank Problem"

  • @JackGremlin
    @JackGremlin 3 місяці тому +2

    I've done nothing but fail math all my life yet I find this video interesting enough to take notes and watch twice.

  • @jamesterwilliger3176
    @jamesterwilliger3176 4 місяці тому +560

    Spies be like "tank you very much" but the mathematicians be like "tanks but no tanks"

  • @svenlima
    @svenlima 4 місяці тому +3

    It's the same question we posed as kids: "How do you count a herd of sheeps?" - "You count the legs and divide the number by 4." At the time we found that funny.

  • @macdofglasgow772
    @macdofglasgow772 4 місяці тому +61

    Excellent. I did laugh at the #1 and #30 thing. Always like Dr Grimes in these videos, I could listen to him just tel me interesting stuff all day.

    • @TheEvilCheesecake
      @TheEvilCheesecake 4 місяці тому

      It's just the one Grime actually

    • @chriswebster24
      @chriswebster24 4 місяці тому

      He was probably talking about him and his brother, together, the Dr. Grimes. His brother is a gynecologist.

  • @davidhershberger3673
    @davidhershberger3673 2 місяці тому

    (I apologize if this has been asked already!)
    Is there any benefit or issue (other than reducing observation size) in setting aside the highest and lowest observations to prevent the concern that you selected the highest and lowest, and how that affects the average gap? If you set aside the highest (30) and lowest (1) is your first go around, you would just have 15, 16, 23. Gaps are 14, 0, and 6. Average gap = 6-2/3. If you rounded to whole number, you'd get an estimated amount of 30.
    In the second example, you had 3, 10, 15, 18, and 24. Eliminating the highest and lowest, remains 10, 15, and 18. Avg gap = 5. Estimated max of 23 (18+5).
    Just having 3 used observations (4 gaps) creates for instability due to low sample, but seems to alleviate concerns that you don't know if you observed the max without knowing.

  • @molieros
    @molieros 4 місяці тому +180

    James: There are 30 German tanks in the bag.
    Chuikov: We were aware of that.

    • @Alex-ff8si
      @Alex-ff8si 4 місяці тому +1

      50th like + first reply

    • @rogerxiao4458
      @rogerxiao4458 4 місяці тому +3

      Krebs: That seems unlikely.
      (Downfall movie reference if you don't get it.)

    • @TheBrad574
      @TheBrad574 4 місяці тому

      Someone read Cornelius Ryan's The Last Battle and his interview with Chuikov.
      I just noticed someone mentioned Downfall too. The book is the source material.

  • @lindhe
    @lindhe 4 місяці тому +1

    James is so good! Always a great video when he's in. Also: he always looks happy, even when picking bad samples.

  • @reedjasonf
    @reedjasonf 4 місяці тому +266

    The disgust in Dr. Grime's voice at 2:24 when he says "I'm NOT going to let you feel the weight of the bag! [Are you daft?]"

    • @reidflemingworldstoughestm1394
      @reidflemingworldstoughestm1394 4 місяці тому +40

      And rightfully so. Who gets to heft a German tank factory during a war?

    • @robinsparrow1618
      @robinsparrow1618 4 місяці тому +10

      the time code you put is after the moment you're talking about

    • @WofWca
      @WofWca 4 місяці тому +9

      2:16

    • @PeterNjeim
      @PeterNjeim 4 місяці тому +17

      ​@@robinsparrow1618this is a common phenomenon I've seen over the years. Someone will watch the video, after watching a funny part, they click pause, then copy the timestamp, forgetting that this time stamp is after the clip

    • @hdbrot
      @hdbrot 4 місяці тому +3

      ⁠​⁠​⁠@@PeterNjeimMaybe OP edits it in. Let‘s hope for the best :)

  • @RichardJBarbalace
    @RichardJBarbalace 4 місяці тому +4

    I think there may be a simpler and more accurate way to do the estimation. My first thought gave estimates of 34 and 28 for the two trials, beating Brady's estimates of 35 and 27.8 both times compared to the actual number 30. Assuming "everything is equal and random" (i.e., a uniform distribution), just take the average of the tank numbers and double it. This also balances all the potential gaps.

    • @ChristopheSmet123321
      @ChristopheSmet123321 4 місяці тому +1

      That is certainly a valid method as well, also unbiased (meaning on average you will be spot on). However, the "maximum plus average gap" method is more efficient, i.e., it has a lower mean squared error: the squared difference to the actual N will on average be smaller than using your method. And that is what you want from an estimator!

  • @TheDuckofDoom.
    @TheDuckofDoom. 4 місяці тому +29

    I just tell the german book keeper that I think his records are sloppy, and he shows me all of his work to prove me wrong.

    • @MrZauberelefant
      @MrZauberelefant 4 місяці тому

      That was in a movie, wasn't it?

    • @lukasskymuh5910
      @lukasskymuh5910 4 місяці тому +3

      This would never work! .... not unless he plays war thunder...

  • @1_in_8billion
    @1_in_8billion Місяць тому

    Hey everyone, I just started learning how to use octave and just for kicks I made a program to do this very estimation. (Thanks for sharing Numberphile, this is really neat stuff!) Here's the *script* if anyone wants to fiddle around with it: (I've added a percent error so you can see just how remarkably accurate this estimation is!)
    actualNumberOfTanks = ceil(rand*1000);
    disp(["actual number of tanks: ", num2str(actualNumberOfTanks)]);
    totalPoolOfTanks = [1:1:actualNumberOfTanks];
    numberOfPicks = ceil(rand*100);
    disp(["number of picks: ", num2str(numberOfPicks)]);
    tankNumberPicks = [1:1:numberOfPicks];
    for pick = [1:1:numberOfPicks]
    tankNumberPicks(pick) = ceil(rand*actualNumberOfTanks);
    end
    disp("tanks randomly selected: ");
    disp(tankNumberPicks);
    estimatedNumberOfTanks = max(tankNumberPicks) + ((max(tankNumberPicks) - numberOfPicks) ./ numberOfPicks);
    disp(["Estimated number of tanks: ", num2str(estimatedNumberOfTanks)]);
    percentError = round(((estimatedNumberOfTanks - actualNumberOfTanks)/actualNumberOfTanks)*100);
    disp(["percent error: ", num2str(percentError)]);
    %;D

  • @S1nwar
    @S1nwar 4 місяці тому +110

    4 8 15 16 23 42... he literally started drawing half the LOST numbers i was on the edge of my seat

    • @GunNNife
      @GunNNife 4 місяці тому +26

      Using the formula from this video, the Lost tank bag has 48 tanks.

    • @gabor6259
      @gabor6259 4 місяці тому

      4 8 15 16 23 42
      4 8 15 16 23 42
      4 8 15 16 23 42
      4 8 15 16 23 42
      4 8 15 16 23 42
      4 8 15 16 23 42
      4 8 15 16 23 42
      4 8 15 16 23 42
      4 8 15 16 23 42
      4 8 15 16 23 42
      4 8 15 16 23 42
      4 8 15 16 23 42
      4 8 15 16 23 42
      4 8 15 16 23 42
      4 8 15 16 23 42
      4 8 15 16 23 42
      4 8 15 16 23 42

    • @ethanbuttimer6438
      @ethanbuttimer6438 4 місяці тому +2

      Haha me too I literally just finished watching an episode

    • @Cerzus
      @Cerzus 4 місяці тому +3

      I was looking for this comment

    • @FilmscoreMetaler
      @FilmscoreMetaler 4 місяці тому

      ​@@GunNNife Too bad it's not 108

  • @nomansbrand4417
    @nomansbrand4417 4 місяці тому +1

    Wondering about the statement about N=max yielding the highest probability for a given distribution. Wouldn't you have to go for the *expected value* of N?
    Assuming an equal distribution for N>=max, you'd end up with N_expexted = 30.6 and 38.6 for the seen examples. In the limiting case for high N, N_expected simply seems N/(pulls-2) larger than max.

  • @AloisMahdal
    @AloisMahdal 4 місяці тому +9

    I keep coming back to the Brady's question at 11:41 -- if in my distribution, lower numbers are more likely, would there be an easy correction for that?

    • @forasago
      @forasago 4 місяці тому

      You would have to come up with a formula for how much more likely the lower numbers are and calculate some kind of upward bias out of that. I don't think the answer could be considered "easy", no.

    • @ArcaneOath
      @ArcaneOath 3 місяці тому +3

      For the purposes of war estimations, I suspect you'd find the opposite true, particularly as time goes on - the data would become skewed towards newer serials for everything, as older models were destroyed or made inoperable.
      Probably best to hash military serial numbers at manufacture time though, regardless.

  • @adrianv.v.4445
    @adrianv.v.4445 2 місяці тому +2

    When calculating the average gap, you should just count the difference between tanks (e.g., between 15 and 1, count that as 14). That way, what you get is the actual (assimptotically) un-biased estimator of the number of tanks. If we do it your way, we get: MAX + (MAX - k)/k = MAX * (1+1/k) - 1, which when we let k->infinity, MAX->Acual_value and therefore we get the Actual_value - 1. It can be also proven that the estimator is biased for any k, outputting smaller values than the real one.
    If we don't add that -k to the formula (that is, we count the gaps as just the difference), the estimator we get is MAX + MAX/k = MAX * (1+1/k), which is the actual un-biased estimator we should use in this case. One may call this the adjusted Maximum Likelihood Estimator (MLE). As you said in the video, the MLE is just the MAX (the number most likely to be right), but it is biased. What we did with this trick was, as you explained, correct it.
    A more standardized way to compute this correction would have been to calculate the Expected Value of the MLE we got, to then apply the necessary multiplicative correction. That is, if it is necessary at all (MLE might as well be unbiased itself). This is one of the most used methods for estimating stuff out in the real world (when we are able to get an MLE).

    • @rennleitung_7
      @rennleitung_7 11 днів тому

      I agree, the definition of the distance looked a bit fishy to me. But as k infinity when k -> infinity. So I think, you might need a more sophisticated reason to make your point.

  • @bryan-nz
    @bryan-nz 4 місяці тому +36

    Have you ever done a video on "Hyper Log Log"? We use it in massive data systems for efficiently estimating the number of unique values. It is very interesting, and freakily accurate.

    • @EricKay_Scifi
      @EricKay_Scifi 4 місяці тому

      I've used that in BigQuery. APPROX_COUNT_DISTINCT is great for figuring out new data.

  • @kleddit6400
    @kleddit6400 Місяць тому +13

    1:51 “Is that a German tank or?” *every tank enthusiast goes oof*

  • @Canzandridas
    @Canzandridas 4 місяці тому +6

    Somewhere deep within my brain I'm pleased with this video because Dr Grime always reminds me of the young folk who went to ww2 saying they were adults when they weren't and this video is about tanks

  • @ritual_aftermath
    @ritual_aftermath 25 днів тому +1

    If I would have had a teacher like this when I was young, I'd likely have become a mathematician! Great video, thank you!

  • @mladengavrilovic8014
    @mladengavrilovic8014 4 місяці тому +51

    it would also make sense to calculate the average of the samples and multiply it by 2 as the average of consecutive numbers starting at 1 would be about n/2 and the average of the samples would also approach the same value.

    • @timseguine2
      @timseguine2 4 місяці тому +17

      Close. The average of the observations is an estimator of the mean of the serial numbers in the bag. You got that much right. But the average serial number is (n+1)/2. So you have to double it and then subtract one.

    • @raiseer
      @raiseer 4 місяці тому +5

      Was my first idea, too. They did basically the same with extra steps :)

    • @rianfelis3156
      @rianfelis3156 4 місяці тому +6

      The reason for those extra steps is that you usually have padding around the serial numbers, like just start counting at 1500 because the 15 means something else, and the last two digits are sequential. Which they did touch on, but not a lot.

    • @halbronk7133
      @halbronk7133 4 місяці тому +10

      This is the method I thought of too, but it turns out that the numbers you find other than the max aren't relevant. However many tanks there are, finding 1, 2, 3, 4, and 30 is the same as finding 26, 27, 28, 29, and 30 (as long as the serial numbers start at 1).

    • @timseguine2
      @timseguine2 4 місяці тому +6

      @@halbronk7133 "the numbers you find other than the max aren't relevant": This isn't precisely true. They are relevant in the sense that they produce a valid estimate for the maximum. The problem is that it ignores relevant information that we know about the problem (that the numbers are sequential without gaps). And usually when you don't use some piece of information to derive your answer then it is possible to do better.

  • @jbeckh2
    @jbeckh2 4 місяці тому

    This episode was great. If you come across more war history examples, please post them. My son loves war history and was fascinated by this. This helps to understand why math is important.

  • @brmolnar
    @brmolnar 4 місяці тому +45

    Seal Team 6 is named that to imply that there are at least 5 other Seal Teams. At least this is the common rumor.

    • @Laotzu.Goldbug
      @Laotzu.Goldbug 4 місяці тому +11

      This is actually true (at least according to Richard Marcinko's autobiography). Now presently there are well over six SEAL Teams (8?) but when Marcinko created a specialst SEAL unit in 1980 here were only two other ones, and "Seal Team 6" was a deliberate attempt at deceiving the Soviets.

  • @indranilroy4822
    @indranilroy4822 2 місяці тому

    I always find it fascinating how these equations can be derived after rigorous application of a simple general concept, like at the beginning of the video you can feel that the frequency of smaller numbers (hence more smaller gaps) would affect the estimate but the quantifying part takes time to visualize in its precise form

  • @LudicrousTachyon
    @LudicrousTachyon 4 місяці тому +10

    For electronics with network cards, companies are assigned ranges of MAC addresses as they are supposed to be universally unique. The range could allow one to estimate the number of devices they sell.

    • @trueriver1950
      @trueriver1950 4 місяці тому +2

      Life, including the Y-T algorithm, is strange indeed

    • @stargazer7644
      @stargazer7644 2 місяці тому

      The operative words here are "supposed to be". And nobody says they have to be assigned sequentially. Each organizationally unique identifier (OUI) can create 16 million unique MAC addresses. And you can have more than one OUI.

  • @RGD2k
    @RGD2k 4 місяці тому +1

    Murphy's original utterance: "I swear, if there is a wrong way to do something, we will find it"
    Sod: Murphy was an optimist.

  • @romansanders
    @romansanders 4 місяці тому +12

    Apple serial numbers were sequential until about 5 years ago. They even contained information about which factory produced the item and when.

    • @MrZauberelefant
      @MrZauberelefant 4 місяці тому +5

      They still should, trackability is vital information.

  • @RAFAELSILVA-by6dy
    @RAFAELSILVA-by6dy Місяць тому

    I got a solution to this problem in a maths challenge about eight years ago. My approach used conditional probabilities and the expected number of tanks in the bag would be:
    N = MAX(k -1)/(k-2)
    For five observations (k = 5), this gives N = MAX(4/3). This is higher than the average gap approach, which gives N = MAX(6/5) - 1

  • @aleksihermonen9017
    @aleksihermonen9017 4 місяці тому +61

    I was thinking about taking the average and doubling it. The idea being that the average would be approximately in the middle of the true number, so double the average would be close to the true number.

    • @PsychoMuffinSDM
      @PsychoMuffinSDM 4 місяці тому +5

      That's what I did, lol.

    • @xerkules2851
      @xerkules2851 4 місяці тому +4

      Same here. That method gives very similar estimates in these examples.

    • @TomVennix
      @TomVennix 4 місяці тому +8

      I think you can improve this estimate by subtracting 1 at the end, since the average of the numbers 1 up to and including N is (N+1)/2 rather than N/2. Denoting the sample average by X, your idea is that X should be approximately equal to (N+1)/2, which would imply that N is approximately equal to 2X-1.
      I'm actually curious to see how this performs (in general) compared to the method presented in the video.

    • @akshaj7011
      @akshaj7011 4 місяці тому

      That wouldn't work if the serial numbers didn't start from 1

    • @aleksihermonen9017
      @aleksihermonen9017 4 місяці тому +4

      @@akshaj7011 That's true, but the average cap wouldn't work either if they take account to the cap from 0 to the first element.
      If the starting point would be unknown, i would probably use standard deviation in the same manner.

  • @sabinrawr
    @sabinrawr 3 місяці тому

    Brady's final questions show amazing insight. My favorite anecdote involves SEAL Team Six. There was not a 5, they just used the number to make people think there were more teams. I don't know if this story is true, but I like it and it shows that you have to know the parameters of the numbers instead of assuming a sequence starting with 1.

  • @Demasx
    @Demasx 4 місяці тому +26

    This feels like one of those widely usable maths that I won't be able to find an application for anytime soon... then when the time comes, I'll remember there's a solution but not what it is 😅 Bookmarking it now for that future occasion, haha

    • @EdMcF1
      @EdMcF1 2 місяці тому

      Ukrainians might find it useful

  • @ismbks
    @ismbks 2 місяці тому

    i missed this guy a lot, i remember binge watching his entire channel when i was in high school, brings me back

  • @aksela6912
    @aksela6912 4 місяці тому +19

    OK, what about this: As the sample size increases, the average of the sample will approach the average of the population, so let's estimate the average like that. For a uniform distribution starting at zero the maximum is simply two times the average, but in this example the minimum is one, so we'll just subtract one from our average. Using this method I get 32 and 28 tanks, respectively.

    • @cryme5
      @cryme5 4 місяці тому

      Or double the median. It would have been 32 and 30. Not sure which is usually closer, I feel like you need a Bayesian analysis with a prior.

    • @aksela6912
      @aksela6912 4 місяці тому

      Although these specific estimates has less error than the ones presented by James, on average his method will be better, at least for larger samples. I did some simulations, and for small samples, say three, it's pretty close, but James' method has a lot more bias.

    • @cryme5
      @cryme5 4 місяці тому

      ​@aksela6912 Funny thing is, no matter the prior you use, the posterior probability of N (the total number of tank) is just the prior truncated starting from M (the maximum of the observed serial numbers). In other words, a Bayesian answer, no matter the prior, should only depend on M (not even on the number of samples).

    • @aksela6912
      @aksela6912 4 місяці тому +2

      @@cryme5 For a uniform distribution the variance of the sample median will be greater than the variance of the sample mean, and as mean and median should be the same it will be better to use the one with less variance. I have to reiterate though, sample mean times two is a poor estimator, even if it feels more intuitive, and it feels like you're utilising the collected data better.

    • @EebstertheGreat
      @EebstertheGreat 4 місяці тому +1

      @@aksela6912 James's method is unbiased. If you observe n tanks and the maximum value you observe is m, then the minimum variance unbiased estimator is m + m/n - 1. Your estimator of twice the sample mean minus one is also unbiased, but its variance is higher. And it doesn't use the important information of the sample maximum, which means the estimate might actually give a value we _know_ is too small.

  • @beal_a
    @beal_a 4 місяці тому +3

    IIUC, this is also a problem where frequentist and bayesian techniques arrive at different answers. I'd love to see an explanation of that.

  • @rPuck
    @rPuck 4 місяці тому +40

    Tanks for sharing!!!

  • @dattaprasadgodbole
    @dattaprasadgodbole 4 місяці тому

    Every part of this video - from finding out the numbers to objections raised - was brilliant. I love this video.

  • @JaniLaaksonen91
    @JaniLaaksonen91 4 місяці тому +23

    Would make a nice graph plotting your best guess of total tanks, pulling one tank at a time. Any time you get a new biggest number the plot would jump up, and when you get smaller numbers it will slowly decend as your average gap gets smaller. It would jerk up and down, approaching the actual total number.

    • @virt1one
      @virt1one 4 місяці тому +1

      agreed that would be nice to look at, though you'd want a larger set than 30. should start out a as a line jumping up and down but rapidly smoothing out. After it calmed down a bit you could probably do a bit of "eyeball extrapolation" to get a more accurate estimate than the last prediction.

  • @5000rgb
    @5000rgb 22 дні тому

    I appreciate the 500% more description. A lot of people muddle that up and would say 600% more. They ship past the fact that 6 times as many is 600% OF or 500% MORE.
    I think my estimation was a little different technique. If we take the arithmetic mean of all the tanks we come up with a number that is half the total. So by taking the mean of the numbers on tracks pulled out of the bag, we can double it.

  • @impossiblemission4ce
    @impossiblemission4ce 4 місяці тому +45

    First Enigma, now these tanks. Sometimes it feels as though James is gearing up for a time travel mission.

    • @talananiyiyaya8912
      @talananiyiyaya8912 4 місяці тому

      Obviously not...

    • @_invencible_
      @_invencible_ 4 місяці тому +3

      @@talananiyiyaya8912 nice try, MI6

    • @sandekv
      @sandekv 4 місяці тому +5

      He is winding down from one. He went there, helped Britain win, and came back.

    • @jimmyzhao2673
      @jimmyzhao2673 4 місяці тому +1

      @@sandekv He's slowly revealing that to us.

  • @bastawa
    @bastawa 4 місяці тому

    That was brilliant! your initial picks are exactly why it was so hard for me to grasp probability at school until I realized it is about multiple events and doesn’t work that great for a single event

  • @Diekyl
    @Diekyl 4 місяці тому +17

    At first, I was perplexed about the method of estimating monthly production with just serial numbers, but I am glad they explained they had a way to decode the month and factory of the tank as well. I assumed some of these numbers must have been intentionally hidden or misleading.

    • @suit1337
      @suit1337 4 місяці тому +2

      no, they were just contracted to different manufacturers and sub-models (Ausführung) and we're assigned specific number ranges
      the gearboxes, or rather specific the engines with the geartrain attached were often shared between different models, like the Panzer V Panther and Panzer VI Tiger shared the same engine platform, and only was different in minor details and power
      in the later stages of the war it was not uncommon to use what was in stock or repair tanks with parts from different models

  • @meownezz
    @meownezz 4 місяці тому

    Information and mathematics once again showing their overwhelming and seemingly timeless relevance. 🙂

  • @MangoJones139
    @MangoJones139 4 місяці тому +4

    I really like Brady's talent for asking "good questions"

  • @alphakumar-g4q
    @alphakumar-g4q 4 місяці тому +2

    i have other method of solving we will average the numbers of 5 random tanks we picked then the average will be close to the combined average of total numbers of tanks so avg of 5 = 17 = n(n+1)/2n {avg of all numbers on tanks , n=total number of tanks } we get n=33 =total number of tanks

  • @betabenja
    @betabenja 4 місяці тому +6

    6:24 scary camera pan

  • @mateodemicheli2420
    @mateodemicheli2420 4 місяці тому

    Awesome concept of a video, I love how you explain each part slowly of the puzzle and the graphs, it helped a lot. Im sucribing right now

  • @YEASTY_COMMIE
    @YEASTY_COMMIE 4 місяці тому +5

    If you take the simpler formula of twice the average value of the tanks, it actually gives better prediction in this case (34 and 28, if I can still perform additions)

  • @jokoluna6978
    @jokoluna6978 4 місяці тому

    This video is brillant! I knew about the story and always thought there is some really complicated math behind the scientists work. Nicely explained, thanks! :)

  • @GeekRedux
    @GeekRedux 4 місяці тому +8

    12:17 "But we broke that code, okay? That's another story." Well, now we've got to hear it! Enigma, or something else?

    • @TheBendermen
      @TheBendermen 4 місяці тому +1

      The Engine machines were for coded communications, I think. I think he meant that the serial numbers were coded, which isn't uncommon for different companies and favorites to have different ways of doing things

  • @backwashjoe7864
    @backwashjoe7864 4 місяці тому +1

    We were not given a key piece of info at the beginning - what year did this occur in? German AFV (including tanks, but also tank destroyer, etc) production varied widely across WW2. For example, 3600 in 1941 and 19000 in 1944.

    • @techheck3358
      @techheck3358 4 місяці тому

      13:30

    • @backwashjoe7864
      @backwashjoe7864 4 місяці тому

      @@techheck3358 But it was not given at the beginning. When we were asked to guess which group was more accurate.

  • @cherifSheriff7544
    @cherifSheriff7544 4 місяці тому +4

    Question: is the formula getting more precise for more observation or for a bigger Numbers of tanks ?

    • @maxxie8058
      @maxxie8058 4 місяці тому +3

      Agreed. If you observe 1000 tanks, is it better (on average of course) to treat that as one big observation or to split it into 10 smaller observations ? I feel like that would be useful info to have.

    • @StephanTrube
      @StephanTrube 4 місяці тому +3

      Yes, probably. The probability to encounter uncommonly extreme gaps becomes smaller and smaller, the more samples you take. When you make as much observations as there are tanks, all leeway in gaps has vanished and your accuracy has reached 100%.

    • @hammerth1421
      @hammerth1421 4 місяці тому +4

      It should be. Law of large numbers, the statistical noises averages out.

  • @Grubik
    @Grubik 4 місяці тому +1

    why just not do avarage of the numbers on tanks? i mean (sum of numbers divided by number of tanks times two) ((1+15+16+23+30)/5)*2=34 which is closer to 30 then his calculated 35, second try is ((3+10+15+18+24)/5)*2=28 which is again a bit closer to actual 30 then his 27,8. Am i just being lucky or is it better way to do it?

  • @PhilBoswell
    @PhilBoswell 4 місяці тому +13

    UA-cam recommended me a short video by Hannah Fry about this very thing just this morning: I don't recall how old the video was but life is strange!

  • @heinaung6967
    @heinaung6967 4 місяці тому

    Thank you Brady for making these videos, every time I watch it motivates me to do my job better as an engineer/computer scientist

  • @pallavinavin4988
    @pallavinavin4988 4 місяці тому +12

    Love ur passion, professor

  • @David8n
    @David8n 4 місяці тому

    When i was doing stats at university the lecturer had us fill in a questionnaire on day one to give us some nice data to do analysis on (birthdays and such). It was all nice data except that there wasn't a single left hander in the class. Not one. There ought to have been about ten but there was zero. Credit to the lecturer, he rolled with it. His attitude was, "these things happen - we don't fudge our data". It was actually a great class.

  • @spencerarmon4491
    @spencerarmon4491 4 місяці тому +7

    Would be cool to see the mathematical derivation of calculating the expected value of the tanks using an infinite sum of the probability at the beginning

    • @Last_Resort991
      @Last_Resort991 4 місяці тому

      Its not an infinite sum when is finite. It has N elements

    • @spencerarmon4491
      @spencerarmon4491 4 місяці тому +1

      @@Last_Resort991to properly calculate the expected value, it would be an infinite sum from the max number seen to infinity

  • @orisphera
    @orisphera 4 місяці тому +2

    When I saw 23, I noticed that it was one of The Numbers. Then I saw 15, which also was one. Then, 16 followed, and it's also one of them. So, it started with half of The Numbers, although in a scrambled order

  • @Xelopheris
    @Xelopheris 4 місяці тому +19

    I literally saw the Hannah Fry video about this yesterday and kind of assumed that this would be a Hannah Fry numberphile video.

  • @axiezimmah
    @axiezimmah 4 місяці тому

    Before watching the video i had another method which i think works pretty well too.
    Because youre pulling random samples, the samples can be assumed to be distributed somewhat randomly along the whole range. So if you take the average of the numbers you have found, that can be assumed to be approximately the middle of the range. Multiply that by 2 and you should get close to the max of tbe range.

  • @WAMTAT
    @WAMTAT 4 місяці тому +4

    Nothing better than James talking WW2

  • @jack-xf6il
    @jack-xf6il 4 місяці тому +2

    This raises 2 questions: 1) how are you able to leave an unresolved Rubik's cube loose on your shelf; 2) why are your tanks pointing their turrets backwards?

    • @dmondot
      @dmondot 4 місяці тому +2

      Actually... The cube has a different purpose. If you watch carefully. the cube has changed configuration 4 times through the video. Most of the time by 1 or 2 moves.
      I think it's encoded messages to pass military information (about British tanks, obviously) to the enemies without looking suspicious.

    • @davestier6247
      @davestier6247 Місяць тому

      When tanks are being transported, whether by rail or on ships, they are almost always oriented with the turrets reversed to make more of them fit in a certain space. Not having the cannon barrel pointing forward usually means more of them can be crammed into a space.

  • @EXPLICITBG
    @EXPLICITBG 4 місяці тому +9

    “I will do one”
    Lo and behold, one he proceeded to do

  • @fespa
    @fespa 4 місяці тому +4

    Another great and entertaining video. Thank you. I would love to read the paper about the why the spies were so wrong.

    • @Jeff-jr4xw
      @Jeff-jr4xw 4 місяці тому +1

      Me too. I thought maybe they were being fed false information?

  • @FayCarllyle
    @FayCarllyle 4 місяці тому +1

    The way we communicate with others and with ourselves ultimately determines the quality of our lives.

  • @Eddy002
    @Eddy002 4 місяці тому +3

    I think the “failed” demo was perfect since you had to explain not only how it works, but also where the formula fails.
    Reminded me of school. The teacher would teach the easiest way to understand something, but then on a test it would be the hardest example/use of that formula. School failed, numberphile succeeded.

  • @shaun7163
    @shaun7163 4 місяці тому

    This guy is the absolute best and has been for years!

  • @luketurner314
    @luketurner314 4 місяці тому +10

    10:24 speedrun

    • @I-md6mq
      @I-md6mq 4 місяці тому +4

      I saw this comment at 10:23, dang..

  • @monkeyboyDylan
    @monkeyboyDylan 3 місяці тому

    I came up with another way to estimate the number of tanks. Not sure which method is superior. First, I determined the general formula for the average from a set of numbers counting sequentially from 1 to N. I rearranged it to solve for N. Then you take your sample, determine the average and estimate N. Using the same samples as in the video I got N=33 and N=27. The average answer was 30!!
    I derived this as follows:
    N1=1
    N2=1.5
    N3=2
    N4=2.5
    N(x)= x-((x-1)*0.5))
    = x - (0.5x - 0.5)
    = 0.5x + 0.5
    Avg = (0.5x + 0.5)/x
    = 0.5 + 1/2x
    X = 2(Avg - 0.5)
    =2Avg - 1
    (23, 15, 16, 1, 30)
    Avg = 17
    X = 33
    3, 10, 15, 18 24
    Avg = 14
    X=27