Well, I can give you my phone number in 3 formats; international, formal and informal. But for this, I'll use my old number that only has 2 formats: international: 46 milliards, national: 763 million
@@yanwittmann As he demonstrated, when the cell is wide enough, it will show normal numbers. Your screen space (or zoom level) might make it not have enough space to display the whole number, giving only you the scientific notation.
it's not about the size of the tool. it's about using the right one at the right time. thats true for programming and woodworking (and, i guess, an other discipline...)
I once worked in a bank(big name, still in business). Money records were kept on a real database, but they used spreadsheets for some other daily things. When showing me how these spreadsheets worked they pulled out a pocket calculator to do the math and typed the answers in(for a few dozen cells).... After I finished face palming I showed them the purpose of spreadsheets, or computers in general for that matter
Training and its lack thereof is a major problem. There are so many features and techniques in Excel that would solve many problems but users don’t know them.
Богдан Кондратов You lead with a ‘ to force the cell to text. So from this video, if you want MARCH5 left as a name the just type ‘MARCH5. Excel is designed for non-technical management types to use comfortably without having hissy fits. If you want to use it for heavy lifting then you have to know how to get past the default training wheels. More problematic are the actual errors - bugs - in different versions of Excel. There is a real doozy with the MROUND() function where it doesn’t round correctly in certain cells. That bug evaded our efforts to figure out, but as it was aberrantly rounding down the work around was to add a fraction of the rounding value to the data: so MROUND(x,y) becomes MROUND(x+y*0.001,y). Annoying, but it gets around that bug.
@@olmostgudinaf8100 Python is not an exception. You are still identifying the type of a value even if the variables are dynamic. 3.14159 != "3.14159" "25/01/2020" != datetime.datetime(2020, 1, 25) There are plenty of strong statically typed languages that infer types as well. The problem with Excel is that it tries to parse every cell with a return type polymorphic parse function and its a pain (or people just don't know how) to specify the return type. An example of this done right is Haskell. It has the function read which is return type polymorphic and it will infer types but you can just add a type annotation if there is any ambiguity. In Python you have to use a separate parse function for each type.
"Why even use a spreadsheet if you aren't going to do any calculations"? ... says the guy who once made a spreadsheet that had nothing but numbers between 0-255 with conditional formatting.
@@p0gr Even then .csv files are a very common way to dump data in a format that's easily readable. Those are spreadsheet files with no calculations in them.
I've used spreadsheets with no calculations before, just to more easily get columnar formatting, because setting up multiple columns in the word processor isn't so bad (still a bit of a pain), but to then get the contents to line up properly can, depending on what the proper alignment is, be a huge hassle, while I can do the same in the spreadsheet in seconds.
Why user a spreadsheet without calculations? I use them all the time. Software exports SQL queries, applies various calculations, then effectively paste values the entire thing to send it out to people around the business. People can use Excel usually, it requires no extra software licences, easy to pivot and filter the data, etc. Also having #N/A is perfectly legitimate in a sheet as it's a way to make charts give null values rather than zero.
Also, I use spreadsheets just to design forms that require hand entry. Perhaps not for some, but for the spreadsheet savvy forms are often easier in Excel than in Word, especially if a person wants the tax form look.
I was going to make a similar comment. I, in a previous life, sent out a rather large spreadsheet. The one sent to supervisors and managers was flat (actually it was posted and an e-mail was sent out when I composed it.) The spreadsheet was actually built with VBA (Visual Basic for Applications) and had very few internal computations.
@@dhampson545 Excel will often corrupt CSV files when importing them, so if the user wants to perform additional processing on the data set and they don't know SQL, they will have problems
I've actually made plenty of spreadsheets at my job that are full of #DIV/0 errors without it being a mistake. Usually this happens when I'm comparing two datasets by finding ratios or relative error values between corresponding data points. If some of the values happen to be zero, I get a #DIV/0 error in those rows, but I also often don't care about these data points, so I ignore the #DIV/0 errors and get what I need from the other rows. One interesting way that spreadsheets fail us - one of our program spits out a CSV file that's often full of "-Infinity" values. When I try to open these in Excel, it sees the initial minus sign and assumes I'm trying to negate a range called "Infinity", which, of course, doesn't exist, so the sheet ends up full of #NAME? errors.
I completely agree - I often have spreadsheets where #Div/0 is the expected result and not a mistake. However, I have also discovered the Onerror function that can put in your chosen message if an error exists. This makes the spreadsheet more meaningful for other people who might think the error message is a mistake.
I also make spreadsheets that intentionally have a lot of errors. Usually it is because they are designed to take in some input set of data that varies in size and calculate useful information from it in a format that is easy to send out in an email. I use the iferror function to hide the errors but it's easier to just make the spreadsheet work for more lines that will ever really be used and hide the extras than to have to remake the spreadsheet everytime the size of the input data changes.
Exactly. On a less interesting level , I would imagine that most businesses have spreadsheets that contain errors due to the absence of data that isn't available yet.
Hell, I have plenty of spreadsheets that actually do a check for certain types of errors in their calculations. Because sometimes those are "valid results" that need to be accounted for.
I often make a formula and drag it down to apply the formula to multiple cells. If one of them is vlabk or has text in because it seperates two datasets then it might return an error, but it doesn't matter andI usually egnore them rather than bothering to add the IFERROR or manually deleting those rows.
For the last company I worked for, I once had a spreadsheet that was 96 MB. Believe it or not, this spreadsheet contained not images, audio, video or anything similar that would artificially increase the size of the file. The spreadsheet was full of server log information that over the course of several sheets was parsed into usable rows and columns which was then finally sorted via a complex pivot table and then several graphs were created from that detail. Each of the data sheets had just over 1 Million rows of data and into the triple letter columns on some of the sheets. This thing was an absolute beast and using it routinely crashed Excel. I recall I had a certain procedure for opening the file so my computer wouldn't lock up. In the end it was an extremely useful data set, albeit a horrible solution for solving the problem we had.
@@andykillsu But companies often don't buy that package when they are buying their Office packages. So it's not available to the poor workers. Found myself in that situation often.
Phillip Siebold That’s not true at all. The majority of companies have office 365 versions that come with Access. Any tier of office that comes with Outlook (very commonly used in companies) and Skype Business will have Access bundles with it. Now your company may not install it on your computer because you would have no use for it. But clearly the type of data and the way the OP was using it, should have been done an Access and even Power BI.
if you're actually parsing logs of real applications that are worth money, you can afford a system that's designed to scrape logs instead of rolling your own. text files and python are even enough for that.
@@andykillsu It isn't readily apparent to a novice that Access is there. Outlook, OneDrive, Word, Excel, Power Point, One Note, Skype, Calendar.... Why is THAT there? That must be the full list, or there'd be something more useful there instead.
My postal code starts with a zero. I ordered a dashcam once and the store left the zero out, then someone at the fulfillment center noticed that it was one digit short and added a zero at the end, shipping my order to a cargo terminal 1,300 km from me. And that's just the straight-line distance; by road, it's 1,700 km (21 h) and three border crossings away, or 2,000 km (28 h) if you want to stay in the country.
@@rosiefay7283 Probably more a stupid action by whomever set up the software/spreadsheet/database used in the first place, since probably all the retailer's colleagues are just typing the data in the boxes as they serve customers and/or fulfill orders, and are not expected to have the time or knowledge to see 'under the hood' so to speak in terms of how the boxes/rows/columns etc have been set up or formatted.
@@MrDannyDetail Rule #1 in programming: don't EVER store postal codes, phone numbers, etc. in numbers, but in strings. and still it goes wrong because of stupid mistakes people make
@@spicybaguette7706 It would be a non-issue if postal codes weren't numeric. If instead of a pseudo-random series of digits they were simply a pseudo-random series of letters, nobody would ever have to worry about storing them incorrectly, because a string would be the only option.
Spreadsheets don't have to contain formulas to be useful. Putting data in a table and being able to sort/filter/graph it is enormously useful. No calculations required.
Yep that and also taking downloaded csv and making dynamic table to better present the data (even if it is just re-ordering the column with no sum/count being done. At least it sorts them and remove duplicates).
The Irony is, sorting/filtering/graphs all require calculation. Either you have a Spreadsheet (not necessarily Excel) or a Table (no calculation, just a N-dimensional list of content).
That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong.
5:38 I have a spreadsheet that I use everyday that has eleven divided by zero errors. But that's okay because it's January and they're just taking the averages of data in months that haven't happened yet! So thery're not mistakes! It's possible that some of these obvious errors might not actually be mistakes in some rare cases!
I have lots of spreadsheets with DIV/0 errors too because I'm applying a formula globally to a whole bunch of data, and some of the base data happen to include 0s. Yeah I guess I could use an IFERROR function or something like that, but I don't really care unless the DIV/0 "infects" subsequent calculations somewhere. It's not like I am legit trying to divide by zero and confused why it won't work.
@@davicorosello1588 If it's for personal or team-internal use only, and everyone knows why that error is there and what it means, why do the additional work? Just for the random case a researcher comes along that only looks at the value in those cells without the surrounding context? Also, the dangerous errors are not those where Excel prints an error message, but those that look like a legit calculation and you only notice them when you check the formula (see the SUM vs AVERAGE example).
I mean that, even if you want to, your error could show a text explaining that error in a more comprehensivelly way instead of leaving it like "div/0", like for example "Not enough X data" or "data X is 0" or whatever, because you can have the same error but for many reasons. But, if you are pleased with what you have, i agree that is way better to do things the way everyone got used to it.
If they're refering to stuff that hasn't happened yet, maybe you should add logic to it that turns them into na() errors instead. That way, anyone reading it can see it's something to ignore rather than wondering if there's a calculation error.
I think he said as a prototype. A nice part is Excel is a OLEDB provider, which makes it pretty easy to use with ADO.NET. pretending it's just a slow limited database without all the hassel if setting up an SQL server. (Although Sqlite is a better option for that)
There are other classic errors as well - I once saw a spreadsheet analyzing a structure produced by a large engineering consultancy. The 'pin joint' analysis used lots of sine & cosine functions to compute the answers. The author(s) had inputted the angles in degrees but the Excel Trig functions work in radians. The resultant report passed two levels of checks my the consultancy and was issued to us as the client. My colleague a structural engineer took one look at the result and started laughing. When he recovered he passed it to me as an Electronics Engineer to see what I thought and it did not take long to realise that some of the forces through the mounting points had negative numbers indicating they were upward - an absolute nonsense given the structure and the loading. The consultancy were very embarrassed when the report (and the associated spreadsheet) were returned - they never explained how it passed the 3 level checks. I suspect it was because the previous person had already signed it.
Why use a spreadsheet when you're not doing calculations with it? Because I've yet to see the word processor that isn't an idiot when it comes to spacing and making lists.
Excel is an okay data exchange format. It's structured, readable by a lay person (with Excel, sheets or calc). And in the case of xlsx and ods a somewhat open format with built in compression. (A zip file with XML inside)
Word is a really great word processor if you know how to use it. Painfully, very few people know how to use it, including every teacher I had at university. Protip: The _outline view_ button will change your life. And for the love of all that is holy, don't manually change size, spacing, font, colour, italics, bolds etc. in your text, it will only lead to heartbreak, use _styles_ in stead, they were created for precisely that purpose.
Regarding the 44meg spreadsheet, I've worked at a medium size food factory where among other things would take daily production stocks and then input them into an Excel file. The file was about 25megs and had all the stocks of all the departments, that were then used to calculate orders of raw materials, production output estimates, and so on. It had hundreds and hundreds of rows and columns and most of them contained formulas of obscene proportions. We're talking a dozen wrapped lines long formulas that covered ALL the aspects of the production. Take any ready meal, look at the ingredients, cross-link them all and them multiply by a hundred. My understanding is that they were built over time by adding more and more references without any optimization (because of the sheer complexity of it). There were backups of backups of backups and as you can imagine the management was very touchy about anyone even breathing near those Excel files. Oh, and every day had a new file, with new inputs and outputs. So yeah...
I made a 0.1s measurement on a 5GS/s oscilloscope. Saved it as ascii, and opened in Excel to view the curves. It seems Excel has a limited nuber of rows. It could only show some of the 500 000 000 rows... My misstake :-)
@@SirLightfire I worked for a company that went through that it was a custom seed mix calculator that the owner had made. It has been recorded multiple times and is super impressive now.
I was once in charge of a spreadsheet that calculated where we stood among our competitors. The input for it was a separate excel file that I had to download each morning with roughly 300 attributes and histories for each of our 2000 competitors that was a couple MB. This 30MB spreadsheet would take that data, parse it by firm and then stitch it together and output a four page PDF (had to be printed to PDF manually) showing where everyone stood. No graphics, no special formatting, just minimal readability items and it was that big. I'd open it, tell it to update and go get coffee. If I was lucky when I got back it would be done. I didn't build it, but I knew the underlying data. If something looked off, I'd go back into the data file, and clean it up, then run it again.
I don't think he has much experience of the amount of data and number of formulae a business can end up sticking in a spreadsheet. Besides Enron collapsed when excel probably still had an approximately 66,000 row limit or something like that. So 20 years later their spreadsheets look pretty small.
@4:20 - If you have large datasets and / or complex calculations then spreadsheets can get quite large. On top of that, it isn’t feasible to compress some data transformation into tight and barely intelligible formulae if you want to check and maintain it. Once you have to do some real work with data it starts chewing up a lot of cells. If you use them as a way of capturing a system in total (I work in a calibration lab) then you potentially have a lot of non-data information present - we use a master workbook that processes structured data into a certificate with tables and other information, and the data may come from another system or from a procedure worksheet loaded into this master workbook. This makes for large spreadsheets with a lot of duplicated and reworked information, but auditing is very easy. The alternative approach of storing all of the data in a database can be useful for some purposes, but makes system changes a nightmare due to the importance of maintaining compatibility with earlier data. Good luck with auditing anything performed several generations of system modifications ago.
I have a spreadsheet I've made for work that actually uses errors to my advantage. I have to compare reports from two different programs. Basically it's an exception report to find out if a value is missing. All from imported data. I have several v lookups on one row looking at different sheets. Then I use "and" and "iserror" to see if every column contains an error. Add filters and bingo I found what's missing. Excel is awesome, but it's not made for databases. It's perfect for pulling data into from other sources and then analyzing it. That's what it is best at. At least in the business world.
I've had memory overflow (I assume?) errors where I paste an entire column of cells into another column and even if I've already marked both columns as plain text to leave the hell alone, autocorrect still noses into random cells and breaks things.
@@tparadox88 When you copy and past a cell, it copies the format. If you right click where you want to paste, you can specify exactly what you want to paste. Since excel was made for "day-to-day" use, copying and pasting everything including formulas and formats is typical. But you have more options such as just pasting text.
This. Just select the column and "edit cells", set the desired format and bish, bash, bosh, no more autoformat errors - until human error inevitably rears it's ugly mug.
@@geolawie Only if you don't know, what you're doing ;-) If you use "data -> from text" it will allow you to specify the separator, data-type of a cell etc. A lot of the really bad errors will go away if you do it that way.
@@tetsi0815 That is an extremely big "only if". I started my career as an accountant and was fully qualified as a Chartered Accountant in the UK. Accountants are clearly very big users of spreadsheets (my guess would be the biggest) and how much training did I get in avoiding spreadsheet errors? You guessed it, absolutely none whatsoever. Most people who use spreadsheets for business critical tasks have had the same amount of training that I had.
I tend to do this thing in Excel (and other spreadsheet programs) 1. Select All. 2. Align vertically and horizontally to center. if I'm not doing numbers, I also do this: 3. Change type to "plain text".
In my work, I have several times ended up with spreadsheets exceeding 300MB. For instance, I had one where I had several months worth of data at 10 second intervals, from multiple different sensors, and was running VBA code to sort, compare, and find trends. There were a few operations in there that took over 30 minutes to run (even with optimizations). But, it did what we needed it to do.
5:18 Because Word is bad at tables. And I do conditional formatting in Excel, not sure if that counts as calculation if I just check for value of certain word.
Excel also has things like Freeze Panes, AutoFilter, and can support tables/grids that are wider than an A4 page more easily than Word can. Not everyone has access to Access (ahem), and setting up an Access database for a single stand-alone table seems a touch eccentric.
Matt: most spreadsheets have an error My google sheets spreadsheet emulating minesweeper: *profuse sweating* Matt: a lot of spreadsheets run out of room fairly quickly My spreadsheet calculating the border of the Mandelbrot set to any accuracy: "first time?" Matt: Spreadsheets are wonderful! Me, playing Conway's game of life on excel: "yeah, I agree" Matt: what spreadaheet is 114Mb? Probably has an image. Also matt: takes a selfie, converts it into RGB and colours hundreds of thousands of cells to recreate the image subpixel by subpixel
Some years ago I did an excel training and one of the first things we learned was the importance of choosing the correct cell format for the data put in them. Most of these errors shown here are user errors that only happen because people don't bother to switch the cell type from standard to the one that matches the data they put in. And for large quantities of hard data you should be using a database in the first place.
5:25 Because it is a convenient way to neatly store a lot of information. Like for example results from an experiment to be processed elsewhere, Payments, passwords, and etc.
@@standupmaths What if they were only faulty because when analyzing the spreadsheets, it couldn't find the reference spreadsheet? I mean to use a reference spreadsheet, you generally need to have the other spreadsheet in the same location it was originally specified.
@@standupmaths But were they using a spreadsheet to calculate the number of errors in the spreadsheets? If so, how did they know that it did not contain errors?
Alfonso J. Ramos Microsoft has been delivering productivity applications for Apple computers since the dawn of personal computers. It even was mandatory when Microsoft dealt with the antitrust trial.
Many years ago I worked with an accountant who, in my opinion, could make a spreadsheet "sing." He could do almost anything - and very quickly. He was so good with it, and so used to it, that he would do text stuff with it as well. It was his automated assistant of choice!
@@herumuharman6305 Sure, but my point is the errors weren't actually errors and I suspect other people may also have done this, making a statistic about the number of errors in a bunch of spreadsheets harder to interpret. Not everyone makes spreadsheets how they should, me included when doing some small non mission critical calculations
And there's some cases where you do want an N/A to show, there's even a function for it na(). Eg, you have a table of marriages, and for some table of people you've got a column for spouse's name. If they have none, then N/A semantically makes sense there.
"Why even use a spreadsheet if you're not gonna do any calculations?" Really important for keeping data! As a teacher I have to provide a lot of data to management, e.g. test results. Sometimes management ask us to use a Google Sheet, and I can just open up my own spreadsheet and copy -> paste-as-values. Sometimes it's a Google Doc, and copy-paste usually fails spectactularly - I have to type each one by hand, taking ten times as long!
just because a spreadsheet has errors it doesn't mean the designer made a mistake and doesn't mean the sheet is not functional... maybe the designer opted not to use IFERROR. yes the sheet looks ugly with errors but doesn't necessarily mean it's broken.
Another frequent error: if you import a .csv-file (which stands for Comma Separated Values, used by many professional applications) containing two columns of integers into Excel, but your region is set to a country that uses a comma as decimal separator (basically all countries that haven't been British colonies), Excel will read it as a single column of non-integer numbers...
IIRC (not having Excel on my PC) in the initial import screen, you can change the character of the column delimiter from the default "," to "." (or whatever you choose?)
When I first started keeping track of my DVDs, the size of an Access document seemed so huge compared to putting it into Excel. Now it's JSON formatted.
Two things I would say. Firstly, if you're including CSVs in this, then of course they will have no calculations. This wasn't made clear in the video. Also, in regards to having things like #DIV/0! and such, some people have sheets where they will need to make calculations, but they haven't entered the data yet, for example: For my own interest, I'm keeping track of the number of British players in each Premier League's starting XI every week. I go through the lineups and manually enter the number between 0 and 11 for each team for each gameweek. At the end of the row I have a simple =AVG(A2:U2)/11 sum to show the percentage of players that started that week that were British. We are only on gameweek 24 or 25 at the moment, so there is still a lot of time left in the season, but when I update this sheet once a week I don't want to have to enter the data then drag the row above down or copy the formula every time, I just have it set up already so that once I've entered the data, the percentage shows. This leads to a full column of percentages followed by errors. There isn't really any errors happening, I'm just being lazy, but since the sheet is just for personal use, and if I show it to other people, it will just be at the end of the season once we have played 38 matches, it doesn't really matter. I suppose it is different if these are email attachments that people are sending to each other, but the point still stands I feel.
Hi Matt, I work as a data analyst. I use excel a lot. Its really easy to make a 41MB file, you just have to copy a format down to the bottom of the page. Also, divide by 0 errors etc are often not mistakes, just cases of not finding a certain value that has a LY figure or an index match that doesn't find a value
This is why it is very important to hire someone who has knowledge of using a database.....numbers/calculations and dates all right-format, and text left formats, keeping an eye on the outcome will clue you in on whether the spreadsheet altered your stuff as you enter it. The apostrophe is very important as well. I track my stock account, along with 2 bank accounts with 2 sub accounts (mine and my gf) with forward financing and budgeting against the debt-load for each year ahead of time.....all in a spreadsheet, no mistakes. Have someone who isn't qualified, and your business will surely turn into Enron.
I spend a lot my time with spreadsheets and I work in telephony - so the first rule is that telephone numbers are not numeric data they are character data and the cells containing them need to be formatted as character data (i.e. text) - then there is no problem. The same is true for any data that can be misconstrued as something it's not. Format the cells with the correct number format (which includes, text, dates, etc.) and then you don't get most of these errors. The 'general' number format is only the default. The problem is Excel is very accessible, but then you get people who don't really know how to use it using it for very important stuff.
A nice bonus of more modern version of Excel comes with the increased sheet size: It's easy to run the computer out of memory and making it grind to a halt. A colleague asked me to export a database table (lots of rows AND lots of columns) so he could work on it in Excel. All my replies of "This doesn't make sense" and "this wont work" were swept aside. So i exported a CSV file. Easy enough, wrote a tool in about 5 minutes to do it properly. He then spent the next few days trying to import a 3 Gigabyte CSV into Excel. It was technically inside both the row and column limits of Excel, but his computer would run out of RAM, start swapping to disk, run out of disk space and then crash either Excel or Windows. He kept trying different ways of loading it (double-clicking, "File->Open", marking the file in Explorer and then pressing Enter...), tried different Excel versions, even tried it on a few other computers. He was absolutely refusing to believe that Excel couldn't load such a massive file, and he kept on trying and trying and trying. It was glorious to watch.
These statistics could be horribly mistaken tho. Spreadsheets in corporations are quite often "connected" to databases via ODBC and get their data from them. Without the databases (usually Access databases in shared network folders) the spreadsheets will be broken. I wonder if the researcher accounted for this fact.
Of course 90% of spreadsheets contain an error! Spreadsheets are used by people who became Managers because they were not good enough at maths to become Engineers (and Engineers use more reliable tools than spreadsheets) I've had some Project Managers make all their decisions based on spreadsheets... (that's 99% of managers working at big companies, btw) Their whole job revolves around spreadsheets for performance measurements, etc. They provided use with some spreadsheets and asked us to fill it for them (as if it we were not busy enough with all the spec changes...) The spreadsheet contained a HUGE error that showballed in the calculation. Fixing the error would have been time consuming, absolutely nothing good would have come out of it (the figures would have looked abysmally bad), and we had actual work to do. Considering that our whole department was being shut down to be outsourced, we couldn't care less about the spreadsheet being accurate or not...
4:39 175 worksheets sounds like a lot, but I have a few that probably get close to that. I use worksheets that are structured identically to do calculations "vertically" between worksheets, and those calculations are usually easier to do and to troubleshoot (and less error prone) if you have shorter formulas and many worksheets vs. long convoluted formulas in one or two worksheets. Plus when I have many worksheets, I tend to add blank worksheets between them as "divider tabs", which further increases the number of worksheets.
A lot of the spreadsheets I have to make involve copy and pasting data from a database query to a spreadsheet so users can use the built-in functionality of the spreadsheet to filter and what ever else they like to do. Those users do not have access to query themselves, even if they did, they are business people, they don't know how to write queries. *Edit:* I typically convert every cell into a Text data-type so it won't remove the leading 0s of a value and other interesting conversions.
I once abused a spreadsheet to design speaker enclosures. It had all the Thiel Small parameters for every driver I could get. One neat feature was that I could enter a box volume and then sort by F3, and get a list of likely good matches. But it was slow at heck, so slow that automatic recalculation had to be turned off... This was back in the days when a 386 processor with the math co-processor was a powerful machine.
Some of my Excel problems is from having different decimal separator and digit grouping standard. "." For digit grouping and "," for decimal separator.
Many people hear "database" and think of a complicated scary computer thing. People, databases are *just* like excel. It's a grid of data, with tons of tools to sort, filter, and display that data. Many modern database programs can be learned in an hour, and can be run entirely on your laptop. The only difference is that they require you to specify what type of data goes in each column, and they are optimized for huge amounts of data. Both of these features solve allll of the problems in this video. It's not complicated: when all you are trying to do is store data, **use databases.**
Know the limitations of your tools, and when to change tools. Essentially every example listed is user error - poor data input, misconfiguration, etc. When values exceed the capabilities of Excel, you should have moved to a more capable tool (like a database) a long time ago already.
In case anyone are wondering, you can make Excel treat whatever input you give it as a string, without autocorrection, if you write it with an apostrophe first, like: '00974528 would be treated as 00974528 without removing zeroes
The 42.2% of spreadsheets were probably acting more like a database. They may not have calculation errors but they could have data entry errors. Of course, you wouldn't be able to detect those errors unless you knew what the correct values were supposed to be.
I was trying to type HSA into a my medical tracking spreadsheet (to show payment method) and it kept turning it to HAS. The usual tricks wouldn't work (formatting, copy and pasting from word, whatever). The only way I could get Excel to keep it as HSA was to put a period at the end.
@5.40 As an accounting graduate, I want to take a moment here to say that I don't see #DIV/O and #NULL errors appearing on a spreadsheet to mean someone made a mistake. When in my courses we used spreadsheets to compare income statements or balance sheets between companies or between periods, we would frequently have whole columns of division showing a % difference. It simply took more work to skip over the blank rows when setting that up, so unless someone went out of their way to make their spreadsheets neater, which doesn't make any difference at all to the important information we want out of the spreadsheet, it was more likely than not to contain a type of error like this. Sometimes I would make and save a spreadsheet that functioned like a form as well, and it might have errors of this kind show up before all the fields were filled in. Doesn't mean it isn't doing what we wanted it to.
Those auto-corrects are really frustrating, though, that much is true. Even if it's not re-typing your data, if you care about casing or are using text strings that look like common words but aren't and aren't supposed to be, it will consistently capitalize your first "word" and autocorrect your "spelling". If anyone knows of a way to turn off those "features", I'd honestly be glad to know about it.
The best reason I know of for having a spreadsheet without any calculations is because MS Excel is the most stable of Microsoft's Office programs. It won't crash or go wonky on you like Word or PowerPoint.
Could it be that the excel spreadsheets with errors were being sent over email to someone that was expected to resolve the errors? So you’d see a higher percentage of invalid sheets because people might be more likely to email a broken sheet. Though this may have been accounted for when they removed duplicates, but that would depend on how they determined duplicates.
They call me the king of the spreadsheets Got 'em all printed out on my bedsheets My new computer's got the clocks, it rocks But it was obsolete before I opened the box
I work with an actual 41MB spreadsheet and it is just numbers, filters, pivot tables and charts. It takes a good minute to load and about 15 seconds to apply a filter on one column.
I have firsthand experience with spreadsheets attacking. For an intro stats class, I had collected employment data, including asking people about how many hours per week they worked. Since exact precision wasn't important for what we were doing with that data point, we chose to ask it categorically. 0-5, 6-10, 11-20, 20+ hours worked per week. The spreadsheet we used to collect survey data very helpfully solved our little problems we had put in each cell, so we were left with data ranging from -5, -4, -9, and 20+. The data were small and the consequences were minimal but it was still incredibly frustrating.
Maybe parentheses would fix it. It might be reading it as the square of a negative number in the first one and subtracting the square of a positive number in the second one.
@@ashtonhoward5582 Yes, that is precisely it. Now I know this I take care but occasionally forget as a transfer a formula from the back of an envelope into Excel!
I've not encountered a lot of auto-correct problems in my spreadsheets, but I also converted a while back to using Google Spreadsheets instead of Excel. Surely you can just change your settings so that it won't auto-correct? Also, when it comes to the typical "errors" you can get (#NUM!, #DIV/0!, etc.) sometimes those can be intentional. I know I've sometimes had formulas that I expect to sometimes yield one of those errors, and I have an "IFERROR()" formula set up to do something in those cases. So just because a formula yields those errors does not mean there's an oversight or error in the intended function of the spreadsheet. Sometimes the user expects those errors, and designs the spreadsheet to deal with them.
I can only imagine that the spreadsheet wanted access to external resources, which weren't available to the spreadsheet risk group. I believe you can get a huge number of errors if your spreadsheet is based on that. Though I also don't want to believe that they would count that as an error, just because they didn't have these missing resources.
Just having a #NUM, #VALUE or similar doesn't mean there's a calculation error. Some of my sheets that have no errors generate such things intentionally (or intentionally don't trap or otherwise prevent them) as part of the way that they work.
I wonder how many of the emailed spreadsheets were somebody saying "I broke this spreadsheet, could you please fix it?" In general, I feel like "errors" is ill defined in this case. Returning "N/A" doesn't mean there's an error with the workbook.
@Богдан Кондратов well... i know the tools, and i don't appreciate these features. because there is no option to get rid of them. all those workarounds cost too much time.
Business users (non-IT) don't want to code. They need tools that help them solve their problems without coding. Heck, even IT these days don't want to code ("no code" being the buzzword) as code needs to be maintained or it breaks sooner or later.
The Hungarian version of Excel uses regional settings, for example decimal comma instead of decimal dot. I gave the input CSV file to students recently containing floating point numbers to 2 decimals using decimal dot (as if you read the file in any programming language, your program will expect decimal point), some of the students were interested how the data looks, and they opened the CSV file in Excel, which showed a lot of dates: every time if the integral part was between 1 and 12, and the fractional part between 1 and 31, it converted to date, and a mere type conversion produced some weird numbers. Libreoffice had no issues :)
"90% of all spreadsheets contain an error". Oh, actually, nevermind, it's only 5%. There was an error in our spreadsheet
😂
I can tell you there are lots of divide by zero errors in spreadsheets at my work at a very well known mega corporation.
60% of the time, it works 100% of the time.
Hu Go there are only 10 kinds of People who understand binary.
There are 2 types of people.
Those who can extrapolate from incomplete data.
someone: "what's your phone number"
me: *"1.4 billion"*
rounded off to the nearest floating point number, of course
Well, I can give you my phone number in 3 formats; international, formal and informal. But for this, I'll use my old number that only has 2 formats: international: 46 milliards, national: 763 million
@@yanwittmann apparently excel converts some numbers to scientific notation.
@@yanwittmann well, some people go into automaton mode after working for so long that mistakes like this happen
@@yanwittmann As he demonstrated, when the cell is wide enough, it will show normal numbers. Your screen space (or zoom level) might make it not have enough space to display the whole number, giving only you the scientific notation.
As in woodworking, more powerful tools allow you to make greater mistakes in less time.
Words to live by 😁
it's not about the size of the tool. it's about using the right one at the right time. thats true for programming and woodworking (and, i guess, an other discipline...)
True, Jim! Thinking of the bloke that had a pretty bad day in Tchernobyl back in the day...
As Brian Reid taught us, Power Tools Can Kill.
Such as accidental removal of digits.
I once worked in a bank(big name, still in business). Money records were kept on a real database, but they used spreadsheets for some other daily things. When showing me how these spreadsheets worked they pulled out a pocket calculator to do the math and typed the answers in(for a few dozen cells).... After I finished face palming I showed them the purpose of spreadsheets, or computers in general for that matter
This is so common that it’s sad 😢
Training and its lack thereof is a major problem. There are so many features and techniques in Excel that would solve many problems but users don’t know them.
As a programmer, declaring what type of data will be put into a variable/cell seems like a reasonable thing to do.
Called a database/table sceme
Karmanyaah Malhotra I assume you are not a Python programmer, then ;-)
Unless you're a python programmer, where you would consider that superfluous
Богдан Кондратов
You lead with a ‘ to force the cell to text. So from this video, if you want MARCH5 left as a name the just type ‘MARCH5.
Excel is designed for non-technical management types to use comfortably without having hissy fits. If you want to use it for heavy lifting then you have to know how to get past the default training wheels.
More problematic are the actual errors - bugs - in different versions of Excel. There is a real doozy with the MROUND() function where it doesn’t round correctly in certain cells. That bug evaded our efforts to figure out, but as it was aberrantly rounding down the work around was to add a fraction of the rounding value to the data: so MROUND(x,y) becomes MROUND(x+y*0.001,y). Annoying, but it gets around that bug.
@@olmostgudinaf8100 Python is not an exception. You are still identifying the type of a value even if the variables are dynamic.
3.14159 != "3.14159"
"25/01/2020" != datetime.datetime(2020, 1, 25)
There are plenty of strong statically typed languages that infer types as well. The problem with Excel is that it tries to parse every cell with a return type polymorphic parse function and its a pain (or people just don't know how) to specify the return type.
An example of this done right is Haskell. It has the function read which is return type polymorphic and it will infer types but you can just add a type annotation if there is any ambiguity. In Python you have to use a separate parse function for each type.
"Why even use a spreadsheet if you aren't going to do any calculations"?
... says the guy who once made a spreadsheet that had nothing but numbers between 0-255 with conditional formatting.
conditional formatting means formulas, and those count as calculations.
@@p0gr Even then .csv files are a very common way to dump data in a format that's easily readable. Those are spreadsheet files with no calculations in them.
I've used spreadsheets with no calculations before, just to more easily get columnar formatting, because setting up multiple columns in the word processor isn't so bad (still a bit of a pain), but to then get the contents to line up properly can, depending on what the proper alignment is, be a huge hassle, while I can do the same in the spreadsheet in seconds.
@@Murzac I view CSV files as closer to a way to store a matrix, than as a spreadsheet in the traditional sense.
@@Murzac well, csv could equally be considered a database table as much as a spreadsheet
Why user a spreadsheet without calculations? I use them all the time. Software exports SQL queries, applies various calculations, then effectively paste values the entire thing to send it out to people around the business. People can use Excel usually, it requires no extra software licences, easy to pivot and filter the data, etc.
Also having #N/A is perfectly legitimate in a sheet as it's a way to make charts give null values rather than zero.
Also, I use spreadsheets just to design forms that require hand entry. Perhaps not for some, but for the spreadsheet savvy forms are often easier in Excel than in Word, especially if a person wants the tax form look.
I was going to make a similar comment. I, in a previous life, sent out a rather large spreadsheet. The one sent to supervisors and managers was flat (actually it was posted and an e-mail was sent out when I composed it.) The spreadsheet was actually built with VBA (Visual Basic for Applications) and had very few internal computations.
If its for sharing SQL data- use CSV, which is universal.
@@dhampson545 One must realize that a lot of end users of the spreadsheet might not know the difference between CSV and CVS.
@@dhampson545 Excel will often corrupt CSV files when importing them, so if the user wants to perform additional processing on the data set and they don't know SQL, they will have problems
I've actually made plenty of spreadsheets at my job that are full of #DIV/0 errors without it being a mistake. Usually this happens when I'm comparing two datasets by finding ratios or relative error values between corresponding data points. If some of the values happen to be zero, I get a #DIV/0 error in those rows, but I also often don't care about these data points, so I ignore the #DIV/0 errors and get what I need from the other rows.
One interesting way that spreadsheets fail us - one of our program spits out a CSV file that's often full of "-Infinity" values. When I try to open these in Excel, it sees the initial minus sign and assumes I'm trying to negate a range called "Infinity", which, of course, doesn't exist, so the sheet ends up full of #NAME? errors.
I completely agree - I often have spreadsheets where #Div/0 is the expected result and not a mistake. However, I have also discovered the Onerror function that can put in your chosen message if an error exists. This makes the spreadsheet more meaningful for other people who might think the error message is a mistake.
I also make spreadsheets that intentionally have a lot of errors. Usually it is because they are designed to take in some input set of data that varies in size and calculate useful information from it in a format that is easy to send out in an email. I use the iferror function to hide the errors but it's easier to just make the spreadsheet work for more lines that will ever really be used and hide the extras than to have to remake the spreadsheet everytime the size of the input data changes.
Exactly. On a less interesting level , I would imagine that most businesses have spreadsheets that contain errors due to the absence of data that isn't available yet.
Hell, I have plenty of spreadsheets that actually do a check for certain types of errors in their calculations. Because sometimes those are "valid results" that need to be accounted for.
I often make a formula and drag it down to apply the formula to multiple cells. If one of them is vlabk or has text in because it seperates two datasets then it might return an error, but it doesn't matter andI usually egnore them rather than bothering to add the IFERROR or manually deleting those rows.
For the last company I worked for, I once had a spreadsheet that was 96 MB. Believe it or not, this spreadsheet contained not images, audio, video or anything similar that would artificially increase the size of the file. The spreadsheet was full of server log information that over the course of several sheets was parsed into usable rows and columns which was then finally sorted via a complex pivot table and then several graphs were created from that detail. Each of the data sheets had just over 1 Million rows of data and into the triple letter columns on some of the sheets. This thing was an absolute beast and using it routinely crashed Excel. I recall I had a certain procedure for opening the file so my computer wouldn't lock up. In the end it was an extremely useful data set, albeit a horrible solution for solving the problem we had.
You shouldn’t have been using Excel for something that large. That is why Access exists.
@@andykillsu But companies often don't buy that package when they are buying their Office packages. So it's not available to the poor workers. Found myself in that situation often.
Phillip Siebold That’s not true at all. The majority of companies have office 365 versions that come with Access. Any tier of office that comes with Outlook (very commonly used in companies) and Skype Business will have Access bundles with it. Now your company may not install it on your computer because you would have no use for it. But clearly the type of data and the way the OP was using it, should have been done an Access and even Power BI.
if you're actually parsing logs of real applications that are worth money, you can afford a system that's designed to scrape logs instead of rolling your own. text files and python are even enough for that.
@@andykillsu It isn't readily apparent to a novice that Access is there. Outlook, OneDrive, Word, Excel, Power Point, One Note, Skype, Calendar.... Why is THAT there? That must be the full list, or there'd be something more useful there instead.
My postal code starts with a zero. I ordered a dashcam once and the store left the zero out, then someone at the fulfillment center noticed that it was one digit short and added a zero at the end, shipping my order to a cargo terminal 1,300 km from me. And that's just the straight-line distance; by road, it's 1,700 km (21 h) and three border crossings away, or 2,000 km (28 h) if you want to stay in the country.
@@rosiefay7283 Probably more a stupid action by whomever set up the software/spreadsheet/database used in the first place, since probably all the retailer's colleagues are just typing the data in the boxes as they serve customers and/or fulfill orders, and are not expected to have the time or knowledge to see 'under the hood' so to speak in terms of how the boxes/rows/columns etc have been set up or formatted.
This confirms one basic rule: The idiot is always sitting in front of the computer screen.
@@MrDannyDetail Rule #1 in programming: don't EVER store postal codes, phone numbers, etc. in numbers, but in strings. and still it goes wrong because of stupid mistakes people make
@@spicybaguette7706 It would be a non-issue if postal codes weren't numeric. If instead of a pseudo-random series of digits they were simply a pseudo-random series of letters, nobody would ever have to worry about storing them incorrectly, because a string would be the only option.
@@MrDannyDetail Not a retailer, I ordered directly from the manufacturer.
Spreadsheets don't have to contain formulas to be useful. Putting data in a table and being able to sort/filter/graph it is enormously useful. No calculations required.
Hehehe, referencing Matt's earlier vlog ua-cam.com/video/iwzzv1biHv8/v-deo.html which is all about the importance of visually graphing your data.
Yep that and also taking downloaded csv and making dynamic table to better present the data (even if it is just re-ordering the column with no sum/count being done. At least it sorts them and remove duplicates).
The Irony is, sorting/filtering/graphs all require calculation. Either you have a Spreadsheet (not necessarily Excel) or a Table (no calculation, just a N-dimensional list of content).
If they used a spreadsheet to get that number of over 90%, then the chance of that number being wrong also is 90%
Presumably, there might be a different-than-90% chance that the chance of that number being wrong is 90%.
touche
exactly!
Your calculation is wrong. The 90% is about spreadsheets containing at least one error, not that a calculation is likely to be 90% wrong.
That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong, And That Has A 90% Chance Of Being Wrong.
5:38 I have a spreadsheet that I use everyday that has eleven divided by zero errors. But that's okay because it's January and they're just taking the averages of data in months that haven't happened yet! So thery're not mistakes! It's possible that some of these obvious errors might not actually be mistakes in some rare cases!
You can put some conditionings in those cells. If certaing cell or data is 0, show some text instead.
I have lots of spreadsheets with DIV/0 errors too because I'm applying a formula globally to a whole bunch of data, and some of the base data happen to include 0s. Yeah I guess I could use an IFERROR function or something like that, but I don't really care unless the DIV/0 "infects" subsequent calculations somewhere. It's not like I am legit trying to divide by zero and confused why it won't work.
@@davicorosello1588 If it's for personal or team-internal use only, and everyone knows why that error is there and what it means, why do the additional work? Just for the random case a researcher comes along that only looks at the value in those cells without the surrounding context?
Also, the dangerous errors are not those where Excel prints an error message, but those that look like a legit calculation and you only notice them when you check the formula (see the SUM vs AVERAGE example).
I mean that, even if you want to, your error could show a text explaining that error in a more comprehensivelly way instead of leaving it like "div/0", like for example "Not enough X data" or "data X is 0" or whatever, because you can have the same error but for many reasons.
But, if you are pleased with what you have, i agree that is way better to do things the way everyone got used to it.
If they're refering to stuff that hasn't happened yet, maybe you should add logic to it that turns them into na() errors instead.
That way, anyone reading it can see it's something to ignore rather than wondering if there's a calculation error.
He really said: Excel is Accessable. In a video where he says "Don't use Excel as a Database". I hope, that was intentional, for it is hilarious
Absolutely use Excel. Don't use it as a database. It's a spreadsheet program not a database program.
I think he said as a prototype. A nice part is Excel is a OLEDB provider, which makes it pretty easy to use with ADO.NET. pretending it's just a slow limited database without all the hassel if setting up an SQL server.
(Although Sqlite is a better option for that)
heck SQLExpress running locally works great too
I feel like the people making reply comments don't know this is a joke about Microsoft Access.
@@userPrehistoricman Yep
There are other classic errors as well - I once saw a spreadsheet analyzing a structure produced by a large engineering consultancy. The 'pin joint' analysis used lots of sine & cosine functions to compute the answers. The author(s) had inputted the angles in degrees but the Excel Trig functions work in radians. The resultant report passed two levels of checks my the consultancy and was issued to us as the client. My colleague a structural engineer took one look at the result and started laughing. When he recovered he passed it to me as an Electronics Engineer to see what I thought and it did not take long to realise that some of the forces through the mounting points had negative numbers indicating they were upward - an absolute nonsense given the structure and the loading. The consultancy were very embarrassed when the report (and the associated spreadsheet) were returned - they never explained how it passed the 3 level checks. I suspect it was because the previous person had already signed it.
Why use a spreadsheet when you're not doing calculations with it?
Because I've yet to see the word processor that isn't an idiot when it comes to spacing and making lists.
And if you're emailing them around, someone else may later want to do calculations with it.
I've probably emailed CSV files or former CSV files with no calculations yet plenty of times
Excel is an okay data exchange format.
It's structured, readable by a lay person (with Excel, sheets or calc). And in the case of xlsx and ods a somewhat open format with built in compression. (A zip file with XML inside)
Word is a really great word processor if you know how to use it. Painfully, very few people know how to use it, including every teacher I had at university. Protip: The _outline view_ button will change your life. And for the love of all that is holy, don't manually change size, spacing, font, colour, italics, bolds etc. in your text, it will only lead to heartbreak, use _styles_ in stead, they were created for precisely that purpose.
Also sorting, and filtering.
Regarding the 44meg spreadsheet, I've worked at a medium size food factory where among other things would take daily production stocks and then input them into an Excel file. The file was about 25megs and had all the stocks of all the departments, that were then used to calculate orders of raw materials, production output estimates, and so on. It had hundreds and hundreds of rows and columns and most of them contained formulas of obscene proportions. We're talking a dozen wrapped lines long formulas that covered ALL the aspects of the production. Take any ready meal, look at the ingredients, cross-link them all and them multiply by a hundred. My understanding is that they were built over time by adding more and more references without any optimization (because of the sheer complexity of it). There were backups of backups of backups and as you can imagine the management was very touchy about anyone even breathing near those Excel files. Oh, and every day had a new file, with new inputs and outputs. So yeah...
I made a 0.1s measurement on a 5GS/s oscilloscope. Saved it as ascii, and opened in Excel to view the curves. It seems Excel has a limited nuber of rows. It could only show some of the 500 000 000 rows... My misstake :-)
@@SirLightfire I worked for a company that went through that it was a custom seed mix calculator that the owner had made. It has been recorded multiple times and is super impressive now.
This is an abuse of spreadsheets - many of them should be replaced by actual databases.
I was once in charge of a spreadsheet that calculated where we stood among our competitors. The input for it was a separate excel file that I had to download each morning with roughly 300 attributes and histories for each of our 2000 competitors that was a couple MB. This 30MB spreadsheet would take that data, parse it by firm and then stitch it together and output a four page PDF (had to be printed to PDF manually) showing where everyone stood. No graphics, no special formatting, just minimal readability items and it was that big. I'd open it, tell it to update and go get coffee. If I was lucky when I got back it would be done. I didn't build it, but I knew the underlying data. If something looked off, I'd go back into the data file, and clean it up, then run it again.
I don't think he has much experience of the amount of data and number of formulae a business can end up sticking in a spreadsheet.
Besides Enron collapsed when excel probably still had an approximately 66,000 row limit or something like that. So 20 years later their spreadsheets look pretty small.
@4:20 - If you have large datasets and / or complex calculations then spreadsheets can get quite large. On top of that, it isn’t feasible to compress some data transformation into tight and barely intelligible formulae if you want to check and maintain it. Once you have to do some real work with data it starts chewing up a lot of cells.
If you use them as a way of capturing a system in total (I work in a calibration lab) then you potentially have a lot of non-data information present - we use a master workbook that processes structured data into a certificate with tables and other information, and the data may come from another system or from a procedure worksheet loaded into this master workbook. This makes for large spreadsheets with a lot of duplicated and reworked information, but auditing is very easy.
The alternative approach of storing all of the data in a database can be useful for some purposes, but makes system changes a nightmare due to the importance of maintaining compatibility with earlier data. Good luck with auditing anything performed several generations of system modifications ago.
I have a spreadsheet I've made for work that actually uses errors to my advantage. I have to compare reports from two different programs. Basically it's an exception report to find out if a value is missing. All from imported data. I have several v lookups on one row looking at different sheets. Then I use "and" and "iserror" to see if every column contains an error. Add filters and bingo I found what's missing.
Excel is awesome, but it's not made for databases. It's perfect for pulling data into from other sources and then analyzing it. That's what it is best at. At least in the business world.
i think the stuff covered in this video proves it falls very short of "perfect" for that use case
Matt narrated Humble Pi on Audible...I'm going to buy this now to use specifically when I want Matt to read me a maths-related bedtime story.
Literally a Parker Story :)
I often set the format of cells manually before populating them and this solves most issues.
I've had memory overflow (I assume?) errors where I paste an entire column of cells into another column and even if I've already marked both columns as plain text to leave the hell alone, autocorrect still noses into random cells and breaks things.
@@tparadox88 When you copy and past a cell, it copies the format. If you right click where you want to paste, you can specify exactly what you want to paste. Since excel was made for "day-to-day" use, copying and pasting everything including formulas and formats is typical. But you have more options such as just pasting text.
This. Just select the column and "edit cells", set the desired format and bish, bash, bosh, no more autoformat errors - until human error inevitably rears it's ugly mug.
All I'm hearing is "set format for cells before you begin writing stuff".
So y'know. BASIC Excel use.
You mean Visual BASIC for Applications ;)
If you open a CSV in Excel, it automatically applies its own formatting.
CSV and Micro$oft is a nightmare for itself.
@@geolawie Only if you don't know, what you're doing ;-) If you use "data -> from text" it will allow you to specify the separator, data-type of a cell etc. A lot of the really bad errors will go away if you do it that way.
@@tetsi0815 That is an extremely big "only if". I started my career as an accountant and was fully qualified as a Chartered Accountant in the UK. Accountants are clearly very big users of spreadsheets (my guess would be the biggest) and how much training did I get in avoiding spreadsheet errors? You guessed it, absolutely none whatsoever. Most people who use spreadsheets for business critical tasks have had the same amount of training that I had.
4:20 *laughs in financial analyst* Creating a 41mb spreadsheet is a normal Tuesday.
You should probably start working in Python. It’s pretty easy to get started with all the material being open source.
I tend to do this thing in Excel (and other spreadsheet programs)
1. Select All.
2. Align vertically and horizontally to center.
if I'm not doing numbers, I also do this:
3. Change type to "plain text".
In my work, I have several times ended up with spreadsheets exceeding 300MB. For instance, I had one where I had several months worth of data at 10 second intervals, from multiple different sensors, and was running VBA code to sort, compare, and find trends. There were a few operations in there that took over 30 minutes to run (even with optimizations). But, it did what we needed it to do.
5:18 Because Word is bad at tables.
And I do conditional formatting in Excel, not sure if that counts as calculation if I just check for value of certain word.
Excel also has things like Freeze Panes, AutoFilter, and can support tables/grids that are wider than an A4 page more easily than Word can. Not everyone has access to Access (ahem), and setting up an Access database for a single stand-alone table seems a touch eccentric.
Matt: most spreadsheets have an error
My google sheets spreadsheet emulating minesweeper: *profuse sweating*
Matt: a lot of spreadsheets run out of room fairly quickly
My spreadsheet calculating the border of the Mandelbrot set to any accuracy: "first time?"
Matt: Spreadsheets are wonderful!
Me, playing Conway's game of life on excel: "yeah, I agree"
Matt: what spreadaheet is 114Mb? Probably has an image.
Also matt: takes a selfie, converts it into RGB and colours hundreds of thousands of cells to recreate the image subpixel by subpixel
Some years ago I did an excel training and one of the first things we learned was the importance of choosing the correct cell format for the data put in them.
Most of these errors shown here are user errors that only happen because people don't bother to switch the cell type from standard to the one that matches the data they put in.
And for large quantities of hard data you should be using a database in the first place.
I am one of those people who uses Excel as a document. Its just handy when I want to show some data - stick stuff in boxes, arrange how you like.
5:25
Because it is a convenient way to neatly store a lot of information.
Like for example results from an experiment to be processed elsewhere,
Payments, passwords, and etc.
The date thing! The date thing! It drives me crazy!
andrew zuo check to see if it’s a setting you can turn off
Just change the cell format to "text"
@ andrew: Know your tools. Don't blame it on the knife when you grab it at the wrong end.
I've gotten used to always putting an apostrophe in front of the data I type into a cell, even if it's not needed.
Quick hack: type ' before entering anything and Excel will treat it as text and not auto-format.
Seems like just the sort of error my home state would make.
Hey, if there is one thing that brings us all together, it’s faulty spreadsheet references.
@@standupmaths What if they were only faulty because when analyzing the spreadsheets, it couldn't find the reference spreadsheet? I mean to use a reference spreadsheet, you generally need to have the other spreadsheet in the same location it was originally specified.
@@standupmaths But were they using a spreadsheet to calculate the number of errors in the spreadsheets? If so, how did they know that it did not contain errors?
6:16
Matt: Let's fire up the old Excel
Me: Microsoft should sponsor this
* sees apple logo *
Nevermind.
Also the fact that half the video is about Excel doing stupid things nobody asked for would probably put them off.
Alfonso J. Ramos Microsoft has been delivering productivity applications for Apple computers since the dawn of personal computers. It even was mandatory when Microsoft dealt with the antitrust trial.
@@becomepostal Indeed. Most Microsoft office applications started on macintosh before there was ever a proper version of windows in the first place...
Also, Excel behaves differently on Mac versus PC.
Many years ago I worked with an accountant who, in my opinion, could make a spreadsheet "sing." He could do almost anything - and
very quickly. He was so good with it, and so used to it, that he would do text stuff with it as well. It was his automated assistant of choice!
I've made some spreadsheets in the past where an error was itself an appropriate answer (or at least a usable substitute to an empty cell)
Weren't you supposed to use iferror for that?
@@herumuharman6305 pretty sure iferror was only available from 2007 onwards
@@herumuharman6305 Sure, but my point is the errors weren't actually errors and I suspect other people may also have done this, making a statistic about the number of errors in a bunch of spreadsheets harder to interpret. Not everyone makes spreadsheets how they should, me included when doing some small non mission critical calculations
@@thomasmckeown10, before iserror you could use if(iserror(•),•,•)
And there's some cases where you do want an N/A to show, there's even a function for it na().
Eg, you have a table of marriages, and for some table of people you've got a column for spouse's name. If they have none, then N/A semantically makes sense there.
"That's not my phone number, that's my FAX NUMBER"
-Matt Parker 2020
More interestingly, knowing Matt and a quick google make me think that might number might actually work, but I don't have a fax to test it :s
At what time does the quote "day-to-day scenarios" appear on the screen? 13:59
"Why even use a spreadsheet if you're not gonna do any calculations?"
Really important for keeping data! As a teacher I have to provide a lot of data to management, e.g. test results. Sometimes management ask us to use a Google Sheet, and I can just open up my own spreadsheet and copy -> paste-as-values.
Sometimes it's a Google Doc, and copy-paste usually fails spectactularly - I have to type each one by hand, taking ten times as long!
March 5 is my birthday. There you go, a piece of information you will never need or desire :')
just because a spreadsheet has errors it doesn't mean the designer made a mistake and doesn't mean the sheet is not functional... maybe the designer opted not to use IFERROR. yes the sheet looks ugly with errors but doesn't necessarily mean it's broken.
(just saying, the 90% statistic is probably misleading)
To answer 5:16 at least for me. We copy+paste as values to strip formulas and look-ups.
'If you want a cell value to remain a string, prepend it with an apostrophe.
"When spreadsheets attack"
_I was really hoping for a Battleship game_
There is one here if you'd like: github.com/rubberduck-vba/Battleship/releases
Another frequent error: if you import a .csv-file (which stands for Comma Separated Values, used by many professional applications) containing two columns of integers into Excel, but your region is set to a country that uses a comma as decimal separator (basically all countries that haven't been British colonies), Excel will read it as a single column of non-integer numbers...
well, you can set what excels uses as the separator when importing
IIRC (not having Excel on my PC) in the initial import screen, you can change the character of the column delimiter from the default "," to "." (or whatever you choose?)
@@kalolord Unless that changed recently, you can only set the field separator but not the decimal point
This video quality is way to good for a vlog about spreadsheets :D
When I first started keeping track of my DVDs, the size of an Access document seemed so huge compared to putting it into Excel. Now it's JSON formatted.
Two things I would say. Firstly, if you're including CSVs in this, then of course they will have no calculations. This wasn't made clear in the video.
Also, in regards to having things like #DIV/0! and such, some people have sheets where they will need to make calculations, but they haven't entered the data yet, for example: For my own interest, I'm keeping track of the number of British players in each Premier League's starting XI every week. I go through the lineups and manually enter the number between 0 and 11 for each team for each gameweek. At the end of the row I have a simple =AVG(A2:U2)/11 sum to show the percentage of players that started that week that were British. We are only on gameweek 24 or 25 at the moment, so there is still a lot of time left in the season, but when I update this sheet once a week I don't want to have to enter the data then drag the row above down or copy the formula every time, I just have it set up already so that once I've entered the data, the percentage shows. This leads to a full column of percentages followed by errors. There isn't really any errors happening, I'm just being lazy, but since the sheet is just for personal use, and if I show it to other people, it will just be at the end of the season once we have played 38 matches, it doesn't really matter.
I suppose it is different if these are email attachments that people are sending to each other, but the point still stands I feel.
I appreciate how steady your camera is. I would dearly love it if other walkabout UA-camrs had something similar on their selfie-stick.
Hi Matt, I work as a data analyst. I use excel a lot. Its really easy to make a 41MB file, you just have to copy a format down to the bottom of the page. Also, divide by 0 errors etc are often not mistakes, just cases of not finding a certain value that has a LY figure or an index match that doesn't find a value
I'm amazed at how you can walk down a busy street, and say this huge long script into a camera, without messing up.
This is why it is very important to hire someone who has knowledge of using a database.....numbers/calculations and dates all right-format, and text left formats, keeping an eye on the outcome will clue you in on whether the spreadsheet altered your stuff as you enter it. The apostrophe is very important as well. I track my stock account, along with 2 bank accounts with 2 sub accounts (mine and my gf) with forward financing and budgeting against the debt-load for each year ahead of time.....all in a spreadsheet, no mistakes.
Have someone who isn't qualified, and your business will surely turn into Enron.
I spend a lot my time with spreadsheets and I work in telephony - so the first rule is that telephone numbers are not numeric data they are character data and the cells containing them need to be formatted as character data (i.e. text) - then there is no problem. The same is true for any data that can be misconstrued as something it's not. Format the cells with the correct number format (which includes, text, dates, etc.) and then you don't get most of these errors. The 'general' number format is only the default. The problem is Excel is very accessible, but then you get people who don't really know how to use it using it for very important stuff.
THIS
A nice bonus of more modern version of Excel comes with the increased sheet size: It's easy to run the computer out of memory and making it grind to a halt. A colleague asked me to export a database table (lots of rows AND lots of columns) so he could work on it in Excel. All my replies of "This doesn't make sense" and "this wont work" were swept aside. So i exported a CSV file. Easy enough, wrote a tool in about 5 minutes to do it properly.
He then spent the next few days trying to import a 3 Gigabyte CSV into Excel. It was technically inside both the row and column limits of Excel, but his computer would run out of RAM, start swapping to disk, run out of disk space and then crash either Excel or Windows. He kept trying different ways of loading it (double-clicking, "File->Open", marking the file in Explorer and then pressing Enter...), tried different Excel versions, even tried it on a few other computers. He was absolutely refusing to believe that Excel couldn't load such a massive file, and he kept on trying and trying and trying. It was glorious to watch.
These statistics could be horribly mistaken tho. Spreadsheets in corporations are quite often "connected" to databases via ODBC and get their data from them. Without the databases (usually Access databases in shared network folders) the spreadsheets will be broken. I wonder if the researcher accounted for this fact.
Of course 90% of spreadsheets contain an error!
Spreadsheets are used by people who became Managers because they were not good enough at maths to become Engineers (and Engineers use more reliable tools than spreadsheets)
I've had some Project Managers make all their decisions based on spreadsheets... (that's 99% of managers working at big companies, btw)
Their whole job revolves around spreadsheets for performance measurements, etc.
They provided use with some spreadsheets and asked us to fill it for them (as if it we were not busy enough with all the spec changes...)
The spreadsheet contained a HUGE error that showballed in the calculation.
Fixing the error would have been time consuming, absolutely nothing good would have come out of it (the figures would have looked abysmally bad), and we had actual work to do.
Considering that our whole department was being shut down to be outsourced, we couldn't care less about the spreadsheet being accurate or not...
Being from Utah, the only thing that surprises me about that Department of Education fact is that I'd never heard it before.
4:39 175 worksheets sounds like a lot, but I have a few that probably get close to that. I use worksheets that are structured identically to do calculations "vertically" between worksheets, and those calculations are usually easier to do and to troubleshoot (and less error prone) if you have shorter formulas and many worksheets vs. long convoluted formulas in one or two worksheets. Plus when I have many worksheets, I tend to add blank worksheets between them as "divider tabs", which further increases the number of worksheets.
Don't let this dude's gatekeeping get to you. There are obviously more valid ways to use Excel than he understands.
A lot of the spreadsheets I have to make involve copy and pasting data from a database query to a spreadsheet so users can use the built-in functionality of the spreadsheet to filter and what ever else they like to do. Those users do not have access to query themselves, even if they did, they are business people, they don't know how to write queries. *Edit:* I typically convert every cell into a Text data-type so it won't remove the leading 0s of a value and other interesting conversions.
I once abused a spreadsheet to design speaker enclosures. It had all the Thiel Small parameters for every driver I could get. One neat feature was that I could enter a box volume and then sort by F3, and get a list of likely good matches. But it was slow at heck, so slow that automatic recalculation had to be turned off... This was back in the days when a 386 processor with the math co-processor was a powerful machine.
I had a spreadsheet in high school that I managed to get up to 5gb. All text, and formulas, no pictures.
You could get 83k errors in a spreadsheet by screwing up a single cell if your sheet is complex enough. But it does sound exciting and noteworthy.
Some of my Excel problems is from having different decimal separator and digit grouping standard. "." For digit grouping and "," for decimal separator.
If your region settings are correct that shouldn't matter...
But if your source data comes from multiple countries...
Yeah. Ouch.
@Богдан Кондратов You could search for . and replace it with , for example. Not the best work around, but it should work if done correctly.
Many people hear "database" and think of a complicated scary computer thing.
People, databases are *just* like excel. It's a grid of data, with tons of tools to sort, filter, and display that data. Many modern database programs can be learned in an hour, and can be run entirely on your laptop.
The only difference is that they require you to specify what type of data goes in each column, and they are optimized for huge amounts of data. Both of these features solve allll of the problems in this video.
It's not complicated: when all you are trying to do is store data, **use databases.**
Know the limitations of your tools, and when to change tools. Essentially every example listed is user error - poor data input, misconfiguration, etc. When values exceed the capabilities of Excel, you should have moved to a more capable tool (like a database) a long time ago already.
In case anyone are wondering, you can make Excel treat whatever input you give it as a string, without autocorrection, if you write it with an apostrophe first, like: '00974528 would be treated as 00974528 without removing zeroes
The 42.2% of spreadsheets were probably acting more like a database. They may not have calculation errors but they could have data entry errors. Of course, you wouldn't be able to detect those errors unless you knew what the correct values were supposed to be.
I was trying to type HSA into a my medical tracking spreadsheet (to show payment method) and it kept turning it to HAS. The usual tricks wouldn't work (formatting, copy and pasting from word, whatever). The only way I could get Excel to keep it as HSA was to put a period at the end.
Put an apostrophe before numbers to force excel to interpret as text.
Yeah, but what if you're importing data from an external source, instead of typing it in manually?
The transition between shots works really well
He's our friendly neighborhood mathematician
"A lot of those spreadsheets were password protected" LOL! The sign says 'please do not cross', we'd better go home.
90% of all spreadsheets are "Parker Spreadsheet"
@5.40 As an accounting graduate, I want to take a moment here to say that I don't see #DIV/O and #NULL errors appearing on a spreadsheet to mean someone made a mistake. When in my courses we used spreadsheets to compare income statements or balance sheets between companies or between periods, we would frequently have whole columns of division showing a % difference. It simply took more work to skip over the blank rows when setting that up, so unless someone went out of their way to make their spreadsheets neater, which doesn't make any difference at all to the important information we want out of the spreadsheet, it was more likely than not to contain a type of error like this.
Sometimes I would make and save a spreadsheet that functioned like a form as well, and it might have errors of this kind show up before all the fields were filled in. Doesn't mean it isn't doing what we wanted it to.
Those auto-corrects are really frustrating, though, that much is true. Even if it's not re-typing your data, if you care about casing or are using text strings that look like common words but aren't and aren't supposed to be, it will consistently capitalize your first "word" and autocorrect your "spelling". If anyone knows of a way to turn off those "features", I'd honestly be glad to know about it.
JavaScript - "I am the worst language when it comes to Dynamic Type"
Excel - "Hold my Spreadsheet"
The best reason I know of for having a spreadsheet without any calculations is because MS Excel is the most stable of Microsoft's Office programs. It won't crash or go wonky on you like Word or PowerPoint.
Long live Spider-Matt.
Spider-Matt,
Spider-Matt.
Does whatever the spider can't.
I think Matt makes for quite a Parker Peter.
@@garr_inc Glad your head did that too
Could it be that the excel spreadsheets with errors were being sent over email to someone that was expected to resolve the errors? So you’d see a higher percentage of invalid sheets because people might be more likely to email a broken sheet. Though this may have been accounted for when they removed duplicates, but that would depend on how they determined duplicates.
14:25 - And this, is when I liked the video.
They call me the king of the spreadsheets
Got 'em all printed out on my bedsheets
My new computer's got the clocks, it rocks
But it was obsolete before I opened the box
It seems that the errors found were only the "obvious" ones, but things like using SUM when AVERAGE was intended could not be caught so easily
I have never been so bored by and so interested in any video in my life.
I commonly work with Excel files that when I import CSV data the file ends up larger than 50MB
Also some times you want to find errors on purpose such as #DIV/0 ect
also define your variable type guys
@@mcdura agree with all 3 of your comments!
@@Shady_Lane hell yea
I work with an actual 41MB spreadsheet and it is just numbers, filters, pivot tables and charts. It takes a good minute to load and about 15 seconds to apply a filter on one column.
6:36 of course it is.
I have firsthand experience with spreadsheets attacking. For an intro stats class, I had collected employment data, including asking people about how many hours per week they worked. Since exact precision wasn't important for what we were doing with that data point, we chose to ask it categorically. 0-5, 6-10, 11-20, 20+ hours worked per week. The spreadsheet we used to collect survey data very helpfully solved our little problems we had put in each cell, so we were left with data ranging from -5, -4, -9, and 20+. The data were small and the consequences were minimal but it was still incredibly frustrating.
0:02 A ringing endorsement! 🤣
I feel like he's barely containing laughter from the realization he can use his audible advert time to advertise his book.
The "feature" that has caught me out more than once is that -A1^2+B1 gives a different result to B1-A1^2
Why, what happens? I'm assuming it is to do with the minus sign not being taken into account when squaring?
wait how is it different what happens lol
Maybe parentheses would fix it. It might be reading it as the square of a negative number in the first one and subtracting the square of a positive number in the second one.
@@ashtonhoward5582 Yes, that is precisely it. Now I know this I take care but occasionally forget as a transfer a formula from the back of an envelope into Excel!
DEC2HEX is Matt Parker's boy band name.
Thank you for providing a good example of why spreadsheets are not databases.
"40% of spreadsheets don't contain any calculations.
Why-oh-why would you ever have a spreadsheet without calculations???"
One word: database
One more word: It's also common practice to copy -> paste special -> values only before emailing spreadsheets.
@ especially when emailing your boss a forecast and you don't want him to find out it was generated with =RANDBETWEEN()
I've not encountered a lot of auto-correct problems in my spreadsheets, but I also converted a while back to using Google Spreadsheets instead of Excel. Surely you can just change your settings so that it won't auto-correct?
Also, when it comes to the typical "errors" you can get (#NUM!, #DIV/0!, etc.) sometimes those can be intentional. I know I've sometimes had formulas that I expect to sometimes yield one of those errors, and I have an "IFERROR()" formula set up to do something in those cases. So just because a formula yields those errors does not mean there's an oversight or error in the intended function of the spreadsheet. Sometimes the user expects those errors, and designs the spreadsheet to deal with them.
6:00 A spreadsheet with 83,273 mistakes?
A ... Parker spreadsheet?
I can only imagine that the spreadsheet wanted access to external resources, which weren't available to the spreadsheet risk group. I believe you can get a huge number of errors if your spreadsheet is based on that. Though I also don't want to believe that they would count that as an error, just because they didn't have these missing resources.
A Parker spreadsheet is where you gave it a go, but didn't quite get there. Having tens of thousands of mistakes is just being a bumbling idiot.
Just having a #NUM, #VALUE or similar doesn't mean there's a calculation error. Some of my sheets that have no errors generate such things intentionally (or intentionally don't trap or otherwise prevent them) as part of the way that they work.
0:40 I live in wisconsin and I been to the circus world museum and it's barabOOOOO.
" Sometimes the spreadsheet can hit the fan " one of the most underrated jokes of the century
Matt, these single long takes are really impressive. Well played 👏👏
I wonder how many of the emailed spreadsheets were somebody saying "I broke this spreadsheet, could you please fix it?"
In general, I feel like "errors" is ill defined in this case. Returning "N/A" doesn't mean there's an error with the workbook.
The moral of the story:
Learn to code
@Богдан Кондратов well... i know the tools, and i don't appreciate these features. because there is no option to get rid of them. all those workarounds cost too much time.
Business users (non-IT) don't want to code. They need tools that help them solve their problems without coding. Heck, even IT these days don't want to code ("no code" being the buzzword) as code needs to be maintained or it breaks sooner or later.
The Hungarian version of Excel uses regional settings, for example decimal comma instead of decimal dot. I gave the input CSV file to students recently containing floating point numbers to 2 decimals using decimal dot (as if you read the file in any programming language, your program will expect decimal point), some of the students were interested how the data looks, and they opened the CSV file in Excel, which showed a lot of dates: every time if the integral part was between 1 and 12, and the fractional part between 1 and 31, it converted to date, and a mere type conversion produced some weird numbers. Libreoffice had no issues :)
That feeling when you've read the Book, and all next videos on this channel are spoiled.
Dream speed run: