Scanf Basics: the good, the bad, and why so many pointers?

Поділитися
Вставка
  • Опубліковано 16 гру 2024

КОМЕНТАРІ • 85

  • @benjaminrich9396
    @benjaminrich9396 Рік тому +50

    Jacob, these 'deeper look at the basics of C' kind of videos are really useful. Great stuff.

  • @AceAufWand
    @AceAufWand Рік тому +24

    Scanf is full of amazing stuff when you consider it, especially when you consider its behavior when treating non-format specifier. For example something like:
    result = scanf(" %*dinput%d%c", &x, &trailing_char);
    Which is probably the closest thing standard C got to regex.
    When starting with C, I tought that the best part of printf and scanf was print and scan but when I started to understand what they did, I realized that the best part of it is actually the f.

  • @redcrafterlppa303
    @redcrafterlppa303 Рік тому +13

    11:43 calling fflush on stdin is undefined behavior according to the C language standard. It just happens to work most of the time. A better solution would be to read() from stdin until it's empty.

  • @beeeeeee42333
    @beeeeeee42333 Рік тому +3

    really love someone who addresses scanf and printf as function and further explain pass by value n pass by reference , as many tutorials just call them this is the way to get input and display output ! ,!!

  • @BryanChance
    @BryanChance Рік тому +6

    This explain so well why I love C. I initially had all the problems you described here with using scanf(). But when I read up on the function details, I found a solution which you described here (with the nextchar , buffer flush, and while loop). Now, some would say that's a lot of work for such a simple thing. Well, I've never written a C program that only use scanf(). SO there's a little bit of setup, putting this in a functio and call it from the rest of your code. What it gives you a the building blocks to read an input. Let's say my input a huge file and I've pre-checked and data is clean, then it's faster to use scanf() without all the error checking. And it makes sense why scanf() works the way it does, not some kind of magic. And the buffer overflow, again put it your custom "input function" or something. LOL modern programmers.. (err developers rather). At least with my experience, it takes some time to figure out C.. it's not Python where you can't just slurp an entire text file into an array, even if the file is 10gb. LOL

  • @unperrier5998
    @unperrier5998 Рік тому +12

    A way to workaround buffer overflows is to use "%ms" and provide the address of the string variable. scanf will allocate the string for you (similar to asprintf)
    It is part of the POSIX standard so not a problem on Linux, but not available on Windows (which is compliant with the old POSIX-1 standard) and likely not on embedded.
    Note that it used to be "%as" (very old compilers/libc)

    • @anon_y_mousse
      @anon_y_mousse Рік тому +1

      A better and standard compliant way is to just write your own getline() equivalent and make sure it's rock solid and use it everywhere. Or to find a good library that does it for you and use that. You can also just be fine with the fact that you might have more of the line in the buffer than you've read and just skip what's left since that's very likely what most people will do anyway.

    • @DatBoi_TheGudBIAS
      @DatBoi_TheGudBIAS 3 місяці тому

      Idk if it is safe (it probably is, it's very convenient), I just use %numbers, where number is the size of the array - 1 (and the array is 0 initialized just for convenience)
      But I don't like %s cuz it ignores whitespaces, so II actually use %[^
      ], and yes, ik fgets exists, but I HATE that fgets consumes the enter too

  • @NinaNanni
    @NinaNanni Рік тому +1

    I swear you beautiful youtube creators are going to make me the programmer i thrive to be. Amazing and thorough work! I wondered why one function took 15 minutes and came here with no expectations, but you surely had beautiful tools to educate about. Thank you!

  • @mirrors.of.reality
    @mirrors.of.reality Рік тому +1

    I am really glad I found your channel, very useful information and well presented. Thank you!

  • @megachar0x01
    @megachar0x01 Рік тому +1

    Just adding more info :
    example code at 7:26 does create a buffer overflow which can cause program to exit peacefully . in defualt compilation stack canary is placed which when clobar exit program peacefully. not only that aslr makes it hard to get shell. so in short to exploit just a simple buffer overflow we need to have a memory leak .

  • @zxuiji
    @zxuiji Рік тому +4

    8:26, last I checked you could use the ".*" modifier to constraint by variable length such as:
    #define LENG 31
    char name[LENG+1] = "";
    scanf("%.*s", LENG,name);
    At the very least I'm going to support either that or just * in my custom version of it later, using parsef for my naming scheme instead of scanf though, lines up nicely with printf :)

    • @31Uluberlu
      @31Uluberlu Рік тому +1

      I'm afraid this format only works with printf:
      printf("%.*s", 5, "Hello World!"); // Hello
      not scanf.

    • @infastin3795
      @infastin3795 Рік тому

      It is not supported anywhere. Precision specifier is only printf thing.

    • @zxuiji
      @zxuiji Рік тому +1

      @@infastin3795 perhaps I mis-remembered then, oh well, I'm making a library that'll be a possible stand in replacementment of libc etc, not for the symbols like musl, but for cross platform stuff, it's called paw, everything is prefixed with paw as well so they can be used together, I'll post a link eventually once I'm satisfied the 1st version is reasonably feature complete, including threads, mutices, semaphores, graphics, ux, all that "fun" stuff that stdc chose to ignore at the start

    • @anon_y_mousse
      @anon_y_mousse Рік тому

      @@zxuiji Good, everyone should do that at least once in their career, especially if they intend on seriously using C.

    • @zxuiji
      @zxuiji Рік тому

      @@anon_y_mousse What, make a library? Make a custom printf/scanf? You weren't clear as to which of my comments you were replying. Side note, just yesterday I managed to finish my pseudo mutices, the main issue I always had with pthread_mutex_t etc is that it was never defined exactly what happens when a thread tries to delete the mutex at the same time another tries to lock it, after a number of re-thinks I finally arrived at an octal permissions based design.
      The mutex requires data to be attached to it during creation along with a type string & callback for what it should do when deleting said data (which can only be triggered when no thread is capable of attaching to the mutex), as a bonus I added a couple of prev/next pointers to create a linked list GC with, the only time the GC is ever searched is when a thread declares it is abandoning all mutices it has permission to attach to, the owning thread ends up in a blocked state but the rest just happily kill their own permissions and attachment count.
      It was a real task to create such a mutex but now I'm comfortable using it in a multi-threaded environment since I know exactly what will happen if one thread tries to delete the mutex while another is trying to lock it, the delete will just not happen because it will detect other threads still have permission to attach and just refuse to remove the owner's permissions when it tries, the owner's permissions have to be removed before deletion is triggered so the only way for something unexpected to happen is if the dev is being stupid by not clearing their pointer after revoking the permissions of their thread.

  • @ZeroCool2211
    @ZeroCool2211 Рік тому +4

    Just one note, fflush has an undefined behaviour when it is being used for stdin because it is originally was made for stdout

  • @coderstubechannel
    @coderstubechannel Рік тому +2

    This video on Scanf Basics is an absolute game-changer! It's exactly the type of content I've been searching for. I'm so glad I stumbled upon this video. It has inspired me to create more programming content on my channel. Thank you for sharing your knowledge, I can't wait to see more from you! 🦾

  • @kellingc
    @kellingc Рік тому +1

    This is cool. I can think of several instances where I want mixed input say like the value then units like 13' 4" or 37F.

  • @aj.arunkumar
    @aj.arunkumar Рік тому

    thanks jacob. i now understand why my scanfs didnt work in college days...

  • @lean.drocalil
    @lean.drocalil Рік тому

    Yet another great video ❤

  • @HansBezemer
    @HansBezemer Рік тому +6

    I've been programming in C since 1987 and I can honestly say I *NEVER* used scanf() in any of my programs. Its behavior is simply too murky for my taste. Like you said, I'd rather read in the whole shebang using fgets() and tokenize the whole bunch myself (not necessarily using strtok() for that).
    Just for fun, I've been doing a sscanf() like routine for my own Forth compiler - and still: some behavior of scanf() was baffling me. A few changes I made:
    (1) My ”SSCANF” really doesn't like whitespace - neither in the buffer nor in the format string. When it encounters it, it will vehemently look for the first ”non-white space” character and resume parsing from there. Which means that these format strings are equivalent: "%c %c %c" and "%c%c%c".
    (2) When parsing it takes a real good look at the delimiters you defined in the format string. If you define: "%s" and your buffer contains ”Hans Bezemer”, it will parse the entire string. However, if you define: "%s ", it will only parse ”Hans” and leave the rest of the buffer unparsed.
    The upside of all this is, is that strings in the buffer are *not* automatically delimited by whitespace. Take "Invoice issued by [%s] on %u-%u-%u". If we feed ”SSCANF” this buffer: "Invoice issued by [Hans Bezemer] on 2022-04-03", it will happily read the entire ”Hans Bezemer” - and not just ”Hans”.
    Still - although it was lots of fun to develop, I've never used it (yet) in my own Forth programs. I still don't trust it with real world data ;-)

    • @grimvian
      @grimvian Рік тому +1

      Agreed: "I'd rather read in the whole shebang using fgets() and tokenize the whole bunch myself"
      As part of my training as a C beginner, I wrote all the string handling myself and getchar() would be my first try.

    • @anon_y_mousse
      @anon_y_mousse Рік тому

      I know how you feel. When I first implemented my own libc I couldn't stand doing things in just the way the standard defined and wound up just writing a completely new definition of a library from scratch. I wound up adding data structures and algorithms and still use it to this day for all of my in-house projects.

    • @DatBoi_TheGudBIAS
      @DatBoi_TheGudBIAS 3 місяці тому

      I'm stuck to scanf cuz I don't want fgets to consume the enter keypress

  • @Thwy
    @Thwy Рік тому +10

    as far as i know, fflush(stdin) has undefined behavior and it doesn't work with GCC on Linux.
    You're living dangerously there.

    • @HansBezemer
      @HansBezemer Рік тому +2

      True. I needed to do that myself - and in order to achieve that (portable across several very different platforms and compilers) I had to read it until EOF.

    • @5cover
      @5cover Рік тому

      ​@HansBezemer which should not work since stdin is never eof.
      The terminal just pauses your program and prompts when the buffer is empty.

    • @Thwy
      @Thwy Рік тому

      @@5cover The stdin can have an EOF.
      try to run
      ./myprogram < test.txt
      The stdin will be the file "test.txt" and it will reach EOF.

  • @23trekkie
    @23trekkie Рік тому +1

    Typing in letter when program expects a number:
    Python - throws an error and ends 😎
    Pascal - throws an error and ends 😎
    Commodore 64 basic - just asks again 😎
    Go - assigns 0 to the variable 😎
    C - infinite loop until your CPU is on fire🤬
    (yes, I know I can walk around this using char array and fgets and sscanf functions, but still...)

    • @5cover
      @5cover Рік тому

      Because scanf was designed to read trusted data (such as formatted data in files), it needs not check for errors.
      The C standard library doesn't offer a way to read user input from the console because there was no need for it. I mean, have you ever performed console input in a "real" project? Apart from the usual (y/n) confirmation (which can trivially be implemented with getchar, as you only need to read a single character), you seldom see console user input as all the information necessary is specified in the command parameters.
      But it's easy to implement it using fgets and functions such as strtol.

  • @ukaase
    @ukaase Рік тому

    hi jacob can u make a video about register level programming in embedded systems would be a nice topic. love so much your content

  • @kevinyonan2147
    @kevinyonan2147 Рік тому +2

    I do know a trick with scanf to have variable-size reading limits for strings. One thing I do is have a numeric string that'll become the format string itself where I have I convert the size of a buffer to a string and have that sandwiched between the '%' and 's' and then use that. If your buffers are always a fixed size, you can optionally use a macro that makes the int into a string literal `#define INT_TO_STRING(x) #x`

    • @Hauketal
      @Hauketal Рік тому

      There is a width option "*" for scanf. You add an extra item in the parameter list for the width.
      Example:
      char name [20];
      scanf ("%*s", sizeof name - 1, name);
      Works too if the size is not a literal, but a parameter to your function.

    • @kevinyonan2147
      @kevinyonan2147 Рік тому +2

      @@Hauketal it doesn't work. the `*` skips stuff. you're thinking of the `*` for `printf`.

  • @DatBoi_TheGudBIAS
    @DatBoi_TheGudBIAS 3 місяці тому

    U should do a video in the safe versions of these functions. I have no idea why printf_s exists tbh, but scanf_s is understandable. It allows U to specify the lenght of the string, and others. Idk how exacly to use it

  • @xCwieCHRISx
    @xCwieCHRISx Рік тому +1

    I use fgets compined with sscanf to read user input.

  • @AminKamali7
    @AminKamali7 6 місяців тому

    Bravo!!

  • @skeleton_craftGaming
    @skeleton_craftGaming Рік тому

    Regarding the last question, at least , because C doesn't have C++ style references... In C++ you should always use std::cin (or std::ifstream for places where you would use fscanf) partially because of the aforementioned pointer issues...

  • @abdoemad552
    @abdoemad552 Рік тому

    Awesome ❤️❤️❤️
    Could you tell us the name of font you are using in vs code?

  • @31redorange08
    @31redorange08 Рік тому +1

    Why doesn't it skip the second scanf when there's apparently still the '
    ' in the buffer?

    • @31Uluberlu
      @31Uluberlu Рік тому

      Most formats including "%s" consume and discard leading whitespace characters. Only "%c", "%[...]" and "%n" don't.

  • @noahvanmiert
    @noahvanmiert Рік тому

    Hey, can you maybe make a video about filesystems?

  • @charankoppineni4498
    @charankoppineni4498 Рік тому +1

    Where can I buy this t shirt ?

    • @JacobSorber
      @JacobSorber  Рік тому

      It should now be available on my store

  • @tshaka_
    @tshaka_ Рік тому

    There's also the bounds checked scanf_s from C11.

  • @germankoga8640
    @germankoga8640 Рік тому

    So I should better stop using scanf in favor of fgets? at least for strings, in case of numbers I guess I'm stuck with scanf

    • @RobBCactive
      @RobBCactive Рік тому +1

      You can read input into a buffer with fgets, then process that data with sscanf was something he said.
      Using scanf is OK in personal programs but not in general professionally.
      In general reading a file format, you analyse the input returning a token to say what type of input text you found and have the value available, converting a string of digits into a number. It's called lexical analysis and parsing for the grammar rules. That way you can detect errors and avoid buffer over runs and overflow.

  • @nunyobiznez875
    @nunyobiznez875 Рік тому +1

    fflush(stdin) on an input stream, is undefined behavior. It works on WIndows, where there's poor standards compliance. But MSYS2, Cygwin, BSD, and Apple all should use fpurge(stdin) and Linux uses __fpurge(stdin) from Solaris, defined in stdio-ext.h. They really need to ISO standardize fpurge(), or even POSIX, but neither board likes moving too quickly and it hasn't yet been 40 years yet 🤣. It's a bit of a mess. I just took the time to write my own header, so I can use a fpurge() macro that'll select the correct function for the system, that can be dropped in and used everywhere, when I need to write portable code. Or there's also the minimally flawed alternative: while((getc(stdin) != '
    '));

  • @zxuiji
    @zxuiji Рік тому

    Can avoid the whole pointer issue and verify input by just making a custom function:
    uintmax_t parseju( FILE *file, char *stopped )
    {
    uintmax_t value = 0;
    unsigned int c = 0;
    int was = 0;
    while (1)
    {
    was = fgetc();
    c = was - '0';
    if ( c > 9 )
    break;
    value *= 10;
    value += c;
    }
    if ( stopped ) *stopped = was;
    return value;
    }
    Don't remember how to "put back" the read character but you get the gist

    • @Hauketal
      @Hauketal Рік тому +1

      This will result in bad values for anything entered before '0', like '+'.

    • @zxuiji
      @zxuiji Рік тому

      @@Hauketal That's fine, you're supposed to check for that yourself if you want to support it, the ONLY purpose of this function is to read digits, that can then be used by wrapper functions that need it, like floating point number readers for instance

    • @your-mom-irl
      @your-mom-irl Рік тому +2

      ungetc

  • @jittertn
    @jittertn Рік тому

    There is scanf_s that protects against overflows since C11

  • @aaaowski7048
    @aaaowski7048 Рік тому

    >write my own scanf
    thats the first thing that came to my mind
    why not just use "read (1, buff, buffsize)"?

  • @user-sl6gn1ss8p
    @user-sl6gn1ss8p Рік тому

    11:38 oh no, don't let stack overflow see that D :

  • @randomscribblings
    @randomscribblings Рік тому

    scanf() has the ability to take length as a parameter.

  • @1873Winchester
    @1873Winchester Рік тому +1

    Newbie to C here but I think I would try and use the isdigit function, it only checks a char at a time, but you can write a new function using isdigit, or just copy the one on stackoverflow. So if (0 == (isdigits(result))) or something like that.

  • @LizzieMignon-h1c
    @LizzieMignon-h1c 2 місяці тому

    Casper Street

  • @rosen8757
    @rosen8757 10 місяців тому

    scanf("%d") is not a valid way to read signed integers unless you know that every input fits into an int, scanf doesn't check for overflow and signed overflow is undefined behaviour. You have to use the strtol family of functions instead.

  • @PavitraGolchha
    @PavitraGolchha Рік тому

    Do you Rust?

    • @sumofat4994
      @sumofat4994 Рік тому +3

      Rust is no

    • @PavitraGolchha
      @PavitraGolchha Рік тому

      @@sumofat4994 Rust is trust

    • @robertstrickland9722
      @robertstrickland9722 Рік тому

      @@sumofat4994 Rust at least "meh". Meh, enough for Linux to consider adding support for it.

    • @patryk_49
      @patryk_49 Рік тому +1

      @@0xDEAD-C0DE Rust is slow and bloated.

    • @sumofat4994
      @sumofat4994 Рік тому +1

      @@0xDEAD-C0DE You are delusional.

  • @Stopinvadingmyhardware
    @Stopinvadingmyhardware Рік тому

    Tokenizing free knowledge.
    No

  • @CCoder--
    @CCoder-- Рік тому +1

    😂 this is just for fun..
    #include
    int main ()
    {
    char a[10]; // suppose we have an array of 10 char
    int n=9; // hence we can fill a max of 9 characters + '\0';

    // generating the scanf format using sprintf();
    char tmp[10];
    sprintf (tmp,"%%%ds",n); // here we are generating the string "%9s" which will be the format for scanf
    printf ("enter a string:");
    scanf (tmp,&a);
    printf ("string=\"%s\"",a);
    return 0;
    }

    • @CCoder--
      @CCoder-- Рік тому

      @@rustycherkas8229 Sorry, my mistake in the line scanf(tmp, &a);
      but the program will still work. I got lucky here 🤓
      The address of an entire array has the same value as the address of the first element of the array.

    • @rustycherkas8229
      @rustycherkas8229 Рік тому

      @@CCoder-- I've deleted my comment. You're right about the array, and about "getting lucky"... Suggest you "tweak" the code to avoid pedantic comments from pedantic people like me... 😁

    • @CCoder--
      @CCoder-- Рік тому

      @@rustycherkas8229 no It's all right😀, pointers are a bit confusing.
      You can keep your previous comment to help others understand.

    • @rustycherkas8229
      @rustycherkas8229 Рік тому

      @@CCoder-- Kinda sorry I brought it up... @06:35 Jacob makes a point about dropping the "address of" from "&name"... imho, better that it is NOT present for those times when a block of code is factored out into a separate function... Easy to overlook an array becoming a pointer only (received as a parameter.) AND, it confuses the sh*t out of newbies, too! 🤣That's worth a LOT.... 🤣🤣🤣

  • @FEFFeX
    @FEFFeX Рік тому +1

    Your videos are awesome
    Huge thanks

  • @papasmurf9146
    @papasmurf9146 Рік тому

    This isn't a pure macro version of what you're trying to accomplish -- and it will only have a %4s if you send a char* instead of a char str[20]; But then again, I only spent a few minutes on it.
    #include
    /*------------------------------------------------------------------------------
    // In appending the line number, if we don't have an indirect CONCAT then
    // __LINE__ gets appended to the parammeter X and not the adctual line
    // number. In order to give the pre-processor a chance to convert it,
    // we use the CONCAT_INDRIECT.
    //----------------------------------------------------------------------------*/
    #define CONCAT(a,b) a ## b
    #define CONCAT_INDIRECT(a,b) CONCAT(a,b)
    #define APPEND_LINE(X) CONCAT_INDIRECT(X,__LINE__)
    char* string_input_size(char* format, size_t size)
    {
    sprintf(format, "%%%ds", size);
    return format;
    }
    #define INTERNAL_SCANF(str, FUNC) \
    char APPEND_LINE(format)[13]; \
    string_input_size(APPEND_LINE(format), sizeof(str)-1); \
    FUNC(APPEND_LINE(format), str)
    #define SCANF(str) INTERNAL_SCANF(str, scanf)
    #define SSCANF(str) INTERNAL_SCANF(str, sscanf)
    #define FSCANF(str) INTERNAL_SCANF(str, fscanf)
    int main(int argc, char* argv[])
    {
    char name[20];
    printf("Name: ");
    SCANF(name);
    printf("You entered '%s'
    ", name);
    }

  • @kiyotaka31337
    @kiyotaka31337 Рік тому

    using FORTIFY_SOURCE with gcc prevents some simple buffer overflows, This would be a cool trick to show

    • @HansBezemer
      @HansBezemer Рік тому

      Careful and defensive programming prevents almost all buffer overflows. Also very effective against memory leaks.