Another solution I like and use quite often when I don't know the dimensions of the matrix in advance is to actually just store matrices in one dimension, like int *matrix = new int[rows * cols], then index it with matrix[row * cols + col]. This way the memory is assured to be contiguous and it's easy and efficient to just memcpy the whole array if you need to duplicate it.
@@doradiv10 no you index it by, for instance for the cell where x = row = 1 and y = col = 2: cell = matrix[row*cols+col] = matrix[1*3 + 2] = matrix[5] the index will never be out of bounds as your matrix has size (rows*cols) and the max index is: (rows-1)*cols+(cols-1) = (rows * cols) - 1, which corresponds to the last cell of the (rows*cols) 1D array.
@@doradiv10 If I may give you a piece of advice, though you probably know it already: matrix libraries is a solved problem in most environments :) Now I still think any developer should develop a Matrices / Vectors library at least once in their lives as an exercise. It's not very complicated but reasonably enough to be interesting. I very strongly suggest you to code your Matrix library (should you choose to go down that path) using Test Driven Development because if you choose to roll your own lib it will have to be rock solid and TDD is a good and appropriate method to maximize the chances of that happening.
Please correct me if am wrong, is there a bug in the for loop from the main scope, specifically accessing memory which is not yours? MATRIX_ROWS same for both loops
In "main" and "print_matrix" functions of C program (before refactoring) you have serious bug: you're writing and reading (respectivly) 4 bytes (1 position) behind 2 dimensional array. Second for loop should have condition: col < MATRIX_COLS instead of col < MATRIX_ROWS.
I was taught arrays in slightly different way, and that used to confuse me. An array (the subscript syntax) is a syntactic sugar over pointers. So when one is doing int array[5]; and printing it with printf("%d", array[3]); for example, then they are effectively doing *(array + 3) and printing that value. The name given to an array is always a pointer to the first location, that is, the zeroth cell. So array[x] is the same as *(array + x) and it is pointer math. Having said that, at 16:57, instead of dynamically allocating the memory, you can still create the matrix as int matrix[MATRIX_ROWS][MATRIX_COLS]; and let the compiler actually reserve the memory for you by allocating the values in stack, and when you need to pass to the function, pass a pointer instead. Then you can pass the matrix address to the function where the signature becomes this: void printMatrix(int *matrix, int rows, int cols); And when we need to do matrix[y][x], we access it as *(matrix + x + y * cols). However, one thing that bothered me is I wasn't able to access it as matrix[y][x], even if I pass matrix to the function as a doubly pointer. When tried using IDEOne, it crashed with a Segmentation Fault at runtime. Any idea what might be going on? My program can be found here: ideone.com/GdWlFk (working version). The version that fails: ideone.com/zIg4A7
I prefer doing 1D way - accessing [x + y * columns]. With that allocation is needed only once. It can also work in more dimensions [x + y * columns + z * columns * rows].
One thing to keep in mind is that in C/C++, "const" doesn't actually mean compile time constant and really just means immutable. (C++ does have constexpr for that, but C, at least at the moment, does not) If you use a "const int" as an array specifier for a variable, it will be a VLA, regardless of if the value can be deduced at compile time. This is different from C++ (which doesn't have VLAs). Currently in C the only way to have "MATRIX_ROWS"/"MATRIX_COLS" be a compile time constant is to either use macros or enums. Also, for C23, the committee passed a proposal that makes it such that only stack allocated VLAs with automatic storage duration will be optional with all other types of VMTs/VLAs being mandatory. This means VLAs as function parameters and pointers to VLAs will be mandatory. (this also means "sizeof(int[rows][colls])" will be mandatory)
Good lesson. The next video could be about using one allocation for all ints at once and then just pointing to them. It not only saves many allocations, but it makes sure that the allocated memory is sequential, which might make a performance difference.
Many commenters have suggested flattening the matrix into a 1D array, and indexing it with matrix[row*cols + col], as a flattened matrix is nicer to the cache, and easier to allocate and deallocate. I would like to point out that you can have a flattened matrix and yet keep an array of pointers to rows that lets you have the more convenient syntax matrix[row][col]. This is done as follows: int **matrix = calloc(MATRIX_ROWS, sizeof *matrix); *matrix = calloc(MATRIX_ROWS * MATRIX_COLS, sizeof **matrix); for (int row = 1; row < MATRIX_ROWS; row++) matrix[row] = *matrix + row * MATRIX_COLS;
@18:30 (switching to alloc'd buffers) The use of "**" probably scares beginners. The alternate equivalent is, of course, int *matrix[] = calloc.... Even though they are the same, the square bracket version provides a clue (especially in the parameter list of the print function) that the var points to a contiguous array... void main( int argc, char *argv[] ) { ......
I kind of already knew all this, but only kind of, so this was a great refresher. Like others here I would also like to see how you would do a version where the entire matrix is allocated all at once as an array of int's, and then you manually access them given a row and col. I did this recently in one of my projects.
Hey Jacob, excellent video as always, thank you! Definitely didn't know about those peculiar ways of passing 2D arrays in C functions. You will often encounter another kind of way around this issue, and that is casting everything to a 1D array (as memory is just one string of bytes), and then access it by something like; { array[ row * MATRIX_COLS + col ] = ( row
Did you know... Assume following int a[10]; a[5] = 3; This is the same as *(a+5)=3. This is also the same as *(5+a) =3. So one can write 5[a] = 3 and it should compile and work. Similar pointer math is behind matrices as well. You can show people how compiler internally calculate the address of memory location. Also rows pointers version of matrix is more flexible in a way that each row can have different number of elements, which is not true matrix... It's more like array of arrays.
That's probably the way I would do it. You lose a lot of performance with an array of pointers. First, it's a double indirection. Also, it's not very cache friendly. Modern CPUs like things to be stored one right after the other for optimal performance.
@@ayecaptainjack6930 in that case allocation of Matrix is just malloc(sizeof(MX_TYPE) * MAX_X * MAX_Y). Getting n,m element is matrix[n + m * MAX_X] and you still can make it even easier with inlined functions or macros.
I'll just say, the C++ version is very old school. Prefer std::array over C style array, or use std::vector. this is why some of us love C++, otherwise one may as well just use C if you are going to new and delete everything.
You know, std::array is essentially just a struct on a template. If you don't need the added features of a true C++ Container, there's no feasible reason to demonize primitive arrays. I'm a big fan of C++, but I hardly ever follow the "the C++ way even if just for the hell of it" advices 😁. Regarding vectors, I've had cases (specifically, writing an emulator in C++) in which going with a C-styled array gave me performance, std::vector was introducing too much overhead, even when previously reserved more memory. Standard C is (mostly) standard and valid C++ and when picking a style between the two, I always go with the simplest and most straightforward (I usually printf over std::cout, too).
*Double pointers are matrices\2D arrays.* This makes understanding and *working with* double pointers (ex: **argv) much easier, especially because how C threats pointers in a special way.
Nah that’s not true. Declaring a pointer vs declaring an array are two different things; char *a = “foo” vs char b[] = “bar” behave unlike each other (a cannot be indexed as it points to space in the text area of its process memory, but b can). The confusion arises between arrays and pointers when passing arguments; an array is passed as an arg via a pointer to its first value.
defining the size of an argument with other arguments, that's a new trick to me, still, I think I'll stick to **mat rather than mat[rows][cols], it's cleaner, more consistent (binary wise) and still simple to understand
Nice video. If I ever decided to use C/CPP for working with matrices, it would be for performance reasons (and, well, I tend to prefer performance in just about any case). This leads me to one, for me interesting, topic - using vectored instructions for number crunching. Would it be possible to touch this in some upcoming video? How to use them, how to determine whether they are available on that specific machine etc...
Alternatively, this works in all compiles for both C and C++ with no need for VLA and no need for allocating arrays on the heap: void print_matrix(int rows, int cols, int matrix[]) { for (int row = 0; row < rows; row++) { printf("%d:\t", row); for (int col = 0; col < cols; col++) printf("%x ", matrix[row*cols + col]); printf(" "); } } int main() { int matrix[MATRIX_ROWS][MATRIX_COLS]; for (int row = 0; row < 3; row++) for (int col = 0; col < 4; col++) matrix[row][col] = (row
I prefer 1D arrays with pointers math. It just a more general way to think about data and i don't care how exactly data is structured - just allocate/free/dosmth this much memory and throw it inside function, highlevel code don't care and everything in code looks the same.
Great video as always! I have a question about using 2D arrays like these on embedded system. The bug in the second for loop could cause some serious problems, since the memory is usually limited. How would you debug it? Is it worth to wrap the 2D arrays in a struct or class and add boundary checks on embedded system, or it adds too much overhead? What's your take on this? I know, it depends on a lot of things, but a general idea or some aspects worth to consider would be very helpful.
Your videos are really nice. Could you focus more on the linking process? And debugging existing real applications? And cross-compiling? Where does that "unable to find symbol" error come from? What's __dso_handle anyway? Why is the linker complaining about a library if I'm giving it the correct path in -L and --verbose is showing me it's actually finding it but ignoring it? While your videos are great for begginers, it seems as though you're passively contributing to the college falicy that newly-grads will start working in developing new software, when 99% of the time they'll be supporting existing (often legacy) apps. There are *a lot* of linker errors and debugging issues that make sense to clarify. Just a suggestion.
why does nobody teach the right way to do 2D arrays in C? this is a naive way that have a lot of cons to it, the better way to do an array of size N * M is to do "int (*arr)[M] = malloc(sizeof(int) * N * M);" here it is the exact same things as if you had declared a static array like "int arr[N][M];", but it is dynamically allocated, you don't need loops to allocate it, and to free it, and you can copy it with just "memcpy(destArr, srcArr, sizeof(int) * N * M);" and the speed benefits of having it contigus in memory.
In my opinion this nested array allocation is bad for teaching, because it makes beginners think, that as much as possible should be allocated on the heap. I think that it is important for beginners to think about the layout of nested C arrays. The important thing is that is missing here is that such nested C arrays are just simple arrays, that are indexed in a certain way. The index is computed by multiplying the number of colums with the row and adding the column to it. Implementing this on a simple 1D array shouldn't be hard. When trying to make it compatible with C++/any C compiler, this is what is usually used in common apis. It is also the most efficient, because no additional memory is used for pointers and the allocation control structures in the allocator used for allocating all the rows. And also it is efficient, because the access does not have an additional indirection, that comes with the additional pointer dereference, which would mean that it is more cache friendly.
You're explanations are very chaotic, you don't finish ideas, you keep going back and forth between different ideas and it's just way too much information thrown at once.
yall keep losing me by going off tangents and showing all the other things we will show later, or something about compiler parts or some other thing or maybe this thing... damn.
Another solution I like and use quite often when I don't know the dimensions of the matrix in advance is to actually just store matrices in one dimension, like int *matrix = new int[rows * cols], then index it with matrix[row * cols + col]. This way the memory is assured to be contiguous and it's easy and efficient to just memcpy the whole array if you need to duplicate it.
This is how non sparse matrices are actually handled in general in libraries like eigen c++
That’s actually really clever
@@doradiv10 no you index it by, for instance for the cell where x = row = 1 and y = col = 2:
cell
= matrix[row*cols+col]
= matrix[1*3 + 2]
= matrix[5]
the index will never be out of bounds as your matrix has size (rows*cols) and the max index is:
(rows-1)*cols+(cols-1)
= (rows * cols) - 1,
which corresponds to the last cell of the (rows*cols) 1D array.
@@doradiv10 If I may give you a piece of advice, though you probably know it already: matrix libraries is a solved problem in most environments :)
Now I still think any developer should develop a Matrices / Vectors library at least once in their lives as an exercise. It's not very complicated but reasonably enough to be interesting.
I very strongly suggest you to code your Matrix library (should you choose to go down that path) using Test Driven Development because if you choose to roll your own lib it will have to be rock solid and TDD is a good and appropriate method to maximize the chances of that happening.
Please correct me if am wrong, is there a bug in the for loop from the main scope, specifically accessing memory which is not yours? MATRIX_ROWS same for both loops
In "main" and "print_matrix" functions of C program (before refactoring) you have serious bug: you're writing and reading (respectivly) 4 bytes (1 position) behind 2 dimensional array. Second for loop should have condition: col < MATRIX_COLS instead of col < MATRIX_ROWS.
That thing was bugging me entire video ngl
I was waiting for it to crash, but oh well
not seriously a bug. kinda of a typo.
@@hatkidchan_ It may crash or not, but it will change value of next variable in memory you don't wan't for sure.
@@yangliu3754 Reding and writing memory outside of range isn't typo, but serious bug, you can try to investigate for hours, days or weeks.
@@wiktorwektor123 as long as you are happy. you are just missing the point. this is not a video about how to write a typo or even bug free program.
I was taught arrays in slightly different way, and that used to confuse me. An array (the subscript syntax) is a syntactic sugar over pointers. So when one is doing int array[5]; and printing it with printf("%d", array[3]); for example, then they are effectively doing *(array + 3) and printing that value. The name given to an array is always a pointer to the first location, that is, the zeroth cell. So array[x] is the same as *(array + x) and it is pointer math.
Having said that, at 16:57, instead of dynamically allocating the memory, you can still create the matrix as int matrix[MATRIX_ROWS][MATRIX_COLS]; and let the compiler actually reserve the memory for you by allocating the values in stack, and when you need to pass to the function, pass a pointer instead. Then you can pass the matrix address to the function where the signature becomes this:
void printMatrix(int *matrix, int rows, int cols);
And when we need to do matrix[y][x], we access it as *(matrix + x + y * cols). However, one thing that bothered me is I wasn't able to access it as matrix[y][x], even if I pass matrix to the function as a doubly pointer. When tried using IDEOne, it crashed with a Segmentation Fault at runtime. Any idea what might be going on?
My program can be found here: ideone.com/GdWlFk (working version).
The version that fails: ideone.com/zIg4A7
I prefer doing 1D way - accessing [x + y * columns]. With that allocation is needed only once. It can also work in more dimensions [x + y * columns + z * columns * rows].
Yeh this flattened multidimensional array approach is nice.Although a bit hard to read,but really nice
Yeah in C I tend to do this as well. I think that's how most C++ matrix libraries deal with this as well.
@@mario7501 Now, how did I deduce by myself that that was the technique I needed?
One thing to keep in mind is that in C/C++, "const" doesn't actually mean compile time constant and really just means immutable. (C++ does have constexpr for that, but C, at least at the moment, does not) If you use a "const int" as an array specifier for a variable, it will be a VLA, regardless of if the value can be deduced at compile time. This is different from C++ (which doesn't have VLAs).
Currently in C the only way to have "MATRIX_ROWS"/"MATRIX_COLS" be a compile time constant is to either use macros or enums.
Also, for C23, the committee passed a proposal that makes it such that only stack allocated VLAs with automatic storage duration will be optional with all other types of VMTs/VLAs being mandatory. This means VLAs as function parameters and pointers to VLAs will be mandatory. (this also means "sizeof(int[rows][colls])" will be mandatory)
3:53 Who else was waiting for him to notice that MATRIX_ROWS in the inner loop should be MATRIX_COLS?
I still am. 😀
Sorry about that. It's things like this that make me wish UA-cam let me update videos. I understand why they don't, but ...😫
@@JacobSorber LOL, maybe to remake it? Funniest thing was when you COPIED loops... I was sure, at that moment, you will see typo...
Good lesson. The next video could be about using one allocation for all ints at once and then just pointing to them. It not only saves many allocations, but it makes sure that the allocated memory is sequential, which might make a performance difference.
Many commenters have suggested flattening the matrix into a 1D array, and indexing it with matrix[row*cols + col], as a flattened matrix is nicer to the cache, and easier to allocate and deallocate. I would like to point out that you can have a flattened matrix and yet keep an array of pointers to rows that lets you have the more convenient syntax matrix[row][col]. This is done as follows:
int **matrix = calloc(MATRIX_ROWS, sizeof *matrix);
*matrix = calloc(MATRIX_ROWS * MATRIX_COLS, sizeof **matrix);
for (int row = 1; row < MATRIX_ROWS; row++)
matrix[row] = *matrix + row * MATRIX_COLS;
And you can also iterate through the entire matrix with a single loop, which is convenient.
@18:30 (switching to alloc'd buffers) The use of "**" probably scares beginners.
The alternate equivalent is, of course, int *matrix[] = calloc....
Even though they are the same, the square bracket version provides a clue (especially in the parameter list of the print function) that the var points to a contiguous array...
void main( int argc, char *argv[] ) { ......
Huh, this video sure was timely. Thanks Jacob.
I was literally just looking at this topic yesterday!
Great as always. Happy that I can follow this time.
I kind of already knew all this, but only kind of, so this was a great refresher. Like others here I would also like to see how you would do a version where the entire matrix is allocated all at once as an array of int's, and then you manually access them given a row and col. I did this recently in one of my projects.
Hey Jacob, excellent video as always, thank you! Definitely didn't know about those peculiar ways of passing 2D arrays in C functions. You will often encounter another kind of way around this issue, and that is casting everything to a 1D array (as memory is just one string of bytes), and then access it by something like; { array[ row * MATRIX_COLS + col ] = ( row
Thank you for this video, very informative.
Best teacher i love your channel
Thank you.
Jacob: "I don't wanna confuse anyone"
.
.
.
Also Jacob: uses
😂. I try.
Flat indexing for the win.
I am confused about why you didn't get an access violation/out of bounds error with both nested for loops due to using MATRIX_ROWS with col
A question: why did you use "col < MATRIX_ROWS" instead of "col < MATRIX_COLS" in both of your for loops?
It's a bug, second loop should be col
Yep, that's called a typo (and a bug). 😫 Thanks for catching it.
@@JacobSorber Oh, OK. Good video anyways. :)
@@JacobSorber In this case it's bug, not a typo ;)
@@wiktorwektor123 It's both.
How would the double pointer method work for static or fixed length array (preserving the array indexing)?
Did you know... Assume following
int a[10];
a[5] = 3;
This is the same as *(a+5)=3.
This is also the same as *(5+a) =3.
So one can write 5[a] = 3 and it should compile and work.
Similar pointer math is behind matrices as well. You can show people how compiler internally calculate the address of memory location.
Also rows pointers version of matrix is more flexible in a way that each row can have different number of elements, which is not true matrix... It's more like array of arrays.
I think you missed the method of allocating the whole matrix as a single continuous array and then using pointer math to get to the element.
That’s making things a lot more complicate tho
That's probably the way I would do it. You lose a lot of performance with an array of pointers. First, it's a double indirection. Also, it's not very cache friendly. Modern CPUs like things to be stored one right after the other for optimal performance.
That's how I did it in my matrix class, which also handles resizing matrices, adding and multiplying them, Gaussian elimination, and determinants.
@@ayecaptainjack6930 I had say it's easier and saves you trouble later.
@@ayecaptainjack6930 in that case allocation of Matrix is just
malloc(sizeof(MX_TYPE) * MAX_X * MAX_Y).
Getting n,m element is
matrix[n + m * MAX_X]
and you still can make it even easier with inlined functions or macros.
I'll just say, the C++ version is very old school. Prefer std::array over C style array, or use std::vector. this is why some of us love C++, otherwise one may as well just use C if you are going to new and delete everything.
You know, std::array is essentially just a struct on a template. If you don't need the added features of a true C++ Container, there's no feasible reason to demonize primitive arrays. I'm a big fan of C++, but I hardly ever follow the "the C++ way even if just for the hell of it" advices 😁. Regarding vectors, I've had cases (specifically, writing an emulator in C++) in which going with a C-styled array gave me performance, std::vector was introducing too much overhead, even when previously reserved more memory. Standard C is (mostly) standard and valid C++ and when picking a style between the two, I always go with the simplest and most straightforward (I usually printf over std::cout, too).
❤️ from India
Dr. Sorber, what's the difference between a vector and a two-dimensional array? I'd like to see a video on making a class out of this.
*Double pointers are matrices\2D arrays.* This makes understanding and *working with* double pointers (ex: **argv) much easier, especially because how C threats pointers in a special way.
Nah that’s not true. Declaring a pointer vs declaring an array are two different things; char *a = “foo” vs char b[] = “bar” behave unlike each other (a cannot be indexed as it points to space in the text area of its process memory, but b can). The confusion arises between arrays and pointers when passing arguments; an array is passed as an arg via a pointer to its first value.
@@italianbeans877 Both a and b behave the same, you can index a no problem and it will do what you expect. They mean slightly different things though.
defining the size of an argument with other arguments, that's a new trick to me, still, I think I'll stick to **mat rather than mat[rows][cols], it's cleaner, more consistent (binary wise) and still simple to understand
Nice video. If I ever decided to use C/CPP for working with matrices, it would be for performance reasons (and, well, I tend to prefer performance in just about any case).
This leads me to one, for me interesting, topic - using vectored instructions for number crunching.
Would it be possible to touch this in some upcoming video?
How to use them, how to determine whether they are available on that specific machine etc...
#include
Alternatively, this works in all compiles for both C and C++ with no need for VLA and no need for allocating arrays on the heap:
void print_matrix(int rows, int cols, int matrix[]) {
for (int row = 0; row < rows; row++) {
printf("%d:\t", row);
for (int col = 0; col < cols; col++)
printf("%x ", matrix[row*cols + col]);
printf("
");
}
}
int main() {
int matrix[MATRIX_ROWS][MATRIX_COLS];
for (int row = 0; row < 3; row++)
for (int col = 0; col < 4; col++)
matrix[row][col] = (row
hi great video is there a way to store output of a counter to a 2d array?
I prefer 1D arrays with pointers math. It just a more general way to think about data and i don't care how exactly data is structured - just allocate/free/dosmth this much memory and throw it inside function, highlevel code don't care and everything in code looks the same.
For my sudoku program, I used triple pointers to make my columns and cells constant time access lol
Great video as always! I have a question about using 2D arrays like these on embedded system. The bug in the second for loop could cause some serious problems, since the memory is usually limited. How would you debug it? Is it worth to wrap the 2D arrays in a struct or class and add boundary checks on embedded system, or it adds too much overhead? What's your take on this? I know, it depends on a lot of things, but a general idea or some aspects worth to consider would be very helpful.
do you can explained maxtrix[row][col]=(row
Your videos are really nice. Could you focus more on the linking process? And debugging existing real applications? And cross-compiling?
Where does that "unable to find symbol" error come from? What's __dso_handle anyway? Why is the linker complaining about a library if I'm giving it the correct path in -L and --verbose is showing me it's actually finding it but ignoring it? While your videos are great for begginers, it seems as though you're passively contributing to the college falicy that newly-grads will start working in developing new software, when 99% of the time they'll be supporting existing (often legacy) apps. There are *a lot* of linker errors and debugging issues that make sense to clarify. Just a suggestion.
why does nobody teach the right way to do 2D arrays in C?
this is a naive way that have a lot of cons to it, the better way to do an array of size N * M is to do "int (*arr)[M] = malloc(sizeof(int) * N * M);"
here it is the exact same things as if you had declared a static array like "int arr[N][M];", but it is dynamically allocated,
you don't need loops to allocate it, and to free it, and you can copy it with just "memcpy(destArr, srcArr, sizeof(int) * N * M);"
and the speed benefits of having it contigus in memory.
That's still a VMT.
In C11 if you want to define __STDC_NO_VLA__, you also can't support VMTs.
Thankfully that was recently changed for C23.
In my opinion this nested array allocation is bad for teaching, because it makes beginners think, that as much as possible should be allocated on the heap. I think that it is important for beginners to think about the layout of nested C arrays. The important thing is that is missing here is that such nested C arrays are just simple arrays, that are indexed in a certain way. The index is computed by multiplying the number of colums with the row and adding the column to it. Implementing this on a simple 1D array shouldn't be hard. When trying to make it compatible with C++/any C compiler, this is what is usually used in common apis. It is also the most efficient, because no additional memory is used for pointers and the allocation control structures in the allocator used for allocating all the rows. And also it is efficient, because the access does not have an additional indirection, that comes with the additional pointer dereference, which would mean that it is more cache friendly.
It really irks me how C++ can't handle 2D arrays.
for(int... this declaration is already too advance for mee
You're explanations are very chaotic, you don't finish ideas, you keep going back and forth between different ideas and it's just way too much information thrown at once.
yall keep losing me by going off tangents and showing all the other things we will show later, or something about compiler parts or some other thing or maybe this thing... damn.
I do not trust this ''Jacob''.