Another problem is IDs with (unknown) gaps in between. Here, finding the cursor value becomes even more difficult or even impossible (this is precisely the reason why the database cannot do an index seek, but has to count lines by scanning). A small addition from me, however: this technique is wonderful for updating or deleting in large tables. You can update/delete rows in blocks by working with TOP and WHERE ID < x in a loop.
Hi there, great videos btw. I am just wondering, how do you know that the cursor = 200400 corresponds to the page that you are trying to access? thanks
@@MilanJovanovicTech but I mean how can apply this on a table where I don’t know which number means any page? For example I have a table where can have logical deleted rows so the number itself can’t be associated to a calculation of a page
@@osman3404 but if you add an order or a filter that doesn't work anymore right? because the cursor uses a number as an anchor to calculate the next rows
he just doesn't know it and can't provide you the answer. In fact, identifying where the cursor is for correct page is very tricky and sometime impossible if you have id as guid or non sequential id. Normally the only situation I can see we could leverage cursor pagination is infinite scroll, where we would know next exact cursor position. For normal paging that allow user to arbitrarily go to any page, cursor won't work! And the way he presented his example is confusing enough for your question to arise :D
Thank you very much. It's nice to know that I've done everything right, even without having seen this video beforehand. I am currently working with realtime data, and saving the last ID helps to keep the next query fast. However, I find your example with the fixed value 200400 somewhat misleading. It's completely ok for benchmarks, but in real life I don't even know which ID I have as a starting point at the beginning. The video would have benefited from providing a practical approach to this. I personally know what to do, but @emma-vi's question shows the need for clarification.
Of course, of course... I needed something stable for the benchmark. In a real world example, we'd fetch the first page, and then use the ID of the last record as the cursor.
Nice! Thank you, perhaps a naive question but you picked an id to use as your cursor. How would you know what to use as your cursor in a real life scenario? Maybe I misunderstood something.
In theory, you can add a map of ids corresponding to pages. You then will need to periodically update it and whenever querying a page, query an offset (number of records added since last map update) to add this offset to the record id from the map. This will add ability to go to arbitrary page (if you use sequential numeric ids and sort paged records by that id). May perform better than offset...limit if paged table contains many more records.
@@MilanJovanovicTech it sure would be. Not so much if you don't need to maintain maps for different users (all users see the same list of records and therefore have the same pages) but still quite messy. And I completely forgot about the removal of records being a thing... It could be accounted for relatively easily as part of the periodical map updates, but between updates some pages would intersect. OR there could be a list of records removed since last map update which can be used to calculate proper offsets when needed... But anyway, it is indeed a very complicated and messy solution that wouldn't make sense for probably anyone
Is there a concept to keep page id's in cache to use them as cursor boundries to speed up the queries? I would never hard code an Id like this, but I can imaging to have these id's collected for instance every day to cache them.
Not really, you need something you can sort on that also matches when the records were created. Otherwise, you might get a different result each time as more records are added/removed from the DB.
How practical is cursor pagination in a real-life scenario given: 1. You almost always want to allow the data to be sorted by user-defined criteria - order date, order value, etc. 2. Once real-life scenarios such as order deletion/cancellation/reversals start to occur the id becomes very brittle, especially when you add a where clause to the select e.g. you want to page all non-cancelled, non-reversed orders
It's not for general purpose pagination, but fits perfectly for use cases where you need an infinite-scroll solution. Examples could be social media timelines, e-commerce catalogs, e-mail, etc.
I would not call this cursor pagination - it's just selecting by clustered index. Cursor pagination is technique that is actually using database feature CURSOR, by defining query (where you can actually use ORDER BY with other expressions than just Id) and then FETCH pages.
cursor pagination is faster, but you cannot order the data by some column except id, so if user want to order data by some column better using cursor, but if he want sort data then better is offset pagination
Wont' work in that case, you need something that's sortable + grows in "creation order" If you have to use a Guid, you'd need another column to handle the creation order part. A good solution could be a CreatedOn column
@@MilanJovanovicTech rewatch your video at 4:30. It’s easy to overlook. You wrote countasync after tolistasync and said it takes 2 trips. You accidentally called countasync on the query instead of before paging. If you are indeed doing what you are coding one round trip was enough but I’m guessing you didn’t mean to call count on the paginated results. For real pagination you should use countasync on the entire filtered query set then call tolistasync with the pagination.
It "works" in theory - you can sort a Guid, right? But it's practically useless, because a Guid is random. You want something that is increasing with creation time, like a numeric PK does.
Interesting that the final version will not work with the UI shown in the thumbnail 😉 If i was in a situation where a primitive unsortable next page implementation was needed, my data would not be in a sql db anyway.
But why on earth would anybody want to have a pagination that filters by ID? A pagination means: Where(any condition) & OrderBy(any column) & any page & any size
Think about your Gmail inbox. You see the latest 50 emails, and can navigate to the next page, etc. The emails are sorted in the order that they arrive to your inbox - i.e. creation order. Which is exactly what an integer PK gives you - creation order.
@@MilanJovanovicTech At which moment do you decide to create/update the cursors in order to fetch the latest 50 emails? on each email arrival do you delete the cursor for said user, recalculate which Id should the cursor have to be the 51st element, then offset all the records from that point (by deleting and recreating them with Id+1) to make room for the cursor?. yes, you could potentially use a step of 2 or bigger so you always have room for creating cursors, but there is also the point of this cursor being hardcoded in the code. Seems very unwieldy at the time of updating the cursors.
@@matiasmiraballes9240you don't need to update a cursor each time. All you need to do is order emails by id descending, and create a cursor pagination backwards: where Id < x
@@MilanJovanovicTech , Thanks Milan. Wouldn't it be more readable if the "Last()" used instead of [^1]? But I appreciate the approach shown, from my understanding it'll do the job working with arrays.
Want to master Clean Architecture? Go here: bit.ly/3PupkOJ
Want to unlock Modular Monoliths? Go here: bit.ly/3SXlzSt
Another problem is IDs with (unknown) gaps in between. Here, finding the cursor value becomes even more difficult or even impossible (this is precisely the reason why the database cannot do an index seek, but has to count lines by scanning).
A small addition from me, however: this technique is wonderful for updating or deleting in large tables. You can update/delete rows in blocks by working with TOP and WHERE ID < x in a loop.
Nice addition to the discussion, thanks
@@MilanJovanovicTech You're welcome! I discovered you on LinkedIn and saw that the videos are even better 😉
Hi there, great videos btw. I am just wondering, how do you know that the cursor = 200400 corresponds to the page that you are trying to access? thanks
Just a setup after checking the DB - helps with benchmarks
@@MilanJovanovicTech but I mean how can apply this on a table where I don’t know which number means any page? For example I have a table where can have logical deleted rows so the number itself can’t be associated to a calculation of a page
@@emma-vithe app or the search screen logic will need to cache the value to use as the cursor.
@@osman3404 but if you add an order or a filter that doesn't work anymore right? because the cursor uses a number as an anchor to calculate the next rows
he just doesn't know it and can't provide you the answer. In fact, identifying where the cursor is for correct page is very tricky and sometime impossible if you have id as guid or non sequential id. Normally the only situation I can see we could leverage cursor pagination is infinite scroll, where we would know next exact cursor position. For normal paging that allow user to arbitrarily go to any page, cursor won't work! And the way he presented his example is confusing enough for your question to arise :D
Thank you very much. It's nice to know that I've done everything right, even without having seen this video beforehand. I am currently working with realtime data, and saving the last ID helps to keep the next query fast. However, I find your example with the fixed value 200400 somewhat misleading. It's completely ok for benchmarks, but in real life I don't even know which ID I have as a starting point at the beginning. The video would have benefited from providing a practical approach to this. I personally know what to do, but @emma-vi's question shows the need for clarification.
Of course, of course... I needed something stable for the benchmark. In a real world example, we'd fetch the first page, and then use the ID of the last record as the cursor.
Nice! Thank you, perhaps a naive question but you picked an id to use as your cursor. How would you know what to use as your cursor in a real life scenario? Maybe I misunderstood something.
It should be a column that's sortable in creation order
In theory, you can add a map of ids corresponding to pages. You then will need to periodically update it and whenever querying a page, query an offset (number of records added since last map update) to add this offset to the record id from the map. This will add ability to go to arbitrary page (if you use sequential numeric ids and sort paged records by that id). May perform better than offset...limit if paged table contains many more records.
That would be a mess to maintain for each user and with records being added and removed
@@MilanJovanovicTech it sure would be. Not so much if you don't need to maintain maps for different users (all users see the same list of records and therefore have the same pages) but still quite messy. And I completely forgot about the removal of records being a thing... It could be accounted for relatively easily as part of the periodical map updates, but between updates some pages would intersect. OR there could be a list of records removed since last map update which can be used to calculate proper offsets when needed... But anyway, it is indeed a very complicated and messy solution that wouldn't make sense for probably anyone
Is there a concept to keep page id's in cache to use them as cursor boundries to speed up the queries? I would never hard code an Id like this, but I can imaging to have these id's collected for instance every day to cache them.
It's obviously just for demo purposes 😅 Typically you will save this value on the client side
I think your cursor example needs an orderBy
The PK is already sorted, so it wouldn't have an effect. But it can help if using a non-indexed column. Or traversing the index in the opposite order.
Great video. Thanks Milan
My pleasure!
I would like to know about dapper with clean architecture. Can you provide video for proper implementation.
I touched on that in a recent CQRS video
Order by will break cursor pagination, for example order by name
Yep, cursor pagination sucks if you need random sorting
Just add an expression to the arguments to detect the order field and then use EF method when building the query.
Ah, and of course made the method generic :)
That won’t fix the underlying database call though and the reasoning behind this video. You will need good covering indexes
Wow ❤
Does it work for Guid primary key?
Not really, you need something you can sort on that also matches when the records were created. Otherwise, you might get a different result each time as more records are added/removed from the DB.
How practical is cursor pagination in a real-life scenario given:
1. You almost always want to allow the data to be sorted by user-defined criteria - order date, order value, etc.
2. Once real-life scenarios such as order deletion/cancellation/reversals start to occur the id becomes very brittle, especially when you add a where clause to the select e.g. you want to page all non-cancelled, non-reversed orders
It's not for general purpose pagination, but fits perfectly for use cases where you need an infinite-scroll solution. Examples could be social media timelines, e-commerce catalogs, e-mail, etc.
What if your ID is not auto incremented or if it's a guid?
Then you'd need another auto-incrementing column (or sortable column at least) - a good example is a CreatedOnUtc column
Great Video Milan. Is there a way for this to work with Strongly Typed Ids?
Yes, but too cumbersome for my liking. The juice ain't worth the squeeze.
Where do you find this kind of information?
Research
OData is the best solution for me , for about a decade
I never had a chance to use it
Great video
But what if our Id is Guid how do you do it in cursor pagination
You need something else that's sortable to pair it with a GUID. An auto-incrementing integer, CreatedOn column, etc.
I'm guessing cursor pagination would not work when you throw in filtering.
Nope, works fine with filtering. Doesn't work with random sort orders.
What theme do you use in your Visual Studio? It's fire Man
It's ReSharper syntax highlighting
Woow
God bless you brother
Thank you
But how do you know the cursor initially?
For a number: 0, or max(int)/max(long)
Basically, the default value of a cursor where any other values is greater or smaller.
how this approach will work if the Id is Guid
it wont
You need another column you can sort by time of creation, like a CreatedOnUtc column
I would not call this cursor pagination - it's just selecting by clustered index. Cursor pagination is technique that is actually using database feature CURSOR, by defining query (where you can actually use ORDER BY with other expressions than just Id) and then FETCH pages.
Nope, this is cursor/keyset pagination. A DB cursor is something else.
cursor pagination is faster, but you cannot order the data by some column except id, so if user want to order data by some column better using cursor, but if he want sort data then better is offset pagination
Yes
Then which sorting should be used
What if the ID used as cursor was a Guid and not a sequential id ?
Wont' work in that case, you need something that's sortable + grows in "creation order"
If you have to use a Guid, you'd need another column to handle the creation order part. A good solution could be a CreatedOn column
On the second example, your CountAsync should be before your pagination. If you kept it like it is you don’t need 2 roundtrips lol.
How would that change the number of round trips?
@@MilanJovanovicTech rewatch your video at 4:30. It’s easy to overlook. You wrote countasync after tolistasync and said it takes 2 trips. You accidentally called countasync on the query instead of before paging.
If you are indeed doing what you are coding one round trip was enough but I’m guessing you didn’t mean to call count on the paginated results.
For real pagination you should use countasync on the entire filtered query set then call tolistasync with the pagination.
What about if the Id is a Guid?
The cursos approach still works?
It "works" in theory - you can sort a Guid, right? But it's practically useless, because a Guid is random. You want something that is increasing with creation time, like a numeric PK does.
@@MilanJovanovicTech I this scenario what approach you suggest? When we only have Guids as Keys?
@@Leobraic Sort by creating date, and then use Skip().Take()
Interesting that the final version will not work with the UI shown in the thumbnail 😉
If i was in a situation where a primitive unsortable next page implementation was needed, my data would not be in a sql db anyway.
It would work - so long as you go to thew next/prev page only 😁
But why on earth would anybody want to have a pagination that filters by ID?
A pagination means: Where(any condition) & OrderBy(any column) & any page & any size
Think about your Gmail inbox. You see the latest 50 emails, and can navigate to the next page, etc. The emails are sorted in the order that they arrive to your inbox - i.e. creation order. Which is exactly what an integer PK gives you - creation order.
@@MilanJovanovicTech At which moment do you decide to create/update the cursors in order to fetch the latest 50 emails? on each email arrival do you delete the cursor for said user, recalculate which Id should the cursor have to be the 51st element, then offset all the records from that point (by deleting and recreating them with Id+1) to make room for the cursor?. yes, you could potentially use a step of 2 or bigger so you always have room for creating cursors, but there is also the point of this cursor being hardcoded in the code. Seems very unwieldy at the time of updating the cursors.
@@matiasmiraballes9240you don't need to update a cursor each time. All you need to do is order emails by id descending, and create a cursor pagination backwards: where Id < x
ok, great, but what if it was a different order, for example, an order by price?
Then you'd need an index on that column, and a way to solve duplicates.
Cursor is the ID of the last record in Sale table?
Nope. It's an ID of some specific record in the table (near the end of this table, if I understand Milan correctly).
The ID of the last record that you READ in the current page - and it denotes the start of the next page
@@MilanJovanovicTech , Thanks Milan. Wouldn't it be more readable if the "Last()" used instead of [^1]? But I appreciate the approach shown, from my understanding it'll do the job working with arrays.
What abt if more than one filter is there?
Filters aren't problematic for this approach.
However, random sorting order is
The cursor pagination is faster but not good to use in combination with guis ids, filtering, ordering and searching
Agreed
👋
That's a LIKE
Thank you! :)
Great video Milan
Thanks a lot!
The naive version would be faster if you only had 10 rows in the table...🤣
Everything is fast with a small database
that's not pagination at all :)
What is it then?
just an approach to bring nearby items, there is no pagenumber @@MilanJovanovicTech