Real world example would be where you have a grid with 50 columns where 30 of them can be filtered and all need to be sortable... then the SQL is the King. But nice example, I apreciate your work, keep going.
Exactly this. Most real world grids allow user to dynamically sort on any of the columns, and page. Probably should mention in the video that if you need dynamic sorting you can rule out cursor pagination..
You can just change Cursor >= request.Cursor to Cursor > request.Cursor rather than returning take + 1. This is fine for databases that use some form of auto increment as the first value is 1, so you don’t have an issue with off by 1 errors
Great video but 2 important concerns: 1. If you can't order by the given cursor column, for example if your Id column is Guid this might not work as intended 2. The OrderBy call should come before the Where call (in theory you first need to sort by that cursor, then filter)
1. Agreed, you'd need something sortable like a CreatedAt column, like Pedro suggested 2. It doesn't make a difference since it's converted into SQL, and SQL is deterministic when it comes to execution order.
@@MilanJovanovicTech I like the fact that it is deterministic but that is SQL. If we make abstraction of that, it's good to know in general the order of operation matters when paging by cursor
@@MortvmMM @Milan Jovanović I think PKs should always be created with sequential logic. Using `Guid.NewGuid()` like you said has no sequential logic, and hence would make index fragmentation on your PK column go up like crazy, impacting performance and hence DB maintainability. 2 possibilities are either: 1) ONLY create upon insert (using SQL's NEWSEQUENTIALID() method), or 2) Use EFCore library's SequentialGuid ValueGeneration method. Don't know if mixing both generation methods in the same column would achieve the intended purpose though, so I'd suggest sticking to one for the same table. Hope this helps.
@@islandparadise if using SQLServer you should use ValueGeneratedOnAdd (if using EF) and let the Guid be generated by the SQL engine, that way it will be sequential. This doesn't work on Postgres, because Postgres has a different logic for clustered indexes, or something along those lines... (i still don't fully understand it TBH) 😅
You can still order by different columns while doing keyset (cursor) pagination. Even there are syntactic sugars for that in some databases, it is written like (a, b) > (?, ?). For example in PostgreSQL: select * from posts where (name, id) > (“john”, 10) order by name limit 25 This syntax expands to name > “john” or (name = “john” and id > 10) If your db hasn’t this syntax, you can always use second syntax Also in EFCore Npgsql has translation for this syntax to postgresql, it is EF.Functions.GreaterThan(ValueTuple.Create(a, b), ValueTuple.Create(x, y))
Hi, thanks for video. It seems 'cursor' is a little bit confusing term in this video. I mean the 'Cursor' in SQL DB (forward-only pointer to row in query result) is not that is in this video - just Id. In this video the main point is using advantages of clustering index for fetching range of rows ordered by such index, for example pages. This approach has best performance because, you know, rows are physically ordered by clustering index within db table. EF is using clustering index for Id field by default.
Definitely a bad term in video name and code. I tought i will see something similar to what I saw in 30 years old legacy code, but using EF - paggination/scrolling build on top of sql cursors with fetch count based on amount of list items being draw/visible.
Hi Milan. What about if I have my primary key is GUID. What should I do with millions of data to be fast. I face this issue with skip input of millions query is time out
@@MilanJovanovicTech let say ur example above ID is GUID not long what is the best way to speed up the query. Thanks u for reply Milion. I'm ur fan for next videos
Cursor should not be whatever your primary key is, cursor must be whatever field (or fields) you sort by. For example, you do not order by guid, therefore there is no need to use it in cursor, instead you probably sort by datetime, so that's your first cursor candidate. If you use more than one field in sorting, or even have the customizable endpoint where you can control the sorting order per request - then you need to figure out how to transform fields for this particular sort order back to cursor, and also translate the cursor into individual field values to search by.
This should work with Guid too. Guids are indeed sortable. However, you typically don't want to use them as primary keys. With random guids, every insert will be on a random page of your table, and can lead to terrible performance. You might look into using newsequentialid() in TSql, or there are various implementations of sequential guid algorithms if you need to allocate outside the DB.
I think the approach is very limited since most of the search will require parameters like patter to search based on example tag, product name and not rely on Id? How would cursor solve this?
You can use use non-cluster index wtih a particular column (or set of columns) that are used for your filtering pattern. In the request to database you use both index and have the same speed effect. The only difference in the code that was showed is that you a more complex Where( in the video && extra columns), but the idea remains.
Hmmm interesting 😂 But you see the timings on the video right? The cold start was at 1600ms (cursor=0), then you do a new call (the one with cursor 51) that took 160ms, then a new call with cursor 101 that took 175ms. No where near the 30ms before the cursor respone refactoring. I think the additional Take() call is to blame.
If you build some feed like Twitter/Instagram/TikTok have then consider using cursor pagination, but if you need ability to navigate to specific page (for example you're developing a porn site) then offset pagination may be more suitable
@@MilanJovanovicTech also if you did not mention (or I missed): cursor pagination helps prevent page data shift in case if new records have been added to the database
I often use this pattern except that I do not include the cursor in the result. The next cursor is simply the id of the last item, you don't have to query one more item this way.
Hi, thanks for example. Do you know some efficient way for offset (not cursor) pagination with filtering, that could not be transmitted into where statement (for example, some sort of auth rule that could be applied only for materialized result after query execution). Would be great to watch your new video with such implementation
Why didn't you just changed the >= to > to solve the duplicate problem? It results in much less code, you can use the ID of the last element in the returned page.
Well done for mentioning a sensible use case for cursor pagination. Video could have been improved by discussing the implications of dynamic sort order for cursor pagination viability!
Good video. I've always known this as keyset pagination. BTW: There is a keyset pagination library for EF core. It does a lot of work for you. I suggest don't just use this for infinite scrolling, use it even for paged grids. It does mean you lose the ability to navigate to a specific page, but with search/filter features and or an index that's not really that important. Not only is perf better for a single query, but overall scalability is better because your database server is doing less work.
Looks like performance went back to Skip/Take performance after all that work. Might not be as beneficial since Skip/Take is fairly common and understood.
Not really, my PC was throwing fits. Skip/Take slows down linearly the further away from the start. Cursor/keyset pagination has constant performance wherever you are. There are many articles about it, with benchmarks.
Hi, I have question, it's a bit not about the topic in this video. I'm using cqrs, when I handle my command with creating Observation, I need to send notification to fron-end that is based on some logic through SignalR. I rise domain event with MediatR Notification and Handling all logic there. But now I also need to update my Patient in the same transaction(when i Create observation), as I know CQRS can update only one aggregate per command? So what should I do in this case.
However this only works if you are ordering by ids, if you are ordering by something like datetime you would need to have a different implementation. And may even be hustle
That's where it becomes a little tricky. You would need to include indexes for the filter columns for it to be fast, and then everything else should end up being the same. I would Google 'Keyset pagination' or 'Cursor pagination' and explore the topic some more.
I don't know if I can agree with that sample. While it's a good showcase for the cursor it's way too simplified. In almost all cases you don't do a paging by ascending order. It's more by descending order. Like you've mentioned in your last example. You want to show the latest post in your feed and add older posts while scrolling down. But it's more difficult than shown in your example. If you order descending, you also have to save your start position for the skip/take approach. Cursor is fine though since you return the next starting cursor position. Maybe an additional (huge) advantage to the cursor approach. If you don't save your state (last position or start point) for the descending order, you can mess up your paging if new items get added. Seen that error a lot. Especially for junior devs and wanted to mention that.
Actually... You just pass the MaxValue as the cursor. And if an indexed is sorted in ASC, it can also be traversed in DESC order from behind. So you can actually use the same index for both sort orders, but just need to change the cursor logic. 😁
This is amazing. Excuse my english, I surely remember that in a video of you, i heard about how to optimize the send email operation. The thing is, when a new user registered successfully, I will send email before return the result of request. But you introduce a way that sending email in background, and now i lost thay video. Please i need your help to reference that video or you can give me some keyword search. Appreciate a lot, thank you so much
How about sorting? Used something similar to delete old data from a database with hundreds of millions of rows, reduced batch jobs time by 60%. Previous solution was hibernate though 🤭😉
Want to master Clean Architecture? Go here: bit.ly/3PupkOJ
Want to unlock Modular Monoliths? Go here: bit.ly/3SXlzSt
Real world example would be where you have a grid with 50 columns where 30 of them can be filtered and all need to be sortable... then the SQL is the King. But nice example, I apreciate your work, keep going.
In that case you need vanilla offset pagination 😁
Exactly this. Most real world grids allow user to dynamically sort on any of the columns, and page. Probably should mention in the video that if you need dynamic sorting you can rule out cursor pagination..
You can just change Cursor >= request.Cursor to Cursor > request.Cursor rather than returning take + 1. This is fine for databases that use some form of auto increment as the first value is 1, so you don’t have an issue with off by 1 errors
Cool!
Great video but 2 important concerns:
1. If you can't order by the given cursor column, for example if your Id column is Guid this might not work as intended
2. The OrderBy call should come before the Where call (in theory you first need to sort by that cursor, then filter)
For the number 1, if you have a CreatedAt column I think it's a good way to proceed with the OrderBy.
1. Agreed, you'd need something sortable like a CreatedAt column, like Pedro suggested
2. It doesn't make a difference since it's converted into SQL, and SQL is deterministic when it comes to execution order.
@@MilanJovanovicTech I like the fact that it is deterministic but that is SQL. If we make abstraction of that, it's good to know in general the order of operation matters when paging by cursor
@@MortvmMM @Milan Jovanović I think PKs should always be created with sequential logic. Using `Guid.NewGuid()` like you said has no sequential logic, and hence would make index fragmentation on your PK column go up like crazy, impacting performance and hence DB maintainability. 2 possibilities are either:
1) ONLY create upon insert (using SQL's NEWSEQUENTIALID() method), or
2) Use EFCore library's SequentialGuid ValueGeneration method.
Don't know if mixing both generation methods in the same column would achieve the intended purpose though, so I'd suggest sticking to one for the same table.
Hope this helps.
@@islandparadise if using SQLServer you should use ValueGeneratedOnAdd (if using EF) and let the Guid be generated by the SQL engine, that way it will be sequential. This doesn't work on Postgres, because Postgres has a different logic for clustered indexes, or something along those lines... (i still don't fully understand it TBH) 😅
You can still order by different columns while doing keyset (cursor) pagination. Even there are syntactic sugars for that in some databases, it is written like (a, b) > (?, ?). For example in PostgreSQL:
select * from posts where (name, id) > (“john”, 10) order by name limit 25
This syntax expands to name > “john” or (name = “john” and id > 10)
If your db hasn’t this syntax, you can always use second syntax
Also in EFCore Npgsql has translation for this syntax to postgresql, it is EF.Functions.GreaterThan(ValueTuple.Create(a, b), ValueTuple.Create(x, y))
Becomes too cumbersome
Hi, thanks for video.
It seems 'cursor' is a little bit confusing term in this video.
I mean the 'Cursor' in SQL DB (forward-only pointer to row in query result) is not that is in this video - just Id.
In this video the main point is using advantages of clustering index for fetching range of rows ordered by such index, for example pages.
This approach has best performance because, you know, rows are physically ordered by clustering index within db table.
EF is using clustering index for Id field by default.
Unfortunate, but I didn't come up with the name
it's also know as keyset. So you can use that or you can use "last" or "lastseen" as your parameter.
Definitely a bad term in video name and code. I tought i will see something similar to what I saw in 30 years old legacy code, but using EF - paggination/scrolling build on top of sql cursors with fetch count based on amount of list items being draw/visible.
At 13:26 the milliseconds spiked to a thousand. Is it because of the extra logic added to return the cursor position?
No, my PC just slowed down 😅
Cursor pagination should have constant time in production
@@MilanJovanovicTech Ooh i see. And probably the API was warming up after restarting. Thank you for your great video. God bless you.
This is what i was looking for in the comments. This comment must be pinned.
Thanks for sharing the concept Milan.
Sure thing!
Good work, I like the blog also. Good content.
Much appreciated!
This is super awesome! Thanks for sharing Milan!
Glad you liked it!
Awesome video. There is a problem with cursor pagination when you need to apply a filter, so keep that in mind.
Filtering should work fine. Arbitrary sorting is a problem
Thank for this wonderful concept.
I've an interesting video on this coming next week, with benchmarks and execution plans
Hi Milan. What about if I have my primary key is GUID. What should I do with millions of data to be fast. I face this issue with skip input of millions query is time out
Guid isn't sortable, so you can't use it with cursor pagination. You need a unique column that is also sortable.
@@MilanJovanovicTech let say ur example above ID is GUID not long what is the best way to speed up the query. Thanks u for reply Milion.
I'm ur fan for next videos
@@Nisa-Julie There isn't one, really. Unless you use a sortable GUID. 🤷♂
The biggest cost with GUID will be sorting, not paging.
Cursor should not be whatever your primary key is, cursor must be whatever field (or fields) you sort by. For example, you do not order by guid, therefore there is no need to use it in cursor, instead you probably sort by datetime, so that's your first cursor candidate. If you use more than one field in sorting, or even have the customizable endpoint where you can control the sorting order per request - then you need to figure out how to transform fields for this particular sort order back to cursor, and also translate the cursor into individual field values to search by.
This should work with Guid too. Guids are indeed sortable. However, you typically don't want to use them as primary keys. With random guids, every insert will be on a random page of your table, and can lead to terrible performance. You might look into using newsequentialid() in TSql, or there are various implementations of sequential guid algorithms if you need to allocate outside the DB.
Where is the DB index see, can I see the query plan?
I didn't post it, but you can read some more here: use-the-index-luke.com/no-offset
I think the approach is very limited since most of the search will require parameters like patter to search based on example tag, product name and not rely on Id? How would cursor solve this?
Then simply use offset pagination if you need flexible paging like that
You can use use non-cluster index wtih a particular column (or set of columns) that are used for your filtering pattern. In the request to database you use both index and have the same speed effect. The only difference in the code that was showed is that you a more complex Where( in the video && extra columns), but the idea remains.
that sounds great to me, but what about when my primary key is of type guid?
Then you're screwed 😅 Since it's not sortable
13:47 Check that response time after the cursor response refactor. Back at 180ms instead of 30ms. So almost no profit gain at all…
Cold start.
Nope. The cold start was at 1600ms and the subsequent calls are at 180ms.
@@brechtlaitem It's faster. Trust me 😅
I'll put my money on it.
Hmmm interesting 😂 But you see the timings on the video right? The cold start was at 1600ms (cursor=0), then you do a new call (the one with cursor 51) that took 160ms, then a new call with cursor 101 that took 175ms.
No where near the 30ms before the cursor respone refactoring.
I think the additional Take() call is to blame.
@@brechtlaitem That Take is in-memory, so I doubt it made a difference. I'll do a proper benchmark to prove my point.
At the end with the complete cursor implementation, the performance is a BIT better than the approach with Take and Skip.
Disregard my tests in Debug mode, it's MUCH faster with cursor pagination
What if the user is in first page and clicks on 10th page instead of second page? How would that work in Cursor approach?
There is no 10th page with cursors, really. Only the next page, and in some cases previous. It's a slightly different concept.
If you build some feed like Twitter/Instagram/TikTok have then consider using cursor pagination, but if you need ability to navigate to specific page (for example you're developing a porn site) then offset pagination may be more suitable
Agreed, it's a good solution for a specific problem. Not a general solution for all paging.
@@MilanJovanovicTech also if you did not mention (or I missed): cursor pagination helps prevent page data shift in case if new records have been added to the database
Very well explained. Thanks.
Glad it was helpful!
I often use this pattern except that I do not include the cursor in the result. The next cursor is simply the id of the last item, you don't have to query one more item this way.
I do exactly same
But you always need one more DB query for last page to know there are no more results.
Hi, thanks for example. Do you know some efficient way for offset (not cursor) pagination with filtering, that could not be transmitted into where statement (for example, some sort of auth rule that could be applied only for materialized result after query execution). Would be great to watch your new video with such implementation
That's a tough one
I agree. But at the same time very often this is a real requirements
Why didn't you just changed the >= to > to solve the duplicate problem? It results in much less code, you can use the ID of the last element in the returned page.
Kind of missed it on the first try, and improvised later 😅
@@MilanJovanovicTech Haha it happens. Thanks for the honest answer. 🙂
Great video thanks.
Does the processing of large data by batch works better with a cursor pagination than with a skip take ?
It should be faster if you were loading batches one by one
What elastic search topics are they under the hood ?
Where did you see elastic search here? 🤔
Well done for mentioning a sensible use case for cursor pagination. Video could have been improved by discussing the implications of dynamic sort order for cursor pagination viability!
I think I'll touch on one more video to showcase that, plus also include some benchmarks so people can understand the performance difference.
Good video.
I've always known this as keyset pagination.
BTW: There is a keyset pagination library for EF core. It does a lot of work for you.
I suggest don't just use this for infinite scrolling, use it even for paged grids. It does mean you lose the ability to navigate to a specific page, but with search/filter features and or an index that's not really that important. Not only is perf better for a single query, but overall scalability is better because your database server is doing less work.
Yes, it's also known as Keyset pagination! I do believe google uses this for Gmail? Since I can only go forward/backward through my email list.
Awesome video, not very long and quite educating. I subbed and liked so please make more
Thanks for the sub! More videos coming 😁
Is the example code shared on github?
I share the code on my Patreon: www.patreon.com/milanjovanovic
/api/user?sort=age
In these case what to do coz tabel get unordered there will no cursor
I don't recommend cursor pagination if you need random sorting capability
Looks like performance went back to Skip/Take performance after all that work. Might not be as beneficial since Skip/Take is fairly common and understood.
Not really, my PC was throwing fits. Skip/Take slows down linearly the further away from the start. Cursor/keyset pagination has constant performance wherever you are. There are many articles about it, with benchmarks.
Hi, I have question, it's a bit not about the topic in this video. I'm using cqrs, when I handle my command with creating Observation, I need to send notification to fron-end that is based on some logic through SignalR. I rise domain event with MediatR Notification and Handling all logic there. But now I also need to update my Patient in the same transaction(when i Create observation), as I know CQRS can update only one aggregate per command? So what should I do in this case.
Update both of them in a single transaction, be practical.
Or update Patient asynchronouslt.
@@MilanJovanovicTech okay, thanks, if u start googling there is a big holy war between ppl can we update 2 aggregates in one transaction or not :)
Please make a video use case of cancelation token ,how ,why use this cancelation token
Alright! I've been planning to do it for a while, but I kept postponing it 😅
However this only works if you are ordering by ids, if you are ordering by something like datetime you would need to have a different implementation. And may even be hustle
Yes, that's very important. Sortability is a requirement for this to work.
Very nice video. Could you also share with us how do you add to this different kind of filtering data?
That's where it becomes a little tricky. You would need to include indexes for the filter columns for it to be fast, and then everything else should end up being the same. I would Google 'Keyset pagination' or 'Cursor pagination' and explore the topic some more.
@@MilanJovanovicTechhi milan...how about if i changed order by into Name column....????....
I don't know if I can agree with that sample. While it's a good showcase for the cursor it's way too simplified. In almost all cases you don't do a paging by ascending order. It's more by descending order.
Like you've mentioned in your last example. You want to show the latest post in your feed and add older posts while scrolling down.
But it's more difficult than shown in your example. If you order descending, you also have to save your start position for the skip/take approach. Cursor is fine though since you return the next starting cursor position.
Maybe an additional (huge) advantage to the cursor approach.
If you don't save your state (last position or start point) for the descending order, you can mess up your paging if new items get added.
Seen that error a lot. Especially for junior devs and wanted to mention that.
Actually... You just pass the MaxValue as the cursor. And if an indexed is sorted in ASC, it can also be traversed in DESC order from behind. So you can actually use the same index for both sort orders, but just need to change the cursor logic. 😁
@@MilanJovanovicTech For the starting point, yes. You can start with int.MaxValue, DateTime.MaxValue and do checks like index
@@MarcusKaseder True, but it's very useful to know about Cursor pagination when you run into a situation where you need it
@@MilanJovanovicTech True, and also about its limitations 😉
This is amazing. Excuse my english, I surely remember that in a video of you, i heard about how to optimize the send email operation. The thing is, when a new user registered successfully, I will send email before return the result of request. But you introduce a way that sending email in background, and now i lost thay video. Please i need your help to reference that video or you can give me some keyword search. Appreciate a lot, thank you so much
Outbox pattern
This is also called keyset pagination
Yes!
first query cursor = 0 , take 51 so it should return 51 records not 50 ? why this happening ?
Take = Limit = Page Size
Excellent 👌😊👌
Thanks a lot 😊
You could use Tuple as a return value or? Not necessary to create response class object.
I find tuples to be less readable than response class objects.
I prefer classes when working with queries, and I use tuples typically for internal results
The time it took still around 160ms?
I really need to do a benchmark video about this to prove I'm not bulshitting 😅
😆 I just noticed the value in postman....
Awesome video
Thanks
Glad you liked it!
shouldn't you skip the first record though? You would still have 50 as your first record, no?
Depends on how you implement it. You can either use the last element as the index or the N+1th element
Nice video bro! Please, raise a little bit the gain of your mic, the volume of your voice is low....
Ah damn it, thanks for the feedback! Will do.
@@MilanJovanovicTech Don't worry, it's not a big problem, just an improvement to make it better :)
I like the part where you copy it into a list twice 😂
Well...
Thank you
You're welcome
I wanna know why the time got longer
Cold start, need to do a proper benchmark on this 😅
How about sorting?
Used something similar to delete old data from a database with hundreds of millions of rows, reduced batch jobs time by 60%.
Previous solution was hibernate though 🤭😉
Tell me more 🤔
It is perfect, Unfortunately It can not be used for all situations such as for GUID and PageNumber clickable.
Yes, it's a good for to for a specific problem.
Great
Thanks :)