The New ID To Replace GUIDs and Integers in .NET

Поділитися
Вставка
  • Опубліковано 20 жов 2024
  • Use code SUMMER24 and get 30% off ANY course on Dometrain: bit.ly/dsummer24
    Check out our new OpenTelemetry course: dometrain.com/...
    Become a Patreon and get special perks: / nickchapsas
    Hello, everybody. I'm Nick, and in this video, I will show you a new type of ID being added in .NET 9, which solves the biggest problem of using GUIDs or UUIDs in databases as primary keys.
    Workshops: bit.ly/nickwor...
    Don't forget to comment, like and subscribe :)
    Social Media:
    Follow me on GitHub: github.com/Elf...
    Follow me on Twitter: / nickchapsas
    Connect on LinkedIn: / nick-chapsas
    Keep coding merch: keepcoding.shop
    #csharp #dotnet

КОМЕНТАРІ • 281

  • @EricSellers-m3k
    @EricSellers-m3k 3 місяці тому +119

    That one guy on your team: Mmm yes. Starting to prep for the Y10K problem now. 😏

    • @Kingside88
      @Kingside88 3 місяці тому +2

      Yeah and makeing big panic to PM

    • @NickSteffen
      @NickSteffen 3 місяці тому

      @@Kingside88 lmao, I still remember a product I worked on where they decided to use 4 characters followed by 9 numerals incrementing server names (after an app name) just in case we had a lot of servers. We never had more than 100 sites with the naming scheme and since the names were unique to each site the number was never incremented more than 4. Meanwhile 30 poor technicians had to type all 13 numbers in tens of of thousands of times while they were working on this product. That naming scheme is long dead now, and will never reach the billions of servers that 9 digits would accommodate.

    • @memoryleakerz
      @memoryleakerz 3 місяці тому

      You're joking but there's a project manager out there, who's grand grand grand children will be running that stand up meeting on GUIDv7 being about to overflow next year and allocating budget for ir

    • @NickAskew
      @NickAskew 3 місяці тому +1

      I'm old enough to remember people who thought 2000 was a long way off 🙂. But I seriously don't think we would need to panic about 8000 years from now.

  • @Jallenbah
    @Jallenbah 3 місяці тому +51

    A nice little bonus of this implementation is if you do happen to have a table which doesn't store creation time information, or if some issue results in that field being overwritten or not written, you can reverse engineer the Id to find the record creation time.

    • @SmoothNanners
      @SmoothNanners 3 місяці тому +33

      Another point to note is, for some types of data, if you're exposing IDs, you may not want the creation time to be exposed for security reasons.

    • @krajekdev
      @krajekdev 3 місяці тому +6

      @@SmoothNanners Thanks, great comment.

    • @Kitulous
      @Kitulous 3 місяці тому +5

      and it means that everyone else can as well

  • @garage30rc
    @garage30rc 3 місяці тому +205

    I always feel slightly guilty when I create a new GUID and don't persist it in a database or use it in a test. In the whole vastness of eternity, that GUID only had a miniscule chance at existing, and I discarded after just a few milliseconds.

    • @z0nx
      @z0nx 3 місяці тому +35

      Only to be experienced by you, man of stardust

    • @T___Brown
      @T___Brown 3 місяці тому +1

      Lol

    • @markparker6499
      @markparker6499 3 місяці тому

      😂

    • @stefan-d.grigorescu
      @stefan-d.grigorescu 3 місяці тому +18

      Hm. Didn't thought that way.
      I just like to run my console app that prints new guids in a while loop and watch them go out of screen view.

    • @ImmoLandwerth
      @ImmoLandwerth 3 місяці тому +15

      It’s as if millions of GUIDs cried out in terror and were suddenly discarded.

  • @GiovanniOrlandoi7
    @GiovanniOrlandoi7 3 місяці тому +12

    So glad this new option still maps to the guid type. Thanks for the video!

  • @BillyBraga
    @BillyBraga 3 місяці тому +106

    6:25 "And I will also be dead" 🤣

    • @marcelagraz5915
      @marcelagraz5915 3 місяці тому +2

      😂

    • @da3dsoul
      @da3dsoul 3 місяці тому +1

      I really feel for the first developer to say that throughout their early 20s and ends up living 10000 years to see the fallout of whatever fallacy they reference. Then again, MS doesn't even want to support Windows 10...

    • @knifekitty_ls
      @knifekitty_ls 3 місяці тому

      that actually solves a lot of problems

    • @ab1ray
      @ab1ray 3 місяці тому

      I came to the comments as soon as I heard this 😂

  • @yannolaf
    @yannolaf 3 місяці тому +13

    We use GUID (v4) as the primary key and a generic (auto inc) as the clustered index as a workaround against clustered index fragmentation. Not perfect, but it works ;)

  • @Spylak
    @Spylak 3 місяці тому +23

    Hello Nick, a comparison video with ULID would be interesting I think.

    • @MoehreMoe100
      @MoehreMoe100 3 місяці тому +4

      ULID just has a few more bytes of randomness because it doesn't have to specify a version

  • @lucaswhite12
    @lucaswhite12 3 місяці тому +45

    LOL I was using this more than
    10 years ago!
    To be clear, the ids were generated on the DB side, and it was a SQL function that did exactly this, generated a new GUID and overwrote part of it with the timestamp so they would be ordered.

    • @levmatta
      @levmatta 3 місяці тому +8

      Me too, but this is better than a custom solution.

    • @Udodisgustme
      @Udodisgustme 3 місяці тому +1

      Cool

    • @pilotboba
      @pilotboba 3 місяці тому +1

      @@levmatta There are some systems where you need to generate an id on the client. Or at least it is much more efficient or eliminates the need to temp id's on client side.

    • @lucaswhite12
      @lucaswhite12 2 місяці тому

      ​@@pilotbobaOh, definitely! It's a good thing! I just find it a bit amusing how some things "persist" 🤭 in our dev world!

  • @antonmartyniuk
    @antonmartyniuk 3 місяці тому +35

    Nick forgot to put an Easter Egg Console.WriteLine in the Program.cs

    • @nickchapsas
      @nickchapsas  3 місяці тому +26

      I knew I missed something!

  • @KritX01
    @KritX01 3 місяці тому +9

    I've been using ULID since it's also time based

  • @victorgarcia3526
    @victorgarcia3526 Місяць тому

    I was looking for this! Thanks Nick!

  • @uncommonbg
    @uncommonbg 3 місяці тому

    There already are packages that do this. I’m using that in some of my new projects. Works really nicely combined with EF value generators.

  • @SlyEcho
    @SlyEcho 3 місяці тому +9

    I don’t think this is compatible with how SQL Server stores unique identifiers. They are ordered by the byte array part first if I remember correctly.

    • @bfrytech
      @bfrytech 2 місяці тому

      It's not. A lot of these going around.

  • @MayronWoW
    @MayronWoW 3 місяці тому +5

    That's a much needed improvement. I wish they could be shorter (such as the IDs used for youtube video IDs) but I'm not sure how easy that would be. Using IDs generated from timestamps feels safer, though.

  • @xlerb2286
    @xlerb2286 3 місяці тому +1

    Being able to order by creation date/time solves my biggest gripe about Guids. Too bad I'm retired now and hopefully will never need to create another unique id ;)

  • @IhorBahlaiev
    @IhorBahlaiev 3 місяці тому +1

    Usually, when I need to have primary key of GUID type, I'm doing the pair columns - GUID non clustered (primary key) + INT/BIGINT clustered autoincrement for avoiding full table sort after inserting new row.

    • @NotACat20
      @NotACat20 3 місяці тому

      Inserting never requires "full table sort". Clustered index is all about sorting on leaf nodes (inside pages), while represented with tree on branches, exactly in same way as with regular index.

  • @utubekade
    @utubekade 3 місяці тому

    Great feature. Thanks for sharing. Hope they find a more efficient way before they release this.

  • @jpfl17
    @jpfl17 3 місяці тому +1

    Nice video, Nick! Regards from Brazil.

  • @vamvdotnet
    @vamvdotnet 3 місяці тому +2

    A question: isn`t this feature what Chris Pattersons`s (MassTransit`s creator) NewId nuget Package delivers for us already?

  • @scdecade
    @scdecade 2 місяці тому

    If the guid is the unique clustered key in the database table then the performance is good. If the situation calls for it you can split apart the primary key and the clustered index

  • @Mr__B.
    @Mr__B. 20 днів тому

    Summary of UUID v7 Problems:
    Insertion performance: Still random-like and could lead to B-tree fragmentation.
    Memory and storage overhead: Larger size (128-bit) than simpler numeric keys.
    Indexing and querying: Slower due to the complexity of comparing longer, larger keys.
    Time dependency: Vulnerable to clock issues (drift, unsynchronization).
    Human readability: Difficult for humans to interact with directly.
    Information disclosure: Potential exposure of time-related information.
    Key collisions: Extremely rare but theoretically possible.
    Compatibility: Not fully supported everywhere yet.
    Unnecessary complexity: In applications where global uniqueness isn’t a must.

  • @Marfig
    @Marfig 3 місяці тому +5

    Well, the version 7 UUID cannot be used by services that expose these IDs to third-parties, through API calls for instance, or to frontends. The timestamp and counter portions of the UUIS have an implied security risk. You wouldn't want your customers to estimate how many invoices you generated last month, or when a certain part was added to your catalogue. It needs to be said that in relational databases, performant IDs that are also secure is still an unresolved problem.
    If you are in a scenario where security trumps performance, you definitely should go with version 4. Or better yet, go with another library like nanoid that can produce low collision IDs with a much smaller index size.

    • @nickchapsas
      @nickchapsas  3 місяці тому +5

      You can’t estimate how many of anything was generated last month with v7. You can just get when it was generated

    • @Marfig
      @Marfig 3 місяці тому

      @@nickchapsas That's correct. It's not version 7 that has a counter segment.

    • @TGeorgie
      @TGeorgie 3 місяці тому +1

      What customer cares when a company generated their invocie or when they added something to the catalogue? What can they even do with that info? I guess depending on the info it might be useful to competitors? Maybe? But if the creation date is not something you absolutely need to hide then there's no issue and quite a bit of benefit.

    • @Marfig
      @Marfig 3 місяці тому

      @@TGeorgie Security isn't evaluated like that. There are other scenarios were revealing the timestamp of a record generation can be a security risk. The reason why Version 7 UUID was never a popular choice for database primary keys is exactly because of the increased vulnerability to enumeration attacks, and the information a timestamp can reveal about the system it was generated on. Many companies in fact have the simple rule that no shared ID should reveal anything about the system or the data itself and should be completely random. If you work on the financial sector for instance, go on and try creating a version UUID on your database to serve as a PK. You'll get quite the PR comments.

    • @deado7282
      @deado7282 3 місяці тому +3

      "You wouldn't want your customers to estimate how many invoices you generated last month, or when a certain part was added to your catalogue."
      Even if you wouldn't want to, you have to. Sequential order-numbers are legally required in many cases. You can usually compose them, but it still has to contain a incremented number.
      You might be able to hide this number on your website for instance. But you have to disclose it customers who bought something anyway. Making the hiding part kinda pointless.

  • @haxi52
    @haxi52 3 місяці тому

    I have an `Id` type in my domain of all my projects. It implicitly converts from/to string/guid and a new static method that does similar to here, where the first few bytes are time ordered. I also like to encode two bytes of an enum that tell me which "table" the id belongs to.

  • @cwevers
    @cwevers 3 місяці тому +20

    Doesn't this eat into the randomness/uniqueness? Since they are both the same length, adding a potentially overlapping part (the time).

    • @SilasPeters
      @SilasPeters 3 місяці тому

      I also wonder what the chances are now

    • @nick21087
      @nick21087 3 місяці тому +10

      Well a collision could now ONLY occur within the same ms? So chance of collision is probably less overall?

    • @VladSandu79
      @VladSandu79 3 місяці тому +1

      What about brute forcing ? I feel like there are fewer posible results now, for each ms

    • @OhhCrapGuy
      @OhhCrapGuy 3 місяці тому +28

      Considering that there are still 74 bits of entropy every millisecond, and that a GUID from one millisecond can never collide with a GUID from another millisecond, there are advantages to this GUID compared to old style ones.
      And the most important one is that your "risk" of collision resets all the way to zero every millisecond. You could generate 1000 GUIDs a second at a constant rate of 1 per ms for 700 years, and have 100% certainty that there would never be a duplicate GUID.
      Now, if you're inserting more than one a millisecond, the risk of collision per millisecond is the birthday paradox problem, with the number of "days in a year" being 2^74, and the number of "people" is the number you insert per millisecond.
      And what's more, unlike regular GUIDs, you can do these inserts for 700 years without the odds of a collision ever increasing, because the odds per millisecond are completely isolated from each other, because if you don't have a collision yet, the odds of the first guid for a millisecond colliding is 0.
      The odds of a second guid in a millisecond is 1 in 2^74

    • @condar15
      @condar15 3 місяці тому +5

      Specifically looking at rough estimates you're looking at needing about 1e37 (ten trillion, trillion, trillion) generations in a single given millisecond to have a 50% chance of a conflict. Even if we assume there are enough UUIDs generated each millisecond to create some conflict, the chance that conflict occurs within the same system (which is what we really care about) is effectively non-existent.

  • @andreaskristensen3948
    @andreaskristensen3948 2 місяці тому

    Is it possible to retirieve the date time from the version 7 guid i.e. a method on the type?

  • @reikooters
    @reikooters 3 місяці тому

    I'm using the RT.Comb package for this at the moment, but would be interesting to see how this compares

  • @leerothman2715
    @leerothman2715 3 місяці тому

    Probably not a good idea on a noSql container/collection if it’s the partition key as you’ll end up with a hot partition?

  • @AslamNazeerShaikh
    @AslamNazeerShaikh 3 місяці тому

    Thanks for sharing wonderful knowledge with us ❤🎉

  • @bryanstackpole1951
    @bryanstackpole1951 3 місяці тому

    Does it play nice with sequential guids?

  • @Grumpicles
    @Grumpicles 3 місяці тому

    I love it when I can defer a problem with a statement like "I'm not going to worry about it, I'll be dead." 😂

  • @NickAskew
    @NickAskew 3 місяці тому

    I was trying to get my head around why it would be important that the GUID, UUID, or identifier is in order of creation. The old fashioned increment a counter system worked fine for smaller systems and automatically ensured this order. So if a table were clustered by the ID and that ID is an incrementing number then the new row can reliably be appended and no sorting is done, but why would you cluster by ID as that would lead always to clusters of size 1? OK so perhaps the issue is nothing to do with clustered indexes but just indexes. Each time a new row is added where the primary key is primary means of retrieval, the index b-tree will need to be adjusted in all cases. Is it faster to add data that is roughly in order to a b-tree? Well I guess it means that you are affecting a more predictable part of the tree rather than the GUID scatter gun approach.

  • @T___Brown
    @T___Brown 3 місяці тому +1

    I switched to mongodb style objectid. Its larger and better for url

  • @gunnarliljas8459
    @gunnarliljas8459 3 місяці тому +7

    It must be noted that this is REALLY BAD idea for SQL Server. MS SQL's uniqueidentifier is not sorted in the way UUIDv7 is. Don't use this with SQL Server.

    • @screamiboy
      @screamiboy 3 місяці тому

      Yes valid and important point in this discussion! There are some v8 implementation, which are solving the issues in MSSQL, but we have to keep in mind, that v8 is a custom format without general specification.

  • @tofu1687
    @tofu1687 3 місяці тому

    We use EF or Database generated Guid which are sorted... but system dependant (ie. merging data from 2 databases on different machines might have an index issue)

  • @danku1013
    @danku1013 3 місяці тому +12

    EF Core supports sequential GUIDs for a long time.

    • @SmoothNanners
      @SmoothNanners 3 місяці тому +4

      I believe it's only with SQL server with NEWSEQUENTIALID set as the database generated value and its implementation is different from UUID v7. Other DBs might need a plugin or don't support it.

  • @rupenanjaria
    @rupenanjaria 2 місяці тому

    How are you able to try this? pls share. Can we do it in SharpLab?

  • @MadafakinRio
    @MadafakinRio 3 місяці тому +9

    I'll stick to my auto-increment integers, thanks.

    • @TheCodeCreator
      @TheCodeCreator 3 місяці тому

      I guess I will too.

    • @TheLucanicLord
      @TheLucanicLord 3 місяці тому +1

      As will I. And my axe!

    • @TheXenoEnder
      @TheXenoEnder 2 місяці тому

      As long as you never need to partition your data for distribution that should be fine.

  • @magnuspersson9740
    @magnuspersson9740 3 місяці тому

    But if you create several directly after each other like in your example they are not sequential if I understood correctly? Since they get the same time stamp.

  • @leandroteles7857
    @leandroteles7857 3 місяці тому +11

    Sir, did you really just run a benchmark of GUID generation? Are you aware that you had just stolen 10000+ GUIDs from the universe? 🤣🤣🤣

  • @NickAskew
    @NickAskew 3 місяці тому

    I'm slightly confused here. I don't see how they are time ordered if the bytes after the time are not byte ordered unless the difference in the Unix time is less than 70 milliseconds. But I suppose even then occasionally two servers generating the new UUID could generate a UUID within that window of time.

  • @alexdubinsky6098
    @alexdubinsky6098 3 місяці тому +2

    what about sequential guid? has been around for years

  • @silkfire
    @silkfire 3 місяці тому

    Is this Guid similar to the NewId package?

  • @ernstgreiner5927
    @ernstgreiner5927 3 місяці тому +3

    If one billion (1E9) people create one billion GUIDs per day on one billion planets, it would take more than 930 million Earth-years to exhaust all these GUIDs.
    Just how to grasp the vastness of GUID...

  • @onelopez
    @onelopez 3 місяці тому

    I had to create my own Guid that uses the time and also noticed that there's a performance hit whenever using this version. But it's worth the trade off imo for the speed gains in indexing

    • @Daniel15au
      @Daniel15au 3 місяці тому

      The difference shown in this video is around 40ns. That's tiny. Even if you were to create 1 million IDs, that's just an extra 40ms.

  • @LilPozzer
    @LilPozzer 3 місяці тому +1

    Hi Nick. How does someone migrate an existing huge microservices app from V4 to V7?

    • @HappyCheeryChap
      @HappyCheeryChap 3 місяці тому

      Just change your generator for new rows from now on.
      Don't waste time changing existing data.
      The sooner you switch, the less you'll care about the older rows sooner too.

  • @SmoothNanners
    @SmoothNanners 3 місяці тому

    Will they also add a property to get the version of a Guid?

  • @atlesmelvr1997
    @atlesmelvr1997 3 місяці тому +1

    sequential guids have been available for a very long time

  • @djupstaten2328
    @djupstaten2328 3 місяці тому

    It seems obvious that time should be in the code. Now mix in some serially unique info from the hardware creating the hash (like the ATPO of the CPU) and you have "place" also, making it impossible to duplicate. It could end up being hefty but probably not too bad.

    • @JW-lv1fm
      @JW-lv1fm 3 місяці тому

      Love this idea, wonder how many bits would be needed however. This might also reduce randomness/collision reduction capability.
      It might be better to just store the location in a separate field, or new dataType (GUIDL) that has an extra 8 bytes for location?
      This is only useful where GUIDs come from random sources anyways, so not needed for everyone's use case.

  • @venussmodzhd1886
    @venussmodzhd1886 3 місяці тому +3

    What about ULID?

  • @Wfmike
    @Wfmike 3 місяці тому +1

    How does that compare to the newid package.

  • @carlosbrockmann7604
    @carlosbrockmann7604 3 місяці тому

    Do you think this new Version could be migrated to SQL-server?

  • @Corvin_
    @Corvin_ 3 місяці тому

    I looked into this a while back and honestly I didn't see why this would be better than using an incremental int + guid column combo. 1 less column to index i guess. I'm not an expert though so if someone knows please enlighten me.

  • @zsolter14
    @zsolter14 3 місяці тому

    So how does this solve that Guids are much larger than for example longs? Guids need twice as much storage than longs and sometimes it matters. What I am trying to say is that this new Guid will only replace the Guid IDs, but the long or int still has a use-case.

  • @Max_Jacoby
    @Max_Jacoby 2 місяці тому

    I always use autoincrement. I don't like the idea of possible collisions however small the chance is. Storing 128 bit instead of 32 is also a thing. It gives 12GB of data overhead if you have 1B records.

  • @NickDarlow
    @NickDarlow 3 місяці тому

    What build of .NET 9 is this in? It doesn't seem to be in preview 6?

  • @Vosoo-e9r
    @Vosoo-e9r 19 днів тому

    In some cases i would use 5 symbols hash with db existence checks

  • @KvapuJanjalia
    @KvapuJanjalia 3 місяці тому

    EF Core's sequential GUID generator does the same thing, no?

  • @unexpectedkAs
    @unexpectedkAs 3 місяці тому

    I expected a benchmark comparison between the nrw Guid and the existing libs.

  • @Ulfius1984
    @Ulfius1984 3 місяці тому +4

    The samples created at 4:30 were not sequential

    • @phizc
      @phizc 3 місяці тому

      That's not a guarantee with v7. The guarantee is to be grouped by milliseconds and highly unlikely to have a collision. Technically the last 78 bits doesn't have to be all random. They could contain a counter as well. I just skimmed Wikipedia's entry about it and I'm not sure why and how many bits has to be random.
      Tldr; Order is not guaranteed if you create multiple UIID v7 per millisecond.

    • @pilotboba
      @pilotboba 3 місяці тому

      I'm not sure the string representation of a guid matches it's binary representation which is how it's stored and sorted. I could be wrong about this, and it would be nice to see a video about it.

    • @Daniel15au
      @Daniel15au 3 місяці тому +1

      ​@@pilotbobathe string matches - it's just a hex representation of the bytes.

  • @Udodisgustme
    @Udodisgustme 3 місяці тому

    Can we parse timestamp back from this new guid easily?

  • @ashbjorn
    @ashbjorn 3 місяці тому +1

    How does this compare to a Sequential Guid?

    • @JW-lv1fm
      @JW-lv1fm 3 місяці тому

      A guid can be generated on any system anywhere, and mostly guarantee no duplicate key.
      A sequential requires knowing the last value, forcing you to lookup from the database as well.

    • @Jallenbah
      @Jallenbah 3 місяці тому +1

      @@JW-lv1fm No it doesn't. Sequential GUID is generally produces a portion of the data using a complete UTC timestamp accurate to the millisecond, so there is no need at all to know the last value.

    • @JW-lv1fm
      @JW-lv1fm 3 місяці тому

      @@Jallenbah Ahh, my bad. I didn't know "Sequential GUID" was a special name for an implementation. Thought he was using the general term. I suppose he did uppercase it to indicate.

  • @OhhCrapGuy
    @OhhCrapGuy 3 місяці тому

    Well, this is better than making CreateDate my Clustered key and making my ID primary key non-clustered... Which is very much not the solution for a lot of problems, but it has worked on occasion...

  • @Stoney_Eagle
    @Stoney_Eagle 3 місяці тому

    Great... Just built my whole database and its logic using ULID's...

  • @Grigorii-j7z
    @Grigorii-j7z 3 місяці тому

    Such a niche feature. GUIDs are fine in 99.9% of scenarios.

  • @SmoothNanners
    @SmoothNanners 3 місяці тому

    How much benefit does it provide over v4 in a PostgreSQL table for insertions, finds, etc if the primary column is of uuid type that uses b-tree index and you're not ever sorting that column? I read that column type is pretty optimized.

    • @michaelkopljan578
      @michaelkopljan578 3 місяці тому

      PK is always Cluster Key if is not told otherwise by table generation. You should read what is Cluster Key and how table pages are working, what is fill factor and etc.... GUID is worst PK you could think about it as it produce fragmation all the time in pages

    • @diamondkingdiamond6289
      @diamondkingdiamond6289 3 місяці тому

      @@michaelkopljan578this is not true, the primary key in PostgreSQL is NOT a clustered index.

    • @diamondkingdiamond6289
      @diamondkingdiamond6289 3 місяці тому

      @@michaelkopljan578this is not true, the primary key is NOT a clustered index in PostgreSQL

    • @diamondkingdiamond6289
      @diamondkingdiamond6289 3 місяці тому +1

      @@michaelkopljan578this is not true. Primary keys are NOT clustered indexes in PostgreSQL. Also Nick, before you delete my comment again, do some research for gods sake. (If that is you)

    • @HappyCheeryChap
      @HappyCheeryChap 3 місяці тому

      Regardless of speed... v4 also uses more storage space in indexes. Both on disk and in RAM. That was enough for me to switch to a sequential version alone.
      I'm actually using my own version where the date and HH:MM are human readable. Loving how practical and convenient that is.

  • @JohanNordberg
    @JohanNordberg 3 місяці тому +1

    Don't think I've ever sorted something on PK. Don't really see how using a GUID as a PK with an unique index and a timestamp column with a sorted index would be a performance problem.
    Imo there is something wrong with the design if PK's are used to sort on creation order.

    • @JollyGiant19
      @JollyGiant19 3 місяці тому +1

      It’s very common with int PKs in my experience. This lets one have the ergonomics of int PKs with the single round trip insertion performance of GUIDs. That’s all really.

    • @pilotboba
      @pilotboba 3 місяці тому +1

      PK by definition is a sort.. and in SQL Server, it isn't only a sort, it is the order the rows are written to disk. So, if you aren't using a highly non-unique ever increasing value for it, you are getting a high amount of table fragmentation. Also, every lookup in a db eventually uses the PK.
      There are ways around it, mentioned in the video, turning off clustering on the guid pk and creating a clustered index on some other field.
      In SQL Server, if you a table without an index it is called a heap and not as performant to retrieve data from.

  • @Grimlock1979
    @Grimlock1979 3 місяці тому +3

    4:24 Those are not ordered, though.

    • @omriliad659
      @omriliad659 3 місяці тому

      was looking for that comment. it really needs an emphasis that while it aims to be sorted, it will not be sorted within the same second

  • @JulienMarpault
    @JulienMarpault 3 місяці тому

    Impossible to find this new method in preview 6. I found the code in the sources on Github. Did you used a nightly for this video ?

    • @nickchapsas
      @nickchapsas  3 місяці тому

      I'm on preview 7 but I thought they added it in preview 6

    • @JulienMarpault
      @JulienMarpault 3 місяці тому

      @@nickchapsas Nope, your a bit early :) Preview 7 will be released in one month, we will have to be patient.

  • @ekeke7125
    @ekeke7125 3 місяці тому

    So basically using GUIDv4 means I don't care about time order, using GUIDv7 means I care about time order.
    I honestly would never go this way. If the data I have require a primary key and then later I need ability to sort by insertion time, that's basically a change of requirement, I'd rather implement Inserted as a separate datetime/timestamp column

  • @winstochurgle9133
    @winstochurgle9133 3 місяці тому

    All I understood from the video is that the UUID takes longer to create than the standard GUID. Then what is the advantage of UUID?

  • @Grudzin7
    @Grudzin7 3 місяці тому

    If you change half of the digits then you lose a lot of possible combinations, going from 2^128 ~3,4e+38 (time independent) to 2^64 ~1,8e+19 for each millisecond. So while inserting a milion records at once (assuming all have same unix timestamp) there is around 2.7e-8 chance of the any collision. It's small, but significantly bigger than original 1.4e-27 with "old" guid.

    • @JW-lv1fm
      @JW-lv1fm 3 місяці тому +1

      That's okay, if you have a collision - just wait 1 ms and try again. :) Those odds get reset every ms.

    • @Grudzin7
      @Grudzin7 3 місяці тому

      @@JW-lv1fm Not really... The case I'm referring to is when you insert a milion records in a single transaction (that's why all records have the same timestamp) and single collision will invalidate whole transaction and force rollback changes. This can be costly. Although it's not the case for most of the applications it may happen for example in some migration or integration apps and cause some trouble.

    • @JW-lv1fm
      @JW-lv1fm 3 місяці тому +2

      @@Grudzin7 Feed it an incrementing timestamp? Check for collision via hashset prior to insertion?
      Technically it gets the current timestamp every creation, so if you are batch processing, doing 1m records in the same ms seems unlikely.
      The use case means you can just stick with V4 too.
      Very few migrations/integrations should be inserting 1m in the same transaction, that's bad for many reasons.

    • @Grudzin7
      @Grudzin7 3 місяці тому

      ​@@JW-lv1fm Then is 999 999 good? Or maybe 999 998? .... 800 000? ... 10 000? 100? 2? where is magic barrier after which there is too many records in single transaction? I've never heard of one. Quite opposite - I remember inserting few millions of records in single transaction by myself, not with same timestamp though. In many cases that is the only way you can keep the integrity of the data, even if something fails in the middle - that is whole point of having the transaction. It all depends what you are doing and what the primary goal is. I can think of several valid scenarios to have millions of records with the same timestamp, and for those scenarios this "new" guid significantly increases random probability of failure.
      And checking guids for collision? God, please, no. That's exactly why we have guids (globally unique!) in the first place.

  • @levmatta
    @levmatta 3 місяці тому

    I had a custom Guid implementation that did a similar thing.
    I can say that this is half the problem.
    Storing in the database is also terrible.
    We need 128bit int fields. Or Nick any other ideias? (That work across Oracle, SQL, Postgres)

    • @JW-lv1fm
      @JW-lv1fm 3 місяці тому

      124bit? Just store the first 124bits of this 128bit field.
      You know the first half is time, and the trailing bits are just random.
      I'm curious why 124bit though...

    • @ghaf222
      @ghaf222 3 місяці тому

      You can also store as CHAR(36)?

    • @levmatta
      @levmatta 3 місяці тому

      @@ghaf222 it's what I am doing now. It is terrible!!! All indexes blow up in size and decrease performance

    • @levmatta
      @levmatta 3 місяці тому

      @@JW-lv1fm 128, my error

  • @TomislavBukovac
    @TomislavBukovac 3 місяці тому

    One of guid use cases was to mitigate solvable German tank problem. This is reintroducing it....

    • @phizc
      @phizc 3 місяці тому

      If being able to estimate the number of records is more important than DB performance, v4 is probably better. Nice thing is that you can choose whether to continue using v4 or use v7 if performance is more important.
      Note: I didn't even know that the German tank problem meant before googling..
      They would have to get many samples for IDs generated in a single millisecond to get any estimate of the total number of records, and there could be bursty behavior in the generation.

  • @farshadgoodarzi7077
    @farshadgoodarzi7077 3 місяці тому +1

    I think perhaps Snowflake ids are much more efficient, they are small in size (just a long number), sequential ordered by time just like UUID v7, no fragmentation in database (as UUID v4 has this kind of fragmentation because they are not necessarily sequential)

  • @Daniel15au
    @Daniel15au 3 місяці тому

    Why's they keep the long format for the new version? It could be way more compact if they used base62 encoding (using 0-9, a-z, A-Z, and a few symbols) rather than hex encoding (which is just 0-9 and a-f)

  • @rrahman-h6n
    @rrahman-h6n 3 місяці тому

    What about Ulid?

  • @SheepyTaggerNep
    @SheepyTaggerNep 3 місяці тому

    So it solves the Year 2038 problem by writing the unix time in 48 bits instead of 32?

  • @_miranHorvat
    @_miranHorvat 3 місяці тому

    So guids get predictable?

  • @jorgeisazac
    @jorgeisazac 26 днів тому

    You bring good content, thanks!, Just a little sugestion given your accent is a little funny, how if you try to talk a little bit slower? It would help your listteners understans what you are trying to say that as I told you is very interesting. regards!

  • @roelbrook7559
    @roelbrook7559 3 місяці тому

    Didn't this already happen on SQL Server when you use auto-incrementing Guid types for the ID column? UUID7 is not really "new" - it is already a couple of years old.
    As per the RFC, you still need rand_a. So they seem to take the rand_a from a UUID4 - which I guess is a quick way to implement this.

    • @leppyr64
      @leppyr64 3 місяці тому

      If you let SQL Server generate the guid with NEWSEQUENTIALID, yes. However if you have a number different clients generating their own guids and inserting, no.

  • @amirchip
    @amirchip 3 місяці тому

    What about NanoId and friends?

  • @andriiyustyk9378
    @andriiyustyk9378 3 місяці тому

    That"s good but not perfect because the time resolution is 1ms. I think it would be better if random data had fewer bits, and time - more.
    If you have high loaded app and need to know for sure what the record was inserted first - only UUID v7 ID is not enough

  • @isnotnull
    @isnotnull 3 місяці тому

    is this only for faster insert into?

    •  3 місяці тому

      In SQL Server fragmentation will come with a cost for read also. But as multiple has mentioned, in that case this would not help.

  • @CkW38
    @CkW38 3 місяці тому +3

    7:08 You said 70 instead of 69. I'm disappointed in you Nick.

  • @lalithprasadsrigiriraju
    @lalithprasadsrigiriraju 3 місяці тому

    I am good with Integers

  • @urzaaaaa
    @urzaaaaa 3 місяці тому

    Will this be added to EF core?

    • @pilotboba
      @pilotboba 3 місяці тому +1

      EF core already generates a sequential id by default. You can also create a convention or convertor for Guid type that uses this if you want.

    • @urzaaaaa
      @urzaaaaa 3 місяці тому

      @@pilotboba But it generates integers by default. Guid v7 by default would be nice.

    • @pilotboba
      @pilotboba 3 місяці тому

      @@urzaaaaa
      They might change the default SequentialGuidValueGenerator to guid v7. Perhaps create an issue for this. I will upvote it if you do.

    • @pilotboba
      @pilotboba 3 місяці тому

      @@urzaaaaa If the PK is a Guid it generates a "sequential" guid when you add the entity to the context.

  • @greencol
    @greencol 3 місяці тому

    The current Guid has 122 bits of entropy (i.e 6 of the bits are known/guessable), and this has far less entropy than that, making it unsuitable for e.g. resource IDs in APIs in some circumstances.

    • @JollyGiant19
      @JollyGiant19 3 місяці тому +2

      But that entropy resets every 1ms.

    • @greencol
      @greencol 3 місяці тому

      @@JollyGiant19 The point is that a whole block of bits are now guessable that weren't before. An attacker attempting to guess an ID has a much smaller ID space to search. OWASP guidance for unguessable IDs is 128 random bits, the existing Guid already fell short of this by 6 bits.

  • @Crozz22
    @Crozz22 3 місяці тому

    Doesn't this increase chances of collisions dramatically? Enough to worry about it?

    • @JollyGiant19
      @JollyGiant19 3 місяці тому +2

      Entropy resets every 1ms so it depends on your load. Most services won’t do enough inserting every millisecond for it to matter.

    • @pilotboba
      @pilotboba 3 місяці тому

      @@JollyGiant19 It's "globallly" unique. So not just have to worry about guids in your system. (although that's generally the case).
      But, yea, I'm not worried about a collision with my own system at all.

  • @solidid9368
    @solidid9368 3 місяці тому

    hmm I prefer that the DBMS generates the IDs for me... but nice to know

  • @stevenj8261
    @stevenj8261 3 місяці тому

    Using MassTransit NewId implementation already time ordered

  • @porcinetdu6944
    @porcinetdu6944 3 місяці тому

    Can someone reexplain please, what is the winning it ? The fact that you can order ? I don’t understand why you could not and now you can

    • @JollyGiant19
      @JollyGiant19 3 місяці тому +2

      GUID V4 is random so sorting gives a random output. GUID V7 has the first bits set to a timestamp so when ordered it’ll be the order of insertion (same as sequential PKs) instead of random.

    • @porcinetdu6944
      @porcinetdu6944 3 місяці тому

      @@JollyGiant19 thanks you it is clearer now

  • @MattThomson
    @MattThomson 2 місяці тому

    Here is a shoutout to all the developers in year 10000 using Chat GPT 472 with C#3 using .Net 1337 in the neural software link Microsoft Visual Universe Simulator Debugger (Education Edition) who run into this bug

  •  2 місяці тому

    Guid is the same size as an IPv6 address and they don't seem to be losing sleep over it.

  • @kurtnelle
    @kurtnelle 2 місяці тому

    So it's not globally unique but it should be system unique; which is unique enough. Also
    Developer: No need to worry about this, since in 8,000 years I will be dead
    Science: We found the cure for death!
    Developer: Shit..

  • @markippo
    @markippo 3 місяці тому +2

    To be honest, from rdbs perspective, guids not being naturally sorted is the only one good thing that comes with them. And that's an asset only in highly concurrent environment when the last page contention problem occurs. But there are better (but more complicated) solutions for that. Never use guids, uuids, strings as a table primary keys (clustered). Nowadays apart from high index degradation, lower speed of dml operations (inserts), they're simply bad for a query performance. Each joining predicate costs you 256 bits instead of 64. Finały if someone run an analytical query - that will eat insane amount of ram and for sure introduce a table spooling. Guids are never good, use classic autoincremented ints instead, change them to bigints only if that's really necessary.

  • @MrCommanderKid
    @MrCommanderKid 3 місяці тому

    Cool🎉🎉🎉

  • @RockTheCage55
    @RockTheCage55 3 місяці тому

    in terms of datababes & how much UUIDs screw up indexes ULID & SnowflakeId are still better.

  • @Spirch
    @Spirch 3 місяці тому +1

    this V7 isnt going to go well with sql server, fragmentation hell, sadly

  • @7eper
    @7eper 3 місяці тому +1

    But usually, don't we let the database create the ID?

    • @JollyGiant19
      @JollyGiant19 3 місяці тому +1

      Some systems do, some don’t.

  • @Marfig
    @Marfig 3 місяці тому +1

    Version 7 GUIDs were never a popular option for primary keys, and it was not because people had to generate them through other means 🤭They are actually worse than version 4. Because not only you are going to still have a hugely fragmented index table (SQL server doesn't sort them the way you think), but your primary key first 8 bytes are also theoretically more vulnerable to enumeration attacks with the random data being reduced from 16 to 8 bytes. No performance benefit and reduced security. Don't use version 7 GUIDs as primary keys.

    • @Rcomian
      @Rcomian 3 місяці тому

      do you have any details about the sorting of the guid data type works in SQL server?

    • @Marfig
      @Marfig 3 місяці тому

      @@Rcomian It essentially looks at the byte groups divided by the '-' character and sorts by individual byte in the following arrangement from right to left: It starts at bytes 10th to 15th as big endian, then the 8th and the 9th as big endian, then the 6th and the 7th, then the 4th and 5th, finally the 0th to 3rd. If you look for "SQL Server GUID sort algorithm. Why?", you'll get a discussion around the topic. Because of the way sort happens, version 7 GUIDS are still highly fragmented, not just as much as version 4. So on very large tables with tenths of millions of records you will see a difference when inserting records. But only on this scenario should you consider them (I wouldn't even then if I were you. See discussion on "CUID2 Note on K-Sortable/Sequential/Monotonically Increasing Ids"). Otherwise, a version 4 GUID, or better yet a NanoId or Cuid2 should be your options.