Excellent speaker. No snarkiness, no cheesy clip-art. Slowly eases more deeply into the subject, in a way where you never feel like they skipped a step and are now lost
My understanding of the automerge algorithm: 1. Treat the whole document as a list of characters and give each character a unique identifier that is totally ordered (think: timestamp each event with (event's logical clock, process id) gives a total order among all events) 2. Record each edit and send them to the other concurrently editing user (e.g. Msg={insert character "a" with id 4a after existing character 2a}). 3. Each user applies the all the local and incoming/external edits. When a conflict arises, reply on the total order to make a "sensible" resolution (e.g. in case of inserting, "smaller" inserts takes precedence over "larger" inserts, so that both users end up in the same string). It seems to me this algorithm is a special case of CRDT (on the `insert` method of a `list` date structure so that concurrent `insert`s are ensured identical state on the `list` on both replicas). In general, we can do this to any method on any data structure, and we want the implementations to be *commutative*, which conveniently ensures "ending up in the same state after applying all operations, even if ops are applied in different order".
I can see how language models could be used to resolve conflicts in a "natural way". "Hi mom dad!" would become "Hey mom and dad!" or in "Hey everyone folks" it would understand the redundancy and automatically suggest sensibel alternatives.
thanks for explaining things clearly! Espeically "push is the JS operation for updating an array" (around 28:00). Makes it easier for people unfamiliar with a particular language to understand!
Pretty nice rundown of crdts. I've been using them in non document based systems. For example, an orderbook. In an ordebook, you have very specific operations; and they map very nicely onto the crdt ops and types. My system is also running a hybrid consensus/collaberative algorithm because there are cases where the book wants to reject certain changes but it scales incredibly well.
I've have greatly enjoyed this speaker's previous talks on append only logs and others. This talk was fabulous in that it was thought provoking, charismatic and he's out their on stage doin his thing... However I have a few issues regarding many points claimed. Probably the biggest point is CRDTs vs consensus. If one is talking about _bitcoin_ and its hard-coded consensus, then yes, they are only superficially similar. But consensus in general is an algorithm for choosing the next state (not just the next block). The last section about automerge basically delineates their naive (meant literally not pejoratively) conflict resolution for edge cases. But their algorithm essentially is another hard coded consensus with implied limitations on state shapes and the transformations allowed as a tradeoff for the mathematical backing of the expected deterministic outcomes. Still a very good talk that is helping me get where I need to go, so a big Thank You!
The comments from 10:00 to 12:00 about operational transformation (OT) are out of date. The correctness problem of OT was solved around 2006, which was published in a 2010 JCSCW paper by Li&Li. A family of OT algorithms, called ABT*, were developed by Shao & Li from 2009 to 2011. More details about this line of work are provided in a short essay written in 2011, which can be found on my LinkedIn profile.
So, in a conflict situation the data from the higher node ID always goes first. I propose that the ID should be a GUID to enable nodes to join without the need of any centralized server. For your consideration, Ben (Node: ffffffff-ffff-ffff-ffff-ffffffffffff)
Immutable, not consistent from each users input. Imagine typing a document with a 15s-15m minute lag, not possible. You've got a good idea, if you don't realize how slow it would actually be. If you know some about BC stuff you also hear him saying that side channels are NOT allowed.
@@rallokkcaz Someone is replying to my 2y old comment. I am delighted :D I've changed my view on BCs fundamentally in the meantime. They are not good for anything but money.
@@DaraulHarris I believe that is subject of debate as the speaker said. Sure thing, you can choose the latest for conflict resolution, but it wouldn't be the perfect strategy for all cases. You never know what's better because it is up to the business rules to decide what is expected behavior. What if the discarded early change was really important? Simply taking the latest wouldn't be a way to go.
While this may sound like a good idea, it would introduce unnecessary dependence and, with that, edge cases that could mess everything up. The main benefit about CRDTs is that (finally!) there's no need to worry about such.
Excellent speaker. No snarkiness, no cheesy clip-art. Slowly eases more deeply into the subject, in a way where you never feel like they skipped a step and are now lost
This has to be the most concise talk I've seen in a long time.
Probably hit a new benchmark for how crisp talks can be without being overbearing.
That book is just awesome! Highly recommended if anyone is interested...
Superb speaker! An excellent author as well.
This dude is a freaking legend. The book is phenomenal. Complex topics explained so well
Such a breath of fresh air after spending tons of time watching marketing-related talks about "new technology XYZ". Thanks! That's awesome!
What a great talk! Such an interesting topic explained in an easy to understand manner. Also, great slides!
What a gem the content, and the delivery. Enlightening. Thanks!!
My understanding of the automerge algorithm:
1. Treat the whole document as a list of characters and give each character a unique identifier that is totally ordered (think: timestamp each event with (event's logical clock, process id) gives a total order among all events)
2. Record each edit and send them to the other concurrently editing user (e.g. Msg={insert character "a" with id 4a after existing character 2a}).
3. Each user applies the all the local and incoming/external edits. When a conflict arises, reply on the total order to make a "sensible" resolution (e.g. in case of inserting, "smaller" inserts takes precedence over "larger" inserts, so that both users end up in the same string).
It seems to me this algorithm is a special case of CRDT (on the `insert` method of a `list` date structure so that concurrent `insert`s are ensured identical state on the `list` on both replicas). In general, we can do this to any method on any data structure, and we want the implementations to be *commutative*, which conveniently ensures "ending up in the same state after applying all operations, even if ops are applied in different order".
He talks faster than I can think! Great explanation. The book is also brilliant. Had to read it multiple times to get in all details :)
I thought I was the only one. 😄
I can see how language models could be used to resolve conflicts in a "natural way". "Hi mom dad!" would become "Hey mom and dad!" or in "Hey everyone folks" it would understand the redundancy and automatically suggest sensibel alternatives.
thanks for explaining things clearly! Espeically "push is the JS operation for updating an array" (around 28:00). Makes it easier for people unfamiliar with a particular language to understand!
The punch line is 40:28 to 42:28. Very clever.
Mr Kleppman is awesome
Excellent talk from an excellent speaker.
Pretty nice rundown of crdts. I've been using them in non document based systems. For example, an orderbook. In an ordebook, you have very specific operations; and they map very nicely onto the crdt ops and types. My system is also running a hybrid consensus/collaberative algorithm because there are cases where the book wants to reject certain changes but it scales incredibly well.
Fantastic book, and brilliant man 👌
I've have greatly enjoyed this speaker's previous talks on append only logs and others. This talk was fabulous in that it was thought provoking, charismatic and he's out their on stage doin his thing... However I have a few issues regarding many points claimed. Probably the biggest point is CRDTs vs consensus. If one is talking about _bitcoin_ and its hard-coded consensus, then yes, they are only superficially similar. But consensus in general is an algorithm for choosing the next state (not just the next block). The last section about automerge basically delineates their naive (meant literally not pejoratively) conflict resolution for edge cases. But their algorithm essentially is another hard coded consensus with implied limitations on state shapes and the transformations allowed as a tradeoff for the mathematical backing of the expected deterministic outcomes.
Still a very good talk that is helping me get where I need to go, so a big Thank You!
Really nice talk!
The punch line is 40:28 to 42:28. Very clever.
brilliant speaker
Excellent !!
I was waiting for an example of "User One changes property X of entity A, while user Two deletes entity A completely" - but alas...
how do you stop 'breaking changes' on code where multiple editors are trying to fix a bug.
simply thanks
Is this Tom Scott dev version?
Also, this dude seems cool to work with.
Great.. Thanks !
The comments from 10:00 to 12:00 about operational transformation (OT) are out of date. The correctness problem of OT was solved around 2006, which was published in a 2010 JCSCW paper by Li&Li. A family of OT algorithms, called ABT*, were developed by Shao & Li from 2009 to 2011. More details about this line of work are provided in a short essay written in 2011, which can be found on my LinkedIn profile.
can you please share your Linkedin profile?
Is there any kind of immutable api for Automerge?
what happened if current data is both A=1
User1 delete/expire a record with key A
and User2 update/upsert a record with key A to value 2
ah ic, 41:00 by assigning priority of the writer
So, in a conflict situation the data from the higher node ID always goes first. I propose that the ID should be a GUID to enable nodes to join without the need of any centralized server.
For your consideration,
Ben (Node: ffffffff-ffff-ffff-ffff-ffffffffffff)
Actually, you could store the change log in a blockchain if you want it to be censorship resistant and guaranteed immutable.
Immutable, not consistent from each users input. Imagine typing a document with a 15s-15m minute lag, not possible. You've got a good idea, if you don't realize how slow it would actually be. If you know some about BC stuff you also hear him saying that side channels are NOT allowed.
@@rallokkcaz Someone is replying to my 2y old comment. I am delighted :D I've changed my view on BCs fundamentally in the meantime. They are not good for anything but money.
At 32:42: how about using "time" as a way to decide which change to pick? So the change that was made later gets preferred?
But according to whose clock?
@@tanders12 why not unix time?
@@DaraulHarris I believe that is subject of debate as the speaker said. Sure thing, you can choose the latest for conflict resolution, but it wouldn't be the perfect strategy for all cases. You never know what's better because it is up to the business rules to decide what is expected behavior. What if the discarded early change was really important? Simply taking the latest wouldn't be a way to go.
@@DaraulHarris How do you make sure that everyone's clocks are correct? Clock skew is a major problem in distributed systems.
While this may sound like a good idea, it would introduce unnecessary dependence and, with that, edge cases that could mess everything up. The main benefit about CRDTs is that (finally!) there's no need to worry about such.