thank you for this detailed explanation. my professors were terrible at online teaching and explaining what this algorithm is and what it is good for. Its awesome you do live session, use chat and twitch. Thats how digital learning is supposed to work in 2020!
Thanks for your great explanation and analysis! I'm having distributed systems this semester and none of the tutorials were better than yours. You saved my life!
If i could not find this video i would never going to understand this topic. This is so best and so interactive Thank you so much for clearing up the things
Thank you for these lectures, Professor. Been loving them. I had a couple of questions w.r.t the example starting at 37:00: 1(a). After P1 records its state, it starts to record messages on all its incoming channels. In addition to this, wouldn't P1 also record events that it executed locally (C for instance) until it receives marker messages from P2 and P3? 1(b). At 49:00, which is roughly the point at which it has heard back from its peer processes, we see that H->D is part of the snapshot. But since C happened before D, C must be part of the snapshot too. If that's the case, why does P1 even need to take a first snapshot of its state at the beginning (S1)? Why not wait until it hears back from the last of its peers before taking one single snapshot that captures all changes until that point? What's the significance of that initial recording of states? 2. Is P1 allowed to send non-marker messages as well over the network in the time it waits after sending its marker messages? For example, could C have been a message over the network to either P2 or P3? Is that allowed? 3. Also, what does it mean to "record messages from an incoming channel"? If message H-D for example denotes a request that requires A to execute an operation (maybe an RPC) on itself which would change its state, would "recording" this message mean only noting the request in a queue without actually executing it? How is it different from "recording the state" (for example, S1)?
"In addition to this, wouldn't P1 also record events that it executed locally (C for instance) until it receives marker messages from P2 and P3?" -- No, it doesn't do anything to update its own recorded process state. It only needs to record channel states. "At 49:00, which is roughly the point at which it has heard back from its peer processes, we see that H->D is part of the snapshot. But since C happened before D, C must be part of the snapshot too. If that's the case, why does P1 even need to take a first snapshot of its state at the beginning (S1)? Why not wait until it hears back from the last of its peers before taking one single snapshot that captures all changes until that point? What's the significance of that initial recording of states?" -- There are two issues here. First, it sounds like you think of event D as part of the recorded snapshot, but it isn't really. You should think of the *message* from P2 to P1 as being part of C21's recorded channel state, but that's different from saying that event D is in the snapshot. Second, think about what would happen if every process had to wait to hear back from all the other processes before it could snapshot its own state. "Is P1 allowed to send non-marker messages as well over the network in the time it waits after sending its marker messages? For example, could C have been a message over the network to either P2 or P3? Is that allowed?" -- Sure! One of the really nice thing about the Chandy-Lamport algorithm is that it still works even if application messages (non-marker messages) are being sent and received at the same time as the snapshot is being taken. "Also, what does it mean to "record messages from an incoming channel"? If message H-D for example denotes a request that requires A to execute an operation (maybe an RPC) on itself which would change its state, would "recording" this message mean only noting the request in a queue without actually executing it? How is it different from "recording the state" (for example, S1)?" -- Well, I'm not sure what you mean by "requires A to execute an operation"; maybe you meant "requires P1 to execute an operation"? If so, then sure, the operation could be actually executed on P1, but any resulting event on P1 would not be part of the recorded process state. The recorded channel state on C21, however, would include the contents of the message. (This is discussed more at the end of my blog post here: decomposition.al/blog/2019/04/26/an-example-run-of-the-chandy-lamport-snapshot-algorithm/ .)
Thank you again. Will the entire course be posted on UA-cam? Will be there "practice" lectures or a way to create a distributed system to solve a real problem? Maybe using Python or some programming language?
The course has a significant programming project (decomposition.al/CSE138-2020-03/course-overview.html#course-project) that gives students a chance to put into practice some of the concepts discussed in lecture. We generally don't discuss the details of the programming assignments much (if at all) during lecture, though. I'll be posting all of my lectures as they happen, yes. Not sure about guest lectures yet.
@@lindseykuperwithasharpie Thank you for making this course available for all. I checked the link you provided above (composition.al/CSE138-2020-03/course-overview.html#course-project) but it seems we cannot reach the programming assignments anymore right?
In the explanation of the example that you give (23:50) there is something confusing. You say, P1 has seen the marker already and so it stops the recording on C_21. But, the very first marker that P1 sees actually the marker from P2 (as a response to the marker sent by P1).
Good question! Actually, sending a marker message counts as "seeing" a marker message. So, when P1 gets the marker message from P2, it has already "seen" one marker message (the one it sent in the first place). My use of the word "see" instead of "receive" is intentional here.
@@lindseykuperwithasharpie I thought, maybe marker message from P2 to P1 (as a reply to P1's marker message) contains the snapshot of its state + contents of the marker message from P1. Thus P1 assumes this message as "seen before". Thanks much for taking your time and replying the message Lindsey :)
My blog post at decomposition.al/blog/2019/04/26/an-example-run-of-the-chandy-lamport-snapshot-algorithm/, which is about this same example, may help (especially the discussion at the end).
Good evening (our clocks are not synchronised though hence [1, 0] ), professor! Jokes apart, this is extremely good content available on Distributes Systems. Thanks a lot for these videos and knowledge!! Which book would you recommend to supplement these videos?
Thanks for watching. I don't think there's a textbook that covers all the bases, which is why this course doesn't have a textbook. I do recommend Martin Kleppmann's "Designing Data-Intensive Applications" for practical distributed systems knowledge, but it doesn't cover all the under-the-hood topics that we cover (e.g., snapshot algorithms, consensus algorithms, consistent hashing); for those things, there's lots of good supplementary material that I've been giving pointers to during the lectures. Is there a specific topic that you want more reading material on?
@@lindseykuperwithasharpie Thanks for the reply, professor! Not a specific topic as such but I wanted to take cloud computing course after this so wanted to understand the distributed systems very well before that. Will check out the supplementary material you've pointed out in the lectures. Thanks again!
Timestamp 26:30 What exactly happened to point D here ? I understand that "The final state of channel C21 is set to the sequence of messages that arrived whilst recording was active" but why did we not take a separate snapshot at Point D? Also, can you please paste the link of the blog post you talked about in the video at timestamp 35:33?
Event D isn't included in P1's recorded process state because P1 recorded its process state prior to event D happening. However, the recorded state of channel C21 includes the message that arrives at D later. You can think of the snapshot as capturing the moment during which that message was in flight from P2 to P1. The blog post is decomposition.al/blog/2019/04/26/an-example-run-of-the-chandy-lamport-snapshot-algorithm/ .
The confusing part is in case one: Start recording incoming messages on incoming channels except C(ki) and in case two: Pi stops recording on channel C(ki) Why does Pi stop recording on channel C(ki), when has it even started?
In case two, this isn't the first marker message that P_i has seen. This means that P_i must either be an initiator, or it must have already received a first marker message. If it's an initiator, then it was recording on all its incoming channels. If it's already received a marker message previously, then it was recording on all its incoming channels except the one(s) on which it's already received a marker message.
Sure, let's say channel C_ki has some other messages in it. Since channels are FIFO, those messages must have been sent by P_k *after* P_k sent a marker message to P_i. And P_k snapshotted its own process state immediately *before* sending that marker message, so there's no record in P_k's part of the snapshot of those messages being sent. Therefore, we don't want P_i's part of the snapshot to record the messages being *received*, because if it did, the snapshot would be inconsistent. Life goes on after a snapshot is taken. Messages keep being sent and received. The snapshot doesn't capture all of these messages; that's the nature of a snapshot.
Chandy Lamport algo started 7:20
thank you for this detailed explanation. my professors were terrible at online teaching and explaining what this algorithm is and what it is good for. Its awesome you do live session, use chat and twitch. Thats how digital learning is supposed to work in 2020!
Thanks for watching!
Thanks for your great explanation and analysis! I'm having distributed systems this semester and none of the tutorials were better than yours. You saved my life!
Thanks for watching.
If i could not find this video i would never going to understand this topic. This is so best and so interactive Thank you so much for clearing up the things
Thanks a lot. this helped clear lot of my doubts
thank you for explanation
Thanks for watching.
Thank you for these lectures, Professor. Been loving them. I had a couple of questions w.r.t the example starting at 37:00:
1(a). After P1 records its state, it starts to record messages on all its incoming channels. In addition to this, wouldn't P1 also record events that it executed locally (C for instance) until it receives marker messages from P2 and P3?
1(b). At 49:00, which is roughly the point at which it has heard back from its peer processes, we see that H->D is part of the snapshot. But since C happened before D, C must be part of the snapshot too. If that's the case, why does P1 even need to take a first snapshot of its state at the beginning (S1)? Why not wait until it hears back from the last of its peers before taking one single snapshot that captures all changes until that point? What's the significance of that initial recording of states?
2. Is P1 allowed to send non-marker messages as well over the network in the time it waits after sending its marker messages? For example, could C have been a message over the network to either P2 or P3? Is that allowed?
3. Also, what does it mean to "record messages from an incoming channel"? If message H-D for example denotes a request that requires A to execute an operation (maybe an RPC) on itself which would change its state, would "recording" this message mean only noting the request in a queue without actually executing it? How is it different from "recording the state" (for example, S1)?
"In addition to this, wouldn't P1 also record events that it executed locally (C for instance) until it receives marker messages from P2 and P3?" -- No, it doesn't do anything to update its own recorded process state. It only needs to record channel states.
"At 49:00, which is roughly the point at which it has heard back from its peer processes, we see that H->D is part of the snapshot. But since C happened before D, C must be part of the snapshot too. If that's the case, why does P1 even need to take a first snapshot of its state at the beginning (S1)? Why not wait until it hears back from the last of its peers before taking one single snapshot that captures all changes until that point? What's the significance of that initial recording of states?" -- There are two issues here. First, it sounds like you think of event D as part of the recorded snapshot, but it isn't really. You should think of the *message* from P2 to P1 as being part of C21's recorded channel state, but that's different from saying that event D is in the snapshot. Second, think about what would happen if every process had to wait to hear back from all the other processes before it could snapshot its own state.
"Is P1 allowed to send non-marker messages as well over the network in the time it waits after sending its marker messages? For example, could C have been a message over the network to either P2 or P3? Is that allowed?" -- Sure! One of the really nice thing about the Chandy-Lamport algorithm is that it still works even if application messages (non-marker messages) are being sent and received at the same time as the snapshot is being taken.
"Also, what does it mean to "record messages from an incoming channel"? If message H-D for example denotes a request that requires A to execute an operation (maybe an RPC) on itself which would change its state, would "recording" this message mean only noting the request in a queue without actually executing it? How is it different from "recording the state" (for example, S1)?" -- Well, I'm not sure what you mean by "requires A to execute an operation"; maybe you meant "requires P1 to execute an operation"? If so, then sure, the operation could be actually executed on P1, but any resulting event on P1 would not be part of the recorded process state. The recorded channel state on C21, however, would include the contents of the message. (This is discussed more at the end of my blog post here: decomposition.al/blog/2019/04/26/an-example-run-of-the-chandy-lamport-snapshot-algorithm/ .)
Thank you again. Will the entire course be posted on UA-cam? Will be there "practice" lectures or a way to create a distributed system to solve a real problem? Maybe using Python or some programming language?
The course has a significant programming project (decomposition.al/CSE138-2020-03/course-overview.html#course-project) that gives students a chance to put into practice some of the concepts discussed in lecture. We generally don't discuss the details of the programming assignments much (if at all) during lecture, though.
I'll be posting all of my lectures as they happen, yes. Not sure about guest lectures yet.
@@lindseykuperwithasharpie Thank you for informing me.
@@lindseykuperwithasharpie Thank you for making this course available for all. I checked the link you provided above (composition.al/CSE138-2020-03/course-overview.html#course-project) but it seems
we cannot reach the programming assignments anymore right?
@@baristuncer7597 The programming assignments have never been public; sorry.
In the explanation of the example that you give (23:50) there is something confusing. You say, P1 has seen the marker already and so it stops the recording on C_21. But, the very first marker that P1 sees actually the marker from P2 (as a response to the marker sent by P1).
Good question! Actually, sending a marker message counts as "seeing" a marker message. So, when P1 gets the marker message from P2, it has already "seen" one marker message (the one it sent in the first place). My use of the word "see" instead of "receive" is intentional here.
@@lindseykuperwithasharpie I thought, maybe marker message from P2 to P1 (as a reply to P1's marker message) contains the snapshot of its state + contents of the marker message from P1. Thus P1 assumes this message as "seen before".
Thanks much for taking your time and replying the message Lindsey :)
Nope, marker messages don't contain state snapshots or anything like that.
27:37 thanks for the video, have a quick question about this part, can you clarify a bit more when you said D will be included in the snapshot
My blog post at decomposition.al/blog/2019/04/26/an-example-run-of-the-chandy-lamport-snapshot-algorithm/, which is about this same example, may help (especially the discussion at the end).
i wish you taught at my university SFU
Good evening (our clocks are not synchronised though hence [1, 0] ), professor! Jokes apart, this is extremely good content available on Distributes Systems. Thanks a lot for these videos and knowledge!! Which book would you recommend to supplement these videos?
Thanks for watching. I don't think there's a textbook that covers all the bases, which is why this course doesn't have a textbook. I do recommend Martin Kleppmann's "Designing Data-Intensive Applications" for practical distributed systems knowledge, but it doesn't cover all the under-the-hood topics that we cover (e.g., snapshot algorithms, consensus algorithms, consistent hashing); for those things, there's lots of good supplementary material that I've been giving pointers to during the lectures. Is there a specific topic that you want more reading material on?
@@lindseykuperwithasharpie Thanks for the reply, professor! Not a specific topic as such but I wanted to take cloud computing course after this so wanted to understand the distributed systems very well before that. Will check out the supplementary material you've pointed out in the lectures. Thanks again!
Timestamp 26:30
What exactly happened to point D here ? I understand that
"The final state of channel C21 is set to the sequence of messages that arrived whilst recording was active"
but why did we not take a separate snapshot at Point D?
Also, can you please paste the link of the blog post you talked about in the video at timestamp 35:33?
Event D isn't included in P1's recorded process state because P1 recorded its process state prior to event D happening. However, the recorded state of channel C21 includes the message that arrives at D later. You can think of the snapshot as capturing the moment during which that message was in flight from P2 to P1.
The blog post is decomposition.al/blog/2019/04/26/an-example-run-of-the-chandy-lamport-snapshot-algorithm/ .
The confusing part is in case one: Start recording incoming messages on incoming channels except C(ki) and in case two: Pi stops recording on channel C(ki)
Why does Pi stop recording on channel C(ki), when has it even started?
In case two, this isn't the first marker message that P_i has seen. This means that P_i must either be an initiator, or it must have already received a first marker message. If it's an initiator, then it was recording on all its incoming channels. If it's already received a marker message previously, then it was recording on all its incoming channels except the one(s) on which it's already received a marker message.
I am a little confused about “Pi marks channel Cki as empty” at 29:12, what do we do if Cki has some other events?
Sure, let's say channel C_ki has some other messages in it. Since channels are FIFO, those messages must have been sent by P_k *after* P_k sent a marker message to P_i. And P_k snapshotted its own process state immediately *before* sending that marker message, so there's no record in P_k's part of the snapshot of those messages being sent. Therefore, we don't want P_i's part of the snapshot to record the messages being *received*, because if it did, the snapshot would be inconsistent.
Life goes on after a snapshot is taken. Messages keep being sent and received. The snapshot doesn't capture all of these messages; that's the nature of a snapshot.
@@lindseykuperwithasharpie that helps a lot, thanks ; )
Mam , what should be done if at the termination we want all channel states to be empty?
I think you'd need a different snapshot algorithm in that case.
20:13