you have no idea how helpful your videos have been to me I went through 2 articles and 2 youtube videos before I realized you have a video on Prim's Algorithm (yes I'm kind of oblivious) and then it became pretty clear to me. Thank you man, cheers
Excellent video, main take away is, it is just like BFS + priority queue (instead of queue in BFS) ; also you will be adding duplicates hence complexity to pop from the min-heap is n^2*log(n)
As other commenters have mentioned, we don't need to generate all edges ahead of time. It's more efficient to track 'remaining' points and only generate edges for points that still need to be visited. For all remaining unvisited nodes, we can calculate the edge cost and push it to the minheap directly. Here's my code (don't get hung up on the lambda, it's just an 'in-inline' function): def minCostConnectPoints(self, points: List[List[int]]) -> int: manhattan = lambda p1, p2: abs(p1[0] - p2[0]) + abs(p1[1] - p2[1]) pq = [(0, tuple(points[0]))] mincost = 0 toVisit = set([tuple(x) for x in points]) while pq: curCost, curNode = heapq.heappop(pq) if curNode not in toVisit: continue toVisit.remove(curNode) mincost += curCost if len(toVisit) == 0: break for n in toVisit: heapq.heappush(pq, (manhattan(curNode, n), n)) return mincost
@@Naveenslab In this leetcode problem, the min heap is just a tool to get the next neighboring edge with minimum cost, as soon as we visited all edges the algorithm is finished, the min heap can be discarded because any remaining elements that we didn't pop are bigger than the minimum. Remember we are trying to find the minimum cost among points, we don't care about all possible costs since those will also incluse costs larger than the minimum possible answer.
@@DavidDLee In the original solution, you're generating all possible edges for every point. By doing it like this, you're only generating edges between your curNode and the ones in "toVisit". The larger the length of points, the more time you'll end up saving.
The tutorial is totally smooth and helpful. You are definitely one of the my best ds and algo online mentor during hunting job (the other is genius Erik Demaine ...) Again, appreciate your knowledge sharing and thank you so much!!!
Precalculating all the edge length is quite time consuming. The distance from one vertex to every remainings should be calculated in the while loop below since the length of vertices will be reduced after each step.
Great walkthrough - LC Explore does a good job of explaining algorithms, but doesn't do anything for implementing. Walking through with code was very helpful.
This channel is the best when it comes to LC solutions. Could you please make a video on "Subarray sum equals K" and "Maximum size subarray sum equals k". These problems are based on the concept of hashmap in 2 sum, yet when it comes to implementation, they are not that trivial.
What a beautiful explanation! I am going to implement this in C++. Thanks for the video!! Learning a lot from your channel and I am also aiming to become a Noogler soon :)
Came across this alternate solution to implement the same Prim's algo This one is simpler :) -------------------------------------------- d = dict() # Take first point and make it's cost as 0 all others infinity for i, (x, y) in enumerate(points): if i: d[(x, y)] = float("inf") else: d[(x, y)] = 0 ans = 0 # While there are still more points left while d: # Get the point with minimum cost x, y = min(d, key=d.get) # Add to current cost and remove the point from dict of points ans += d.pop((x, y)) # Relax/re-calculate cost of all the points from this point for x1, y1 in d: d[(x1, y1)] = min(d[(x1, y1)], abs(x - x1) + abs(y - y1)) return ans
You can think about the complexity as when you have n points optimal connected, to include another point, you need to find the best way to connect the point from every single already connected point. So each inclusion of another point is nlogn.
Ok, can you clarify why did not we just do compare each node with everyone and take the min values and add them up? it will be n^2. is it because we want to prevent a cycle?
As of now this solution is getting TLE on LC, because of a huge test case. To avoid it, calculate the Manhattan distance inside the Prims algorithm part. Only maintain neighbours list in the adjacency hashmap.
adjList = {i :[] for i in range(n)} for i in range(n): for j in range(1, n): adjList[i].append(j) adjList[j].append(i) for nei in adjList[node]: x1, y1 = points[node] x2, y2 = points[nei] dist = abs(x1 - x2) + abs(y1 - y2) if nei not in visit: heapq.heappush(minHeap, [dist, nei]) Rest everything can stay same!
Thank you for your video, this is very valuable! There is an optimized Prime's method from Leet code solution. The time complexity can be optimized into O(N^2). Could you talk a little about that?
As others have pointed out in the comment section, you can track the remaining points (instead of the frontier set). In this particular question, the distance to each remaining point needs to be re-evaluated every time a new point is added to the MST. The trick to achieving O(n^2) is to track the next closest point during that re-evaluation, such that getting the next closest remaining point in each iteration is O(1). def minCostConnectPoints(self, points: list[list[int]]) -> int: if not points: return 0 total_dist = 0 # invariant: the last element in this list will have the minimum distance distance_and_points: list[list[int or float, int, int]] = [[float('inf'), p[0], p[1]] for p in points] distance_and_points[-1][0] = 0 while distance_and_points: dist, x, y = distance_and_points.pop() total_dist += dist if not len(distance_and_points): break # update distance of all remaining points min_dist_idx = 0 for i in range(len(distance_and_points)): dist2, x2, y2 = distance_and_points[i] distance_and_points[i][0] = min(dist2, abs(x2 - x) + abs(y2 - y)) if distance_and_points[i][0] < distance_and_points[min_dist_idx][0]: min_dist_idx = i # move the closest point to the end of distance_and_points by swapping distance_and_points[-1], distance_and_points[min_dist_idx] = distance_and_points[min_dist_idx], distance_and_points[-1] return total_dist
You can calculate distances on the fly, You don't have to get the full ADJ list, because PRIM is greedy. Every time you add a node, you don't need anything related to it for the next iterations... Just address all nodes as unconnected components and start connecting them using the minimum distance from the frontier.
I did the same thing as the code in this video in JavaScript and it took 2 seconds to run on leetcode, changed it to be like Bellman Ford and it lowered down to 152ms. Bellman ford complexity on this should be runtime O(n^2) and memory O(n). I think the Prim's algorithm as implemented in this video is runtime O(n^2 + n*log n + n^2 * log n) (build adj list, pop heap n times, push to heap n*n times) since it's pushing to the heap n^2 times (in nested for loop). It would be cheaper to just recalculate the distance with every point than pushing to the heap for every element on every iteration. I think proper prims algorithm is meant to be different.
When building the adj list, I thought you want to add every other nodes? but you are adding all nodes j for node i, where j>i. So you did: for i in range(N): for j in range(i+1, N): # build adj list but instead should it be: for i in range(N): for j in range(N): if i == j: continue # build adj list
think i know the answer. bec. this is undirected (bidirectional graph), so you don't need to check previous nodes you iterated over since you already assigned those 'neighbors' already
He does exactly what you are thinking he is just being a little clever about it. We don't have to start from j=0 each time because he adds two edges per inner loop. "adj[i] = append(dist,j)" AND "adj[j] = append(dist, i)"
Isn't line 26 redundant though? Since we are already continuing at line 21 if a node is already seen. I just checked and it works just fine when I comment line 26.
True, but the properties of log means that you can bring down the 2 from the exponent of of the second n^2 and make that the coefficient of the log, or simply 2n^2logn. Then the coefficient gets removed because it does not matter when considering O.
At 15:35, from 2nd node to 1st node we are adding edge as it has min cost of 9, but won't it violate the line if nei not in visit(26th line in code) as 1st node is already in visit set?
2nd node with cost 9 was added to the heap while we were exploring nei from 1st node. And yes like he said same node will be added several times in the heap but with different costs.
Hello everyone. Please how can we implement this to an actual question? I have an assignment and went through the video. I understood the first part but have issues implementing it in my assignment. Many thanks
The solution is great, however one of the test case fails `points = [[0,0],[1,1],[1,0],[-1,1]]` calculating the edges uses (i+1) in inner loop, which is causing the issue
🚀 neetcode.io/ - I created a FREE site to make interview prep a lot easier, hope it helps! ❤
nice initiative
you have no idea how helpful your videos have been to me
I went through 2 articles and 2 youtube videos before I realized you have a video on Prim's Algorithm (yes I'm kind of oblivious)
and then it became pretty clear to me.
Thank you man, cheers
Excellent video, main take away is, it is just like BFS + priority queue (instead of queue in BFS) ; also you will be adding duplicates hence complexity to pop from the min-heap is n^2*log(n)
As other commenters have mentioned, we don't need to generate all edges ahead of time. It's more efficient to track 'remaining' points and only generate edges for points that still need to be visited. For all remaining unvisited nodes, we can calculate the edge cost and push it to the minheap directly. Here's my code (don't get hung up on the lambda, it's just an 'in-inline' function):
def minCostConnectPoints(self, points: List[List[int]]) -> int:
manhattan = lambda p1, p2: abs(p1[0] - p2[0]) + abs(p1[1] - p2[1])
pq = [(0, tuple(points[0]))]
mincost = 0
toVisit = set([tuple(x) for x in points])
while pq:
curCost, curNode = heapq.heappop(pq)
if curNode not in toVisit:
continue
toVisit.remove(curNode)
mincost += curCost
if len(toVisit) == 0:
break
for n in toVisit:
heapq.heappush(pq, (manhattan(curNode, n), n))
return mincost
Genius!
Small suggestion, you can avoid this line: "if len(toVisit) == 0: break"
If your while loop condition is instead:
```
while toVisit:
```
@@Naveenslab
In this leetcode problem, the min heap is just a tool to get the next neighboring edge with minimum cost, as soon as we visited all edges the algorithm is finished, the min heap can be discarded because any remaining elements that we didn't pop are bigger than the minimum. Remember we are trying to find the minimum cost among points, we don't care about all possible costs since those will also incluse costs larger than the minimum possible answer.
Why is it more efficient? Looks like it's the same complexity.
@@DavidDLee
In the original solution, you're generating all possible edges for every point.
By doing it like this, you're only generating edges between your curNode and the ones in "toVisit". The larger the length of points, the more time you'll end up saving.
@@iamthecondor I see. But, same complexity still.
I am so used to your coding style, I coded this question up by myself in c++ and it is the same, literally the same how you did it. Great list.
The tutorial is totally smooth and helpful.
You are definitely one of the my best ds and algo online mentor during hunting job (the other is genius Erik Demaine ...)
Again, appreciate your knowledge sharing and thank you so much!!!
Precalculating all the edge length is quite time consuming. The distance from one vertex to every remainings should be calculated in the while loop below since the length of vertices will be reduced after each step.
He explained he wanted to keep the prim's portion separate from non-related prim atleast 3 times in the video
18:48
You are the go-to channel once I want to learn anything
Great walkthrough - LC Explore does a good job of explaining algorithms, but doesn't do anything for implementing. Walking through with code was very helpful.
This channel is the best when it comes to LC solutions. Could you please make a video on "Subarray sum equals K" and "Maximum size subarray sum equals k". These problems are based on the concept of hashmap in 2 sum, yet when it comes to implementation, they are not that trivial.
bro i have struggled soo much with these.
Yes I will try to do the in the near future
Since we are finding the MST in a dense graph (a graph with many edges) - the complete graph, Prim’s algo is more efficient in basically every case.
Thank you for a such a detailed video explaining Prim's algorithm and building the solution..
What a beautiful explanation! I am going to implement this in C++. Thanks for the video!! Learning a lot from your channel and I am also aiming to become a Noogler soon :)
Came across this alternate solution to implement the same Prim's algo
This one is simpler :)
--------------------------------------------
d = dict()
# Take first point and make it's cost as 0 all others infinity
for i, (x, y) in enumerate(points):
if i:
d[(x, y)] = float("inf")
else:
d[(x, y)] = 0
ans = 0
# While there are still more points left
while d:
# Get the point with minimum cost
x, y = min(d, key=d.get)
# Add to current cost and remove the point from dict of points
ans += d.pop((x, y))
# Relax/re-calculate cost of all the points from this point
for x1, y1 in d:
d[(x1, y1)] = min(d[(x1, y1)], abs(x - x1) + abs(y - y1))
return ans
nice solution and explanation! we actually need either of the checking condition for point in visited. Having both does not affect the overal TC.
This is such a great and well-explained video.
You can think about the complexity as when you have n points optimal connected, to include another point, you need to find the best way to connect the point from every single already connected point. So each inclusion of another point is nlogn.
You saved me! Thank you. Forever grateful
You sir, are a legend.
You are an awesome teacher, thank you for your videos ❤️
honestly if you taught a course, i would buy it. i dont normally consider them but you teach different :)
Really grateful for the awesome explanation
Awesome video NeetCode you are best 🎉
Ok, can you clarify why did not we just do compare each node with everyone and take the min values and add them up? it will be n^2. is it because we want to prevent a cycle?
As of now this solution is getting TLE on LC, because of a huge test case.
To avoid it, calculate the Manhattan distance inside the Prims algorithm part. Only maintain neighbours list in the adjacency hashmap.
adjList = {i :[] for i in range(n)}
for i in range(n):
for j in range(1, n):
adjList[i].append(j)
adjList[j].append(i)
for nei in adjList[node]:
x1, y1 = points[node]
x2, y2 = points[nei]
dist = abs(x1 - x2) + abs(y1 - y2)
if nei not in visit:
heapq.heappush(minHeap, [dist, nei])
Rest everything can stay same!
import java.util.*;
class point {
int x;
int y;
public point(int x, int y) {
this.x = x;
this.y = y;
}
@Override
public int hashCode() {
return Objects.hash(x, y);
}
@Override
public boolean equals(Object obj) {
point p = (point) obj;
if (this.x == p.x && this.y == p.y) {
return true;
}
return false;
}
}
class edge {
point p1;
point p2;
int weight;
public edge(point p1, point p2, int weight) {
this.p1 = p1;
this.p2 = p2;
this.weight = weight;
}
}
class Solution {
public int prims(HashMap adjList, HashSet visited, PriorityQueue minHeap,
int sum) {
edge top = minHeap.poll();
if (top == null) {
return sum;
}
while (visited.contains(top.p2)) {
top = minHeap.poll();
if (top == null) {
return sum;
}
}
sum += top.weight;
point start = top.p2;
visited.add(start);
List neighbours = adjList.get(start);
neighbours.forEach((neighbour) -> {
if (!visited.contains(neighbour)) {
int weight = Math.abs(start.x - neighbour.x) + Math.abs(start.y - neighbour.y);
minHeap.add(new edge(start, neighbour, weight));
}
});
return prims(adjList, visited, minHeap, sum);
}
public int minCostConnectPoints(int[][] points) {
if (points.length == 1) {
return 0;
}
HashMap adjList = new HashMap();
PriorityQueue minHeap = new PriorityQueue((e1, e2) -> {
return Integer.compare(e1.weight, e2.weight);
});
for (int i = 0; i < points.length; i++) {
for (int j = 0; j < points.length; j++) {
if (i != j) {
List temp = adjList.getOrDefault(new point(points[i][0], points[i][1]),
new ArrayList());
temp.add(new point(points[j][0], points[j][1]));
adjList.put(new point(points[i][0], points[i][1]), temp);
}
}
}
minHeap.add(new edge(null, new point(points[0][0], points[0][1]), 0));
int sum = 0;
HashSet visited = new HashSet();
sum = prims(adjList, visited, minHeap, sum);
return sum;
}
}
very helpful! clear explanation!
Thank you for your video, this is very valuable! There is an optimized Prime's method from Leet code solution. The time complexity can be optimized into O(N^2). Could you talk a little about that?
As others have pointed out in the comment section, you can track the remaining points (instead of the frontier set). In this particular question, the distance to each remaining point needs to be re-evaluated every time a new point is added to the MST. The trick to achieving O(n^2) is to track the next closest point during that re-evaluation, such that getting the next closest remaining point in each iteration is O(1).
def minCostConnectPoints(self, points: list[list[int]]) -> int:
if not points:
return 0
total_dist = 0
# invariant: the last element in this list will have the minimum distance
distance_and_points: list[list[int or float, int, int]] = [[float('inf'), p[0], p[1]] for p in points]
distance_and_points[-1][0] = 0
while distance_and_points:
dist, x, y = distance_and_points.pop()
total_dist += dist
if not len(distance_and_points):
break
# update distance of all remaining points
min_dist_idx = 0
for i in range(len(distance_and_points)):
dist2, x2, y2 = distance_and_points[i]
distance_and_points[i][0] = min(dist2, abs(x2 - x) + abs(y2 - y))
if distance_and_points[i][0] < distance_and_points[min_dist_idx][0]:
min_dist_idx = i
# move the closest point to the end of distance_and_points by swapping
distance_and_points[-1], distance_and_points[min_dist_idx] = distance_and_points[min_dist_idx], distance_and_points[-1]
return total_dist
You can calculate distances on the fly,
You don't have to get the full ADJ list, because PRIM is greedy. Every time you add a node, you don't need anything related to it for the next iterations...
Just address all nodes as unconnected components and start connecting them using the minimum distance from the frontier.
I did the same thing as the code in this video in JavaScript and it took 2 seconds to run on leetcode, changed it to be like Bellman Ford and it lowered down to 152ms. Bellman ford complexity on this should be runtime O(n^2) and memory O(n). I think the Prim's algorithm as implemented in this video is runtime O(n^2 + n*log n + n^2 * log n) (build adj list, pop heap n times, push to heap n*n times) since it's pushing to the heap n^2 times (in nested for loop). It would be cheaper to just recalculate the distance with every point than pushing to the heap for every element on every iteration. I think proper prims algorithm is meant to be different.
Great Explanation !!!
When building the adj list, I thought you want to add every other nodes? but you are adding all nodes j for node i, where j>i.
So you did:
for i in range(N):
for j in range(i+1, N):
# build adj list
but instead should it be:
for i in range(N):
for j in range(N):
if i == j:
continue
# build adj list
think i know the answer. bec. this is undirected (bidirectional graph), so you don't need to check previous nodes you iterated over since you already assigned those 'neighbors' already
He does exactly what you are thinking he is just being a little clever about it. We don't have to start from j=0 each time because he adds two edges per inner loop. "adj[i] = append(dist,j)" AND "adj[j] = append(dist, i)"
Thanks for the tutorial. I’d feel more comfortable with matrix indexing tho.
Isn't line 26 redundant though? Since we are already continuing at line 21 if a node is already seen. I just checked and it works just fine when I comment line 26.
I don't think its necessary but it may save some extra while loop cycles.
Great video! Thank you!
can you cover other algorithms like kruskal, dijkstra etc.
We wantsssss it, we needsssss it. @Neetcode
Great video. Isn't the time complexity O(n2 log n2) instead of O(n2 log n)? Can't the heap size reach higher than n
True, but the properties of log means that you can bring down the 2 from the exponent of of the second n^2 and make that the coefficient of the log, or simply 2n^2logn. Then the coefficient gets removed because it does not matter when considering O.
for a dense graph like this, adjacency matrix will be better.
Agreed. Excellent video nonetheless.
As always, thank you !!
Please, can you also someday add to your course on your website video on Kosaraju's Algorithm
Great video
We are pushing a neighbor only when not in visit set, then in that case do we have to check again and continue?
would doing the calculations in prims be more optimal, if so how would we do it?
You are the legend
At 15:35, from 2nd node to 1st node we are adding edge as it has min cost of 9, but won't it violate the line if nei not in visit(26th line in code) as 1st node is already in visit set?
2nd node with cost 9 was added to the heap while we were exploring nei from 1st node. And yes like he said same node will be added several times in the heap but with different costs.
shouldn't we heapify the list first?
your channel is a goldmine
Leetcode does not ask, but the algorithm presented does not provide a tree as output, only the cost.
you mentioned that there are some other algorithms for this type of problem, will you be doing videos on those anytime as well?
you can use Union Find making use of Kruskal's algorithm for this type of question
How's it different than Djkistra Algo? @neetcode
Hello everyone. Please how can we implement this to an actual question? I have an assignment and went through the video. I understood the first part but have issues implementing it in my assignment. Many thanks
Isn't your solution O(N^2) instead of O(N^2LogN)? I see it as creating the adj list takes O(N^2) then Prim's takes O(NLogN).
for-loop inside a while loop + a heappop = O(n^2logn)
How is it n^2 * log(n) and not instead n^2 * log(n^2)? If the heap can have n^2 elements in it, each heap operation is going to be log(n^2), no?
Ah, i just (re-)learned that log(n^2) = 2 * log(n) = O(log(n))
How can you start at an arbitrary node instead of starting at the node with the min edge that connects to it
Because the first node u select has dist 0 and later u spread out looking for the smallest node contacted to this node with dist 0
So nice
can you drop a link to c++ code for this ques?
This looks like an O((E+N) lg N) where E=N^2 so O(N^2 lg N)? isn't there a O(N lg n) solution?
The solution is great, however one of the test case fails
`points =
[[0,0],[1,1],[1,0],[-1,1]]`
calculating the edges uses (i+1) in inner loop, which is causing the issue
Don't you need to heapify minH?
We use heapq.heappush and heapq.heappop functions to push and pop. They take care of ensuring the heap is maintain properly.
is this a TSP?
JAVA Solution With Custom Min Heap Implementation. intentionally avoided PriorityQueue existing minHeap in Java for learning.
record HeapValue(int cost, int node) {}
class Solution {
public int minCostConnectPoints(int[][] points) {
ArrayList heap = new ArrayList();
HashMap map = buildMap(points);
HashSet set = new HashSet();
int cost = 0;
insert(heap, new HeapValue(0, 0));
while(set.size() < points.length) {
HeapValue currentNode = remove(heap);
if(set.contains(currentNode.node())) continue;
cost += currentNode.cost();
set.add(currentNode.node());
if(!map.containsKey(currentNode.node())) continue;
for(HeapValue neighbor : map.get(currentNode.node())) {
if(!set.contains(neighbor.node())) {
insert(heap, neighbor);
}
}
}
return cost;
}
private HeapValue remove(ArrayList heap) {
HeapValue min = heap.get(0);
heap.set(0, heap.get(heap.size() - 1));
heap.remove(heap.size() - 1);
heapify(heap, 0);
return min;
}
private void insert(ArrayList heap, HeapValue node) {
heap.add(node);
int length = heap.size() - 1;
while((length - 1) / 2 >= 0 && heap.get((length - 1) / 2).cost() > node.cost()) {
HeapValue tmp = heap.get((length - 1) / 2);
heap.set((length - 1) / 2, node);
heap.set(length, tmp);
length = (length - 1) / 2;
}
}
private void heapify(ArrayList heap, int node) {
if(heap.size() == 0) return;
int smaller = node;
int leftChild = 2 * node + 1;
int rightChild = 2 * node + 2;
if(leftChild < heap.size() && heap.get(leftChild).cost() < heap.get(smaller).cost()) {
smaller = leftChild;
}
if(rightChild < heap.size() && heap.get(rightChild).cost() < heap.get(smaller).cost()) {
smaller = rightChild;
}
if(smaller != node) {
HeapValue tmp = heap.get(node);
heap.set(node, heap.get(smaller));
heap.set(smaller, tmp);
heapify(heap, smaller);
}
}
private HashMap buildMap(int[][] points) {
HashMap map = new HashMap();
for(int i = 0; i < points.length; i++) {
int[] currentPoint = points[i];
for(int j = i + 1; j < points.length; j++) {
int[] nextPoint = points[j];
int distance = Math.abs(currentPoint[0] - nextPoint[0]) + Math.abs(currentPoint[1] - nextPoint[1]);
if(map.containsKey(i)) {
map.get(i).add(new HeapValue(distance, j));
} else {
ArrayList list = new ArrayList();
list.add(new HeapValue(distance, j));
map.put(i, list);
}
if(map.containsKey(j)) {
map.get(j).add(new HeapValue(distance, i));
} else {
ArrayList list = new ArrayList();
list.add(new HeapValue(distance, i));
map.put(j, list);
}
}
}
return map;
}
}
Sorry for being a stickler but, I couldn't help but notice from a couple of your videos that you write the incorrect spelling of Dijkstra's Algorithm😅
Yeah, he also pronounces it wrong.
wow Thanks
🐐
❤❤
popping min is O(1) operaton not a logn
Reading the min is O(1), but popping is O(logn). When popping, the min element is removed, which means the heap must be reordered.
it's pronounced dike-struh lmao 💀
Sorry buddy you use a lot of words but still not clear. A lot of the Indian youtubers explain much more clearly.
So you can go to those Indian's video, hater.
Yeah true, All those didi and bhaiya of yours true..... Those cancers of youtube, You are right..😊