I think your decision to not put everything on one big shared bus was the smart approach. Each input pin on the bus has a small amount of parasitic capacitance, which increases bus loading and requires additional drive current from the output pin driving the bus. That increases dI/dt which means more radiative EMI and crosstalk, and distorts the edges. This is less of a problem with an open drain setup, but still causes slower edge transitions and ringing. The long traces will have a lot of inductance which, left undamped, also tends to cause a lot of ringing. Longer traces also mean you're getting to the point where you're having to model them as transmission lines, since the Nyquist frequency of the design is set by the rise/fall time (not the clock!) and that's very fast on modern ICs - it's pretty common to see frequency components in the 300-800MHz range during transitions, so if you're running traces further than about 9cm you can no longer treat them as lumped lines. Once you get to this sort of scale you typically want to be using bus redrivers to break the bus up into smaller segments to avoid SI/EMI problems. If you start finding that you have SI issues once you add all the boards, two things you can do are reducing the pullup resistor value and adding a small resistor in series with each IO line. Right now with 5.1kΩ pullups you've got that classic sharkfin shaped clock, where the pullup resistor takes a while to overcome all the parasitic capacitance on the board. You can speed that rising edge up by reducing that pullup resistance - bodging a second 5.1kΩ resistor on top will do that. The falling edge is very fast because the IO pins are actively pulling the bus to ground. This causes big dI/dt spikes at the falling edge, while all that charge stored in the parasitic capacitances rushes through the low impedance path created by the active low-side FET. You can moderate that dI/dt with a small value resistor (e.g. 22Ω) in series with each of the IOs, so the bus is still strongly pulled down but the current isn't controlled only by the Rds(on) of the low-side FET in the IO. Since you've already spun the boards this might be kinda tricky to add - maybe something for a rev2/3? :)
It also doesn't hurt that he left the "repetition" and modularity to the board coppies. Kind of made me think of repeating code where a loop should be implemented. It would be easier to maintain/rid of bugs, and left the mind numbing repetition to the manufacturing. Not to mention he can expand the cluster as needed.
@@johboh I guess the Pixel Pump might be a good candidate. I have not used one myself, but the fact that it's an open project is a good thing. Of course there are cheaper and less capable options, but if you do regular board assemblies, buying a decent and a bit more expensive tool once will save you a lot of time and money over time.
Makes me wish I'd done electrical engineering at university. This level of dev is beyond my capability of simple analog electronics, I'm like a monkey with a spanner. Not enough time in the day now to reskill but your work is inspiring and why I'm subscribed.
Not sure if you've done this already, but it might make sense for you to have a seperate "subnet" for each blade and then only send transmitted data on the inter-blade bus if the destination is outside of that subnet.
first ime ive seen tape and tray of parts being used, kudos. i did inkdot for a year because i loved the simplicity and focus it required. they moved me to pin refurbishing when they found out i could do it easily
I am a software person and I built cards with my electronic partner 15 years ago that each card has three microchip processors that communicate with each other on the card in fast serial communication on pullup lines. These cards communicated with other similar cards for ranges of 10 km on a pair of cords that also transferred the energy for the needs of agriculture in the field.
sounds insane in performance, but actually would be roughly equal to 4 cores running at just over 3ghz due to the low clockspeed. that said it does show it is possible, and if this works it will also work with much faster risk-v chips. actually in some arm architectures the cores where designed to be kind of used like this so you could just keep scaling them, there was actually some 1000core arm cpu somewhere around 2013 or such, sadly never took of since back then mulithreading didn't really practically exist yet, as in that basically no softwares used it, and that things like handling large amounts of data at once wheren't a thing yet. that said, risc-v is opensource, so it means it should be possible to actually make a risc-v cpu which directly combines tons of cores. if you plan to make something like that I do have a better way for you to try out than using a single bus(or a few busses) since using busses like that can work but can have problems, I roughly designed a new experimental way of doing such multichip communication for the raspberry pi foundation some years ago, actually was to try and get them to make a board with way more cores. but essentially it is a method giving quite some bandwith but also large buffering and chips being able to get the data when they are ready instead of needing to accept it directly, that said, in some cases direct busses might be more usefull, luckily in a full cpu design you can make many more busses, both have advantages and weaknesses depending on the loads.
Ever heard of the "transputer", a 1980s commercial computer made of a collection of thousands of tiny weak processors working in parrel for advanced scientific tasks. Your cluster reminds me of it. Retrobytes channel made a video on it several months ago.
that's how a modern video card works!!! they have thousands of units (they're called differently among gpu manufacturers) that run in parallel executing small programs called shaders, which (oversimplifying now) all determine the color of EVERY pixel on your screen tens of times a second
At 10 pins free per 48mhz cpu, you could connect 20,040 leds (or 6 million if they are combined). Enough to make a small terminal screen...or play bad apple. With each pin handling 90 leds at 48Mhz, this thing would push pixels like a monster. Just need the timing to be perfect.....
Kind of, but CAN has a priority system and allows the message of the highest priority transmitter to go through. This is especially important in automotive applications.
You are very close to the original Ethernet CSMA/CD protocol. The XOR checksum has the problem that two colliders can cancel eachother - two single bit errors could result in a correct checksum - making a packet "appear" good. As such Ethernet uses a CRC. Further, if you detect a collision you "jam" the whole packet with alternating ones and zeros to really mess it up and then do your randomized backoff. What you will find, and you are not the first, is that as you scale the collisions will increase and the bandwidth will be insufficient. The cores will be data starved. This was the case with the Intel MIC's (Knight's Corner). They used PCI-E but the issue is the same, multidrop and star topologies oversubscribe easily. You will note datacenters (home of enormous clusters) used leaf spine (and other) interconnects to mitigate this. But fun none the less. So you have a huge number of course - what will you do with it? What would others in the comments run?
I would use an active pullup (constant current source) on the bus with so many devices on the bus. It could be a current mirror with two P MOSFET transistors (e.g. BSS84). With 5 mA current, it would probably speed up the communications a lot.
This project is great. It reminds me of the beginnings of parallel processing from the 80s where used Motorola MC68000, Inmos T400/T800 and parallel computers BBN Butterfly, Connection Machine CM-2 and similar from that period. I have always been fascinated with parallel machines and when I saw some projects 20-25 years ago years where people made their own PCBs with MC68000 and mounted multiple of them in a case of 386 computer and thus obtained better performances in some calculations, I wished that I could do the same but unfortunately I never got it... As soon as the prices of DUAL computers dropped, I immediately got one so I could have my own parallel machine... somewhere around 2000 (Dual Celeron)... Before that I got a SEGA MEGA DRIVE because it has two processors (Z80 and MC68000) which work in parallel... It was incredible for me at the time... Now I see people making emulator PCBs with RPI PICO microcontrollers that plug in like a T400/T800 and form a transputer computer... Recently, the idea of building a Parallella computer with lots of (ARM?) Epiphany RISC microcontroller has been popular, but I see that it has not spread to the masses. I'd love to see some advanced things you can do with this parallel wonder of 256 processors. It would be fantastic if you could make some tests with fractal calculations, FEM analysis, 3D graphics or comparison with ordinary computers, maybe some x86 computer emulator??? Finally, it would be fantastic if you could offer a Parallel Computer for Educational purposes to some educational institutions, so that students could easily understand problems and challenges of parallel programming... Of course at a reasonable price... Best regards, and I look forward to the further development of your project...
Speaking about check sums: an 8-bit checksum might work for most cases, however there's about a 1/256 probability that a random collision won't be detected as such & then not be handled accordingly - this might or might not be a problem depending on how you further handle those cases.
You may want to decrease the resistance on your clock line. A slow rise time can cause one of the processors to miss a clock and become out of sync with the host.
This is how the made Supercomputers in the Cray days. Now days the system bus interconnect is replaced by Networking. But This is still a superior way to build these clusters.
According to the sales materials, those microcontrollers have a max clock speed (without active cooling or overvolting) of 48 MHz and change. They also have 32 bit wide memory and data busses. Some specific versions of them have multiple additional ports and busses and only need some breakout logic (or a third party microcontroller board with sufficient GPIO pins) to fully expose them for programming purposes. It feels like a bit of a waste to use them as individual switches for individual LEDs. But that's just me. YMMV.
I know people who would go over the edge for your random parts placement :) "All values of similar resistors have to face the same directions"... LOL. Nice one. Wish I had more time to join the livestreams again ...
You may be nuts, but that's much of the fun of watching. This project is a delightful sprawl, full of potential and hurdles. What do you want it to become, beyond the LED art? I mean, is there a target functionality or is the journey the goal? Well, I guess we'll find out.
This is good for the Ghostbusters' backpack, if they have a vision device in an invisible spectrum band, all these processors doing raster scans off a large unidirectional antenna array
If you send a considerable amount of broadcast it makes sense to have a bit after the source address which is only set if it's a broadcast. So you can skip the target address completely. This only makes sense if you send a lot of broadcast messages, as every unicast message is then 1 bit longer
I've got two questions: 1) for what could this be used for? 2) for the waiting time after a collision, couldn't you use the ID itself as a delay? Or maybe force them to report in order, maybe using a master or calling the next one in line
Game of Life; Each Cell (group) has a finite time to check for- Food, Friend or Foe in adjacent blocks. Movement is turn based. Food, a limited, randomly placed resource, extends (life) up to 10 turns. Finding a Friend, adds a chance of 1-2 new Cells each turn. Each Foe can remove 1 adjacent Cell (not of its group) per turn, adding a chance for their group to grow next turn. *time is finite* for all Cells. Friend, Food or Foe. Meaning- the simulation ends, and you get to see a nice pattern of what groups fizzled, which ones flourished.
would be really cool to see this do something like a phone or computer software benchmark... with the lights it would be very satisfying... knowing the computer is actually computing... do the same for the hd/ssd and gpu 🤓🤓🤓 actual functional led display / rgb lighting
😅 This is a kind of projects I really like watching, but I have a question, what can It really do beside some basic stuffs, anything like computing with a lot of cores ( that may be too hard 😮 ). In my opinion, this is an interesting project I love. Thank you for making the video, hope you have a great day 🎉🎉🎉
Exactly. I think that to blink a LED, some FPGA will beat x10 RISC-V by number of I/Os, speed, and by a price. It's interesting to have some idea what is it for, how much for one flop, what are alternatives in terms of a price, performance, so on.. It's like to build a cluster with Raspberry PIs, when you can take an i7 and save money and have much better performance.
I think this is an amazing achievement. I would love to see you demonstrate its speed with some "sha-1"cracking or comparison testing against a raspberry pi 5 and a mid range PC with a long duration 24hr minimum to see how far 17Ghz can go I a day
if the collision detection is waiting a random time using the ID as the seed so they're always different, why not just use the ID as the amount of wait time directly?
Hello, nice video! How do you apply solder paste to the pads? Do you use a stencil mask? Do you have any tips on how to deal with a situation where no stencil mask is available?
An absolutely awesome project with great prospects and the limitation is the users imagination. however, I have 2 questions: 1. can it run doom? 2. in chat language was the code written? was it C/C++>
Despite missing half of the explanation bcs I have no idea what the used terms means, I nevertheless found everything fascinating. For me, its like our modern day version of an art painting. Can u tell me which kind of university degree/knowledge/skills are necessary for such project? And good job! 👍
I feel like to avoid collisions it might be more consistent to simply have a synchronized counter across all the processors based on the clock, then index each processor uniquely, modulo the time counter by the number of processors, and then when the processor's number comes up from that modulo operator, you are allowed to send data. That way there is literally no way to ever have a collision (ideally). I'm sure it's a bit more complex than that. Of course, the amount of time it takes before any given processor can send data is anywhere between 0 and n, where n is the number of processors... Could be bad if there are a ton of processors... I suppose the random time approach is *potentially* faster, although any time I hear "random" in this sort of application I am a bit suspicious hahah.
The technique you're describing is called "Time-division multiplexing", or TDM for short. In its simplest form every node would get a fixed number of bytes to send during their turn, but a more complicated scheme could dynamically change the size of time windows depending on how much data a node reports it has available... moving bytes is the first step to building anything like that though.
What an epic project! Subscribed Probably dumb question, but could the collision detection be replaced by a queueing system where an mcu can request the bus then get serviced fifo? Maybe that would be slower. Would be cool to see a map-reduce algorithm running on this beast.
That becomes a latency/bandwidth tradeoff... if you can request large chunks of dedicated time, you can shift bytes out at full speed, while both collision detection and turnaround (the system setting after currents potentially change direction on the backplane, also peak EMI) inherently slows down the timing required. Many systems use a fast clock with added guard intervals, clock cycles where nobody drives the bus.
SWD is a 2 wire ARM variant of JTAG (normally 4 or 5 wires), unlikely to appear on non-ARM chips. CH32V003 uses a different 1 wire debug interface they call SWD or SDI.
I think your decision to not put everything on one big shared bus was the smart approach. Each input pin on the bus has a small amount of parasitic capacitance, which increases bus loading and requires additional drive current from the output pin driving the bus. That increases dI/dt which means more radiative EMI and crosstalk, and distorts the edges. This is less of a problem with an open drain setup, but still causes slower edge transitions and ringing. The long traces will have a lot of inductance which, left undamped, also tends to cause a lot of ringing. Longer traces also mean you're getting to the point where you're having to model them as transmission lines, since the Nyquist frequency of the design is set by the rise/fall time (not the clock!) and that's very fast on modern ICs - it's pretty common to see frequency components in the 300-800MHz range during transitions, so if you're running traces further than about 9cm you can no longer treat them as lumped lines. Once you get to this sort of scale you typically want to be using bus redrivers to break the bus up into smaller segments to avoid SI/EMI problems.
If you start finding that you have SI issues once you add all the boards, two things you can do are reducing the pullup resistor value and adding a small resistor in series with each IO line. Right now with 5.1kΩ pullups you've got that classic sharkfin shaped clock, where the pullup resistor takes a while to overcome all the parasitic capacitance on the board. You can speed that rising edge up by reducing that pullup resistance - bodging a second 5.1kΩ resistor on top will do that. The falling edge is very fast because the IO pins are actively pulling the bus to ground. This causes big dI/dt spikes at the falling edge, while all that charge stored in the parasitic capacitances rushes through the low impedance path created by the active low-side FET. You can moderate that dI/dt with a small value resistor (e.g. 22Ω) in series with each of the IOs, so the bus is still strongly pulled down but the current isn't controlled only by the Rds(on) of the low-side FET in the IO. Since you've already spun the boards this might be kinda tricky to add - maybe something for a rev2/3? :)
It also doesn't hurt that he left the "repetition" and modularity to the board coppies. Kind of made me think of repeating code where a loop should be implemented. It would be easier to maintain/rid of bugs, and left the mind numbing repetition to the manufacturing. Not to mention he can expand the cluster as needed.
It was fun meeting you in person during CCC last year. Strange to see you pop up in a comment section though.
@@modernsolutions6631 I've never been to CCC! EMF Camp 2020, maybe?
@@gsuberland My bad. Then i must have confused you.
Dude - use a foot-operated vacuum pen - much quicker & easier than tweezers!
I would like one! Any recommendations?
@@johboh I guess the Pixel Pump might be a good candidate. I have not used one myself, but the fact that it's an open project is a good thing. Of course there are cheaper and less capable options, but if you do regular board assemblies, buying a decent and a bit more expensive tool once will save you a lot of time and money over time.
I thought this was a joke at first because I was imagining him controlling the position of the pen with his feet and the vacuum button with his hand
Makes me wish I'd done electrical engineering at university. This level of dev is beyond my capability of simple analog electronics, I'm like a monkey with a spanner. Not enough time in the day now to reskill but your work is inspiring and why I'm subscribed.
Want an easy start? Watch Ben Eater videos! Start with the breadboard series, then the 6502!
@@thek3743 Ben eater is the GOAT. 100% great series. His 12(?) part networking series is also great.
Same here. I'm a software developer, so I don't have much time, but I've always been interested in electrical.
I love the random clock variations on the blink sketch. Fun source lf entropy.
It's one of the nightmare of electrical designer. Very hard to synchronize differents components at high speed.
It's also sensitive to temperature, so if you have a thermal gradient across the ICs you'll find that some drift faster than others.
@@gsuberlandheating up half of the boards sounds like a cool idea
@@king_james_official hot* idea
@@siz1700 ha ha ha!!! (with long pauses in between)
So awesome. IMO Fiasco would be a cool code name for a project or chip.
fiasco 256, that way there can also be a fiasco 10000
The L4Re Microkernel is named Fiasco.
Not sure if you've done this already, but it might make sense for you to have a seperate "subnet" for each blade and then only send transmitted data on the inter-blade bus if the destination is outside of that subnet.
Dude, design GPU already
oh my god, you're reinventing the ethernet
@@monad_tcp that sort of sub-networked interconnect is common in CPU design as well
@@tophyr mfw everything is just ethernet
@@cabbose2552Some things are Arcnet...
first ime ive seen tape and tray of parts being used, kudos. i did inkdot for a year because i loved the simplicity and focus it required. they moved me to pin refurbishing when they found out i could do it easily
I am a software person and I built cards with my electronic partner 15 years ago that each card has three microchip processors that communicate with each other on the card in fast serial communication on pullup lines. These cards communicated with other similar cards for ranges of 10 km on a pair of cords that also transferred the energy for the needs of agriculture in the field.
Watching your pick and place makes me want to both go into electronics and stay the heck away from it.
"your"??
sounds insane in performance, but actually would be roughly equal to 4 cores running at just over 3ghz due to the low clockspeed.
that said it does show it is possible, and if this works it will also work with much faster risk-v chips.
actually in some arm architectures the cores where designed to be kind of used like this so you could just keep scaling them, there was actually some 1000core arm cpu somewhere around 2013 or such, sadly never took of since back then mulithreading didn't really practically exist yet, as in that basically no softwares used it, and that things like handling large amounts of data at once wheren't a thing yet.
that said, risc-v is opensource, so it means it should be possible to actually make a risc-v cpu which directly combines tons of cores.
if you plan to make something like that I do have a better way for you to try out than using a single bus(or a few busses) since using busses like that can work but can have problems, I roughly designed a new experimental way of doing such multichip communication for the raspberry pi foundation some years ago, actually was to try and get them to make a board with way more cores. but essentially it is a method giving quite some bandwith but also large buffering and chips being able to get the data when they are ready instead of needing to accept it directly, that said, in some cases direct busses might be more usefull, luckily in a full cpu design you can make many more busses, both have advantages and weaknesses depending on the loads.
Ever heard of the "transputer", a 1980s commercial computer made of a collection of thousands of tiny weak processors working in parrel for advanced scientific tasks.
Your cluster reminds me of it.
Retrobytes channel made a video on it several months ago.
we did basic programming on them in the 90's. used for fft audio processing
that sounds pretty much like a gpu with its shader units
Yep I was reading an article on the chips & cheese blog the other day about a Qualcomn mobile GPU & that's what I was thinking @destiny_02
Reminds me of TIS-100
that's how a modern video card works!!! they have thousands of units (they're called differently among gpu manufacturers) that run in parallel executing small programs called shaders, which (oversimplifying now) all determine the color of EVERY pixel on your screen tens of times a second
At 10 pins free per 48mhz cpu, you could connect 20,040 leds (or 6 million if they are combined). Enough to make a small terminal screen...or play bad apple. With each pin handling 90 leds at 48Mhz, this thing would push pixels like a monster.
Just need the timing to be perfect.....
that is what a gpu is..
I love it when blink goes out of sync... it looks like one of Big Clive's "supercomputers" except it really is a supercomputer!
It reminds me of the Lost in Space equipment in the early 1960s!
Great danger.
As always, an amazing project. The funky music for hand SMD assembly *almost* made it look enjoyable 😂
Can it run Doom?
sounds like a reasonable end goal
8:36 Holy rise time Batman!
The Signal Integrity engineer just started breaking out in a cold sweat
Your message collision scheme is remarkably similar to how CAN. works. It seems you've independently discovered an excellent system. very impressive.
Kind of, but CAN has a priority system and allows the message of the highest priority transmitter to go through. This is especially important in automotive applications.
It's CSMA-CD. Used most commonly in 802.3 (commonly ethernet) communications.
Random wait on collision is a common solution.
However if things get very busy, efficiency will drop like lots of confused cars with no road rules.
You are very close to the original Ethernet CSMA/CD protocol. The XOR checksum has the problem that two colliders can cancel eachother - two single bit errors could result in a correct checksum - making a packet "appear" good. As such Ethernet uses a CRC. Further, if you detect a collision you "jam" the whole packet with alternating ones and zeros to really mess it up and then do your randomized backoff. What you will find, and you are not the first, is that as you scale the collisions will increase and the bandwidth will be insufficient. The cores will be data starved. This was the case with the Intel MIC's (Knight's Corner). They used PCI-E but the issue is the same, multidrop and star topologies oversubscribe easily. You will note datacenters (home of enormous clusters) used leaf spine (and other) interconnects to mitigate this. But fun none the less. So you have a huge number of course - what will you do with it? What would others in the comments run?
That is awesome! I love to see it start doing some real work!
I would use an active pullup (constant current source) on the bus with so many devices on the bus. It could be a current mirror with two P MOSFET transistors (e.g. BSS84). With 5 mA current, it would probably speed up the communications a lot.
wow, this is incredible to see the idea from start. You're awesome!
Ok, but can it run Crisis?
in theory, yes... i guess we will see.
first doom, then quake, then crisys then half life 🤓
doom first quake halflife crisys. it should run it. but without any gpu it may be animated gif gameplay...
Crysis*
@@sharma_harsh Who cares about video games? Important question is: Can it run BOINC projects?
"...actually 273 but okay" is the best subtitle for a video in the history of the platform.
You're going to run Game of Life on that thing, aren't you?
That or Bad Apple.
That or we get rickrolled.
This project is great.
It reminds me of the beginnings of parallel processing from the 80s where used Motorola MC68000, Inmos T400/T800 and parallel computers BBN Butterfly, Connection Machine CM-2 and similar from that period.
I have always been fascinated with parallel machines and when I saw some projects 20-25 years ago years where people made their own PCBs with MC68000 and mounted multiple of them in a case of 386 computer and thus obtained better performances in some calculations, I wished that I could do the same but unfortunately I never got it...
As soon as the prices of DUAL computers dropped, I immediately got one so I could have my own parallel machine... somewhere around 2000 (Dual Celeron)...
Before that I got a SEGA MEGA DRIVE because it has two processors (Z80 and MC68000) which work in parallel...
It was incredible for me at the time...
Now I see people making emulator PCBs with RPI PICO microcontrollers that plug in like a T400/T800 and form a transputer computer...
Recently, the idea of building a Parallella computer with lots of (ARM?) Epiphany RISC microcontroller has been popular, but I see that it has not spread to the masses.
I'd love to see some advanced things you can do with this parallel wonder of 256 processors.
It would be fantastic if you could make some tests with fractal calculations, FEM analysis, 3D graphics or comparison with ordinary computers, maybe some x86 computer emulator???
Finally, it would be fantastic if you could offer a Parallel Computer for Educational purposes to some educational institutions, so that students could easily understand problems and challenges of parallel programming... Of course at a reasonable price...
Best regards, and I look forward to the further development of your project...
Speaking about check sums: an 8-bit checksum might work for most cases, however there's about a 1/256 probability that a random collision won't be detected as such & then not be handled accordingly - this might or might not be a problem depending on how you further handle those cases.
Many kudos for attempting such a "mega-project". No pain no gain...
that's going to be fun to program
Amazing work, what a project! 😮👍
That is a lot of CPU power for some random blinking LEDs :)
That could have been achieved with much less resources and effort indeed
This is the coolest thing I've seen in a while!
Wow just discovered. Awesome. Can't wait for the next!!
You may want to decrease the resistance on your clock line. A slow rise time can cause one of the processors to miss a clock and become out of sync with the host.
This is how the made Supercomputers in the Cray days. Now days the system bus interconnect is replaced by Networking. But This is still a superior way to build these clusters.
According to the sales materials, those microcontrollers have a max clock speed (without active cooling or overvolting) of 48 MHz and change. They also have 32 bit wide memory and data busses. Some specific versions of them have multiple additional ports and busses and only need some breakout logic (or a third party microcontroller board with sufficient GPIO pins) to fully expose them for programming purposes. It feels like a bit of a waste to use them as individual switches for individual LEDs.
But that's just me. YMMV.
I know people who would go over the edge for your random parts placement :) "All values of similar resistors have to face the same directions"... LOL. Nice one. Wish I had more time to join the livestreams again ...
how is your comment 2h old ? the video was uploaded 5min ago 🤔
@@valet_noir Patrons get early access, even this means only 2 hours in Butluni terms. Other UA-camrs are a bit more generous here ;)
All values of all components must face the same direction!!! ;)
You may be nuts, but that's much of the fun of watching. This project is a delightful sprawl, full of potential and hurdles. What do you want it to become, beyond the LED art? I mean, is there a target functionality or is the journey the goal?
Well, I guess we'll find out.
This is good for the Ghostbusters' backpack, if they have a vision device in an invisible spectrum band, all these processors doing raster scans off a large unidirectional antenna array
Cool project, thanks for sharing.
If you send a considerable amount of broadcast it makes sense to have a bit after the source address which is only set if it's a broadcast. So you can skip the target address completely.
This only makes sense if you send a lot of broadcast messages, as every unicast message is then 1 bit longer
Ein Jahr jeden Tag auf neue warten hat sich gelohnt 🥹🥹
I have to admit, watching them go out of sync is beautiful. Am I weird to like that more than synchronized blinking?
This project reminded me about both the game of life automaton, KISS principle and CD part of CSMA/CD.
Amazing video! you are teaching a lot of stuff with this.
this is nuts! , i love it!
Awesome work 😮
I've got two questions:
1) for what could this be used for?
2) for the waiting time after a collision, couldn't you use the ID itself as a delay? Or maybe force them to report in order, maybe using a master or calling the next one in line
CSMA/CD reinvented :)
You could use the now free command pin to sync all the clocks together
man I wish I had your skillset.
You are integrating systems for the s100 bus.
Game of Life; Each Cell (group) has a finite time to check for- Food, Friend or Foe in adjacent blocks. Movement is turn based.
Food, a limited, randomly placed resource, extends (life) up to 10 turns. Finding a Friend, adds a chance of 1-2 new Cells each turn.
Each Foe can remove 1 adjacent Cell (not of its group) per turn, adding a chance for their group to grow next turn.
*time is finite* for all Cells. Friend, Food or Foe. Meaning- the simulation ends, and you get to see a nice pattern of what groups fizzled, which ones flourished.
would be really cool to see this do something like a phone or computer software benchmark... with the lights it would be very satisfying... knowing the computer is actually computing... do the same for the hd/ssd and gpu 🤓🤓🤓 actual functional led display / rgb lighting
Heck of a great project! And custom CDMA!
Have you tried running the LINPACK benchmark on it? Would be really interested to see compute performance.
A true work of art!
My hats off to you! 🍻
after 2 minutes you already deserve a like!
Your collision detection is very similar if not the same as CAN Bus collision detection (I need to check but I believe it’s at least close)
This is so awesome!
😅 This is a kind of projects I really like watching, but I have a question, what can It really do beside some basic stuffs, anything like computing with a lot of cores ( that may be too hard 😮 ). In my opinion, this is an interesting project I love. Thank you for making the video, hope you have a great day 🎉🎉🎉
Exactly. I think that to blink a LED, some FPGA will beat x10 RISC-V by number of I/Os, speed, and by a price. It's interesting to have some idea what is it for, how much for one flop, what are alternatives in terms of a price, performance, so on.. It's like to build a cluster with Raspberry PIs, when you can take an i7 and save money and have much better performance.
Very interesting and good job, but what's the next step?
I think this is an amazing achievement. I would love to see you demonstrate its speed with some "sha-1"cracking or comparison testing against a raspberry pi 5 and a mid range PC with a long duration 24hr minimum to see how far 17Ghz can go I a day
0:50 imagine the reveal of an ai being sentient by the blinking lights getting faster and faster and then it stops, and just starts showing text?
3:30 LumenPnP when? My hand and eyes hurt just watching all that placement! ( I probably just have a low tolerance though lol)
The blink looked like game of life 😂
if the collision detection is waiting a random time using the ID as the seed so they're always different, why not just use the ID as the amount of wait time directly?
Hello, nice video!
How do you apply solder paste to the pads? Do you use a stencil mask? Do you have any tips on how to deal with a situation where no stencil mask is available?
this is cool. thank you.
this is really incredible. i'd love to see a collab between you and @beneater !!! Really great work.
An absolutely awesome project with great prospects and the limitation is the users imagination.
however, I have 2 questions:
1. can it run doom?
2. in chat language was the code written? was it C/C++>
I would love to see a super cluster with the cluster and a 2040 I/O chip?
So you are the guy that created the brain of skynet! I knew it!
Bro cooks so much, he wouldn't even need a reflow oven
GAME OF LIFE on this would be insaine
nice work ....
I have been planning to do this with sg2002s.
just discovered your channel and this is super cool! what was your career path that got you into electronics? thanks!
Thats HUGE!!
Did you just create a pretty good random number generator with those blinking leds? Looks much cooler than those lava lamps
i just discover your channel an io immediately subscrtibed. this project mesmerize me. keep on!
I wonder how the Green Arrays chips handle communicating between CPU's.
Wow that's insane
I wonder if you could use this for parallel processing like how GPUs does it
Despite missing half of the explanation bcs I have no idea what the used terms means, I nevertheless found everything fascinating. For me, its like our modern day version of an art painting. Can u tell me which kind of university degree/knowledge/skills are necessary for such project? And good job! 👍
so i only understood like 20% of this. what is this for? whats the computing power? how does it compare to a "modern" equivalent?
Could u check the flux link? it also refers to the syringe page! thx!!
I feel like to avoid collisions it might be more consistent to simply have a synchronized counter across all the processors based on the clock, then index each processor uniquely, modulo the time counter by the number of processors, and then when the processor's number comes up from that modulo operator, you are allowed to send data. That way there is literally no way to ever have a collision (ideally). I'm sure it's a bit more complex than that. Of course, the amount of time it takes before any given processor can send data is anywhere between 0 and n, where n is the number of processors... Could be bad if there are a ton of processors... I suppose the random time approach is *potentially* faster, although any time I hear "random" in this sort of application I am a bit suspicious hahah.
The technique you're describing is called "Time-division multiplexing", or TDM for short. In its simplest form every node would get a fixed number of bytes to send during their turn, but a more complicated scheme could dynamically change the size of time windows depending on how much data a node reports it has available... moving bytes is the first step to building anything like that though.
Ah! Interesting! TIL hahah.
5:18 > We can hear you laughing. I like your enthusiasm
Why not CAN Bus? It already has a good data arbitration mechanism and checksum, no need to reinvent the wheel :)
crazy! in a good way!
You can also add in a small fpga to make the to run or manage the cluster?
What’s the music playing during assembly at 4:13?
What might be some of the use cases for the Megacluster?
And then add some what are currently at the moment quite inexpensive RAM and Storage for BIOS?
What an epic project! Subscribed
Probably dumb question, but could the collision detection be replaced by a queueing system where an mcu can request the bus then get serviced fifo? Maybe that would be slower.
Would be cool to see a map-reduce algorithm running on this beast.
That becomes a latency/bandwidth tradeoff... if you can request large chunks of dedicated time, you can shift bytes out at full speed, while both collision detection and turnaround (the system setting after currents potentially change direction on the backplane, also peak EMI) inherently slows down the timing required. Many systems use a fast clock with added guard intervals, clock cycles where nobody drives the bus.
Would you he able to upload those first streams in which you made the cluster and the protocol? It's not on twitch nor UA-cam...
Really nice project. What are you using for the top view shots?
4:39
love your video, just one thing, isn't it violet and not pink LEDs. :-D
nice project, but i have one question. i want to try this chip CH32V003 but can i use other swd debugger or it need to be e-link debugger?
SWD is a 2 wire ARM variant of JTAG (normally 4 or 5 wires), unlikely to appear on non-ARM chips. CH32V003 uses a different 1 wire debug interface they call SWD or SDI.
how many GBit/s is your outside communication? Do you use QSFP??