What I like most about Gaussian splattering is that it is basically a non neural network approach that is competitive if not better than neural network approaches.
@@DanieleCapellini gradient descent does not feature back propagation or connected layers and it doesn’t learn it just is a method for solving a problem that is not solvable analytically.
@@Cl0udWolf I wonder what method neural networks use to "learn" optimal changes of parameters based on the gradients they've calculated during backpropagation.
I love how in traditional computer science nerd fashion whenever either of them speaks the other one just looks extremely awkward, like they don't know what to do with themselves. love the new format btw seriously, do more of these please
It's not just awkward nerd waiting, when smart people like this talk shop they are constantly thinking through what's being said, so it's also a display of intent listening and conceiving follow up statements or questions. E.g. in your typical talk show with charismatic people, no one actually listens to what the other person says, they just pretend to pay attention while waiting for their turn to say something pre baked or to latch on to a fragment of a sentence said by the other party. But here we see both of them mulling over what the other says and then trying to fill in any gaps they think need filling. It's a much deeper level of engagement in conversation, that's why they focus their gaze on e.g. their hands or the table, to keep themselves free from distraction and focus on the words.
This is so cool! I've been looking for a neat explanation on what actually Gaussian Splatting is for a long while now. And who better to explain it. Perfect!
Yes ❤ please more videos about techniques like this. Also reeeaally enjoyed the 'old' computer vision and neural net videos. They inspired me to pursue a carreer in this direction! thank you
Small correction 5:58 The 3D Gaussian's mean is indeed placed at the location of the corresponding SfM point as it represents the closely estimated position during initialization. The size of the initial isotropic Gaussian is determined by the average distance to the three closest neighboring points.
There is a certain fuzziness and artifacting that is unique to splatting which begins to be a bit of an eyesore the more you work with them. Almost like you are wearing googles with vasoline smeared in random spots. I hope we discover some way to overcome that. Maybe combining traditional and splatting and doing boolean operations to cull the splatting errors?
I think most of the artifacts are due to insufficient camera angles for training. It ends up positioning some splats in mid air, in a way that work for the training angles, but no when you move around. Also, the more splats you use (and the smaller their minimal size), the more detail you can capture, and sharper the details can become. However at the expense of performance. An alternative would be to use asymmetric splats, with adjustable degrees of blur on each side of the ellipse (like angled spotlight projections). That could represent sharper edges with fewer splats. But depending how costly more complex splats are to render, it may not be worth it and be slower than rendering a larger number of symmetric gaussians.
@@Pixelarter If the splat positioning were restrained in 3d space for each object being represented, I think it would automatically reign in the artifacting. Like a low poly textureless version of what is being rendered used as a guide to restrain the splatting being placed to within those bounds.
That's just due to the splats being large, which is due to not enough camera angles. The more of the scene you give the algorithm (especially giving close up shots of things) the more detail (i.e. smaller splats) it can produce, but that means more effort for both you and the computer. It's probable best to use this technique for relatively small objects, and removing large splats in the sky to be replaced with a more traditional skybox technique.
It's just the opposite drawback of triangles. With triangles, a lack of detail appears as overly sharp points and edges, while in gaussians low detail creates blur instead.
Great, but the rendering of splats does not use a depth buffer. All the splats are gaussian - it means that even at 100% opacity, the edges will be transparent, eventually fading to 0 opacity. What's usually done, is that you sort all the splats and render them from back to front. That's the current pinnacle of the technique. Some opt in for sorting once every Nth frame, which greatly improves performance, but other than that, there's a high degree of overdraw. And they're not like circles - their projection may look like a fluffy cloud, but they're volumetric by all accounta. Aside the color, opacity and spherical harmonics, they have a position, a rotation and a scale - all in 3D, so they look differently depending on the angle you're looking at them.
Maybe a set of depth buffers (where each iteration traverses one transparent object)? Let's say that after traversing the first object the accumulated alpha in an area is equivalent to 1, everything behind could then be skipped
there's a game called unrecord that uses unreal engines photogrammetry assets that's been getting a lot of attention. for the purposes of a game, photogrammetry is the same as gaussian splatting, if not better.
I think it's worth clarifying the distinction between building a Gaussian Splat scene and actually rendering the scene on screen, as some people might be confused about these 2 steps being named after the same thing. Another thing to mention, is that it's much faster and less error-prone to automatically generate the scene by providing location and angular data to the Gaussian builder. Most phones have accelerometers and gyroscopes, which helps a lot with auto-generating splats, otherwise you'd have to remember the exact location and angle of the camera to place each picture into the scene manually. BTW, do the pixels get turned into voxels before being converted to Gaussians?
I want to see this as a video with sports events. I imagine flying through a sports scene in 3D while the action is in motion. How many camera angles would be needed? Would the speed be quick? I guess each frame would need to be trained and stored (weeks of training). I think each frame is tens of megabytes, so it might be too much to actually be executed. Your thoughts?
A research team has already made some short 3D videos using animated gaussians, recorded inside a dome covered in video cameras. Google "Dynamic 3D Gaussians".
I've got two questions: When we split a gaussian, how do we avoid having them both converging to the same place and size? Is there a repulsion term in the loss function? Like, electrons repelling each other? And how do we avoid gaussians just fully transparent? Also, I don't think it would handle specular reflections or anything that depend on the position of the camera.
In GS they optimize and adaptively densifiy/control the Gaussians in order to tackle over-reconstructed area and under-reconstructed ones. In the case of under-reconstruction, meaning that space needs to be filled, a new Gaussian is created as a clone of one in the under-reconstructed area and placed along the direction of the positional gradient. In over-recontructed areas instead, the target Gaussian is simply split and divided by a factor of 1.6 (determined experimentally). To determine the position of the obtained Gaussians from the splitting process, they perform a normal sampling based on the scaling values of the gaussian (mean centered at the Gaussian mean and Standard Dev based on the Gaussian's scaling parameters). They do matrix multiplication on these samples with the original rotation matrix of the GS on which splitting is been performed, in order to align them with the current spatial orientation of the initial one. Subsequently, they use these sampled values as an offset from the original mean to determine the 2 new means for each couple of newly generated Gaussian obtained from the splitting process on the initial Gaussians it was applied to.
Fully transparent Gaussians are simply pruned as they do not contribute to the final rasterized image (more recent papers adopt smarter pruning techniques). Regarding specular reflections, in this video they did not go into detail, but the 3D Gaussians are basically 3D ellipses, and on the surface of and Ellipse you can define Sferical Harmonic functions, these are used to represent color on the surface of the Gaussians as a function of the viewing direction.
@@stefanoscolapastathat's neat, so the 3d gaussians fill out the same-collored borders of the objects and then each gaussian is given a view dependant function of color?
I have to give this a rewatch, but I feel a segment is missing. How are the gaussians generated? I acknowledge the parts of the video discussing what they are, the benefits they confer to render time, flexibility and application. But I missed how to actually produce my own gaussian burst as recommend to do of my house at the end of the video
There's still a pre-rendering step where the gaussians are generated, like in conventional photogrammetry. You feed the images into a simple neural network that builds the gaussians. It's just adding a second layer of data to the point cloud.
Initially the images are passed through a classical Structure From Motion algorithm in COLMAP which estimates a coarse point cloud of the scene. Those points are used as initial 3D positions to place the first 3D gaussians.
There are algorithms that can extract the 3D scene from the NERF to a ply or other 3D format. Marching Cubes for example.Also Gaussian Splatting doesn't capture Phong Illumination or difficult illumination in general except you overfit the model and make really small gaussians. So photorealism with gaussian splatting is something debatable. Last but not least there are new models like F2-Nerf that are way faster that the NeRF you are using implementing 3D Multiresolution Hashencoding that helps the model converge faster. After seeing the video I tend to believe that gaussian splatting is like patching 3D shape with small bell shaped Plasticine objects to create a 3D scene that looks like the one in the image. This even defies bias that is made from camera i.e. different illumination from different angles due to light captured from camera
This is like complaining about jpeg or mp4. The point isn't photorealism, it's that it's a viable realtime technique that performs like conventional rendering on cheap hardware. Realtime optimization is all about doing things "worse".
For those of you wondering how people end up at companies like Meta or Spotify, it's oftentimes dudes like this. The handful of people out of a PhD program in computer science who just LOVE writing splatting APIs in Rust or Go.
I read the original paper and am familiar with the technique. Was excited to see y'all do a video on it but this video could definitely have used better production to explain Gaussians. For example, t was 13 minutes before you showed an image of a gaussian blob.
@@hannah42069 oh yeah I've seen that one. It appears that it can only be recorded in a fixed space using multiple cameras, so it's not too useful in the way Gaussian splatting was intended to capture arbitrary still scenes. Not quite what I had in mind.
With static object and stiff objects it shouldn't be too hard as the gaussians can keep their shape and size. But I image it becomes really complicated once every gaussian itself has to animate as well.
One could try and find a transformation of connected subset of gaussians that when aplied and rendered minimizes the difference to an image where the transformation is applied. You have now found a joint in the object and its corresponding transformation
0:40 running of a web server and sending the images? The data is loaded from a server yes, but it's being rendered completely locally, in the browser, no? This would not at all affect the performance
Just saying it's doing gradient descent doesn't tell us its particularly cheap. I mean backpropogation in a nueral net is just a form of gradient descent.
In the mid 1990s, I was part of a project to create the instrumentation for an autonomous mining vehicle. It used a distance measuring instrument. This comment was made not publically visible by the "algorithm" until I made it exclude the technicalities involved. Curious that?
Στα μέσα της δεκαετίας του 1990, ήμουν μέρος ενός έργου για τη δημιουργία των οργάνων για ένα αυτόνομο όχημα εξόρυξης που πήγαινε από σταθμό GPS σε σταθμό με το φορτίο ορυκτών για να τα απορρίψει. Χρησιμοποίησε ένα ορατό κόκκινο σύστημα μέτρησης απόστασης λέιζερ μέσης ισχύος που παρήγαγε χιλιάδες μετρήσεις απόστασης ανά δευτερόλεπτο ενώ τις σάρωνε χρησιμοποιώντας έναν κινητήρα άξονα με ελεγχόμενη ταχύτητα βρόχου κλειδώματος φάσης που είχε έναν καθρέφτη τοποθετημένο στον κινητήρα για να παράγει σταθερή ταχύτητα μέτρησης απόστασης λέιζερ που θα σάρωνε η σκηνή πάνω από το όχημα και σε κάθε σταθμό που επισκέπτεστε θα υπάρχει ένας γραμμικός κώδικας μέτριου μεγέθους σε μια ορισμένη απόσταση πάνω από το όχημα. Η ταχύτητα των υπολογιστών τότε δεν ήταν υπερβολικά γρήγορη, αλλά τώρα οι υπολογιστές είναι πολύ πιο γρήγοροι, ώστε όλα τα δείγματα απόστασης να μπορούν να καταγραφούν όμορφα σε μια συσκευή αποθήκευσης υπολογιστή. Το θέμα θα ήταν ότι όλα αυτά τα δεδομένα σύντομα θα γέμιζαν τις περισσότερες συσκευές αποθήκευσης, εκτός εάν χρησιμοποιήθηκε μια καλή μέθοδος συμπίεσης για την εξοικονόμηση χώρου μέσα σε τέτοιες συσκευές αποθήκευσης. Αυτός θα ήταν ένας πολύ γρήγορος τρόπος για να δημιουργήσετε έναν τρισδιάστατο χάρτη επιπέδου παιχνιδιού για χρήση σε παιχνίδια. Εάν μια περιοχή αποφευχθεί η σάρωση, θα φαινόταν στο παιχνίδι σας να έχει μια τρύπα στον χάρτη του παιχνιδιού σας σε ορισμένα σημεία. Πιθανότατα, ορισμένοι παίκτες παιχνιδιών χάκερ θα έβρισκαν έναν τρόπο να βγουν έξω από τον χάρτη σας και να πυροβολήσουν παίκτες που ήταν ακόμα μέσα στον χάρτη, ανίκανοι να ανταποκριθούν στους αντιπάλους τους που εξαπατούν. Η πρώτη μου μέθοδος ήταν να χρησιμοποιήσω έναν ψηφιακό βρόχο κλειδώματος φάσης που ήταν ελεγχόμενος από κρυστάλλους, αλλά το όχημα εξόρυξης θα υπόκειται σε πάρα πολλές προσκρούσεις στην επιφάνεια του δρόμου υψηλής ταχύτητας που τα επίπεδα κραδασμών για το κύκλωμα θα προκαλούσαν ζημιά στην κρυσταλλική αναφορά συχνότητας που είναι τοποθετημένη στην πλακέτα κυκλώματος. ο μηχανικός βρήκε μια άλλη ακριβή μέθοδο αναφοράς που βασίζεται σε αναλογικές μεθόδους που δεν έχουν ευάλωτα εξαρτήματα κυκλώματος που δεν θα καταστραφούν από το όχημα που κινείται σε ανομοιόμορφες συνθήκες οδοστρώματος σε σχετικά υψηλές ταχύτητες. Ο κινητήρας του άξονα ήταν ένας τριφασικός κινητήρας συνεχούς ρεύματος χωρίς ψήκτρες που κινούνταν από μια γέφυρα H MOSFET σε κάθε μία από τις τρεις φάσεις του. Ο κινητήρας θα παράγει τους δικούς του παλμούς μέσω των αισθητήρων φαινομένου του χωλ εντός του κινητήρα που θα μπορούσαν να χρησιμοποιηθούν ως ανάδραση στο κύκλωμα ρυθμιστή ταχύτητας κινητήρα βρόχου κλειδώματος φάσης που έλεγχε την αλληλουχία φάσης της γέφυρας H. Το μόνο που θα ήταν απαραίτητο για τη χρήση ενός τέτοιου οργάνου λέιζερ μέτρησης απόστασης θα ήταν να γείρετε το κουτί σάρωσης με λέιζερ μια μικρή σταθερή γωνία αφού λήφθηκαν αρκετά δείγματα από το παράθυρο που είχαν τοποθετήσει κύκλωμα μετριασμού της ομίχλης στο τζάμι επειδή αυτό το πράγμα επρόκειτο να λειτουργούν σε διάφορες καιρικές συνθήκες. Ο ηλικιωμένος που εργαζόταν μαζί μου ήταν ήδη 45 έως 55 ετών εκείνη την εποχή, επομένως μπορεί κάλλιστα να πέθανε στο ενδιάμεσο διάστημα, και εγώ ήμουν μια ή δύο δεκαετίες νεότερος από αυτόν. Ο χρόνος, οι πόλεμοι και οι διάφορες ασθένειες συνεχίζουν να είναι όλοι εχθροί μας. Αναφορά Wikipedia 256χρονος Κινέζος βοτανολόγος πεθαίνει το 1933 Ένα βίντεο σχετικά με την ιατρική δικαίως ισχυριζόταν ότι ΔΕΝ ήταν δουλειά του να αυξήσει σε μεγάλο βαθμό τη διάρκεια της ζωής μας, αλλά το αντίθετο και να προεδρεύει σε ένα πρόγραμμα για να διασφαλίσει ότι ο καθένας μας πεθάνει αφού περάσουν τα χρήσιμα χρόνια μας. Προφανώς αυτός ο τύπος ήταν άνθρωπος, αλλά αν διαβάσετε την περιγραφή του, ήταν περίπου 7 πόδια ύψος. Κατέληξε να έχει 20 ή περισσότερες συζύγους και πολλά παιδιά με την καθεμία, τα οποία τα έζησε όλα πολύ εύκολα. Μέχρι σήμερα, πιθανότατα έχει πολλούς απογόνους.
The technical details were able to be translated into the target Greek language by Google translation tools. There is nothing in there of an offensive nature. Why do they do this?
i would like to see MORE reimagining tech Like i want to do Syth world behavior to allow for Plant growth simulation using UV and element exposure with a Magnetic loop extrude method. basically what direction is based on the moon so how it is extruded and the sun to what direction it fasces to bake.
It renders what is in the reflection with another set of gaussians. It's cool because it actually creates a mirror world. I saw someone zoom in through the reflection in a pool of water which was pretty trippy.
The lighting is just whatever was captured by the camera. In a sense there are no "surfaces", the only conventional 3D simulation used is the original point cloud. Adding simulated lighting will require additional developments. (Maybe photographing the subject in multiple lighting conditions, for a start. Could be as simple as abusing the camera's flash, or waving a light around as you record video.)
@@ElTrolldego I looked into it, and NeRF works the same way, it's just capturing existing highlights and reflections as an artifact. Versions of NeRF with dynamic lighting are either using a specialized photography setup, like I figured, or generating a conventional 3D model and lighting that. Neither technique can get correct physical material information from arbitrary photos, that data has to come from somewhere. I think the original NeRF presenters were slightly hyping up its ability to "handle" specular and reflections, it's not deliberate like that implies, it's just a side-effect. Both techniques just reproduce whatever is shown in the source images, they don't actually know the difference between a window, a mirror, and a swimming pool.
Using fuzzy colored blobs instead of triangles lets you create images using less information, in the same way that a painting can look more "correct" to our perception than a photograph. They're underselling how performant this is, the tree scene they show here probably runs about as fast as something like Minecraft on a cheap modern GPU. I'm guessing the office PC they use in the demo has integrated graphics.
Can I just leave this here ,an idea came to me and put the question to a llm Improving 3D Gaussian splatting by incorporating soap bubble dynamics to capture image information is a fascinating idea. This proposition involves melding two seemingly different concepts: Gaussian splatting, primarily used in volume rendering and graphics, and the physics of soap bubble dynamics, a study in fluid mechanics and minimal surfaces. Here's a concept on how to approach this problem: ### 1. Understanding the Basics **Gaussian Splatting:** In graphics, Gaussian splatting is a technique used in volume rendering where each data point contributes to its surrounding pixels in a manner that resembles a Gaussian distribution. This results in smooth transitions and blending between the points. **Soap Bubble Dynamics:** Soap bubbles form minimal surface areas for a given volume because of surface tension. The dynamics of soap bubbles, including how they merge and form shapes, are governed by principles that minimize surface area while maintaining volume. ### 2. Concept Integration Proposal The goal of incorporating soap bubble dynamics into 3D Gaussian splatting is to achieve a more natural and minimalistic representation of the data, possibly leading to improvements in the visualization's accuracy and aesthetic. Here’s a step by step approach: **Step 1: Data Preparation and Initial Splatting** - Start with standard 3D Gaussian splatting to create an initial rendering of your volume data. - Ensure your data is normalized and correctly scaled for the visualization process. **Step 2: Incorporating Soap Bubble Dynamics** - **Surface Tension as a Guiding Principle:** Modify the Gaussian distribution parameters dynamically based on an algorithm inspired by surface tension. This could involve adjusting the spread of the Gaussian function to more closely resemble the way soap bubbles naturally adjust their surfaces to minimize the area for a given volume. - **Minimal Surface Calculation:** Introduce calculations to adapt the attributes of each "splat" such that it contributes to forming a minimal surface, analogous to the way soap films span frames. This can mean adjusting the splat sizes based on their relative positions and data densities to more accurately reflect an equilibrium state you’d find in a soap bubble structure. **Step 3: Optimization and Rendering Adjustments** - **Adaptive Resolution:** Utilize soap bubble mechanics to adjust the rendering resolution and detail dynamically, focusing computational resources on areas of higher complexity or data density. - **Simulation of Light and Transparency:** Model the interaction of light with the "bubble-like" structures in your volume rendering to simulate the iridescence and translucency seen in soap bubbles. ### 3. Computational Challenges and Solutions Implementing these concepts computationally will require advanced algorithms for: - Dynamically adjusting Gaussian parameters based on soap bubble dynamics principles. - Efficiently computing minimal surfaces across the data points in real-time or near-real-time for interactive visualization. - Optimizing the computational performance to handle the additional complexity introduced by these calculations. ### 4. Potential Application Areas Such an approach could significantly impact fields requiring high-detail volume rendering, like medical imaging, scientific visualization, and CGI, by providing a more intuitive and visually appealing representation of complex 3D data. ### 5. Conclusion While the idea is complex and would require substantial research and development effort to fully realize, incorporating soap bubble dynamics into 3D Gaussian splatting presents a promising avenue for enhancing visualizations by leveraging the inherent beauty and efficiency of natural structures. The fusion of these concepts could lead to breakthroughs in how we represent and interact with 3D volumetric data.
I learned about gaussuan splatting at uni last year so I thought it was an older technique. Didn't know the paper on it was published basically only a few months before that. It definitely seems way too simple to be something only developed last year.
Could you please try and do this with pictures / videos of any of the moon landings? that would be so awesome. I already tried with A11 but it never worked, but there weren't that many pictures.
@@vibaj16 yeah ap11 hasn't but maybe others. Or do you know if there is a possibility to give some kind of positional and angle information with the pictures? wouldn't be easy to extract those but it should be possible to estimate some.
Posts into suggestion box: "Meshtastic". Geek level 11. It's becoming quite popular in the UK, but quite quiet (English is awesome!) in Nottingham. It's geeky, it involves ESP32 computers and Chirp modulation and witchcraft?. I still don't understand it, so need an aduilt to explain it to me. Thank you.
So basically objects become collections of colored fuzz balls instead of meshes. I feel like this solves problems in rendering foliage. I wonder if it's possible to render both kinds at the same time.
Hypothetically could this rendering system be used for doing a 3d render like for a video game? As in, generating the actual gaussians to be the render instead of trying to take pictures for your initial data.
Even if the tech tech behind them is really, REALLY cool, this ultimately boils down to just the 3D equivalent of pre-rendered backgrounds like the old Resident Evils and point & click games used-- Almost completely static and unreactive to change. Even worse, you're limited to what you can get pictures or video of, which would be pretty artistically limiting. So, I wouldn't really count on Gaussian Splats making their way into games in any significant way.
That's just a limitation of the way the gaussians are currently generated, there's no reason they can't have more lighting information. But yeah, it's for rendering 3D photos, not simulated worlds. Maybe someday.
Yeah, you could always include more information to support dynamic lighting, but I meant static in the sense that nothing moves or can be meaningfully interacted with. Granted, that's kind of just a problem with *a lot* of fancy graphics innovations (tech demos of unmoving environments always looking 100x better than anything that actually gets shipped), but still. The 3D photos are cool either way.
@@NotSlaya49 Research on 3DGS is booming right now, recent papers bind 3D Gaussians to the surface of a mesh which is used as a reference for dynamic scenes. I see the future as being a hybrid classic triangle meshes + 3D gaussians together.
Oooh what if this could be coupled with the Apollonian Gasket algorithm ? Let the gaussians be circles always, and let them fill the shape?... and then in 3d...?
Idk this wasn’t the best explanation. Should’ve used bigger drawings and maybe planned it out more. You’re videos are usually so helpful, this kinda missed it for me
Again I don't understand the explanations here, you basically detail the second part of the algorithm which is the part about how to render the scene once the gaussians are "there", and yes, it's pretty much rasterizing some blobs with a pretty fade-out gradient towards their edges. But the main issue here is HOW did the gaussians get there in the first place from your set of pictures or video clip? Cause that's the main headache of the whole approach, how to populate an empty 3D volume with a point cloud that's *somehow* being deduced out of 2D input sets.
"Posh Point Cloud" is the best best short description of Gaussian Splatting I've heard.
Posh, pish... whats a few percent in taxes between friends?
I love this two person interview format. It’s so dynamic.
Flows much better than the cameraman - guest format. Having the cameraman intervene here every now and again also worked very well!
Also gives credit to the young scientific staff, that is often the power house in academia.
What I like most about Gaussian splattering is that it is basically a non neural network approach that is competitive if not better than neural network approaches.
CloudWolf! you should try making a gaussian splat rendering datapack for Minecraft
I mean, it still uses gradient descent for optimization, which makes it pretty close to neural networks
@@DanieleCapellini gradient descent does not feature back propagation or connected layers and it doesn’t learn it just is a method for solving a problem that is not solvable analytically.
But it does do backpropagation with the differentiable rasterizer@@Cl0udWolf
@@Cl0udWolf I wonder what method neural networks use to "learn" optimal changes of parameters based on the gradients they've calculated during backpropagation.
Loved the Pound particles haha
I love how in traditional computer science nerd fashion whenever either of them speaks the other one just looks extremely awkward, like they don't know what to do with themselves.
love the new format btw
seriously, do more of these
please
Ah, great. Now I have "I don't know what to do with myself" of the White Stripes playing on loop in my head.
lol. Settle down man.
It's not just awkward nerd waiting, when smart people like this talk shop they are constantly thinking through what's being said, so it's also a display of intent listening and conceiving follow up statements or questions. E.g. in your typical talk show with charismatic people, no one actually listens to what the other person says, they just pretend to pay attention while waiting for their turn to say something pre baked or to latch on to a fragment of a sentence said by the other party. But here we see both of them mulling over what the other says and then trying to fill in any gaps they think need filling. It's a much deeper level of engagement in conversation, that's why they focus their gaze on e.g. their hands or the table, to keep themselves free from distraction and focus on the words.
Loool
@@luicecifer The Hardest Button to Button...you're welcome
Eye contact makes a lot greater difference to the video than I would have guessed.
Really great content.
love how enthusiastic Lewis and Mike are about these techniques!
This is so cool! I've been looking for a neat explanation on what actually Gaussian Splatting is for a long while now. And who better to explain it. Perfect!
I'm always pleasantly surprised when I see a computerfile video with Mike popup on my recommendations.
Yes ❤ please more videos about techniques like this. Also reeeaally enjoyed the 'old' computer vision and neural net videos. They inspired me to pursue a carreer in this direction! thank you
0:23 and later, Humble Pi in the background of a video released on Pi Day, well done
I've been trying to understand this for so long 😅 Thank you for the clear explanation!
Gotta love that MGS clip at 6:24.
Small correction 5:58
The 3D Gaussian's mean is indeed placed at the location of the corresponding SfM point as it represents the closely estimated position during initialization. The size of the initial isotropic Gaussian is determined by the average distance to the three closest neighboring points.
I was always a bit put out with NERFs, since a scene is a NN and all the problems that this entails. GS has such incredible potential.
The energy and the passion here, wonderful!
This is such a cool topic, and such a great explanation. Thank you!
Nice! SMERFs next! Or maybe how to generate a 3dgs point cloud from only 2 or 3 cameras?
What a nice explanation! We've come a long way from the ray tracing in the early, but at that time, mind-boggeling 'Wolfenstein' or 'Doom' engines.
ray casting*
you two drawing pictures of pine tree renders reminds me of the brothers drawing the trees in Myst by smearing big green cones.
There is a certain fuzziness and artifacting that is unique to splatting which begins to be a bit of an eyesore the more you work with them. Almost like you are wearing googles with vasoline smeared in random spots. I hope we discover some way to overcome that. Maybe combining traditional and splatting and doing boolean operations to cull the splatting errors?
Maybe applying something like a de-noising model as a post-processing layer?
I think most of the artifacts are due to insufficient camera angles for training. It ends up positioning some splats in mid air, in a way that work for the training angles, but no when you move around.
Also, the more splats you use (and the smaller their minimal size), the more detail you can capture, and sharper the details can become. However at the expense of performance.
An alternative would be to use asymmetric splats, with adjustable degrees of blur on each side of the ellipse (like angled spotlight projections). That could represent sharper edges with fewer splats. But depending how costly more complex splats are to render, it may not be worth it and be slower than rendering a larger number of symmetric gaussians.
@@Pixelarter If the splat positioning were restrained in 3d space for each object being represented, I think it would automatically reign in the artifacting. Like a low poly textureless version of what is being rendered used as a guide to restrain the splatting being placed to within those bounds.
That's just due to the splats being large, which is due to not enough camera angles. The more of the scene you give the algorithm (especially giving close up shots of things) the more detail (i.e. smaller splats) it can produce, but that means more effort for both you and the computer. It's probable best to use this technique for relatively small objects, and removing large splats in the sky to be replaced with a more traditional skybox technique.
It's just the opposite drawback of triangles. With triangles, a lack of detail appears as overly sharp points and edges, while in gaussians low detail creates blur instead.
Great, but the rendering of splats does not use a depth buffer. All the splats are gaussian - it means that even at 100% opacity, the edges will be transparent, eventually fading to 0 opacity. What's usually done, is that you sort all the splats and render them from back to front. That's the current pinnacle of the technique. Some opt in for sorting once every Nth frame, which greatly improves performance, but other than that, there's a high degree of overdraw. And they're not like circles - their projection may look like a fluffy cloud, but they're volumetric by all accounta. Aside the color, opacity and spherical harmonics, they have a position, a rotation and a scale - all in 3D, so they look differently depending on the angle you're looking at them.
Maybe a set of depth buffers (where each iteration traverses one transparent object)? Let's say that after traversing the first object the accumulated alpha in an area is equivalent to 1, everything behind could then be skipped
That requires sorting!@@nicolas.predella
Creating a universe of mini Mike Pounds...how unintentionally sweet
We used this splatting to do Ortho rectification for line sensors 20 years ago.
Is there a reason for using gaussians rather than, say, some kind of wavelets to allow automatic compression by rounding high frequency information?
The gaussians are rendered in real time as textured polygons by retail graphics hardware.
It's all because of hardware. It's the same reason video games aren't using NURBS for meshes even though they'd look awesome.
Can't wait to see how this is used in video games
The first game with gaussian splatting will be revolutionary.
there's a game called unrecord that uses unreal engines photogrammetry assets that's been getting a lot of attention. for the purposes of a game, photogrammetry is the same as gaussian splatting, if not better.
Superb stuff, as always.
Not going to pretend I understand everything, but that's down to my neural network not being all that I'd like it to be.
I think it's worth clarifying the distinction between building a Gaussian Splat scene and actually rendering the scene on screen, as some people might be confused about these 2 steps being named after the same thing.
Another thing to mention, is that it's much faster and less error-prone to automatically generate the scene by providing location and angular data to the Gaussian builder. Most phones have accelerometers and gyroscopes, which helps a lot with auto-generating splats, otherwise you'd have to remember the exact location and angle of the camera to place each picture into the scene manually.
BTW, do the pixels get turned into voxels before being converted to Gaussians?
I want to see this as a video with sports events. I imagine flying through a sports scene in 3D while the action is in motion. How many camera angles would be needed? Would the speed be quick? I guess each frame would need to be trained and stored (weeks of training). I think each frame is tens of megabytes, so it might be too much to actually be executed. Your thoughts?
A research team has already made some short 3D videos using animated gaussians, recorded inside a dome covered in video cameras. Google "Dynamic 3D Gaussians".
Wow! This reminds me of a bob ross painting where the whole scene is is constructed by "smears" of colour
I've got two questions: When we split a gaussian, how do we avoid having them both converging to the same place and size? Is there a repulsion term in the loss function? Like, electrons repelling each other?
And how do we avoid gaussians just fully transparent?
Also, I don't think it would handle specular reflections or anything that depend on the position of the camera.
In GS they optimize and adaptively densifiy/control the Gaussians in order to tackle over-reconstructed area and under-reconstructed ones.
In the case of under-reconstruction, meaning that space needs to be filled, a new Gaussian is created as a clone of one in the under-reconstructed area and placed along the direction of the positional gradient.
In over-recontructed areas instead, the target Gaussian is simply split and divided by a factor of 1.6 (determined experimentally). To determine the position of the obtained Gaussians from the splitting process, they perform a normal sampling based on the scaling values of the gaussian (mean centered at the Gaussian mean and Standard Dev based on the Gaussian's scaling parameters). They do matrix multiplication on these samples with the original rotation matrix of the GS on which splitting is been performed, in order to align them with the current spatial orientation of the initial one. Subsequently, they use these sampled values as an offset from the original mean to determine the 2 new means for each couple of newly generated Gaussian obtained from the splitting process on the initial Gaussians it was applied to.
Fully transparent Gaussians are simply pruned as they do not contribute to the final rasterized image (more recent papers adopt smarter pruning techniques).
Regarding specular reflections, in this video they did not go into detail, but the 3D Gaussians are basically 3D ellipses, and on the surface of and Ellipse you can define Sferical Harmonic functions, these are used to represent color on the surface of the Gaussians as a function of the viewing direction.
@@stefanoscolapastathat's neat, so the 3d gaussians fill out the same-collored borders of the objects and then each gaussian is given a view dependant function of color?
@@stefanoscolapasta they look like 2D ellipses (as they have to be for traditional rasterization)
I don't understand this, but from examples I've seen GS can handle reflections pretty well
I have to give this a rewatch, but I feel a segment is missing. How are the gaussians generated? I acknowledge the parts of the video discussing what they are, the benefits they confer to render time, flexibility and application. But I missed how to actually produce my own gaussian burst as recommend to do of my house at the end of the video
There's still a pre-rendering step where the gaussians are generated, like in conventional photogrammetry. You feed the images into a simple neural network that builds the gaussians. It's just adding a second layer of data to the point cloud.
Initially the images are passed through a classical Structure From Motion algorithm in COLMAP which estimates a coarse point cloud of the scene. Those points are used as initial 3D positions to place the first 3D gaussians.
First time I laughed out loud on computerphile 😂
Haha, there’s a NERF gun on the desk 😂
There are algorithms that can extract the 3D scene from the NERF to a ply or other 3D format. Marching Cubes for example.Also Gaussian Splatting doesn't capture Phong Illumination or difficult illumination in general except you overfit the model and make really small gaussians. So photorealism with gaussian splatting is something debatable. Last but not least there are new models like F2-Nerf that are way faster that the NeRF you are using implementing 3D Multiresolution Hashencoding that helps the model converge faster. After seeing the video I tend to believe that gaussian splatting is like patching 3D shape with small bell shaped Plasticine objects to create a 3D scene that looks like the one in the image. This even defies bias that is made from camera i.e. different illumination from different angles due to light captured from camera
This is like complaining about jpeg or mp4. The point isn't photorealism, it's that it's a viable realtime technique that performs like conventional rendering on cheap hardware. Realtime optimization is all about doing things "worse".
Gaussian Splating - If it's dynamic, take a 360 degree shot of a dancer and use the camera layout scheme of the dome
this episode is so good!
So basically they're like 3D brushstrokes or fluffy voxels...
Kind of reminds me of how the graphics in "Dreams" for Playstation 4 are rendered
For those of you wondering how people end up at companies like Meta or Spotify, it's oftentimes dudes like this. The handful of people out of a PhD program in computer science who just LOVE writing splatting APIs in Rust or Go.
Dying of laughter here with the millions of Mike Pounds :') :')
Not rendering something that isn't there, makes it more useful / reliable in some applications.
like medical ? but where can it be useful
Impressionism in 3d 😱
Next step - apply jpeg compression for 10x performance and/or quality
How well does it do with perspective correct drawings and recreating their look? How many are needed?
The Multiverse of Mikeness
I read the original paper and am familiar with the technique. Was excited to see y'all do a video on it but this video could definitely have used better production to explain Gaussians. For example, t was 13 minutes before you showed an image of a gaussian blob.
If you want to play with Gaussian Splatting the iOS app Scaniverse will scan rooms and objects using this technique
Is there potential for animated/moving scenes with Gaussian splatting?
yes, but as of now it's only for pre-recorded movement. there's a video on youtube somewhere, i think "4d gaussian splatting"
@@hannah42069 oh yeah I've seen that one. It appears that it can only be recorded in a fixed space using multiple cameras, so it's not too useful in the way Gaussian splatting was intended to capture arbitrary still scenes. Not quite what I had in mind.
With static object and stiff objects it shouldn't be too hard as the gaussians can keep their shape and size. But I image it becomes really complicated once every gaussian itself has to animate as well.
One could try and find a transformation of connected subset of gaussians that when aplied and rendered minimizes the difference to an image where the transformation is applied. You have now found a joint in the object and its corresponding transformation
0:40 running of a web server and sending the images? The data is loaded from a server yes, but it's being rendered completely locally, in the browser, no? This would not at all affect the performance
Just saying it's doing gradient descent doesn't tell us its particularly cheap. I mean backpropogation in a nueral net is just a form of gradient descent.
METAL GEAR MENTIONEDDD
In the mid 1990s, I was part of a project to create the instrumentation for an autonomous mining vehicle. It used a distance measuring instrument. This comment was made not publically visible by the "algorithm" until I made it exclude the technicalities involved. Curious that?
Referenced things must have negative connotations?
My user name likely has me on some blacklist?
Maybe if I translate the original comment into Greek?
Στα μέσα της δεκαετίας του 1990, ήμουν μέρος ενός έργου για τη δημιουργία των οργάνων για ένα αυτόνομο όχημα εξόρυξης που πήγαινε από σταθμό GPS σε σταθμό με το φορτίο ορυκτών για να τα απορρίψει. Χρησιμοποίησε ένα ορατό κόκκινο σύστημα μέτρησης απόστασης λέιζερ μέσης ισχύος που παρήγαγε χιλιάδες μετρήσεις απόστασης ανά δευτερόλεπτο ενώ τις σάρωνε χρησιμοποιώντας έναν κινητήρα άξονα με ελεγχόμενη ταχύτητα βρόχου κλειδώματος φάσης που είχε έναν καθρέφτη τοποθετημένο στον κινητήρα για να παράγει σταθερή ταχύτητα μέτρησης απόστασης λέιζερ που θα σάρωνε η σκηνή πάνω από το όχημα και σε κάθε σταθμό που επισκέπτεστε θα υπάρχει ένας γραμμικός κώδικας μέτριου μεγέθους σε μια ορισμένη απόσταση πάνω από το όχημα. Η ταχύτητα των υπολογιστών τότε δεν ήταν υπερβολικά γρήγορη, αλλά τώρα οι υπολογιστές είναι πολύ πιο γρήγοροι, ώστε όλα τα δείγματα απόστασης να μπορούν να καταγραφούν όμορφα σε μια συσκευή αποθήκευσης υπολογιστή. Το θέμα θα ήταν ότι όλα αυτά τα δεδομένα σύντομα θα γέμιζαν τις περισσότερες συσκευές αποθήκευσης, εκτός εάν χρησιμοποιήθηκε μια καλή μέθοδος συμπίεσης για την εξοικονόμηση χώρου μέσα σε τέτοιες συσκευές αποθήκευσης.
Αυτός θα ήταν ένας πολύ γρήγορος τρόπος για να δημιουργήσετε έναν τρισδιάστατο χάρτη επιπέδου παιχνιδιού για χρήση σε παιχνίδια. Εάν μια περιοχή αποφευχθεί η σάρωση, θα φαινόταν στο παιχνίδι σας να έχει μια τρύπα στον χάρτη του παιχνιδιού σας σε ορισμένα σημεία. Πιθανότατα, ορισμένοι παίκτες παιχνιδιών χάκερ θα έβρισκαν έναν τρόπο να βγουν έξω από τον χάρτη σας και να πυροβολήσουν παίκτες που ήταν ακόμα μέσα στον χάρτη, ανίκανοι να ανταποκριθούν στους αντιπάλους τους που εξαπατούν. Η πρώτη μου μέθοδος ήταν να χρησιμοποιήσω έναν ψηφιακό βρόχο κλειδώματος φάσης που ήταν ελεγχόμενος από κρυστάλλους, αλλά το όχημα εξόρυξης θα υπόκειται σε πάρα πολλές προσκρούσεις στην επιφάνεια του δρόμου υψηλής ταχύτητας που τα επίπεδα κραδασμών για το κύκλωμα θα προκαλούσαν ζημιά στην κρυσταλλική αναφορά συχνότητας που είναι τοποθετημένη στην πλακέτα κυκλώματος. ο μηχανικός βρήκε μια άλλη ακριβή μέθοδο αναφοράς που βασίζεται σε αναλογικές μεθόδους που δεν έχουν ευάλωτα εξαρτήματα κυκλώματος που δεν θα καταστραφούν από το όχημα που κινείται σε ανομοιόμορφες συνθήκες οδοστρώματος σε σχετικά υψηλές ταχύτητες.
Ο κινητήρας του άξονα ήταν ένας τριφασικός κινητήρας συνεχούς ρεύματος χωρίς ψήκτρες που κινούνταν από μια γέφυρα H MOSFET σε κάθε μία από τις τρεις φάσεις του. Ο κινητήρας θα παράγει τους δικούς του παλμούς μέσω των αισθητήρων φαινομένου του χωλ εντός του κινητήρα που θα μπορούσαν να χρησιμοποιηθούν ως ανάδραση στο κύκλωμα ρυθμιστή ταχύτητας κινητήρα βρόχου κλειδώματος φάσης που έλεγχε την αλληλουχία φάσης της γέφυρας H. Το μόνο που θα ήταν απαραίτητο για τη χρήση ενός τέτοιου οργάνου λέιζερ μέτρησης απόστασης θα ήταν να γείρετε το κουτί σάρωσης με λέιζερ μια μικρή σταθερή γωνία αφού λήφθηκαν αρκετά δείγματα από το παράθυρο που είχαν τοποθετήσει κύκλωμα μετριασμού της ομίχλης στο τζάμι επειδή αυτό το πράγμα επρόκειτο να λειτουργούν σε διάφορες καιρικές συνθήκες. Ο ηλικιωμένος που εργαζόταν μαζί μου ήταν ήδη 45 έως 55 ετών εκείνη την εποχή, επομένως μπορεί κάλλιστα να πέθανε στο ενδιάμεσο διάστημα, και εγώ ήμουν μια ή δύο δεκαετίες νεότερος από αυτόν. Ο χρόνος, οι πόλεμοι και οι διάφορες ασθένειες συνεχίζουν να είναι όλοι εχθροί μας.
Αναφορά
Wikipedia 256χρονος Κινέζος βοτανολόγος πεθαίνει το 1933
Ένα βίντεο σχετικά με την ιατρική δικαίως ισχυριζόταν ότι ΔΕΝ ήταν δουλειά του να αυξήσει σε μεγάλο βαθμό τη διάρκεια της ζωής μας, αλλά το αντίθετο και να προεδρεύει σε ένα πρόγραμμα για να διασφαλίσει ότι ο καθένας μας πεθάνει αφού περάσουν τα χρήσιμα χρόνια μας. Προφανώς αυτός ο τύπος ήταν άνθρωπος, αλλά αν διαβάσετε την περιγραφή του, ήταν περίπου 7 πόδια ύψος. Κατέληξε να έχει 20 ή περισσότερες συζύγους και πολλά παιδιά με την καθεμία, τα οποία τα έζησε όλα πολύ εύκολα. Μέχρι σήμερα, πιθανότατα έχει πολλούς απογόνους.
The technical details were able to be translated into the target Greek language by Google translation tools. There is nothing in there of an offensive nature. Why do they do this?
So it's basically a 3D gaussian mixture model where the number of gaussians is learned?
Exactly right, no magic really ;)
i would like to see MORE reimagining tech
Like i want to do Syth world behavior to allow for Plant growth simulation using UV and element exposure with a Magnetic loop extrude method.
basically what direction is based on the moon so how it is extruded and the sun to what direction it fasces to bake.
How does Gaussian Splatting handle specular surfaces?
It renders what is in the reflection with another set of gaussians. It's cool because it actually creates a mirror world. I saw someone zoom in through the reflection in a pool of water which was pretty trippy.
The lighting is just whatever was captured by the camera. In a sense there are no "surfaces", the only conventional 3D simulation used is the original point cloud. Adding simulated lighting will require additional developments. (Maybe photographing the subject in multiple lighting conditions, for a start. Could be as simple as abusing the camera's flash, or waving a light around as you record video.)
@@trickster721advantage NeRF then, if I understood that correctly, since they can handle view dependent reflection and refraction.
@@ElTrolldego I looked into it, and NeRF works the same way, it's just capturing existing highlights and reflections as an artifact. Versions of NeRF with dynamic lighting are either using a specialized photography setup, like I figured, or generating a conventional 3D model and lighting that. Neither technique can get correct physical material information from arbitrary photos, that data has to come from somewhere.
I think the original NeRF presenters were slightly hyping up its ability to "handle" specular and reflections, it's not deliberate like that implies, it's just a side-effect. Both techniques just reproduce whatever is shown in the source images, they don't actually know the difference between a window, a mirror, and a swimming pool.
This will easily get to the limit if perceivable graphics once you have hardware made for representing gaussians in a screen
Noob question, how is this functionality different from fotogrametry meshing?
Using fuzzy colored blobs instead of triangles lets you create images using less information, in the same way that a painting can look more "correct" to our perception than a photograph. They're underselling how performant this is, the tree scene they show here probably runs about as fast as something like Minecraft on a cheap modern GPU. I'm guessing the office PC they use in the demo has integrated graphics.
Can I just leave this here ,an idea came to me and put the question to a llm
Improving 3D Gaussian splatting by incorporating soap bubble dynamics to capture image information is a fascinating idea. This proposition involves melding two seemingly different concepts: Gaussian splatting, primarily used in volume rendering and graphics, and the physics of soap bubble dynamics, a study in fluid mechanics and minimal surfaces. Here's a concept on how to approach this problem:
### 1. Understanding the Basics
**Gaussian Splatting:** In graphics, Gaussian splatting is a technique used in volume rendering where each data point contributes to its surrounding pixels in a manner that resembles a Gaussian distribution. This results in smooth transitions and blending between the points.
**Soap Bubble Dynamics:** Soap bubbles form minimal surface areas for a given volume because of surface tension. The dynamics of soap bubbles, including how they merge and form shapes, are governed by principles that minimize surface area while maintaining volume.
### 2. Concept Integration Proposal
The goal of incorporating soap bubble dynamics into 3D Gaussian splatting is to achieve a more natural and minimalistic representation of the data, possibly leading to improvements in the visualization's accuracy and aesthetic. Here’s a step by step approach:
**Step 1: Data Preparation and Initial Splatting**
- Start with standard 3D Gaussian splatting to create an initial rendering of your volume data.
- Ensure your data is normalized and correctly scaled for the visualization process.
**Step 2: Incorporating Soap Bubble Dynamics**
- **Surface Tension as a Guiding Principle:** Modify the Gaussian distribution parameters dynamically based on an algorithm inspired by surface tension. This could involve adjusting the spread of the Gaussian function to more closely resemble the way soap bubbles naturally adjust their surfaces to minimize the area for a given volume.
- **Minimal Surface Calculation:** Introduce calculations to adapt the attributes of each "splat" such that it contributes to forming a minimal surface, analogous to the way soap films span frames. This can mean adjusting the splat sizes based on their relative positions and data densities to more accurately reflect an equilibrium state you’d find in a soap bubble structure.
**Step 3: Optimization and Rendering Adjustments**
- **Adaptive Resolution:** Utilize soap bubble mechanics to adjust the rendering resolution and detail dynamically, focusing computational resources on areas of higher complexity or data density.
- **Simulation of Light and Transparency:** Model the interaction of light with the "bubble-like" structures in your volume rendering to simulate the iridescence and translucency seen in soap bubbles.
### 3. Computational Challenges and Solutions
Implementing these concepts computationally will require advanced algorithms for:
- Dynamically adjusting Gaussian parameters based on soap bubble dynamics principles.
- Efficiently computing minimal surfaces across the data points in real-time or near-real-time for interactive visualization.
- Optimizing the computational performance to handle the additional complexity introduced by these calculations.
### 4. Potential Application Areas
Such an approach could significantly impact fields requiring high-detail volume rendering, like medical imaging, scientific visualization, and CGI, by providing a more intuitive and visually appealing representation of complex 3D data.
### 5. Conclusion
While the idea is complex and would require substantial research and development effort to fully realize, incorporating soap bubble dynamics into 3D Gaussian splatting presents a promising avenue for enhancing visualizations by leveraging the inherent beauty and efficiency of natural structures. The fusion of these concepts could lead to breakthroughs in how we represent and interact with 3D volumetric data.
I learned about gaussuan splatting at uni last year so I thought it was an older technique. Didn't know the paper on it was published basically only a few months before that.
It definitely seems way too simple to be something only developed last year.
It actually takes a lot of inspiration from EWA Volume Splatting from 2001.
Has Matt Parker paid up for the product placement?
I'd love to see the particle explosion but with the original gaussians instead of replacing them with a sprite.
Could you please try and do this with pictures / videos of any of the moon landings? that would be so awesome. I already tried with A11 but it never worked, but there weren't that many pictures.
probably not enough pictures for a decent reconstruction
@@vibaj16 yeah ap11 hasn't but maybe others.
Or do you know if there is a possibility to give some kind of positional and angle information with the pictures? wouldn't be easy to extract those but it should be possible to estimate some.
I don't think overfitting means that.
Oh the excitement 😅
so my main question is, is this going to give us a higher quality version of the 8800 Blue Lick Road house in VR?
Im still not sure, do both rendering techniques use the same sample data?
I was interested in photogrammetry while ago, amazing how neural networks are doing it
It's like fiberglass insulation lidar
It is like a blanket of voxels (or pixels in a non-plane surface)
What NPU do I need to render this with 2MP photos?
Posts into suggestion box: "Meshtastic".
Geek level 11.
It's becoming quite popular in the UK, but quite quiet (English is awesome!) in Nottingham. It's geeky, it involves ESP32 computers and Chirp modulation and witchcraft?. I still don't understand it, so need an aduilt to explain it to me.
Thank you.
So basically objects become collections of colored fuzz balls instead of meshes. I feel like this solves problems in rendering foliage. I wonder if it's possible to render both kinds at the same time.
So what about 4D Gaussian splatting as a VR video format?
Is there a link where we can download a sample scene to navigate inside ?
Whats the previous video?
NERFs (No, not that kind) - Computerphile
Can't post a link but you can look it up.
ua-cam.com/video/wKsoGiENBHU/v-deo.html - now in description as well -Sean
coooorrr that lewis guy is sexy
Hypothetically could this rendering system be used for doing a 3d render like for a video game?
As in, generating the actual gaussians to be the render instead of trying to take pictures for your initial data.
We're already using rasterization in video games, just on triangles instead of gaussians.
Even if the tech tech behind them is really, REALLY cool, this ultimately boils down to just the 3D equivalent of pre-rendered backgrounds like the old Resident Evils and point & click games used-- Almost completely static and unreactive to change. Even worse, you're limited to what you can get pictures or video of, which would be pretty artistically limiting.
So, I wouldn't really count on Gaussian Splats making their way into games in any significant way.
That's just a limitation of the way the gaussians are currently generated, there's no reason they can't have more lighting information. But yeah, it's for rendering 3D photos, not simulated worlds. Maybe someday.
Yeah, you could always include more information to support dynamic lighting, but I meant static in the sense that nothing moves or can be meaningfully interacted with. Granted, that's kind of just a problem with *a lot* of fancy graphics innovations (tech demos of unmoving environments always looking 100x better than anything that actually gets shipped), but still. The 3D photos are cool either way.
@@NotSlaya49 Research on 3DGS is booming right now, recent papers bind 3D Gaussians to the surface of a mesh which is used as a reference for dynamic scenes.
I see the future as being a hybrid classic triangle meshes + 3D gaussians together.
Oooh what if this could be coupled with the Apollonian Gasket algorithm ? Let the gaussians be circles always, and let them fill the shape?... and then in 3d...?
Like a gaussian mixture model?
Idk this wasn’t the best explanation. Should’ve used bigger drawings and maybe planned it out more. You’re videos are usually so helpful, this kinda missed it for me
so these are basically little particles instead of vertices that form faces
Still verticies. Just a bunch of 2 triangle squares sent to the GPU where the shader draws a transparent circle on them.
Men I don’t understand much but by the excitement of them I got very excited too 🤣
Is gaussian splatting what polycam uses?
When gaussians in a piece of code become physical ;-), you realize how much traditional cv folks hate neural networks :P.
Matt Parkers book in a research office... Sure... :)
A grandient can be descended in any way but backpropagation is what makes ai
It doesn't learn because it's not deep
100fps? Well wait until consumers demand 1024 times the density of blobs.
I feel like you glossed over the details a bit too much.
It was a Gaussian approximation of the details
Is it neural network based or not? Mentioned "training" multiple times in reference to Gaussian splatting.
It isn’t neural network based. The training is gradient descent based.
... it's almost like ANNs aren't the only way to create machine-learnt models! ;P
Again I don't understand the explanations here, you basically detail the second part of the algorithm which is the part about how to render the scene once the gaussians are "there", and yes, it's pretty much rasterizing some blobs with a pretty fade-out gradient towards their edges. But the main issue here is HOW did the gaussians get there in the first place from your set of pictures or video clip? Cause that's the main headache of the whole approach, how to populate an empty 3D volume with a point cloud that's *somehow* being deduced out of 2D input sets.
Arent there negative gaussians? Could cut out the crescent of a crescent moon.
That was an introduction, but i d like to see some rendered examples.
8:22 Lewis should of drawn bigger, and used more pages.
Have computer scientists reinvented the atom?
Ending is gold haa
Put Mike to the background to get more views 😂
I prefer meshes
This seems like fuzzy voxels.
It's so simple! I could probably code it in an afternoon.