[Seminar] 3D Gaussian Splatting for Real-Time Radiance Field Rendering

Поділитися
Вставка
  • Опубліковано 20 жов 2024

КОМЕНТАРІ • 30

  • @IIIXRLab
    @IIIXRLab  3 місяці тому

    JooHyun, thank you for your presentation. Here are my questions (Q) and feedback (F):
    [Q1] I wonder whether the 16 x 16 tiles are sufficient for high-quality rendering. Is the best tile size independent of the final image resolution?
    [Q2] How does this technique handle sorting-based alpha blending when it becomes infeasible, such as with transparent objects or twisted polygons?
    [Q3] On the evaluation page, Gaussian Splatting shows more than 100 FPS even with 30K iterations, which is quite surprising. Was this performance measured including the initialization stage like the Structure from Motion (SfM) stage, or after initialization was completed?
    [F1] It is highly recommended to speak louder. Even if your English isn't perfect, audiences are more likely to overlook errors if they can clearly understand the context.
    [F2] Several pieces of information and figures were presented too briefly. When you introduce information or figures, ensure you explicitly explain them. For instance, say, "This concept is briefly... you can find more information here..." Providing more detail helps novice readers understand better.
    [F3] When presenting evaluation results, clearly state the conditions and units used.
    I believe that too much feedback at once can be overwhelming, so please focus on these three points. By keeping these in mind, I am confident we will see even better presentations in the future. Thank you again for your effort.
    H. Kang

    • @박주현-123
      @박주현-123 3 місяці тому

      [A1] The paper adhered to the tile size of 16x16 for various resolutions. The PSNR scores for the generated scenes exceeding 27 indicates that the given tile size is sufficient for high-quality renderings.
      [A2] All objects including transparent ones or twisted polygons are represented using multiple 3D gaussians, with their position set as the [x, y, z] coordinate. Theses Gaussians are sorted based on their depth relative to the image plain, so the sorting-based alpha blending functions similarly to that for ordinary objects.
      [A3] It was measured after the initialization had been completed.
      [Feedbacks] Thank you for your thoughtful reviews. I'll be sure to keep them in mind.

  • @정승재_teclados078
    @정승재_teclados078 3 місяці тому

    Thank you for sharing basic summary of 3D Gaussian splatting.
    I appreciated the organization of the flow of 3DGS according to the pseudocode.
    I have one question:
    Why does the design of the rasterizer, including the tile-based rasterizer and the process of sorting by depth and tile ID, enable parallelism? How does this impact real-time rendering performance?

    • @박주현-123
      @박주현-123 3 місяці тому

      I assume computing the Gaussian list for each tile instead of for each pixel contributes to the increase in overall rasterization speed

  • @황주영-j7y
    @황주영-j7y 3 місяці тому

    Thank you for sharing your overall knowledge of 3d Gaussian splatting.
    My questions are largely two things.
    1. The points in the point cloud obtained by SfM seem to be used as the average value of the initial 3D Gaussians, but if you look at the pictures that appear when you search for SfM point cloud, you can see that the point clouds are covering the object.
    I'm actually wondering if each of these points will have a 3d Gaussian average.
    2. And I wonder how you can incorporate these 3d Gaussian splatting methods into holograms

    • @박주현-123
      @박주현-123 3 місяці тому

      A1. The mean of a Gaussian can be interpreted as the center of the Gaussian, just like the central point of a circle. Initializing the 3D Gaussians from SfM point clouds means that the [x, y, z] coordinates of the point clouds will be set as the central points of the Gaussians, which are then subsequently applied the attributes of covariance, color, and opacity.
      A2. That would be my research topic :)

  • @김병민_dm
    @김병민_dm 2 місяці тому

    Thank you for the clear presentation. Regarding the 3D Gaussian culling part, does it mean that Gaussians partially included in the view frustum are not rendered? As far as I know, in traditional frustum culling techniques, if an object is partially within the view frustum, that object becomes a target for rendering.

    • @박주현-123
      @박주현-123 2 місяці тому +1

      The confidentiality of the Gaussians are computed for those in the borderlines, and only those with scores higher than 99% are included in the rendering.

  • @tjswodud-c6c
    @tjswodud-c6c 3 місяці тому

    Thanks for a great presentation on Gaussian Splatting. I gained a lot of knowledge thanks to you.
    I have one question: what are the limitations of Gaussian Splatting in your opinion? Compared to traditional methods like NeRF, you explained the advantages of being able to render realistic scenes while using less memory, but you didn't mention anything about limitations, so I'm curious about your thoughts.
    Thank you.

    • @박주현-123
      @박주현-123 3 місяці тому +1

      Here are some limitations to 3DGS:
      1. Large memory space required for representation: Some follow-up research try to address to this problem, reducing the memory required to 10% of the original version while minimizing the quality loss.
      2. Unstable surfaces: The generated surfaces can be unstable, leading to messy meshes when applied mesh extraction algorithms. This issue is being tackled using techniques like surface regularization.

  • @홍성은-iiixr
    @홍성은-iiixr 3 місяці тому

    Nice presentation. It was really touching seminar.
    As I understood, there are iterations that removing, cloning, splitting in optimizing process.
    Then there are any skills of stopping iterations like early stopping in this area?
    Experiments show that more times of iterations, the better.
    But I think it is too may iterations that pay inefficient cost for some kinds of simple images rather than complex.
    Thank you!

    • @박주현-123
      @박주현-123 3 місяці тому

      I reviewed the official code, but did not find any stopping mechanisms.
      It seems the authors empirically determined the appropriate number of iterations, adjusting it based on the complexity of the target scene.

  • @misong-kim
    @misong-kim 2 місяці тому

    Thank you for your presentation.
    1. If the output of 3DGS is a collection of 3D Gaussians, does this represent a seperate form of 3D object representation compared to commonly used methods like point clouds, meshes, or implicit functions?
    2. If so, are there any methodologies or research on converting this data type into meshes for use in games, and are there any challenges associated with this process?

    • @박주현-123
      @박주현-123 2 місяці тому

      A1. Yes. 3D Gaussians can be considered as an extended form of point clouds, where the points in the space are treated as the center of 3D anisotropic Gaussians.
      A2. There exist some metrics that transforms the 3D Gaussians into other data types, such as the marching cube algorithm.

  • @포로뤼
    @포로뤼 3 місяці тому

    Thank you for the good seminar :)
    I have one question. Why do point clouds often have holes or discontinuities? How does Gaussian splatting address these issues and provide more robustness?

    • @박주현-123
      @박주현-123 3 місяці тому

      The holes and discontinuities in point clouds often result from occlusion and limited viewpoints in multi-view images, feature mapping errors in the point-cloud generation algorithm, etc.
      3DGS addresses these issues by employing anisotropic 3D Gaussians to fill the gaps, which presents more compact scenes compared to raw point clouds.

  • @2l302
    @2l302 3 місяці тому

    Thank you for the excellent presentation. I have one question: why does this paper use a tile-based rasterizer instead of a traditional rasterizer? I'm curious if there are specific benefits to using a tile-based rasterizer in the context of this paper.

    • @박주현-123
      @박주현-123 3 місяці тому

      A tile-based rasterizer leverages GPU parallel processing by instantiating a thread block for each tile, enabling fast rasterization.

  • @vvuonghn
    @vvuonghn Місяць тому

    Hi, Could you please share your slide

  • @노성래99
    @노성래99 3 місяці тому

    Thank you for presentation.
    1) I think advanced techiques for adaptive density control will be major research point. Do you have any idea for this?
    2) In my opinion, Gaussian Splatting can be interpreted as generalized stochastic point cloud. Will other explicit 3D representations (Mesh, voxel, ...) have potential to be advance like 3DGS?

    • @박주현-123
      @박주현-123 3 місяці тому +1

      A1. One method would be to adopt a function that calculates the smoothness of the surface into the training pipeline. The smoothness level then can be leveraged as the criterion for determining whether to apply the densification method.
      A2. I assume mesh-based approaches may work in a similar manner. The initialization of meshes can be done via point clouds generated from SfM, and the "densification" process can be replaced with subdivision of the mesh surfaces.

  • @critbear
    @critbear 3 місяці тому

    I think Gaussian splatting requires a lot of memory. Is it possible to compare the required memory with NeRF or traditional methods? And could it be a big drawback?

    • @박주현-123
      @박주현-123 3 місяці тому

      3DGS typically requires 700 MB to 1.2 GB for the scenes presented, whereas NeRFs only need a few megabytes (e.g., 5 MB for vanilla NeRF). This significant difference in data size can be a major drawback when transmitting data over networks.

  • @SeungWonSeo-q3s
    @SeungWonSeo-q3s 3 місяці тому

    Thank you for the presentation. I have one question.
    From what I understood, each Gaussian seems to have only one color parameter. If I want to represent a mirror using 3D Gaussian splatting, I am curious how this would be possible. I think the color would need to change depending on the camera angle.
    How can this issue be resolved? If I am mistaken, please feel free to correct me.

    • @박주현-123
      @박주현-123 3 місяці тому

      In Gaussian Splatting, mirrors do not function as one might typically expect. Gaussians are optimized based on multi-view images of the scene, meaning that objects reflected in the mirror are already captured in these images. Therefore, the Gaussians constituting the mirror object only need to "memorize" and predict what they need to display for a given camera viewpoint.

  • @RounLee0927
    @RounLee0927 3 місяці тому

    Did they subdivide the image into 16 x 16 tiles just for parallel processing? Doesn’t that affect the quality of the result?
    I’m also curious about the effect of changing the number of tiles, is it better to use a large number of tiles if I have enough GPU?

    • @박주현-123
      @박주현-123 3 місяці тому

      A1. Yes, tile-based subdivision is designed for GPU parallel processing. The authors claim alpha-blending on tiles can be approximate in some configurations, but they become negligible as splats converge.
      A2. Since one thread block is created for each tile, increasing the number of tiles would result in faster training as well as inference speed.

  • @한동현Han
    @한동현Han 3 місяці тому

    Is there a more effective way to handle very thin 3D objects such as ropes, paper, or steel mesh?
    Do you think 3D Gaussian splatting could replace the traditional vertex-mesh representation of 3D scenes? What challenges remain to be solved in your opinion?

    • @박주현-123
      @박주현-123 3 місяці тому

      A1. I haven't had the chance to find papers that propose 3D Gaussians that are specially designed for specific type of objects. It may be a good research topic though.
      A2. One of the crucial attributes of a 3D representation is the ease of manipulation. However, scenes generated using 3D Gaussians are not as intuitively manipulated as those created with meshes, leaving much space for improvements.

    • @한동현Han
      @한동현Han 3 місяці тому

      ​@@박주현-123 And here's my feedback for you.
      I think video representation is a perfect reference to understand how Gaussian splatting and novel view synthesis work. I know there are a lot of good videos to refer to. It would be good if you include a video next time.