ShanghaiTech Digital Human
ShanghaiTech Digital Human
  • 46
  • 95 932
[SIGGRAPHAsia 2024] V^3: Viewing Volumetric Videos on Mobiles via Streamable 2D DynamicGaussians
Project: authoritywang.github.io/v3/
Experiencing high-fidelity volumetric video as seamlessly as 2D videos is a long-held dream. However, current dynamic 3DGS methods, despite their high rendering quality, face challenges in streaming on mobile devices due to computational and bandwidth constraints. In this paper, we introduce V^3 (Viewing Volumetric Videos), a novel approach that enables high-quality mobile rendering by streaming dynamic Gaussians. Our key innovation is to view dynamic 3DGS as 2D videos, facilitating the use of hardware video codecs. Additionally, we propose a two-stage training strategy to reduce storage requirements with rapid training speed. The first stage employs hash encoding and shallow MLP to learn motion, then reduces the number of Gaussians through pruning to meet the streaming requirements, while the second stage fine-tunes other Gaussian attributes using residual entropy loss and temporal loss to improve temporal continuity. This strategy, which disentangles motion and appearance, maintains high rendering quality with compact storage requirements. Meanwhile, we designed a multi-platform player to decode and render 2D Gaussian videos. Extensive
experiments demonstrate the effectiveness of V^3, outperforming other methods by enabling high-quality rendering and streaming on common devices, which was unseen before. As the first to stream dynamic Gaussians on mobile devices, our companion player offers users an unprecedented volumetric video experience, including smooth scrolling and instant sharing.
Penghao Wang, Zhirui Zhang, Liao Wang, Kaixin Yao, Siyuan Xie, Jingyi Yu, Minye Wu, Lan Xu,
V3: Viewing Volumetric Videos on Mobiles via Streamable 2D Dynamic Gaussians.
Переглядів: 1 279

Відео

[SIGGRAPH 2024] DressCode: Autoregressively Sewing and Generating Garments from Text Guidance
Переглядів 1992 місяці тому
Project Page: ihe-kaii.github.io/DressCode/ Arxiv: arxiv.org/abs/2401.16465 Apparel's significant role in human appearance underscores the importance of garment digitalization for digital human creation. Recent advances in 3D content creation are pivotal for digital human creation. Nonetheless, garment generation from text guidance is still nascent. We introduce a text-driven 3D garment generat...
Instant Facial Gaussians Translator for Relightable and Interactable Facial Rendering
Переглядів 1,1 тис.2 місяці тому
Project: dafei-qin.github.io/TransGS.github.io/ Arxiv: coming soon The advent of digital twins and mixed reality devices has increased the demand for high-quality and efficient 3D rendering, especially for facial avatars. Traditional and AI-driven modeling techniques enable high-fidelity 3D asset generation from scans, videos, or text prompts. However, editing and rendering these assets often i...
[SIGGRAPH 2024]CLAY: A Controllable Large-scale Generative Model for Creating High-quality 3D Assets
Переглядів 6 тис.4 місяці тому
Project Page: sites.google.com/view/clay-3dlm Arxiv: arxiv.org/abs/2406.13897 Demo: hyperhuman.deemos.com/rodin In the realm of digital creativity, our potential to craft intricate 3D worlds from imagination is often hampered by the limitations of existing digital tools, which demand extensive expertise and effort. To narrow this disparity, we introduce CLAY, a 3D geometry and material generato...
[SIGGRAPHAsia 2024] LetsGo: Large-Scale Garage Rendering via LiDAR-Assisted Gaussian Primitives
Переглядів 1,1 тис.5 місяців тому
Project: zhaofuq.github.io/LetsGo/ Arxiv: arxiv.org/pdf/2404.09748 Large garages are ubiquitous yet intricate scenes that present unique challenges due to their monotonous colors, repetitive patterns, reflective surfaces, and transparent vehicle glass. Conventional Structure from Motion (SfM) methods for camera pose estimation and 3D reconstruction often fail in these environments due to poor c...
[SIGGRAPHAsia 2024] Robust Dual Gaussian Splatting for Immersive Human-centric Volumetric Videos
Переглядів 2,5 тис.5 місяців тому
Project: nowheretrix.github.io/DualGS/ Volumetric video represents a transformative advancement in visual media, enabling users to navigate immersive virtual experiences freely and narrowing the gap between digital and real worlds. However, the need for extensive manual intervention to stabilize mesh sequences and the generation of excessively large assets in existing workflows impedes broader ...
[SIGGRAPH 2024] Media2Face: Co-speech Facial Animation Generation With Multi-Modality Guidance
Переглядів 2,3 тис.5 місяців тому
Project: sites.google.com/view/media2face Arxiv: arxiv.org/abs/2401.15687 The synthesis of 3D facial animations from speech has garnered considerable attention. Due to the scarcity of high-quality 4D facial data and well-annotated abundant multi-modality labels, previous methods often suffer from limited realism and a lack of flexible conditioning. We address this challenge through a trilogy. W...
[CVPR2024] HOI-M3: Capture Multiple Humans and Objects Interaction within Contextual Environment
Переглядів 7977 місяців тому
Project Page: juzezhang.github.io/HOIM3_ProjectPage/ Arxiv: arxiv.org/pdf/2404.00299 Humans naturally interact with both others and the surrounding multiple objects, engaging in various social activities. However, due to fundamental data scarcity, recent advances in modeling human-object interactions mostly focus on perceiving isolated individuals and objects. In this paper, we introduce HOI-M3...
[CVPR2024] I’M HOI: Inertia-aware Monocular Capture of 3D Human-Object Interactions
Переглядів 5387 місяців тому
Project Page: afterjourney00.github.io/IM-HOI.github.io/ Arxiv: arxiv.org/abs/2312.08869 We are living in a world surrounded by diverse and “smart” devices with rich modalities of sensing ability. Conveniently capturing the interactions between us humans and these objects remains far-reaching. In this paper, we present I’m-HOI, a monocular scheme to faithfully capture the 3D motions of both the...
[CVPR2024] BOTH2Hands: Inferring 3D Hands from Both Text Prompts and Body Dynamics
Переглядів 98410 місяців тому
Project Page: godheritage.github.io/ Arxiv: arxiv.org/abs/2312.07937 The recently emerging text-to-motion advances have spired numerous attempts for convenient and interactive human motion generation. Yet, existing methods are largely limited to generating body motions only without considering the rich two-hand motions, let alone handling various conditions like body dynamics or texts. To break...
[CVPR2024] HiFi4G: High-Fidelity Human Performance Rendering via Compact Gaussian Splatting
Переглядів 12 тис.11 місяців тому
Project Page: nowheretrix.github.io/HiFi4G/ Arxiv: arxiv.org/abs/2312.03461 We have recently seen tremendous progress in photo-real human modeling and rendering. Yet, efficiently rendering realistic human performance and integrating it into the rasterization pipeline remains challenging. In this paper, we present HiFi4G, an explicit and compact Gaussian-based approach for high-fidelity human pe...
[CVPR2024] VideoRF: Rendering Dynamic Radiance Fields as 2D Feature Video Streams
Переглядів 2,1 тис.11 місяців тому
Project Page: aoliao12138.github.io/VideoRF/ Arxiv: arxiv.org/abs/2312.01407 Neural Radiance Fields (NeRFs) excel in photorealistically rendering static scenes. However, rendering dynamic, long-duration radiance fields on ubiquitous devices remains challenging, due to data storage and computational constraints. In this paper, we introduce VideoRF, the first approach to enable real-time streamin...
[CVPR2024] OMG: Towards Open-vocabulary Motion Generation via Mixture of Controllers
Переглядів 1,3 тис.11 місяців тому
Project Page: tr3e.github.io/omg-page/ Arxiv: arxiv.org/abs/2312.08985 We have recently seen tremendous progress in realistic text-to-motion generation. Yet, the existing methods often fail or produce implausible motions with unseen text inputs, which limits the applications. In this paper, we present OMG, a novel framework, which enables compelling motion generation from zero-shot open-vocabul...
[SIGGRAPH 2023] HACK: Learning a Parametric Head and Neck Model for High-fidelityAnimation
Переглядів 7 тис.Рік тому
Project Page: sites.google.com/view/hack-model Arxiv: arxiv.org/abs/2305.04469 Significant advancements have been made in developing parametric models for digital humans, with various approaches concentrating on parts such as the human body, hand, or face. Nevertheless, connectors such as the neck have been overlooked in these models, with rich anatomical priors often unutilized. In this paper,...
[SIGGRAPH 2023] DreamFace: Progressive Generation of Animatable 3D Faces under Text Guidance
Переглядів 15 тис.Рік тому
Project: sites.google.com/view/dreamface Arxiv: arxiv.org/pdf/2304.03117.pdf Web demo: hyperhuman.deemos.com HuggingFace: huggingface.co/spaces/DEEMOSTECH/ChatAvatar Emerging Metaverse applications demand accessible, accurate, and easy-to-use tools for 3D digital human creations in order to depict different cultures and societies as if in the physical world. Recent large-scale vision-language a...
[CVPR2023] ReRF: Neural Residual Radiance Fields for Streamably Free-Viewpoint Videos
Переглядів 3,3 тис.Рік тому
[CVPR2023] ReRF: Neural Residual Radiance Fields for Streamably Free-Viewpoint Videos
[TVCG (IEEEVR2023)] LiDAR-aid Inertial Poser: Large-scale Human Motion Capture
Переглядів 372Рік тому
[TVCG (IEEEVR2023)] LiDAR-aid Inertial Poser: Large-scale Human Motion Capture
[AAAI2023] HybridCap: Inertia-aid Monocular Capture of Challenging Human Motions
Переглядів 472Рік тому
[AAAI2023] HybridCap: Inertia-aid Monocular Capture of Challenging Human Motions
[CVPR2023] HumanGen: Generating Human Radiance Fields with Explicit Priors
Переглядів 663Рік тому
[CVPR2023] HumanGen: Generating Human Radiance Fields with Explicit Priors
[CVPR2023] NeuralDome: A Neural Modeling Pipeline on Multi-View Human-Object Interactions
Переглядів 811Рік тому
[CVPR2023] NeuralDome: A Neural Modeling Pipeline on Multi-View Human-Object Interactions
[CVPR2023] Instant-NVR: Instant Neural Volumetric Rendering for Human-object Interactions
Переглядів 1,1 тис.Рік тому
[CVPR2023] Instant-NVR: Instant Neural Volumetric Rendering for Human-object Interactions
[CVPR2023] Relightable Neural Human Assets from Multi-view Gradient Illuminations
Переглядів 981Рік тому
[CVPR2023] Relightable Neural Human Assets from Multi-view Gradient Illuminations
[SIGGRAPH ASIA 2022] Human Performance Modeling and Rendering via Neural Animated Mesh
Переглядів 2,6 тис.2 роки тому
[SIGGRAPH ASIA 2022] Human Performance Modeling and Rendering via Neural Animated Mesh
[SIGGRAPH ASIA 2022] SCULPTOR: Skeleton-Consistent Face Creation with a Learned Parametric Generator
Переглядів 3,6 тис.2 роки тому
[SIGGRAPH ASIA 2022] SCULPTOR: Skeleton-Consistent Face Creation with a Learned Parametric Generator
[SIGGRAPH ASIA 2022] Video-driven Neural Physically-based Facial Asset for Production
Переглядів 13 тис.2 роки тому
[SIGGRAPH ASIA 2022] Video-driven Neural Physically-based Facial Asset for Production
[SIGGRAPH2022] NIMBLE: A Non-rigid Hand Model with Bones and Muscles
Переглядів 2,3 тис.2 роки тому
[SIGGRAPH2022] NIMBLE: A Non-rigid Hand Model with Bones and Muscles
[SIGGRAPH2022] Artemis: Articulated Neural Pets with Appearance and Motion Synthesis
Переглядів 2,7 тис.2 роки тому
[SIGGRAPH2022] Artemis: Articulated Neural Pets with Appearance and Motion Synthesis
[CVPR2022] HumanNeRF: Efficiently Generated Human Radiance Field from Sparse Inputs
Переглядів 3,8 тис.2 роки тому
[CVPR2022] HumanNeRF: Efficiently Generated Human Radiance Field from Sparse Inputs
[CVPR2022] Fourier PlenOctree for Dynamic Radiance Field Rendering in Real-time
Переглядів 2 тис.2 роки тому
[CVPR2022] Fourier PlenOctree for Dynamic Radiance Field Rendering in Real-time
[CVPR2022] NeuralHOFusion: Neural Volumetric Rendering under Human-object Interactions
Переглядів 9872 роки тому
[CVPR2022] NeuralHOFusion: Neural Volumetric Rendering under Human-object Interactions