Fusion3D: Research Spotlight

Fusion3D, developed by CoCoSys scholars Sixu Li, Yang (Katie) Zhao, Chaojian Li, Zhifan Ye, and other members of PI Prof. Yingyan (Celine) Lin’s lab at Georgia Tech, presents an end-to-end acceleration framework for real-time 3D intelligence. By accelerating computation and data movement across the algorithm, architecture, and system integration levels, this innovative approach significantly improves the throughput of 3D intelligence applications. Recognized for its impact, Fusion3D received the Best Paper Award at MICRO 2024, one of the top conferences in computer architecture, with the citation highlighting it as a “compelling end-to-end demonstration of an emerging application of importance.”

1. Tell us about the findings of your recent work entitled “Fusion-3D: Integrated Acceleration for Instant 3D Reconstruction and Real-Time Rendering” and how it supports the goals of CoCoSys.

The key insight of Fusion-3D is the discovery of domain-specific multi-level independence (see figure below) inherent to 3D intelligence. Unlike general-purpose computational tasks, 3D intelligence exhibits unique dataflow patterns that allow us to divide computations based on independence, enabling efficient parallel processing. This division maximizes runtime efficiency, while the aggregation phase ensures high accuracy. This independence is not only pivotal for 3D intelligence but could also inspire solutions in other domains requiring parallel computation.

Fusion-3D aligns with the goals of CoCoSys, which aims to build the next generation of collaborative human-AI systems. For such systems to function effectively, AI must be capable of perceiving and understanding the physical world, including its spatial context and nearby environments, to communicate seamlessly and interactively with humans. Fusion-3D acts as the “eyes” and “memories” for these systems, enabling fast and accurate 3D reconstruction, rendering, and intelligent applications built on top of them. This technology empowers AI to “see” and process real-world environments in a real-time manner, laying the groundwork for robust perception and interaction.

2. How do your research findings push the boundaries of what we currently know or can do in the field?

Fusion3D advances efficient 3D intelligence by providing the first unified solution for simultaneous real-time rendering (recalling the environment) and instant reconstruction (memorizing the environment). These two capabilities are critical for human-AI systems that need to perceive, understand, and respond to dynamic real-world environments. Prior to Fusion-3D, no solution could achieve both at the required speed and efficiency, especially on resource-constrained edge devices (e.g., operating within only 5 watts of power). By addressing this gap, Fusion-3D makes real-time AI perception of the real world feasible, providing a foundation for future intelligent systems.

3. What are some real-world applications or examples of your research that people might encounter in their daily lives? (Note: Try to share an example or analogy to help a non-expert understand your research.)

Fusion-3D has far-reaching applications in both industry and everyday life. For example:

Rescue Robots: In disaster scenarios, robots equipped with Fusion-3D can rapidly reconstruct and analyze environments, providing first responders with critical, real-time 3D visuals of dangerous or inaccessible areas.
Telepresence: Imagine video calls where participants not only see each other but also interact within realistic, reconstructed 3D environments. Fusion-3D enables such immersive communication experiences.
AR/VR Navigation: Users could scan their surroundings with a smartphone and instantly generate a 3D map for augmented reality navigation or virtual tours.

To simplify: Fusion-3D acts like giving AI a pair of “3D glasses,” enabling it to visualize and interact with the world in ways humans find intuitive.

4. What inspired you to pursue this research, and why do you think it is important?

Fusion-3D is the result of a two-year research effort driven by Prof. Lin’s group, based on the vision that 3D intelligence will be essential for future AI systems. The reasoning is simple: for AI to function effectively alongside humans, it must be able to perceive and understand the 3D world as we do. This journey began in 2022 when Chaojian, a senior Ph.D. student, and Sixu, a new group member, explored efficient solutions for neural rendering pipelines under Prof. Lin’s guidance.

Their first milestone was an efficient rendering system that could generate high-quality images from reconstructed scenes, akin to “recalling a specific viewpoint of a memory.” This work was accepted and then has now become the most-cited paper at ICCAD 2022. The next phase addressed efficient 3D reconstruction—teaching computers to “memorize” environments—resulting in Instant-3D, a paper accepted at ISCA 2023. Building on these successes, the team developed Fusion-3D, an all-in-one solution enabling computers to “see,” “memorize,” and “recall” environments, regardless of the scale of the 3D scenes.

This work is crucial because effective human-AI collaboration requires AI to perceive and understand the 3D physical world. By achieving these capabilities, Fusion-3D directly supports CoCoSys’s mission to design intelligent and collaborative human-AI systems.