Google's impressive 3D video conference - here's how it works

(Image credit: Google)

Google had given us a peak into what they saw as the next generation of video chats where participants appear in a 3D chat booth. At a time when its rivals such as Zoom are coming with innovations, this seemed to be the pinnacle of multi-location people collaboration. 

Now, the company has gone a step ahead and shared the technology beneath the experiments that has been named Project Starline. However, this isn't the first time that technology has attempted a 3D interface. Snapchat Spectacles 3 features two cameras designed to allow users to create 3D video that can interpret the depth.


Key breakthroughs include 3D imaging, real-time compression and a 3D display. (Courtesy: Google)  (Image credit: Google Blog Screengrab)

The research paper provides intricate details around how Google created the tech that was demoed at the 2021 I/O conference. It details some of the challenges that exist while attempting to trick the human brain into accepting the depth within what is essentially a 2D image. 

The image of a person sitting a couple of feet away needs to be high resolution and free of artefacts around it. It also needs to be aligned to the relative position of the other person in the booth. Then there is the matter of making it appear that sound is emanating from the other person's mouth as also the eye contact that needs to be maintained between the two images.  

In the past, we have seen experiments with VR headsets by several companies such as Meta (earlier Facebook). What Google is experimenting with could result in 3D experiences not requiring bulky headsets and trackers.

Here's how it all comes together

The research paper also described the sort of hardware required to resolve these issues. For starters, the system is built around a 65-inch 8K panel running at 60Hz. Around these, three pods that capture colour imagery and depth data. In addition, there are four tracking cameras, four microphones, two loudspeakers and infrared projectors to capture multiple viewpoints and depth maps. 

All of this equipment is used to capture colour images from four points and three depth maps. These seven video streams are captured at 44.1kHz and encoded at 256 Kbps. This is where Google's expertise in capturing and transmitting data comes into play as the bandwidth required is between 30MB to 100MB based on a variety of variables such as the clothes worn and the magnitude of their gestures. 

The project also brings four high-end Nvidia graphics cards to encode and decode all the data with latency averaging at around 105.8 milliseconds. You could say, it would be just as lifelike as a 2D video call across a broadband connection in India. Google has already tested out the video conferencing across three of its sites, evoking good reactions from the attendees, the paper says. 

The company claims that as many as 117 participants have held 308 meetings over nine months. Each of the meetings averaging about 35 minutes. 

Of course, there are no details forthcoming from the company over whether the new system would see the light of the day soon and how they plan to commercialize it. Of course, one cannot expect costs to leak out at this juncture, given that Google itself may be using internal resources for Project Starline. 

Want to know about the latest happenings in tech? Follow TechRadar India on TwitterFacebook and Instagram!

Raj Narayan

A media veteran who turned a gadget lover fairly recently. An early adopter of Apple products, Raj has an insatiable curiosity for facts and figures which he puts to use in research. He engages in active sport and retreats to his farm during his spare time.