fThe graphics wizards at NVIDIA have figured out how we can all have high image quality videoconferences even with crappy bandwidth.
The standard method of videoconferencing uses a camera capturing pixels that must be transmitted over the connection. For every second we speak on camera, moving our faces, millions of pixels must be sent. As the system chokes on all of the pixels, the image quality is dialed down.
NVIDIA's system, Maxine, does not work by transmitting pixels. Instead the videoconference is started with a keyframe, a still image, of the speaker's face. Then, as the speaker begins speaking and moving their face, Maxine's AI-powered software only captures facial keypoints and transmits those over the network. Software on the receiving side then translates those keypoints and re-renders the speaker's face accordingly.
It's quite clever, and the difference is very noticeable:
Here's what it looks like on video. Note that the software can even change the angle of your gaze:
Enter a caption (optional)
NVIDIA refers to Maxine as AI video compression. You can learn more about it here.
Create a Core77 Account
Already have an account? Sign In
By creating a Core77 account you confirm that you accept the Terms of Use
Please enter your email and we will send an email to reset your password.