Dnext

May 8, 2024 2:51am

Vidu is a Chinese video generation AI competitive with OpenAI's Sora, according to rumor (neither is available for the public to use). It's a collaboration between Tsinghua University in Beijing and a company called Shengshu Technology.

"Vidu is capable of producing 16-second clips at 1080p resolution -- Sora by comparison can generate 60-second videos. Vidu is based on a Universal Vision Transformer (U-ViT) architecture, which the company says allows it to simulate the real physical world with multi-camera view generation. This architecture was reportedly developed by the Shengshu Technology team in September 2022 and as such would predate the diffusion transformer (DiT) architecture used by Sora."

"According to the company, Vidu can generate videos with complex scenes adhering to real-world physics, such as realistic lighting and shadows, and detailed facial expressions. The model also demonstrates a rich imagination, creating non-existent, surreal content with depth and complexity. Vidu's multi-camera capabilities allows for the generation of dynamic shots, seamlessly transitioning between long shots, close-ups, and medium shots within a single scene."

"A side-by-side comparison with Sora reveals that the generated videos are not at Sora's level of realism."

Meet Vidu, A New Chinese Text to Video AI Model - Maginative

#solidstatelife #ai #genai #computervision #videogeneration