venturebeat.com Open in urlscan Pro
192.0.66.2  Public Scan

Submitted URL: https://venturebeat.com/ai/google-shows-off-lumiere-a-space-time-diffusion-model-for-realistic-ai-videos/)
Effective URL: https://venturebeat.com/ai/google-shows-off-lumiere-a-space-time-diffusion-model-for-realistic-ai-videos/
Submission: On January 25 via api from BE — Scanned from DE

Form analysis 1 forms found in the DOM

GET https://venturebeat.com/

<form method="get" action="https://venturebeat.com/" class="search-form" id="nav-search-form">
  <input id="mobile-search-input" class="" type="text" placeholder="Search" name="s" aria-label="Search" required="">
  <button type="submit" class="">
    <svg width="24" height="24" viewBox="0 0 24 24" fill="none" xmlns="http://www.w3.org/2000/svg">
      <g>
        <path fill-rule="evenodd" clip-rule="evenodd"
          d="M14.965 14.255H15.755L20.745 19.255L19.255 20.745L14.255 15.755V14.965L13.985 14.685C12.845 15.665 11.365 16.255 9.755 16.255C6.16504 16.255 3.255 13.345 3.255 9.755C3.255 6.16501 6.16504 3.255 9.755 3.255C13.345 3.255 16.255 6.16501 16.255 9.755C16.255 11.365 15.665 12.845 14.6851 13.985L14.965 14.255ZM5.255 9.755C5.255 12.245 7.26501 14.255 9.755 14.255C12.245 14.255 14.255 12.245 14.255 9.755C14.255 7.26501 12.245 5.255 9.755 5.255C7.26501 5.255 5.255 7.26501 5.255 9.755Z">
        </path>
      </g>
    </svg>
  </button>
</form>

Text Content

WE VALUE YOUR PRIVACY

We and our partners store and/or access information on a device, such as cookies
and process personal data, such as unique identifiers and standard information
sent by a device for personalised ads and content, ad and content measurement,
and audience insights, as well as to develop and improve products. With your
permission we and our partners may use precise geolocation data and
identification through device scanning. You may click to consent to our and our
760 partners’ processing as described above. Alternatively you may access more
detailed information and change your preferences before consenting or to refuse
consenting. Please note that some processing of your personal data may not
require your consent, but you have a right to object to such processing. Your
preferences will apply to this website only. You can change your preferences at
any time by returning to this site or visit our privacy policy.
MORE OPTIONSAGREE

Skip to main content
Events Video Special Issues Jobs
VentureBeat Homepage

Subscribe

 * Artificial Intelligence
   * View All
   * AI, ML and Deep Learning
   * Auto ML
   * Data Labelling
   * Synthetic Data
   * Conversational AI
   * NLP
   * Text-to-Speech
 * Security
   * View All
   * Data Security and Privacy
   * Network Security and Privacy
   * Software Security
   * Computer Hardware Security
   * Cloud and Data Storage Security
 * Data Infrastructure
   * View All
   * Data Science
   * Data Management
   * Data Storage and Cloud
   * Big Data and Analytics
   * Data Networks
 * Automation
   * View All
   * Industrial Automation
   * Business Process Automation
   * Development Automation
   * Robotic Process Automation
   * Test Automation
 * Enterprise Analytics
   * View All
   * Business Intelligence
   * Disaster Recovery Business Continuity
   * Statistical Analysis
   * Predictive Analysis
 * More
   * Data Decision Makers
   * Virtual Communication
     * Team Collaboration
     * UCaaS
     * Virtual Reality Collaboration
     * Virtual Employee Experience
   * Programming & Development
     * Product Development
     * Application Development
     * Test Management
     * Development Languages


Subscribe Events Video Special Issues Jobs



GOOGLE SHOWS OFF LUMIERE, A SPACE-TIME DIFFUSION MODEL FOR REALISTIC AI VIDEOS 

Shubham Sharma@mr_bumss
January 24, 2024 12:57 PM
 * Share on Facebook
 * Share on X
 * Share on LinkedIn

Lumiere
Image Credit: Lumiere Github

As more and more enterprises continue to double down on the power of generative
AI, organizations are racing to build more competent offerings for them. Case in
point: Lumiere, a space-time diffusion model proposed by researchers from
Google, Weizmann Institute of Science and Tel Aviv University to help with
realistic video generation.

The paper detailing the technology has just been published, although the models
remain unavailable to test. If that changes, Google can introduce a very strong
player in the AI video space, which is currently being dominated by players like
Runway, Pika and Stability AI.

1
/
4
Putting AI to Work with Matt Marshall
Read More

384K
2.2K




Video Player is loading.
Play Video
Unmute

Duration 0:00
/
Current Time 0:00
Playback Speed Settings
1x
Loaded: 0%

0:00

Remaining Time -0:00
 
FullscreenPlayUp Next

This is a modal window.



Beginning of dialog window. Escape will cancel and close the window.

TextColorWhiteBlackRedGreenBlueYellowMagentaCyanTransparencyOpaqueSemi-TransparentBackgroundColorBlackWhiteRedGreenBlueYellowMagentaCyanTransparencyOpaqueSemi-TransparentTransparentWindowColorBlackWhiteRedGreenBlueYellowMagentaCyanTransparencyTransparentSemi-TransparentOpaque
Font Size50%75%100%125%150%175%200%300%400%Text Edge
StyleNoneRaisedDepressedUniformDropshadowFont FamilyProportional
Sans-SerifMonospace Sans-SerifProportional SerifMonospace SerifCasualScriptSmall
Caps
Reset restore all settings to the default valuesDone
Close Modal Dialog

End of dialog window.

Share
Playback Speed

0.25x
0.5x
1x Normal
1.5x
2x
Replay the list

TOP ARTICLES






 * Powered by AnyClip
 * Privacy Policy




Putting AI to Work with Matt Marshall


The researchers claim the model takes a different approach from existing players
and synthesizes videos that portray realistic, diverse and coherent motion – a
pivotal challenge in video synthesis.


WHAT CAN LUMIERE DO?

At its core, Lumiere, which means light, is a video diffusion model that
provides users with the ability to generate realistic and stylized videos. It
also provides options to edit them on command. 

Users can give text inputs describing what they want in natural language and the
model generates a video portraying that. Users can also upload an existing still
image and add a prompt to transform it into a dynamic video. The model also
supports additional features such as inpainting, which inserts specific objects
to edit videos with text prompts; Cinemagraph to add motion to specific parts of
a scene; and stylized generation to take reference style from one image and
generate videos using that.

“We demonstrate state-of-the-art text-to-video generation results, and show that
our design easily facilitates a wide range of content creation tasks and video
editing applications, including image-to-video, video inpainting, and stylized
generation,” the researchers noted in the paper.


advertisement


While these capabilities are not new in the industry and have been offered by
players like Runway and Pika, the authors claim that most existing models tackle
the added temporal data dimensions (representing a state in time) associated
with video generation by using a cascaded approach. First, a base model
generates distant keyframes and then subsequent temporal super-resolution (TSR)
models generate the missing data between them in non-overlapping segments. This
works but makes temporal consistency difficult to achieve, often leading to
restrictions in terms of video duration, overall visual quality, and the degree
of realistic motion they can generate.

Lumiere, on its part, addresses this gap by using a Space-Time U-Net
architecture that generates the entire temporal duration of the video at once,
through a single pass in the model, leading to more realistic and coherent
motion. 

“By deploying both spatial and (importantly) temporal down- and up-sampling and
leveraging a pre-trained text-to-image diffusion model, our model learns to
directly generate a full-frame-rate, low-resolution video by processing it in
multiple space-time scales,” the researchers noted in the paper.

advertisement


The video model was trained on a dataset of 30 million videos, along with their
text captions, and is capable of generating 80 frames at 16 fps. The source of
this data, however, remains unclear at this stage.


PERFORMANCE AGAINST KNOWN AI VIDEO MODELS

When comparing the model with offerings from Pika, Runway, and Stability AI, the
researchers noted that while these models produced high per-frame visual
quality, their four-second-long outputs had very limited motion, leading to
near-static clips at times. ImagenVideo, another player in the category,
produced reasonable motion but lagged in terms of quality.

“In contrast, our method produces 5-second videos that have higher motion
magnitude while maintaining temporal consistency and overall quality,” the
researchers wrote. They said users surveyed on the quality of these models also
preferred Lumiere over the competition for text and image-to-video generation.

advertisement


While this could be the beginning of something new in the rapidly moving AI
video market, it is important to note that Lumiere is not available to test yet.
The company also notes that the model has certain limitations. It can not
generate videos consisting of multiple shots or those involving transitions
between scenes — something that remains an open challenge for future research. 

VentureBeat's mission is to be a digital town square for technical
decision-makers to gain knowledge about transformative enterprise technology and
transact. Discover our Briefings.




 * VentureBeat Homepage
 * Follow us on Facebook
 * Follow us on X
 * Follow us on LinkedIn
 * Follow us on RSS

 * Press Releases
 * Contact Us
 * Advertise
 * Share a News Tip
 * Contribute to DataDecisionMakers

 * Privacy Policy
 * Terms of Service
 * Do Not Sell My Personal Information

© 2024 VentureBeat. All rights reserved.