venturebeat.com Open in urlscan Pro
192.0.66.2  Public Scan

Submitted URL: https://venturebeat.com/ai/microsoft-dragnuwa-pushes-the-bar-in-ai-video-with-trajectory-based-generation/)
Effective URL: https://venturebeat.com/ai/microsoft-dragnuwa-pushes-the-bar-in-ai-video-with-trajectory-based-generation/
Submission: On January 10 via api from BE — Scanned from DE

Form analysis 1 forms found in the DOM

GET https://venturebeat.com/

<form method="get" action="https://venturebeat.com/" class="search-form" id="nav-search-form">
  <input id="mobile-search-input" class="" type="text" placeholder="Search" name="s" aria-label="Search" required="">
  <button type="submit" class="">
    <svg width="24" height="24" viewBox="0 0 24 24" fill="none" xmlns="http://www.w3.org/2000/svg">
      <g>
        <path fill-rule="evenodd" clip-rule="evenodd"
          d="M14.965 14.255H15.755L20.745 19.255L19.255 20.745L14.255 15.755V14.965L13.985 14.685C12.845 15.665 11.365 16.255 9.755 16.255C6.16504 16.255 3.255 13.345 3.255 9.755C3.255 6.16501 6.16504 3.255 9.755 3.255C13.345 3.255 16.255 6.16501 16.255 9.755C16.255 11.365 15.665 12.845 14.6851 13.985L14.965 14.255ZM5.255 9.755C5.255 12.245 7.26501 14.255 9.755 14.255C12.245 14.255 14.255 12.245 14.255 9.755C14.255 7.26501 12.245 5.255 9.755 5.255C7.26501 5.255 5.255 7.26501 5.255 9.755Z">
        </path>
      </g>
    </svg>
  </button>
</form>

Text Content

WE VALUE YOUR PRIVACY

We and our partners store and/or access information on a device, such as cookies
and process personal data, such as unique identifiers and standard information
sent by a device for personalised ads and content, ad and content measurement,
and audience insights, as well as to develop and improve products. With your
permission we and our partners may use precise geolocation data and
identification through device scanning. You may click to consent to our and our
760 partners’ processing as described above. Alternatively you may access more
detailed information and change your preferences before consenting or to refuse
consenting. Please note that some processing of your personal data may not
require your consent, but you have a right to object to such processing. Your
preferences will apply to this website only. You can change your preferences at
any time by returning to this site or visit our privacy policy.
MORE OPTIONSAGREE

Skip to main content
Events Video Special Issues Jobs
VentureBeat Homepage

Subscribe

 * Artificial Intelligence
   * View All
   * AI, ML and Deep Learning
   * Auto ML
   * Data Labelling
   * Synthetic Data
   * Conversational AI
   * NLP
   * Text-to-Speech
 * Security
   * View All
   * Data Security and Privacy
   * Network Security and Privacy
   * Software Security
   * Computer Hardware Security
   * Cloud and Data Storage Security
 * Data Infrastructure
   * View All
   * Data Science
   * Data Management
   * Data Storage and Cloud
   * Big Data and Analytics
   * Data Networks
 * Automation
   * View All
   * Industrial Automation
   * Business Process Automation
   * Development Automation
   * Robotic Process Automation
   * Test Automation
 * Enterprise Analytics
   * View All
   * Business Intelligence
   * Disaster Recovery Business Continuity
   * Statistical Analysis
   * Predictive Analysis
 * More
   * Data Decision Makers
   * Virtual Communication
     * Team Collaboration
     * UCaaS
     * Virtual Reality Collaboration
     * Virtual Employee Experience
   * Programming & Development
     * Product Development
     * Application Development
     * Test Management
     * Development Languages


Subscribe Events Video Special Issues Jobs



MICROSOFT’S LATEST MODEL PUSHES THE BAR IN AI VIDEO WITH TRAJECTORY-BASED
GENERATION

Shubham Sharma@mr_bumss
January 9, 2024 11:09 AM
 * Share on Facebook
 * Share on X
 * Share on LinkedIn

Image Credit: Venturebeat made with Ideogram

Join leaders in San Francisco on January 10 for an exclusive night of
networking, insights, and conversation. Request an invite here.

--------------------------------------------------------------------------------



AI companies are racing to master the art of video generation. Over the last few
months, several players in the space, including Stability AI and Pika Labs, have
released models capable of producing videos of different types with text and
image prompts. Building on that work, Microsoft AI has dropped a model that aims
to deliver more granular control over the production of a video.

Dubbed DragNUWA, the project supplements the known approaches of text and
image-based prompting with trajectory-based generation. This allows users to
manipulate objects or entire video frames with specific trajectories. This gives
an easy way to achieve highly controllable video generation from semantic,
spatial and temporal aspects – while ensuring high-quality output at the same
time.

1
/
2
Putting AI to Work with Matt Marshall
Read More

210.7K
1.2K




Video Player is loading.
Play Video
Unmute

Duration 0:00
/
Current Time 0:00
Playback Speed Settings
1x
Loaded: 0%

0:00

Remaining Time -0:00
 
FullscreenPlayUp Next

This is a modal window.



Beginning of dialog window. Escape will cancel and close the window.

TextColorWhiteBlackRedGreenBlueYellowMagentaCyanTransparencyOpaqueSemi-TransparentBackgroundColorBlackWhiteRedGreenBlueYellowMagentaCyanTransparencyOpaqueSemi-TransparentTransparentWindowColorBlackWhiteRedGreenBlueYellowMagentaCyanTransparencyTransparentSemi-TransparentOpaque
Font Size50%75%100%125%150%175%200%300%400%Text Edge
StyleNoneRaisedDepressedUniformDropshadowFont FamilyProportional
Sans-SerifMonospace Sans-SerifProportional SerifMonospace SerifCasualScriptSmall
Caps
Reset restore all settings to the default valuesDone
Close Modal Dialog

End of dialog window.

Share
Playback Speed

0.25x
0.5x
1x Normal
1.5x
2x
Replay the list

TOP ARTICLES






 * Powered by AnyClip
 * Privacy Policy




Putting AI to Work with Matt Marshall


Microsoft has open-sourced the model weights and demo for the project, allowing
the community to try out it. However, it is important to note that this is still
a research effort and remains far from perfect.


WHAT MAKES MICROSOFT DRAGNUWA UNIQUE?

Historically, AI-driven video generation has revolved around either text, image
or trajectory-based inputs. The work has been pretty good, but each approach has
struggled to deliver fine-grained control over the desired output. 


VB EVENT

The AI Impact Tour

Getting to an AI Governance Blueprint – Request an invite for the Jan 10 event.

 


Learn More
advertisement


The combination of text and images alone, for instance, fails to convey the
intricate motion details present in a video. Meanwhile, images and trajectories
may not adequately represent future objects and trajectories and language can
result in ambiguity when expressing abstract concepts. An example would be
failing to differentiate between a real-world fish and a painting of a fish. 

To work around this, in August 2023, Microsoft’s AI team proposed DragNUWA, an
open-domain diffusion-based video generation model that brought together all
three factors – images, text and trajectory – to facilitate highly controllable
video generation from semantic, spatial and temporal aspects. This allows the
user to strictly define the desired text, image and trajectory in the input to
control aspects like camera movements, including zoom-in or zoom-out effects, or
object motion in the output video.

For instance, one could upload the image of a boat in a body of water and add a
text prompt “a boat sailing in the lake” as well as directions marking the
boat’s trajectory. This would result in a video of the boat sailing in the
marked direction, giving the desired outcome. The trajectory provides motion
details, language gives details of future objects and images add the distinction
between objects.

DragNUWA in action
advertisement



RELEASED ON HUGGING FACE

In the early 1.5 version of the DragNUWA, which has just been released on
Hugging Face, Microsoft has tapped Stability AI’s Stable Video Diffusion model
to animate an image or its object according to a specific path. Once matured,
this technology can make video generation and editing a piece of cake. Imagine
being able to transform backgrounds, animate images and direct motion paths just
by drawing a line here or there. 

advertisement


AI enthusiasts are excited about the development, with many calling it a big
leap in creative AI. However, it remains to be seen how the research model
performs in the real world. In its tests, Microsoft claimed that the model was
able to achieve accurate camera movements and object motions with different drag
trajectories.

“Firstly, DragNUWA supports complex curved trajectories, enabling the generation
of objects moving along the specific intricate trajectory. Secondly, DragNUWA
allows for variable trajectory lengths, with longer trajectories resulting in
larger motion amplitudes. Lastly, DragNUWA has the capability to simultaneously
control the trajectories of multiple objects. To the best of our knowledge, no
existing video generation model has effectively achieved such trajectory
controllability, highlighting DragNUWA’s substantial potential to advance
controllable video generation in future applications,” the company researchers
noted in the paper.

The work adds to the growing mountain of research in the AI video space. Just
recently, Pika Labs made headlines by opening access to its text-to-video
interface that works just like ChatGPT and produces high-quality short videos
with a range of customizations on offer.

VentureBeat's mission is to be a digital town square for technical
decision-makers to gain knowledge about transformative enterprise technology and
transact. Discover our Briefings.




THE AI IMPACT TOUR

Join us in San Francisco for an invitation-only evening of networking and
insights at our exclusive event: "Getting to an AI Governance Blueprint."

Request an Invite


 * VentureBeat Homepage
 * Follow us on Facebook
 * Follow us on X
 * Follow us on LinkedIn
 * Follow us on RSS

 * Press Releases
 * Contact Us
 * Advertise
 * Share a News Tip
 * Contribute to DataDecisionMakers

 * Privacy Policy
 * Terms of Service
 * Do Not Sell My Personal Information

© 2024 VentureBeat. All rights reserved.