venturebeat.com Open in urlscan Pro
192.0.66.2  Public Scan

URL: https://venturebeat.com/ai/apple-researchers-achieve-breakthroughs-in-multimodal-ai-as-company-ramps-up-investments/
Submission: On March 24 via manual from SG — Scanned from SG

Form analysis 2 forms found in the DOM

GET https://venturebeat.com/

<form method="get" action="https://venturebeat.com/" class="search-form" id="nav-search-form">
  <input id="mobile-search-input" class="" type="text" placeholder="Search" name="s" aria-label="Search" required="">
  <button type="submit" class="">
    <svg width="24" height="24" viewBox="0 0 24 24" fill="none" xmlns="http://www.w3.org/2000/svg">
      <g>
        <path fill-rule="evenodd" clip-rule="evenodd"
          d="M14.965 14.255H15.755L20.745 19.255L19.255 20.745L14.255 15.755V14.965L13.985 14.685C12.845 15.665 11.365 16.255 9.755 16.255C6.16504 16.255 3.255 13.345 3.255 9.755C3.255 6.16501 6.16504 3.255 9.755 3.255C13.345 3.255 16.255 6.16501 16.255 9.755C16.255 11.365 15.665 12.845 14.6851 13.985L14.965 14.255ZM5.255 9.755C5.255 12.245 7.26501 14.255 9.755 14.255C12.245 14.255 14.255 12.245 14.255 9.755C14.255 7.26501 12.245 5.255 9.755 5.255C7.26501 5.255 5.255 7.26501 5.255 9.755Z">
        </path>
      </g>
    </svg>
  </button>
</form>

<form action="" data-action="nonce_mailchimp_boilerplate_subscribe" id="boilerplateNewsletterForm" class="Form js-vb-newsletter-cta">
  <input type="email" name="email" placeholder="Email" class="Form__input" id="boilerplateNewsletterEmail" required="">
  <input type="hidden" name="newsletter" value="vb_dailyroundup">
  <input type="hidden" name="b_f67554569818c29c4c844d121_89d8059242" value="">
  <input type="hidden" id="nonce_mailchimp_boilerplate_subscribe" name="nonce_mailchimp_boilerplate_subscribe" value="68df95cb3f"><input type="hidden" name="_wp_http_referer"
    value="/ai/apple-researchers-achieve-breakthroughs-in-multimodal-ai-as-company-ramps-up-investments/"> <button type="submit" class="Form__button Newsletter__sub-btn">Subscribe</button>
</form>

Text Content

Skip to main content
Events Video Special Issues Jobs
VentureBeat Homepage

Subscribe

 * Artificial Intelligence
   * View All
   * AI, ML and Deep Learning
   * Auto ML
   * Data Labelling
   * Synthetic Data
   * Conversational AI
   * NLP
   * Text-to-Speech
 * Security
   * View All
   * Data Security and Privacy
   * Network Security and Privacy
   * Software Security
   * Computer Hardware Security
   * Cloud and Data Storage Security
 * Data Infrastructure
   * View All
   * Data Science
   * Data Management
   * Data Storage and Cloud
   * Big Data and Analytics
   * Data Networks
 * Automation
   * View All
   * Industrial Automation
   * Business Process Automation
   * Development Automation
   * Robotic Process Automation
   * Test Automation
 * Enterprise Analytics
   * View All
   * Business Intelligence
   * Disaster Recovery Business Continuity
   * Statistical Analysis
   * Predictive Analysis
 * More
   * Data Decision Makers
   * Virtual Communication
     * Team Collaboration
     * UCaaS
     * Virtual Reality Collaboration
     * Virtual Employee Experience
   * Programming & Development
     * Product Development
     * Application Development
     * Test Management
     * Development Languages


Subscribe Events Video Special Issues Jobs



APPLE RESEARCHERS ACHIEVE BREAKTHROUGHS IN MULTIMODAL AI AS COMPANY RAMPS UP
INVESTMENTS

Michael Nuñez@MichaelFNunez
March 15, 2024 1:31 PM
 * Share on Facebook
 * Share on X
 * Share on LinkedIn

Credit: VentureBeat made with Midjourney

Join Gen AI enterprise leaders in Boston on March 27 for an exclusive night of
networking, insights, and conversations surrounding data integrity. Request an
invite here.

--------------------------------------------------------------------------------



Apple researchers have developed new methods for training large language models
on both text and images, enabling more powerful and flexible AI systems, in what
could be a significant advance for artificial intelligence and for future Apple
products.

The work, described in a research paper titled “MM1: Methods, Analysis &
Insights from Multimodal LLM Pre-training” that was quietly posted to arxiv.org
this week, demonstrates how carefully combining different types of training data
and model architectures can lead to state-of-the-art performance on a range of
AI benchmarks.

1
/
21
Live from GTC 2024 - Interview with Supermicro
Read More

33.3K
1




Video Player is loading.
Play Video
Unmute

Duration 0:00
/
Current Time 0:00
Playback Speed Settings
1x
Loaded: 0%

0:00

Remaining Time -0:00
 
FullscreenPlayRewind 10 SecondsUp Next

This is a modal window.



Beginning of dialog window. Escape will cancel and close the window.

TextColorWhiteBlackRedGreenBlueYellowMagentaCyanTransparencyOpaqueSemi-TransparentBackgroundColorBlackWhiteRedGreenBlueYellowMagentaCyanTransparencyOpaqueSemi-TransparentTransparentWindowColorBlackWhiteRedGreenBlueYellowMagentaCyanTransparencyTransparentSemi-TransparentOpaque
Font Size50%75%100%125%150%175%200%300%400%Text Edge
StyleNoneRaisedDepressedUniformDropshadowFont FamilyProportional
Sans-SerifMonospace Sans-SerifProportional SerifMonospace SerifCasualScriptSmall
Caps
Reset restore all settings to the default valuesDone
Close Modal Dialog

End of dialog window.

Share
Playback Speed

0.25x
0.5x
1x Normal
1.5x
2x
Replay the list

TOP ARTICLES






 * Powered by AnyClip
 * Privacy Policy




Live from GTC 2024 - Interview with Supermicro


“We demonstrate that for large-scale multimodal pre-training using a careful mix
of image-caption, interleaved image-text, and text-only data is crucial for
achieving state-of-the-art few-shot results across multiple benchmarks,” the
researchers explain. By training models on a diverse dataset spanning visual and
linguistic information, the MM1 models were able to excel at tasks like image
captioning, visual question answering, and natural language inference.


SCALING VISUAL COMPONENTS IS KEY

The researchers also found that the choice of image encoder and the resolution
of input images had a major impact on model performance. “We show that the image
encoder together with image resolution and the image token count has substantial
impact, while the vision-language connector design is of comparatively
negligible importance,” they said. This suggests that continued scaling and
refinement of the visual components of these multimodal models will be key to
unlocking further gains.


VB EVENT

The AI Impact Tour – Atlanta

Continuing our tour, we’re headed to Atlanta for the AI Impact Tour stop on
April 10th. This exclusive, invite-only event, in partnership with Microsoft,
will feature discussions on how generative AI is transforming the security
workforce. Space is limited, so request an invite today.

Request an invite

Surprisingly, the largest 30 billion parameter MM1 model exhibited strong
in-context learning abilities, allowing it to perform multi-step reasoning over
multiple input images using few-shot “chain-of-thought” prompting. This points
to the potential for large multimodal models to tackle complex, open-ended
problems that require grounded language understanding and generation.


APPLE’S BILLION-DOLLAR AI BET

advertisement


The MM1 research comes as Apple has been ramping up its investments in
artificial intelligence in an effort to catch up with rivals like Google,
Microsoft, and Amazon who have raced ahead in integrating generative AI
capabilities into their products. The company is on track to spend $1 billion
per year on AI development, according to a recent Bloomberg report.

Sources say Apple is working on a large language model framework called “Ajax”
as well as a chatbot known internally as “Apple GPT.” The goal is to integrate
these technologies into Siri, Messages, Apple Music and other apps and services.
For example, AI could be used to auto-generate personalized playlists, assist
developers in writing code, or engage in open-ended conversation and task
completion.

“We view AI and machine learning as fundamental technologies, and they’re
integral to virtually every product that we ship,” Apple CEO Tim Cook said
during a recent earnings call. “I’m not going to get into details about what it
is, because — as you know, we don’t — we really don’t do that. But you can bet
that we’re investing, we’re investing quite a bit, we’re going to do it
responsibly and it will — you will see product advancements over time that where
the — those technologies are at the heart of them.”


THE HIGH STAKES OF THE AI ARMS RACE

advertisement


Apple has a history of being a fast follower rather than a first mover when it
comes to major technology shifts. But with AI poised to transform every aspect
of the digital landscape, the stakes are high for the iPhone maker to stay
competitive. The MM1 research shows that Apple has the talent and resources to
make cutting-edge advances. But it remains to be seen if the notoriously
secretive company can move quickly enough to keep pace in the escalating AI arms
race.

Many eyes will be on Apple’s Worldwide Developers Conference in June, where the
company is expected to unveil new AI-powered features and developer tools. In
the meantime, smaller AI advances like the Keyframer animation tool and
performance enhancements coming out of Apple’s research labs show steady
progress is being made behind the scenes. 

As Cook hinted during a recent earnings call: “We’re excited to share details of
our ongoing work in AI later this year.” That work, it is now clear, includes
ambitious efforts to master multimodal intelligence at the largest scales. The
age of pervasively helpful and human-like AI may arrive sooner than we think —
and Apple intends to play a major part in shaping it.

VB Daily

Stay in the know! Get the latest news in your inbox daily

Subscribe

By subscribing, you agree to VentureBeat's Terms of Service.

Thanks for subscribing. Check out more VB newsletters here.

An error occured.






NEXT STOP: AI IMPACT TOUR BOSTON

Join us in Boston an exclusive invitation-only evening of networking and
insights to discuss how to ensure data integrity for enterprise AI.

Request an Invite


 * VentureBeat Homepage
 * Follow us on Facebook
 * Follow us on X
 * Follow us on LinkedIn
 * Follow us on RSS

 * Press Releases
 * Contact Us
 * Advertise
 * Share a News Tip
 * Contribute to DataDecisionMakers

 * Privacy Policy
 * Terms of Service
 * Do Not Sell My Personal Information

© 2024 VentureBeat. All rights reserved.