venturebeat.com Open in urlscan Pro
2606:4700::6812:8ee  Public Scan

Submitted URL: https://venturebeat.com/ai/why-small-language-models-are-the-next-big-thing-in-ai/))
Effective URL: https://venturebeat.com/ai/why-small-language-models-are-the-next-big-thing-in-ai/
Submission: On April 15 via api from BE — Scanned from DE

Form analysis 2 forms found in the DOM

GET https://venturebeat.com/

<form method="get" action="https://venturebeat.com/" class="search-form" id="nav-search-form">
  <input id="mobile-search-input" class="" type="text" placeholder="Search" name="s" aria-label="Search" required="">
  <button type="submit" class="">
    <svg width="24" height="24" viewBox="0 0 24 24" fill="none" xmlns="http://www.w3.org/2000/svg">
      <g>
        <path fill-rule="evenodd" clip-rule="evenodd"
          d="M14.965 14.255H15.755L20.745 19.255L19.255 20.745L14.255 15.755V14.965L13.985 14.685C12.845 15.665 11.365 16.255 9.755 16.255C6.16504 16.255 3.255 13.345 3.255 9.755C3.255 6.16501 6.16504 3.255 9.755 3.255C13.345 3.255 16.255 6.16501 16.255 9.755C16.255 11.365 15.665 12.845 14.6851 13.985L14.965 14.255ZM5.255 9.755C5.255 12.245 7.26501 14.255 9.755 14.255C12.245 14.255 14.255 12.245 14.255 9.755C14.255 7.26501 12.245 5.255 9.755 5.255C7.26501 5.255 5.255 7.26501 5.255 9.755Z">
        </path>
      </g>
    </svg>
  </button>
</form>

<form action="" data-action="nonce_mailchimp_boilerplate_subscribe" id="boilerplateNewsletterForm" class="Form js-vb-newsletter-cta">
  <input type="email" name="email" placeholder="Email" class="Form__input" id="boilerplateNewsletterEmail" required="">
  <input type="hidden" name="newsletter" value="vb_dailyroundup">
  <input type="hidden" name="b_f67554569818c29c4c844d121_89d8059242" value="">
  <input type="hidden" id="nonce_mailchimp_boilerplate_subscribe" name="nonce_mailchimp_boilerplate_subscribe" value="56ed90e5f5"><input type="hidden" name="_wp_http_referer" value="/ai/why-small-language-models-are-the-next-big-thing-in-ai/">
  <button type="submit" class="Form__button Newsletter__sub-btn">Subscribe</button>
</form>

Text Content

WE VALUE YOUR PRIVACY

We and our partners store and/or access information on a device, such as cookies
and process personal data, such as unique identifiers and standard information
sent by a device for personalised advertising and content, advertising and
content measurement, audience research and services development. With your
permission we and our partners may use precise geolocation data and
identification through device scanning. You may click to consent to our and our
1413 partners’ processing as described above. Alternatively you may access more
detailed information and change your preferences before consenting or to refuse
consenting. Please note that some processing of your personal data may not
require your consent, but you have a right to object to such processing. Your
preferences will apply to this website only. You can change your preferences or
withdraw your consent at any time by returning to this site and clicking the
"Privacy" button at the bottom of the webpage.
MORE OPTIONSAGREE

Skip to main content
Events Video Special Issues Jobs
VentureBeat Homepage

Subscribe

 * Artificial Intelligence
   * View All
   * AI, ML and Deep Learning
   * Auto ML
   * Data Labelling
   * Synthetic Data
   * Conversational AI
   * NLP
   * Text-to-Speech
 * Security
   * View All
   * Data Security and Privacy
   * Network Security and Privacy
   * Software Security
   * Computer Hardware Security
   * Cloud and Data Storage Security
 * Data Infrastructure
   * View All
   * Data Science
   * Data Management
   * Data Storage and Cloud
   * Big Data and Analytics
   * Data Networks
 * Automation
   * View All
   * Industrial Automation
   * Business Process Automation
   * Development Automation
   * Robotic Process Automation
   * Test Automation
 * Enterprise Analytics
   * View All
   * Business Intelligence
   * Disaster Recovery Business Continuity
   * Statistical Analysis
   * Predictive Analysis
 * More
   * Data Decision Makers
   * Virtual Communication
     * Team Collaboration
     * UCaaS
     * Virtual Reality Collaboration
     * Virtual Employee Experience
   * Programming & Development
     * Product Development
     * Application Development
     * Test Management
     * Development Languages


Subscribe Events Video Special Issues Jobs



WHY SMALL LANGUAGE MODELS ARE THE NEXT BIG THING IN AI

James Thomason@jathomason
April 12, 2024 12:06 PM
 * Share on Facebook
 * Share on X
 * Share on LinkedIn

Credit: VentureBeat using Midjourney

Discover how companies are responsibly integrating AI in production. This
invite-only event in SF will explore the intersection of technology and
business. Find out how you can attend here.

--------------------------------------------------------------------------------



In the AI wars, where tech giants have been racing to build ever-larger language
models, a surprising new trend is emerging: small is the new big. As progress in
large language models (LLMs) shows some signs of plateauing, researchers and
developers are increasingly turning their attention to small language models
(SLMs). These compact, efficient and highly adaptable AI models are challenging
the notion that bigger is always better, promising to change the way we approach
AI development.


ARE LLMS STARTING TO PLATEAU?

Recent performance comparisons published by Vellum and HuggingFace suggest that
the performance gap between LLMs is quickly narrowing. This trend is
particularly evident in specific tasks like multi-choice questions, reasoning
and math problems, where the performance differences between the top models are
minimal. For instance, in multi-choice questions, Claude 3 Opus, GPT-4 and
Gemini Ultra all score above 83%, while in reasoning tasks, Claude 3 Opus,
GPT-4, and Gemini 1.5 Pro exceed 92% accuracy.

1
/
3
Unleashing Generative AI: Building trustworthy solutions with flawless data
Read More

94.9K
155




Video Player is loading.
Play Video
Unmute

Duration 0:00
/
Current Time 0:00
Playback Speed Settings
1x
Loaded: 0%

0:00

Remaining Time -0:00
 
FullscreenPlayRewind 10 SecondsUp Next

This is a modal window.



Beginning of dialog window. Escape will cancel and close the window.

TextColorWhiteBlackRedGreenBlueYellowMagentaCyanTransparencyOpaqueSemi-TransparentBackgroundColorBlackWhiteRedGreenBlueYellowMagentaCyanTransparencyOpaqueSemi-TransparentTransparentWindowColorBlackWhiteRedGreenBlueYellowMagentaCyanTransparencyTransparentSemi-TransparentOpaque
Font Size50%75%100%125%150%175%200%300%400%Text Edge
StyleNoneRaisedDepressedUniformDropshadowFont FamilyProportional
Sans-SerifMonospace Sans-SerifProportional SerifMonospace SerifCasualScriptSmall
Caps
Reset restore all settings to the default valuesDone
Close Modal Dialog

End of dialog window.

Share
Playback Speed

0.25x
0.5x
1x Normal
1.5x
2x
Replay the list

TOP ARTICLES






 * Powered by AnyClip
 * Privacy Policy




Unleashing Generative AI: Building trustworthy solutions with flawless data


Interestingly, even smaller models like Mixtral 8x7B and Llama 2 – 70B are
showing promising results in certain areas, such as reasoning and multi-choice
questions, where they outperform some of their larger counterparts. This
suggests that the size of the model may not be the sole determining factor in
performance and that other aspects like architecture, training data, and
fine-tuning techniques could play a significant role.

The latest research papers announcing new LLMs all point in the same direction:
“If you just look empirically, the last dozen or so articles that come out,
they’re kind of all in the same general territory as GPT-4,” says Gary Marcus,
the former head of Uber AI and author of “Rebooting AI,” a book about building
trustworthy AI. Marcus spoke with VentureBeat on Thursday. 


VB EVENT

The AI Impact Tour – San Francisco

Join us as we navigate the complexities of responsibly integrating AI in
business at the next stop of VB’s AI Impact Tour in San Francisco. Don’t miss
out on the chance to gain insights from industry experts, network with
like-minded innovators, and explore the future of GenAI with customer
experiences and optimize business processes.


Request an invite

“Some of them are a little better than GPT-4, but there’s no quantum leap. I
think everybody would say that GPT-4 is a quantum step ahead of GPT-3.5. There
hasn’t been any [quantum leap] in over a year,” said Marcus. 

As the performance gap continues to close and more models demonstrate
competitive results, it raises the question of whether LLMs are indeed starting
to plateau. If this trend persists, it could have significant implications for
the future development and deployment of language models, potentially shifting
the focus from simply increasing model size to exploring more efficient and
specialized architectures.

advertisement



DRAWBACKS OF THE LLM APPROACH

The LLMs, while undeniably powerful, come with significant drawbacks. Firstly,
training LLMs requires an enormous amount of data, requiring billions or even
trillions of parameters. This makes the training process extremely
resource-intensive, and the computational power and energy consumption required
to train and run LLMs are staggering. This leads to high costs, making it
difficult for smaller organizations or individuals to engage in core LLM
development. At an MIT event last year, OpenAI CEO Sam Altman stated the cost of
training GPT-4 was at least $100M. 

The complexity of tools and techniques required to work with LLMs also presents
a steep learning curve for developers, further limiting accessibility. There is
a long cycle time for developers, from training to building and deploying
models, which slows down development and experimentation. A recent paper from
the University of Cambridge shows companies can spend 90 days or longer
deploying a single machine learning (ML) model.  

advertisement


Another significant issue with LLMs is their propensity for hallucinations –
generating outputs that seem plausible but are not actually true or factual.
This stems from the way LLMs are trained to predict the next most likely word
based on patterns in the training data, rather than having a true understanding
of the information. As a result, LLMs can confidently produce false statements,
make up facts or combine unrelated concepts in nonsensical ways. Detecting and
mitigating these hallucinations is an ongoing challenge in the development of
reliable and trustworthy language models.

“If you’re using this for a high-stakes problem, you don’t want to insult your
customer, or get bad medical information, or use it to drive a car and take
risks there. That’s still a problem,” cautions Marcus.

The scale and black-box nature of LLMs can also make them challenging to
interpret and debug, which is crucial for building trust in the model’s outputs.
Bias in the training data and algorithms can lead to unfair, inaccurate or even
harmful outputs. As seen with Google Gemini, techniques to make LLMs “safe” and
reliable can also reduce their effectiveness. Additionally, the centralized
nature of LLMs raises concerns about the concentration of power and control in
the hands of a few large tech companies.


ENTER SMALL LANGUAGE MODELS (SLMS)

advertisement


Enter small language models. SLMs are more streamlined versions of LLMs, with
fewer parameters and simpler designs. They require less data and training
time—think minutes or a few hours, as opposed to days for LLMs. This makes SLMs
more efficient and straightforward to implement on-site or on smaller devices. 

One of the key advantages of SLMs is their suitability for specific
applications. Because they have a more focused scope and require less data, they
can be fine-tuned for particular domains or tasks more easily than large,
general-purpose models. This customization enables companies to create SLMs that
are highly effective for their specific needs, such as sentiment analysis, named
entity recognition, or domain-specific question answering. The specialized
nature of SLMs can lead to improved performance and efficiency in these targeted
applications compared to using a more general model.

advertisement


Another benefit of SLMs is their potential for enhanced privacy and security.
With a smaller codebase and simpler architecture, SLMs are easier to audit and
less likely to have unintended vulnerabilities. This makes them attractive for
applications that handle sensitive data, such as in healthcare or finance, where
data breaches could have severe consequences. Additionally, the reduced
computational requirements of SLMs make them more feasible to run locally on
devices or on-premises servers, rather than relying on cloud infrastructure.
This local processing can further improve data security and reduce the risk of
exposure during data transfer.

SLMs are also less prone to undetected hallucinations within their specific
domain compared to LLMs. SLMs are typically trained on a narrower and more
targeted dataset that is specific to their intended domain or application, which
helps the model learn the patterns, vocabulary and information that are most
relevant to its task. This focus reduces the likelihood of generating
irrelevant, unexpected or inconsistent outputs. With fewer parameters and a more
streamlined architecture, SLMs are less prone to capturing and amplifying noise
or errors in the training data. 

Clem Delangue, CEO of the AI startup HuggingFace, suggested that up to 99% of
use cases could be addressed using SLMs, and predicted 2024 will be the year of
the SLM. HuggingFace, whose platform enables developers to build, train and
deploy machine learning models, announced a strategic partnership with Google
earlier this year. The companies have subsequently integrated HuggingFace into
Google’s Vertex AI, allowing developers to quickly deploy thousands of models
through the Google Vertex Model Garden. 


GEMMA SOME LOVE, GOOGLE

advertisement


After initially forfeiting their advantage in LLMs to OpenAI, Google is
aggressively pursuing the SLM opportunity. Back in February, Google introduced
Gemma, a new series of small language models designed to be more efficient and
user-friendly. Like other SLMs, Gemma models can run on various everyday
devices, like smartphones, tablets or laptops, without needing special hardware
or extensive optimization.


Since the release of Gemma, the trained models have had more than 400,000
downloads last month on HuggingFace, and already a few exciting projects are
emerging. For example, Cerule is a powerful image and language model that
combines Gemma 2B with Google’s SigLIP, trained on a massive dataset of images
and text. Cerule leverages highly efficient data selection techniques, which
suggests it can achieve high performance without requiring an extensive amount
of data or computation. This means Cerule might be well-suited for emerging edge
computing use cases. 

Another example is CodeGemma, a specialized version of Gemma focused on coding
and mathematical reasoning.  CodeGemma offers three different models tailored
for various coding-related activities, making advanced coding tools more
accessible and efficient for developers. 


THE TRANSFORMATIVE POTENTIAL OF SMALL LANGUAGE MODELS

advertisement


As the AI community continues to explore the potential of small language models,
the advantages of faster development cycles, improved efficiency, and the
ability to tailor models to specific needs become increasingly apparent. SLMs
are poised to democratize AI access and drive innovation across industries by
enabling cost-effective and targeted solutions. The deployment of SLMs at the
edge opens up new possibilities for real-time, personalized, and secure
applications in various sectors, such as finance, entertainment, automotive
systems, education, e-commerce and healthcare.

By processing data locally and reducing reliance on cloud infrastructure, edge
computing with SLMs enables faster response times, improved data privacy, and
enhanced user experiences. This decentralized approach to AI has the potential
to transform the way businesses and consumers interact with technology, creating
more personalized and intuitive experiences in the real world. As LLMs face
challenges related to computational resources and potentially hit performance
plateaus, the rise of SLMs promises to keep the AI ecosystem evolving at an
impressive pace.

VB Daily

Stay in the know! Get the latest news in your inbox daily

Subscribe

By subscribing, you agree to VentureBeat's Terms of Service.

Thanks for subscribing. Check out more VB newsletters here.

An error occured.






NEXT STOP: THE AI IMPACT TOUR IN SAN FRANCISCO

Join GenAI leaders in San Francisco on May 8th for an exclusive invitation-only
event focused on the latest advancements shaping the future with the practical
applications of generative AI in production.

Request an Invite


 * VentureBeat Homepage
 * Follow us on Facebook
 * Follow us on X
 * Follow us on LinkedIn
 * Follow us on RSS

 * Press Releases
 * Contact Us
 * Advertise
 * Share a News Tip
 * Contribute to DataDecisionMakers

 * Privacy Policy
 * Terms of Service
 * Do Not Sell My Personal Information

© 2024 VentureBeat. All rights reserved.