venturebeat.com
Open in
urlscan Pro
2606:4700::6812:8ee
Public Scan
Submitted URL: https://venturebeat.com/ai/why-small-language-models-are-the-next-big-thing-in-ai/))
Effective URL: https://venturebeat.com/ai/why-small-language-models-are-the-next-big-thing-in-ai/
Submission: On April 15 via api from BE — Scanned from DE
Effective URL: https://venturebeat.com/ai/why-small-language-models-are-the-next-big-thing-in-ai/
Submission: On April 15 via api from BE — Scanned from DE
Form analysis
2 forms found in the DOMGET https://venturebeat.com/
<form method="get" action="https://venturebeat.com/" class="search-form" id="nav-search-form">
<input id="mobile-search-input" class="" type="text" placeholder="Search" name="s" aria-label="Search" required="">
<button type="submit" class="">
<svg width="24" height="24" viewBox="0 0 24 24" fill="none" xmlns="http://www.w3.org/2000/svg">
<g>
<path fill-rule="evenodd" clip-rule="evenodd"
d="M14.965 14.255H15.755L20.745 19.255L19.255 20.745L14.255 15.755V14.965L13.985 14.685C12.845 15.665 11.365 16.255 9.755 16.255C6.16504 16.255 3.255 13.345 3.255 9.755C3.255 6.16501 6.16504 3.255 9.755 3.255C13.345 3.255 16.255 6.16501 16.255 9.755C16.255 11.365 15.665 12.845 14.6851 13.985L14.965 14.255ZM5.255 9.755C5.255 12.245 7.26501 14.255 9.755 14.255C12.245 14.255 14.255 12.245 14.255 9.755C14.255 7.26501 12.245 5.255 9.755 5.255C7.26501 5.255 5.255 7.26501 5.255 9.755Z">
</path>
</g>
</svg>
</button>
</form>
<form action="" data-action="nonce_mailchimp_boilerplate_subscribe" id="boilerplateNewsletterForm" class="Form js-vb-newsletter-cta">
<input type="email" name="email" placeholder="Email" class="Form__input" id="boilerplateNewsletterEmail" required="">
<input type="hidden" name="newsletter" value="vb_dailyroundup">
<input type="hidden" name="b_f67554569818c29c4c844d121_89d8059242" value="">
<input type="hidden" id="nonce_mailchimp_boilerplate_subscribe" name="nonce_mailchimp_boilerplate_subscribe" value="56ed90e5f5"><input type="hidden" name="_wp_http_referer" value="/ai/why-small-language-models-are-the-next-big-thing-in-ai/">
<button type="submit" class="Form__button Newsletter__sub-btn">Subscribe</button>
</form>
Text Content
WE VALUE YOUR PRIVACY We and our partners store and/or access information on a device, such as cookies and process personal data, such as unique identifiers and standard information sent by a device for personalised advertising and content, advertising and content measurement, audience research and services development. With your permission we and our partners may use precise geolocation data and identification through device scanning. You may click to consent to our and our 1413 partners’ processing as described above. Alternatively you may access more detailed information and change your preferences before consenting or to refuse consenting. Please note that some processing of your personal data may not require your consent, but you have a right to object to such processing. Your preferences will apply to this website only. You can change your preferences or withdraw your consent at any time by returning to this site and clicking the "Privacy" button at the bottom of the webpage. MORE OPTIONSAGREE Skip to main content Events Video Special Issues Jobs VentureBeat Homepage Subscribe * Artificial Intelligence * View All * AI, ML and Deep Learning * Auto ML * Data Labelling * Synthetic Data * Conversational AI * NLP * Text-to-Speech * Security * View All * Data Security and Privacy * Network Security and Privacy * Software Security * Computer Hardware Security * Cloud and Data Storage Security * Data Infrastructure * View All * Data Science * Data Management * Data Storage and Cloud * Big Data and Analytics * Data Networks * Automation * View All * Industrial Automation * Business Process Automation * Development Automation * Robotic Process Automation * Test Automation * Enterprise Analytics * View All * Business Intelligence * Disaster Recovery Business Continuity * Statistical Analysis * Predictive Analysis * More * Data Decision Makers * Virtual Communication * Team Collaboration * UCaaS * Virtual Reality Collaboration * Virtual Employee Experience * Programming & Development * Product Development * Application Development * Test Management * Development Languages Subscribe Events Video Special Issues Jobs WHY SMALL LANGUAGE MODELS ARE THE NEXT BIG THING IN AI James Thomason@jathomason April 12, 2024 12:06 PM * Share on Facebook * Share on X * Share on LinkedIn Credit: VentureBeat using Midjourney Discover how companies are responsibly integrating AI in production. This invite-only event in SF will explore the intersection of technology and business. Find out how you can attend here. -------------------------------------------------------------------------------- In the AI wars, where tech giants have been racing to build ever-larger language models, a surprising new trend is emerging: small is the new big. As progress in large language models (LLMs) shows some signs of plateauing, researchers and developers are increasingly turning their attention to small language models (SLMs). These compact, efficient and highly adaptable AI models are challenging the notion that bigger is always better, promising to change the way we approach AI development. ARE LLMS STARTING TO PLATEAU? Recent performance comparisons published by Vellum and HuggingFace suggest that the performance gap between LLMs is quickly narrowing. This trend is particularly evident in specific tasks like multi-choice questions, reasoning and math problems, where the performance differences between the top models are minimal. For instance, in multi-choice questions, Claude 3 Opus, GPT-4 and Gemini Ultra all score above 83%, while in reasoning tasks, Claude 3 Opus, GPT-4, and Gemini 1.5 Pro exceed 92% accuracy. 1 / 3 Unleashing Generative AI: Building trustworthy solutions with flawless data Read More 94.9K 155 Video Player is loading. Play Video Unmute Duration 0:00 / Current Time 0:00 Playback Speed Settings 1x Loaded: 0% 0:00 Remaining Time -0:00 FullscreenPlayRewind 10 SecondsUp Next This is a modal window. Beginning of dialog window. Escape will cancel and close the window. TextColorWhiteBlackRedGreenBlueYellowMagentaCyanTransparencyOpaqueSemi-TransparentBackgroundColorBlackWhiteRedGreenBlueYellowMagentaCyanTransparencyOpaqueSemi-TransparentTransparentWindowColorBlackWhiteRedGreenBlueYellowMagentaCyanTransparencyTransparentSemi-TransparentOpaque Font Size50%75%100%125%150%175%200%300%400%Text Edge StyleNoneRaisedDepressedUniformDropshadowFont FamilyProportional Sans-SerifMonospace Sans-SerifProportional SerifMonospace SerifCasualScriptSmall Caps Reset restore all settings to the default valuesDone Close Modal Dialog End of dialog window. Share Playback Speed 0.25x 0.5x 1x Normal 1.5x 2x Replay the list TOP ARTICLES * Powered by AnyClip * Privacy Policy Unleashing Generative AI: Building trustworthy solutions with flawless data Interestingly, even smaller models like Mixtral 8x7B and Llama 2 – 70B are showing promising results in certain areas, such as reasoning and multi-choice questions, where they outperform some of their larger counterparts. This suggests that the size of the model may not be the sole determining factor in performance and that other aspects like architecture, training data, and fine-tuning techniques could play a significant role. The latest research papers announcing new LLMs all point in the same direction: “If you just look empirically, the last dozen or so articles that come out, they’re kind of all in the same general territory as GPT-4,” says Gary Marcus, the former head of Uber AI and author of “Rebooting AI,” a book about building trustworthy AI. Marcus spoke with VentureBeat on Thursday. VB EVENT The AI Impact Tour – San Francisco Join us as we navigate the complexities of responsibly integrating AI in business at the next stop of VB’s AI Impact Tour in San Francisco. Don’t miss out on the chance to gain insights from industry experts, network with like-minded innovators, and explore the future of GenAI with customer experiences and optimize business processes. Request an invite “Some of them are a little better than GPT-4, but there’s no quantum leap. I think everybody would say that GPT-4 is a quantum step ahead of GPT-3.5. There hasn’t been any [quantum leap] in over a year,” said Marcus. As the performance gap continues to close and more models demonstrate competitive results, it raises the question of whether LLMs are indeed starting to plateau. If this trend persists, it could have significant implications for the future development and deployment of language models, potentially shifting the focus from simply increasing model size to exploring more efficient and specialized architectures. advertisement DRAWBACKS OF THE LLM APPROACH The LLMs, while undeniably powerful, come with significant drawbacks. Firstly, training LLMs requires an enormous amount of data, requiring billions or even trillions of parameters. This makes the training process extremely resource-intensive, and the computational power and energy consumption required to train and run LLMs are staggering. This leads to high costs, making it difficult for smaller organizations or individuals to engage in core LLM development. At an MIT event last year, OpenAI CEO Sam Altman stated the cost of training GPT-4 was at least $100M. The complexity of tools and techniques required to work with LLMs also presents a steep learning curve for developers, further limiting accessibility. There is a long cycle time for developers, from training to building and deploying models, which slows down development and experimentation. A recent paper from the University of Cambridge shows companies can spend 90 days or longer deploying a single machine learning (ML) model. advertisement Another significant issue with LLMs is their propensity for hallucinations – generating outputs that seem plausible but are not actually true or factual. This stems from the way LLMs are trained to predict the next most likely word based on patterns in the training data, rather than having a true understanding of the information. As a result, LLMs can confidently produce false statements, make up facts or combine unrelated concepts in nonsensical ways. Detecting and mitigating these hallucinations is an ongoing challenge in the development of reliable and trustworthy language models. “If you’re using this for a high-stakes problem, you don’t want to insult your customer, or get bad medical information, or use it to drive a car and take risks there. That’s still a problem,” cautions Marcus. The scale and black-box nature of LLMs can also make them challenging to interpret and debug, which is crucial for building trust in the model’s outputs. Bias in the training data and algorithms can lead to unfair, inaccurate or even harmful outputs. As seen with Google Gemini, techniques to make LLMs “safe” and reliable can also reduce their effectiveness. Additionally, the centralized nature of LLMs raises concerns about the concentration of power and control in the hands of a few large tech companies. ENTER SMALL LANGUAGE MODELS (SLMS) advertisement Enter small language models. SLMs are more streamlined versions of LLMs, with fewer parameters and simpler designs. They require less data and training time—think minutes or a few hours, as opposed to days for LLMs. This makes SLMs more efficient and straightforward to implement on-site or on smaller devices. One of the key advantages of SLMs is their suitability for specific applications. Because they have a more focused scope and require less data, they can be fine-tuned for particular domains or tasks more easily than large, general-purpose models. This customization enables companies to create SLMs that are highly effective for their specific needs, such as sentiment analysis, named entity recognition, or domain-specific question answering. The specialized nature of SLMs can lead to improved performance and efficiency in these targeted applications compared to using a more general model. advertisement Another benefit of SLMs is their potential for enhanced privacy and security. With a smaller codebase and simpler architecture, SLMs are easier to audit and less likely to have unintended vulnerabilities. This makes them attractive for applications that handle sensitive data, such as in healthcare or finance, where data breaches could have severe consequences. Additionally, the reduced computational requirements of SLMs make them more feasible to run locally on devices or on-premises servers, rather than relying on cloud infrastructure. This local processing can further improve data security and reduce the risk of exposure during data transfer. SLMs are also less prone to undetected hallucinations within their specific domain compared to LLMs. SLMs are typically trained on a narrower and more targeted dataset that is specific to their intended domain or application, which helps the model learn the patterns, vocabulary and information that are most relevant to its task. This focus reduces the likelihood of generating irrelevant, unexpected or inconsistent outputs. With fewer parameters and a more streamlined architecture, SLMs are less prone to capturing and amplifying noise or errors in the training data. Clem Delangue, CEO of the AI startup HuggingFace, suggested that up to 99% of use cases could be addressed using SLMs, and predicted 2024 will be the year of the SLM. HuggingFace, whose platform enables developers to build, train and deploy machine learning models, announced a strategic partnership with Google earlier this year. The companies have subsequently integrated HuggingFace into Google’s Vertex AI, allowing developers to quickly deploy thousands of models through the Google Vertex Model Garden. GEMMA SOME LOVE, GOOGLE advertisement After initially forfeiting their advantage in LLMs to OpenAI, Google is aggressively pursuing the SLM opportunity. Back in February, Google introduced Gemma, a new series of small language models designed to be more efficient and user-friendly. Like other SLMs, Gemma models can run on various everyday devices, like smartphones, tablets or laptops, without needing special hardware or extensive optimization. Since the release of Gemma, the trained models have had more than 400,000 downloads last month on HuggingFace, and already a few exciting projects are emerging. For example, Cerule is a powerful image and language model that combines Gemma 2B with Google’s SigLIP, trained on a massive dataset of images and text. Cerule leverages highly efficient data selection techniques, which suggests it can achieve high performance without requiring an extensive amount of data or computation. This means Cerule might be well-suited for emerging edge computing use cases. Another example is CodeGemma, a specialized version of Gemma focused on coding and mathematical reasoning. CodeGemma offers three different models tailored for various coding-related activities, making advanced coding tools more accessible and efficient for developers. THE TRANSFORMATIVE POTENTIAL OF SMALL LANGUAGE MODELS advertisement As the AI community continues to explore the potential of small language models, the advantages of faster development cycles, improved efficiency, and the ability to tailor models to specific needs become increasingly apparent. SLMs are poised to democratize AI access and drive innovation across industries by enabling cost-effective and targeted solutions. The deployment of SLMs at the edge opens up new possibilities for real-time, personalized, and secure applications in various sectors, such as finance, entertainment, automotive systems, education, e-commerce and healthcare. By processing data locally and reducing reliance on cloud infrastructure, edge computing with SLMs enables faster response times, improved data privacy, and enhanced user experiences. This decentralized approach to AI has the potential to transform the way businesses and consumers interact with technology, creating more personalized and intuitive experiences in the real world. As LLMs face challenges related to computational resources and potentially hit performance plateaus, the rise of SLMs promises to keep the AI ecosystem evolving at an impressive pace. VB Daily Stay in the know! Get the latest news in your inbox daily Subscribe By subscribing, you agree to VentureBeat's Terms of Service. Thanks for subscribing. Check out more VB newsletters here. An error occured. NEXT STOP: THE AI IMPACT TOUR IN SAN FRANCISCO Join GenAI leaders in San Francisco on May 8th for an exclusive invitation-only event focused on the latest advancements shaping the future with the practical applications of generative AI in production. Request an Invite * VentureBeat Homepage * Follow us on Facebook * Follow us on X * Follow us on LinkedIn * Follow us on RSS * Press Releases * Contact Us * Advertise * Share a News Tip * Contribute to DataDecisionMakers * Privacy Policy * Terms of Service * Do Not Sell My Personal Information © 2024 VentureBeat. All rights reserved.