venturebeat.com Open in urlscan Pro
192.0.66.2  Public Scan

Submitted URL: https://cmszf04.na1.hubspotlinks.com/Ctc/W1+113/cMsZf04/VWx9bv1VpRfnW1xhnVF4VSGdhW3S9mTt50V_QLMkNlkr5nKvpV3Zsc37CgHcHW3QhFtH8b9c2VN3g...
Effective URL: https://venturebeat.com/ai/mit-researchers-develop-self-learning-language-models-that-outperform-larger-counterparts/?ut...
Submission: On July 30 via api from US — Scanned from DE

Form analysis 1 forms found in the DOM

GET https://venturebeat.com/

<form method="get" action="https://venturebeat.com/" class="search-form" id="nav-search-form">
  <input id="mobile-search-input" class="" type="text" placeholder="Search" name="s" aria-label="Search" required="">
  <button type="submit" class="">
    <svg width="24" height="24" viewBox="0 0 24 24" fill="none" xmlns="http://www.w3.org/2000/svg">
      <g>
        <path fill-rule="evenodd" clip-rule="evenodd"
          d="M14.965 14.255H15.755L20.745 19.255L19.255 20.745L14.255 15.755V14.965L13.985 14.685C12.845 15.665 11.365 16.255 9.755 16.255C6.16504 16.255 3.255 13.345 3.255 9.755C3.255 6.16501 6.16504 3.255 9.755 3.255C13.345 3.255 16.255 6.16501 16.255 9.755C16.255 11.365 15.665 12.845 14.6851 13.985L14.965 14.255ZM5.255 9.755C5.255 12.245 7.26501 14.255 9.755 14.255C12.245 14.255 14.255 12.245 14.255 9.755C14.255 7.26501 12.245 5.255 9.755 5.255C7.26501 5.255 5.255 7.26501 5.255 9.755Z">
        </path>
      </g>
    </svg>
  </button>
</form>

Text Content

WE VALUE YOUR PRIVACY

We and our partners store and/or access information on a device, such as cookies
and process personal data, such as unique identifiers and standard information
sent by a device for personalised ads and content, ad and content measurement,
and audience insights, as well as to develop and improve products. With your
permission we and our partners may use precise geolocation data and
identification through device scanning. You may click to consent to our and our
partners’ processing as described above. Alternatively you may click to refuse
to consent or access more detailed information and change your preferences
before consenting. Please note that some processing of your personal data may
not require your consent, but you have a right to object to such processing.
Your preferences will apply to this website only. You can change your
preferences at any time by returning to this site or visit our privacy policy.
MORE OPTIONSDISAGREEAGREE

Skip to main content
Events Video Special Issues Jobs
VentureBeat Homepage

Subscribe

 * Artificial Intelligence
   * View All
   * AI, ML and Deep Learning
   * Auto ML
   * Data Labelling
   * Synthetic Data
   * Conversational AI
   * NLP
   * Text-to-Speech
 * Security
   * View All
   * Data Security and Privacy
   * Network Security and Privacy
   * Software Security
   * Computer Hardware Security
   * Cloud and Data Storage Security
 * Data Infrastructure
   * View All
   * Data Science
   * Data Management
   * Data Storage and Cloud
   * Big Data and Analytics
   * Data Networks
 * Automation
   * View All
   * Industrial Automation
   * Business Process Automation
   * Development Automation
   * Robotic Process Automation
   * Test Automation
 * Enterprise Analytics
   * View All
   * Business Intelligence
   * Disaster Recovery Business Continuity
   * Statistical Analysis
   * Predictive Analysis
 * More
   * Data Decision Makers
   * Virtual Communication
     * Team Collaboration
     * UCaaS
     * Virtual Reality Collaboration
     * Virtual Employee Experience
   * Programming & Development
     * Product Development
     * Application Development
     * Test Management
     * Development Languages


Subscribe Events Video Special Issues Jobs



MIT RESEARCHERS DEVELOP SELF-LEARNING LANGUAGE MODELS THAT OUTPERFORM LARGER
COUNTERPARTS

Victor Dey
June 1, 2023 7:00 AM
 * Share on Facebook
 * Share on Twitter
 * Share on LinkedIn

Image credit: VentureBeat generated with Midjourney

Head over to our on-demand library to view sessions from VB Transform 2023.
Register Here

--------------------------------------------------------------------------------



Researchers at the MIT Computer Science and Artificial Intelligence Laboratory
(CSAIL) have achieved a groundbreaking advancement in language modeling in the
realm of dominant large language models (LLMs).

The CSAIL team has pioneered an innovative approach to language modeling that
challenges the conventional belief that smaller models possess limited
capabilities. The research introduces a scalable, self-learning model that
surpasses larger counterparts by up to 500 times in specific language
understanding tasks, all without reliance on human-generated annotations.

1
/
10
Embracing Responsibility with Explainable AI
Read More

14.3K
1




Video Player is loading.
Play Video
Unmute

Duration 0:00
/
Current Time 0:00
Playback Speed Settings
1x
Loaded: 0%

0:00

Remaining Time -0:00
 
FullscreenPlayUp Next

This is a modal window.



Beginning of dialog window. Escape will cancel and close the window.

TextColorWhiteBlackRedGreenBlueYellowMagentaCyanTransparencyOpaqueSemi-TransparentBackgroundColorBlackWhiteRedGreenBlueYellowMagentaCyanTransparencyOpaqueSemi-TransparentTransparentWindowColorBlackWhiteRedGreenBlueYellowMagentaCyanTransparencyTransparentSemi-TransparentOpaque
Font Size50%75%100%125%150%175%200%300%400%Text Edge
StyleNoneRaisedDepressedUniformDropshadowFont FamilyProportional
Sans-SerifMonospace Sans-SerifProportional SerifMonospace SerifCasualScriptSmall
Caps
Reset restore all settings to the default valuesDone
Close Modal Dialog

End of dialog window.

Share
Playback Speed

0.25x
0.5x
1x Normal
1.5x
2x
Replay the list

TOP ARTICLES






 * Powered by AnyClip
 * Privacy Policy




Embracing Responsibility with Explainable AI
Embracing Responsibility with Explainable AI
NOW PLAYING
UP NEXT
The Latest AI Strategies for IoT and Cloud Security
NOW PLAYING
UP NEXT
How to Improve Efficiency & Spark Creativity in the Workplace with Generative AI
NOW PLAYING
UP NEXT
How Generative AI will accelerate personalization
NOW PLAYING
UP NEXT
How Computer Vision and Generative AI are Revolutionizing Customer Experience
NOW PLAYING
UP NEXT
Women in AI Breakfast Panel 2023
NOW PLAYING
UP NEXT
Killer app for enterprise generative AI - Creating content at scale
NOW PLAYING
UP NEXT
VB Transform Opening Remarks 2023
NOW PLAYING
UP NEXT
VB Transform - GPT for Numbers
NOW PLAYING
UP NEXT
Transform Events: Sizzle Reel
NOW PLAYING
UP NEXT



The algorithm developed by the MIT team, named “SimPLE” (Simple Pseudo-Label
Editing), utilizes self-training, a technique that allows the model to learn
from its own predictions, thereby eliminating the need for additional annotated
training data. This model was devised to tackle the challenge of generating
inaccurate labels during self-training.

Notably, the research team claims that this inventive approach significantly
enhances the model’s performance across various tasks, surpassing notable models
such as Google’s LaMDA, FLAN and other GPT models.


EVENT

VB Transform 2023 On-Demand

Did you miss a session from VB Transform 2023? Register to access the on-demand
library for all of our featured sessions.

 


Register Now


A REVOLUTION (BUT LIMITED IN SCOPE)

In their paper Entailment as Robust Self-Learners, the MIT research team
presents the argument that while recent advancements in language generation with
LLMs have brought about a revolution, these models possess a distinct limitation
when it comes to understanding tasks.

advertisement


“Digital calculators are better than GPT-4 in arithmetic because they are
designed based on arithmetic principles,” Hongyin Luo, MIT CSAIL postdoctoral
associate and research lead author, told VentureBeat. “Our small model is
trained to grasp the core principle of language understanding — contextual
entailment, while LLMs do not explicitly learn about it. With a clear goal of
learning contextual entailment, the parameter efficiency of our model is much
higher than LLMs, thus achieving good performance on NLU tasks.”

The research also states that, simply put, a competent contextual entailment
model must also excel as an natural language understanding (NLU) model.

Moreover, the CSAIL team believes that the implications of this research go
beyond mere enhancements in performance. It challenges the prevailing notion
that larger models are inherently superior, highlighting the potential of
smaller models as equally powerful and environmentally sustainable alternatives.


ENHANCING LANGUAGE MODEL UNDERSTANDING THROUGH TEXTUAL ENTAILMENT

advertisement


The MIT CSAIL team focused on textual entailment to enhance the model’s
comprehension of diverse language tasks. Textual entailment denotes the
connection between two sentences, whereby if one sentence (the premise) is true,
it is probable that the other sentence (the hypothesis) is also true.

By training the model using a model that recognizes these relationships, the
researchers were able to generate prompts to assess whether specific information
is entailed by a given sentence or phrase within various tasks. This zero-shot
adaptation significantly enhanced the model’s versatility and adaptability.

MIT’s Luo told VentureBeat that although LLMs have showcased impressive
abilities in generating language, art and code, they carry considerable
computational costs and privacy risks when handling sensitive data. Conversely,
smaller models have historically fallen behind their larger counterparts in
multi-tasking and weakly supervised tasks.

To address these challenges, the MIT CSAIL researchers employed a natural
language-based logical inference dataset to develop smaller models that
outperformed much larger models. In addition, by incorporating the concept of
textual entailment, researchers endowed the models with the ability to
comprehend a broad spectrum of tasks.

advertisement



ADAPTING WITHOUT ADDITIONAL TRAINING

These models underwent training to ascertain whether specific information was
entailed by a given sentence or phrase, thereby enabling them to adapt to
various tasks without requiring additional training.

“The benefit of self-training is that the model can automatically label a large
amount of data (create pseudo-labels), but the risk is that the pseudo-labels
contain wrong predictions, which might mislead the model or cause overfitting,”
said Luo. “Our SimPLE method outperforms all self-training baselines. The method
combines two classic AI strategies for robustness: Uncertainty estimation and
voting, and provides a more accurate set of predictions.”

advertisement


Lou explained that language model training traditionally necessitates manual
data annotation by humans or utilizing LLM APIs. However, human annotators often
label sensitive data, thereby compromising privacy. Additionally, transmitting
data to third-party annotators or OpenAI’s API may result in the inadvertent
exposure of highly sensitive information.

“Our method allows data annotation without seeing the data,” he explained. “An
annotator only needs to write a template that describes the task. With this
template, our system predicts the relationship between the response and the
question, generating high-quality labels. By doing this, the dataset is
annotated without sharing any data with the annotator.”


REDEFINING AI MODEL DEVELOPMENT THROUGH SELF-TRAINING

MIT’s research team asserts that the collection of smaller models exhibits
versatility across a wide array of AI tasks — ranging from sentiment
classification to news categorization — and demonstrate exceptional proficiency
in discerning the relationship between two textual components.

advertisement


The models can also infer sentiment from statements and ascertain the subject
matter of news articles based on their content. The researchers achieved
remarkable outcomes by reimagining various NLU tasks as entailment tasks.

According to Luo, the self-trained entailment models, which comprise 350 million
parameters, outperform supervised language models with 137 to 175 billion
parameters. He firmly believes that this pioneering work has the potential to
redefine the AI and ML landscape, providing a language modeling solution that is
more scalable, dependable and cost-effective.

“The core of the model is predicting entailment relations, while LLMs predict
“how to make things read similar to the training data.”

“This makes our model more suitable and efficient for language understanding,”
Luo added. “Our model performs better than LLMs and traditional BERT-based
models trained with human-generated labels.”


PAVING THE WAY FOR COST-EFFICIENT LANGUAGE MODEL TRAINING

advertisement


The paper that outlines this research, authored by Luo, James Glass and Yoon
Kim, is scheduled to be presented in July at the Meeting of the Association for
Computational Linguistics in Toronto, Canada. The project received support from
the Hong Kong Innovation AI program.

With its pioneering approach, the research strives to establish the groundwork
for future AI technologies that prioritize scalability, privacy preservation and
sustainability.

Lou said that the model contains only 1/500th of the parameters compared to
GPT-3-175B, making its deployment significantly easier and resulting in faster
inference. The CSAIL team emphasized that organizations would now be able to
deploy efficient, robust multi-task models without compromising data privacy or
relying on expensive computational resources through the research.

“Our next step involves employing the entailment models in various
language-related tasks,” said Lou. “Currently, we are engaged in co-training
with LLMs to leverage their advantages and further enhance the capabilities of
our efficient self-trained models. Additionally, we are working on applying
entailment models to measure the alignment between a claim and fact/moral
principles, which benefits detecting machine and human-generated misinformation,
hate speech and stereotypes.”

VentureBeat's mission is to be a digital town square for technical
decision-makers to gain knowledge about transformative enterprise technology and
transact. Discover our Briefings.




VB TRANSFORM ON-DEMAND LIBRARY

Were you unable to attend our live event in SF? Check out all of the summit
sessions in our on-demand library now!

Watch Now


 * VentureBeat Homepage
 * Follow us on Facebook
 * Follow us on Twitter
 * Follow us on LinkedIn
 * Follow us on RSS

 * Press Releases
 * Contact Us
 * Advertise
 * Share a News Tip
 * Contribute to DataDecisionMakers

 * Careers
 * Privacy Policy
 * Terms of Service
 * Do Not Sell My Personal Information

© 2023 VentureBeat. All rights reserved.