venturebeat.com Open in urlscan Pro
192.0.66.2 Public Scan

Back to summary

URL:
https://venturebeat.com/2021/10/28/new-deep-reinforcement-learning-technique-helps-ai-to-evolve/
Submission: On November 01 via api (November 1st 2021, 2:52:53 am UTC) from SG — Scanned from DE

Form analysis
3 forms found in the DOM

GET https://venturebeat.com/

<form method="get" action="https://venturebeat.com/" class="Search">
  <input id="search-input" class="Search__input GlobalNav__text" type="text" placeholder="Search" name="s" aria-label="Search" required="">
  <button type="submit" class="Search__submit" aria-label="Search submit button">
    <svg width="24" height="24" viewBox="0 0 24 24" fill="none" xmlns="http://www.w3.org/2000/svg">
      <g>
        <path fill-rule="evenodd" clip-rule="evenodd"
          d="M14.965 14.255H15.755L20.745 19.255L19.255 20.745L14.255 15.755V14.965L13.985 14.685C12.845 15.665 11.365 16.255 9.755 16.255C6.16504 16.255 3.255 13.345 3.255 9.755C3.255 6.16501 6.16504 3.255 9.755 3.255C13.345 3.255 16.255 6.16501 16.255 9.755C16.255 11.365 15.665 12.845 14.6851 13.985L14.965 14.255ZM5.255 9.755C5.255 12.245 7.26501 14.255 9.755 14.255C12.245 14.255 14.255 12.245 14.255 9.755C14.255 7.26501 12.245 5.255 9.755 5.255C7.26501 5.255 5.255 7.26501 5.255 9.755Z">
        </path>
      </g>
    </svg>
  </button>
</form>

GET https://venturebeat.com/

<form method="get" action="https://venturebeat.com/" class="Search Search--mobile Nav__section--active">
  <input id="mobile-search-input" class="Search__input GlobalNav__text" type="text" placeholder="Search" name="s" aria-label="Search" required="">
  <button type="submit" class="Search__submit">
    <svg width="24" height="24" viewBox="0 0 24 24" fill="none" xmlns="http://www.w3.org/2000/svg">
      <g>
        <path fill-rule="evenodd" clip-rule="evenodd"
          d="M14.965 14.255H15.755L20.745 19.255L19.255 20.745L14.255 15.755V14.965L13.985 14.685C12.845 15.665 11.365 16.255 9.755 16.255C6.16504 16.255 3.255 13.345 3.255 9.755C3.255 6.16501 6.16504 3.255 9.755 3.255C13.345 3.255 16.255 6.16501 16.255 9.755C16.255 11.365 15.665 12.845 14.6851 13.985L14.965 14.255ZM5.255 9.755C5.255 12.245 7.26501 14.255 9.755 14.255C12.245 14.255 14.255 12.245 14.255 9.755C14.255 7.26501 12.245 5.255 9.755 5.255C7.26501 5.255 5.255 7.26501 5.255 9.755Z">
        </path>
      </g>
    </svg>
  </button>
</form>

<form onsubmit="return false;">
  <ul class="firebaseui-idp-list">
    <li class="firebaseui-list-item"><button class="firebaseui-idp-button mdl-button mdl-js-button mdl-button--raised firebaseui-idp-password firebaseui-id-idp-button" data-provider-id="password" style="background-color:#db4437"
        data-upgraded=",MaterialButton"><span class="firebaseui-idp-icon-wrapper"><img class="firebaseui-idp-icon" alt="" src="https://www.gstatic.com/firebasejs/ui/2.0.0/images/auth/mail.svg"></span><span
          class="firebaseui-idp-text firebaseui-idp-text-long">Sign in with email</span><span class="firebaseui-idp-text firebaseui-idp-text-short">Email</span></button></li>
    <li class="firebaseui-list-item"><button class="firebaseui-idp-button mdl-button mdl-js-button mdl-button--raised firebaseui-idp-google firebaseui-id-idp-button" data-provider-id="google.com" style="background-color:#ffffff"
        data-upgraded=",MaterialButton"><span class="firebaseui-idp-icon-wrapper"><img class="firebaseui-idp-icon" alt="" src="https://www.gstatic.com/firebasejs/ui/2.0.0/images/auth/google.svg"></span><span
          class="firebaseui-idp-text firebaseui-idp-text-long">Sign in with Google</span><span class="firebaseui-idp-text firebaseui-idp-text-short">Google</span></button></li>
  </ul>
</form>

Text Content

Skip to main content
VentureBeat Homepage
* Events
* GamesBeat

* Jobs
* Low Code / No Code Summit

Account Settings Log Out
Become a Member Sign In

VentureBeat Homepage

*
*

VENTUREBEAT

* AR/VR
* Big Data
* Cloud
* Commerce
* Dev
* Enterprise
* Entrepreneur
* Marketing
* Media
* Mobile
* Security
* Social
* Transportation

THE MACHINE

* AI
* Machine Learning
* Computer Vision
* Natural Language Processing
* Robotic Process Automation

GAMESBEAT

* Games
* Esports
* PC Gaming

EVENTS

* Upcoming
* Transform
* Media Partner
* Webinars

GENERAL

* Newsletters
* Got a news tip?
* Advertise
* Press Releases
* Guest Posts
* Deals
* Jobs
* VB Lab
* About
* Contact
* Privacy Policy

JOIN THE VENTUREBEAT COMMUNITY

FREE: JOIN THE VENTUREBEAT COMMUNITY FOR ACCESS TO 3 PREMIUM POSTS AND UNLIMITED
VIDEOS PER MONTH.

Learn More

Please wait...
* Sign in with emailEmail
* Sign in with GoogleGoogle

By continuing, you are indicating that you accept our Terms of Service and
Privacy Policy.

* Share on Facebook
* Share on Twitter
* Share on LinkedIn

* VentureBeat Homepage
* Social Links
* Newsletters
* Events
* Profile

NEW DEEP REINFORCEMENT LEARNING TECHNIQUE HELPS AI TO EVOLVE

Ben Dickson@BenDee983
October 28, 2021 2:20 PM
* Share on Facebook
* Share on Twitter
* Share on LinkedIn

Image Credit: Andriy Onufriyenko // Getty Images

Hundreds of millions of years of evolution have produced a variety of
life-forms, each intelligent in its own fashion. Each species has evolved to
develop innate skills, learning capacities, and a physical form that ensures
survival in its environment.

But despite being inspired by nature and evolution, the field of artificial
intelligence has largely focused on creating the elements of intelligence
separately and fusing them together after the development process. While this
approach has yielded great results, it has also limited the flexibility of AI
agents in some of the basic skills found in even the simplest life-forms.

1
/
9
Low Code Literally Helps Power the State of Vermont. Find Out How._
Read More

Video Player is loading.
Play Video
Unmute

Duration -:-
/
Current Time 0:00
Playback Speed Settings
1x
Loaded: 0%

Remaining Time -0:00
FullscreenPlayUp Next

This is a modal window.

Beginning of dialog window. Escape will cancel and close the window.

TextColorWhiteBlackRedGreenBlueYellowMagentaCyanTransparencyOpaqueSemi-TransparentBackgroundColorBlackWhiteRedGreenBlueYellowMagentaCyanTransparencyOpaqueSemi-TransparentTransparentWindowColorBlackWhiteRedGreenBlueYellowMagentaCyanTransparencyTransparentSemi-TransparentOpaque
Font Size50%75%100%125%150%175%200%300%400%Text Edge
StyleNoneRaisedDepressedUniformDropshadowFont FamilyProportional
Sans-SerifMonospace Sans-SerifProportional SerifMonospace SerifCasualScriptSmall
Caps
Reset restore all settings to the default valuesDone
Close Modal Dialog

End of dialog window.

Playback Speed

0.25x
0.5x
1x Normal
1.5x
2x
Replay the list
* Powered by AnyClip
* Privacy Policy

TOP ARTICLES

Skip
Ads by

Low Code Literally Helps Power the State of Vermont. Find Out How._

In a new paper published in the scientific journal Nature, AI researchers at
Stanford University present a new technique that can help take steps toward
overcoming some of these limits. Called “deep evolutionary reinforcement
learning,” or DERL, the new technique uses a complex virtual environment and
reinforcement learning to create virtual agents that can evolve both in their
physical structure and learning capacities. The findings can have important
implications for the future of AI and robotics research.

EVOLUTION IS HARD TO SIMULATE

In nature, the body and brain evolve together. Across many generations, every
animal species has gone through countless cycles of mutation to grow limbs,
organs, and a nervous system to support the functions it needs in its
environment. Mosquitos are equipped with thermal vision to spot body heat. Bats
have wings to fly and an echolocation apparatus to navigate dark spaces. Sea
turtles have flippers to swim with and a magnetic field detector system to
travel very long distances. Humans have an upright posture that frees their arms
and lets them see the far horizon, hands and nimble fingers that can manipulate
objects, and a brain that makes them the best social creatures and problem
solvers on the planet.

Interestingly, all these species descended from the first life-form that
appeared on Earth several billion years ago. Based on the selection pressures
caused by the environment, the descendants of those first living beings evolved
in many directions.

Studying the evolution of life and intelligence is interesting, but replicating
it is extremely difficult. An AI system that would want to recreate intelligent
life in the same way that evolution did would have to search a very large space
of possible morphologies, which is extremely expensive computationally. It would
need a lot of parallel and sequential trial-and-error cycles.

AI researchers use several shortcuts and predesigned features to overcome some
of these challenges. For example, they fix the architecture or physical design
of an AI or robotic system and focus on optimizing the learnable parameters.
Another shortcut is the use of Lamarckian rather than Darwinian evolution, in
which AI agents pass on their learned parameters to their descendants. Yet
another approach is to train different AI subsystems separately (vision,
locomotion, language, etc.) and then tack them on together in a final AI or
robotic system. While these approaches speed up the process and reduce the costs
of training and evolving AI agents, they also limit the flexibility and variety
of results that can be achieved.

DEEP EVOLUTIONARY REINFORCEMENT LEARNING

In their new work, the researchers at Stanford aim to bring AI research a step
closer to the real evolutionary process while keeping the costs as low as
possible. “Our goal is to elucidate some principles governing relations between
environmental complexity, evolved morphology, and the learnability of
intelligent control,” they wrote in their paper.

Within the DERL framework, each agent uses deep reinforcement learning to
acquire the skills required to maximize its goals during its lifetime. DERL uses
Darwinian evolution to search the morphological space for optimal solutions,
which means that when a new generation of AI agents are spawned, they only
inherit the physical and architectural traits of their parents (along with
slight mutations). None of the learned parameters are passed on across
generations.

“DERL opens the door to performing large-scale in silico experiments to yield
scientific insights into how learning and evolution cooperatively create
sophisticated relationships between environmental complexity, morphological
intelligence, and the learnability of control tasks,” the researchers wrote.

SIMULATING EVOLUTION

For their framework, the researchers used MuJoCo, a virtual environment that
provides highly accurate rigid-body physics simulation. Their design space is
called Universal Animal (Unimal), in which the goal is to create morphologies
that learn locomotion and object-manipulation tasks in a variety of terrains.

Each agent in the environment is composed of a genotype that defines its limbs
and joints. The direct descendant of each agent inherits the parent’s genotype
and goes through mutations that can create new limbs, remove existing limbs, or
make small modifications to characteristics, such as the degrees of freedom or
the size of limbs.

Each agent is trained with reinforcement learning to maximize rewards in various
environments. The most basic task is locomotion, in which the agent is rewarded
for the distance it travels during an episode. Agents whose physical structures
are better suited for traversing terrain learn faster to use their limbs for
moving around.

To test the system’s results, the researchers generated agents in three types of
terrains: flat (FT), variable (VT), and variable terrains with modifiable
objects (MVT). The flat terrain puts the least selection pressure on the agents’
morphology. The variable terrains, on the other hand, force the agents to
develop a more versatile physical structure that can climb slopes and move
around obstacles. The MVT variant has the added challenge of requiring the
agents to manipulate objects to achieve their goals.

THE BENEFITS OF DERL

Above: Deep evolutionary reinforcement learning generates a variety of
successful morphologies across different environments.

Image Credit: TechTalks

One of the interesting findings of DERL is the diversity of the results. Other
approaches to evolutionary AI tend to converge on one solution because new
agents directly inherit the physique and learnings of their parents. But in
DERL, only morphological data is passed on to descendants; the system ends up
creating a diverse set of successful morphologies, including bipeds, tripeds,
and quadrupeds with and without arms.

At the same time, the system shows traits of the Baldwin effect, which suggests
that agents that learn faster are more likely to reproduce and pass on their
genes to the next generation. DERL shows that evolution “selects for faster
learners without any direct selection pressure for doing so,” according to the
Stanford paper.

“Intriguingly, the existence of this morphological Baldwin effect could be
exploited in future studies to create embodied agents with lower sample
complexity and higher generalization capacity,” the researchers wrote.

Finally, the DERL framework also validates the hypothesis that more complex
environments will give rise to more intelligent agents. The researchers tested
the evolved agents across eight different tasks, including patrolling, escaping,
manipulating objects, and exploration. Their findings show that in general,
agents that have evolved in variable terrains learn faster and perform better
than AI agents that have only experienced flat terrain.

Their findings seem to be in line with another hypothesis by DeepMind
researchers that a complex environment, a suitable reward structure, and
reinforcement learning can eventually lead to the emergence of all kinds of
intelligent behaviors.

AI AND ROBOTICS RESEARCH

The DERL environment only has a fraction of the complexities of the real world.
“Although DERL enables us to take a significant step forward in scaling the
complexity of evolutionary environments, an important line of future work will
involve designing more open-ended, physically realistic, and multiagent
evolutionary environments,” the researchers wrote.

In the future, the researchers plan to expand the range of evaluation tasks to
better assess how the agents can enhance their ability to learn human-relevant
behaviors.

The work could have important implications for the future of AI and robotics and
push researchers to use exploration methods that are much more similar to
natural evolution.

“We hope our work encourages further large-scale explorations of learning and
evolution in other contexts to yield new scientific insights into the emergence
of rapidly learnable intelligent behaviors, as well as new engineering advances
in our ability to instantiate them in machines,” the researchers wrote.

Ben Dickson is a software engineer and the founder of TechTalks. He writes about
technology, business, and politics.

VENTUREBEAT

VentureBeat's mission is to be a digital town square for technical
decision-makers to gain knowledge about transformative technology and transact.
Our site delivers essential information on data technologies and strategies to
guide you as you lead your organizations. We invite you to become a member of
our community, to access:
* up-to-date information on the subjects of interest to you
* our newsletters
* gated thought-leader content and discounted access to our prized events, such
as Transform 2021: Learn More
* networking features, and more

Become a member

FIND YOUR DREAM JOB ON VENTUREBEAT CAREERS

Oracle
Amsterdam
View 1146 Jobs
CGI Group, Inc.
Amsterdam
View 787 Jobs
The Travelers Companies, Inc.
Amsterdam
View 251 Jobs
Quorum
Washington
View 1 Jobs

STABILIZE YOUR GROWTH

Join Trinet on Dec. 1 as they provide tips on how to avoid the pitfalls of
too-fast growth, how to bring focus back to the consumer, recognize the signs
that are creating difficulties within your team, and more.

TRANSFORM TECHNOLOGY SUMMITS

Hear from CIOs, CTOs, and other C-level execs on data and AI strategies

Learn More

JOIN FORCES WITH VENTUREBEAT AT OUR UPCOMING AI & DATA EVENTS

Sponsor VB Events
* VentureBeat Homepage
* Follow us on Facebook
* Follow us on Twitter
* Follow us on LinkedIn
* Follow us on RSS

* VB Lab
* Newsletters
* Events
* Special Issue
* Product Comparisons
* Jobs

* About
* Contact
* Careers
* Privacy Policy
* Terms of Service
* Do Not Sell My Personal Information

We may collect cookies and other personal information from your interaction with
our website. For more information on the categories of personal information we
collect and the purposes we use them for, please view our Notice at Collection.

venturebeat.com Open in urlscan Pro 192.0.66.2 Public Scan

Form analysis 3 forms found in the DOM

GET https://venturebeat.com/

GET https://venturebeat.com/

Text Content

venturebeat.com Open in urlscan Pro
192.0.66.2 Public Scan

Form analysis
3 forms found in the DOM