venturebeat.com
Open in
urlscan Pro
192.0.66.2
Public Scan
URL:
https://venturebeat.com/2021/10/28/new-deep-reinforcement-learning-technique-helps-ai-to-evolve/
Submission: On November 01 via api from SG — Scanned from DE
Submission: On November 01 via api from SG — Scanned from DE
Form analysis
3 forms found in the DOMGET https://venturebeat.com/
<form method="get" action="https://venturebeat.com/" class="Search">
<input id="search-input" class="Search__input GlobalNav__text" type="text" placeholder="Search" name="s" aria-label="Search" required="">
<button type="submit" class="Search__submit" aria-label="Search submit button">
<svg width="24" height="24" viewBox="0 0 24 24" fill="none" xmlns="http://www.w3.org/2000/svg">
<g>
<path fill-rule="evenodd" clip-rule="evenodd"
d="M14.965 14.255H15.755L20.745 19.255L19.255 20.745L14.255 15.755V14.965L13.985 14.685C12.845 15.665 11.365 16.255 9.755 16.255C6.16504 16.255 3.255 13.345 3.255 9.755C3.255 6.16501 6.16504 3.255 9.755 3.255C13.345 3.255 16.255 6.16501 16.255 9.755C16.255 11.365 15.665 12.845 14.6851 13.985L14.965 14.255ZM5.255 9.755C5.255 12.245 7.26501 14.255 9.755 14.255C12.245 14.255 14.255 12.245 14.255 9.755C14.255 7.26501 12.245 5.255 9.755 5.255C7.26501 5.255 5.255 7.26501 5.255 9.755Z">
</path>
</g>
</svg>
</button>
</form>
GET https://venturebeat.com/
<form method="get" action="https://venturebeat.com/" class="Search Search--mobile Nav__section--active">
<input id="mobile-search-input" class="Search__input GlobalNav__text" type="text" placeholder="Search" name="s" aria-label="Search" required="">
<button type="submit" class="Search__submit">
<svg width="24" height="24" viewBox="0 0 24 24" fill="none" xmlns="http://www.w3.org/2000/svg">
<g>
<path fill-rule="evenodd" clip-rule="evenodd"
d="M14.965 14.255H15.755L20.745 19.255L19.255 20.745L14.255 15.755V14.965L13.985 14.685C12.845 15.665 11.365 16.255 9.755 16.255C6.16504 16.255 3.255 13.345 3.255 9.755C3.255 6.16501 6.16504 3.255 9.755 3.255C13.345 3.255 16.255 6.16501 16.255 9.755C16.255 11.365 15.665 12.845 14.6851 13.985L14.965 14.255ZM5.255 9.755C5.255 12.245 7.26501 14.255 9.755 14.255C12.245 14.255 14.255 12.245 14.255 9.755C14.255 7.26501 12.245 5.255 9.755 5.255C7.26501 5.255 5.255 7.26501 5.255 9.755Z">
</path>
</g>
</svg>
</button>
</form>
<form onsubmit="return false;">
<ul class="firebaseui-idp-list">
<li class="firebaseui-list-item"><button class="firebaseui-idp-button mdl-button mdl-js-button mdl-button--raised firebaseui-idp-password firebaseui-id-idp-button" data-provider-id="password" style="background-color:#db4437"
data-upgraded=",MaterialButton"><span class="firebaseui-idp-icon-wrapper"><img class="firebaseui-idp-icon" alt="" src="https://www.gstatic.com/firebasejs/ui/2.0.0/images/auth/mail.svg"></span><span
class="firebaseui-idp-text firebaseui-idp-text-long">Sign in with email</span><span class="firebaseui-idp-text firebaseui-idp-text-short">Email</span></button></li>
<li class="firebaseui-list-item"><button class="firebaseui-idp-button mdl-button mdl-js-button mdl-button--raised firebaseui-idp-google firebaseui-id-idp-button" data-provider-id="google.com" style="background-color:#ffffff"
data-upgraded=",MaterialButton"><span class="firebaseui-idp-icon-wrapper"><img class="firebaseui-idp-icon" alt="" src="https://www.gstatic.com/firebasejs/ui/2.0.0/images/auth/google.svg"></span><span
class="firebaseui-idp-text firebaseui-idp-text-long">Sign in with Google</span><span class="firebaseui-idp-text firebaseui-idp-text-short">Google</span></button></li>
</ul>
</form>
Text Content
Skip to main content VentureBeat Homepage * Events * GamesBeat * Jobs * Low Code / No Code Summit Account Settings Log Out Become a Member Sign In VentureBeat Homepage * * VENTUREBEAT * AR/VR * Big Data * Cloud * Commerce * Dev * Enterprise * Entrepreneur * Marketing * Media * Mobile * Security * Social * Transportation FOLLOW follow us on Twitter follow us on Facebook follow us on LinkedIn Follow us on RSS THE MACHINE * AI * Machine Learning * Computer Vision * Natural Language Processing * Robotic Process Automation FOLLOW Follow us on RSS GAMESBEAT * Games * Esports * PC Gaming FOLLOW follow us on Twitter Follow us on RSS EVENTS * Upcoming * Transform * Media Partner * Webinars GENERAL * Newsletters * Got a news tip? * Advertise * Press Releases * Guest Posts * Deals * Jobs * VB Lab * About * Contact * Privacy Policy × JOIN THE VENTUREBEAT COMMUNITY FREE: JOIN THE VENTUREBEAT COMMUNITY FOR ACCESS TO 3 PREMIUM POSTS AND UNLIMITED VIDEOS PER MONTH. Learn More SIGN UP WITH YOUR BUSINESS E-MAIL TO CONTINUE WITH TICKET PURCHASE Please wait... * Sign in with emailEmail * Sign in with GoogleGoogle By continuing, you are indicating that you accept our Terms of Service and Privacy Policy. SHARE * Share on Facebook * Share on Twitter * Share on LinkedIn * VentureBeat Homepage * Social Links * Newsletters * Events * Profile NEW DEEP REINFORCEMENT LEARNING TECHNIQUE HELPS AI TO EVOLVE Ben Dickson@BenDee983 October 28, 2021 2:20 PM * Share on Facebook * Share on Twitter * Share on LinkedIn Image Credit: Andriy Onufriyenko // Getty Images Hundreds of millions of years of evolution have produced a variety of life-forms, each intelligent in its own fashion. Each species has evolved to develop innate skills, learning capacities, and a physical form that ensures survival in its environment. But despite being inspired by nature and evolution, the field of artificial intelligence has largely focused on creating the elements of intelligence separately and fusing them together after the development process. While this approach has yielded great results, it has also limited the flexibility of AI agents in some of the basic skills found in even the simplest life-forms. 1 / 9 Low Code Literally Helps Power the State of Vermont. Find Out How._ Read More Video Player is loading. Play Video Unmute Duration -:- / Current Time 0:00 Playback Speed Settings 1x Loaded: 0% Remaining Time -0:00 FullscreenPlayUp Next This is a modal window. Beginning of dialog window. Escape will cancel and close the window. TextColorWhiteBlackRedGreenBlueYellowMagentaCyanTransparencyOpaqueSemi-TransparentBackgroundColorBlackWhiteRedGreenBlueYellowMagentaCyanTransparencyOpaqueSemi-TransparentTransparentWindowColorBlackWhiteRedGreenBlueYellowMagentaCyanTransparencyTransparentSemi-TransparentOpaque Font Size50%75%100%125%150%175%200%300%400%Text Edge StyleNoneRaisedDepressedUniformDropshadowFont FamilyProportional Sans-SerifMonospace Sans-SerifProportional SerifMonospace SerifCasualScriptSmall Caps Reset restore all settings to the default valuesDone Close Modal Dialog End of dialog window. Playback Speed 0.25x 0.5x 1x Normal 1.5x 2x Replay the list * Powered by AnyClip * Privacy Policy TOP ARTICLES Skip Ads by Low Code Literally Helps Power the State of Vermont. Find Out How._ In a new paper published in the scientific journal Nature, AI researchers at Stanford University present a new technique that can help take steps toward overcoming some of these limits. Called “deep evolutionary reinforcement learning,” or DERL, the new technique uses a complex virtual environment and reinforcement learning to create virtual agents that can evolve both in their physical structure and learning capacities. The findings can have important implications for the future of AI and robotics research. EVOLUTION IS HARD TO SIMULATE In nature, the body and brain evolve together. Across many generations, every animal species has gone through countless cycles of mutation to grow limbs, organs, and a nervous system to support the functions it needs in its environment. Mosquitos are equipped with thermal vision to spot body heat. Bats have wings to fly and an echolocation apparatus to navigate dark spaces. Sea turtles have flippers to swim with and a magnetic field detector system to travel very long distances. Humans have an upright posture that frees their arms and lets them see the far horizon, hands and nimble fingers that can manipulate objects, and a brain that makes them the best social creatures and problem solvers on the planet. Interestingly, all these species descended from the first life-form that appeared on Earth several billion years ago. Based on the selection pressures caused by the environment, the descendants of those first living beings evolved in many directions. Studying the evolution of life and intelligence is interesting, but replicating it is extremely difficult. An AI system that would want to recreate intelligent life in the same way that evolution did would have to search a very large space of possible morphologies, which is extremely expensive computationally. It would need a lot of parallel and sequential trial-and-error cycles. AI researchers use several shortcuts and predesigned features to overcome some of these challenges. For example, they fix the architecture or physical design of an AI or robotic system and focus on optimizing the learnable parameters. Another shortcut is the use of Lamarckian rather than Darwinian evolution, in which AI agents pass on their learned parameters to their descendants. Yet another approach is to train different AI subsystems separately (vision, locomotion, language, etc.) and then tack them on together in a final AI or robotic system. While these approaches speed up the process and reduce the costs of training and evolving AI agents, they also limit the flexibility and variety of results that can be achieved. DEEP EVOLUTIONARY REINFORCEMENT LEARNING In their new work, the researchers at Stanford aim to bring AI research a step closer to the real evolutionary process while keeping the costs as low as possible. “Our goal is to elucidate some principles governing relations between environmental complexity, evolved morphology, and the learnability of intelligent control,” they wrote in their paper. Within the DERL framework, each agent uses deep reinforcement learning to acquire the skills required to maximize its goals during its lifetime. DERL uses Darwinian evolution to search the morphological space for optimal solutions, which means that when a new generation of AI agents are spawned, they only inherit the physical and architectural traits of their parents (along with slight mutations). None of the learned parameters are passed on across generations. “DERL opens the door to performing large-scale in silico experiments to yield scientific insights into how learning and evolution cooperatively create sophisticated relationships between environmental complexity, morphological intelligence, and the learnability of control tasks,” the researchers wrote. SIMULATING EVOLUTION For their framework, the researchers used MuJoCo, a virtual environment that provides highly accurate rigid-body physics simulation. Their design space is called Universal Animal (Unimal), in which the goal is to create morphologies that learn locomotion and object-manipulation tasks in a variety of terrains. Each agent in the environment is composed of a genotype that defines its limbs and joints. The direct descendant of each agent inherits the parent’s genotype and goes through mutations that can create new limbs, remove existing limbs, or make small modifications to characteristics, such as the degrees of freedom or the size of limbs. Each agent is trained with reinforcement learning to maximize rewards in various environments. The most basic task is locomotion, in which the agent is rewarded for the distance it travels during an episode. Agents whose physical structures are better suited for traversing terrain learn faster to use their limbs for moving around. To test the system’s results, the researchers generated agents in three types of terrains: flat (FT), variable (VT), and variable terrains with modifiable objects (MVT). The flat terrain puts the least selection pressure on the agents’ morphology. The variable terrains, on the other hand, force the agents to develop a more versatile physical structure that can climb slopes and move around obstacles. The MVT variant has the added challenge of requiring the agents to manipulate objects to achieve their goals. THE BENEFITS OF DERL Above: Deep evolutionary reinforcement learning generates a variety of successful morphologies across different environments. Image Credit: TechTalks One of the interesting findings of DERL is the diversity of the results. Other approaches to evolutionary AI tend to converge on one solution because new agents directly inherit the physique and learnings of their parents. But in DERL, only morphological data is passed on to descendants; the system ends up creating a diverse set of successful morphologies, including bipeds, tripeds, and quadrupeds with and without arms. At the same time, the system shows traits of the Baldwin effect, which suggests that agents that learn faster are more likely to reproduce and pass on their genes to the next generation. DERL shows that evolution “selects for faster learners without any direct selection pressure for doing so,” according to the Stanford paper. “Intriguingly, the existence of this morphological Baldwin effect could be exploited in future studies to create embodied agents with lower sample complexity and higher generalization capacity,” the researchers wrote. Finally, the DERL framework also validates the hypothesis that more complex environments will give rise to more intelligent agents. The researchers tested the evolved agents across eight different tasks, including patrolling, escaping, manipulating objects, and exploration. Their findings show that in general, agents that have evolved in variable terrains learn faster and perform better than AI agents that have only experienced flat terrain. Their findings seem to be in line with another hypothesis by DeepMind researchers that a complex environment, a suitable reward structure, and reinforcement learning can eventually lead to the emergence of all kinds of intelligent behaviors. AI AND ROBOTICS RESEARCH The DERL environment only has a fraction of the complexities of the real world. “Although DERL enables us to take a significant step forward in scaling the complexity of evolutionary environments, an important line of future work will involve designing more open-ended, physically realistic, and multiagent evolutionary environments,” the researchers wrote. In the future, the researchers plan to expand the range of evaluation tasks to better assess how the agents can enhance their ability to learn human-relevant behaviors. The work could have important implications for the future of AI and robotics and push researchers to use exploration methods that are much more similar to natural evolution. “We hope our work encourages further large-scale explorations of learning and evolution in other contexts to yield new scientific insights into the emergence of rapidly learnable intelligent behaviors, as well as new engineering advances in our ability to instantiate them in machines,” the researchers wrote. Ben Dickson is a software engineer and the founder of TechTalks. He writes about technology, business, and politics. This story originally appeared on Bdtechtalks.com. Copyright 2021 VENTUREBEAT VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact. Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access: * up-to-date information on the subjects of interest to you * our newsletters * gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More * networking features, and more Become a member FIND YOUR DREAM JOB ON VENTUREBEAT CAREERS Oracle Amsterdam View 1146 Jobs CGI Group, Inc. Amsterdam View 787 Jobs The Travelers Companies, Inc. Amsterdam View 251 Jobs Quorum Washington View 1 Jobs STABILIZE YOUR GROWTH Join Trinet on Dec. 1 as they provide tips on how to avoid the pitfalls of too-fast growth, how to bring focus back to the consumer, recognize the signs that are creating difficulties within your team, and more. Register here TRANSFORM TECHNOLOGY SUMMITS Hear from CIOs, CTOs, and other C-level execs on data and AI strategies Learn More JOIN FORCES WITH VENTUREBEAT AT OUR UPCOMING AI & DATA EVENTS Sponsor VB Events * VentureBeat Homepage * Follow us on Facebook * Follow us on Twitter * Follow us on LinkedIn * Follow us on RSS * VB Lab * Newsletters * Events * Special Issue * Product Comparisons * Jobs * About * Contact * Careers * Privacy Policy * Terms of Service * Do Not Sell My Personal Information © 2021 VentureBeat. All rights reserved. × We may collect cookies and other personal information from your interaction with our website. For more information on the categories of personal information we collect and the purposes we use them for, please view our Notice at Collection. ×