www.infoworld.com Open in urlscan Pro
151.101.66.165  Public Scan

Submitted URL: https://www.codeproject.com/News.aspx?ntag=19837497854098772&_z=7139521
Effective URL: https://www.infoworld.com/article/3711100/reining-in-the-bs-in-ai.html
Submission: On November 21 via api from US — Scanned from CA

Form analysis 1 forms found in the DOM

<form class="gsc-search-box gsc-search-box-tools" accept-charset="utf-8">
  <table cellspacing="0" cellpadding="0" role="presentation" class="gsc-search-box">
    <tbody>
      <tr>
        <td class="gsc-input">
          <div class="gsc-input-box" id="gsc-iw-id1">
            <table cellspacing="0" cellpadding="0" role="presentation" id="gs_id50" class="gstl_50 gsc-input" style="width: 100%; padding: 0px;">
              <tbody>
                <tr>
                  <td id="gs_tti50" class="gsib_a"><input autocomplete="off" type="text" size="10" class="gsc-input" name="search" title="search" aria-label="search" id="gsc-i-id1" dir="ltr" spellcheck="false" placeholder="Start Searching"
                      style="width: 100%; padding: 0px; border: none; margin: 0px; height: auto; outline: none;"></td>
                  <td class="gsib_b">
                    <div class="gsst_b" id="gs_st50" dir="ltr"><a class="gsst_a" href="javascript:void(0)" title="Clear search box" role="button" style="display: none;"><span class="gscb_a" id="gs_cb50" aria-hidden="true">×</span></a></div>
                  </td>
                </tr>
              </tbody>
            </table>
          </div>
        </td>
        <td class="gsc-search-button"><button class="gsc-search-button gsc-search-button-v2"><svg width="13" height="13" viewBox="0 0 13 13">
              <title>search</title>
              <path
                d="m4.8495 7.8226c0.82666 0 1.5262-0.29146 2.0985-0.87438 0.57232-0.58292 0.86378-1.2877 0.87438-2.1144 0.010599-0.82666-0.28086-1.5262-0.87438-2.0985-0.59352-0.57232-1.293-0.86378-2.0985-0.87438-0.8055-0.010599-1.5103 0.28086-2.1144 0.87438-0.60414 0.59352-0.8956 1.293-0.87438 2.0985 0.021197 0.8055 0.31266 1.5103 0.87438 2.1144 0.56172 0.60414 1.2665 0.8956 2.1144 0.87438zm4.4695 0.2115 3.681 3.6819-1.259 1.284-3.6817-3.7 0.0019784-0.69479-0.090043-0.098846c-0.87973 0.76087-1.92 1.1413-3.1207 1.1413-1.3553 0-2.5025-0.46363-3.4417-1.3909s-1.4088-2.0686-1.4088-3.4239c0-1.3553 0.4696-2.4966 1.4088-3.4239 0.9392-0.92727 2.0864-1.3969 3.4417-1.4088 1.3553-0.011889 2.4906 0.45771 3.406 1.4088 0.9154 0.95107 1.379 2.0924 1.3909 3.4239 0 1.2126-0.38043 2.2588-1.1413 3.1385l0.098834 0.090049z">
              </path>
            </svg></button></td>
        <td class="gsc-clear-button">
          <div class="gsc-clear-button" title="clear results">&nbsp;</div>
        </td>
      </tr>
    </tbody>
  </table>
</form>

Text Content

Close Ad


infoworld
UNITED STATES
 * United States
 * United Kingdom

 * App Dev
 * Cloud
 * Gen AI
 * Machine Learning
 * Analytics
 * IDG TECH(Talk) Community
 * Newsletters

×

search
 

Analytics
Careers
Databases
Cloud Computing
 * Amazon Web Services
 * Kubernetes
 * Microsoft Azure

Generative AI
Machine Learning
Open Source
Software Development
 * Agile Development
 * CI/CD
 * Devops
 * Java
 * JavaScript
 * Microsoft .Net

Newsletters
IDG Events
In-Depth
 * Features
 * How-To
 * News
 * Reviews

Blogs
Video
 * Do More with R
 * Smart Python
 * IDG TECH(talk) Channel

White Papers/Webcasts
From Our Partners
   
   
 * The Latest Content from Our Sponsors

More from the Foundry Network
The voice of IT leadership
Analytics Careers CIO Role Digital Transformation Leadership Project Management
Security at the speed of business
Application Security Cloud Security Identity Management Information Security
Network Security Risk Management Security Software
Making technology work for business
Blockchain Collaboration Mobile Office Software Security Systems Management
Windows
From the data center to the edge
Data Center Internet of Things Linux Networking SD-WAN Servers Storage Wi-Fi
 * About Us |
 * Contact |
 * Republication Permissions |
 * Privacy Policy |
 * Cookie Policy |
 * Copyright Notice |
 * European Privacy Settings |
 * Member Preferences |
 * Advertising |
 * Foundry Careers |
 * Ad Choices |
 * E-commerce Links |
 * California: Do Not Sell My Personal Info |

 * Follow Us
 * 
 * 
 * 


×

Close
 * Home
 * Artificial Intelligence
 * Generative AI




REINING IN THE BS IN AI


LARGE LANGUAGE MODELS TRAINED ON QUESTIONABLE STUFF ONLINE WILL PRODUCE MORE OF
THE SAME. RETRIEVAL AUGMENTED GENERATION IS ONE WAY TO GET CLOSER TO TRUTH.

 * 
 * 
 * 
 * 
 * 
 * 
 * 

By Matt Asay

Contributor, InfoWorld | Nov 19, 2023 6:00 pm PST


alberto clemares exposito / Shutterstock



Even people not in tech seemed to have heard of Sam Altman’s ouster from OpenAI
on Friday. I was with two friends the next day (one works in construction and
the other in marketing) and both were talking about it. Generative AI (genAI)
seems to have finally gone mainstream.

What it hasn’t done, however, is escape the gravitational pull of BS, as Alan
Blackwell has stressed. No, I don’t mean that AI is vacuous, long on hype, and
short on substance. AI is already delivering for many enterprises across a host
of industries. Even genAI, a small subset of the overall AI market, is a
game-changer for software development and beyond. And yet Blackwell is correct:
“AI literally produces bullshit.” It makes up stuff that sounds good based on
training data.

Even so, if we can “box it in,” as MIT professor of AI Rodney Brooks describes,
genAI has potential to make a big difference in our lives.

[ ALSO ON INFOWORLD: ZERO-SHOT LEARNING AND THE FOUNDATIONS OF GENERATIVE AI ]


‘CHATGPT IS A BULLSHIT GENERATOR’

Truth is not fundamental to how large language models function. LLMs are “deep
learning algorithms that can recognize, summarize, translate, predict, and
generate content using very large data sets.” Note that “truth” and “knowledge”
have no place in that definition. LLMs aren’t designed to tell you the truth. As
detailed in an OpenAI forum, “Large language models are probabilistic in nature
and operate by generating likely outputs based on patterns they have observed in
the training data. In the case of mathematical and physical problems, there may
be only one correct answer, and the likelihood of generating that answer may be
very low.”

Companies begin exploring the AI multiverse | Ep. 70


0 seconds of 27 secondsVolume 0%


Press shift question mark to access a list of keyboard shortcuts
Keyboard ShortcutsEnabledDisabled
Play/PauseSPACE
Increase Volume↑
Decrease Volume↓
Seek Forward→
Seek Backward←
Captions On/Offc
Fullscreen/Exit Fullscreenf
Mute/Unmutem
Decrease Caption Size-
Increase Caption Size+ or =
Seek %0-9


Settings
OffAutomated Captions - en-US
Font Color
White

Font Opacity
100%

Font Size
100%

Font Family
Arial

Character Edge
None

Background Color
Black

Background Opacity
50%

Window Color
Black

Window Opacity
0%

Reset
WhiteBlackRedGreenBlueYellowMagentaCyan
100%75%50%25%
200%175%150%125%100%75%50%
ArialCourierGeorgiaImpactLucida ConsoleTahomaTimes New RomanTrebuchet MSVerdana
NoneRaisedDepressedUniformDrop Shadow
WhiteBlackRedGreenBlueYellowMagentaCyan
100%75%50%25%0%
WhiteBlackRedGreenBlueYellowMagentaCyan
100%75%50%25%0%

This ad will end in 13
Live
00:13
00:13
00:27






 

That’s a nice way of saying you might not want to rely on ChatGPT to do basic
multiplication problems for you, but it could be great at crafting an answer on
the history of algebra. In fact, channeling Geoff Hinton, Blackwell says, “One
of the greatest risks is not that chatbots will become super intelligent, but
that they will generate text that is super persuasive without being
intelligent.”

It’s like “fake news” on steroids. As Blackwell says, “We’ve automated
bullshit.”



This isn’t surprising, given the primary sources for the LLMs underlying ChatGPT
and other GenAI systems are Twitter, Facebook, Reddit, and “other huge archives
of bullshit.” However, “there is no algorithm in ChatGPT to check which parts
are true,” such that the “output is literally bullshit,” says Blackwell.

What to do?

SponsoredPost Sponsored by the EY organization

A complex threat landscape demands a dynamic approach to security

Just one in five CISOs and C-suite leaders consider their security strategies
effective. Learn how a modern security strategy can better protect your
organization.


‘YOU HAVE TO BOX THINGS IN CAREFULLY’

The key to getting some semblance of useful knowledge out of LLMs, according to
Brooks, is “boxing in.” He says, “You have to box [LLMs] in carefully so that
the craziness doesn’t come out, and the making stuff up doesn’t come out.” But
how does one “box an LLM in?”

One critical way is through retrieval augmented generation (RAG). I love
how Zachary Proser characterizes it: “RAG is like holding up a cue card
containing the critical points for your LLM to see.” It’s a way to augment an
LLM with proprietary data, giving the LLM more context and knowledge to improve
its responses.

RAG depends on vectors, which are a foundational element used in a variety of AI
use cases. A vector embedding is just a long list of numbers that describe
features of the data object, like a song, an image, a video, or a poem, stored
in a vector database. They’re used to capture the semantic meaning of objects in
relation to other objects. Similar objects are grouped together in the vector
space. The closer two objects, the more similar they are. (For example, “rugby”
and “football” will be closer to each other than “football” and “basketball”).
You can then query for related entities that are similar based on their
characteristics, without relying on synonyms or keyword matching.

As Proser concludes, “Since the LLM now has access to the most pertinent and
grounding facts from your vector database, it can provide an accurate answer for
your user. RAG reduces the likelihood of hallucination.” Suddenly, your LLM is
much more likely to give you a true response, not merely a response that sounds
true. This is the sort of “boxing in” that can make LLMs actually useful and not
hype.

SponsoredPost Sponsored by EY

It's time to get your arms around cloud cost management

How a FinOps approach can help you make the right trade-offs among the cost,
speed, and quality of cloud services.

Otherwise, it’s just automated bullshit.

Next read this:

 * The best open source software of 2023
 * Do programming certifications still matter?
 * Cloud computing is no longer a slam dunk
 * What is generative AI? Artificial intelligence that creates
 * Coding with AI: Tips and best practices from developers
 * Why Wasm is the future of cloud computing

Related:
 * Generative AI
 * Emerging Technology
 * Technology Industry

Matt Asay runs developer relations at MongoDB. The views expressed herein are
Matt’s and do not reflect those of his employer.

Follow
 * 
 * 
 * 
 * 
 * 

Copyright © 2023 IDG Communications, Inc.



InfoWorld Follow us
 * 
 * 
 * 
   


 * About Us
 * Contact
 * Republication Permissions
 * Privacy Policy
 * Cookie Policy
 * Copyright Notice
 * European Privacy Settings
 * Member Preferences
 * Advertising
 * Foundry Careers
 * Ad Choices
 * E-commerce Links
 * California: Do Not Sell My Personal Info

Copyright © 2023 IDG Communications, Inc.

Explore the Foundry Network descend
 * CIO
 * Computerworld
 * CSO Online
 * InfoWorld
 * Network World



















INFOWORLD WANTS TO SHOW YOU NOTIFICATIONS

--------------------------------------------------------------------------------

YOU CAN TURN OFF NOTIFICATIONS AT ANY TIME FROM YOUR BROWSER

Accept Do not accept

POWERED BY SUBSCRIBERS