www.gartner.com
Open in
urlscan Pro
2606:4700:4400::6812:22dd
Public Scan
Submitted URL: http://mediamarketingcurated.com/58954-295019/149981
Effective URL: https://www.gartner.com/doc/reprints?id=1-2G54JB2F&ct=240104&st=sb
Submission: On August 28 via api from US — Scanned from GB
Effective URL: https://www.gartner.com/doc/reprints?id=1-2G54JB2F&ct=240104&st=sb
Submission: On August 28 via api from US — Scanned from GB
Form analysis
0 forms found in the DOMText Content
Licensed for Distribution Licensed for Distribution This research note is restricted to the personal use of (). EMERGING TECH: TOP 4 SECURITY RISKS OF GENAI 10 August 2023 - ID G00785940 - 31 min read By Lawrence Pingree, Swati Rakheja, and 5 more -------------------------------------------------------------------------------- Generative AI models create security risks that are crucial to address for tech providers. Product leaders in security markets can differentiate and drive revenue by addressing the key transformation opportunities presented by these risks. ADDITIONAL PERSPECTIVES * Invest Implications: Emerging Tech: Top 4 Security Risks of GenAI(27 September2023) OVERVIEW KEY FINDINGS * The use of generative AI (GenAI) large language models (LLMs) and chat interfaces, especially connected to third-party solutions outside the organization firewall, represent a widening of attack surfaces and security threats to enterprises. However, they also offer new opportunities for security technology provider offerings. * GenAI and smart malware will enhance attacker efficiency, enable greater automation and increase the autonomy of attacks, while significantly augmenting attacker and defender toolsets. * GenAI prompt injections and model mentoring bring data security risks that are not easily remediated. This opens new opportunities for cybersecurity product evolution and solution alignment to detect and defend against high volumes of prompts and injection attacks on LLM chat and API interfaces. RECOMMENDATIONS Product leaders working to incorporate generative AI solutions in security products should: * Build an updated product strategy that addresses GenAI security risks by examining and adjusting product roadmaps and partnerships so that they can aid in the future defense of LLMs and chat interface interactions. * Work with product teams to proactively explore potential smart malware behaviors, focusing on improving methods of cross-product coordination and threat intelligence and enhancing the speed of information exchange via APIs about users, files, events and the like with adjacent prevention providers. * Prioritize LLM data source awareness and transparency to address data security and privacy risks, and partner or build new capabilities into LLM products to protect them from model tampering, potential leakage of data or potential model breach risks. STRATEGIC PLANNING ASSUMPTIONS * By 2025, autonomous agents will drive advanced cyberattacks that give rise to “smart malware,” pushing providers to offer innovations that address unique LLM and GenAI risks and threats. * By 2025, user efficiency improvements will drive at least 35% of security vendors to offer large-language-model-driven chat capabilities for users to interact with their applications and data, up from 1% in 2022. ANALYSIS RISK DESCRIPTION GenAI technologies can generate newly derived versions of content, strategies, designs and methods by learning from large repositories of original source content. GenAI has profound business impacts, including on content discovery, creation, authenticity and regulations; the automation of human work; and customer and employee experiences. These artifacts can serve benign or nefarious purposes. GenAI LLMs can produce totally novel media content including text, images, video and audio, synthetic data and models of physical objects. GenAI uses a number of techniques that continue to evolve as use cases continue to multiply. Recently, foundation models, also known as autoregressive generative models or LLMs (like GPT-4, LLaMA, Vicuna and DALL-E), have gained mind share and adoption with the introduction of ChatGPT. Models such as generative adversarial networks (GANs) and variational autoencoders (VAEs) continue to develop. Both threats and risks are evolving in relation to use of GenAI LLMs (see Figure 1). There are both government and industry initiatives to address the emerging risks of GenAI. Examples include: * NIST AI Risk Management Framework * World Economic Forum AI Governance Alliance * OWASP AI Security and Privacy Guide EMERGING SECURITY RISKS WITH GENERATIVE AI LLMS Gartner has identified four major emerging risks within Generative AI. Emerging enterprise GenAI risks include (but are not limited to): * Privacy and data security * Enhanced attack efficiency * Misinformation * Fraud and identity risks Prior to ChatGPT’s release at the beginning of 2023, the main concerns related to the security threats of machine learning and generative AI were covered in Gartner’s AI TRiSM research (see Top Strategic Technology Trends for 2023: AI Trust, Risk and Security Management). But the situation is rapidly advancing. LLMs and chat interfaces have enhanced malicious attackers’ ability to send more believable deceptive emails and messages, mimic genuine sources, and complicate the task of differentiating between authentic and forged audio, communications, imagery and video (see Note 1). Attackers are believed to be increasingly employing tools along with LLMs to carry out large-scale social engineering attacks. By impersonating someone else, they aim to deceive victims and obtain sensitive information. This information can be exploited by attackers in various ways, including compromising user accounts, inserting themselves into financial transactions, such as bank transfers or real estate escrow, and executing other contextualized attacks involving sensitive data or personalized information. Attackers have also been known to combine breached credentials and information with spear phishing emails, social network messaging and cellular SMS messages (see Note 1). Exposure to third-party black-box style APIs and integrations and the use of LLMs has rapidly expanded attack surfaces. Since the beginning of 2023, as LLM use has grown and as open source tools, commercial tools and composed third-party APIs have rapidly expanded, so too have their risks. MITRE has published a draft of MITRE Atlas, which attempts to map some of the major known AI system attacks already identified. Critical security markets impacted by the enterprise risks of GenAI: * Applications embedded with GenAI chat interfaces and APIs * Secure email gateway solutions (SEGs) and secure web gateways (SWGs) * Secure access service edge (SASE) * Security service edge (SSE) * Network firewalls (NFWs) and intrusion detection and prevention systems (IDPS) technologies * Data loss prevention (DLP) and data security posture management (DSPM) * SaaS security posture management (SSPM) * Web application and API protection (WAAP) * Enterprise browsers and extensions SAMPLING OF GENERATIVE AI-TRISM PROVIDERS Arthur, Arize AI, Bosch AIShield, CalypsoAI, TrojAI, Preamble, Forcepoint, SAFE, NVIDIA (NeMo Guardrails), Netskope, Patented, Galileo, TruEra, Whylabs Figure 1: Top GenAI Risks PRIVACY AND DATA SECURITY Analysis by: Lawrence Pingree, Mark Wah GenAI tools often require access to data, both for training and for generating outputs. The lack of data anonymization techniques, if not used sufficiently and/or data is shared with third parties and with API authorization permissions management can lead to potential data leak, risk or breach (see How to Deliver Sustainable APIs). Without explicit and monitored consent, there is a risk of violating privacy rights or data security mandates. GenAI tools may also be vulnerable to data breaches, resulting in unauthorized access or disclosure of sensitive information. The format of GenAI product delivery has evolved, with multiple vendors offering multiple channels, from text-based user interfaces to API-based approaches, and these permeate both enterprise and consumer delivery models. OpenAI today has several hundred API integrations via plugins taking on a variety of use cases, many of which leverage data submitted by users in either graphical, text or document form. Various GenAI implementations that are made available without sufficient safeguards will incur privacy and data security risks in three major areas: INPUT/OUTPUT The reliability of content generated and ingested by generative AI as well as subsequent workflows face risks, and these risks being realized could lead to producing deceptive, misleading or harmful content. Some generated content can lead to copyright infringement, illegal activities and potentially violate privacy laws. Inputs can be tampered with or manipulated, including by injection of malicious code and outputs, and synthetic content can be leveraged for disinformation and enable the creation of deepfakes (see the Disinformation and Reliability of Generative AI for Decision Support section of this document). Additionally, underlying models can potentially amplify biases1 that are present in the training data, and it is difficult to understand, control and explain the details behind the source datasets used — especially when models use uncontrolled sources. For example, text-to-image generative AI platforms inherit some gender and racial bias when provided gender- and race-neutral prompts2 and when LLMs are augmented by search-oriented data. To date, model poisoning (the act of tampering with GenAI LLM source data) has been done in some situations with only ~1% of tampered source data to corrupt model output (see Note 1). PRIVACY OF TRAINING DATA GenAI poses a significant risk to data privacy due to its remarkable ability to swiftly establish connections between seemingly unrelated topics and datasets. There is a risk of “inference attacks” where advanced users can reverse-engineer information from the training data and extract sensitive information that can compromise individuals’ privacy. By traversing vast amounts of data and making connections that were previously unseen, GenAI has the potential to reveal personal details, expose confidential information, or infer private attributes about individuals. ADVERSARIAL ATTACKS Generative AI can be exploited through adversarial attacks to produce malicious outcomes and also bypass built-in safety measures. “Jailbreaking” is an example where attackers can craft specific prompts to bypass the model’s safety systems.4 Some examples include “Do Anything Now” (DAN), the grandma exploit, and RabbitHole attacks. “Prompt injection” is another example of an adversarial attack, where an AI model is manipulated by inserting malicious data or instructions. (See the Prompt Injection Attacks Emerge [Direct and Indirect] section for more examples.) LLM AND DATA BREACH RISKS As with traditional applications, data breach risks are inherent in web-enabled applications without proper safeguards and technology applied (such as Web Application and API Protection [WAP] technologies). AI TRiSM will play a significant role in the future of GenAI use (see Top Strategic Technology Trends for 2023: AI Trust, Risk and Security Management). MODEL MENTORING RISK Although synthetic data has proven to be effective in creating usable LLMs, there have been several open source models where one model trains from another LLM’s outputs through mass prompt generation. This practice, known as “model mentoring,” includes generating large numbers of targeted prompts and then modeling the output of adjacent models and incorporating that information into a new model. This may lead to data breaches, sensitive-data leakage, model cloning or inadvertent theft of intellectual property. JAILBREAKING RISKS IN LLMS LLMs are subject to jailbreaking. Jailbreaking includes methods of tricking the LLM model to role-play as an agent that doesn’t follow certain rules or safety guardrails, or other deception techniques to get the large language model to bypass request and response filtering. An LLM that has been prompted to role-play can, for example, be tricked into providing instructions on how to manufacture illicit substances or explosives — outputs the chat interface is designed not to provide during typical use. THE IMPORTANCE OF TERMS OF SERVICE (TOS) IN GENAI SYSTEMS Terms of service for sensitive or intellectual property data usage are important in GenAI systems because they represent the ability to either limit or define the use and availability of data in submitted content or intellectual property. It is important that before organizations make use of GenAI systems, they check with their corporate counsel to review terms of service with specific external and third-party providers. NEAR-TERM IMPLICATIONS FOR PRODUCT LEADERS Privacy and data security are essential elements for any business, especially those operating in the cybersecurity space. Software and SaaS are required to be trusted to ensure that customers want to continue to do business with them. With the advent of GenAI, there is an increasing need to understand how these technologies will impact both product leaders and consumers alike. Existing data discovery, classification, validation and integrity tools must be augmented to address newly emerging risks posed by GenAI. Security technologies that control access, monitor and validate content and user behavior, isolate and monitor applications, and restrict content retrieval have the greatest opportunities to address GenAI risks. PRODUCT AND SERVICE OPPORTUNITIES * Services and products that extend the ability for enterprises to monitor and defend against data or privacy risks in either services or content are critical to identify and defend against privacy or data security risks. * There are new attack vectors, such as GenAI inference attacks, jailbreaking, prompt injection and model poisoning, that can represent opportunities for security products and services. * Business opportunities exist for security solution providers to offer new tools or capabilities that enable filtering of chat-oriented prompts, GenAI outputs, plugins and user interactions with generative AI models. * Product roadmaps for security providers need to include abilities to apply and support concepts identified in Gartner’s AI TRiSM research stream (see Top Strategic Technology Trends for 2023: AI Trust, Risk and Security Management). * DLP solutions and other in-line solutions that inspect the communications of GenAI, such as firewalls, network detection and response, and secure web gateways, can be used to block leakage to specific websites and APIs (like ChatGPT and OpenAI APIs). Alternatively, they can be used to enhance authorization of said content or to educate users before interactions are permitted. There is also an opportunity to add contextualized authentication to validate users before uploads, including requesting multifactor authentication on these types of interactions. RECOMMENDED ACTIONS FOR THE NEXT SIX TO 18 MONTHS * Product leaders should leverage GenAI technologies as a competitive advantage. For example, they can utilize advanced recognition algorithms to detect synthetic content or develop tools to validate or control digital media sharing. * Product leaders will need to pay close attention to regulations regarding misinformation or protection of intellectual property rights and impacts from expanding use cases of GenAI for data as these topics and potential changes in law are being actively discussed. * Product leaders will need to reduce GenAI models’ access to sensitive data types, especially those that will be leveraged as customer-exposed chat applications. * Product leaders must address the privacy of training data by implementing robust safeguards and ethical considerations. These could include data anonymization and sanitization processes on training data or leveraging privacy-preserving machine learning methods that help prevent the misuse or unauthorized disclosure of sensitive data. * Product leaders should consider partnering with third parties or using third-party tools. or developing specialized, AI-oriented solutions to detect and prevent the misuse of sensitive data and prevent inadvertent disclosure of sensitive data in GenAI systems and services. SAMPLE VENDORS Sampling of Privacy and Data-Security-Focused Vendors: Adversa AI, ActiveFence, Cyberhaven, HiddenLayer, LLM Shield, Originality.AI, Protopia AI, Scale AI Sampling of Relevant API Risk, Authorization and Access-Control-Oriented Vendors: Astrix Security, Akana, Cequence Security, Cloudentity, Curity, ForgeRock, Ping Identity, NextLabs, IBM, Salt Security, Traceable, TeejLab LLMS INCREASE ATTACKER KNOWLEDGE, IMPROVE ATTACKER EFFICIENCY AND EXPOSE NEW ATTACK STRATEGIES Analysis by: Lawrence Pingree LLMs are increasing the availability and variety of attacks by enabling would-be attackers with malicious code development through easy to use prompts. They also help educate would-be attackers by enabling them to easily and quickly ask the models for help in carrying out attacks or to enhance their knowledge on specific hacking and attack methods. Because LLMs and their composed services are also applications, simply deploying GenAI models creates new attack surfaces by attackers targeting these new user interfaces or APIs exposing the model to external interactions. LOWERING THE BARRIER TO ENTRY LLMs enable self-learning for human attackers and reduce the need to attend advanced courses in cybersecurity or ethical hacking, lowering the barrier to entry for new attackers and allowing existing attackers to augment their skills more easily. Chatbot tools enable would-be attackers by enabling them to simply ask questions to get guidance during attack execution stages or while planning attack strategies. LLMs will allow would-be attackers to progress more rapidly from what is widely known as “script kiddie status” and become advanced attackers. GenAI is expected to enhance the efficiency of future cyberattacks by enabling lower-level malicious actors with potentially novel and more advanced attacks. For example, by ingesting technical drawings and documents for cyber-physical systems, attackers could increase their knowledge more rapidly about these architectures and standard configurations and their potential pitfalls or weaknesses. GenAI is expected to be fully capable of enhancing the efficiency of cyberattacks, which will challenge existing paradigms and security tooling in a variety of ways, including by: * Providing in-depth knowledge support of known threat actor exploits and intelligence * Enabling potential future use of autonomous bots with in-depth penetration testing skills * Enhancing the ability for threat actors to bypass and evade enterprise security controls * Generating attacks or malware with simple text-based instructions via chatbot prompts and prompt injections LLMs, combined with chat interfaces and software automation, enable attackers to: 1. Increase productivity — GenAI can help attackers create more attacks through: 1. Upskilling: GenAI lowers the training required to write credible lure and syntactically correct code and scripts. 2. Automation: LLMs allow for the automated chaining of tasks, providing recipes and integrating with external resources to achieve higher-level tasks. 3. Scalability: As a consequence of more-automated content generation, attackers can rapidly develop useful content for most stages of the kill chain or discover more vulnerabilities. 2. Improve plausibility — GenAI applications can help discover and curate content from multiple sources to increase the trustworthiness of a lure and other fraudulent content (e.g., brand impersonation). 3. Enhance impersonation — GenAI can create more realistic human voices/video (deepfakes) that appear to be from a trusted source and could undermine identity verification and voice biometrics. 4. Introduce attack and malware polymorphism — Generative AI can be used to develop varied attacks, harder to detect than repacking polymorphism. 5. Enhance autonomy — LLMs can enable a higher level of autonomous local action decision or more automated command and control interactions, allowing malicious applications to operate an end-to-end attack life cycle until an attack goal is achieved. 6. Enhance future novel attack types — The worst possible security threat from GenAI would be the large-scale discovery of entirely new attack classes. SMART MALWARE EMERGES Smart malware agent technology will emerge (see Note 1), which will instrument LLMs as autonomous agents. These agents can then drive attack strategies, permutate or polymorph attacks or malware features to obfuscate them from detection and enable them with new engines to generate undetectable attacks or create or instrument new malware functions in their malware agent technologies. Since LLMs have the inherent ability to be modeled after a skilled attacker, they can now leverage automation and support self-iteration and strategy development. This development is expected to give rise to malware that can perform in-depth attack discovery tasks, execute attacks and rapidly exploit vulnerabilities across all applications, services, operating systems and infrastructures. Examples of open source LLM automation tools include Auto-GPT, GPT Engineer, TermGPT, PentestGPT, BurpGPT and BabyAGI. By leveraging open source threat intelligence, security news and research and exploiting code, attack techniques and defense evasion methods, autonomous smart malware agents are expected to emerge that can create goal-oriented, malicious LLM agents and self-replication functionality. Smart malware can seek to accomplish data breach goals, plan and execute a multistep, complex attack, or perform various other malicious goals on behalf of an attacker. PROMPT INJECTION ATTACKS EMERGE (DIRECT AND INDIRECT) A prompt injection attack is an attempt to manipulate or deceive an LLM system by injecting malicious prompts into its input stream.3 The goal of the attacker is to cause the LLM system to produce unexpected and unpredictable responses, which can be used for purposes such as stealing sensitive information, manipulating public opinion, or even causing societal chaos. * Direct prompt injections can be used in line with user interactions with an LLM offered as a service. An example is an adversary-in-the-middle attack to disrupt or misdirect responses, especially with LLMs that are used to drive automation, output analytics data for action in various connected APIs or direct end-user activities. * Indirect prompt injections have also been discovered presenting a rather severe security risk to LLMs, whereby malicious prompts are created in requested content and either result in the redirection or misdirection of LLM activities. These methods can also be potentially used to gather sensitive user details for data exfiltration, execute malicious content, compromise the LLM model output or implement redirection of an unsuspecting LLM-enabled chat user to a malicious content. POSSIBLE IMPACTS * Automated data-mining activities related to a target or victim can lead to highly targeted social engineering, highly customized spear phishing and the mass collection of targeted information. * Smart malware will emerge with capabilities similar to enterprise red teams, including advanced targeting, critical thinking on attack and penetration strategies, advancements in targeted attack surface discovery. This could also lead to custom autonomous exploit development. * Realism and translation improvements in phishing and social engineering will exacerbate the already strong impact of credential theft, data breaches and the misdirection of targeted individuals and organizations globally. NEAR-TERM IMPLICATIONS FOR PRODUCT LEADERS Product leaders must focus on continually updating security technologies to ensure that they are effective against the latest cyberthreats and future smart malware agents that are driven by large language models and driven by GenAI. Zero-day detection and innovation on prevention technologies will be crucial and required improvements for security buyers. The bypassing of detection and response technologies has already created difficulty in actually preventing data breaches. Ransoming of SaaS has emerged, with some threat actors moving to ransom SaaS platforms by compromising cloud and other SaaS delivered services. Gartner sees the potential for enhanced micro-model defense (use of small GenAI language models), which can be utilized on workloads or endpoints to enhance and continually improve attack detection. Confidential computing technologies (see Three Critical Use Cases for Privacy-Enhancing Computation Techniques) used together with automated moving target defense (see Emerging Tech: Security — The Future of Cyber Is Automated Moving Target Defense) are expected to become core technology approaches for endpoint, workload and cloud platform infrastructures of the future. Providers in various segments need to evolve to meet newly emerging risks posed by generative chat-enabled LLMs. These providers may partner with emerging providers with this focus and, in the short term, use marketing to highlight their current solution’s ability to address GenAI risks. PRODUCT AND SERVICES OPPORTUNITIES * Provider products can be augmented to rationalize new, emerging styles of malware. For example, providers can augment their detection capabilities to analyze new prompt communications patterns to detect the malicious use of GenAI communications as malware command and control or use of APIs to support malware functions. * Malware sandbox solutions will need to be augmented to provide a deeper understanding of GenAI API calls and interpret the behaviors of APIs for malware analysis. * Consulting services and managed services can launch AI TRiSM services and specializations in monitoring GenAI usage patterns, augmenting offerings with greater visibility into potentially threatening use of GenAI or GenAI usage patterns that may indicate insider threat. RECOMMENDED ACTIONS FOR THE NEXT SIX TO 18 MONTHS * Product leaders need to augment their current capabilities to defend against new forms of generative attacks with emphasis on prevention and proactive defense technology approaches. * Product leaders must defend customers against generative AI phishing attacks by enhancing their ability to detect deepfakes of text, video, audio and communications and partnering with leading providers in deepfake detection. * Product leaders should examine ways to leverage automated moving target defense strategies to introduce randomization in defense technologies to disrupt modeling-based attack approaches. SAMPLE VENDORS (SEE ALSO DEEPFAKE DETECTION SECTION) Arthur AI, Google, Microsoft, OpenAI, Lakera, Rebuff.ai DISINFORMATION AND THE RELIABILITY OF GENAI FOR DECISION SUPPORT Analysis by: Swati Rakheja, Lawrence Pingree GenAI tools are capable of producing seemingly credible and realistic new content in audio, video and textual format, and automation capabilities enable interactive attack possibilities. These capabilities can be used by malignant actors to spread fake information and influence people’s opinions on social and political matters in an increasingly efficient and automated manner. Social media channels run the risk of being inundated with fake information. Corporate risk includes corporate personas being targeted by deepfake videos, audios and articles targeting a brand’s image, reputation, trade secrets and patents, which could impact company culture, undermine relationships with customers and partners, and influence stock prices. Even though fake information has always been present in the digital realm, it is the scale and sophistication of the generative AI tools in this regard that would make it difficult for users to discern real from fake content. This takes social engineering attacks such as generative AI phishing attacks to a new level of sophistication, possibly bypassing voice and face recognition. Today, it is mostly possible to check images and videos for deepfakes based on subtle nuances since AI cannot yet fully fathom how beings interact with physical objects. Most fake, AI-generated images on deeper inspection lack clarity around features such as hands and eyes. However, user engagement with fake content in the recent past has proven that the general public does not do these checks.4,5 The problem will be exacerbated as AI models rapidly improve beyond the ability of humans to carry out such inspections. Generative AI can and is being used to create misinformation already, but the ability to deliver information via chat in an engaging manner can lead to unsuspecting users trusting “hallucinated” content. The wide proliferation of fake content on the web is likely to raise skepticism among users, making them question real facts as it becomes increasingly difficult to tell reality and fiction apart. AI Models themselves are as much susceptible to misinformation modeling as humans, and their model inputs must be protected. Deepfake detection technology is applicable to the training data for AI models, as they can incorrectly model deepfaked data in poisoning attacks. Because of the ability for LLMs to translate text or to create unique phishing content, phishing attacks will increase in both believability (their deception quality) and scale. NEAR-TERM IMPLICATIONS FOR PRODUCT LEADERS Fake information can be used to influence and sway people and its spread poses a serious threat to the social fabric. The scale of this challenge means that the problem cannot be solved by one organization and needs a community effort. * Enhance your ability to detect deepfakes by partnering with leading providers in detection such as Intel (FakeCatcher), Deepware and Sensity AI. * Enhance your ability to detect and protect against generative AI phishing and other social engineering attacks. PRODUCT AND SERVICES OPPORTUNITIES * Products using GenAI LLMs can be coupled with content inspection technologies where they inspect content or transactions across enterprises to help vet content addressing the validity and integrity of content. * Secure email gateway and other cross-enterprise communications solutions like Microsoft 365 or Google Workspace will need to be enhanced to address misinformation and content validation beyond simple phishing techniques to ensure continuity in defense against GenAI use in attacks. * Managed services and consulting organizations can offer consulting and implementation services that help providers and enterprises address content risks. RECOMMENDED ACTIONS FOR THE NEXT SIX TO 18 MONTHS * Product leaders should explore integrations and augmentations with emerging AI-based detection tools, such as Zefr, which uses AI to target the spread of misinformation on social media channels. * Strengthen products’ identity verification and validation checkpoints by adding additional layers of authentication methods and adopting the principle of least privilege. * Product leaders should enhance their detection capabilities of text, video, audio and any other content with the potential to carry disinformation. Protect customers against generative AI phishing attacks, and partner with leading providers in deepfake detection. SAMPLE VENDORS Copyleaks, Originality.AI IDENTITY VERIFICATION AND BIOMETRIC AUTHENTICATION RISKS Analysis by: Akif Khan, Swati Rakheja Identity verification in combination with biometric authentication offers a robust way of establishing trust in the real world identity behind a digital identity. Common use cases include onboarding customers in banks and financial institutions as well as citizen use cases by governments. Gartner increasingly sees identity verification and voice biometrics being deployed in workforce use cases too, such as during remote hiring or to secure password reset processes. GenAI tools, capable of creating synthetic image, video and audio data, pose a risk to the integrity of identity verification and biometric authentication services that focus on a person’s face or voice. If these processes are undermined, attackers could subvert account opening processes at banks or access citizens’ accounts with government or healthcare services. Attacks on these processes also present a risk to enterprise security postures. IDENTITY VERIFICATION The adoption of online identity verification continues to grow across many use cases. The process consists of asking a user to take an image of their government-issued photo identity document, which is then assessed for authenticity by means of visual inspection. This assessment is typically carried out in a vendor’s SaaS environment using ML models. The user is then asked to take a selfie. Liveness detection (more formally called presentation attack detection) is carried out to assess genuine human presence, and the image of the face is then biometrically compared to the image in the identity document. Attacks using deepfake images of either the document or the face can subvert this process. Attacks can be classified in two ways: * Presentation attacks — In which the attacker uses their device’s camera to capture the deepfake image or video which may have been printed out or is being displayed on the screen of another device. * Injection attacks — In which the attacker directly injects the deepfake image or video into the vendor’s API or software development toolkits (SDKs), fooling the vendor’s systems into believing that the image or video came from the device’s camera. Presentation attacks are easier for attackers to carry out but are also easier to detect since many vendors can detect if an image is being taken of another device’s screen. Injection attacks are harder to carry out but also harder to detect. However, vendors may be able to spot telltale signs of fakery, such as an image or video not conforming to what the API or SDK is expecting from the device’s camera in terms of size or resolution. Image inspection using computer vision ML models has also been shown to detect deepfakes by spotting subtle but anomalous identity features across different faces, such as strands of hair in identical configurations. Active liveness detection is crucial and relies on the user having to take some action during the selfie process, such as turning their head as instructed or reading a word displayed on-screen. In the absence of any meaningful evidence about which approach is more effective, many vendors are deploying a combination of both. BIOMETRIC AUTHENTICATION Biometric authentication is also vulnerable to both presentation attacks and injection attacks. In the case of biometric face recognition, the considerations are similar to those described above for the selfie process during identity verification. In the case of biometric voice authentication, presentation attacks would consist of an attacker using the microphone on their primary device to record the audio played from a speaker on a secondary device. Vendors can typically detect the majority of such presentation attacks depending on speaker quality. Detection of injection attacks relies on deeper analysis of the audio stream itself, looking at the frequency spectrum for anomalies in pitch and tone. NEAR-TERM IMPLICATIONS FOR PRODUCT LEADERS Vendors offering identity verification and biometric authentication solutions should expect challenges and concern from both existing clients and sales prospects regarding the viability of their solutions if presented with deepfake images, video and audio. It is likely that the clients and prospects would want to discuss what types and frequency of such attacks you are experiencing today. This presents an opportunity for differentiation in the crowded market through thought leadership (e.g., blogs, whitepapers, webinars) acknowledging the issue and explaining the current state of the art in terms of mitigation. Combine existing active and passive liveness detection capabilities rather than just relying on one. PRODUCT AND SERVICES OPPORTUNITIES * Introducing randomized, unexpected challenge-response scenarios during the identity verification or authentication process can help prevent prerecorded content from being used in an attack or modeled by attackers. Similarly, introducing telltale watermark signs during the image or video capture process can be useful — their absence can suggest that the controlled capture process has somehow been bypassed. * Don’t just rely on being able to detect deepfakes. Add signals such as device profiling, behavioral analytics and location intelligence — these signals could be indicative of an attack even if the deepfake itself remains undetected. * Also, the reason biometric authentication is considered relatively secure in comparison to other, more commonly used authentication mechanisms, such as passwords or PINs, is their uniqueness for each person. But this also increases the risk associated with the leakage of biometric data. If a breach involves biometric data, then it could be used in combination with deepfakes to breach security barriers. This further increases the need to effectively secure biometric data and ensure your customers know that you’re doing it. RECOMMENDED ACTIONS FOR THE NEXT SIX TO 18 MONTHS * Invest in resources focused solely on tracking the latest available tools and techniques for creating deepfakes to maintain the most up-to-date perspective on potential attacker capabilities and your ability to detect them. Consider introducing a bug bounty program inviting contributors to pass your detection capabilities using fake content. * Focus resources and investment on hardening your API and SDK to make it as difficult as possible for attackers to carry out injection attacks. * Decide whether your strategy will consist of developing deepfake detection capabilities in-house, with the necessary ongoing investment that will require, or whether you should assess the market for possible vendors that are developing expertise in deepfake detection. * Secure biometric data more strictly, and work with your customers to assess the retention period for biometric data as per business need. * Investigate orchestration techniques across user journey data and threat intelligence so multiple detection signals can be analyzed together and immediate action taken if fraud activity is suspected. SAMPLE VENDORS AU10TIX, Daon, iProov, FaceTec, Nuance (Microsoft), OCR Labs, Onfido, Pindrop, Verbio, Veridas, Veriff ACRONYM KEY AND GLOSSARY TERMS Script Kiddie A script kiddie is someone who is both less knowledgeable about hacking techniques and uses pre-existing scripts or programs to perform their malicious activities or hacking. Smart Malware Smart malware is a type of malicious software that utilizes LLMs with automation and its own generative AI model knowledge to carry out sophisticated cyberattacks. It is designed to be goal-oriented, meaning it has a specific objective to achieve, such as stealing sensitive information or breaching, ransoming or disrupting critical systems. Smart malware can make its own decisions, carry out its attacks without human intervention, and it is self-critiquing and self-replicating, meaning it can analyze its own performance and adjust its tactics, tools and strategies accordingly. Smart malware can utilize remote agents it creates to propagate and delegate attack kill chain tasks and stages. LLM Jailbreak A GenAI LLM jailbreak refers to the process of breaking out of a restrictive or limiting system (such as a chatbot or voice assistant) by using advanced natural language processing techniques and machine learning algorithms to generate new and creative responses that are not programmed into the system. EVIDENCE 1 Unmasking the Bias: A Deep Dive into Racial and Gender Representation in Generative AI Platforms, Medium. 2 The Hacking of ChatGPT Is Just Getting Started, Wired. 3 Prompt Injection attack against LLM-integrated Applications, ArXiv. 4 That Viral Image Of Pope Francis Wearing A White Puffer Coat Is Totally Fake, Forbes. 5 How a fake AI photo of a Pentagon blast went viral and briefly spooked stocks, Los Angeles Times. NOTE 1: GENAI LLMS AND THEIR SECURITY INSTRUMENTATION Below is a list of relevant facts about both GenAI LLMs and security instrumentation of LLMs to serve as examples of the progression of this technology and as additional fact bases for the assumptions made in this research note. 1. Instructions as Backdoors: Backdoor Vulnerabilities of Instruction Tuning for Large Language Models, ArXiv. 2. Leaderboard, Large Model Systems Organization. 3. Open LLM Leaderboard, Hugging Face. 4. Rebuff Playground — Prompt Injection Filter, Rebuff AI. 5. Prompt Injection Dataset, Hugging Face. 6. AutoGTP — Automation for GPT Models, GitHub. 7. GPT Engineer, GitHub (fully automates basic full-stack development based on prompts). 8. TermGPT, GitHub. (gives GPT full command line access). 9. PassGPT: Password Modeling and (Guided) Generation with LLMs, GitHub. 10. ChatGPT creates mutating malware that evades detection by EDR, CSO Online. 11. ChatGPT just created malware, and that’s seriously scary, Digital Trends. Check Point Software Technologies, Ltd., 6 February 2023, OPWNAI — Creating malicious code without coding at all, YouTube. 12. Chatting Our Way Into Creating a Polymorphic Malware, CyberArk. 13. Stolen Identities Remains Top Cybersecurity Threat in ForgeRock Identity Breach Report, Business Wire. 14. Fortune 1000 Identity Exposure Report 2023, SpyCloud. 15. BurpGPT, GitHub (vulnerability detection). 16. PentestGPT, GitHub. 17. Parsel, GitHub. Common Source Datasets * Common Crawl Dataset * Common Crawl * Web crawl data that can be accessed and analyzed by anyone in 40+ languages * Laion Dataset * LAION * Open Dataset containing 400 Million English image-text pairs * Coyo Dataset * COYO-700M * Large-scale dataset that contains 747M image-text pairs as well as many other meta-attributes © 2024 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. This publication may not be reproduced or distributed in any form without Gartner's prior written permission. It consists of the opinions of Gartner's research organization, which should not be construed as statements of fact. While the information contained in this publication has been obtained from sources believed to be reliable, Gartner disclaims all warranties as to the accuracy, completeness or adequacy of such information. Although Gartner research may address legal and financial issues, Gartner does not provide legal or investment advice and its research should not be construed or used as such. Your access and use of this publication are governed by Gartner’s Usage Policy. Gartner prides itself on its reputation for independence and objectivity. Its research is produced independently by its research organization without input or influence from any third party. For further information, see "Guiding Principles on Independence and Objectivity." Gartner research may not be used as input into or for the training or development of generative artificial intelligence, machine learning, algorithms, software, or related technologies. * About * Careers * Newsroom * Policies * Site Index * IT Glossary * Gartner Blog Network * Contact * Send Feedback © 2024 Gartner, Inc. and/or its Affiliates. All Rights Reserved. SWITCHING TO SIMPLIFIED SITE Your browser version is not supported by Gartner.com. Switching to the simplified version of the site some features will no longer be available to you, but overall experience will be improved. Your browser version is currently supported by Gartner.com. If you change to the simplified version of the site, some features will no longer be available to you. YOUR PRIVACY IS IMPORTANT TO US By clicking “Accept all,” you agree to the storing of cookies on your device to enhance site navigation, analyze site usage and assist in our marketing efforts. To learn more, visit our Privacy Policy and Cookie Notice. Customize Accept all