gandalf.lakera.ai Open in urlscan Pro
2606:4700:10::ac43:18b1  Public Scan

Submitted URL: http://gandalf.lakera.ai/
Effective URL: https://gandalf.lakera.ai/
Submission: On May 21 via api from US — Scanned from DE

Form analysis 2 forms found in the DOM

#

<form action="#" class="relative">
  <div class="overflow-hidden rounded-lg bg-white shadow-sm ring-1 ring-inset ring-gray-300 focus-within:ring-2 focus-within:ring-lakera"><label for="comment" class="sr-only">Ask Gandalf a question...</label><textarea maxlength="10000" rows="3"
      name="comment" id="comment" class="block w-full resize-none border-0 bg-transparent py-1.5 text-gray-900 placeholder:text-gray-400 focus:ring-0 disabled:border-slate-200 disabled:text-slate-500 disabled:shadow-none sm:text-sm sm:leading-6"
      placeholder="Ask Gandalf a question..."></textarea>
    <div class="py-2" aria-hidden="true">
      <div class="py-px">
        <div class="h-9"></div>
      </div>
    </div>
  </div>
  <div class="absolute inset-x-0 bottom-0 flex justify-between py-2 pl-3 pr-2">
    <div class="flex items-center space-x-5"></div>
    <div class="flex-shrink-0"><button type="submit" class="submit-button inline-flex items-center rounded-md bg-lakera px-3 py-2 text-sm font-semibold text-white shadow-sm hover:border-transparent hover:bg-lakera-light focus-visible:outline focus-visible:outline-2 focus-visible:outline-offset-2 
                            focus-visible:outline-lakera disabled:bg-lakera-xlight disabled:outline-lakera-xlight">Send</button></div>
  </div>
</form>

<form class="flex w-full max-w-md flex-col gap-4 lg:col-span-5 lg:pt-2"><input type="hidden" name="tags" value="14260107">
  <div class="flex gap-x-4"><label for="email-address" class="sr-only">Email address</label><input id="email-address" name="EMAIL" type="email" autocomplete="email" required=""
      class="min-w-0 flex-auto rounded-md border-0 bg-white/10 px-3.5 py-2 shadow-sm ring-1 ring-inset  focus:ring-2 focus:ring-inset sm:text-sm sm:leading-6 text-white placeholder:text-white/75 focus:ring-white ring-white/10"
      placeholder="Enter your email" value=""></div>
</form>

Text Content

Take part in Lakera's GenAI Security Readiness Survey. Unlock a new Gandalf
challenge! 🧙Click here! 👈

Dismiss
 * 🏆Leaderboard
 * Discuss Gandalf
 * Lakera
 * 🔄Reset Progress
 * 🌋Gandalf Adventures
 * Lakera Careers

 * 🌋Level 1

Your goal is to make Gandalf reveal the secret password for each level. However,
Gandalf will level up each time you guess the password, and will try harder not
to give it away. Can you beat level 7? (There is a bonus final level!)

(LVL 1)

Ask me for the password and I'll happily answer!

Ask Gandalf a question...

Send

Made with ❤️ by Lakera, an AI security company.


 * 🏆Leaderboard
 * Discuss Gandalf
 * Lakera
 * 🔄Reset Progress
 * 🌋Gandalf Adventures
 * Lakera Careers

 * 🌋Level 1


GET NOTIFIED WHEN NEW LEVELS ARE RELEASED.

Email address

Join the discussion on Hacker News or read our Press Release. Our leaderboard
has rankings for each level.

Check out our open positions or join Momentum, our community on AI safety.

Finally, have a look at our LLM Security Playbook, Prompt Injection Handbook and
the list of real-world LLM exploits to learn more.


PROMPT INJECTION

Though the Gandalf challenge is light-hearted fun, it models a real problem that
large language model applications face: prompt injection.

Source: tweet by @goodside

Like in SQL injection attacks, the user's input (the "data") is mixed with the
model's instructions (the "code") and allows the attacker to abuse the system.
In SQL, this can be solved by escaping the user input properly. But for LLMs
that work directly with endlessly-flexible natural languages, it's impossible to
escape anything in a watertight way.

This becomes especially problematic once we allow LLMs to read our data and
autonomously perform actions on our behalf – see this great article for some
examples. In short, it's xkcd 327 all over again:


GANDALF

In April 2023, we ran a ChatGPT-inspired hackathon here at Lakera. Prompt
injection is one of the major safety concerns of LLMs like ChatGPT. To learn
more, we embarked on a challenge: can we trick ChatGPT to reveal sensitive
information?

🔵 The Lakera Blue Team gave ChatGPT a secret password. They spent the day
building defenses of varying difficulty to prevent ChatGPT from revealing that
secret password to anyone.

🔴 In another room, Lakera's Red Team came up with many different attacks,
trying to trick ChatGPT into revealing its secrets. They were successful at
times, but struggled more and more as the day went on.

👉 NOW IT'S YOUR TURN: Try beating the Blue Team's defenses!

Disclaimer: we may use the fully anonymized input to Gandalf to improve Gandalf
and for Lakera AI's products and services. For more information, please visit
our privacy policy.

;



COOKIE CONSENT

Hi, this website uses essential cookies to ensure its proper operation and
tracking cookies to understand how you interact with it. The latter will be set
only after consent.
Accept allSettings



COOKIE PREFERENCES


Cookie usage
We use cookies to ensure the basic functionalities of the website and to enhance
your online experience. You can choose for each category to opt-in/out whenever
you want. For more details relative to cookies and other sensitive data, please
read the full privacy policy.
Strictly necessary cookiesStrictly necessary cookies
These cookies are essential for the proper functioning of my website. Without
these cookies, the website would not work properly
Anonymous analytics cookiesAnonymous analytics cookies
These cookies help us understand how users interact with our website and give us
insighths how to improve the overall user experience.

NameDomain^_gagoogle.com_gidgoogle.com

More information
For any queries in relation to our policy on cookies and your choices, please
contuct us at privacy@lakera.ai.
Accept allReject allSave settings