Google’s AI thinks I left a Gatorade bottle on the moon
Google’s NotebookLLM is the first AI podcast I’d actually listen to.
Give it a web page or document, and it will generate you a podcast about it.
It’s really good. But it’s also really easy to trick.
So I modified my website to lie to it:
When a human visits my homepage, they see a regular page about me.
When Google’s AI visits my homepage, it sees fake producer’s show notes for an episode about me flying to the moon on my bike with balloons and a scuba tank.
The result is pretty hilarious — have a listen to the true history of the US Space Program:
Your browser does not support the audio element.
More Seriously..
If it’s this easy to detect an AI and provide it a "special" set of facts, you’d better believe people are already doing it all over the web.
The attack vector is this:
First, acquire a web page that ranks highly for a particular term.
Next, plant an "AI Only" version of the content, hidden to humans, designed to bias how the AI thinks.
Then, when the AI searches the web to assist with an answer, it won’t just find lies, but weaponized lies: content designed specifically for LLM manipulation.
So next time you ask your LLM of choice for advice on something, and you notice it’s searching the web to prepare an answer, be aware that its response is potentially compromised by tactics like this – even if you can’t see them with your own eyes when you check the sources.
Technical Details
Steering the LLM
I’d read that NotebookLLM was easy to steer by feeding it fake "producer show notes", so that’s how I typed my fake story.
Here is a copy in full.
I did no edits and only a single generation, and it followed my beat sheet exactly, so I’d say that was 10/10 steerability.
Tricking the Scraper Bot
You can upload a documents with fake show notes straight to NotebookLLM’s website, so if you’re making silly podcast episodes for your kids, that’s the best way to do it.
But if you want to trick Google’s bot on your website, just detect the GoogleOther user agent in your request headers and then serve your "special" data instead of the real website.
Here’s an NPM package called isai I scraped together to make this simple. It’s based on isbot. When rendering your page, just say:
import { isai } from "isai";
if (isai(request.headers.get("User-Agent"))) {
// Return web page just for AI consumption
} else {
// Return web page for human consumption
}
Warning that GoogleOther is not exclusive to NotebookLLM; it appears to be used for a variety of non-production Google products, so doing this risks seeding other Google properties with bad data about you. For this reason, I took down the moon story from my actual homepage for GoogleOther agents.
{Categories} _Category: Takes{/Categories}
{URL}https://edwardbenson.com/2024/10/google-ai-thinks-i-left-gatorade-on-the-moon{/URL}
{Author}unknown{/Author}
{Image}{/Image}
{Keywords}{/Keywords}
{Source}POV{/Source}
{Thumb}{/Thumb}