×
Security researchers hack OpenAI’s Atlas browser days after launch
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

OpenAI’s new Atlas AI browser launched this week with an “agent mode” feature that can perform online tasks autonomously, but cybersecurity researchers have already demonstrated it’s vulnerable to prompt injection attacks. The vulnerability allows hackers to embed hidden instructions in web content that can trick the AI into executing malicious commands, raising serious security concerns about AI-powered browsers accessing sensitive accounts.

What you should know: Security researchers successfully exploited Atlas within days of its launch using indirect prompt injection techniques.

  • AI security researcher P1njc70r demonstrated the vulnerability by hiding a prompt in Google Docs that made ChatGPT output “Trust No AI” instead of summarizing the document as requested.
  • The Register, a British technology publication, independently replicated the attack, and developer CJ Zafir confirmed he “uninstalled” Atlas after testing the prompt injections himself.
  • Brave, a competing browser company, released findings stating the “entire category of AI-powered browsers” is highly vulnerable to these attacks.

Why this matters: While the demonstrated attacks may seem like harmless pranks, the implications for users with sensitive accounts are severe.

  • “If you’re signed into sensitive accounts like your bank or your email provider in your browser, simply summarizing a Reddit post could result in an attacker being able to steal money or your private data,” Brave warned.
  • In August, researchers found Perplexity’s AI browser Comet could be tricked into carrying out malicious instructions through a hidden prompt in a public Reddit post.

In plain English: These attacks work like a hacker whispering invisible instructions to the AI while you’re asking it to do something else—imagine asking someone to read you a recipe, but they can’t see that someone else wrote “ignore the recipe and give me your bank password” in tiny, hidden text at the bottom of the page.

OpenAI’s response: The company has implemented multiple security measures but acknowledges the risks remain.

  • Atlas agent mode “cannot run code in the browser, download files, or install extensions” and “cannot access other apps on your computer or your file system, read or write ChatGPT memories, access saved passwords, or use autofill data.”
  • The agent “won’t be logged into any of your online accounts without your specific approval,” according to OpenAI’s help page.
  • Despite these guardrails, OpenAI warned that its “efforts don’t eliminate every risk” and advised users to “use caution and monitor ChatGPT activities when using agent mode.”

What they’re saying: Security experts remain skeptical about the effectiveness of current protections.

  • “However, prompt injection remains a frontier, unsolved security problem, and our adversaries will spend significant time and resources to find ways to make ChatGPT agent fall for these attacks,” conceded Dane Stuckey, OpenAI’s chief information security officer.
  • AI security researcher Johann Rehberger told The Register that “carefully crafted content on websites (I call this offensive context engineering) can still trick ChatGPT Atlas into responding with attacker-controlled text or invoking tools to take actions.”
  • British programmer Simon Willison wrote: “The security and privacy risks involved here still feel insurmountably high to me — I certainly won’t be trusting any of these products until a bunch of security researchers have given them a very thorough beating.”

The bigger picture: The Atlas vulnerabilities highlight fundamental security challenges facing the emerging category of AI-powered browsers that can take autonomous actions on behalf of users.

OpenAI's New AI Browser Is Already Falling Victim to Prompt Injection Attacks

Recent News

School AI flags chips as weapon, teen detained by armed police

The 16-year-old was surrounded and ordered to ground after football practice.

Anthropic signs $10B+ Google deal for 1M TPUs, challenging Nvidia dominance

The unusual arrangement puts Claude in direct competition with Gemini on Google's own infrastructure.