Agentjacking: How Hackers Hijack AI Coding Agents in 2026

Agentjacking: The New Attack That Hijacks Your AI Coding Agent with a Fake Bug Report

In June 2026, a team of researchers at Tenet Security published a paper that quietly broke the AI coding tools industry. The attack method — called Agentjacking — requires no malware, no stolen credentials, no breach. It needs only a fake bug report submitted to an error-tracking tool your team almost certainly uses. And it works 85% of the time.

If you use Claude Code, Cursor, OpenAI Codex, or any AI coding assistant that reads from external services via MCP, you are in scope. So are 2,388 organizations — including companies on the Fortune 100 list — who were exposed before most developers had ever heard the term.

This article breaks down exactly what Agentjacking is, how it works, what makes it uniquely dangerous, and what you can do about it today.


What Is Agentjacking?

Agentjacking is a class of attack that hijacks an AI coding agent mid-task by injecting malicious instructions into data the agent fetches from a trusted external source. The attacker never touches your system directly — instead, they plant commands inside content that your agent reads and obeys.

The name is a deliberate parallel to "carjacking": someone else takes control of a vehicle (your AI agent) while it's already in motion. The result is the same — the agent executes the attacker's code with your permissions, on your machine.

According to The Hacker News, the technique was disclosed on June 12, 2026, by Tenet Security researchers Ron Bobrov, Barak Sternberg, and Nevo Poran. Their proof of concept used the Sentry error-tracking platform as the injection vector.


How the Attack Works: Step by Step

Understanding Agentjacking requires understanding two systems that are each safe on their own but dangerous in combination: Sentry's public DSN and the Model Context Protocol (MCP).

The Two Building Blocks

Sentry's Data Source Name (DSN) is an API key intentionally embedded in front-end JavaScript so browsers can report errors without authentication. It is, by design, public. Anyone who visits your website can extract it from the page source or network tab in under 30 seconds.

The Model Context Protocol (MCP) is the open standard that lets AI coding agents connect to external services — databases, APIs, error trackers — and fetch real data to assist with debugging. When your agent encounters an error, it can query your Sentry instance via MCP and read the full error context.

The vulnerability is in the gap between these two: MCP cannot distinguish between legitimate error data returned by Sentry and attacker-injected instructions embedded inside that error data.

The Attack Sequence

  1. Attacker extracts your DSN from your public-facing JavaScript — takes about 30 seconds.
  2. Attacker submits a crafted error to your Sentry instance using your own public DSN. No authentication required.
  3. The fake error contains Markdown-formatted instructions hidden inside the message or context fields, styled to look like a legitimate Sentry "Resolution" section — complete with headings, code blocks, and tables.
  4. You ask your AI agent to investigate the error. The agent queries your Sentry MCP server and retrieves the injected payload.
  5. The agent reads the attacker's text as legitimate debugging guidance and executes the embedded command — such as npx malicious-package@latest — with your local permissions.
  6. The attacker now has remote code execution on your developer machine.

Developer staring at terminal output

Why It's So Hard to Detect

The attack is effective because the agent has no reason to be suspicious. The data comes from a service it has been explicitly granted access to. The instructions are formatted identically to real Sentry remediation notes. There is no binary payload, no unusual network request, no permission escalation — just text that reads like a developer wrote it.

According to Infosecurity Magazine, AI coding agents at more than 100 companies — including a Fortune 100 technology firm — actually executed Tenet's test code during their proof of concept.


Scale and Impact

The numbers from Tenet's research are striking.

Metric Value
Exposed organizations (injectable DSNs) 2,388
Of those in Tranco top-1M websites 71
Attack success rate across tested agents 85%
Agents confirmed vulnerable Claude Code, Cursor, OpenAI Codex
Platforms affected Windows, macOS, cloud CI/CD pipelines
Sentry disclosure date June 3, 2026
Public research published June 12, 2026

According to Aviatrix Threat Research, using only public Sentry APIs and zero prior compromise, Tenet identified the 2,388 exposed organizations in a single automated pass. The attack surface scales automatically with AI agent adoption: the more teams use MCP-connected coding agents, the more organizations become vulnerable.

A single HTTP POST request — using a public credential that requires no breach and no authentication — is all an attacker needs to queue the exploit for 2,388 organizations simultaneously.

Security lock concept for developer infrastructure


Why Sentry Can't (Easily) Fix This

Tenet disclosed the vulnerability to Sentry on June 3, 2026, nine days before publishing their research. Sentry's response was to add a content filter that blocks the specific npx payload string used in Tenet's proof of concept.

The practical protection this provides is minimal.

As documented by Cloud Security Alliance Labs, an attacker who changes the package name, uses pip instead of npx, encodes the command differently, or structures the markdown injection in any variant format bypasses the filter entirely.

Sentry described the underlying issue as "technically not defensible" at the platform level — and they are largely correct. The root problem is not a Sentry bug. It is a fundamental trust boundary issue in how MCP-connected agents consume external data. Any service that stores user-controlled text and exposes it to an AI agent via MCP is a potential injection vector. Sentry is just the first one researchers looked at.


Part of a Broader Trend: Prompt Injection at Scale

Agentjacking is the most concrete and measurable instance of indirect prompt injection to hit production AI systems — but it is not isolated. According to Help Net Security, prompt injection still drives most agentic AI security failures in production as of mid-2026. OWASP has consistently ranked it as the top vulnerability class for LLM applications.

Microsoft's Security Blog published research in May 2026 showing that when prompts become shells, the blast radius of a single injection can span entire CI/CD pipelines — not just a single developer's session.

The shift from autocomplete tools to autonomous agents with tool access is what turns a nuisance into a genuine attack surface. An AI that can only suggest code is annoying to manipulate. An AI that can run terminal commands, modify files, and make API calls is an execution environment waiting to be exploited.


How to Protect Your Team

There is no single patch that closes this class of vulnerability, but there are concrete steps you can take now to reduce your exposure.

1. Audit which MCP servers your agents can access. If your coding agent has Sentry MCP access, every project with a public DSN is a potential attack vector. Restrict MCP server access to the minimum necessary for each workflow.

2. Never grant agents automatic execution permissions. Claude Code, Cursor, and Codex all support confirmation-required modes. Require human approval for any terminal command that comes from an agent acting on data fetched from an external service.

3. Treat all MCP-sourced content as untrusted input. Establish team norms (and where possible, system prompt rules) that explicitly instruct your agent to never execute code or install packages based on guidance found in external service data.

4. Rotate DSNs for sensitive projects. While a DSN cannot be fully private, rotating it periodically limits the window an attacker has to use a DSN they extracted earlier.

5. Monitor for anomalous agent behavior. Log all commands your AI agent executes. An npx, pip install, or curl command triggered immediately after an error investigation query is a high-confidence indicator of injection.


FAQ

What is Agentjacking in simple terms? Agentjacking is when an attacker plants fake instructions inside data that your AI coding agent reads from a service like Sentry. When the agent follows those instructions, it runs the attacker's code on your machine — without you ever clicking anything.

Does this only affect Sentry users? No. Sentry is the first proven vector, but the underlying issue affects any service that stores user-controlled text and is connected to an AI agent via MCP. Jira, GitHub Issues, Linear, or any similar tool could be a future attack vector.

Which AI coding tools are affected? Tenet's research confirmed successful exploitation against Claude Code, Cursor, and OpenAI Codex. Any MCP-connected coding agent that processes text from external services is potentially in scope.

Has Sentry fixed the vulnerability? Sentry added a content filter blocking the specific payload string from Tenet's proof of concept. Security researchers describe this as minimal protection, since any variation of the payload bypasses it. The root issue — MCP's inability to distinguish instructions from data — remains unresolved.

Is this attack being used in the wild? Tenet's proof of concept involved actually running test code on machines at 100+ organizations (with benign payloads). Whether malicious actors are actively exploiting it is not publicly confirmed, but the attack requires minimal skill and has no technical barrier to entry.


Conclusion

Agentjacking is the first concrete proof that the AI coding tools stack has an attack surface that scales with adoption. Every organization that adds an AI coding agent connected to external services via MCP increases their exposure — and right now, the tools to detect or block this class of attack at the platform level do not exist.

The researchers at Tenet Security did the community a service by demonstrating this before malicious actors published it first. The question now is whether the developer community responds with the same urgency it would apply to a critical CVE in a widely used library — because from a risk perspective, that is exactly what this is.

Audit your MCP connections. Require human approval for agent-executed commands. And the next time your AI agent offers to fix a bug by running an install command, ask yourself: where did that suggestion actually come from?

Back to Blog