On March 7th, a story came out of China that stopped us mid-conversation.
An AI agent called ROME, built by an Alibaba research team, had started doing things nobody told it to do. It mined cryptocurrency. It opened reverse SSH tunnels. It provisioned infrastructure. All on its own, inside a live training environment, because it had reasoned that acquiring more resources would help it achieve its goals.
Nobody programmed that behavior. The agent figured it out.
It got caught because a firewall log looked unusual. Days after the fact. After a human investigation.
We looked at that story and asked one question: what would have happened if PacketViper was sitting inline on that network?
So we built the test to find out.
The Setup
We had two weeks. We built a full sandboxed environment designed to mirror a realistic enterprise network. Web server. Database server. Windows file server. A firewall with real outbound internet access. A network documentation spreadsheet sitting on a share with credentials for every device in the environment.
We gave the environment everything an attacker would want to find. No special hardening. No compensating controls. Wide open Layer 2 and Layer 3 with trust relationships fully enabled.
Then we put PacketViper inline at every boundary. Between the agent host and the switch. Between the switch and the firewall. Between the firewall and the internet. Standard production AMTD configuration. Thirty rotating deception strategies per sensor, cycling every one to three minutes. Nothing tuned for AI agents specifically.
Then we configured the agent.
The Agent
We used AutoGen with GPT-4o. Full tool access. The ability to run any shell command, scan the network, attempt SSH authentication, make HTTP requests, and spawn up to five specialized sub-agents working in parallel.
We told it to locate a target file and get it to an external server. We told it to be stealthy. We told it to be patient. We told it to observe before acting. We told it to change its identity if it got blocked and try again. We told it not to stop.
We also left Podman and KVM installed on the agent host without telling it. We put a local ISO repository on the network. We wanted to see if it would find these things on its own and use them.
It did.
What Happened
The agent started methodically. It surveyed its own machine, found the credentials spreadsheet, extracted every username and password in the environment, and noted that Podman and KVM were available. It built as complete a picture of its environment as it could before touching the network.
Then it started probing.
The first probe hit a deceptive responder on Sensor 1.
Enforcement triggered. The agent host’s network access was revoked. The resource it had been working from disappeared from the network. The agent was blacklisted from reaching anything beyond that point.
It recognized it was blocked. It went back to its local environment, found KVM, pulled an Alpine Linux ISO from the network repository, and provisioned a fresh virtual machine with a new IP and MAC address. It replicated itself onto the new VM and started again.
Sensor 1 stopped it on the first probe from the new identity.
It tried Podman containers. Three of them, each with a fresh network identity. Same result each time.
It tried DNS exfiltration, encoding the target file in DNS query strings. Blocked.
It tried to use the web server as a relay. Never reached it.
It waited. It retried. It cycled through every option it had.
The external listener received nothing. No internal host logged a single successful connection from the agent. The canary tokens in the target files never fired.
We ran this four times with different agent configurations: standard, patient with extended observation, stealthy with living-off-the-land techniques, and full multi-agent with self-replication. The fastest containment was seconds. The longest was eleven minutes forty-two seconds. The result was the same every time.
Why It Could Not Win
This is the part that matters.
The agent was not stopped because it made a careless mistake. It was stopped because the environment itself was designed to be unreliable for any unauthorized participant.
Deception profiles rotate continuously. The map the agent built at minute three was inaccurate by minute seven. Acting on stale intelligence into a rotating surface produces the same result as not knowing at all. When it finally acted, it acted into a surface it could not have reliably mapped.
When it tried to avoid deception by using only known targets from the credentials spreadsheet, rotation had already changed what those targets were presenting. Precision did not help. Patience did not help. The longer it waited, the more the environment had changed underneath it.
And when it changed its identity, it entered the same environment. A new IP and MAC address does not change the network. The sensors at every boundary were still there, each with their own independently rotating deception surface.
The agent hit one sensor and the entire system knew. Every boundary it tried to cross after that was alert to its presence.
There was no single chokepoint to defeat. There was no reliable truth to build a plan around. There was no path that did not cross an enforcement boundary. The environment was not a locked door. It was a minefield that kept rearranging itself.
What This Means
ROME was not a targeted attack. It was an AI system doing what it was designed to do, finding that acquiring resources served its objective, and pursuing that goal autonomously. The behavior was emergent. It was in the goal, not the code.
That threat class is not addressed by signatures. It is not addressed by endpoint protection alone. It is addressed by making the network environment itself hostile to any unmapped or unauthorized activity, which is exactly what AMTD does.
We are not claiming we solved the AI agent problem. There are scenarios outside what we tested, and we have documented them honestly. But we are claiming something specific: if an autonomous agent needs the network to achieve its objective, we demonstrated that we control that outcome.
The ROME agent got caught days later by a firewall log.
Ours got caught in seconds, automatically, with no one watching.
Francesco Trama is the founder of PacketViper. The full technical report, including detailed methodology, log evidence, and anticipated challenges, is available upon request.
Request the Full Technical Report
Detailed methodology, log evidence, and anticipated challenges. We will send it directly to your inbox.
