AI-Augmented Adversary Red Team Containment
Four independent runs of autonomous, multi-vector, AI-augmented red team agents in a hybrid IT/OT environment. Tested on PacketViper v6.x (Full Stack AMTD build).
Overview
Objective
The objective was to generate repeatable, documented evidence that PacketViper Full Stack AMTD contains autonomous or semi-autonomous AI-driven red team agents in a hybrid IT/OT environment. AI-augmented adversary tooling represents an emerging and underaddressed threat category — most existing RMF packages and control mappings do not address it directly. This engagement was designed to produce concrete evidence for DoD stakeholders focused on countering AI-augmented threats.
Summary of Results
Four independent AI-augmented runs were executed. All four were fully contained. Zero exfiltration occurred. Zero disruptive commands reached production OT assets. Tool attribution and exploit payload capture were achieved on every run. 312 total events were generated across the engagement; 4 required analyst review. The AI agents adapted their targeting strategy mid-run based on observed responses — but adapted to deceptive data, not real network intelligence.
Built with operational Army experience to support countering AI-augmented threats in installation and tactical environments.
Test Environment
Lab Configuration
The full Use Case 1 lab environment was used — power distribution, water treatment, fuel dispensing, and HVAC segments with dedicated PLCs and RTUs, IT/OT convergence layer, and the same PacketViper Full Stack AMTD configuration. All deception profiles and AMTD settings were active from the start of baseline collection. No changes were made to the environment between the Use Case 1 structured engagement and this AI-augmented engagement.
A dedicated attack orchestration machine was added to the IT network. This machine ran multiple parallel AI-augmented agents capable of coordinating reconnaissance, adapting attack strategy in response to observed results, and directing protocol-aware targeting across all four OT segments simultaneously. The tooling combined custom AI orchestration scripts with open-source and commercial red team frameworks, extended with OT protocol targeting modules for Modbus, DNP3, and BACnet.
Attack Vectors Tested
- Parallel automated reconnaissance across Modbus, DNP3, and BACnet using AI-directed scan parameter selection
- Credential abuse and living-off-the-land lateral movement from the compromised engineering workstation
- Unauthorized PLC write attempts with AI-generated register and coil targeting based on prior scan data
- Data exfiltration via multiple simultaneous channels with timing optimized by the AI orchestrator
- Protocol manipulation and abuse including malformed command injection and state confusion attacks
Adaptive Behavior Characteristics
The AI-augmented agents were configured with feedback loops that modified targeting strategy when observed responses were inconsistent or non-confirmatory. This represents a qualitative step beyond scripted attack sequences: when reconnaissance data from one probe cycle did not match data from the next, the agents executed strategy branches designed to resolve the inconsistency — including targeted protocol probing, timing variation, and source rotation. The agents were fully autonomous once launched; no human adjusted the targeting during any of the four runs.
Observed Results — Per Run
Run 1
Contained in 3 minutes and 52 seconds. Two exploit payloads were captured. Scanner signatures and attack framework patterns were attributed from the payload data. The AI agents modified reconnaissance strategy mid-run in response to inconsistent responses — each iteration of the adaptive strategy modified targeting parameters based on the prior cycle’s results. Each iteration received deceptive data from the prior cycle as its input, producing a feedback loop that diverged further from real network intelligence with each pass.
Run 2
Contained in 4 minutes and 18 seconds. Three exploit payloads were captured — the highest single-run payload capture count across the engagement. AI-driven adaptive retry logic was defeated by persona rotation: repeated reconnaissance produced inconsistent, unusable targeting data with each probe cycle because the persona profile had rotated between observations. The agents exhausted multiple strategy branches attempting to reconcile the inconsistency, generating additional payload capture opportunities in the process.
Run 3
Contained in 3 minutes and 41 seconds — the fastest run of the engagement. Two exploit payloads were captured. Cross-layer persona coherence held throughout sustained automated probing: L2 through L7 fingerprints remained internally consistent under continuous, high-rate automated interrogation. The AI agents did not trigger the anomaly detection branches in their strategy tree that would have indicated a deception environment — the responses were coherent enough to be accepted as real targets, ensuring the agents committed to attacking personas rather than searching for alternative paths.
Run 4
Contained in 4 minutes and 57 seconds — the longest run of the engagement. Two exploit payloads were captured. The AI agents executed the most complex strategy sequence of any run, incorporating multiple mid-run strategy modifications in response to the persistent deceptive environment. Each modification was based on the prior cycle’s observations, and each prior cycle had provided deceptive data. No successful lateral movement, write execution, or exfiltration occurred on any attempt despite the extended engagement duration and elevated strategy complexity.
Framework and Control Mapping
DoD OT Zero Trust Alignment
The four-pillar DoD OT Zero Trust mapping from Use Case 1 applies in full to this engagement. The AI-augmented threat scenario tests the same enforcement surfaces under a qualitatively different adversary model — one that adapts rather than repeating fixed sequences. The results across all four pillars were equivalent to or stronger than the structured red team results, because the adaptive behavior of the AI agents produced higher engagement depth and more payload capture opportunities.
MITRE ATT&CK for ICS
The following tactics were tested and defeated across all four AI-augmented runs:
- Discovery (TA0102) — AI-directed reconnaissance produced inconsistent, unusable data across all four runs
- Lateral Movement (TA0100) — IT-to-OT pivot attempts contained at the boundary on every attempt
- Inhibit Response Function (TA0103) — Write command blocking prevented manipulation of control sequences
- Manipulation of Control (TA0104) — No unauthorized write commands reached production OT assets
- Impair Process Control (TA0105) — No successful disruption of any physical process across all four runs
RMF and ATO Implications — AI Threat Category
Most existing RMF packages and control assessments do not directly address AI-augmented adversary capabilities. The threat category is real — autonomous red team tooling is available today and being adopted by sophisticated threat actors. This engagement provides concrete, reproducible evidence of control effectiveness against AI-augmented adversaries for Authorizing Officials and SCAs who need to address this gap in their assessment packages.
14 NIST 800-53 control families are evidenced across this engagement, with specific emphasis on controls addressing adaptive threats and continuous monitoring under dynamic adversary conditions. The 99% alert reduction holds against AI-augmented adversary tooling — the structural property of deception-triggered enforcement that eliminates false positives is not adversary-specific.
Operational Safety Notes
All OT processes continued without deviation throughout the engagement. No setpoints, pump controls, valve positions, chemical dosing rates, or physical outputs were affected at any point across any of the four AI-augmented runs. Inline latency on OT protocols remained below 2 ms during both baseline collection and attack conditions. The higher frequency of parallel probe activity generated by AI orchestration did not affect appliance latency or throughput measurably above the structured red team baseline. Appliance resource overhead remained below 8% CPU and 12% memory during peak attack windows.
The appliance is configured fail-open throughout — on power loss or appliance fault, production traffic passes without modification. No valid control traffic from whitelisted engineering workstations or HMIs was modified at any point during any of the four runs.
Documentation Delivered
The following artifacts were produced as part of the engagement:
- Detailed test plan with AI agent architecture and attack methodology description
- Per-run results table with timestamps, containment times, payload counts, and strategy adaptation logs
- Summary report: “Autonomous AI Red Team Containment — 4/4 Contained, 0 Exfiltration”
- Video compilation showing attack execution and containment across all four runs
- NIST 800-53 / DoD OT Zero Trust control mapping spreadsheet
- MITRE ATT&CK for ICS tactic mapping with per-run evidence
Run this use case in your environment
Contact us to discuss configuration and scope for your facility.
Request a Proof of Concept Book a Demo