Ai Red Teaming Scope Shift To System Assessment
Sources: 1 • Confidence: Medium • Updated: 2026-03-27 10:09
Key takeaways
- In the corpus, the meaning of "AI red teaming" is described as shifting from primarily model safety/alignment/bias testing to end-to-end system testing of deployments that include AI components.
- The corpus reports that non-human identities already outnumber human identities in many organizations at roughly 82–96 to 1, and that AI deployments further increase machine-identity growth.
- AI security engagements are described as resembling traditional offensive security assessments because surrounding components (identities, web servers, databases) are largely unchanged, but prompt injection and probabilistic model behavior add new testing requirements.
- Modern attack-path analysis is described as increasingly crossing multiple identity and cloud stacks (e.g., GitHub, AWS, AD/Entra), and BloodHound's Open Graph extension is described as being used to map identities across arbitrary technology stacks.
- The corpus asserts that attackers can use AI-enabled tooling to scale continuous scanning and discovery (including broadly running cloud security scanners), increasing the need for defenders to find and fix exposures first.
Sections
Ai Red Teaming Scope Shift To System Assessment
- In the corpus, the meaning of "AI red teaming" is described as shifting from primarily model safety/alignment/bias testing to end-to-end system testing of deployments that include AI components.
- The most commonly assessed enterprise AI deployment described is a chatbot front end that forwards user input to a model provider and may connect to RAG stores and internal systems.
- AI security engagements are described as resembling traditional offensive security assessments because surrounding components (identities, web servers, databases) are largely unchanged, but prompt injection and probabilistic model behavior add new testing requirements.
Identity Sprawl And Agent Credential Aggregation Increase Blast Radius
- The corpus reports that non-human identities already outnumber human identities in many organizations at roughly 82–96 to 1, and that AI deployments further increase machine-identity growth.
- AI agent systems are described as high-value credential aggregation points where compromise, including via indirect prompt injection such as through email, can expose many identities and access tokens with impact compared to credential dumping from compromised servers.
- The corpus asserts that controlling identity privileges remains a core mitigation for AI-era risk and that granting AI systems the ability to execute arbitrary code is a high-risk design choice to avoid.
Most Ai Security Findings Are Still Classic App And Identity Failures
- AI security engagements are described as resembling traditional offensive security assessments because surrounding components (identities, web servers, databases) are largely unchanged, but prompt injection and probabilistic model behavior add new testing requirements.
- Many AI-related security findings described are traditional web application issues (e.g., IDOR and injection), while the distinctly new attack primitive highlighted is prompt engineering that resembles social engineering.
Cross Stack Attack Paths And Graph Based Mapping
- Modern attack-path analysis is described as increasingly crossing multiple identity and cloud stacks (e.g., GitHub, AWS, AD/Entra), and BloodHound's Open Graph extension is described as being used to map identities across arbitrary technology stacks.
- In the corpus's description of the SalesLoft/Drift incident, an alleged compromise path ran from GitHub to AWS credential access and then to theft of OAuth tokens used to access customers' Salesforce instances via the vendor's AI chatbot integration.
Attacker Scaling And Default Deny Expectation
- The corpus asserts that attackers can use AI-enabled tooling to scale continuous scanning and discovery (including broadly running cloud security scanners), increasing the need for defenders to find and fix exposures first.
- The corpus expresses the expectation that as attacker and deployment tempo approaches "machine speed," permissive-by-default configurations become less viable and secure-by-default (deny-by-default) posture becomes more important.
Unknowns
- What proportion of enterprise AI security findings in practice are truly LLM-specific (e.g., prompt injection, tool misuse) versus classic web/app/identity issues, and how is that measured?
- How frequently do prompt injection issues reproduce across multiple attempts for the same prompt and environment, and what evidence standards are used to validate remediation given non-determinism?
- What is the source and scope of the reported 82–96:1 non-human-to-human identity ratio, and how does that ratio change specifically after agent/AI rollouts?
- Which specific architectural patterns (e.g., where tokens are stored, how tools are invoked, what permissions are granted) most strongly drive the "credential aggregation point" risk for agents?
- How commonly do real-world compromise paths chain across GitHub, cloud credentials, and downstream customer OAuth tokens in AI-feature contexts, versus being illustrative edge cases?