F5 AI Red Team - Continuous AI Assurance

Key Concepts

AI Routines/AI Red Teaming: Automated adversarial testing of AI systems to identify vulnerabilities.
CI/CD Pipeline Integration: Embedding AI assurance testing directly into the Continuous Integration/Continuous Delivery workflow.
Prompt Injection: A vulnerability where malicious prompts manipulate an AI model’s behavior.
Jailbreaking: Circumventing the safety mechanisms of an AI model to generate prohibited content.
Agentic Warfare/Agent Attacks: Utilizing a swarm of autonomous agents to generate adaptive prompts for persistent adversarial testing.
Cassie Score: A quantifiable metric representing an AI model’s exposure to risks.
AI Resilience: The ability of an AI system to withstand adversarial attacks and maintain functionality.
Operational Attacks: Adapting real-world application layer attack techniques to AI systems.

Continuous AI Assurance with F5 AI Routines: A Detailed Overview

This presentation details how F5 AI Routines enable continuous AI assurance by automating AI advisor testing and integrating it into CI/CD pipelines. The core argument is that traditional, one-time red teaming exercises are insufficient for maintaining AI security due to the constant evolution of models, prompts, and policies, and the potential for silent failures like prompt injections and data leakage.

The Need for Continuous AI Assurance

The speaker emphasizes that AI systems are dynamic, necessitating ongoing security assessments. One-time red teamings quickly become obsolete as models and policies change. Failures, such as prompt injections or data leakage, can occur undetected. Embedding red teaming directly into the CI/CD pipeline provides continuous, auditable proof of controlled AI risk, leading to faster, more secure AI delivery. As stated, “AI security is no longer ad hoc. With continuous CI/CD testing, teams get measurable risk reductions, transparency, and visibility.”

F5 AI Routines Capabilities

F5 AI Routines cover a broad spectrum of AI threats. These include:

Known AI Threats: Addressing vulnerabilities like prompt injections and jailbreaks using continuously updated attack signatures.
Operational Attacks: Adapting established application layer attack techniques to target AI systems.
Advanced Threats (Agentic Warfare): Employing a swarm of autonomous “red team agents” that generate adaptive prompts, simulating persistent adversaries. This is described as mirroring the behavior of sophisticated, ongoing attacks.
Custom Attacks: Allowing customers to define attacks tailored to their specific risk scenarios.

The outcome of these tests is quantified using metrics like the Cassie score and measures of AI resilience, providing defensible data on model exposure. Cassie and ARS are specifically mentioned as tools for quantifying outcomes and providing measurable data.

CI/CD Pipeline Integration: A Step-by-Step Process

The presentation demonstrates how F5 AI Routines integrate into a typical CI/CD pipeline. The process involves the following steps:

Environment Preparation: Setting up the testing environment.
Attack Campaign Definition: Defining an attack campaign, which serves as a reusable blueprint for adversarial techniques, ranging from non-prompt attacks to API abuse.
Campaign Execution: Running the attack campaign against the AI system under test (foundational model, custom LLM, or AI-powered application via API). Agentic attacks can be enabled at this stage.
Result Generation: Automatically generating results, including the Cassie score and identifying issues with actionable recommendations.
Reporting: Generating detailed reports for various stakeholders.

The demo utilizes GitLab CI to illustrate the integration, highlighting its ease of implementation without disrupting development workflows. The pipeline is triggered by a commit representing a model change, application update, or defined risk event. A monitoring job tracks progress and ensures completion.

Reporting and Decision-Making

A second pipeline generates and publishes reports, ensuring a clear record of testing and outcomes. This pipeline can be triggered by a webhook, orchestration workflow, or AI operations events. The F5 APIs are used to fetch campaign results and commit them to the repository for auditability and traceability.

Three report formats are generated:

HTML & PDF: High-level overviews for executive and risk stakeholders, focusing on exposure and impact.
CSV: Detailed technical data (prompts, model responses, attack vectors, test outcomes) for engineering and security teams to facilitate deeper analysis and remediation.

The resulting scores become a decision point for security operations (SecOps) and decision-makers, allowing them to approve low-risk models, review borderline cases, or block high-risk releases.

Technical Details & Data

The presentation highlights the use of F5 APIs for automating report generation and data extraction. The data extracted includes raw results transformed into tailored reports for different audiences. The system provides both “technical depth and executive insight.” The integration is designed to be seamless, ensuring that adversarial testing fits “naturally into a modern CI/CD pipeline.”

Logical Connections & Synthesis

The presentation logically progresses from identifying the limitations of traditional AI security approaches to demonstrating a solution that addresses those limitations through automation and CI/CD integration. The various components – threat coverage, attack campaign definition, pipeline integration, and reporting – are interconnected to create a comprehensive AI assurance system.

The core takeaway is that continuous AI assurance, facilitated by tools like F5 AI Routines, is essential for scaling AI security at the speed of software delivery. It moves AI security from a reactive, ad-hoc process to a proactive, integrated component of the development lifecycle.