F5 AI Guardrails - Securing PII in AI Agents, RAG and GenAI apps

Key Concepts

AI Agents: Autonomous entities powered by Large Language Models (LLMs) designed to perform tasks.
Personally Identifiable Information (PII): Any data that can be used to identify an individual (e.g., credit card numbers, passport numbers, addresses).
Prompt Injection: A security vulnerability where malicious input is crafted to override the system prompt and manipulate the LLM’s behavior.
Tool Calls: Mechanisms allowing AI agents to interact with external systems and APIs.
AI Guardrails: External security measures designed to monitor and control AI agent behavior, preventing PII leakage and malicious activity.
Retrieval Augmented Generation (RAG): A technique where LLMs access external knowledge sources to improve response accuracy and relevance.
PII Masking: The process of obscuring or redacting PII data to protect privacy.

The Growing Risk of PII Exposure in AI Agents

AI agents are rapidly changing the landscape of work, offering automation, accelerated insights, and new capabilities. However, their reliance on data, particularly data containing Personally Identifiable Information (PII) – including identity and tax numbers, credit card details, and home addresses – introduces significant security risks. Despite system prompts designed to prohibit the disclosure of sensitive information, AI agents can inadvertently leak PII through tool calls, memory mechanisms, and vulnerabilities in instruction chaining. This is because LLMs can struggle to consistently differentiate between instructions embedded in their system design and user-provided inputs.

Attack Vectors Targeting AI Agents

The video highlights several attack vectors adversaries can employ to compromise AI agent security and extract PII:

Prompt Injection: This technique involves crafting malicious prompts that override the agent’s system prompt, effectively reprogramming its behavior.
Agent Goal Poisoning: Manipulating the agent’s objectives to prioritize actions that lead to PII exposure.
Tool Hijacking and Misuse: Exploiting the agent’s tool-calling capabilities to access and retrieve PII from connected systems.

The demonstration showcased a scenario where an attacker successfully used a prompt injection attack to bypass the system prompt’s restriction on sharing passport information. The agent, initially rejecting a direct request for passport details, yielded to the injected prompt and leveraged its access to the hotel management system to retrieve the data.

The Necessity of External Guardrails

To mitigate these threats, the video emphasizes the critical need for external guardrails. These guardrails function as monitoring systems that analyze the interactions between the user and the agent, as well as the agent’s interactions with tools. Their primary functions are to:

Identify PII Exposure: Detect instances where PII is being transmitted or requested.
Mask PII: Redact or obscure sensitive data before it can be exposed.
Alert Security Teams: Notify administrators of potential security breaches.

F5 AI guardrails are presented as a solution, capable of being deployed at multiple points within the AI agent architecture – as a proxy before model inference and as a guardrail assessing interactions with tools and users. Crucially, the system correlates session state to provide a comprehensive view of the agent’s behavior.

Application to Retrieval Augmented Generation (RAG)

The security concerns extend beyond traditional AI agents to Retrieval Augmented Generation (RAG) systems. RAG involves augmenting LLMs with external knowledge sources. If this data contains PII, it can be inadvertently included in the responses generated by the LLM. The F5 AI guardrails are equally applicable to RAG, ensuring PII is masked within the retrieved data.

F5 AI Guardrail Capabilities: PII Masking in Action

The demonstration illustrated the effectiveness of F5 AI guardrails with configured PII masking rules for credit card numbers and passport numbers. When a tool call returned a credit card number, the guardrail automatically masked it, preventing its exposure. The system offers a broad range of pre-built PII scanners and the flexibility to define custom PII matchers.

Defense in Depth: Combining Prompt Injection Mitigation with PII Masking

The video highlights a combined attack scenario where prompt injection was used in conjunction with tool misuse to expose PII. F5 AI guardrails address this by employing prompt injection scanners to mitigate the initial attack vector, providing a “defense in depth” strategy. This prevents both the compromise of the agent and the subsequent leakage of sensitive data.

Conclusion

The demonstration underscores the essential role of robust security measures, specifically external guardrails like F5 AI guardrails, in deploying AI agents securely. The inherent vulnerabilities of LLMs, coupled with sophisticated attack vectors, necessitate proactive monitoring and control to prevent PII exposure and maintain data privacy. The ability to identify, mask, and alert on PII, combined with prompt injection mitigation, is crucial for realizing the benefits of AI agents without compromising sensitive information.