Real time fraud detection with AlloyDB AI

By Google Cloud Tech

Share:

Key Concepts

  • Vector Embeddings: Numerical representations of data that capture semantic meaning, used here for anomaly detection.
  • AlloyDB AI: A Google Cloud database service integrated with AI capabilities for real-time inference.
  • SCANN (Scalable Nearest Neighbors): A Google-developed vector indexing algorithm optimized for high-velocity, large-scale data.
  • Fraud Detection Thresholds: The mathematical boundary used to classify transactions as fraudulent vs. legitimate.
  • Hybrid AI Approach: Combining vector-based similarity search with LLM (Gemini) reasoning to improve classification accuracy.
  • In-Database Inference: Executing AI models directly within the database to minimize latency and maximize throughput.

1. Real-Time Fraud Detection Framework

Paul Ramsey, a Product Manager at Google, demonstrates how Cymbal Investments (a fictional financial firm) utilizes AlloyDB to modernize fraud detection. The process involves:

  • Data Preparation: Converting 2 million financial transactions into "feature strings" (textual representations of table data).
  • Embedding Generation: Using AlloyDB’s google_ml.embedding functions to transform these strings into vector embeddings in minutes.
  • Indexing: Utilizing SCANN, an algorithm based on 14 years of Google research, which requires 4x less memory than traditional methods and is optimized for scales exceeding 10 billion vectors.

2. The Mathematical Trade-off in Fraud Detection

The system identifies anomalies by calculating the vector distance between incoming transactions and known fraudulent patterns. The effectiveness is governed by a threshold:

  • Threshold 0.021: Baseline setting achieving 79% recall.
  • Threshold 0.011: Increases recall to 93% but results in a 20% false-positive rate (flagging valid transactions).
  • Threshold 0.031: Reduces false positives but allows 25% of fraudulent transactions to go undetected (false negatives).

3. Hybrid AI: Breaking the Tie with Gemini

To overcome the limitations of static thresholds, the system employs a hybrid approach using Gemini. When a transaction falls near the threshold, the system triggers a natural language analysis:

  • Methodology: The system queries the LLM with specific context: "Based on this user's typical spending patterns, does this look like a card testing attempt?"
  • Outcome: This hybrid approach improves recall and reduces false negatives by over 5% each, effectively resolving the "tie" between the vector model's uncertainty and the need for accuracy.

4. Performance and Scalability Benchmarks

A primary concern with LLM integration is latency. AlloyDB addresses this through architectural optimizations:

  • Array-Based Processing: Delivers 2,000x performance improvements, processing 7,700 rows per second.
  • Optimized AI Functions: By using native in-database execution and surrogate models, the system achieves 100,000 rows per second.
  • Efficiency Gains: This represents a 23,000x improvement over traditional "row-at-a-time" database calls.
  • Cost Impact: These optimizations result in a 6,000x cost reduction, bringing the price of intelligent inference down to 1/10th of a cent per transaction.

5. Conclusion

The presentation highlights that AlloyDB is not merely a storage solution but a high-speed inference engine. By combining vector search (SCANN) with LLM-based reasoning (Gemini) and in-database execution, organizations can achieve real-time fraud detection that is both highly accurate and economically scalable. The core takeaway is that the integration of AI directly into the database layer eliminates the performance bottlenecks typically associated with external API calls for inference.

Chat with this Video

AI-Powered

Load the transcript when you're ready to chat so the initial page stays lighter.

Related Videos

Ready to summarize another video?

Summarize YouTube Video