Building a life saving MCP server on Cloud Run (Avalanche demo)

Key Concepts

MCP (Model Context Protocol): A standardized way to connect real-time data to Large Language Models (LLMs). It’s an AI-optimized API specification that is self-describing.
LLM (Large Language Model): AI models like Gemini, capable of understanding and generating human-like text. They require external data sources for up-to-date information.
Cloud Run: A fully managed serverless execution environment on Google Cloud, ideal for applications with variable traffic patterns.
Serverless Computing: A cloud computing execution model where the cloud provider dynamically manages the allocation of machine resources.
Hallucinations (in LLMs): Instances where an LLM generates incorrect or nonsensical information. MCP aims to reduce these by grounding responses in factual data.
JSON Schema: A vocabulary that allows you to annotate and validate JSON documents. Used in MCP to define tool inputs.

Avalanche Safety & MCP Server on Google Cloud Run: A Detailed Overview

This discussion centers around leveraging a Model Context Protocol (MCP) server, deployed on Google Cloud Run, to provide accurate, real-time avalanche data to Large Language Models (LLMs) and AI-powered applications. The goal is to enhance the safety of winter sports enthusiasts by providing access to up-to-date avalanche risk information.

The Need for MCP Servers & Real-Time Data

Kristoff explains that LLMs, while powerful, lack inherent knowledge of constantly changing data like avalanche conditions. “An LLM would not intrinsically know about most up-to-date data.” Traditional APIs are cumbersome for AI integration, requiring complex prompting and parsing. MCP addresses this by offering a standardized, self-describing interface.

The self-describing nature of MCP is crucial. The server explicitly informs the LLM about available tools – in this case, tools for accessing avalanche data – and how to use them, using plain English descriptions and JSON schemas. This eliminates the need for intricate prompt engineering.

Avalanche Warning MCP Server Functionality

The demonstrated Avalanche MCP server, developed by the Austrian Avalanche Warning Service, exposes two key tools:

get region ID: Takes a mountain name as input and returns a unique region ID. This ID is essential for querying specific avalanche reports.
get bulletin: Requires a region ID and a date (optionally a language) as input. It then returns the avalanche danger rating and related data in a structured format.

The LLM autonomously chains these tools together. For example, when asked about the avalanche danger level at Mountain Solstein, the LLM first uses get region ID to obtain the correct ID, then uses get bulletin with that ID to retrieve the current report. This process was demonstrated with a query to Gemini CLI, resulting in a “grounded answer based on the official records” with “no hallucinations.”

Deploying to Cloud Run: Scalability & Cost Efficiency

The discussion highlights the advantages of deploying the MCP server to Google Cloud Run. Kristoff emphasizes that while running MCP servers locally is suitable for personal use, a public service requires a scalable and accessible endpoint.

Cloud Run is presented as an ideal solution due to its serverless nature and auto-scaling capabilities. Avalanche report traffic is inherently “spiky” – high during and after snowfall, and low during off-peak times. Cloud Run’s ability to “scale to zero” ensures that resources are only consumed when needed, minimizing costs.

“Cloud Run scales to zero. So if no one is asking about the avalanches, almost nothing is getting charged. But if a blizzard hits or champagne powder falls, traffic will spike suddenly. Cloud Run will scale up automatically.”

Deployment is streamlined with a single command that builds a container from the code and provides a publicly accessible URL.

Deployment Process (Simplified)

Code: Utilize the Avalanche MCP server code provided by the Austrian Avalanche Warning Service.
Deployment Command: Execute a single command to deploy the code as a Cloud Run service.
Containerization: Cloud Run automatically builds a container image from the code.
URL Provisioning: Cloud Run provides a unique URL for accessing the deployed MCP server.
Integration: The URL can then be used by LLMs (like Gemini) or AI agents to access real-time avalanche data.

Data & Statistics (Implied)

While specific traffic statistics weren't provided, the discussion implies a significant variance in query volume based on weather conditions. The example of "champagne powder" falling illustrates a scenario where demand for avalanche reports would dramatically increase.

Key Takeaways

Kristoff summarizes the main benefits:

MCP for Real-Time Data: MCP effectively connects AI applications and agents to dynamic data sources like avalanche warnings.
Cloud Run for Serverless Scalability: Cloud Run provides a cost-effective and scalable platform for hosting MCP servers, automatically adjusting resources based on demand.

Synthesis/Conclusion

The conversation demonstrates a practical application of MCP and Cloud Run to address a critical safety concern in winter sports. By providing a standardized and scalable way to deliver real-time avalanche data to LLMs, this approach empowers users with accurate information, potentially saving lives. The combination of a self-describing protocol like MCP and a serverless platform like Cloud Run represents a powerful paradigm for building AI-powered applications that rely on constantly updated information.