Why Your AI UX Is Broken (and It's Not the Model's Fault) — Mike Christensen, Ably
By AI Engineer
Key Concepts
- Direct HTTP Streaming: The standard pattern where a client establishes a persistent point-to-point connection with an agent to receive LLM responses.
- Server-Sent Events (SSE): A standard for pushing updates from a server to a client over HTTP; inherently one-way.
- Durable Sessions: A stateful, shared intermediary layer between the agent and the client that decouples the two, allowing for persistent, multi-device, and multi-agent interactions.
- Pub/Sub (Publish/Subscribe): A messaging pattern where publishers (agents) and subscribers (clients) communicate through a shared channel rather than direct connections.
- Resilient Delivery: The ability for a stream to survive network drops, page refreshes, or device switching.
- Live Control: The ability for a client to send upstream signals (e.g., "stop," "steer," or "follow-up") to an agent while it is actively generating content.
1. The Limitations of Direct HTTP Streaming
The speaker, Mike Christensen (Staff Engineer at Ably), argues that the current industry standard—direct HTTP streaming—is fundamentally limited because it couples the user experience to a single, fragile connection.
- Fragility: If a user’s Wi-Fi drops or they refresh the page, the stream is severed.
- Lack of Continuity: Because the connection is a "private pipe," opening the app on a second device or tab results in a lack of visibility into the ongoing response.
- One-Way Constraint: SSE is strictly one-way. If a user wants to "stop" an agent, they must close the connection, creating ambiguity: should the agent stop generating (wasting tokens) or buffer the response for a potential resume?
2. The "Durable Sessions" Framework
To solve these issues, engineering teams are moving toward Durable Sessions. This architecture introduces a persistent, stateful medium between the agent and the client.
- Decoupling: The agent writes events to the session, not the client. The client reads from the session.
- Resumability: Because the session stores events (e.g., in Redis or a Pub/Sub channel), a client that reconnects can request a replay of missed events based on sequence numbers.
- Multi-Surface Sync: Since the session is a shared resource, multiple tabs or devices can subscribe to the same session, ensuring all surfaces show the same live state.
3. Advanced Interaction Models
- Bidirectional Control: By moving from SSE to WebSockets or a Pub/Sub-based transport, clients gain an upstream channel to send commands (e.g., "change the flight date to Wednesday") while the agent is still working.
- Concurrent/Multi-Agent Architectures: In complex systems, an orchestrator agent often delegates tasks to sub-agents. Instead of forcing the orchestrator to proxy all granular updates from sub-agents, a durable session allows all agents to write independently to the channel. This simplifies the architecture and provides the client with full visibility into all sub-tasks.
4. Real-World Application: Ably AI Transport
The speaker introduces Ably AI Transport as a practical implementation of this pattern.
- Functionality: It acts as a drop-in layer that handles the "plumbing" of durable sessions, including:
- Event Materialization: Automatically converting text chunks into complete responses.
- Multiplexing: Managing concurrent activity from multiple agents or clients.
- Fanout: Synchronizing state across multiple devices.
- Human-in-the-loop: The demo showcased a support scenario where a human agent can join an existing session, view the full history of the AI-user interaction, and take over seamlessly.
5. Notable Quotes
- "The health of that stream is essentially tied to the health of that end client's connection." — Explaining the primary failure point of direct HTTP streaming.
- "Resume and cancel are mutually exclusive when you're using SSE." — Highlighting the technical conflict in current AI SDKs.
- "The best AI products do better than a simple sequential request-response pattern; they let you communicate with the agent while it's working."
6. Synthesis and Conclusion
The transition from direct HTTP streaming to a Durable Session model is essential for moving from "fragile demos" to "production-grade AI products." By decoupling the agent from the client via a persistent, addressable, and resumable layer (like Pub/Sub channels), developers can achieve:
- Resilience: Seamless recovery from network interruptions.
- Multi-device parity: Consistent experiences across mobile and desktop.
- Rich Interaction: Bidirectional control and multi-agent visibility without complex, custom-built orchestration logic.
This architectural shift allows engineering teams to stop building "plumbing" and focus on the core value of their AI agents.
Chat with this Video
AI-PoweredLoad the transcript when you're ready to chat so the initial page stays lighter.