Agents on the Canvas in tldraw — Steve Ruiz, tldraw
By AI Engineer
Key Concepts
- Tldraw: An online whiteboard platform and SDK that allows developers to build canvas-based applications.
- Make Real: A project enabling users to draw UI prototypes on a canvas and convert them into functional HTML/CSS prototypes using AI.
- Agentic Loops: A framework where AI models generate an output, review it, and iterate until a specific goal is achieved.
- Multi-Agent Orchestration: A system where multiple AI agents (e.g., "Fairies") collaborate on a shared canvas, delegating tasks and managing state.
- Local-First/File-Over-App: A design philosophy where applications operate on local files, allowing for greater user control and integration with powerful, potentially risky, AI scripting tools.
- Script Injection: The process of allowing an AI to modify the runtime environment or code of an application directly.
1. Tldraw: From Whiteboard to AI Canvas
Steve Ruiz, founder of Tldraw, explains that Tldraw is not just a whiteboard but a "hackable canvas runtime" built with React. The platform’s SDK allows developers to integrate canvas functionality into other products (e.g., Replit’s Agent Canvas, Luba AI). The core mission is to prove the SDK's versatility by building AI-driven collaborative tools.
2. Evolution of AI on the Canvas
- Make Real (2023): One of the first projects to allow non-technical users to create functional prototypes from hand-drawn sketches. It demonstrated the power of using vision models to interpret structured data.
- Tldraw Computer: An early experiment in chaining prompts to allow AI to act as a collaborator on the canvas, completing diagrams or graphs rather than just generating static images.
- Agentic Loops: Moving beyond "one-shot" generation, Tldraw implemented loops where the AI produces an output, reviews it, and iterates. This mimics the workflow of modern coding agents.
3. Multi-Agent Collaboration: "Fairies"
The "Fairies" project (fairies.tldraw.com) represents the next step in agentic interaction:
- Shared State: Multiple agents (Fairies) exist on the canvas simultaneously. They can "see" each other's work and coordinate actions.
- Leader-Follower Framework: When given a complex task, one agent is elected as the "leader." It scouts the canvas, creates a to-do list, and delegates specific sub-tasks to other agents.
- Visualizing Thinking: Unlike traditional terminal-based agents, these agents provide visual feedback on the canvas, allowing users to see the agent's "thought process" and progress in real-time.
4. Technical Challenges and Solutions
- Vision Model Limitations: Ruiz notes that vision models struggle with structured data because training data for images often conflicts with web-based coordinate systems (e.g., the Y-axis orientation in Cartesian graphs vs. web DOM coordinates).
- Safety vs. Agency: To allow agents to perform more powerful tasks (like modifying application code), Tldraw moved toward a desktop-based Electron wrapper. By opening a local HTTP port, the AI can perform "script injection" to modify the application's behavior.
- The "Sharp Tool" Philosophy: Ruiz argues that for maximum agency, users must accept the risks of allowing AI to modify local files. He compares this to "OpenClaude" or similar tools, emphasizing that these are powerful, "sharp" tools that require user responsibility.
5. Notable Quotes
- "It felt like I was handing my keyboard to some other AI rather than someone collaborating with me." — Steve Ruiz, on the limitations of early agentic loops.
- "If you really want to maximize the agency... then you kind of just need to hand that to the user and say good luck." — On the necessity of local-first, high-access AI environments.
6. Synthesis and Conclusion
The presentation highlights a shift in AI interaction from static generation to active, collaborative agency. By moving agents from the sidebar onto the canvas, Tldraw enables a more intuitive, visual, and collaborative workflow. The future of these tools lies in "local-first" architectures where AI has deep access to the user's environment, allowing it to act as a true partner in creation, provided the user is willing to manage the inherent risks of such powerful, unconstrained access.
Chat with this Video
AI-PoweredLoad the transcript when you're ready to chat so the initial page stays lighter.