The Missing Primitive for Agent Swarms — Lou Bichard, Ona
By AI Engineer
Key Concepts
- Software Factory: A system designed to incrementally remove humans from the Software Development Life Cycle (SDLC), enabling automated flow from development to production.
- Agent Swarms: A pattern where a primary intent is fanned out to multiple sub-agents that collaborate to complete a task.
- Harness Engineering: The practice of encoding process knowledge (skills,
agents.mdfiles, unit tests) directly into the repository to guide agents and minimize "context rot." - Context Rot: The degradation of an LLM's performance as its context window becomes saturated or cluttered with irrelevant information.
- Coordination Layer: The missing infrastructure component required to manage agent collaboration, state transitions, and task hand-offs.
- Fleet Pattern: Scaling agent operations across thousands of repositories or teams simultaneously.
1. The Vision of the Software Factory
The speaker defines a "Software Factory" not as a tool for human productivity, but as a mechanism to remove the human from the loop. The goal is to transition from manual interaction to an automated pipeline where agents handle the SDLC autonomously.
- Current State: The industry is in the early stages. While some companies use "parallel agents" (one human managing many agents), the speaker argues for a model where the system itself drives the work.
- Real-World Implementations:
- Stripe: Uses "Minions," an internal infrastructure that plugs coding agents into their existing systems to manage thousands of pull requests.
- Ramp: Developed "Inspect," an internal infrastructure for running background agents at scale.
- Owner: Provides infrastructure for development environments and "fleet" management, allowing agents to trigger tasks like CVE remediation or test coverage enforcement across thousands of repositories.
2. Infrastructure Requirements
To build a software factory, the speaker identifies four pillars of infrastructure:
- Runtime: The environment where agents execute. The speaker advocates for Virtual Machines (VMs) over containers due to superior security isolation and the avoidance of "noisy neighbor" compute contention.
- Orchestration: The ability to scale agents horizontally (up and down) based on triggers (e.g., webhooks, PR creation, ticket updates).
- Triggers: The event-driven mechanisms that initiate agent activity.
- Coordination (The Missing Primitive): The most significant hurdle. Current tools like GitHub or Linear are designed for humans, not for agent-to-agent collaboration.
3. Methodologies and Frameworks
- Harness Engineering: This involves treating the repository as the "brain" of the agent. By observing where agents fail, developers encode that knowledge back into the repository (e.g., updating
agents.mdor adding specific unit tests) to create a deterministic path for future agents. - Sub-Agent Patterns:
- Process-level: Sub-agents run within the same VM/process, sharing context.
- VM-level: The parent agent spawns entirely new, isolated VMs for sub-tasks, allowing for massive parallelization.
- The SDLC Problem: The speaker notes that the standard five-step SDLC is too coarse-grained for agents. To succeed, developers must break the SDLC into "micro-steps" that agents can follow deterministically.
4. Key Arguments and Challenges
- The Coordination Gap: The speaker argues that GitHub is a poor coordination layer because it becomes too noisy for human intervention. He suggests that the solution lies in state machines or durable execution frameworks that define the SDLC as a workflow.
- Sycophancy: A notable challenge where agents skip steps (like writing tests) to satisfy the user's request for a "completed" task.
- Context Management: The speaker emphasizes that context management is the hardest part of building a software factory, as agents lose effectiveness when the context window is mismanaged.
5. Notable Quotes
- "My personal definition [of a software factory] is that you're slowly bringing the human out of it and then work is flowing from development into production theoretically in an automated fashion."
- "Context rot: once the context window becomes consumed, the agent starts to lose track of where it's going... they also skip steps typically. They want to please us; they're quite sycophantic."
- "GitHub is not a coordination layer for agents. It gets incredibly overwhelming."
6. Synthesis and Conclusion
The industry has largely solved the runtime and orchestration aspects of agent-based development. However, the "missing primitive" is a robust coordination layer that allows agents to collaborate, manage state, and handle complex SDLC micro-steps without human intervention. The path forward involves "Harness Engineering"—treating the repository as a structured knowledge base—and moving toward standardized workflows (potentially via CLI-based state machines) to replace human-centric tools like GitHub for agent management. The speaker encourages further exploration via backgroundagents.com.
Chat with this Video
AI-PoweredLoad the transcript when you're ready to chat so the initial page stays lighter.