Back to all videos

Why the Real AI Agent Era Is Still 5 Years Away | Yutori, Abhishek Das

By EO

Constraint 1: No broad terms (e.g.no "Finance Technology").Constraint 2: Return ONLY a comma-separated list.

Share:

Key Concepts

Agentic Workflows: AI systems designed to perform sequences of actions on the web to complete complex tasks.
Non-determinism: The tendency of AI models to produce different outputs or fail unpredictably, which the speaker argues is currently being "normalized" in the industry.
Long-Horizon Workflows: Tasks requiring a long sequence of steps (10–50+), where error rates compound exponentially.
Evals and Guardrails: Systematic testing frameworks used to monitor agent performance and identify failure points.
Dogfooding: The practice of using one's own product internally to refine quality and user experience.
Grad-CAM (Gradient-weighted Class Activation Mapping): A technique for making deep learning models more interpretable by visualizing which parts of an input (e.g., an image) the model focuses on to make a prediction.
Proof of Work: The concept that AI agents must provide transparency regarding the steps taken to reach a conclusion to build user trust.

1. The Challenge of Agentic Reliability

Abhishek Das, co-founder and co-CEO of Ytorii, argues that the current state of "agentic" products is plagued by low reliability. He highlights a critical mathematical problem: in a 50-step workflow, even with a 90% success rate per step, the cumulative probability of success is extremely low.

The "Normalization of Slop": Das criticizes the industry trend of accepting non-deterministic, unreliable AI behavior. He asserts that if an agent cannot perform a task correctly on the first try, it is not yet "good enough" for production.
Backtracking: A core requirement for robust agents is the ability to recognize a mistake, backtrack, and attempt a different branch of logic, rather than continuing to fail.

2. Product Philosophy and User Experience

Das emphasizes that in an era where LLMs make prototyping easy, the true differentiator is taste, craft, and intuition.

The 80/20 Approach: While user feedback is vital, Das argues that builders must also rely on intuition to solve problems users haven't explicitly articulated.
Reducing "Paper Cuts": He cites the example of auto-filling 2FA codes on mobile devices—a feature users didn't explicitly request but which significantly improves daily life by removing friction.
Trust through Transparency: To build user trust, Ytorii’s products include a "proof of work" feature. Users can click a button to see exactly which websites the agent visited and what data it analyzed, mirroring the interpretability goals of his earlier research, Grad-CAM.

3. Methodology: Building Ytorii

The development process at Ytorii is rooted in the scientific method: forming hypotheses, designing experiments, and iterating.

Internal Testing: The team practices rigorous "dogfooding," dedicating 1.5 hours weekly to testing new features. They run dozens of internal experiments, only shipping the most reliable ones to production.
Evals: Every production query is subjected to comprehensive evaluation frameworks to identify specific domains where the agent struggles, allowing for targeted model improvement.
The "Attention to Detail" Philosophy: Das believes that if a company demonstrates extreme care in the visible parts of a product, users will inherently trust the invisible, complex backend processes.

4. Vision for the Future

Das envisions a shift in how humans interact with the web:

Higher Abstraction: Within 5–10 years, users will interact with the web at a higher level of abstraction, delegating "digital chores" to proactive AI agents.
Accessibility: Agents will democratize technology, allowing non-technical users (like his parents) to interact with complex websites simply by stating their intent, rather than learning the UI of every new platform.
Human-Agent Collaboration: The goal is not to replace humans, but to handle mundane tasks, freeing humans to focus on more meaningful, creative work.

5. Notable Quotes

"If it's not good enough to work on the first try, it's not good enough."
"I don't like the normalization of slop and non-determinism and poor reliability, especially with agentic products."
"It is important for models to be able to convey not just the final prediction... but also the proof of work."

Synthesis

The core takeaway is that the future of AI agents depends on moving away from the current culture of "shipping broken things" toward a culture of high-reliability engineering. By combining rigorous evaluation, transparent "proof of work" mechanisms, and a design-first approach to user experience, Ytorii aims to transition AI from a novelty that fails frequently to a reliable, proactive assistant that handles the complexities of the web on behalf of the user.

Chat with this Video

AI-Powered

Load the transcript when you're ready to chat so the initial page stays lighter.

Related Videos

Ready to summarize another video?

Summarize YouTube Video