The Future of Knowledge Assistants: Jerry Liu

Key Concepts

Knowledge Assistants
Retrieval-Augmented Generation (RAG)
Large Language Models (LLMs)
Agentic RAG
Multi-Agent Systems
Data Processing (Parsing, Chunking, Indexing)
Query Understanding and Planning
Tool Use
Microservices

1. Introduction: The Future of Knowledge Assistants

Jerry, co-founder and CEO of LlamaIndex, discusses the future of knowledge assistants.
LLMs are being used in enterprises for document processing, knowledge search, conversational agents, and generative workflows.
The goal is to build an interface that can take any task as input and return a relevant output (short answer, research report, structured output).

2. Limitations of Basic RAG Pipelines

Naive RAG pipelines have limitations:
- Naive data processing.
- Lack of complex query understanding and planning.
- No sophisticated interaction with other services.
- Statelessness (no memory).
Basic RAG is considered a "glorified search system" and cannot answer complex questions.

3. From Simple Search to Context-Augmented Research Assistant

Three steps to move beyond simple RAG:
- Advanced Data and Retrieval Modules.
- Advanced Single Agent Query Flows (Agentic RAG).
- General Multi-Agent Task Solver.

4. Advanced Data and Retrieval Modules

Data quality is crucial for LLM applications ("garbage in, garbage out").
Data processing translates raw data into a format suitable for LLMs.
Main components: parsing, chunking, and indexing.

4.1 Parsing

Good PDF parsing is essential to extract complex documents into well-structured representations.
Example: Using Llama Parse on the Caltrain schedule yields better results than PiPDF due to its ability to understand spatially laid out text.
Good parsing reduces hallucinations.
Llama Parse has processed tens of millions of pages.

4.2 Indexing

Advanced indexing modules model heterogeneous data within a document.

5. Advanced Single Agent Query Flows (Agentic RAG)

Building an agentic RAG layer on top of existing data services enhances query understanding.
Key components of agentic QA systems:
- Function calling and tool use.
- Query planning (sequential or DAG-style).
- Conversation memory (statefulness).
Agentic RAG uses LLMs extensively during query understanding and processing.
Everything is an LLM interacting with data services as tools.
Agent reasoning loops (e.g., while loop over function calling or ReAct) enable personalized QA systems.
Can handle complex questions, maintain user state, and access structured data.

6. Multi-Agent Task Solvers

Single agents have limitations; specialist agents perform better.
Agents are increasingly interfacing with other agents, suggesting a multi-agent future.

6.1 Why Multi-Agents?

Specialization and reliable operation over focused tasks.
Parallelization for faster task completion.
Potential cost and latency savings by using weaker models with fewer tools per agent.

6.2 Challenges in Building Multi-Agent Systems

Balancing unconstrained agent interaction with explicit control.
Defining the proper service architecture for agents in production.

7. Llama Agents: Agents as Microservices

Llama Agents is a new repo (in alpha) that represents agents as microservices.
Goal: To move agents out of notebooks and into production.
Each agent is a separate service that can communicate with others through a central API.
Enables scalable, deployable, and reusable agents.

7.1 Core Architecture

Each agent is a separate service.
Agents can be written with LlamaIndex or other frameworks.
Agents communicate via a message queue.
Orchestration happens through a control plane (inspired by Kubernetes).
Orchestration can be explicit (defined flows) or implicit (LLM orchestrator).

7.2 Demo: Llama Agents on a Basic RAG Pipeline

Demo shows how to run Llama Agents on a trivial RAG pipeline with a query rewriting service and a default RAG agent.
Agents communicate through an API protocol.
Enables launching multiple client requests and handling tasks from different directions.
The goal is to turn even trivial logic into deployable services.

8. Call to Action

Llama Agents is in Alpha mode; feedback is welcome.
Dozens of initial tutorials are available.
Check out the discussions tab for the roadmap.
If interested in data quality, join the waitlist for Llama Cloud.

9. Conclusion

Multi-agent systems are a core component of the future of knowledge assistants.
Llama Agents aims to provide a production-grade framework for building multi-agent systems.
Data quality and advanced data processing are essential for effective LLM applications.

The Future of Knowledge Assistants: Jerry Liu

Key Concepts

1. Introduction: The Future of Knowledge Assistants

2. Limitations of Basic RAG Pipelines

3. From Simple Search to Context-Augmented Research Assistant

4. Advanced Data and Retrieval Modules

4.1 Parsing

4.2 Indexing

5. Advanced Single Agent Query Flows (Agentic RAG)

6. Multi-Agent Task Solvers

6.1 Why Multi-Agents?

6.2 Challenges in Building Multi-Agent Systems

7. Llama Agents: Agents as Microservices

7.1 Core Architecture

7.2 Demo: Llama Agents on a Basic RAG Pipeline

8. Call to Action

9. Conclusion

Chat with this Video

Related Videos

Ready to summarize another video?