Coding AI Research Assistant in Python

Key Concepts

LangChain: An agentic framework used to coordinate the AI research assistant.
Pydantic: A data validation library used to enforce a "Structured Output" schema, ensuring the AI returns data in a predictable JSON format.
MCP (Model Context Protocol): A standard for connecting AI assistants to external data sources (in this case, scientific literature).
Consensus MCP: A specific server that provides access to peer-reviewed scientific literature, preventing AI hallucinations.
Vibe Coding: A development approach where design and UI components are generated by AI (e.g., using Claude Code) rather than manually coded.
Structured Output: A technique where the LLM is forced to adhere to a predefined class structure (e.g., Paper, Formula, Report) to ensure programmatic usability.

1. Project Overview and Architecture

The goal of the project is to build an AI research assistant that generates academic reports (papers, formulas, and trends) based on user-provided topics and constraints. The development follows a three-step methodology:

CLI Proof of Concept: A basic Python script using LangChain and Pydantic to generate structured JSON.
UI Integration: Using "vibe coding" to wrap the logic in a Flask web application.
Grounding with Consensus: Integrating the Consensus MCP server to replace hallucinated data with real, peer-reviewed scientific literature.

2. Technical Implementation Details

Environment Management: The project uses uv for dependency management and .env files to securely store the OpenAI API key.
Structured Schema: The assistant uses Pydantic BaseModel classes to define the expected output:
- Paper: Includes title, authors, year, venue, URL, and relevance description.
- Formula: Includes name, LaTeX source code, description, and reference.
- Report: The top-level container aggregating papers, formulas, and trends.
Model Invocation: The system uses init_chat_model (GPT-5) with structured_output to ensure the LLM returns data matching the defined schema.

3. The "Grounding" Problem and Solution

A significant issue identified during the CLI phase was that the LLM frequently hallucinated papers, especially when constrained by specific dates (e.g., 2026).

The Solution: Integrating the Consensus MCP server.
Methodology:
- Install langchain-mcp-adapters.
- Configure a MultiServerMCPClient using the MCP remote tool via NPX.
- Transition from a simple chat model to an Agent capable of tool use.
- Use load_mcp_tools to allow the agent to query the Consensus database in real-time.

4. Step-by-Step Framework for MCP Integration

Client Setup: Initialize MultiServerMCPClient with the Consensus URL (https://mcp.consensus.app/mcp).
Session Management: Use async with client.session to create a context for tool execution.
Tool Loading: Use await load_mcp_tools(session) to inject scientific search capabilities into the agent.
Agent Execution: Replace the standard invoke with an agentic invoke that utilizes the tool_strategy to populate the Report schema.

5. Notable Quotes and Insights

"We cannot just get randomly structured free-form text from the model and then build a UI around it. We need to have certain fields that are always there." — Explaining the necessity of Pydantic for UI development.
"This is day and night difference in the quality of the results... this is what allows you to build reliable research technology." — On the impact of grounding the model with Consensus versus relying on raw LLM generation.

6. Synthesis and Conclusion

The project demonstrates that while LLMs are powerful for synthesis, they are unreliable for academic research due to hallucinations. By combining LangChain's agentic capabilities with Pydantic's structured output and MCP-based grounding, developers can create robust research tools. The final application successfully retrieves real-world papers, renders LaTeX formulas, and identifies trends, providing a professional-grade workflow for academic writing.