Medical AI Project: Side Effects Tracker in Python
By NeuralNine
Side Effects Tracker for Medical Drugs in Python - Detailed Summary
Key Concepts:
- MCP (Model Call Protocol): A standardized way for agents to interact with tools and APIs.
- Arcade.dev: A platform for building, deploying, and managing MCP servers and tools.
- Flask: A Python web framework used to create the web interface.
- LangChain: A framework for developing applications powered by language models.
- API (Application Programming Interface): A set of rules and specifications that software programs can follow to communicate with each other (specifically clinicaltrials.gov in this case).
- Tool Integration: Connecting external services (like APIs and Slack) to an agent for enhanced functionality.
- Data Aggregation: Combining data from multiple sources (clinical trial results) to provide a consolidated view.
- Slack Integration: Using Slack as a communication channel for notifications and updates.
- ORM (Object-Relational Mapper): A technique that converts data between incompatible systems using object-oriented programming languages. (SQLAlchemy used here)
1. Project Overview & Technical Architecture
The project aims to build a side effects tracker for medical drugs. Users will input a drug name via a web interface, and the system will return a list of recently reported side effects from clinical trials, along with an automated Slack notification for new updates.
The technical architecture comprises:
- MCP Server: Built from scratch using Python and Arcade.dev, responsible for communicating with the clinicaltrials.gov API.
- Arcade.dev Platform: Provides security, reliability, and management for the MCP server.
- Flask Application: Serves as the web interface, handling user input and displaying results.
- Slack Integration: Delivers real-time notifications about new side effect reports.
- LangChain Agent: Powers the application logic, utilizing the MCP server and Slack tools.
2. Data Acquisition from clinicaltrials.gov API
The project utilizes the publicly available clinicaltrials.gov API (https://clinicaltrials.gov/api/v2) to retrieve side effect data. Specifically, the /studies endpoint is used.
Key API parameters:
- query: The drug name to search for (e.g., "paracetamol").
- page_size: The number of results per page (set to 25).
- sort: Sorting criteria, set to "results_first_post_date" to sort by date.
The API response is a complex JSON structure. Relevant data points for side effect extraction are located within:
studiessectionprotocolsectionresultsfield (indicating if results are available)serious_eventswithinadverse_eventsmodule instudy_resultssection. This contains the side effect term and the number affected/at risk.
3. Proof of Concept (API Interaction)
A test_api.py script was created to demonstrate basic API interaction. This script:
- Imports the
requestslibrary. - Defines the base API URL.
- Constructs a dictionary of API parameters.
- Sends a GET request to the
/studiesendpoint. - Parses the JSON response.
- Iterates through the studies, checking for results and serious events.
- Extracts side effect terms and calculates probabilities (number affected / number at risk).
- Aggregates side effects, counting occurrences to avoid redundancy.
- Filters results to only include side effects with a probability greater than 0.01 (1%).
4. MCP Server Implementation with Arcade.dev
Arcade.dev is used to create and deploy the MCP server. The process involves:
- Using
arcade new side_effects_mcpto generate a project template. - Installing the
requestspackage within the project's virtual environment usinguv add requests. - Defining a tool named
get_side_effects_for_drugwith adrug_nameparameter (annotated as a string). - Copying the API interaction code (from
test_api.py) into the tool function. - Adding a docstring to the tool function describing its purpose.
- Deploying the server using
arcade deploy --e source side_effects_mcp/server.py. This requires authentication with Arcade.dev. - Accessing the deployed server and its tools through the Arcade.dev dashboard.
5. Flask Application Development
A Flask application (agent_app) is built to provide a user interface. Key components:
- Database Model (models.py): Defines two SQLAlchemy models:
DrugandSideEffectReport, establishing a one-to-many relationship between drugs and their reported side effects. - Tool Definitions (agent.py): Defines four local tools:
list_drugs,create_drug,create_side_effect, andlist_side_effects. - MCP Tool Integration (agent.py): A function
get_mcp_toolsestablishes a connection to the deployed Arcade MCP server using an HTTP transport and API key authentication. - Agent Creation (agent.py): Uses
langchain.agent.create_agentto create an agent powered by GPT-4.1, utilizing both the local tools and the MCP tools. - Flask Routes (main.py):
/: Renders an HTML template (index.html) displaying a list of drugs./query: Handles POST requests with a drug name, invokes the LangChain agent, and returns the results as JSON.
- Configuration (main.py): Sets up the Flask application, initializes the database, and loads API keys from an environment file (
.env).
6. Slack Integration
The Slack integration is facilitated through Arcade.dev's tool catalog. The agent uses the send_message tool to send notifications to a specific Slack channel ("all-neural9"). The system automatically sends a Slack message whenever new side effect information is discovered and added to the database.
7. System Workflow & Agent Logic
The agent's workflow is defined in the system prompt:
- List all drugs in the database.
- If the drug exists, list all side effects for that drug.
- Retrieve new side effects from the clinicaltrials.gov API.
- If new side effects are found that are not already in the database, create new
SideEffectReportentries. - Send a Slack message to the user informing them of the new information.
8. Data & Statistics
- The API returns data in JSON format, requiring parsing to extract relevant information.
- Side effect probabilities are calculated as
number_affected / number_at_risk. - A probability threshold of 0.01 (1%) is used to filter out less significant side effects.
- The database stores drug names and side effect reports, allowing for historical tracking of side effect data.
9. Notable Quotes
- “You are a helpful assistant for finding side effects of drugs.” – System prompt for the LangChain agent.
- “If you ever find new information that was not already in the DB, send a Slack message that informs the user about the update.” – System prompt emphasizing proactive notification.
Conclusion:
This project demonstrates a comprehensive approach to building a side effects tracker using Python, Arcade.dev, Flask, LangChain, and the clinicaltrials.gov API. The architecture leverages MCP for secure and reliable tool integration, enabling an agent to retrieve data, process it, store it in a database, and proactively notify users of new information via Slack. The project highlights the power of combining these technologies to create a valuable application for medical professionals and patients. Further development could include more sophisticated data analysis, visualization, and user interface enhancements.
Chat with this Video
AI-PoweredHi! I can answer questions about this video "Medical AI Project: Side Effects Tracker in Python". What would you like to know?