How I Build and Ship Custom AI Solutions for Clients
By Dave Ebbelaar
Key Concepts
- GenAI Launchpad: A proprietary, standardized framework/architecture used by Data Luminina for all client projects, ensuring consistency in file structure, Docker configuration, and backend modules.
- Two-Week Sprint Model: A project management methodology where development is broken into 10–20k EUR, two-week cycles to maintain agility and manage client expectations.
- Cloud Code: An AI-assisted development tool used to accelerate coding, planning, and debugging.
- Human-in-the-Loop (HITL): A design pattern where AI handles the majority of tasks (e.g., 80%), while complex edge cases are escalated to human operators.
- Observability Stack: The combination of Langfuse (for LLM trace tracking and prompt optimization) and Sentry (for application error tracking and monitoring).
- Bare Metal Deployment: Using Hetzner VMs with Docker Compose and Caddy (reverse proxy) for cost-effective, reliable production hosting.
1. Discovery and Use Case Selection
The discovery phase is the most critical stage for project success. Dave Ablar emphasizes:
- Problem vs. Solution: Clients often request specific AI tools without understanding the underlying problem. The goal is to identify high-impact, simple use cases rather than "moonshots."
- ROI Focus: Every project must have a clear Return on Investment. If a task can be automated with simple tools (e.g., N8N or Zapier), AI may not be necessary.
- Red Flags: Avoid clients who want AI just for the sake of "doing AI," those who lack clear success criteria, or those who expect 100% accuracy from day one.
- Common Use Cases: Document processing, content generation, customer support, internal knowledge assistants, and data extraction.
2. Scoping and Iterative Development
- The 80% Rule: Initial AI builds typically reach 70–80% accuracy. It is vital to educate clients that LLMs are not deterministic like traditional software and require iterative improvement to reach 90%+.
- Proof of Concept (PoC) vs. MVP:
- PoC: Used when the technical feasibility is uncertain; results in a demo/report.
- MVP: A tangible, deployable product that provides immediate value.
- Client Involvement: Clients must be informed that their domain expertise is required for feedback loops, which are essential for refining the AI’s performance.
3. Technical Architecture and Stack
Data Luminina uses a highly standardized stack to enable rapid deployment:
- Backend: Python (FastAPI, Celery for task queuing, Redis for storage, PostgreSQL).
- Frontend: Next.js (integrated with Supabase for authentication).
- LLM Provider: Azure OpenAI (chosen primarily for data compliance and security requirements).
- Standardization: By using the "GenAI Launchpad," every project starts with the same folder structure and Docker configuration, allowing the team to switch between projects seamlessly.
4. Development Workflow and "Cloud Code"
- Efficiency: The team uses AI-assisted coding to ship in two weeks what previously took two months.
- Workflow:
- Start with a Plan Mode before executing code.
- Maintain
cloud.mdfiles to provide context to the AI, as every session starts fresh. - Use "Superpowers" (a set of skills for large, long-running tasks) and the "skill creator" tool to automate repetitive development tasks.
- Team Structure: The company avoids scaling headcount, preferring to scale through AI-driven productivity and a small team of co-founders/subcontractors.
5. Deployment and Security
- Infrastructure: Deploying on bare-metal Hetzner VMs is preferred over managed cloud services (AWS/Azure) for cost-efficiency (10x cheaper).
- Security:
- Block all IPs by default.
- Use a VPN (e.g., N-layer) to provide a static IP for whitelisting database access.
- Use Caddy as a reverse proxy to handle HTTPS automatically.
- Keep only ports 80 and 443 open.
6. Monitoring and Observability
- Langfuse: Tracks LLM traces, allowing the team to drill down into specific steps of a DAG (Directed Acyclic Graph) to see exactly where an AI output failed.
- Sentry: Monitors application errors. When an error occurs, the team can copy the stack trace as Markdown, feed it into the AI coding agent, and receive an immediate fix.
7. Synthesis and Conclusion
The "long game" for an AI development company is to transition from a project-based freelancer to a long-term software partner. By focusing on recurring maintenance, predictable revenue, and standardized architectures, developers can avoid the "feast or famine" cycle of freelance work. The current "golden era" of AI allows for massive output, but the core value remains in the process, reliability, and ability to maintain systems rather than just the AI models themselves.
Key Takeaway: "Build the system once, make sure it can adapt to new models, and don't reinvent the wheel."
Chat with this Video
AI-PoweredHi! I can answer questions about this video "How I Build and Ship Custom AI Solutions for Clients". What would you like to know?