Read these if you want to build AI applications
By Thu Vu
AI Book Recommendations: A Deep Dive
Key Concepts: Large Language Models (LLMs), AI Engineering, Foundation Models, Transformers, Attention Mechanism, Pre-training, Fine-tuning, Retrieval Augmented Generation (RAG), LM Ops, AI Applications, AI Development Pipeline.
1. Build a Large Language Model (LLM) from Scratch by Sebastian Raschka
- Main Topic: Building an LLM from the ground up to gain a deep understanding of its inner workings.
- Key Points:
- Provides a step-by-step guide to building an LLM similar to GPT-2.
- Emphasizes the importance of understanding through creation, echoing Richard Feynman's quote: "What I cannot create, I do not understand."
- Highlights real-world advantages of custom LLMs in specialized fields like finance or medicine, where they can outperform general-purpose models.
- Addresses data privacy concerns by allowing users to train models on their own data, avoiding third-party sharing.
- Covers essential concepts like Transformers, the attention mechanism, pre-training, and fine-tuning.
- Step-by-Step Process: The book guides the reader through the entire process of building an LLM, from understanding the underlying theory to implementing it in code.
- Technical Details:
- Uses Python and PyTorch, a deep learning library.
- Includes an appendix on PyTorch for beginners.
- Provides clear diagrams and illustrations to explain complex ideas.
- Resources: The book's code is available on a public GitHub repository.
- Target Audience: Machine learning researchers, AI researchers, and engineers who want a deep understanding of LLMs.
- Prerequisites: Basic to intermediate Python skills.
2. AI Engineering: Building Applications with Foundation Models by Chip Huyen
- Main Topic: AI Engineering as a new discipline focused on leveraging existing Foundation Models to build robust and reliable AI applications.
- Key Points:
- Emphasizes the shift from developing new AI models to building applications using existing Foundation Models.
- Highlights the rapid growth of open-source AI engineering frameworks.
- Provides a comprehensive guide to AI engineering, covering almost everything needed to bring Foundation Models into real-world applications.
- Discusses the importance of considering whether an AI solution is necessary before implementation, cautioning against "fear of missing out."
- Covers foundational knowledge on building AI applications using Foundation Models like LLMs and LMMs (Large Multimodal Models).
- Provides an overview of available Foundation Models and their use cases (coding, image/video production, writing, education, chatbots, workflow automation).
- Explains technical terms like unsupervised pre-training, supervised fine-tuning, and reinforcement learning with human feedback (RLHF).
- Discusses scaling laws and the relationship between model size, data, computing power, and cost.
- Covers techniques to adapt Foundation Models to specific use cases: prompt engineering, Retrieval Augmented Generation (RAG), AI agents, and fine-tuning.
- Addresses the importance of data privacy and potential misuses.
- Provides frameworks and metrics to evaluate Foundation Models and AI systems.
- Arguments:
- The biggest opportunities in AI lie in building AI applications, not just developing Foundation Models.
- AI engineering is a distinct discipline from machine learning engineering.
- Companies should carefully consider whether an AI solution is necessary before implementing it.
- Technical Details:
- Explains the attention mechanism and Transformer architecture.
- Resources: The book comes with a GitHub repository containing chapter summaries and resources.
- Target Audience: Individuals working in companies that want to integrate AI into their operations.
3. LM Engineers Handbook: Master the Art of Engineering Large Language Models from Concept to Production by Paul Irofti and Maxime Labonne
- Main Topic: A hands-on guide for engineers to build, optimize, and deploy LLM-based applications.
- Key Points:
- Focuses on the technical details of an LLM development pipeline.
- Covers data collection, data engineering, supervised fine-tuning, Retrieval Augmented Generation (RAG), pipeline evaluation, cloud deployment, and LM Ops.
- Uses a recurring example: building an "LM twin," a version of yourself projected into an LM using your writing style, voice, and personality.
- Explains how to deploy the AI system professionally on the cloud, including setting up virtual machines, databases, and a CI/CD/CT pipeline.
- Introduces tools such as ZenML, Docker, Comet, OpML, MongoDB, Qdrant, and AWS SageMaker.
- Step-by-Step Process: The book guides the reader through the entire LLM development pipeline, from data collection to deployment.
- Technical Details:
- Focuses on tooling and architecture.
- Explains how to implement each part of the LLM development pipeline.
- Covers LM Ops, which expands on ML Ops with components like prompt monitoring, guardrails, and human-in-the-loop feedback.
- Case Study: Building an LM twin using social media posts and fine-tuning/RAG techniques.
- Target Audience: AI engineers and developers.
Synthesis/Conclusion
The three books offer complementary perspectives on AI. Raschka's book provides a foundational understanding of LLMs by guiding readers through building one from scratch. Huyen's book focuses on AI engineering, emphasizing the importance of building robust applications using existing Foundation Models. Irofti and Labonne's handbook offers a practical, hands-on guide for engineers to build, optimize, and deploy LLM-based applications. Together, these books provide a comprehensive overview of the current state of AI and the skills needed to succeed in this rapidly evolving field.
Chat with this Video
AI-PoweredHi! I can answer questions about this video "Read these if you want to build AI applications". What would you like to know?