Small Bets, Big Impact Building GenBI at a Fortune 100 – Asaf Bord, Northwestern Mutual

GenBI: Balancing Innovation and Stability at Northwestern Mutual

Key Concepts: GenBI (Generative AI + Business Intelligence), Data Democratization, Risk Aversion, Incremental Delivery, Metadata, RAG (Retrieval-Augmented Generation), LLM (Large Language Model), Semantic Layer, Sun Cost Bias.

Introduction & The GenBI Concept

The presentation focuses on Northwestern Mutual’s journey into GenBI – a fusion of Generative AI and Business Intelligence. GenBI aims to empower users to answer business questions directly with data, bypassing the traditional reliance on BI teams for report generation and interpretation. This democratization of data access is the core driver behind the initiative. Assaf, the speaker, humorously notes the presentation wasn’t created with GenAI, as an initial attempt using GPT-3 was completely disrupted by GPT-5, highlighting the rapid evolution and potential instability of these tools.

Northwestern Mutual: A Unique Landscape for GenAI

Northwestern Mutual possesses significant advantages for GenAI implementation: abundant data, substantial financial resources, and access to top-tier talent. However, the company’s inherent risk aversion, stemming from its long-term commitments to clients (life insurance policies spanning decades), presents a significant challenge. The company’s core principle is described as “generational responsibility” – essentially, “don’t mess up.” The central question addressed is how to balance this need for stability with the desire for innovation.

Four Core Challenges in Implementing GenBI

The team identified four key challenges in pursuing GenBI:

Novelty: No one had previously implemented GenBI in the same way, requiring a pioneering approach.
Real-World Data: A deliberate decision was made to use messy, real-world data (160 years of historical data) rather than synthesized or cleansed data. This was to ensure the solution’s effectiveness in a realistic production environment. The gap between proof-of-concept and production is significant, especially in GenAI.
Trust Building: Establishing trust in the system’s accuracy and reliability was crucial, both with end-users and with company leadership. The speaker acknowledges the well-documented concerns around accuracy and bias in GenAI.
Budget & Impact: Securing investment for a novel, unproven project within a risk-averse organization required demonstrating potential value and managing expectations.

The Value of Real Data & User Involvement

Using actual, messy data was prioritized to ensure the solution could handle the complexities of a real-world production environment. This approach also fostered collaboration with data professionals who possess deep subject matter expertise. These professionals provided valuable validation, real-world examples of questions, and served as a crucial evaluation resource. Importantly, involving business users in the research process from the outset created buy-in and a sense of ownership. Users actively requested production deployment once they saw the potential.

Building Trust: A Crawl, Walk, Run Approach

To build trust, a phased rollout was implemented:

Phase 1 (Crawl): Release to BI experts – individuals capable of independently verifying results and providing constructive feedback. This functioned as a “GitHub co-pilot” for BI professionals, accelerating their existing workflows.
Phase 2 (Walk): Expand access to business managers – those familiar with BI outputs and capable of identifying inaccuracies. This group was expected to be more forgiving of errors and provide valuable feedback.
Phase 3 (Run): (Future) Potential release to executives, requiring a significantly higher level of accuracy and conciseness. The speaker acknowledges this phase may not be achievable.

A key architectural decision was to initially focus on delivering existing, verified reports and dashboards rather than attempting to generate SQL queries directly. The team found that 80% of BI team effort was spent directing users to the correct reports and assisting with their interpretation. This approach aligned expectations and built inherent trust by delivering familiar assets in a faster, more interactive way.

Incremental Delivery & Risk Mitigation

The team adopted a highly iterative, incremental delivery process to secure leadership buy-in and manage risk. Each six-week sprint resulted in a tangible deliverable, providing visibility into progress and allowing for course correction. This approach eliminated “sun cost bias” – the tendency to continue funding a project simply because of prior investment. The ability to demonstrate value at each stage and the option to halt investment at any point were critical to securing ongoing support.

Architectural Overview & Workflow

The GenBI architecture comprises four key agents:

Metadata Agent: Understands the context of the business question by interacting with the data catalog and documentation.
RAG Agent: Retrieves relevant, certified reports and dashboards.
SQL Agent: Generates SQL queries to extract additional data when necessary, leveraging existing reports as a starting point.
BI Agent: Translates the data into a business-friendly answer.

The workflow is as follows: a business question is received, the metadata agent establishes context, the RAG agent searches for existing reports, the SQL agent generates queries if needed, and the BI agent delivers the final answer. A feedback loop ensures continuous improvement. Each of these agents can be independently packaged and deployed as a standalone product.

Quantifiable Results & Future Directions

The RAG agent automated approximately 80% of the work previously performed by two full-time BI team members, whose primary task was directing users to the correct reports.
Metadata enrichment, driven by LLM interactions, resulted in measurable improvements in LLM performance during A/B testing, validating the value of improved metadata.

Future steps include evaluating third-party GenBI tools (like Databricks Genie), further enriching the data catalog with metadata, and exploring new pricing models for SaaS products in the GenAI era. The speaker suggests a shift from seat-based pricing to usage-based or value-based pricing, reflecting the increased productivity enabled by GenAI.

Conclusion

Northwestern Mutual’s GenBI journey demonstrates a pragmatic approach to integrating GenAI into a risk-averse organization. By prioritizing real data, incremental delivery, trust-building, and quantifiable results, the team has successfully secured ongoing investment and laid the foundation for a future where data is more accessible and actionable for all. The emphasis on understanding the value delivered by GenAI, rather than simply the technology itself, is a key takeaway.

Small Bets, Big Impact Building GenBI at a Fortune 100 – Asaf Bord, Northwestern Mutual

GenBI: Balancing Innovation and Stability at Northwestern Mutual

Chat with this Video

Related Videos

Ready to summarize another video?