From Physics to the Future: Brian Granger on Project Jupyter in the Age of AI

By The New Stack

Share:

Key Concepts

  • Jupyter Project: An open-source project focused on interactive computing, research, education, and knowledge sharing.
  • IPython Notebook: The precursor to Jupyter Notebooks, initially an interactive shell.
  • Jupyter Notebook: A web-based interactive computing environment that allows users to create and share documents containing live code, equations, visualizations, and narrative text.
  • Kernel: The computational engine that executes code in a Jupyter Notebook.
  • Modular and Extensible Architecture: A design principle for Jupyter that emphasizes reusable and adaptable building blocks.
  • Community and Governance: The organizational and decision-making structures of an open-source project.
  • AI in Computing: The integration of artificial intelligence into software development and user workflows.
  • Technical Debt: The implied cost of additional rework caused by choosing an easy (limited) solution now instead of using a better approach that would take longer.
  • Altter: A statistical visualization library for Python.
  • Visualization Grammar: A system for describing and creating visualizations based on fundamental components.
  • Sustainability: The long-term viability of an open-source project, encompassing technical, community, and governance aspects.

Project Jupyter: Origins, Evolution, and Future

This summary details a conversation with Brian Granger, co-creator of Project Jupyter and Senior Principal Technologist at AWS, discussing the history, development, and future of Jupyter Notebooks.

Origins of Jupyter Notebooks

  • Early Influences: Brian Granger and Fernando Perez, co-creators of Jupyter, were inspired by their experience with scientific computing tools like Mathematica during their graduate studies in physics.
  • IPython's Genesis (Late 1990s - Early 2000s): Fernando Perez initially created IPython as an interactive shell, aiming to bring a more powerful and user-friendly command-line experience to Python.
  • The Vision for a Notebook (2004): Granger and Perez discussed the desire for a web-based notebook experience in Python, similar to what they had used in other environments, to facilitate their research and education.
  • Development Timeline (2004-2011): The initial version of the IPython Notebook took approximately seven years to develop, reflecting a deliberate and careful engineering approach.

The Problem Jupyter Solved

  • Focus on Research and Education: The original intent of IPython Notebook (and subsequently Jupyter) was not for software engineering or application development, but rather for tackling complex problems requiring human reasoning and thinking, particularly in research and education.
  • Messy Problems: Jupyter was designed to handle "messy problems" that are common in scientific inquiry and learning.

Key Design Principles and Learnings

  • Modular and Extensible Architecture (What Went Right):
    • Inspired by the modularity of physics principles (e.g., Newton's laws), the team focused on creating a small number of reusable, extensible, and modular building blocks.
    • Key components include the notebook format, the kernel message protocol, and higher-level APIs for Jupyter server and Jupyter Lab extensions.
    • This architecture allows for flexibility and the ability to solve a wide range of problems with a core set of tools.
  • Community and Governance (What Could Be Improved):
    • The team lacked experience in building and managing open-source communities and governance structures.
    • They had to revisit and adapt their governance model at different stages, realizing they should have addressed these challenges earlier.
    • The transition from a hobbyist project with a small, familiar group to a large-scale project involving corporations and a wider community was a slow realization.

The Current Era of Jupyter: AI Integration

  • The AI Challenge: The current era for Jupyter, and indeed for many in the tech landscape, is defined by the question of "what do we do with AI?"
  • Jupyter's Unique Role: Unlike traditional Integrated Development Environments (IDEs) focused on building, testing, and deploying applications, Jupyter's core purpose is to facilitate thinking, collaboration, and knowledge sharing.
  • Deliberate Approach to AI: The Jupyter community is intentionally taking a slow and thoughtful approach to integrating AI, focusing on how AI can enhance thinking, collaboration, and knowledge sharing.
  • Developing Software Building Blocks: While answers are still emerging, the community is developing software building blocks to test and iterate on these AI-related questions.

Challenges of Scale in Open Source

  • Technical Complexity: Jupyter is a large open-source project with over 100 GitHub repositories across multiple organizations, leading to a significant number of GitHub issues (thousands in some repos).
  • AI as a Catalyst for Development: The developers building Jupyter, who are largely software engineers, are finding AI to be a powerful tool for accelerating their work on the project.
  • Impact on Resourcing and Prioritization: AI is changing calculations regarding resourcing, prioritization, and technical debt.
  • Example: Jupyter Server Reimplementation: Brian Granger demonstrated how an AI coding agent could extract an OpenAPI specification from a part of Jupyter Server, reimplement it in Go, and generate a test suite with 70% coverage in about half an hour. This was previously considered an "insane" undertaking.
  • Rethinking Technical Debt and Sunk Costs: AI is making previously dismissed options, like rewriting core components, viable, fundamentally altering how technical debt and sunk costs are perceived.
  • AI Slop vs. Productive Contributions: While some projects receive a lot of AI-generated "slop" in pull requests, Jupyter, being past the initial hype curve and on a "plateau of productivity," has not experienced this issue significantly. This also means it might not be a primary target for new contributors.

Altter: A Statistical Visualization Library

  • Original Vision: Brian Granger, along with Jake Vanderpus, co-created Altter with the goal of addressing the growing importance of tabular data visualization.
  • Inspiration from Visualization Grammar: They were drawn to the visualization grammar developed by Jeff Heer's group at the University of Washington, which offered a more modular and expressive approach compared to pre-composed sentences for specific plot types.
  • Modular and Flexible Grammar: Altter was designed around a modular and flexible grammar, enabling the expression of various visualizations using a small set of building blocks.
  • Current Status: Granger notes he hasn't actively worked on Altter in years, with Jake Vanderpus and others now leading its development.

Recognition and Future of Jupyter

  • ACM Software System Award: Jupyter's recognition with the ACM Software System Award, alongside systems like Unix and the World Wide Web, signifies its significant impact.
  • Responsibility and Sustainability: This award highlights the responsibility that comes with Jupyter being a critical dependency for entire industries (academic research, education, corporations).
    • Sustainability: A key concern is the long-term technical and governance sustainability of the project, ensuring that notebooks created today will run a decade from now.
  • Competition and Uniqueness:
    • While the landscape has become more competitive with the rise of traditional IDEs and generative AI, Jupyter remains unique due to its focus on thinking, collaboration, and knowledge sharing, rather than pure software engineering.
    • The community is motivated to avoid a "slow death" of irrelevance by understanding user needs and building valuable tools.

Conclusion

Project Jupyter has evolved from a research and education tool to a critical infrastructure for various industries. Its modular architecture has been a key to its success, while the community and governance aspects have presented ongoing learning opportunities. The advent of AI is poised to significantly impact Jupyter's development and its role in facilitating thinking, collaboration, and knowledge sharing, with a strong emphasis on long-term sustainability and continued relevance in a competitive landscape.

Chat with this Video

AI-Powered

Hi! I can answer questions about this video "From Physics to the Future: Brian Granger on Project Jupyter in the Age of AI". What would you like to know?

Chat is based on the transcript of this video and may not be 100% accurate.

Related Videos

Ready to summarize another video?

Summarize YouTube Video