How to use KerasHub with Hugging Face

By Google for Developers

Share:

Caris Hub: Mixing Model Architectures and Checkpoints Across Frameworks

Key Concepts:

  • Model Architecture: The blueprint or structure of a machine learning model, defined by code using frameworks like Jax, PyTorch, or Caris.
  • Model Weights (Checkpoints): Numerical parameters of a model tuned during training, representing the learned knowledge. A checkpoint is a saved snapshot of these weights.
  • Caris Hub: A Python library simplifying work with model architectures, supporting PyTorch, Jax, and TensorFlow.
  • Hugging Face Hub: A repository for sharing model checkpoints, often in the SafeTensors format.
  • SafeTensors: A secure and efficient format for storing and sharing model weights.
  • Backends: The underlying framework used to run the model (Jax, PyTorch, TensorFlow).

1. Introduction & Problem Statement

The video introduces Caris Hub as a solution to the growing complexity in the AI landscape, specifically addressing the challenge of utilizing pre-trained model weights across different frameworks. Traditionally, model architecture and weights are tightly coupled to a specific framework. Caris Hub decouples these, enabling users to combine architectures with weights from various sources, regardless of their original framework. This unlocks the potential to leverage the vast collection of models available on platforms like Hugging Face Hub with the flexibility of choosing a preferred backend (Jax, PyTorch, or TensorFlow).

2. Defining Model Architecture vs. Model Weights

A clear distinction is made between model architecture and model weights. The architecture is likened to a blueprint – the code defining the model’s structure and operations. Frameworks like Jax, PyTorch, and Caris are used to describe this structure. Model weights, conversely, are the numerical parameters learned during training, representing the model’s knowledge. These weights are often saved as “checkpoints,” snapshots of the model’s state at a specific point in training, typically when performance is optimal. Combining the architecture (code) and weights (checkpoint) results in a functional model.

3. Caris Hub’s Role & Functionality

Caris Hub is presented as a Python library designed to streamline work with model architectures. It provides access to numerous popular machine learning models, allowing users to load desired architectures with minimal code. Crucially, Caris Hub is built on Caris and inherently supports the three major deep learning frameworks: PyTorch, Jax, and TensorFlow.

The library’s power lies in its ability to integrate with resources like the Hugging Face Hub, a large repository of community-shared model checkpoints, often stored in the SafeTensors format. Caris Hub includes built-in converters that automatically load these Hugging Face checkpoints into a Caris model, irrespective of the original framework used to create them. This allows for seamless loading of a PyTorch checkpoint into a Caris Hub model running on Jax or TensorFlow.

4. Practical Demonstration: Loading Models from Hugging Face Hub

The video demonstrates the process with three examples:

  • Mistral (Cybersecurity Focused): A fine-tuned Mistral model named "lily" (from Hugging Face Hub) was loaded using the mistral_causal_lm class from Caris Hub. The from_preset method was used, specifying the Hugging Face model path with the prefix "hf://".
  • Llama 3 (General Purpose): A fine-tuned Llama 3 checkpoint named "xverify" was loaded using the llama_3_causal_lm class, again utilizing the from_preset method and the "hf://" prefix.
  • Gemma (Multilingual Translation): A fine-tuned Gemma model, "ERA X translator," was loaded using the gemma_3_causal_lm class and the same loading procedure.

Each model was loaded and run with only two lines of code, showcasing the simplicity of the process.

5. Step-by-Step Process for Loading Checkpoints

The demonstrated process can be summarized as follows:

  1. Select Backend: Specify the desired backend (Jax, PyTorch, or TensorFlow) within the Caris Hub environment.
  2. Choose Model Class: Identify the appropriate Caris Hub class corresponding to the desired model architecture (e.g., mistral_causal_lm, llama_3_causal_lm, gemma_3_causal_lm).
  3. Load from Preset: Use the from_preset method of the chosen class.
  4. Specify Hugging Face Path: Provide the Hugging Face model path, prefixed with "hf://".

6. Key Arguments & Perspectives

The central argument is that Caris Hub significantly enhances flexibility and efficiency in machine learning model development. By decoupling architecture and weights, it empowers users to:

  • Leverage Community Resources: Utilize the extensive collection of fine-tuned models on Hugging Face Hub.
  • Maintain Framework Control: Choose the preferred backend framework (Jax, PyTorch, TensorFlow) without being constrained by the original framework of the checkpoint.
  • Accelerate Experimentation: Quickly experiment with different models and checkpoints with minimal code.

This perspective is supported by the live demonstration, showcasing the ease with which different models were loaded and run.

7. Notable Quotes

  • “Caris Hub gives you incredible flexibility. It separates the model's architecture from its weights, allowing you to bridge that gap between frameworks and checkpoint sources.” – Euanguo
  • “They say three is a lucky number. So, we'll do one more for good measure.” – Euanguo (demonstrating the ease of loading multiple models)

8. Technical Vocabulary

  • Jax, PyTorch, TensorFlow: Popular deep learning frameworks.
  • Causal LM: Causal Language Model – a type of language model that predicts the next token in a sequence.
  • SafeTensors: A secure and efficient format for storing and sharing model weights.
  • Backend: The underlying framework used to run the model.
  • Checkpoint: A snapshot of a model's weights at a specific point in time.
  • Preset: A pre-configured set of parameters for loading a model.

9. Logical Connections

The video logically progresses from identifying a problem (framework limitations) to presenting a solution (Caris Hub). It first establishes the foundational concepts of model architecture and weights, then explains how Caris Hub bridges the gap between them. The practical demonstrations reinforce the benefits of this approach, showcasing the ease of use and flexibility.

10. Synthesis & Conclusion

Caris Hub offers a powerful solution for navigating the complexities of modern machine learning. By decoupling model architecture from weights and providing seamless integration with resources like Hugging Face Hub, it empowers developers to experiment more freely, leverage community contributions, and maintain control over their preferred frameworks. The demonstrated simplicity and efficiency of the process position Caris Hub as a valuable tool for accelerating AI development and innovation. The key takeaway is that Caris Hub unlocks a new level of flexibility and interoperability in the machine learning ecosystem.

Chat with this Video

AI-Powered

Hi! I can answer questions about this video "How to use KerasHub with Hugging Face". What would you like to know?

Chat is based on the transcript of this video and may not be 100% accurate.

Related Videos

Ready to summarize another video?

Summarize YouTube Video