Back to all videos

Open Source Friday with Prachi Sethi and Open Mind

By GitHub

Open Source Robotics AI Cognition Robot Software Development

Share:

Key Concepts

OM1: An open-source, hardware-agnostic software platform designed to provide "cognition" to robots.
Cognitive Layer: The integration of Large Language Models (LLMs) into robotics to enable natural language interaction, reasoning, and autonomous decision-making.
Hardware Agnostic: Software designed to run across various robot form factors (e.g., Unitree Go2, G1, TurtleBot, Limxron) and hardware configurations.
System Prompting: A configuration method used to define a robot's personality, behavior, and operational constraints.
Teleoperation: Remote control of a robot, often facilitated through cloud-based simulators.
SLAM (Simultaneous Localization and Mapping): A process where a robot builds a 3D map of its environment while navigating it.
Plugin Architecture: A modular system allowing developers to add support for new hardware, sensors, and actions via specific interfaces.

1. Overview of OM1

OM1 is an open-source project that acts as a "connector" for various robotic form factors. It bridges the gap between hardware and AI by processing sensor inputs (camera, microphone, LiDAR) through an LLM, which then triggers specific robotic actions. The platform is designed to be accessible, allowing users to run it on hardware as simple as a Raspberry Pi or on a local machine (Mac/Ubuntu).

2. Architecture and Workflow

The OM1 architecture follows a modular flow:

Sensor Layer: Collects data from hardware (video, audio, spatial sensors).
Processing Layer: Inputs are sent to an LLM (e.g., Gemini, GPT, Llama, or OpenRouter).
Cognitive Layer: The LLM interprets the user's intent based on a System Prompt (which defines the robot's persona) and the current sensor data.
Action Layer: The LLM maps the intent to predefined actions (e.g., "walk forward") via the Hardware Abstraction Layer (HAL).

3. Step-by-Step: Getting Started

Prerequisites: Install uv (package manager), portaudio, and ffmpeg.
Configuration:
- Clone the repository.
- Generate an API key via the OM1 portal.
- Update the configuration file (e.g., conversation.json or spot.json) with the API key and desired LLM settings.
- Define the history_length to control how much context the robot retains.
Execution: Run the command uv run source run.py. The system automatically installs necessary dependencies in a virtual environment and initializes the conversation agent.

4. Simulation and Testing

For users without physical robots, OM1 provides a Cloud Simulator.

Functionality: Users can test full autonomy, including SLAM map generation and navigation.
Process: Users access the cloud teleoperation interface, clone the repo within the cloud environment, and use the same API key to interact with a virtual robot.
Advanced Features: The platform supports autonomous charging, where the robot monitors its battery levels and returns to a docking station when necessary.

5. Contributing to OM1

The project encourages community contributions, particularly in two areas:

Hardware Support: Creating "connectors" for new robot form factors by mapping the hardware's native SDK/API to OM1’s action interface.
Sensor Integration: Adding support for new sensors (e.g., smoke detectors, humidity sensors) by creating input plugins.
Contribution Framework:
- Actions: Define an interface and connector to map hardware-specific functions to OM1 keywords.
- Inputs: Create a provider/input plugin to handle specific data formats (e.g., MJPEG for cameras).

6. Key Takeaways

Accessibility: By supporting Raspberry Pi and cloud simulation, OM1 lowers the barrier to entry for software developers interested in robotics.
Modularity: The plugin-based architecture ensures that the platform can scale to support virtually any robot, provided a connector is built.
Community Growth: The project is actively seeking first-time contributors to help expand hardware compatibility and sensor support, with plans to create "good for first-timer" GitHub issues.
Synthesis: OM1 represents a significant shift in robotics, moving away from rigid, pre-programmed behaviors toward flexible, LLM-driven cognitive agents that can interact with their environment in real-time.

Chat with this Video

AI-Powered

Load the transcript when you're ready to chat so the initial page stays lighter.

Related Videos

Ready to summarize another video?

Summarize YouTube Video