GitHub - OpenMind/OM1: Modular AI runtime for robots

Technical Paper | Documentation | X

OpenMind's OM1 is a modular AI runtime that empowers developers to create and deploy multimodal AI agents across digital environments and physical robots, including Humanoids, Phone Apps, Quadrupeds, educational robots such as TurtleBot 4, and simulators like Gazebo and Isaac Sim. OM1 agents can process diverse inputs like web data, social media, camera feeds, and LIDAR, while enabling physical actions including motion, autonomous navigation, and natural conversations. The goal of OM1 is to make it easy to create highly capable human-focused robots, that are easy to upgrade and (re)configure to accommodate different physical form factors.

Capabilities of OM1

Modular Architecture: Designed with Python for simplicity and seamless integration.
Data Input: Easily handles new data and sensors.
Hardware Support via Plugins: Supports new hardware through plugins for API endpoints and specific robot hardware connections to ROS2, Zenoh, and CycloneDDS. (We recommend Zenoh for all new development).
Web-Based Debugging Display: Monitor the system in action with WebSim (available at http://localhost:8000/) for easy visual debugging.
Pre-configured Endpoints: Supports Text-to-Speech, multiple LLMs from OpenAI, xAI, DeepSeek, Anthropic, Meta, Gemini, NearAI, Ollama (local), and multiple Visual Language Models (VLMs) with pre-configured endpoints for each service.

Architecture Overview

Getting Started

To get started with OM1, let's run the Spot agent. Spot uses your webcam to capture and label objects. These text captions are then sent to the LLM, which returns movement, speech and face action commands. These commands are displayed on WebSim along with basic timing and other debugging information.

Package Management and VENV

You will need the uv package manager.

Install Dependencies

For macOS

brew install portaudio ffmpeg

For Linux

sudo apt-get update
sudo apt-get install portaudio19-dev python3-dev ffmpeg

Clone the Repo

git clone https://github.com/OpenMind/OM1.git
cd OM1
git submodule update --init
uv venv

Obtain an OpenMind API Key

Obtain your API Key at OpenMind Portal.

Create your account on OpenMind Portal if you haven't yet.
Go to the dashboard and create a new API key.
Copy the generated API key.
Edit config/spot.json5 and replace the openmind_free placeholder with your API key. Or, configure it in the .env file using this command - cp .env.example .env and add your key to the .env.

Alternatively, you can set your API key in the .bashrc file

vi ~/.bashrc # for Linux
vi ~/.zshrc # for macOS

Add the following to the file

export OM_API_KEY="<your_api_key>"

source ~/.bashrc # for linux
source ~/.zshrc # for macOS

OMCU

OMCU is the computational unit for billing on OpenMind's platform. The free plan provides 50 OMCU renewed monthly.

Upgrade your plan here for additional credits.

Launching OM1

Run

uv run src/run.py spot

After launching OM1, the Spot agent will interact with you and perform (simulated) actions. For more help connecting OM1 to your robot hardware, see getting started.

Note: This is just an example agent configuration. If you want to interact with the agent and see how it works, make sure ASR and TTS are configured in spot.json5.

What's Next?

Try out some examples
Add new inputs and actions.
Design custom agents and robots by creating your own json5 config files with custom combinations of inputs and actions.
Change the system prompts in the configuration files (located in /config/) to create new behaviors.

Interfacing with New Robot Hardware

OM1 assumes that robot hardware provides a high-level SDK that accepts elemental movement and action commands such as backflip, run, gently pick up the red apple, move(0.37, 0, 0), and smile. An example is provided in src/actions/move/connector/ros2.py:

...
elif output_interface.action == "shake paw":
    if self.sport_client:
        self.sport_client.Hello()
...

If your robot hardware does not yet provide a suitable HAL (hardware abstraction layer), traditional robotics approaches such as RL (reinforcement learning) in concert with suitable simulation environments (Unity, Gazebo), sensors (such as hand mounted ZED depth cameras), and custom VLAs will be needed for you to create one. It is further assumed that your HAL accepts motion trajectories, provides battery and thermal management/monitoring, and calibrates and tunes sensors such as IMUs, LIDARs, and magnetometers.

OM1 can interface with your HAL via USB, serial, ROS2, CycloneDDS, Zenoh, or websockets. For an example of an advanced humanoid HAL, please see Unitree's C++ SDK. Frequently, a HAL, especially ROS2 code, will be dockerized and can then interface with OM1 through DDS middleware or websockets.

Recommended Development Platforms

OM1 is developed on:

Nvidia Thor (running JetPack 7.0) - full support
Jetson AGX Orin 64GB (running Ubuntu 22.04 and JetPack 6.1) - limited support
Mac Studio with Apple M2 Ultra with 48 GB unified memory (running MacOS Sequoia)
Mac Mini with Apple M4 Pro with 48 GB unified memory (running MacOS Sequoia)
Generic Linux machines (running Ubuntu 22.04)

OM1 should run on other platforms (such as Windows) and microcontrollers such as the Raspberry Pi 5 16GB.

Introduction to BrainPack

From research to real-world autonomy, a platform that learns, moves, and builds with you.

The BrainPack is designed to be mounted directly onto a robot to bring together mapping, object recognition, remote control, and self charging, giving humanoids and quadrupeds what they need to navigate, remember, and act with purpose.

Full Autonomy Guidance

We're excited to introduce full autonomy for Unitree Go2 and G1 with the BrainPack. Full autonomy has five services that work together in a loop without manual intervention:

om1
OM1-ros2-sdk – A ROS 2 package that provides SLAM (Simultaneous Localization and Mapping) capabilities for the Unitree Go2 robot using an RPLiDAR(S2L) sensor, the SLAM Toolbox and the Nav2 stack.
om1-avatar – A modern React-based frontend application that provides the user interface and avatar display system for OM1 robotics software.
om1-video-processor - The OM1 Video Processor is a Docker-based solution that enables real-time video streaming, face recognition, and audio capture for OM1 robots.
om1-system-setup - To setup wifi, and, monitor and manage docker containers.

Simulator Support

OM1 integrates with popular robotics simulators to enable rapid prototyping and testing without physical hardware. We currently support Gazebo with Unitree Go2 and Isaac Sim with Unitree Go2 and G1.

Gazebo

Full support for Gazebo with ROS2 integration. Ideal for testing autonomous SLAM map generation and navigation stacks, sensor simulation, and multi-robot scenarios.

See Gazebo to get started.

Isaac Sim

NVIDIA Isaac Sim support for physics-accurate simulation with GPU acceleration.

Requires NVIDIA GPU and CUDA support. See Isaac Sim Setup to get started.

Detailed Documentation

More detailed documentation can be accessed at docs.openmind.org.

Contributing

Please make sure to read the Contributing Guide before making a pull request.

License

This project is licensed under the terms of the MIT License, which is a permissive free software license that allows users to freely use, modify, and distribute the software. The MIT License is a widely used and well-established license that is known for its simplicity and flexibility. By using the MIT License, this project aims to encourage collaboration, modification, and distribution of the software.

Name		Name	Last commit message	Last commit date
Latest commit History 2,552 Commits
.github		.github
config		config
cyclonedds		cyclonedds
docs		docs
knowledge_base/demo		knowledge_base/demo
scripts		scripts
src		src
system_hw_test		system_hw_test
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
.typos.toml		.typos.toml
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
pyrightconfig.json		pyrightconfig.json
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Capabilities of OM1

Architecture Overview

Getting Started

Package Management and VENV

Install Dependencies

Clone the Repo

Obtain an OpenMind API Key

OMCU

Launching OM1

What's Next?

Interfacing with New Robot Hardware

Recommended Development Platforms

Introduction to BrainPack

Full Autonomy Guidance

Simulator Support

Gazebo

Isaac Sim

Detailed Documentation

Contributing

License

About

Uh oh!

Releases 8

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Capabilities of OM1

Architecture Overview

Getting Started

Package Management and VENV

Install Dependencies

Clone the Repo

Obtain an OpenMind API Key

OMCU

Launching OM1

What's Next?

Interfacing with New Robot Hardware

Recommended Development Platforms

Introduction to BrainPack

Full Autonomy Guidance

Simulator Support

Gazebo

Isaac Sim

Detailed Documentation

Contributing

License

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 8

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages