Skip to content

Submission for MIT Reality Hack 2026 - Team 'The Future is Chrome' - XR Humanoid Robot Telepresence

License

Notifications You must be signed in to change notification settings

Caerii/Personoid-MITRealityHack-2026

 
 

Repository files navigation

The Future is Chrome — MIT Reality Hack 2026

Booster K1 Robot with tin foil hat

Our Booster K1 robot, ready for emotional expression and XR interaction (complete with protective tin foil hat! 🛸)

Robot dripped out

Our robot absolutely dripped out and ready to express emotions in style! 💯

GANTASMO Team with Robot

The team (in GANTASMO gear) carefully observing our robot's emotional state - because even robots need emotional support! 😄

Unity project targeting Samsung GalaxyXR (AndroidXR) for mixed-reality interaction with a Booster K1 robot.

This repo is built on top of the Android XR Samples for Unity project and keeps the sample scenes as a platform-feature sandbox while adding hack-specific robot + networking experiments.

📰 Read more about this project on Devpost → — You can also find lots of other cool projects from the hackathon there!

🏆 Awards & Recognition

  • 🥈 Runner Up - Google DeepMind Prize - Recognized for breakthrough AI innovation, combining LLM-powered robot control, real-time voice AI, and multi-modal XR interactions into a cohesive system that demonstrates the future of human-robot collaboration
  • 🌟 Reality Amplifier Prize - Honored for exceptional community leadership, technical mentorship, and collaborative spirit in supporting fellow teams throughout the hackathon, embodying the true spirit of innovation and knowledge sharing

📑 Quick Navigation

🎯 Project Overview

This project creates a complete pipeline from XR interactions to physical robot control, enabling users to control a Booster K1 humanoid robot through multiple natural interaction modalities:

  • 🎮 XR Gestures: Hand tracking, gaze, and UI interactions
  • 🗣️ Voice Commands: Real-time voice AI via Gemini Live
  • 💬 Natural Language: Text prompts that generate complex robot animations
  • 🎭 Emotional Expressions: Pre-defined emotional postures and animations

The system integrates sophisticated AI capabilities (LLM-powered animation generation, voice AI) with real-time robot control, creating a seamless experience where users can naturally express intent and see it executed on a physical robot.

🏗️ System Architecture

┌─────────────────────────────────────────────────────────────────┐
│                    GalaxyXR Headset (Unity)                      │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │  XR Interactions: Hands, Gaze, Voice, UI                  │  │
│  └──────────────┬───────────────────────────────────────────┘  │
│                 │                                                 │
│  ┌──────────────▼───────────────────────────────────────────┐  │
│  │  Unity C# Integration Layer                               │  │
│  │  • BoosterRobotService (HTTP client)                      │  │
│  │  • BoosterRobotController (MonoBehaviour)                 │  │
│  │  • ExampleRobotXRIntegration (XR event handlers)          │  │
│  │  • GeminiLiveAudioClient (WebSocket voice AI)             │  │
│  └──────────────┬───────────────────────────────────────────┘  │
└─────────────────┼───────────────────────────────────────────────┘
                  │ HTTP / WebSocket
                  ▼
┌─────────────────────────────────────────────────────────────────┐
│              Python Backend Services                             │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │  Keyframe Animation API Server (FastAPI)                  │  │
│  │  • /generate_from_prompt → LLM generates keyframes       │  │
│  │  • /execute_sequence → Robot execution controller         │  │
│  │  • /stock_animations → Pre-defined emotional postures     │  │
│  └──────────────┬───────────────────────────────────────────┘  │
│                 │                                                 │
│  ┌──────────────▼───────────────────────────────────────────┐  │
│  │  LLM Provider Layer (Gemini/OpenAI/Anthropic)            │  │
│  │  • Natural language → Structured keyframe JSON           │  │
│  │  • Workspace bounds validation                            │  │
│  │  • Posture generation with IK constraints                 │  │
│  └──────────────┬───────────────────────────────────────────┘  │
│                 │                                                 │
│  ┌──────────────▼───────────────────────────────────────────┐  │
│  │  Booster Robotics SDK (C++/Python via pybind11)          │  │
│  │  • DDS-based real-time communication                      │  │
│  │  • High-level API (B1LocoClient)                          │  │
│  │  • Automatic inverse kinematics (IK)                       │  │
│  │  • End-effector control (MoveHandEndEffectorV2)          │  │
│  └──────────────┬───────────────────────────────────────────┘  │
└─────────────────┼───────────────────────────────────────────────┘
                  │ DDS (FastDDS)
                  ▼
┌─────────────────────────────────────────────────────────────────┐
│                    Booster K1 Robot (22 DoF)                    │
│  • 4-DOF arms (shoulder pitch/roll, elbow pitch, wrist pitch)   │
│  • 2-DOF head (pitch, yaw)                                       │
│  • 12-DOF legs (locomotion)                                      │
└─────────────────────────────────────────────────────────────────┘

📖 For detailed architecture diagrams, see Architecture Documentation

Key Innovations

  • LLM-Powered Animation Generation: Convert natural language descriptions into executable robot keyframe sequences
  • Multi-Modal Interaction: Combine voice, gestures, gaze, and text for flexible robot control
  • Real-Time Visualization: Preview animations in 3D before executing on physical robot
  • Production-Ready Architecture: Complete SDK integration, comprehensive documentation, and deployment guides

Goals

  • Use hands, gaze, and spatial understanding on GalaxyXR to drive robot interactions.
  • Establish a reliable comms path from headset → robot (direct network, and/or an Arduino bridge).
  • Enable natural language control of robot through LLM-powered animation generation.
  • Support real-time voice interactions with AI that can control the robot.
  • Optionally run a second client (Meta Quest 3) that colocates with GalaxyXR for shared context via Photon.

Target Hardware

  • Samsung GalaxyXR (primary XR device)
  • Booster K1 robot (primary physical agent)
  • Arduino (optional bridge for sensors/actuation and rapid prototyping)
  • Meta Quest 3 (optional secondary client for shared/remote view)

👥 Team Overview

This project is the result of collaborative work from five team members:

Contributor Primary Focus Key Deliverables
Daniel Samsung Galaxy XR Build Setup & TextMesh Pro Integration Samsung Galaxy XR build configuration, Android XR development environment setup, OpenXR Unity Port integration, TextMesh Pro integration, Gemini material enhancements
Marcel UI System Foundation & Visual Polish Complete UI system (scenes, prefabs, scripts), custom scribble-style assets, audio feedback, custom fonts (Nexa), sprite animation system, logo animations
Josh Voice Concierge Addon & Gemini Live API Gemini Live API implementation (backend logic and framework code), complete Gemini Live voice persona AI with mood detection, spatial UI, audio-reactive visualization
Alif Booster Robotics Integration Python keyframe animation system (extends Booster SDK), React 3D frontend (Three.js simulator), Unity C# integration, robot SDK connection, kinematic conversion system, server integration, Gemini Live voice control, comprehensive documentation
Jose Arduino Uno Q Integration & Architecture Support Arduino Uno Q integration framework (Python bridge, WebSocket communication, emotion-based LED control), system architecture clarification, concept iteration (LED strip library compatibility issues prevented full integration)

📖 For detailed contributions, see Team Contributions Documentation

Tech Stack Summary

Core Platform

  • Unity 6000.2.9f1 with Android XR (OpenXR)
  • XR Interaction Toolkit + XR Hands
  • TextMesh Pro for advanced text rendering

Robot Control & AI

  • Booster Robotics SDK (C++/Python with DDS communication)
  • FastAPI Python server for keyframe generation
  • LLM Providers: Gemini 2.0 Flash, OpenAI GPT-4, Anthropic Claude
  • Gemini Live API for real-time voice AI
  • React + Three.js for 3D visualization

📖 For complete tech stack details, see Tech Stack Documentation

Quick Start

Prerequisites

  1. Install Unity 6000.2.9f1 with Android Build Support
  2. Install OpenJDK (via Unity Hub module or system install)
  3. Install Android SDK + NDK tools (via Unity Hub or Android Studio)

Build & Run

  1. Open the project in Unity
  2. Open Build Profiles and switch to Android
  3. Import TextMeshPro essentials: Window → TextMeshPro → Import TMP Essential Resources
  4. Build and run to device

📖 For detailed setup instructions, see Getting Started Guide

Key Features

🎬 LLM-Powered Keyframe Animation System

Generate complex robot animations from natural language prompts:

  • Support for Gemini, OpenAI, and Anthropic APIs
  • Automatic workspace validation and safety clamping
  • 8 pre-defined emotional animations (50 keyframes each)
  • Interactive 3D visualization frontend

📖 See Booster Robotics Documentation for details

🗣️ Voice Control

Two voice AI implementations:

  1. Alif's Robot Control: Voice commands → Robot actions

    • "Wave your arms" → Robot executes animation
    • Real-time bidirectional voice streaming
    • Integrated with keyframe animation system
  2. Josh's Voice Concierge: General voice AI assistant

    • Mood detection and visualization
    • Spatial UI panels
    • Audio-reactive visualization
    • See Assets/VoiceConciergeAddon/README.md

🎨 XR UI System

Complete UI system for Samsung Galaxy XR passthrough mode:

  • Custom scribble-style assets (Marcel's contribution)
  • TextMesh Pro integration
  • Hand tracking and gaze interaction
  • Audio feedback system

📖 See Team Contributions for Marcel's UI system and Daniel's TextMesh Pro integration

🤖 Robot Integration

Multiple interaction modalities converge on robot control:

  • Natural language → Animation (LLM keyframe generation)
  • Voice → Direct control (Gemini Live)
  • XR Gestures → Robot commands (hand tracking, gaze)
  • UI Buttons → Robot actions

📖 See Robot Integration Documentation for architecture and flows

Project Structure

The-Future-is-Chrome-MIT-Reality-Hack-2026/
├── Assets/
│   ├── UI/                          # Marcel's UI system foundation
│   ├── TextMesh Pro/                # TextMesh Pro integration (Daniel)
│   ├── VoiceConciergeAddon/         # Josh's voice AI addon
│   ├── BoosterRobotics/             # Alif's robot integration
│   │   ├── Scripts/                 # Unity C# integration layer
│   │   └── booster_robotics_sdk/   # Complete SDK + keyframe system
│   ├── ArduinoUnoQ/                 # Jose's Arduino integration (experimental)
│   └── AndroidXRUnitySamples/        # Baseline Android XR samples
├── docs/                            # Detailed documentation
│   ├── GETTING_STARTED.md
│   ├── ARCHITECTURE.md
│   ├── TEAM_CONTRIBUTIONS.md
│   ├── TECH_STACK.md
│   ├── BOOSTER_ROBOTICS.md
│   ├── ROBOT_INTEGRATION.md
│   ├── ANDROID_XR_SAMPLES.md
│   └── DEVPOST_STORY.md
├── robot-server/                    # Simple robot HTTP server
└── README.md                        # This file

Documentation

Main Documentation

Component-Specific Documentation

  • Android XR Samples: ANDROID_XR_SAMPLES.md - Baseline showcase samples
  • Voice Concierge: Assets/VoiceConciergeAddon/README.md - Josh's voice AI addon
  • Booster Robotics Unity: Assets/BoosterRobotics/README.md - Unity integration guide
  • Keyframe Animation: Assets/BoosterRobotics/booster_robotics_sdk/example/high_level/keyframe_animation/README.md - Complete system docs
  • Arduino Uno Q Integration: Assets/ArduinoUnoQ/README.md - Jose's Arduino integration framework (experimental)
  • Robot Server: robot-server/README.md - Simple HTTP server API

Safety Notes

⚠️ Important Safety Guidelines:

  • Keep an immediate physical and software stop path available during robot testing
  • The server includes an emergency stop endpoint
  • Always test animations in the visualization frontend before executing on physical robot
  • Start with low-speed movements and gradually increase
  • Ensure clear workspace around robot
  • Have a spotter present during initial testing

Colocation / Multi-Client (Planned)

Photon and Styly is planned to synchronize shared state between devices (e.g., GalaxyXR ↔ Quest 3) and support colocation experiments.

Expected setup once Photon is added:

  • Choose a Photon stack (PUN or Fusion) and add the SDK to the project.
  • Configure Photon App ID in project settings.
  • Define a minimal shared-state schema (anchors/poses/interaction events).

License

See LICENSE file for details.

Contributing

See CONTRIBUTING.md for contribution guidelines.


Built for MIT Reality Hack 2026 🚀

For questions or issues, please check the documentation or open a GitHub issue.

About

Submission for MIT Reality Hack 2026 - Team 'The Future is Chrome' - XR Humanoid Robot Telepresence

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C# 54.5%
  • Python 22.0%
  • C++ 11.7%
  • ShaderLab 5.3%
  • JavaScript 2.6%
  • HLSL 1.4%
  • Other 2.5%