Maxence Boels

Artificial Intelligence Researcher

LeRobot + SO-101 Arm

#ROBOTICS #MANIPULATION #VLA

Project Overview (in progress)

This project focuses on developing and training Vision-Language-Action (VLA) policies using the LeRobot framework with a SO-101 robotic arm. The goal is to create robust manipulation policies through imitation learning and contribute meaningful improvements to open source robotics projects.

The system integrates visual servoing capabilities with language understanding to enable instruction-following manipulation tasks, trained through real-world demonstrations and simulation.

Project Progress

LeSurgeon System Overview

Overview of the LeSurgeon system architecture

ZED Surgical Cameras

Integration of ZED cameras for enhanced depth perception

LeRobot SO-101 Arm Setup

Hardware setup and integration with LeRobot framework

Robot arm in action

SO-101 arm performing manipulation tasks during testing

Technical Details

Hardware Components

  • SO-101 6-DOF Robotic Arm
  • High-resolution RGB cameras for visual perception
  • Force/torque sensors for haptic feedback
  • Custom end-effector tools for various manipulation tasks
  • Real-time control system with low-latency communication

Software Stack

  • LeRobot Framework for policy training and deployment
  • PyTorch for deep learning model development
  • OpenCV for computer vision and image processing
  • Diffusion policies for smooth action generation
  • Vision-language models for instruction understanding

Key Features

Imitation Learning Pipeline

Robust data collection system for recording human demonstrations and training policies that can generalize to new scenarios and objects.

Vision-Language Integration

Natural language instruction processing combined with visual understanding enables flexible task specification and execution.

Open Source Contributions

All improvements, benchmarks, and SO-101 support will be contributed back to the LeRobot community to benefit the broader robotics research ecosystem.