CRV 2026

IRIS

Learning-Driven Cinema Robot Arm
for Visuomotor Motion Control

Qilong Cheng · Matthew Mackay · Ali Bereyhi

Abstract

A cinema robot that learns from demonstration.

IRIS is a low-cost, 3D-printed 6-DOF cinema robot arm that learns cinematic camera motions from human demonstrations via goal-conditioned visuomotor imitation learning. At ~$992 in materials, it achieves 97% of expert visual alignment and 6× smoother motion than its human teachers. All hardware designs, simulation, and training code are fully open-sourced.

Open-Source Hardware

Sub-$1K 6-DOF arm with quasi-direct-drive actuators. STEP files, BOM, and wiring docs released.

High-Fidelity Simulation

MuJoCo physics twin with analytical FK/IK, RRT* and potential-field planners, and cinema shot modes.

Visuomotor Imitation Learning

Goal-conditioned CVAE transformer trained from kinesthetic demonstrations. 9 ablation variants evaluated.

Full ROS Stack

200 Hz RS-485 driver, joint calibration, sim-to-real live mirroring, teach-and-repeat workflow.

Demo

See IRIS in action.

IRIS — Cinema Shot Execution

Crane — Vertical rise while tracking a fixed subject

Dolly — Linear push-in / pull-out

Pan — Lateral arc sweep

Kinesthetic teaching — operator physically guides the arm

CVAE Full policy deployed on real hardware — 46.2% task success

Results

97% of expert alignment. 6× smoother.

Method	Success	Vis. Align.	Jerk
Expert	90.0%	0.874	3.64
CVAE Full	90.0%	0.847	0.61
Incremental	0.0%	0.636	0.83
RGB Only	0.0%	0.584	1.65
Visual	0.0%	0.536	1.59
RRT*	10.0%	0.636	0.22

Visual Alignment = ResNet18 cosine similarity to goal image. Jerk in m/s³.

Citation

Cite this work.

@inproceedings{cheng2026iris,
  title     = {{IRIS}: Learning-Driven Task-Specific Cinema Robot Arm
               for Visuomotor Motion Control},
  author    = {Qilong Cheng and Matthew Mackay and Ali Bereyhi},
  booktitle = {23rd Conference on Robots and Vision},
  year      = {2026},
  url       = {https://arxiv.org/abs/2602.17537}
}

Read the Paper GitHub

IRIS

A cinema robot that learns from demonstration.

Open-Source Hardware

High-Fidelity Simulation

Visuomotor Imitation Learning

Full ROS Stack

Built from the ground up.

Everything is released.

CAD & Assembly

ROS Stack

MuJoCo Environment

Imitation Learning

See IRIS in action.

97% of expert alignment. 6× smoother.

Cite this work.