Visual Servoing for Humanoid Robots

To execute reaching, grasping, or manipulating motions, a humanoid robot must be able to deal with inaccurate object localization, fuzzy sensor data, and a dynamic environment. Therefore, we use techniques related to Position Based Visual Servoing (PBVS) to allow a robust and reactive interaction with the environment. By fusing the sensor channels coming from motors, vision, and haptics the visual servoing framework enables the humanoid robots of the ARMAR series to grasp objects and to open doors in a kitchen.
To exploit the full grasping capabilities of a humanoid robot, robust execution of bimanual grasping or manipulation motions should be possible. The bimanual visual servoing framework, which we developed at H²T at KIT, enables the robots to robustly execute dual arm grasping and manipulation tasks. Therefore, target objects and both hands are tracked alternately and a combined open- / closed-loop controller is used for positioning the hands with respect to the targets. The control framework for reactive positioning of both hands applying position based visual servoing fuses the sensor data streams coming from the vision system, the joint encoders, and the force/torque sensors.

Videos

A Position-Based Visual Servoing controller enables the humanoid robot ARMAR-III to hand-over a cup from the left to the right hand.

Reactive Visual Servoing is used to robustly grasp a door handle.
A box with mashed potatoes is robustly grasped by ARMAR-III. The internal view of the visually controlled movement is shown.
The humanoid robot ARMAR-III uses Reactive Visual Servoing techniques to grasp a moving cup.
The humanoid robot ARMAR-III uses its visual servoing system for grasping an apple juice in the fridge.
Bimanual grasping and manipulation: ARMAR-III uses the Bimanual Visual Servoing controller for grasping a wok with both hands resulting in a coupled bimanual manipulation.
Bimanual Visual Servoing for simultaneously grasping a cup and a beverage carton with both hands. The juice is poured in the cup by executing a de-coupled dual-arm manipulation.
ARMAR-III is supposed to grasp the cereal box in different setups. The RRT-based Multi-EEF planner implicitly selects the hand, a feasible grasping pose together with an IK-solution and generates a collision-free grasping trajectory. The planned trajectory is executed with high accuracy due to the visually guided motion execution that continuously observes and corrects the relation of hand and target.