Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon

OpenAI builds reinforcement learning based system giving robots human like dexterity

Save for later
  • 3 min read
  • 31 Jul 2018

article-image

Researchers at OpenAI have developed a system trained with reinforcement learning algorithms which is dexterous in-hand manipulation. Termed as Dactyl, this system can solve object orientation tasks entirely in a simulation without any human input. After the system’s training phase, it was able to work on a real robot without any fine-tuning.

Using humanoid hand systems to manipulate objects has been a long-standing challenge in robotic control. Current techniques remain limited in their ability to manipulate objects in the real world. Although robotic hands have been available for quite some time, they were largely unable to utilize complex end-effectors to perform dexterous manipulation tasks.

The Shadow Dexterous Hand, for instance, has been available since 2005 with five fingers and 24 degrees of freedom. However, it did not see large-scale adoption because of the difficulty of controlling such complex systems.

Now OpenAI researchers have developed a system that trained control policies allowing a robot hand to perform complex in-hand manipulations. This systems shows unprecedented levels of dexterity and discovers different hand grasp types found in humans, such as the tripod, prismatic, and tip pinch grasps. It is also able to display dynamic behaviors such as finger gaiting, multi-finger coordination, the controlled use of gravity, and application of translational and torsional forces to the object.

How does the OpenAI system work?

  1. First, they used a large distribution of simulations with randomized parameters to collect data for the control policy and vision-based pose estimator.
  2. The control policy receives observed robot states and rewards from the distributed simulations. It then learns to map observations to actions using RNN and reinforcement learning.
  3. The vision-based pose estimator renders scenes collected from the distributed simulations. It then learns to predict the pose of the object from images using a CNN, trained from the control policy.
  4. The object pose is predicted from 3 camera feeds with the CNN. These cameras measure the robot fingertip locations using a 3D motion capture system and give them to the control policy to produce an action for the robot.
  5. Unlock access to the largest independent learning library in Tech for FREE!
    Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
    Renews at €18.99/month. Cancel anytime


openai-reinforcement-learning-giving-robots-human-like-dexterity-img-0

OpenAI blog


You can place a block in the palm of the Shadow Dexterous hand and the Dactyl can reposition it into different orientations. For example, it can rotate the block to put a new face on top.

openai-reinforcement-learning-giving-robots-human-like-dexterity-img-1

OpenAI blog


According to OpenAI, this project completes a full cycle of AI development that OpenAI has been pursuing for the past two years. “We’ve developed a new learning algorithm, scaled it massively to solve hard simulated tasks, and then applied the resulting system to the real world.

You can read more about Dactyl on OpenAI blog. You can also read the research paper for further analysis.

AI beats human again – this time in a team-based strategy game
OpenAI charter puts safety, standards, and transparency first
Introducing Open AI’s Reptile: The latest scalable meta-learning Algorithm on the block