• Wed. Oct 28th, 2020



Robot beats humans at curling thanks to deep learning program

Sept. 23 (UPI) — Thanks to a new deep learning program, a curling robot, appropriately named Curly, was able to win three out of four matches against curlers from South Korea’s national teams.

Researchers detailed the software behind the breakthrough in a new paper published Wednesday in the journal Science Robotics.

Curling is sometimes described as a hybrid of bowling and chess — on ice. During gameplay, teams of two take turns throwing large “stones” across 150 feet of ice toward a bulls-eye target. The sport requires a combination of precise physical performance and strategic thinking.

“The game of curling can be considered a good testbed for studying the interaction between artificial intelligence systems and the real world,” Seong-Whan Lee, professor of cognitive engineering at Korea University, told UPI in an email.

Often, artificial intelligence system perform well in simulations, but struggle when applied in the real world. The problem is known as the “sim-to-real” gap.

In the computer lab, deep learning systems can learn from millions of actions in repeated simulations.

“In the real world, we may not even be able to perform hundreds of actions for the purpose of learning in each case,” Lee said. “Moreover, the system can never fully replicate the real world.”

Ice is one of many environments that is especially difficult to simulate. And because Curly must physically interact with the environment, simulation is even more difficult.

“In robotics, the sim-to-real gap has to do with visual perception, which means that the simulated world looks different from the real world,” Johannes Andreas Stork told UPI in an email.

Stork was not involved in the research, but wrote a commentary article on the research for Science Robotics.

“An example would be that a driverless car would see many more different cars and houses than one could possibly simulate,” said Stork, a professor of machine learning at Örebro University in Sweden.

In curling, with each throw, the ice conditions change. To compete against humans, researchers had to train Curly to judge uncontrollable environmental conditions and adapt.

Researchers supplied Curly with what they call a “deep reinforcement learning” system, a trial-and-error learning system that helps Curly compensate for uncertainties and take adaptive actions. Curly learns from each throw, allowing the robot to make corrections on subsequent throws.

“It is no longer necessary to identify the exact conditions on the ice sheet explicitly and therefore it is not necessary to a simulation that is exactly like the real world,” Stork said. “The simulation only had to change during training such that the policy has to learn to adapt. This is how the sim-to-real gap is addressed in this work.”

When the research team, scientists and engineers from Germany and Korea, combined their deep reinforcement learning system with a separate, previously developed strategy planning model, their artificial intelligence curling robot system was able to outperform expert curlers.

“We succeeded not only in terms of strategic planning but also with respect to the real-time adaptation within the real curling game setting,” Lee said.

Unlike human curling teams, which have three members, Curly uses only two robots — no sweepers. Curly relies on a skipper, the component in charge of aiming strategy, and a thrower, the component in charge of throwing mechanics.

Using its novel artificial intelligence system, the two components communicate to identify throwing errors, interpret changing ice conditions and make adjustments accordingly, all while accounting for the shifting strategy — dictated by the stones thrown by the human team.

Scientists hope their new deep reinforcement learning system can be adapted for a variety of complex real-world applications, including drone navigation.

“The approach presented in this work is suitable for problems where we have a task or environment that is changing over time and where the exact conditions are difficult to perceive from sensor data,” Stork said. “An industry-related scenario that I currently work with would be a plant for processing ore from a mine.”

“The composition of the ore changes over time, depending from where it was mined and the plant’s operation has to adapt to these changes,” he said. “Here, it is too expensive to analyze in a chemical lab on a continual basis.”

Source Article