(Nanowerk News) researchers led by the University of California San Diego have developed a new model that trains a quadrupedal robot to see more clearly in 3D. This sophistication allows the robot to autonomously traverse challenging terrain with ease—including stairs, rocky ground, and paths filled with gaps—while clearing obstacles in its path.
The researchers will present their work (“Neural Volumetric Memory for Visual Driving Control”) at the 2023 Conference on Computer Vision and Pattern Recognition (CVPR), which will take place from June 18 to 22 in Vancouver, Canada.
“By giving the robot a better understanding of its environment in 3D, it can be used in more complex real-world environments,” said study senior author Xiaolong Wang, a professor of electrical and computer engineering at UC San Diego Jacobs School. Manipulation.
The robot is equipped with a forward-facing depth camera on its head. The camera is tilted down at an angle that provides a good view of both the scene in front of it and the terrain below.
To improve the robot’s 3D perception, the researchers developed a model that first takes 2D images from a camera and translates them into 3D space. It does this by viewing a short video sequence consisting of the current frame and several previous frames, then extracting chunks of 3D information from each 2D frame. It includes information about the robot’s leg movements such as joint angle, joint speed and distance from the ground. The model compares information from the previous frame with information from the current frame to estimate the 3D transformation between the past and the present.
The model blends all that information together so it can use the current frame to synthesize previous frames. As the robot moves, the model checks the synthesized frame against the frame the camera has captured. If they match, then the model knows that it has learned the correct representation of the 3D scene. Otherwise, it makes corrections until it is correct.
3D representation is used to control the movement of the robot. By synthesizing visual information from the past, the robot can remember what it has seen, as well as the previous actions its legs performed, and use that memory to inform its next movements.
“Our approach allows the robot to build a short-term memory of its 3D environment so it can act better,” said Wang.
The new study builds on the team’s previous work, in which researchers developed an algorithm that combines computer vision with proprioception — which involves a sense of movement, direction, speed, location, and touch — to allow four-legged robots to walk and run on uneven ground. while avoiding obstacles. The advance here is that by enhancing the robot’s 3D perception (and combining it with proprioception), the researchers are showing that the robot can traverse more challenging terrain than before.
“What’s exciting is that we have developed a model that can handle different types of challenging environments,” said Wang. “That’s because we’ve created a better understanding of the 3D environment that makes the robot more versatile across a variety of scenarios.”
However, this approach has limitations. Wang noted that their current model doesn’t guide the robot to a specific destination or destination. When deployed, the robot only takes a straight path and if it sees an obstacle, it avoids it by walking away via another straight path. “The robot doesn’t control exactly where it goes,” he said. “In the future, we want to incorporate more planning techniques and complete the navigation pipeline.”