杂乱场景中高效的机器人推抓方法

An Efficient Robotic Pushing and Grasping Method in Cluttered Scene

IEEE Transactions on Cybernetics · 2024

被引 9

ABS 3

Sheng Yu
Di‐Hua Zhai
Yuanqing Xia
Yuyin Guan

中文导读

提出一种端到端的推抓网络EPGNet，使用EfficientNet-B0和交叉融合模块，在减少参数的同时实现高精度和高效率，实验表明其性能优于单阶段方法，与多阶段方法相当。

Abstract

Pushing and grasping (PG) are crucial skills for intelligent robots. These skills enable robots to perform complex grasping tasks in various scenarios. These PG methods can be categorized into single-stage and multistage approaches. Single-stage methods are faster but less accurate, while multistage methods offer high accuracy at the expense of time efficiency. To address this issue, a novel end-to-end PG method called efficient PG network (EPGNet) is proposed in this article. EPGNet achieves both high accuracy and efficiency simultaneously. To optimize performance with fewer parameters, EfficientNet-B0 is used as the backbone of EPGNet. Additionally, a novel cross-fusion module is introduced to enhance network performance in robotic PG tasks. This module fuses and utilizes local and global features, aiding the network in handling objects of varying sizes in different scenes. EPGNet consists of two branches dedicated to predicting PG actions, respectively. Both branches are trained simultaneously within a Q-learning framework. Training data is collected through trial and error, involving the robot performing PG actions. To bridge the gap between simulation and reality, a unique PG dataset is proposed. Additionally, a YOLACT network is trained on the PG dataset to facilitate object detection and segmentation. A comprehensive set of experiments is conducted in simulated environments and real-world scenarios. The results demonstrate that EPGNet outperforms single-stage methods and offers competitive performance compared to multistage methods, all while utilizing fewer parameters. A video is available at https://youtu.be/HNKJjQH0MPc.

机器人计算机视觉人工智能深度学习

阅读原文 ↗