Abstract
Robotic arms are currently in the spotlight of the industry of future, but their efficiency faces huge challenges. The efficient grasping of the robotic arm, replacing human work, requires visual support. In this paper, we propose to augment end-to-end deep learning gasping with an object detection model in order to improve the efficiency of grasp pose prediction. The accurate position of the object is difficult to obtain in the depth image due to the absent of the label in point cloud in an open environment. In our work, the detection information is fused with the depth image to obtain accurate 3D mask of the point cloud, guiding the classical GraspNet to generate more accurate grippers. The detection-driven 3D mask method allows also to design a priority scheme increasing the adaptability of grasping scenarios. The proposed grasping method is validated on multiple benchmark datasets achieving state-of-the-art performances.
Similar content being viewed by others
References
Papacharalampopoulos A, Makris S, Bitzios A et al (2016) Prediction of cabling shape during robotic manipulation. Int J Adv Manuf Technol 82:123–132. https://doi.org/10.1007/s00170015-7318-5
Le MT, Lien JJJ (2022) Robot arm grasping using learning-based template matching and self-rotation learning network. Int J Adv Manuf Technol 121(3–4):1915–1926. https://doi.org/10.1007/s00170-022-09374-y
Dang AT, Hsu QC, Jhou YS (2022) Development of human–robot cooperation for assembly using image processing techniques. Int J Adv Manuf Technol 120(5–6):3135–3154. https://doi.org/10.1007/s00170-022-08968-w
Miller AT, Knoop S, Christensen HI et al (2003) Automatic grasp planning using shape primitives. In: 2003 IEEE International Conference on Robotics and Automation 1824–1829. https://doi.org/10.1109/ROBOT.2003.1241860
Jiang Y, Moseson S, Saxena A (2011) Efficient grasping from rgbd images: learning using a new rectangle representation. In: 2011 IEEE International conference on robotics and automation pp 3304–3311. https://doi.org/10.1109/ICRA.2011.5980145
Redmon J, Angelova A (2015) Real-time grasp detection using convolutional neural networks. In: 2015 IEEE international conference on robotics and automation (ICRA) pp. 1316–1322. https://doi.org/10.48550/arXiv.1412.3128
Piater JH (2002) Learning visual features to predict hand orientations. Computer Science Department Faculty Publication Series, p 148. https://doi.org/10.1007/1-84628-102-4_21
Xiang Y, Schmidt T, Narayanan V et al (2017) Posecnn: a convolutional neural network for 6d object pose estimation in cluttered scenes. arXiv preprint arXiv:1711.00199. https://doi.org/10.48550/arXiv.1711.00199
Yan X et al (2018) Learning 6-DOF grasping interaction via deep geometry-aware 3D representations. In: 2018 IEEE International Conference on Robotics and Automation pp. 3766–3773. https://doi.org/10.48550/arXiv.1708.07303
Le TT, Le TS, Chen YR et al (2021) 6D pose estimation with combined deep learning and 3D vision techniques for a fast and accurate object grasping. Robot Auton Syst 141:103775. https://doi.org/10.1016/j.robot.2021.103775
Wolnitza M, Kaya O, Kulvicius T et al (2022) 3D object reconstruction and 6D-pose estimation from 2D shape for robotic grasping of objects. arXiv preprint arXiv:2203.01 051. https://doi.org/10.48550/arXiv.2203.01051
Gupta H, Thalhammer S, Leitner M et al (2022) Grasping the inconspicuous. arXiv preprint arXiv:2211.08182. https://doi.org/10.48550/arXiv.2211.08182
Jin M, Li J, Zhang L (2022) DOPE++: 6D pose estimation algorithm for weakly textured objects based on deep neural networks. PloS One 17(6):e0269175. https://doi.org/10.1371/journal.pone.0269175
Huang R, Mu F, Li W et al (2022) Estimating 6D object poses with temporal motion reasoning for robot grasping in cluttered scenes. IEEE Robot Autom Lett. https://doi.org/10.1109/LRA.2022.3147334
ten Pas A, Gualtieri M, Saenko K et al (2017) Grasp pose detection in point clouds. Int J Robot Res 36(13–14): 1455–1473. https://doi.org/10.48550/arXiv.1706.09911
Mahler J, Matl M, Satish V et al (2019) Learning ambidext- rous robot grasping policies. Sci Robot 4(26):eaau4984. https://doi.org/10.1126/scirobotics.aau4984
Metzner M, Albrecht F, Fiegert M et al (2022) Virtual train- ing and commissioning of industrial bin picking systems using synthetic sensor data and simulation. Int J Comput Integr Manuf 1–10. https://doi.org/10.1080/0951192X.2021.2004618
Mallick A, del Pobil AP, Cervera E (2018) Deep learning based object recognition for robot picking task. Proceedings of the 12th international conference on ubiquitous informati- on management and communication pp. 1–9. https://doi.org/10.1145/3164541.3164628
Zeng A, Song S, Yu KT et al (2022) Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching. Int J Robot Res 41(7): 690–705. https://doi.org/10.48550/arXiv.1710.01330
Zhang Z, Zheng C (2022) Simulation of Robotic Arm Grasping Control Based on Proximal Policy Optimization Algorithm. J Phys: Conf Ser IOP Publishing 2203(1): 012065. https://ma.x-mol.com/paperRedirect/1499629009831354368
Wang T, Chen Y, Qiao M et al (2018) A fast and robust convolutional neural network-based defect detection model in product quality control. Int J Adv Manuf Technol 94:3465–3471. https://doi.org/10.1007/s00170-017-0882-0
Zhengming Li, Jinlong Z (2020) Detection and positioning of grab target based on deep learning. Inf Control 49(2):147–153
Kato H, Nagata F, Murakami Y et al (2022) partial depth estimation with single image using YOLO and CNN for robot arm control. In: 2022 IEEE International Conference on Mechatronics and Automation (ICMA) pp. 1727–1731. https://doi.org/10.1109/ICMA54519.2022.9856055
Wang CY, Bochkovskiy A, Liao HYM (2022) YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv preprint arXiv:2207.026 96. https://doi.org/10.48550/arXiv.2207.02696
Glenn Jocher et al (2021) ultralytics/yolov5: v5.0 – YOLO v5-P6 1280 models, AWS, Supervise.ly and YouTube integ- rations. https://doi.org/10.5281/ZENODO.4679653
Morrison D, Corke P, Leitner J (2018) Closing the loop for robotic grasping: a real-time, generative grasp synthesis approach. arXiv preprint arXiv:1804.05172. https://doi.org/10.48550/arXiv.1804.05172
Fang HS, Wang C, Fang H et al (2022) AnyGrasp: robust and efficient grasp perception in spatial and temporal domains. arXiv preprint arXiv:2212.08333. https://doi.org/10.48550/arXiv.2212.08333
Mahler J, Liang J, Niyaz S et al (2017) Dex-net 2.0: Deep learning to plan robust grasps with synthetic point clouds and analytic grasp metrics. arXiv preprint arXiv:1703.09312. https://doi.org/10.48550/arXiv.1703.09312
Mahler J, Matl M, Liu X et al (2018) Dex-net 3.0: Computing robust vacuum suction grasp targets in point clouds using a new analytic model and deep learning. In: 2018 IEEE International Conference on robotics and automation (ICRA) pp. 5620–5627. https://doi.org/10.48550/arXiv.1709.06670
Fang HS, Wang C, Gou M, Lu C (2019) GraspNet: a large-scale clustered and densely annotated dataset for object grasping. arXiv preprint arXiv:1912.13470. https://doi.org/10.48550/arXiv.1912.13470
Fang HS, Wang C, Gou M, Lu C (2020) Graspnet-1billion: a large-scale benchmark for general object grasp- ing. In Proceedings of the IEEE/CVF conference on comput- er vision and pattern recognition. pp 11444–11453
Bottarel F, Vezzani G, Pattacini U et al (2020) GRASPA 1.0: GRASPA is a robot arm grasping performance benchmark. IEEE Robot Autom Lett 5(2): 836–843. https://doi.org/10.48550/arXiv.2002.05017
Wang C, Fang HS, Gou M et al (2021) Graspness discovery in clutters for fast and accurate grasp detection. Proceedings of the IEEE/CVF International Conference on Computer Vision pp. 15964–15973. https://doi.org/10.1109/ICCV48922.2021.01566
Mehrkish A, Janabi-Sharifi F (2022) Grasp synthesis of continuum robots. Mech Mach Theory 168:104575. https://doi.org/10.1016/j.mechmachtheory.2021.10457
Tung K, Su J, Cai J et al (2022) Uncertainty-based exploring strategy in densely cluttered scenes for vacuum cup grasping. In: 2022 International Conference on Robotics and Automation (ICRA) pp. 3483–3489. https://doi.org/10.1109/ICRA46639.2022.9811599
Ren S, He K, Girshick R et al (2017) Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Inte-lligence pp. 1137–1149
Funding
This work is partially funded by the China Scholarship Fund.
Author information
Authors and Affiliations
Contributions
All authors equally contributed to the content of this article.
Corresponding author
Ethics declarations
Ethics approval
The authors have no conflicts of interest in the development and publication of current research.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, L., Cherouat, A., Snoussi, H. et al. Detection-driven 3D masking for efficient object grasping. Int J Adv Manuf Technol 129, 4695–4703 (2023). https://doi.org/10.1007/s00170-023-12574-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00170-023-12574-9