CN118721200A

CN118721200A - Visual servo control method, device and storage medium for dual-arm collaborative robot

Info

Publication number: CN118721200A
Application number: CN202410934517.1A
Authority: CN
Inventors: 邓三鹏; 李辉; 周旺发; 张香玲; 马传庆; 丁昊然
Original assignee: Yangzhou Bono Intelligent Technology Co ltd
Current assignee: Yangzhou Bono Intelligent Technology Co ltd
Priority date: 2024-07-12
Filing date: 2024-07-12
Publication date: 2024-10-01

Abstract

The present invention provides a dual-arm collaborative robot visual servo control method, device and storage medium, which relates to the field of mechanical automation technology, including: obtaining image information of a binocular vision system; establishing a coordinate relationship between a robotic arm and a target object based on camera calibration and hand-eye calibration; performing static binocular vision calibration only once at the initial posture, and then obtaining the spatial three-dimensional coordinates of the target object through a coordinate system mapping relationship, thereby realizing dynamic binocular vision; identifying the target object based on an improved YOLO target detection algorithm; determining the target object position information according to the result of target detection; determining the master-slave relationship between a first robotic arm and a second robotic arm according to the position information; controlling the main arm to reach a grasping posture, and controlling the slave arm to drive a camera to an observation position according to the posture of the main arm; performing optimal posture estimation on the target object and controlling the main arm to accurately grasp it. The present invention improves the robustness and accuracy of the dual-arm collaborative robot in a dynamic environment.

Description

Visual servo control method, device and storage medium for dual-arm collaborative robot

技术领域Technical Field

本发明涉及机械自动化技术领域，具体而言，涉及双臂协作机器人视觉伺服控制方法、装置和存储介质。The present invention relates to the field of mechanical automation technology, and in particular to a visual servo control method, device and storage medium for a dual-arm collaborative robot.

背景技术Background Art

由于机器人作业环境的多样化、任务的复杂化，单臂机器人受工作环境和自身条件的限制，往往难以胜任复杂的作业任务。相较于单臂机器人，双臂机器人不仅更加灵活，功能更加丰富，而且对于作业环境变化的适应性更加强大，能够胜任多样化的工作环境。双臂机器人并非单臂机器人在数量上的单纯堆砌，而是通过有效的协同控制机制去共同完成工作任务。自第一台双臂机器人问世以来，双臂机器人就被定义为一种具有高度灵活性和协同能力的机器人系统，逐渐成为了机器人领域最热门的研究课题，同时也成为了机器人未来发展的一个主要趋势。Due to the diversity of robot working environments and the complexity of tasks, single-arm robots are often unable to perform complex tasks due to the limitations of the working environment and their own conditions. Compared with single-arm robots, dual-arm robots are not only more flexible and more functional, but also more adaptable to changes in the working environment and capable of handling a variety of working environments. Dual-arm robots are not simply a pile of single-arm robots in terms of quantity, but rather use effective collaborative control mechanisms to complete work tasks together. Since the advent of the first dual-arm robot, dual-arm robots have been defined as a robot system with high flexibility and collaborative capabilities, gradually becoming the hottest research topic in the field of robotics, and also a major trend in the future development of robots.

为了增强机器人在复杂多变的环境中精确辨识目标并做出正确判断的能力，现有技术在机器人中集成了各种传感器，在机器人的诸多传感器中，视觉传感器是最为关键的一种，因为它提供了对周围环境的实时信息，使机器人能够感知和理解物体的形状、位置和特征。根据相机与机器人的位置可以分为“Eye-in-hand(简称EIH)”相机和“Eye-to-hand(简称ETH)”相机。基于视觉传感器的机器人能够更有效地与环境互动，并执行各种各样的任务。视觉技术的融入使得机器人变得更加智能，能够适应更复杂多变的工作环境，能够更和谐的人机协作，共同完成复杂多样的工作任务。因而视觉技术已经成为机器人应用中的关键技术，视觉传感器也成为实现机器人智能化的一个关键组成部分。In order to enhance the robot's ability to accurately identify targets and make correct judgments in complex and changing environments, the existing technology integrates various sensors in the robot. Among the many sensors of the robot, the visual sensor is the most critical one because it provides real-time information about the surrounding environment, enabling the robot to perceive and understand the shape, position and characteristics of objects. According to the position of the camera and the robot, it can be divided into "Eye-in-hand (abbreviated as EIH)" camera and "Eye-to-hand (abbreviated as ETH)" camera. Robots based on visual sensors can interact with the environment more effectively and perform a variety of tasks. The integration of visual technology makes robots more intelligent, able to adapt to more complex and changing working environments, and able to collaborate more harmoniously with humans and machines to complete complex and diverse work tasks together. Therefore, visual technology has become a key technology in robot applications, and visual sensors have also become a key component in realizing robot intelligence.

多视觉系统能够测量物体的前提是知道相机的内部参数和两个相机之间的位姿关系，因此标定之后的多视觉系统中的相机位姿不会改变，所以一般的多视觉系统为静态。若在执行作业任务时，两个相机之间的位姿改变，则需要重新标定，否则将无法继续完成任务。因此，在复杂多变的动态环境中，固定不动的多视觉系统受到了极大的限制，这使得机器人在执行任务时会出现不精确、不稳定的情况。The premise for a multi-vision system to measure an object is to know the internal parameters of the camera and the position relationship between the two cameras. Therefore, the camera position in the multi-vision system will not change after calibration, so the general multi-vision system is static. If the position between the two cameras changes during the execution of the task, recalibration is required, otherwise the task will not be completed. Therefore, in a complex and changing dynamic environment, the fixed multi-vision system is greatly limited, which makes the robot inaccurate and unstable when performing tasks.

因此，如何提高双臂协作机器人在动态环境中的鲁棒性和准确性成为亟待解决的技术问题。Therefore, how to improve the robustness and accuracy of dual-arm collaborative robots in dynamic environments has become a technical problem that needs to be solved urgently.

发明内容Summary of the invention

本发明旨在至少解决现有技术或相关技术中存在的技术问题之一，公开了双臂协作机器人视觉伺服控制方法、装置和存储介质，提高了双臂协作机器人在动态环境中的鲁棒性和准确性。The present invention aims to solve at least one of the technical problems existing in the prior art or related technology, and discloses a visual servo control method, device and storage medium for a dual-arm collaborative robot, which improves the robustness and accuracy of the dual-arm collaborative robot in a dynamic environment.

本发明的第一方面公开了一种双臂协作机器人视觉伺服控制方法，包括：获取设置在第一机械臂末端的第一相机和设置在第二机械臂末端的第二相机采集到的目标物的图像信息；基于相机标定和手眼标定建立机械臂与目标物之间的坐标关系；仅在初始位姿进行一次静态双目视觉标定，而后通过坐标系映射关系获得目标物的空间三维坐标，进而实现动态双目视觉；基于改进的YOLO目标检测算法对目标物进行识别，改进的YOLO目标检测算法是在YOLO-v5s模型的基础上加入ECA注意力机制和SA注意力机制形成的；根据目标检测的结果确定目标物位置信息；根据位置信息确定第一机械臂和第二机械臂的主从关系；控制主臂抵达抓取位姿，控制从臂根据主臂的位姿带动相机抵达观测位；采用基于位置的视觉伺服控制方法，结合EK-SVSF位姿估计算法和基于OWA的分步式数据融合算法对目标物进行最优位姿估计；根据最优位姿估计的结果控制主臂通过关节运动进行偏差调整，进而实现对目标物的精确抓取。The first aspect of the present invention discloses a dual-arm collaborative robot visual servo control method, comprising: acquiring image information of a target object collected by a first camera arranged at the end of a first robotic arm and a second camera arranged at the end of a second robotic arm; establishing a coordinate relationship between the robotic arm and the target object based on camera calibration and hand-eye calibration; performing static binocular vision calibration only once at an initial position, and then obtaining the spatial three-dimensional coordinates of the target object through a coordinate system mapping relationship, thereby realizing dynamic binocular vision; identifying the target object based on an improved YOLO target detection algorithm, wherein the improved YOLO target detection algorithm is based on YOLO-v5 The ECA attention mechanism and the SA attention mechanism are added to the s model; the position information of the target object is determined according to the result of target detection; the master-slave relationship of the first robotic arm and the second robotic arm is determined according to the position information; the master arm is controlled to reach the grasping position, and the slave arm is controlled to drive the camera to the observation position according to the position of the master arm; the position-based visual servo control method is adopted, combined with the EK-SVSF pose estimation algorithm and the step-by-step data fusion algorithm based on OWA to estimate the optimal pose of the target object; according to the result of the optimal pose estimation, the main arm is controlled to adjust the deviation through joint movement, so as to achieve accurate grasping of the target object.

根据本发明公开的双臂协作机器人视觉伺服控制方法，优选地，第一相机和第二相机为深度相机，分别跟随第一机械臂和第二机械臂的自由端运动。According to the visual servo control method for a dual-arm collaborative robot disclosed in the present invention, preferably, the first camera and the second camera are depth cameras, which follow the movement of the free ends of the first robotic arm and the second robotic arm respectively.

根据本发明公开的双臂协作机器人视觉伺服控制方法，优选地，相机标定的计算过程包括：将相机固定并连续采集标定板图像；计算角点像素坐标；计算单应性矩阵；求解内参矩阵；求解外参矩阵；求解镜头畸变系数；基于重投影误差最小函数，利用极大似然估计进行迭代，对相机标定得到的所有参数进行优化。According to the dual-arm collaborative robot visual servo control method disclosed in the present invention, preferably, the calculation process of camera calibration includes: fixing the camera and continuously collecting calibration plate images; calculating the pixel coordinates of the corner points; calculating the homography matrix; solving the intrinsic parameter matrix; solving the extrinsic parameter matrix; solving the lens distortion coefficient; based on the minimum reprojection error function, using maximum likelihood estimation to iterate and optimize all parameters obtained by the camera calibration.

根据本发明公开的双臂协作机器人视觉伺服控制方法，优选地，手眼标定的计算过程包括：计算机械臂末端坐标系到基坐标系的映射矩阵；计算世界坐标系到相机坐标系的映射矩阵；计算相机坐标系到机械臂末端坐标系的映射矩阵。According to the dual-arm collaborative robot visual servo control method disclosed in the present invention, preferably, the calculation process of hand-eye calibration includes: calculating the mapping matrix from the robot arm end coordinate system to the base coordinate system; calculating the mapping matrix from the world coordinate system to the camera coordinate system; calculating the mapping matrix from the camera coordinate system to the robot arm end coordinate system.

根据本发明公开的双臂协作机器人视觉伺服控制方法，优选地，动态双目视觉的计算过程包括：计算初始位姿下第一相机到基坐标系的映射关系；计算第n位姿下基坐标系到第一相机的映射关系；计算世界坐标系到基坐标系的映射关系；计算第n位姿下世界坐标系到第一相机的映射关系；计算第n位姿下世界坐标系到第二相机的映射关系；计算第n位姿下第二相机到第一相机的映射关系；根据第一相机和第二相机的相对平移向量计算第n位姿下的动态基线；根据动态基线以及三角测量原理，获得深度信息；根据深度信息以及目标点在第二相机像平面上的投影信息，并利用相机内参和投影关系获取空间点的三维坐标。According to the dual-arm collaborative robot visual servo control method disclosed in the present invention, preferably, the calculation process of dynamic binocular vision includes: calculating the mapping relationship between the first camera and the base coordinate system in the initial posture; calculating the mapping relationship between the base coordinate system and the first camera in the nth posture; calculating the mapping relationship between the world coordinate system and the base coordinate system; calculating the mapping relationship between the world coordinate system and the first camera in the nth posture; calculating the mapping relationship between the world coordinate system and the second camera in the nth posture; calculating the mapping relationship between the second camera and the first camera in the nth posture; calculating the dynamic baseline in the nth posture according to the relative translation vector of the first camera and the second camera; obtaining depth information according to the dynamic baseline and the principle of triangulation; obtaining the three-dimensional coordinates of the spatial point according to the depth information and the projection information of the target point on the image plane of the second camera, and using the camera intrinsic parameters and the projection relationship.

根据本发明公开的双臂协作机器人视觉伺服控制方法，优选地，改进的YOLO目标检测算法具体包括：在YOLO-v5s模型的Neck层的每一个卷积块之前均加入ECA模块和SA模块。According to the dual-arm collaborative robot visual servo control method disclosed in the present invention, preferably, the improved YOLO target detection algorithm specifically includes: adding an ECA module and a SA module before each convolution block of the Neck layer of the YOLO-v5s model.

根据本发明公开的双臂协作机器人视觉伺服控制方法，优选地，还包括：将YOLO-v5s模型的上采样模块的最近邻插值法替换为反卷积法，卷积核大小为4，卷积步长为2，通过学习参数来反向传播梯度，将低分辨率的特征图映射回高分辨率空间。According to the dual-arm collaborative robot visual servo control method disclosed in the present invention, preferably, it also includes: replacing the nearest neighbor interpolation method of the upsampling module of the YOLO-v5s model with a deconvolution method, with a convolution kernel size of 4 and a convolution step size of 2, and back-propagating gradients through learning parameters to map the low-resolution feature map back to the high-resolution space.

根据本发明公开的双臂协作机器人视觉伺服控制方法，优选地，确定第一机械臂和第二机械臂的主从关系的步骤，具体包括：通过视觉系统确定目标物的中心点；计算目标物中心点与第一机械臂的欧式距离Ll，计算目标物中心点与第二机械臂的欧式距离Lr；当Ll＜Lr时，选择第一机械臂作为主臂，第二机械臂作为从臂；当Ll＞Lr时，选择第二机械臂作为主臂，第一机械臂作为从臂。According to the dual-arm collaborative robot visual servo control method disclosed in the present invention, preferably, the step of determining the master-slave relationship between the first robotic arm and the second robotic arm specifically includes: determining the center point of the target object through a visual system; calculating the Euclidean distance Ll between the center point of the target object and the first robotic arm, and calculating the Euclidean distance Lr between the center point of the target object and the second robotic arm; when Ll＜Lr, selecting the first robotic arm as the master arm and the second robotic arm as the slave arm; when Ll＞Lr, selecting the second robotic arm as the master arm and the first robotic arm as the slave arm.

本发明的第二方面公开了一种双臂协作机器人视觉伺服控制装置，包括：存储器，用于存储程序指令；处理器，用于调用存储器中存储的程序指令以实现如上述任一技术方案的双臂协作机器人视觉伺服控制方法。The second aspect of the present invention discloses a dual-arm collaborative robot visual servo control device, comprising: a memory for storing program instructions; a processor for calling the program instructions stored in the memory to implement a dual-arm collaborative robot visual servo control method such as any of the above technical solutions.

本发明的第三方面公开了一种计算机可读存储介质，该计算机可读存储介质存储有程序代码，程序代码用于实现如上述任一技术方案的双臂协作机器人视觉伺服控制方法。A third aspect of the present invention discloses a computer-readable storage medium storing a program code, wherein the program code is used to implement a visual servo control method for a dual-arm collaborative robot as described in any of the above technical solutions.

本发明的有益效果至少包括：提出了对环境变化适应性更强的动态双目视觉标定方法，通过传统双目视觉标定方法和坐标系映射关系实现目标定位，在动态环境下表现出较强的鲁棒性和准确性；针对YOLO-v5s在动态环境中推理时间较长和易丢失图像信息的问题，提出了一种改进的YOLO目标检测算法，该算法在Neck网络中引入ECA模块和SA模块，并用反卷积方法替换了上采样中的最近邻插值法，有效提升目标检测准确率；提出基于位置的主从规划策略，采用基于位置的视觉伺服控制方法，结合EK-SVSF位姿估计算法和基于OWA的分步式数据融合算法对目标物进行最优位姿估计，提升机械手的抓取准确度。The beneficial effects of the present invention include at least: a dynamic binocular vision calibration method with stronger adaptability to environmental changes is proposed, target positioning is achieved through the traditional binocular vision calibration method and the coordinate system mapping relationship, and strong robustness and accuracy are shown in a dynamic environment; in view of the problems of long inference time and easy loss of image information of YOLO-v5s in a dynamic environment, an improved YOLO target detection algorithm is proposed, which introduces ECA module and SA module in the Neck network, and replaces the nearest neighbor interpolation method in upsampling with the deconvolution method, which effectively improves the target detection accuracy; a position-based master-slave planning strategy is proposed, and a position-based visual servo control method is adopted, which combines the EK-SVSF pose estimation algorithm and the OWA-based step-by-step data fusion algorithm to estimate the optimal pose of the target object, thereby improving the grasping accuracy of the manipulator.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1示出了根据本发明的一个实施例的双臂协作机器人视觉伺服控制方法的工作流程示意图。FIG1 shows a schematic diagram of the workflow of a dual-arm collaborative robot visual servo control method according to an embodiment of the present invention.

图2示出了根据本发明的一个实施例的相机成像过程的坐标系示意图。FIG. 2 is a schematic diagram showing a coordinate system of a camera imaging process according to an embodiment of the present invention.

图3示出了根据本发明的一个实施例的世界坐标系到像素坐标系的映射过程示意图。FIG. 3 is a schematic diagram showing a mapping process from a world coordinate system to a pixel coordinate system according to an embodiment of the present invention.

图4示出了根据本发明的一个实施例的手眼标定示意图。FIG. 4 shows a schematic diagram of hand-eye calibration according to an embodiment of the present invention.

图5示出了根据本发明的一个实施例的动态双目视觉标定模型示意图。FIG5 shows a schematic diagram of a dynamic binocular vision calibration model according to an embodiment of the present invention.

图6示出了根据本发明的一个实施例的三角测量原理示意图。FIG. 6 is a schematic diagram showing the triangulation principle according to an embodiment of the present invention.

图7示出了根据本发明的一个实施例的改进的YOLO目标检测算法的网络模型示意图。FIG7 shows a schematic diagram of a network model of an improved YOLO target detection algorithm according to an embodiment of the present invention.

图8示出了根据本发明的一个实施例的基于位置的视觉伺服控制方法的流程示意图。FIG. 8 is a schematic flow chart of a position-based visual servo control method according to an embodiment of the present invention.

图9示出了根据本发明的一个实施例的OWA分布式数据融合流程示意图。FIG. 9 shows a schematic diagram of an OWA distributed data fusion process according to an embodiment of the present invention.

图10示出了根据本发明的一个实施例的双臂协作机器人视觉伺服控制装置的示意框图。FIG. 10 shows a schematic block diagram of a visual servo control device for a dual-arm collaborative robot according to an embodiment of the present invention.

具体实施方式DETAILED DESCRIPTION

为了能够更清楚地理解本发明的上述目的、特征和优点，下面结合附图和具体实施方式对本发明进行进一步的详细描述。In order to more clearly understand the above-mentioned objects, features and advantages of the present invention, the present invention is further described in detail below with reference to the accompanying drawings and specific embodiments.

在下面的描述中阐述了很多具体细节以便于充分理解本发明，但是，本发明还可以采用其他不同于在此描述的其他方式来实施，因此，本发明并不限于下面公开的具体实施例的限制。In the following description, many specific details are set forth to facilitate a full understanding of the present invention. However, the present invention may also be implemented in other ways different from those described herein. Therefore, the present invention is not limited to the specific embodiments disclosed below.

根据本发明的一个实施例，本发明公开的双臂协作机器人视觉伺服控制方法包括：获取设置在第一机械臂末端的第一相机和设置在第二机械臂末端的第二相机采集到的目标物的图像信息；基于相机标定和手眼标定建立机械臂与目标物之间的坐标关系；仅在初始位姿进行一次静态双目视觉标定，而后通过坐标系映射关系获得目标物的空间三维坐标，进而实现动态双目视觉；基于改进的YOLO目标检测算法对目标物进行识别，改进的YOLO目标检测算法是在YOLO-v5s模型的基础上加入ECA注意力机制和SA注意力机制形成的；根据目标检测的结果确定目标物位置信息；根据位置信息确定第一机械臂和第二机械臂的主从关系；控制主臂抵达抓取位姿，控制从臂根据主臂的位姿带动相机抵达观测位；采用基于位置的视觉伺服控制方法，结合EK-SVSF位姿估计算法和基于OWA的分步式数据融合算法对目标物进行最优位姿估计；根据最优位姿估计的结果控制主臂通过关节运动进行偏差调整，进而实现对目标物的精确抓取。According to one embodiment of the present invention, the visual servo control method of a dual-arm collaborative robot disclosed in the present invention comprises: acquiring image information of a target object collected by a first camera arranged at the end of a first robotic arm and a second camera arranged at the end of a second robotic arm; establishing a coordinate relationship between the robotic arm and the target object based on camera calibration and hand-eye calibration; performing static binocular vision calibration only once at an initial position, and then obtaining the spatial three-dimensional coordinates of the target object through a coordinate system mapping relationship, thereby realizing dynamic binocular vision; identifying the target object based on an improved YOLO target detection algorithm, wherein the improved YOLO target detection algorithm is based on the YOLO- The ECA attention mechanism and the SA attention mechanism are added to the v5s model; the target position information is determined according to the result of target detection; the master-slave relationship between the first robotic arm and the second robotic arm is determined according to the position information; the master arm is controlled to reach the grasping position, and the slave arm is controlled to drive the camera to the observation position according to the position of the master arm; the position-based visual servo control method is adopted, combined with the EK-SVSF pose estimation algorithm and the OWA-based step-by-step data fusion algorithm to estimate the optimal pose of the target; according to the result of the optimal pose estimation, the main arm is controlled to adjust the deviation through joint movement, thereby achieving accurate grasping of the target.

在本实施例中，如图1所示，本发明的双臂协作机器人视觉伺服控制方法主要包括3个阶段：(1)初始阶段：在整个控制系统运行前需要进行一些准备工作，包括视觉系统的标定和参数配置。其中视觉系统的标定包括相机标定、手眼标定和双目视觉标定；参数配置包括相机内参、机械臂结构参数和运动参数、各部件与主机的通信参数等。(2)识别与定位阶段：控制机械臂成功抓取目标物的前提是能找到目标物，所以需要控制相机采集目标物图片，主机通过目标检测算法识别出目标物，并通过双目视觉标定获得目标物中心点的三维坐标。(3)视觉伺服控制阶段：通过基于位置的主从规划策略确定两个机械臂的主从关系，采用基于位置的伺服系统控制方法驱动双臂协作机器人运动，主、副眼采集目标物图像，对目标物进行位姿估计，并通过数据融合算法获得最优位姿，从而控制机械臂进行精确抓取。总体上，由相机识别并定位目标物，将其位置信息传送到主控系统；主控系统根据左、右机器臂基座与目标物位置之间的距离确定二者的主从关系；主控系统控制主臂带动主臂末端搭载的相机靠近目标物进行位姿估计，同时从臂根据主臂位姿带动从臂末端搭载的相机进入最优观测位姿，完成主臂和目标物的位姿估计；将两个位姿估计结果传回主控系统进行数据融合，得到目标物最优位姿，从而控制主臂进行精确抓取。In this embodiment, as shown in FIG1 , the dual-arm collaborative robot visual servo control method of the present invention mainly includes three stages: (1) Initial stage: some preparation work needs to be done before the entire control system is operated, including calibration and parameter configuration of the visual system. The calibration of the visual system includes camera calibration, hand-eye calibration and binocular vision calibration; the parameter configuration includes camera internal parameters, mechanical arm structure parameters and motion parameters, communication parameters between each component and the host, etc. (2) Identification and positioning stage: the premise of controlling the mechanical arm to successfully grasp the target is to find the target, so it is necessary to control the camera to collect the target image, the host recognizes the target through the target detection algorithm, and obtains the three-dimensional coordinates of the center point of the target through binocular vision calibration. (3) Visual servo control stage: the master-slave relationship of the two mechanical arms is determined by the master-slave planning strategy based on position, and the servo system control method based on position is used to drive the dual-arm collaborative robot to move. The master and slave eyes collect the target image, estimate the position and pose of the target, and obtain the optimal position and pose through the data fusion algorithm, so as to control the mechanical arm to grasp accurately. In general, the camera identifies and locates the target object, and transmits its position information to the main control system; the main control system determines the master-slave relationship between the left and right robot arm bases and the target object based on the distance between the two; the main control system controls the main arm to drive the camera mounted on the end of the main arm to approach the target object for pose estimation, and at the same time, the slave arm drives the camera mounted on the end of the slave arm into the optimal observation pose according to the pose of the main arm, completing the pose estimation of the main arm and the target object; the two pose estimation results are transmitted back to the main control system for data fusion to obtain the optimal pose of the target object, thereby controlling the main arm for precise grasping.

根据上述实施例，优选地，双臂系统选用RM65超轻量六轴机械臂，搭载EG2-4C二指电动夹爪，视觉系统选用RealSenseD435相机，主控系统选用Nvidia Jetson Nano；基于ROS构建软件系统框架。According to the above embodiment, preferably, the dual-arm system uses the RM65 ultra-light six-axis robotic arm, equipped with an EG2-4C two-finger electric gripper, the visual system uses a RealSenseD435 camera, and the main control system uses Nvidia Jetson Nano; the software system framework is built based on ROS.

根据上述实施例，相机标定的具体过程与原理包括：相机成像过程就是将三维空间点转换成二维平面点，而相机标定就是来反向求解相机成像模型参数的过程，这些参数包含主点和焦距等与相机内部结构相关的内部参数，还包含与相机旋转和平移等位置相关的外部参数。相机成像过程涉及到四个坐标系的相互转换关系，如图2所示，四个坐标系分别为像素坐标系o-uv、图像坐标系O-xy、相机坐标系Oc-XcYcZc和世界坐标系Ow-XwYwZw，P为实际空间中的一点，其在相机图像上的成像点为p(x,y)，Oc和O的距离为相机焦距f。相机成像模型参数的求解过程可以理解为世界坐标系到像素坐标系的映射过程，如图3所示，相机成像模型参数的求解过程包括刚体变换、透视投影和仿射变换三个部分。映射过程可以由式(3.1)表示：According to the above embodiment, the specific process and principle of camera calibration include: the camera imaging process is to convert three-dimensional space points into two-dimensional plane points, and camera calibration is to reversely solve the camera imaging model parameters. These parameters include internal parameters related to the internal structure of the camera, such as the principal point and focal length, and external parameters related to the position of the camera, such as rotation and translation. The camera imaging process involves the mutual conversion relationship of four coordinate systems. As shown in Figure 2, the four coordinate systems are the pixel coordinate system o-uv, the image coordinate system O-xy, the camera coordinate system Oc-XcYcZc and the world coordinate system Ow-XwYwZw. P is a point in the actual space, and its imaging point on the camera image is p(x, y). The distance between Oc and O is the camera focal length f. The process of solving the camera imaging model parameters can be understood as a mapping process from the world coordinate system to the pixel coordinate system. As shown in Figure 3, the process of solving the camera imaging model parameters includes three parts: rigid body transformation, perspective projection and affine transformation. The mapping process can be expressed by formula (3.1):

其中，(u,v)为像素坐标，A是相机标定的内参矩阵，包括焦距f、像主点坐标(u0,v0)和畸变系数，T是相机标定的外参矩阵，包括旋转矩阵R和平移向量t，dX和dY是单位像素在相机感光板X和Y方向上的实际物理长度，θ为相机感光板横边和纵边之间的角度，在理想情况下为90°。Where (u, v) is the pixel coordinate, A is the intrinsic parameter matrix of the camera calibration, including the focal length f, the image principal point coordinates (u0, v0) and the distortion coefficient, T is the extrinsic parameter matrix of the camera calibration, including the rotation matrix R and the translation vector t, dX and dY are the actual physical lengths of the unit pixel in the X and Y directions of the camera plate, and θ is the angle between the horizontal and vertical edges of the camera plate, which is ideally 90°.

但在实际的成像过程中，由于镜头制造或使用过程中的物理因素，使得相机实际工作原理与理想的针孔成像模型有所偏差，因此在成像过程中物体形状发生扭曲或拉伸，影响图像的几何形状和尺寸的真实表达，即镜头畸变：假设(x,y)是存在畸变的图像坐标，(x,y)是无畸变的图像坐标，则径向畸变的数学模型如式(3.2)所示：However, in the actual imaging process, due to physical factors in the manufacturing or use of the lens, the actual working principle of the camera deviates from the ideal pinhole imaging model. Therefore, the shape of the object is distorted or stretched during the imaging process, affecting the true expression of the geometric shape and size of the image, that is, lens distortion: Assuming that (x, y) is the image coordinate with distortion and (x, y) is the image coordinate without distortion, the mathematical model of radial distortion is shown in formula (3.2):

其中，r表示图像像素点到图像中心点的距离，即r2＝x2+y2，k1、k2和k3是镜像畸变系数。Wherein, r represents the distance from the image pixel to the image center, that is, r2=x2+y2, and k1, k2 and k3 are mirror distortion coefficients.

切向畸变的数学模型如式(3.3)所示：The mathematical model of tangential distortion is shown in formula (3.3):

其中，p1和p2是切向畸变系数。Among them, p1 and p2 are the tangential distortion coefficients.

本发明采用张正友相机标定法进行相机标定：The present invention adopts Zhang Zhengyou camera calibration method to perform camera calibration:

该方法为了使相机各个参数尽可能逼近准确值，通常采用线性建模方法求出其初值，再通过非线性建模来校正初值，进而得到较低误差的参数。In order to make the various camera parameters as close to the accurate values as possible, this method usually uses linear modeling to find their initial values, and then uses nonlinear modeling to correct the initial values to obtain parameters with lower errors.

张正友相机标定具体步骤如下：The specific steps of Zhang Zhengyou's camera calibration are as follows:

(1)图像采集：将相机固定并连续采集标定板图像，为确保数据的多样性和准确性，需要保证标定板处于相机采集范围内，同时在采集过程中需要不断地改变标定板与相机的距离、角度、方位等。(1) Image acquisition: The camera is fixed and continuously acquires images of the calibration plate. To ensure data diversity and accuracy, it is necessary to ensure that the calibration plate is within the camera acquisition range. At the same time, the distance, angle, and orientation between the calibration plate and the camera need to be constantly changed during the acquisition process.

(2)计算角点像素坐标：利用图像检测算法计算标定板上每个角点的像素坐标。(2) Calculate the pixel coordinates of corner points: Use the image detection algorithm to calculate the pixel coordinates of each corner point on the calibration plate.

(3)计算单应性矩阵M：张正友相机标定中的单应性矩阵即为内参矩阵A和外参矩阵T的积。假设世界坐标系的X_w轴和Y_w轴与标定板平面平行，而Z_w轴与标定板平面垂直，则整个标定板在世界坐标中的Z值均为0。因此，式(3.1)可以改写为式(3.4)。(3) Calculate the homography matrix M: The homography matrix in Zhang Zhengyou's camera calibration is the product of the intrinsic parameter matrix A and the extrinsic parameter matrix T. Assuming that the X _w axis and Y _w axis of the world coordinate system are parallel to the plane of the calibration plate, and the Z _w axis is perpendicular to the plane of the calibration plate, the Z value of the entire calibration plate in the world coordinate system is 0. Therefore, equation (3.1) can be rewritten as equation (3.4).

消去尺度因子Z_c，可以得到世界坐标系到像素坐标系的映射关系，如式(3.5)所示。Eliminating the scale factor Z _c , we can obtain the mapping relationship from the world coordinate system to the pixel coordinate system, as shown in equation (3.5).

理论上四个角点就可以求得单应性矩阵M，但是由于噪音等因素的影响，一般会利用最小二乘法对尽量多的角点进行单应性矩阵估计，从而求得最佳的单应性矩阵M。Theoretically, the homography matrix M can be obtained from four corner points. However, due to the influence of factors such as noise, the least squares method is generally used to estimate the homography matrix for as many corner points as possible to obtain the optimal homography matrix M.

(4)求内参矩阵A：在张正友标定算法中，一般使用式(3.6)作为相机标定的内参矩阵。其中，γ为扭曲参数，α和β为单位距离上像素的个数，(u₀,v₀)为像主点坐标。(4) Find the intrinsic parameter matrix A: In Zhang Zhengyou's calibration algorithm, formula (3.6) is generally used as the intrinsic parameter matrix for camera calibration. Among them, γ is the distortion parameter, α and β are the number of pixels per unit distance, and (u ₀ ,v ₀ ) is the coordinate of the image principal point.

令B＝A-TA-1，如式(3.7)所示，可以看出矩阵B为对称矩阵，只有6个未知量，因此理论上三张标定板图片就可以求得矩阵B。Let B = A-TA-1, as shown in formula (3.7), it can be seen that matrix B is a symmetric matrix with only 6 unknown quantities. Therefore, theoretically, matrix B can be obtained by three calibration plate images.

但是为了提高标定结果的稳定性和准确性，得到更可靠的相机内参矩阵，一般会利用最小二乘法对尽量多的标定板图片进行拟合。利用矩阵分解算法得到A和A^-1，A中未知量的表达式如式(3.8)所示。However, in order to improve the stability and accuracy of the calibration results and obtain a more reliable camera intrinsic parameter matrix, the least squares method is generally used to fit as many calibration plate images as possible. Using the matrix decomposition algorithm, A and A ^-1 are obtained. The expression of the unknown quantity in A is shown in formula (3.8).

(5)求外参矩阵：根据A可以得到相机标定的外参矩阵，如式(3.9)所示。其中，λ＝1/‖A^-1M₁‖＝1/‖A^-1M₂‖,(5) Obtaining the extrinsic parameter matrix: Based on A, we can obtain the extrinsic parameter matrix of the camera calibration, as shown in equation (3.9). Where, λ = 1/‖A ^-1 M ₁ ‖ = 1/‖A ^-1 M ₂ ‖,

[R t]＝[R₁ R₂ R₃ t]＝[λA^-1M₁λA^-1M₂ R₁×R₂λA^-1M₃](3.9)[R t]＝[R ₁ R ₂ R ₃ t]＝[λA ^-1 M ₁ λA ^-1 M ₂ R ₁ ×R ₂ λA ^-1 M ₃ ](3.9)

(6)求镜头畸变系数：在张正友标定方法中加入了低阶的径向畸变，其模型的矩阵形式如式(3.10)所示，其中(u^,^v)是存在畸变的像素坐标。(6) Calculate the lens distortion coefficient: Low-order radial distortion is added to Zhang Zhengyou’s calibration method. The matrix form of the model is shown in equation (3.10), where (u^,^v) is the pixel coordinate with distortion.

将式(3.10)简写成Dk＝d，利用最小二乘法求解得到畸变系数，如式(3.11)所示。Formula (3.10) is simplified as Dk=d, and the distortion coefficient is obtained by using the least squares method, as shown in formula (3.11).

k＝(D^TD)^-1D^Td (3.11)k＝(D ^T D) ^-1 D ^T d (3.11)

(7)优化参数：通过式(3.12)的重投影误差最小函数，利用极大似然估计进行迭代^[54]，对相机标定得到的所有参数进行优化，其中为角点M_ij对应的像素坐标。(7) Optimizing parameters: All parameters obtained by camera calibration are optimized by iterating the reprojection error minimization function of equation (3.12) using maximum likelihood estimation ^[54] , where is the pixel coordinate corresponding to the corner point _Mij .

图4所示，根据上述实施例，手眼标定的具体过程和原理包括：本发明采用EIH手眼系统，机械臂运动到两个不同的位姿对标定板进行标定，并保证标定板在相机的视野范围内。As shown in FIG4 , according to the above embodiment, the specific process and principle of hand-eye calibration include: the present invention adopts an EIH hand-eye system, and the robotic arm moves to two different postures to calibrate the calibration plate, and ensures that the calibration plate is within the field of view of the camera.

在EIH手眼系统中，Camera相对于End是固定的，相对于Base是变化的，因此待求量为Camera到End的映射矩阵，用^eT_c表示。此外，End到Base的映射矩阵用^bT_e表示；World到Camera的映射矩阵用^cT_w表示；World到Base的映射矩阵用^bT_w表示。In the EIH hand-eye system, the Camera is fixed relative to the End and changes relative to the Base, so the quantity to be determined is the mapping matrix from Camera to End, represented by ^e T _c . In addition, the mapping matrix from End to Base is represented by ^b T _e ; the mapping matrix from World to Camera is represented by ^c T _w ; and the mapping matrix from World to Base is represented by ^b T _w .

假设P为空间中一点，其在Base、World、Camera和End下的坐标分别为P_b(X_b,Y_b,Z_b)、P_w(X_w,Y_w,Z_w)、P_c(X_c,Y_c,Z_c)和P_e(X_e,Y_e,Z_e)，则求解^eT_c的过程如下：Assume that P is a point in space, and its coordinates in Base, World, Camera, and End are _Pb ( _Xb , _Yb , _Zb ), _Pw ( _Xw , _Yw , _Zw ), _Pc ( _Xc , _Yc , _Zc ), and _Pe ( _Xe , _Ye , _Ze ), respectively. The process _of solving ^eTc is as follows:

(1)End到Base的映射矩阵^bT_e (1) End to Base mapping matrix ^b T _e

P_e通过刚性变换可以得到P_b，如式(3.13)所示。其中旋转矩阵^bR_e＝R_zR_yR_x和平移向量^bt_e可以通过机械臂的正运动学获得。P _e can be obtained by rigid transformation to P _b , as shown in equation (3.13), where the rotation matrix ^b _Re = R _z R _y R _x and the translation vector ^b t _e can be obtained by the forward kinematics of the robot arm.

P_b＝^bR_eP_e+^bt_e(3.13)P _b ＝ ^b R _e P _e + ^b t _e (3.13)

将式(3.13)转换成矩阵形式，如式(3.14)所示。Convert equation (3.13) into matrix form, as shown in equation (3.14).

则P_e到P_b的转换关系可以由式(3.15)表示，End到Base的映射矩阵^bT_e可以由(3.16)表示。Then the conversion relationship from _Pe to _Pb can be expressed by formula (3.15), and _the mapping matrix ^bTe from End to Base can be expressed by (3.16).

(2)World到Camera的映射矩阵^cT_w (2) World to Camera mapping matrix ^c T _w

P_w通过相机标定的外参矩阵可以得到P_c，如式(3.17)所示。P _w can be obtained by the external parameter matrix of camera calibration to obtain P _c , as shown in formula (3.17).

P_c＝^cR_wP_w+^ct_w(3.17)P _c = ^c R _w P _w + ^c t _w (3.17)

其中，^cR_w和^ct_w分别为P_w相对于P_c的旋转矩阵和平移向量。Among them, ^c R _w and ^c t _w are the rotation matrix and translation vector of P _w relative to P _c , respectively.

同样将式(3.17)转换成矩阵形式即可得P_w相对于P_c的转换关系，如式(3.18)所示，则World到Camera的映射矩阵^cT_w可以由式(3.19)表示。Similarly, converting equation (3.17) into matrix form can obtain the transformation relationship between _Pw and _Pc , as shown in equation (3.18), and _the mapping matrix ^cTw from World to Camera can be expressed by equation (3.19).

P_c＝^cT_wP_w(3.18)P _c ＝ ^c T _w P _w (3.18)

(3)Camera到End的映射矩阵^eT_c (3) Camera to End mapping matrix ^e T _c

令P_c相对于P_e的转换关系为式(3.20)，P_w相对于P_b的转换关系为式(3.21)。Let the conversion relationship between _Pc and _Pe be equation (3.20), and the conversion relationship between _Pw and _Pb be equation (3.21).

P_e＝^eT_cP_c (3.20) _Pe ₌ ^eTcPc ( _3.20 )

P_b＝^bT_wP_w (3.21)P _b ＝ ^b T _w P _w (3.21)

联立式(3.15)、式(3.18)和式(3.20)可以得到式(3.22)。Combining equations (3.15), (3.18) and (3.20) we can get equation (3.22).

P_b＝^bT_eP_e＝^bT_e ^eT_cP_c＝^bT_e ^eT_c ^cT_wP_w(3.22)P _b ＝ ^b T _e P _e ＝ ^b T _e ^e T _c P _c ＝ ^b T _e ^e T _c ^c T _w P _w (3.22)

根据式(3.21)和式(3.22)可以得到各映射矩阵之间的关系，如式(3.23)所示。According to equations (3.21) and (3.22), the relationship between the mapping matrices can be obtained, as shown in equation (3.23).

^bT_w＝^bT_e ^eT_c ^cT_w(3.23) ^b T _w ＝ ^b T _e ^e T _c ^c T _w (3.23)

因为基座和标定板相对位置不变，夹爪和相机相对位置不变，所以在式(3.23)中^bT_w和^eT_c均为常量。在保证标定板在相机视野范围内的前提下，机械臂运动到两个不同的位姿可以得到式(3.24)。Because the relative positions of the base and the calibration plate remain unchanged, and the relative positions of the gripper and the camera remain unchanged, ^b T _w and ^e T _c are both constants in equation (3.23). Under the premise of ensuring that the calibration plate is within the camera's field of view, the robot moves to two different positions to obtain equation (3.24).

式(3.24)是典型的“AX＝XB”的问题，其中X＝^eT_c为待求变换矩阵，可以根据机器人正运动学获得，可以根据相机标定的外参矩阵获得。使用Tsai提出的方法即可求解Camera相对于End的变换矩阵^eT_c，如式(3.25)所示，即手眼标定的结果。Formula (3.24) is a typical "AX = XB" problem, where X = ^e T _c is the transformation matrix to be determined, It can be obtained according to the robot's forward kinematics, It can be obtained based on the external parameter matrix of camera calibration. Using the method proposed by Tsai, the transformation matrix ^e T _c of Camera relative to End can be solved, as shown in formula (3.25), which is the result of hand-eye calibration.

如图5所示，根据上述实施例，本发明还公开了实现动态双目视觉的算法：As shown in FIG5 , according to the above embodiment, the present invention also discloses an algorithm for realizing dynamic binocular vision:

动态双目视觉标定模型如图5所示，相机运动前的位姿称为初始位姿，相机运动过程中的某次位姿称为第n位姿，则为左、右相机在初始位姿下的光心，为左、右相机在第n位姿下的光心。为初始位姿下P在左、右相机成像平面上的成像点；为第n位姿下P在左、右相机成像平面上的成像点。The dynamic binocular vision calibration model is shown in Figure 5. The pose before the camera moves is called the initial pose, and a certain pose during the camera movement is called the nth pose. is the optical center of the left and right cameras in the initial position, is the optical center of the left and right cameras at the nth position. is the imaging point of P on the left and right camera imaging planes in the initial posture; is the imaging point of P on the left and right camera imaging planes in the nth posture.

在动态双目视觉中，关键在于获得动态基线和深度信息，其计算流程包括：In dynamic binocular vision, the key is to obtain dynamic baseline and depth information. The calculation process includes:

(1)计算初始位姿R-Camera到R-Base的映射关系^rbT_rc(0) (1) Calculate the mapping relationship from the initial pose R-Camera to R-Base ^rb T _rc(0)

根据式(3.15)和式(3.20)，可以得到初始位姿下P_rc相对于P_rb的转换关系，如式(3.29)所示。According to equations (3.15) and (3.20), the transformation relationship between P _rc and P _rb in the initial posture can be obtained, as shown in equation (3.29).

其中，^rbR_rc(0)和^rbt_rc(0)为初始位姿下P_rc相对于P_rb的旋转矩阵和平移向量，^rbR_re(0)和^rbt_re(0)为初始位姿下P_re相对于P_rb的旋转矩阵和平移向量，^re(0)R_rc(0)和^re(0)t_rc(0)为初始位姿下P_rc相对于P_re的旋转矩阵和平移向量。Among them, ^rb R _rc(0) and ^rb t _rc(0) are the rotation matrix and translation vector of P _rc relative to P _rb in the initial pose, ^rb R _re(0) and ^rb t _re(0) are the rotation matrix and translation vector of P _re relative to P _rb in the initial pose, ^re(0) R _rc(0) and ^re(0) t _rc(0) are the rotation matrix and translation vector of P _rc relative to P _re in the initial pose.

则初始位姿下R-Camera到R-Base的映射关系^rbT_rc(0)可由式(3.30)表示。Then the mapping relationship ^rb T _rc(0) from R-Camera to R-Base in the initial posture can be expressed by formula (3.30).

(2)计算第n位姿下R-Base到R-Camera的映射关系^rc(n)T_rb (2) Calculate the mapping relationship ^rc(n) T _rb from R-Base to R-Camera in the nth pose

与计算^rbT_rc(0)同理，可以得到第n位姿下P_rb相对于P_rc的转换关系，如式(3.31)所示。其中，^rc(n)R_rb和^rc(n)t_rb为第n位姿下P_rb相对于P_rc的旋转矩阵和平移向量。Similar to the calculation of ^rb T _rc(0) , the transformation relationship of P _rb relative to P _rc in the nth posture can be obtained, as shown in equation (3.31). Among them, ^rc(n) R _rb and ^rc(n) t _rb are the rotation matrix and translation vector of P _rb relative to P _rc in the nth posture.

则第n位姿下R-Base到R-Camera的映射关系^rc(1)T_rb可由式(3.32)表示。Then the mapping relationship ^rc(1) T _rb from R-Base to R-Camera in the nth posture can be expressed by formula (3.32).

(3)计算World到R-Base的映射关系^rbT_w (3) Calculate the mapping relationship from World to R-Base ^rb T _w

根据传统双目视觉标定方法得到初始位姿下P_w相对于P_rc的转换关系，如式(3.33)所示。其中，和为初始位姿下右相机经过相机标定得到的相对于标定物的旋转矩阵和平移向量_[59]。According to the traditional binocular vision calibration method, the conversion relationship between _Pw and _Prc in the initial posture is obtained as shown in formula (3.33). and are the rotation matrix and translation vector of the right camera at the initial pose relative to the calibration object obtained after camera calibration _[59] .

联立式(3.29)和式(3.33)可以得到P_w相对于P_rb的转换关系，如式(3.34)所示。其中，^rbR_w和^rbt_w为P_w相对于P_rb的旋转矩阵和平移向量。Combining equations (3.29) and (3.33) we can get the transformation relationship of P _w relative to P _rb , as shown in equation (3.34). Where ^rb R _w and ^rb t _w are the rotation matrix and translation vector of P _w relative to P _rb .

则World到R-Base的映射关系^rbT_w可以由式(3.35)表示。Then the mapping relationship ^rb T _w from World to R-Base can be expressed by formula (3.35).

(4)计算第n位姿下World到R-Camera的映射关系^rc(n)T_w (4) Calculate the mapping relationship ^rc(n) T _w from World to R-Camera at the nth pose

联立式(3.31)和式(3.34)可以得到第n位姿下P_w相对于P_rc的转换关系，如式(3.36)所示。其中，^rc(n)R_w和^rc(n)t_w为第n位姿下P_w相对于P_rc的旋转矩阵和平移向量。Combining equations (3.31) and (3.34) we can get the transformation relationship of P _w relative to P _rc in the nth posture, as shown in equation (3.36). Where ^rc(n) R _w and ^rc(n) t _w are the rotation matrix and translation vector of P _w relative to P _rc in the nth posture.

则第n位姿下World到R-Camera的映射关系^rc(n)T_w可以由式(3.37)表示。Then the mapping relationship ^rc(n) T _w from World to R-Camera in the nth pose can be expressed by formula (3.37).

(5)计算第n位姿下World到L-Camera的映射关系^lc(n)T_w (5) Calculate the mapping relationship ^lc(n) T _w from World to L-Camera at the nth pose

与计算^rc(n)T_w同理，可以得到第n位姿下P_w相对于P_lc的转换关系，如式(3.38)所示。其中，^lc(n)R_w和^lc(n)t_w为第n位姿下P_w相对于P_lc的旋转矩阵和平移向量，^lc(n)R_lb和^lc(n)t_lb为第n位姿下P_lb相对于P_lc的旋转矩阵和平移向量，^lbR_w和^lbt_w为P_w相对于P_lb的旋转矩阵和平移向量。Similar to the calculation of ^rc(n) T _w , the transformation relationship of P _w relative to P _lc in the nth posture can be obtained, as shown in formula (3.38). Among them, ^lc(n) R _w and ^lc(n) t _w are the rotation matrix and translation vector of P _w relative to P _lc in the nth posture, ^lc(n) R _lb and ^lc(n) t _lb are the rotation matrix and translation vector of P _lb relative to P _lc in the nth posture, ^lb R _w and ^lb t _w are the rotation matrix and translation vector of P _w relative to P _lb.

则第n位姿下World到L-Camera的映射关系^lc(n)T_w可以由式(3.39)表示。Then the mapping relationship ^lc(n) T _w from World to L-Camera in the nth pose can be expressed by formula (3.39).

(6)计算第n位姿下L-Camera到R-Camera的映射关系^rc(n)T_lc(n) (6) Calculate the mapping relationship from L-Camera to R-Camera in the nth pose ^rc(n) T _lc(n)

联立式(3.36)和式(3.38)可以得到第n位姿下P_lc相对于P_rc的转换关系，如式(3.40)所示。其中，^rc(n)R_lc(n)和^rc(n)t_lc(n)为第n位姿下P_lc相对于P_rc的旋转矩阵和平移向量。The combination of equations (3.36) and (3.38) can obtain the transformation relationship of P _lc relative to P _rc in the nth posture, as shown in equation (3.40). Among them, ^rc(n) R _lc(n) and ^rc(n) t _lc(n) are the rotation matrix and translation vector of P _lc relative to P _rc in the nth posture.

则第n位姿下L-Camera到R-Camera的映射关系^rc(n)T_lc(n)即为第n位姿下左、右相机的相对位姿关系，^rc(n)T_lc(n)的表达式如式(3.41)所示。Then the mapping relationship ^rc(n) T _lc(n) from L-Camera to R-Camera in the nth pose is the relative pose relationship between the left and right cameras in the nth pose. The expression of ^rc(n) T _lc(n) is shown in formula (3.41).

(7)计算动态基线和深度信息(7) Calculate dynamic baseline and depth information

在式(3.41)中^rc(n)t_lc(n)为左、右相机的相对平移向量，则第n位姿下的动态基线b可以由^rc(n)t_lc(n)取模得到，如式(3.42)所示。In formula (3.41), ^rc(n) t _lc(n) is the relative translation vector of the left and right cameras, then the dynamic baseline b at the nth posture can be obtained modulo ^rc(n) t _lc(n) , as shown in formula (3.42).

如图6所示，根据三角测量原理求深度信息：图中，I_l为左相机像平面，O_l为左相机光心，其在I_l上的投影记作O_l'，目标点P在I_l上的投影记作P_l(u_l,v_l)；I_r为右相机像平面，O_r为右相机光心，其在I_r上的投影记作O_r'，P在I_r上的投影记作P_r(u_r,v_r)；P_l和P_r的距离为基线b与视差d的差值，如(3.43)所示。因为左、右相机规格参数一致，所以两个相机焦距f相同。z为P到基线b的垂直距离，即深度信息。As shown in Figure 6, the depth information is obtained based on the triangulation principle: In the figure, I _l is the image plane of the left camera, O _l is the optical center of the left camera, its projection on I _l is marked as O _l ', and the projection of the target point P on I _l is marked as P _l (u _l ,v _l ); I _r is the image plane of the right camera, O _r is the optical center of the right camera, its projection on I _r is marked as O _r ', and the projection of P on I _r is marked as P _r (u _r ,v _r ); The distance between P _l and P _r is the difference between the baseline b and the parallax d, as shown in (3.43). Because the specifications of the left and right cameras are consistent, the focal lengths f of the two cameras are the same. z is the vertical distance from P to the baseline b, that is, the depth information.

|P_lP_r|＝b-d＝b-(u_l-u_r)|P _l P _r |=bd=b-(u _l -u _r )

(3.43)(3.43)

通过三角形相然原理可以得到深度信息，如式(3.44)所示，其中b为动态基线。Depth information can be obtained through the triangle phase principle, as shown in formula (3.44), where b is the dynamic baseline.

最后，将z与(u_l,v_l)联立，并利用相机内参和投影关系将其转换为相机坐标系下的坐标(X_c,Y_c,Z_c)，再根据坐标系映射关系将(X_c,Y_c,Z_c)转换为世界坐标系下的坐标(X,Y,Z)，即空间点的三维坐标。根据以上计算流程即可得到动态双目视觉标定方法，该方法仅在两个相机运动前进行双目视觉标定，在运动过程中只通过坐标系映射关系即可得目标物的空间三维坐标，无需重新进行双目视觉标定，可以有效减少外部环境因素对标定结果的影响。Finally, z is combined with ( _ul , _vl ), and the camera internal parameters and projection relationship are used to convert it into the coordinates ( _Xc , _Yc , _Zc ) in the camera coordinate system. Then, according to the coordinate system mapping relationship, ( _Xc , _Yc , _Zc ) is converted into the coordinates (X, Y, Z) in the world coordinate system, that is, the three-dimensional coordinates of the spatial point. According to the above calculation process, the dynamic binocular vision calibration method can be obtained. This method only performs binocular vision calibration before the two cameras move. During the movement, the spatial three-dimensional coordinates of the target object can be obtained only through the coordinate system mapping relationship. There is no need to re-calibrate the binocular vision, which can effectively reduce the influence of external environmental factors on the calibration results.

根据上述实施例，本发明提出的改进的YOLO目标检测算法是在YOLO-v5s网络结构中引入注意力机制形成，原因有三点：第一，注意力机制使得网络更加聚焦于目标对象，有助于提高YOLOv5s的检测准确率和召回率；第二，注意力机制有助于提取图像中的重要特征，抑制噪声和无关信息，有助于增强YOLOv5s在复杂场景(遮挡、光照变化等)下的鲁棒性；第三，注意力机制可以引导YOLOv5s网络更有效地学习图像特征，有助于加速模型的收敛速度，减少资源消耗。According to the above embodiments, the improved YOLO target detection algorithm proposed in the present invention is formed by introducing the attention mechanism into the YOLO-v5s network structure. There are three reasons: first, the attention mechanism makes the network more focused on the target object, which helps to improve the detection accuracy and recall rate of YOLOv5s; second, the attention mechanism helps to extract important features in the image, suppress noise and irrelevant information, and help enhance the robustness of YOLOv5s in complex scenes (occlusion, illumination changes, etc.); third, the attention mechanism can guide the YOLOv5s network to learn image features more effectively, which helps to accelerate the convergence speed of the model and reduce resource consumption.

注意力机制是一种用于增强深度学习模型表达能力和准确性的技术，它可以帮助模型在输入数据中选择性地关注与当前任务相关的特征，从而使模型可以更好地捕获和利用关键信息。常见的注意力机制包括通道注意SE(Squeeze-and-Excitation)、CA(ChannelAttention Module)、ECA(Efficient Channel Attention Module)，空间注意力SA(Spatial Attention)和混合注意力CBAM(Convolutional Block Attention Module)等。在注意力机制的选择上，因为通道注意力在嵌入通道时会丢失一部分空间信息，空间注意力难以捕获全局依赖性，所以选择通道注意力和空间注意力相结合的混合注意力机制。CBAM通过全局池化学习特征图的信息会丢失一些局部信息，对小尺度物体的关注不足。因此，本发明将通道注意力ECA和空间注意力SA相结合，组成新的混合注意力机制，在本实施例中称为ECA_SA模块。相比CBAM，ECA_SA更加注重在通道和空间维度上分别进行自适应加权，采用可分离卷积操作，在处理小尺度物体时会表现更好。The attention mechanism is a technology used to enhance the expressiveness and accuracy of deep learning models. It can help the model selectively focus on features related to the current task in the input data, so that the model can better capture and utilize key information. Common attention mechanisms include channel attention SE (Squeeze-and-Excitation), CA (ChannelAttention Module), ECA (Efficient Channel Attention Module), spatial attention SA (Spatial Attention) and hybrid attention CBAM (Convolutional Block Attention Module). In the selection of attention mechanisms, because channel attention loses part of the spatial information when embedding the channel, spatial attention is difficult to capture global dependencies, so a hybrid attention mechanism combining channel attention and spatial attention is selected. CBAM loses some local information by learning the information of feature maps through global pooling, and pays insufficient attention to small-scale objects. Therefore, the present invention combines channel attention ECA and spatial attention SA to form a new hybrid attention mechanism, which is called ECA_SA module in this embodiment. Compared with CBAM, ECA_SA pays more attention to adaptive weighting in the channel and spatial dimensions, and adopts separable convolution operations, which performs better when processing small-scale objects.

注意力机制加入位置的选择对于模型的优化也至关重要。Neck网络位于主干网络和输出层之间，在YOLO-v5s模型中有着承上启下的作用，它负责对主干网络提取的特征进行进一步处理、融合和聚合。而在Neck网络中，在卷积操作之前加入注意力机制有助于网络更好地捕捉特征之间的相关性。因此本发明选择在Neck网络中的每一个卷积块前添加ECA_SA模块，新的网络模型称之为YOLO-v5s_ECA_SA，其网络结构如图7所示。在添加ECA_SA模块时，要确保ECA_SA模块与其前、后两个模块的通道数保持一致，同时要检查ECA_SA模块与网络模型中其它模块的连接性。The choice of the position where the attention mechanism is added is also crucial for the optimization of the model. The Neck network is located between the backbone network and the output layer. It plays a connecting role in the YOLO-v5s model. It is responsible for further processing, fusing and aggregating the features extracted by the backbone network. In the Neck network, adding the attention mechanism before the convolution operation helps the network to better capture the correlation between features. Therefore, the present invention chooses to add an ECA_SA module before each convolution block in the Neck network. The new network model is called YOLO-v5s_ECA_SA, and its network structure is shown in Figure 7. When adding the ECA_SA module, it is necessary to ensure that the number of channels of the ECA_SA module is consistent with that of the two modules before and after it, and at the same time, the connectivity of the ECA_SA module with other modules in the network model should be checked.

根据上述实施例，优选地，还包括：采用反卷积(Deconvolution)方法替换YOLO模型原来的最近邻插值法，它通过学习参数来反向传播梯度，将低分辨率的特征图映射回高分辨率空间。具体地：According to the above embodiment, preferably, it also includes: using a deconvolution method to replace the original nearest neighbor interpolation method of the YOLO model, which back-propagates the gradient by learning parameters to map the low-resolution feature map back to the high-resolution space. Specifically:

将YOLOv5s_ECA_SA网络模型的上采样模块(Upsample)中的最近邻插值法用反卷积方法替换，反卷积需要配置的相关参数如表1所示。The nearest neighbor interpolation method in the upsample module (Upsample) of the YOLOv5s_ECA_SA network model is replaced by the deconvolution method. The relevant parameters that need to be configured for deconvolution are shown in Table 1.

表1反卷积参数配置Table 1 Deconvolution parameter configuration

根据上述实施例，优选地，基于位置的视觉伺服控制方法(PBVS)的基本思想是机械臂末端执行器的位姿误差与视觉特征点或特征描述子之间的误差联系起来，并通过调整机器人的关节角度来最小化这些误差，其基本流程如图8所示。通过PBVS的基本流程可以知道，想要使机械臂末端执行器准确的到达预期位姿需要两个关键参数：关节空间的速度和位置。根据动态双目视觉标定方法可以得到目标物的位姿，从而可以构建PBVS系统。系统构建完成后需要获得各关节的运动速度，从而控制关节运动进行抓取。According to the above embodiments, preferably, the basic idea of the position-based visual servo control method (PBVS) is to link the posture error of the end effector of the robot with the error between the visual feature points or feature descriptors, and minimize these errors by adjusting the joint angles of the robot. The basic process is shown in Figure 8. It can be seen from the basic process of PBVS that two key parameters are required to make the end effector of the robot accurately reach the expected posture: the speed and position in the joint space. According to the dynamic binocular vision calibration method, the posture of the target object can be obtained, so that the PBVS system can be constructed. After the system is built, it is necessary to obtain the movement speed of each joint to control the joint movement for grasping.

关节空间的位置可以根据视觉伺服误差不断调整。PBVS的误差可以由式(5.33)表示，其中s是t时刻视觉测量得到的状态量，l(t)为t时刻三维空间位置信息，r为三维空间姿态信息，s^为理想状态量。The position of the joint space can be continuously adjusted according to the visual servo error. The error of PBVS can be expressed by formula (5.33), where s is the state quantity obtained by visual measurement at time t, l(t) is the three-dimensional space position information at time t, r is the three-dimensional space posture information, and s^ is the ideal state quantity.

e(t)＝s(l(t),r)-s^{^}(5.33)e(t)＝s(l(t),r)-s ^{^} (5.33)

令r＝[R_x,R_y,R_z]^T，其模长为旋转角θ的大小，则θ和r可以由式(5.34)和式(5.35)表示。Let r = [R _x , R _y , R _z ] ^T , where its modulus is the size of the rotation angle θ, then θ and r can be expressed by equations (5.34) and (5.35).

在理想状态下，s^＝[0,0]^T，令目标物坐标与机械臂末端坐标的旋转矩阵和平移向量为In an ideal state, s^＝[0,0] ^T , let the rotation matrix and translation vector of the target object coordinates and the robot end coordinates be

^tR_e和^tT_e，则式(5.33)可以改写为式(5.36)。^tR_e和^tT_e可以由动态双目视觉标定得到。 ^t R _e and ^t T _e , then equation (5.33) can be rewritten as equation (5.36). ^t R _e and ^t T _e can be obtained by dynamic binocular vision calibration.

e(t)＝(tT_e,r)^T e(t)＝(tT _e ,r) ^T

(5.36)(5.36)

设机械臂末端的线速度向量和角速度向量分别为v_e和w_e，则末端的速度为V_e＝(v_e,w_e)，其与状态量的转换关系如式(5.37)所示：Assume that the linear velocity vector and angular velocity vector of the end of the robot arm are _ve and _we respectively, then the velocity of the end is _Ve = ( _ve , _we ), and its conversion relationship with the state quantity is shown in formula (5.37):

s＝L_sV_e (5.37)s＝L _s V _e (5.37)

其中，Ls为特征雅可比矩阵，其表达式如式(5.38)所示：Among them, Ls is the characteristic Jacobian matrix, and its expression is shown in formula (5.38):

令L_e＝L_s，根据式(5.36)和式(5.37)可以得到误差的导数，如式(5.39)所示。Let _Le = _Ls . According to equations (5.36) and (5.37), we can get the derivative of the error, as shown in equation (5.39).

e.＝L_e V_e(5.39)e.＝L _e V _e (5.39)

假设误差呈指数级减少，则夹爪的速度如式(5.40)所示，其中λ为比例系数，为L_e的满秩矩阵。Assuming that the error decreases exponentially, the speed of the gripper is as shown in equation (5.40), where λ is the proportionality coefficient, is a full rank matrix of _Le .

根据式(5.36)和式(5.38)可以将式(5.40)改写，如式(5.41)所示。According to equations (5.36) and (5.38), equation (5.40) can be rewritten as shown in equation (5.41).

根据式(5.41)可以得到机械臂末端的线速度v_e和角速度w_e，如式(5.42)所示。According to formula (5.41), the linear velocity v _e and angular velocity w _e of the end of the robot arm can be obtained, as shown in formula (5.42).

将位姿误差e(t)、线速度v_e和角速度w_e发送给机械臂控制器，根据运动学分析解算出各个关节的运动角度和速度，即可控制机械臂精确抓取目标物。The posture error e(t), linear velocity v _e and angular velocity w _e are sent to the robot controller, and the motion angle and speed of each joint are calculated based on kinematic analysis, so that the robot can be controlled to accurately grasp the target object.

根据上述实施例，PBVS虽然可以精确控制机械臂的抓取位姿，但是其对光照变化和视角变化较为敏感，并且对特征点匹配的准确性要求较高，在动态环境中仅采用PBVS很难实现机械臂的精确抓取。因此在本发明的视觉伺服控制系统中加入了基于扩展卡尔曼平滑变结构滤波(EK-SVSF)的位姿估计算法。卡尔曼滤波器通过对系统的状态进行预测和更新，提供对系统状态的最优估计。视觉伺服系统的状态预测协方差预测P_k/k-1和预测测量可以由式(5.43)表示，According to the above embodiment, although PBVS can accurately control the grasping posture of the robot arm, it is sensitive to changes in illumination and viewing angle, and has high requirements for the accuracy of feature point matching. It is difficult to achieve accurate grasping of the robot arm using only PBVS in a dynamic environment. Therefore, a posture estimation algorithm based on the extended Kalman smooth variable structure filter (EK-SVSF) is added to the visual servo control system of the present invention. The Kalman filter provides an optimal estimate of the system state by predicting and updating the state of the system. State prediction of the visual servo system Covariance prediction P _k/k-1 and prediction measurement It can be expressed by formula (5.43),

其中A为系统动态矩阵，为后验状态估计向量，P_k-1为后验误差协方差矩阵，Q_k-1为系统噪声协方差矩阵，f为图像映射函数。Where A is the system dynamic matrix, is the a posteriori state estimation vector, P _k-1 is the a posteriori error covariance matrix, Q _k-1 is the system noise covariance matrix, and f is the image mapping function.

根据式(5.43)，可以得到预测误差e_z,k/k-1，如式(5.44)所示，其中ξ_k-1为相机测量值。According to formula (5.43), the prediction error e _z ,k/k-1 can be obtained, as shown in formula (5.44), where ξ _k-1 is the camera measurement value.

从而得到平滑边界值ψ_k，如式(5.45)所示，其中F_k为观测矩阵，S_k为量测误差协方差矩阵，E-为由E组成的对角矩阵，其表达式如式(5.46)所示，其中γ为SVSF的收敛率。Thus, the smooth boundary value ψ _k is obtained, as shown in formula (5.45), where F _k is the observation matrix, S _k is the measurement error covariance matrix, E- is the diagonal matrix composed of E, and its expression is shown in formula (5.46), where γ is the convergence rate of SVSF.

E＝|e_z,k/k-1|+γ|e_z,k-1/k-1| (5.46)E＝|e _z ,k/k-1|+γ|e _z ,k-1/k-1| (5.46)

假设恒定边界值为ψ_c，当ψ_k>ψ_c时，EK-SVSF的增益K_k为先验证测量误差e_z,k/k-1和后验证测量误差e_z,k-1/k-1的函数，即标准SVSF增益；当ψ_k<ψ_c时，EK-SVSF的增益K_k为标准的EKF增益。EK-SVSF的增益的表达式如(5.47)所示，为测量矩阵F的伪逆，diag代表对角线，代表哈达玛积，sat代表SVSF中e_z,k/k-1/ψ_k的逻辑性可满足条件。Assuming that the constant boundary value is ψ _c , when ψ _k > ψ _c , the gain K _k of EK-SVSF is a function of the first verification measurement error e _z ,k/k-1 and the post-verification measurement error e _z ,k-1/k-1, that is, the standard SVSF gain; when ψ _k < ψ _c , the gain K _k of EK-SVSF is the standard EKF gain. The expression of the gain of EK-SVSF is shown in (5.47), is the pseudo-inverse of the measurement matrix F, diag represents the diagonal, represents the Hadamard product, and sat represents the logical satisfies condition of e _z ,k/k-1/ψ _k in SVSF.

根据K_k计算出系统状态更新后的状态预测ω^_k/k和协方差预测P_k，如式(5.48)所示。The updated state prediction ω^ _k/k and covariance prediction P _k of the system are calculated based on K _k , as shown in formula (5.48).

根据上述实施例，单个相机的位姿估计会受到外部环境的影响，从而影响位姿估计的准确性，不利于动态环境下的抓取任务，为了提高双臂系统的鲁棒性和稳定性，通常会使用数据融合技术，将来源于多个传感器的数据通过特定的方式进行集成。本发明的传感器包括左、右两个相机，当主臂进行抓取任务时，主眼会采集目标物信息并进行位姿估计，从臂会达到主臂侧面采集目标物信息和主臂末端执行器信息并进行位姿估计，因此本发明将对位姿估计进行数据融合，从而得到最优的位姿估计。According to the above embodiments, the pose estimation of a single camera will be affected by the external environment, thereby affecting the accuracy of the pose estimation, which is not conducive to grasping tasks in dynamic environments. In order to improve the robustness and stability of the dual-arm system, data fusion technology is usually used to integrate data from multiple sensors in a specific way. The sensor of the present invention includes two left and right cameras. When the main arm performs a grasping task, the main eye will collect target information and perform pose estimation, and the slave arm will reach the side of the main arm to collect target information and main arm end effector information and perform pose estimation. Therefore, the present invention will perform data fusion on the pose estimation to obtain the optimal pose estimation.

为了获得更好的位姿估计，本发明利用有序加权平均算子(OWA)对来源于两个相机的EK-SVSF位姿估计结果进行融合。OWA算子是基于节点之间的顺序关系进行加权平均来获得聚合结果，基于OWA的分布式数据融合流程如图9所示。In order to obtain better pose estimation, the present invention uses the ordered weighted average operator (OWA) to fuse the EK-SVSF pose estimation results from two cameras. The OWA operator performs weighted averaging based on the order relationship between nodes to obtain the aggregated result. The distributed data fusion process based on OWA is shown in Figure 9.

OWA是对EK-SVSF位姿估计结果进行分布式数据融合，因此OWA的权重是根据后验误差e_k/k计算得到的协方差矩阵P_k/k确定的，计算表达式如式(5.49)所示，其中e_k/k为当前状态ω_k与状态预测的差值。每次进行误差估计时，其误差值都可以从协方差矩阵P_k/k的对角元素中获得，从而用来更新权值。OWA performs distributed data fusion on the pose estimation results of EK-SVSF. Therefore, the weight of OWA is determined by the covariance matrix P _k/k calculated based on the posterior error e _k/k. The calculation expression is shown in formula (5.49), where e _k/k is the current state ω _k and the state prediction Each time an error estimation is performed, the error value can be obtained from the diagonal elements of the covariance matrix P _k/k and used to update the weights.

根据基于OWA的分布式数据融合方式的原理可知，估计误差低越权重越高，反之权重越低，因此权重是误差值的倒数。假设两个协方差矩阵P_k/k的第一个元素为P₁和P₂，则权重如式(5.50)所示，将得到的权重应用于对应数据并加权平均即可得到最优位姿估计。According to the principle of the distributed data fusion method based on OWA, the lower the estimated error, the higher the weight, and vice versa. Therefore, the weight is the inverse of the error value. Assuming that the first elements of the two covariance matrices P _k/k are P ₁ and P ₂ , the weights are as shown in formula (5.50). The obtained weights are applied to the corresponding data and weighted averaged to obtain the optimal pose estimation.

根据上述实施例，本发明以视觉伺服系统引导双臂协作机器人为目标，以动态环境为背景，以两个位置可变的EIH相机为基础，围绕双目视觉标定、目标检测算法、视觉伺服控制和位姿估计算法等关键技术，对双臂协作机器人视觉伺服控制方法进行优化，提高了双臂协作机器人视觉伺服控制系统的鲁棒性和准确性。具体包括：针对在动态环境中易被遮挡而导致双目系统失效的问题，设计了双臂协作机器人视觉伺服控制方案，并提出一种动态双目视觉伺服系统，该系统通过两个位置可变的EIH相机引导双臂协作机器人运动。硬件系统主体由两个协作机器臂和两个深度相机构成，视觉处理和协作机器臂的运动都通过ROS系统控制。提出对环境变化适应性更强的动态双目视觉标定方法，该方法利用协作机器人上的视觉模块，构建动态双目视觉系统，通过传统双目视觉标定方法和坐标系映射关系实现标定；选定YOLOv5s网络模型作为目标检测算法，针对YOLOv5s模型在动态环境中检测目标物存在的不足之处进行改进，包括在Neck网络中引入ECA_SA模块，用反卷积方法替换了上采样中的最近邻插值法。采用基于位置的视觉伺服控制技术，结合EK-SVSF位姿估计算法和基于OWA的分步式数据融合算法对目标物进行最优位姿估计。According to the above embodiments, the present invention aims to guide the dual-arm collaborative robot with a visual servo system, takes a dynamic environment as the background, and is based on two position-variable EIH cameras. It optimizes the visual servo control method of the dual-arm collaborative robot around key technologies such as binocular vision calibration, target detection algorithm, visual servo control, and pose estimation algorithm, thereby improving the robustness and accuracy of the visual servo control system of the dual-arm collaborative robot. Specifically, it includes: in order to solve the problem that the binocular system is easily blocked in a dynamic environment and causes failure, a dual-arm collaborative robot visual servo control scheme is designed, and a dynamic binocular visual servo system is proposed, which guides the movement of the dual-arm collaborative robot through two position-variable EIH cameras. The main body of the hardware system consists of two collaborative robot arms and two depth cameras, and the visual processing and the movement of the collaborative robot arms are controlled by the ROS system. A dynamic binocular vision calibration method with stronger adaptability to environmental changes is proposed. This method uses the visual module on the collaborative robot to build a dynamic binocular vision system, and realizes calibration through the traditional binocular vision calibration method and coordinate system mapping relationship. The YOLOv5s network model is selected as the target detection algorithm, and improvements are made to the shortcomings of the YOLOv5s model in detecting targets in dynamic environments, including the introduction of the ECA_SA module in the Neck network and the replacement of the nearest neighbor interpolation method in upsampling with the deconvolution method. The position-based visual servo control technology is used, combined with the EK-SVSF pose estimation algorithm and the OWA-based step-by-step data fusion algorithm to estimate the optimal pose of the target.

如图10所示，根据本发明的又一个实施例还公开了一种双臂协作机器人视觉伺服控制装置100，包括：存储器101，用于存储程序指令；处理器102，用于调用存储器中存储的程序指令以实现如上述实施例的双臂协作机器人视觉伺服控制方法。As shown in FIG10 , according to another embodiment of the present invention, a dual-arm collaborative robot visual servo control device 100 is disclosed, comprising: a memory 101 for storing program instructions; a processor 102 for calling the program instructions stored in the memory to implement the dual-arm collaborative robot visual servo control method as in the above embodiment.

根据本发明的又一个实施例还公开了一种计算机可读存储介质，该计算机可读存储介质存储有程序代码，程序代码用于实现如上述实施例的双臂协作机器人视觉伺服控制方法。According to another embodiment of the present invention, a computer-readable storage medium is disclosed. The computer-readable storage medium stores program codes, and the program codes are used to implement the dual-arm collaborative robot visual servo control method as described in the above embodiment.

上述实施例的各种方法中的全部或部分步骤是可以通过程序来控制相关的硬件来完成，该程序可以存储于可读存储介质中，存储介质包括只读存储器(Read—OnlyMemory，ROM)、随机存储器(Random Access Memory，RAM)、可编程只读存储器(Programmable Read-only Memory，PROM)、可擦除可编程只读存储器(ErasableProgrammable Read Only Memory，EPROM)、一次可编程只读存储器(One-timeProgrammable Read-Only Memory，OTPROM)、电子抹除式可复写只读存储器(Electrically-Erasable Programmable Read-Only Memory，EEPROM)、只读光盘(CompactDisc Read—Only Memory，CD-ROM)或其他光盘存储器、磁盘存储器、磁带存储器、或者能够用于携带或存储数据的可读的任何其他介质。All or part of the steps in the various methods of the above embodiments can be completed by controlling the relevant hardware through a program, and the program can be stored in a readable storage medium, and the storage medium includes a read-only memory (ROM), a random access memory (RAM), a programmable read-only memory (PROM), an erasable programmable read-only memory (EPROM), a one-time programmable read-only memory (OTPROM), an electronically erasable programmable read-only memory (EEPROM), a compact disc read-only memory (CD-ROM) or other optical disc storage, magnetic disk storage, magnetic tape storage, or any other readable medium that can be used to carry or store data.

以上所述仅为本发明的优选实施例而已，并不用于限制本发明，对于本领域的技术人员来说，本发明可以有各种更改和变化。凡在本发明的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention. For those skilled in the art, the present invention may have various modifications and variations. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included in the protection scope of the present invention.

Claims

1. The visual servo control method of the double-arm cooperative robot is characterized by comprising the following steps of:

acquiring image information of a target object acquired by a first camera arranged at the tail end of a first mechanical arm and a second camera arranged at the tail end of a second mechanical arm;

Establishing a coordinate relationship between the mechanical arm and the target object based on camera calibration and hand-eye calibration;

Performing static binocular vision calibration once only in the initial pose, and then obtaining the space three-dimensional coordinates of the target object through the coordinate system mapping relation, so as to realize dynamic binocular vision;

identifying the target object based on an improved YOLO target detection algorithm formed by adding an ECA attention mechanism and an SA attention mechanism on the basis of a YOLO-v5s model;

Determining target object position information according to a target detection result;

determining a master-slave relationship of the first mechanical arm and the second mechanical arm according to the position information;

The master arm is controlled to reach the grabbing pose, and the slave arm is controlled to drive the camera to reach the observing position according to the pose of the master arm;

Adopting a visual servo control method based on position, and carrying out optimal pose estimation on a target object by combining an EK-SVSF pose estimation algorithm and an OWA-based step-by-step data fusion algorithm;

and controlling the main arm to carry out deviation adjustment through joint movement according to the result of the optimal pose estimation, so as to realize accurate grabbing of the target object.

2. The method according to claim 1, wherein the first camera and the second camera are depth cameras, respectively, following the free end movements of the first mechanical arm and the second mechanical arm.

3. The visual servo control method of the double-arm cooperative robot according to claim 1, wherein the calculation process of the camera calibration comprises:

fixing a camera and continuously collecting images of a calibration plate;

Calculating the pixel coordinates of the corner points;

Calculating a homography matrix;

solving an internal reference matrix;

solving an external parameter matrix;

Solving a lens distortion coefficient;

and (3) based on the minimum function of the reprojection error, iterating by using maximum likelihood estimation, and optimizing all parameters obtained by camera calibration.

4. The visual servo control method of the double-arm cooperative robot according to claim 1, wherein the calculation process of the hand-eye calibration comprises:

Calculating a mapping matrix from a mechanical arm terminal coordinate system to a base coordinate system;

calculating a mapping matrix from the world coordinate system to the camera coordinate system;

and calculating a mapping matrix from the camera coordinate system to the tail end coordinate system of the mechanical arm.

5. The method for controlling visual servoing of a dual-arm cooperative robot according to claim 1, wherein said calculating of dynamic binocular vision comprises:

Calculating a mapping relation from the first camera to the base coordinate system under the initial pose;

Calculating the mapping relation from the base coordinate system to the first camera under the nth pose;

calculating the mapping relation from the world coordinate system to the base coordinate system;

calculating the mapping relation between the world coordinate system under the nth pose and the first camera;

calculating the mapping relation from the world coordinate system to the second camera under the nth pose;

Calculating a mapping relation from the second camera to the first camera under the nth pose;

calculating a dynamic baseline in the nth pose according to the relative translation vector of the first camera and the second camera;

obtaining depth information according to the dynamic baseline and a triangulation principle;

And acquiring the three-dimensional coordinates of the space point by utilizing the internal camera parameters and the projection relation according to the depth information and the projection information of the target point on the second camera image plane.

6. The visual servo control method of the double-arm cooperative robot according to claim 1, wherein the improved YOLO target detection algorithm specifically comprises:

The ECA module and SA module are added before each convolution block of Neck layers of the YOLO-v5s model.

7. The method for visual servoing control of a dual-arm cooperative robot of claim 6, further comprising:

The nearest neighbor interpolation method of an up-sampling module of the YOLO-v5s model is replaced by a deconvolution method, the convolution kernel size is 4, the convolution step length is 2, the gradient is reversely propagated through learning parameters, and the low-resolution feature map is mapped back to a high-resolution space.

8. The method according to any one of claims 1 to 6, wherein the step of determining the master-slave relationship between the first and second robot arms specifically comprises:

Determining a center point of the target object through a vision system;

Calculating the Euclidean distance Ll between the center point of the object and the first mechanical arm, and calculating the Euclidean distance Lr between the center point of the object and the second mechanical arm;

When Ll is smaller than Lr, selecting the first mechanical arm as a master arm and the second mechanical arm as a slave arm;

when Ll is larger than Lr, the second mechanical arm is selected as a master arm, and the first mechanical arm is selected as a slave arm.

9. A visual servo control device of a double-arm cooperative robot, comprising:

a memory for storing program instructions;

A processor for invoking said program instructions stored in said memory to implement the dual arm cooperative robot vision servo control method of any of claims 1-8.

10. A computer readable storage medium, characterized in that the computer readable storage medium stores a program code for implementing the double-arm cooperative robot vision servo control method according to any one of claims 1 to 8.