当前位置: 首页>博士论文>资源详情
融合人眼视觉的显著性目标检测及视觉伺服控制研究
中文摘要

随着空间与深海等探索活动持续的发展,机器人需在极限环境(如真空、高压等)下进行多种任务作。由于此类环境的非结构化特点(存在不确定性和复杂性),机器人难以快速的对场景的变化进行视觉感知并作出响应。相反地,面对此类复杂不确定性环境时,人类视觉系统能够将有限的大脑处理资源集中在重要的目标上并对其优先进行处理。这种选择注意机制提高了人类快速应对非结构化环境变化的能力。受此生物机制的启发,大量工作提出利用显著性检测算法来模拟人类的这种机制。显著性目标检测算法能够实现场景目标区域的快速定位,非常适合作为图像预处理步骤并实现场景目标快速准确的检测,因此已广泛应用于多个研究领域,如目标识别、目标跟踪及人机交互等。但大部分研究只关注简单场景的显著性检测,对具有复杂纹理、多变背景及杂乱颜色的场景,它们难以以一致的高显著值凸出显著目标。本文在通过分析已有显著性算法缺点的基础上,提出了两种融合人眼视觉信息的显著性曰标检测算法,并成功应用于机器人伺服控制。本文在显著性目标检测、人眼信息建模、视觉伺服控制等方面开展了以下工作: 为了有效地利用人眼视觉信息进行显著性目标检测,提出了一种基于粒子滤波的人眼注视信息建模方法。首先,基于人眼注视信息的时空特性,建立了注视点的时间-空间模型,用于描述注视点数据;结合粒子滤波和时空模型,对注视点数据进行处理以生成注视信息的密度图。经过实验验证,提出的方法可以很好地表征显著口标所在的区域。 针对现有算法难以处理背景复杂的问题,提出了一种基于多尺度流形学习的显著性目标检测算法。首先,利用注视点预测模型和基于图的超像素提取方法实现显著性区域的指示及显著性目标的粗检测;然后,利用超像素提取方法实现了场景的多尺度信息表达,通过提出的流形学习正则化框架,完成多尺度场景的显著性评估,并整合多个显著图对粗检测结果进行优化,从而得到最终显著性目标结果。实验结果表明,在复杂图像集Judd-A上,提出算法相较于当前最好算法的F度量值提高了6.2%。 为了提高复杂环境下显著目标的完整性和准确性,提出了一种融合人眼视觉信息的多图显著性目标检测算法。通过分析复杂环境的特性,提取颜色等多个特征米表征超像素单元;根据提取的多个特征和多个距离度量,构建多个无向图;基于提取的人眼注视点密度图,对其进行阈值处理并提取相应区域作为前景显著性种子,从而完成人眼注视信息的融合;通过分析多个无向图的互补特性,采用了一种改进的多图优化流形学习框架完成多图显著性评估,从而生成像素级显著图。实验结果表明,在复杂图像集Judd-A上,提出算法相较于当前最好算法的F度量值提高了20.7%。 在显著性目标检测的基础上,提出了一种机器人视觉伺服方法实现对任务目标的精确抓取。通过改进的迭代闽值分割算法实现对显著性目标的区域分割,并采用基于改进的遗传算法实现了分割区域的图像矩特征匹配,从而完成目标识别;根据单目“Eye-in-Hand”摄像机配置推导了无标定视觉伺服图像雅可比矩阵表达式,利用基于拟牛顿法设计了视觉伺服控制器;基于特征匹配求得的单应性矩阵,计算图像特征误差,完成了机器人的运动控制。 针对视觉伺服的自主控制难以应对非结构化环境的问题,提出了一种融合人眼视觉的机器人伺服控制方法。通过分析手控器及视线追踪等人机交互设备的特性,分别为它们设计了相应的融合人眼视觉的人机交互映射控制方法,从而实现机器人的直接控制;借鉴碰撞检测中的思想,提出了一种基于跟踪距离的自主控制区域建模方法,从而将构建的人机交互场景分为直接控制和自主控制区域;通过分析视觉伺服控制的特点,提出一种自主控制区域半径计算方法,从而实现了控制模式的自动切换。 综合以上研究工作的成果,构建了空间非合作目标捕获的遥操作原型系统;并从系统框架、功能模块开发对原型系统进行了详细的描述,并通过实验对融合人眼视觉的显著性目标检测及视觉伺服控制方法进行了验证。 关键词:显著性目标检测,人眼视觉,视觉伺服,人眼注视信息建模,多尺度流形学习,特征匹配,人机交互

英文摘要

With the development of space and deep sea exploration activities, robots need to carry out various tasks in the extreme environments (such as vacuum, high pressure, etc.). Because of the unstructured characteristics (uncertainty and complexity) in the extreme environments, the robot can not quickly perceive and respond to changes of the scenes. In contrast, the human visual system can focus limited brain processing resources on important objects and prioritize them when facing such complex and uncertain environments. This selective attention mechanism enables human to quickly respond to changes in unstructured environments. Inspired by this biological mechanism, a large number of saliency detection algorithms are proposed to achieve the human visual attention mechanism. The saliency object detection algorithms can achieve rapid positioning of object region in the scene, which is very suitable as the image preprocessing step to achieve quick and accurate object detection, so they have been widely used in many research fields, such as object recognition, object tracking and human-computer interaction. However, most of the researches only focus on the saliency detection of simple scenes, and they are difficult to highlight the saliency object with consistent high and significant values for the scenes with complex texture, volatile background and messy color,. In this dissertation, based on the analysis of the shortcomings of the existing saliency algorithms, two kinds of saliency detection algorithms based on human visual information are proposed, and successfully applied in the robot visual servoing control applications. In this dissertation, the following works are carried out in the aspects of saliency object detection, human eye information modeling, visual servo control and so on. In order to effectively perform saliency object detection with human visual information, a method of human gaze information modeling based on particle filter is proposed. First, based on the temporal and spatial characteristics of eye gaze information, the temporal-spatial model of gaze points is established to describe the gaze point data. Combined particle filter and spatio-temporal model, the gaze point data are processed based on particle filter to generate the density map of gaze information. Experimental results show that the performance of the proposed method is good to express the region of the salient object. Aiming at the problem that the existing algorithms are difficult to deal with complex backgrounds, a saliency detection algorithm based on multi-scale manifold learning is proposed. Using the fixation prediction model and the superpixel extraction method based on the graph to achieve the indication of the salient region and the coarse detection of the salient object. The multi-scale information expression of the scene is realized by the superpixel extraction method. Baed on the proposed manifold learning regularization framework, the saliency estimation of multi-scale scenes is completed. By integrating multiple saliency maps to optimize the coarse detection results, the final saliency object result is achieved. Experimental results show that the F-measure of proposed algorithm is 6.2% more than the second best method on Judd-A dataset. In order to improve the integrity and accuracy of the salient object in the complex scenes, a multi-graph saliency detection algorithm based on human visual information is proposed. By analyzing the characteristics of complex scene, the color and other features are extracted to characterize the superpixel unit. Using multiple features and distance measures construct multiple undirected graphs. The threshold processing is used to threshold the extracted density map of eye gaze points, and the corresponding region is extracted as foreground salient seeds so as to complete the eye gaze information fusion. By analyzing the complementary characteristics of multiple undirected graphs, an improved multi-graph optimization manifold learning framework is used to complete the multi-graph saliency estimation, thus generating the pixel saliency map. Experimental results show that the F-measure of proposed algorithm is 20.7% more than the second best method on Judd-A dataset. On the basis of the saliency object detection, a robot visual servoing control method is proposed to achieve the accurate grasp of the task target. An improved iterative threshold segmentation algorithm is used to achieve the region segmentation of the saliency object, and the image moment feature matching is implemented to achieve the object recognition based on the improved genetic algorithm. According to the configuration of monocular "Eye-in-Hand" camera, the image Jacobian matrix expression of uncalibrated visual servoing is derived, and the visual servoing controller is designed based on Quasi-Newton method. Based on the homography matrix obtained by feature matching, the image feature error is calculated, and the motion control of the robot is completed. Aiming at the problem that the autonomous control of visual servoing is difficult to deal with the unstructured environments, a robot servoing control method based on human vision is proposed. By analyzing the characteristics of human-computer interaction devices such as hand controller and gaze tracking, the corresponding human-computer interactive mapping control methods based on human vision are designed for them to realize the direct control of the robot. Benefitting the idea of collision detection, a method of autonomous control region modeling based on tracking distance is proposed, which can divide the human-computer region into direct control and autonomous control regions. By analyzing the characteristics of visual servo control, a calculation method of autonomous control area radius is proposed, which realizes the automatic switching of control mode. Based on the achievements of the above researches, the prototype system of "spatial non-cooperative target capture" is constructed. The system is described in detail from the system framework and function module development. The saliency detection and visual servoing control methods fused with human vision are verified by the experiments. Keywords: Saliency Object Detection, Human Vision, Visual Servoing, Eye Gaze Information Modeling, Multi-Scale Manifold Learning, Feature Matching, Human-Computer Interaction (HCI

作者相关
主题相关
看过该书的人还在看哪些书