首页 > 最新文献

2014 International Conference on Digital Image Computing: Techniques and Applications (DICTA)最新文献

英文 中文
Automatic Building Footprint Extraction and Regularisation from LIDAR Point Cloud Data 基于激光雷达点云数据的楼宇足迹自动提取与正则化
M. Awrangjeb, Guojun Lu
This paper presents a segmentation of LIDAR point cloud data for automatic extraction of building footprint. Using the ground height information from a DEM (Digital Elevation Model), the non-ground points (mainly buildings and trees) are separated from the ground points. Points on walls are removed from the set of non-ground points. The remaining non-ground points are then divided into clusters based on height and local neighbourhood. Planar roof segments are extracted from each cluster of points following a region-growing technique. Planes are initialised using coplanar points as seed points and then grown using plane compatibility tests. Once all the planar segments are extracted, a rule-based procedure is applied to remove tree planes which are small in size and randomly oriented. The neighbouring planes are then merged to obtain individual building boundaries, which are regularised based on a new feature-based technique. Corners and line-segments are extracted from each boundary and adjusted using the assumption that each short building side is parallel or perpendicular to one or more neighbouring long building sides. Experimental results on five Australian data sets show that the proposed method offers higher correctness rate in building footprint extraction than a state-of-the-art method.
本文提出了一种用于建筑物足迹自动提取的激光雷达点云数据分割方法。利用DEM(数字高程模型)的地面高度信息,将非地面点(主要是建筑物和树木)与地面点分离。墙上的点从非接地点的集合中移除。剩余的非地面点则根据高度和局部邻域划分成簇。利用区域生长技术从每个点簇中提取平面顶板段。平面使用共面点作为种子点初始化,然后使用平面兼容性测试进行生长。在提取出所有平面段后,采用一种基于规则的方法去除尺寸较小且方向随机的树平面。然后将相邻的平面合并以获得单独的建筑边界,这些边界基于一种新的基于特征的技术进行规范化。从每个边界提取角和线段,并使用假设每个短建筑边平行或垂直于一个或多个相邻的长建筑边来调整。在五个澳大利亚数据集上的实验结果表明,所提出的方法在建筑足迹提取方面比目前最先进的方法具有更高的正确率。
{"title":"Automatic Building Footprint Extraction and Regularisation from LIDAR Point Cloud Data","authors":"M. Awrangjeb, Guojun Lu","doi":"10.1109/DICTA.2014.7008096","DOIUrl":"https://doi.org/10.1109/DICTA.2014.7008096","url":null,"abstract":"This paper presents a segmentation of LIDAR point cloud data for automatic extraction of building footprint. Using the ground height information from a DEM (Digital Elevation Model), the non-ground points (mainly buildings and trees) are separated from the ground points. Points on walls are removed from the set of non-ground points. The remaining non-ground points are then divided into clusters based on height and local neighbourhood. Planar roof segments are extracted from each cluster of points following a region-growing technique. Planes are initialised using coplanar points as seed points and then grown using plane compatibility tests. Once all the planar segments are extracted, a rule-based procedure is applied to remove tree planes which are small in size and randomly oriented. The neighbouring planes are then merged to obtain individual building boundaries, which are regularised based on a new feature-based technique. Corners and line-segments are extracted from each boundary and adjusted using the assumption that each short building side is parallel or perpendicular to one or more neighbouring long building sides. Experimental results on five Australian data sets show that the proposed method offers higher correctness rate in building footprint extraction than a state-of-the-art method.","PeriodicalId":146695,"journal":{"name":"2014 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116881208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Near-Miss Event Detection at Railway Level Crossings 铁路平道口的近距离探测
Sina Aminmansour, F. Maire, C. Wullems
Recent modelling of socio-economic costs by the Australian railway industry in 2010 has estimated the cost of level crossing accidents to exceed AU$116 million annually. To better understand the causal factors of these accidents, a video analytics application is being developed to automatically detect near- miss incidents using forward facing videos from trains. As near-miss events occur more frequently than collisions, by detecting these occurrences there will be more safety data available for analysis. The application that is being developed will improve the objectivity of near- miss reporting by providing quantitative data about the position of vehicles at level crossings through the automatic analysis of video footage. In this paper we present a novel method for detecting near-miss occurrences at railway level crossings from video data of trains. Our system detects and localizes vehicles at railway level crossings. It also detects the position of railways to calculate the distance of the detected vehicles to the railway centerline. The system logs the information about the position of the vehicles and railway centerline into a database for further analysis by the safety data recording and analysis system, to determine whether or not the event is a near-miss. We present preliminary results of our system on a dataset of videos taken from a train that passed through 14 railway level crossings. We demonstrate the robustness of our system by showing the results of our system on day and night videos.
2010年,澳大利亚铁路行业最近的社会经济成本模型估计,平交道口事故的成本每年超过1.16亿澳元。为了更好地了解这些事故的原因,人们正在开发一种视频分析应用程序,利用火车上的正面视频自动检测未遂事故。由于未遂事件比碰撞更频繁,通过检测这些事件,将有更多的安全数据可供分析。正在开发的应用程序将通过自动分析录像片段,提供有关平交道口车辆位置的定量数据,从而提高近距离脱靶报告的客观性。本文提出了一种利用列车视频数据检测铁路平交道口近靶事件的新方法。我们的系统检测和定位铁路平交道口的车辆。它还检测铁路的位置,以计算被检测车辆到铁路中心线的距离。该系统将车辆和铁路中心线的位置信息记录到数据库中,供安全数据记录和分析系统进一步分析,以确定该事件是否为未遂事件。我们展示了我们的系统在通过14个铁路平交道口的火车上拍摄的视频数据集上的初步结果。我们通过展示我们的系统在白天和夜间视频上的结果来证明我们系统的鲁棒性。
{"title":"Near-Miss Event Detection at Railway Level Crossings","authors":"Sina Aminmansour, F. Maire, C. Wullems","doi":"10.1109/DICTA.2014.7008119","DOIUrl":"https://doi.org/10.1109/DICTA.2014.7008119","url":null,"abstract":"Recent modelling of socio-economic costs by the Australian railway industry in 2010 has estimated the cost of level crossing accidents to exceed AU$116 million annually. To better understand the causal factors of these accidents, a video analytics application is being developed to automatically detect near- miss incidents using forward facing videos from trains. As near-miss events occur more frequently than collisions, by detecting these occurrences there will be more safety data available for analysis. The application that is being developed will improve the objectivity of near- miss reporting by providing quantitative data about the position of vehicles at level crossings through the automatic analysis of video footage. In this paper we present a novel method for detecting near-miss occurrences at railway level crossings from video data of trains. Our system detects and localizes vehicles at railway level crossings. It also detects the position of railways to calculate the distance of the detected vehicles to the railway centerline. The system logs the information about the position of the vehicles and railway centerline into a database for further analysis by the safety data recording and analysis system, to determine whether or not the event is a near-miss. We present preliminary results of our system on a dataset of videos taken from a train that passed through 14 railway level crossings. We demonstrate the robustness of our system by showing the results of our system on day and night videos.","PeriodicalId":146695,"journal":{"name":"2014 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120998805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Reflective Features Detection and Hierarchical Reflections Separation in Image Sequences 图像序列中的反射特征检测与分层反射分离
Di Yang, Srimal Jayawardena, Stephen Gould, Marcus Hutter
Computer vision techniques such as Structurefrom- Motion (SfM) and object recognition tend to fail on scenes with highly reflective objects because the reflections behave differently to the true geometry of the scene. Such image sequences may be treated as two layers superimposed over each other - the nonreflection scene source layer and the reflection layer. However, decomposing the two layers is a very challenging task as it is ill-posed and common methods rely on prior information. This work presents an automated technique for detecting reflective features with a comprehensive analysis of the intrinsic, spatial, and temporal properties of feature points. A support vector machine (SVM) is proposed to learn reflection feature points. Predicted reflection feature points are used as priors to guide the reflection layer separation. This gives more robust and reliable results than what is achieved by performing layer separation alone.
计算机视觉技术,如结构从运动(SfM)和物体识别往往失败的场景与高反射的物体,因为反射的行为不同于真实的几何场景。这样的图像序列可以被视为相互叠加的两层——非反射场景源层和反射层。然而,分解两层是一项非常具有挑战性的任务,因为它是病态的,而且常用的方法依赖于先验信息。这项工作提出了一种自动检测反射特征的技术,该技术对特征点的内在、空间和时间属性进行了全面分析。提出了一种基于支持向量机的反射特征点学习方法。利用预测的反射特征点作为先验,指导反射层分离。这比单独执行层分离所获得的结果更健壮和可靠。
{"title":"Reflective Features Detection and Hierarchical Reflections Separation in Image Sequences","authors":"Di Yang, Srimal Jayawardena, Stephen Gould, Marcus Hutter","doi":"10.1109/DICTA.2014.7008127","DOIUrl":"https://doi.org/10.1109/DICTA.2014.7008127","url":null,"abstract":"Computer vision techniques such as Structurefrom- Motion (SfM) and object recognition tend to fail on scenes with highly reflective objects because the reflections behave differently to the true geometry of the scene. Such image sequences may be treated as two layers superimposed over each other - the nonreflection scene source layer and the reflection layer. However, decomposing the two layers is a very challenging task as it is ill-posed and common methods rely on prior information. This work presents an automated technique for detecting reflective features with a comprehensive analysis of the intrinsic, spatial, and temporal properties of feature points. A support vector machine (SVM) is proposed to learn reflection feature points. Predicted reflection feature points are used as priors to guide the reflection layer separation. This gives more robust and reliable results than what is achieved by performing layer separation alone.","PeriodicalId":146695,"journal":{"name":"2014 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125577859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Modular Learning Approach for Fish Counting and Measurement Using Stereo Baited Remote Underwater Video 基于立体诱饵远程水下视频的鱼类计数和测量模块化学习方法
F. Westling, Changming Sun, Dadong Wang
An approach is suggested for automating fish identification and measurement using stereo Baited Remote Underwater Video footage. Simple methods for identifying fish are not sufficient for measurement, since the snout and tail points must be found, and the stereo data should be incorporated to find a true measurement. We present a modular framework that ties together various approaches in order to develop a generalised system for automated fish detection. A method is also suggested for using machine learning to improve identification. Experimental results indicate the suitability of our approach.
提出了一种利用立体鱼饵远程水下录像自动识别和测量鱼类的方法。简单的识别方法不足以进行测量,因为必须找到鱼的鼻子和尾巴点,并且必须结合立体数据才能找到真正的测量。我们提出了一个模块化框架,将各种方法联系在一起,以开发一个自动化鱼类检测的通用系统。本文还提出了一种利用机器学习改进识别的方法。实验结果表明了该方法的适用性。
{"title":"A Modular Learning Approach for Fish Counting and Measurement Using Stereo Baited Remote Underwater Video","authors":"F. Westling, Changming Sun, Dadong Wang","doi":"10.1109/DICTA.2014.7008086","DOIUrl":"https://doi.org/10.1109/DICTA.2014.7008086","url":null,"abstract":"An approach is suggested for automating fish identification and measurement using stereo Baited Remote Underwater Video footage. Simple methods for identifying fish are not sufficient for measurement, since the snout and tail points must be found, and the stereo data should be incorporated to find a true measurement. We present a modular framework that ties together various approaches in order to develop a generalised system for automated fish detection. A method is also suggested for using machine learning to improve identification. Experimental results indicate the suitability of our approach.","PeriodicalId":146695,"journal":{"name":"2014 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"429 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122869135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Performance Evaluation of 3D Local Surface Descriptors for Low and High Resolution Range Image Registration 三维局部表面描述符在低分辨率和高分辨率距离图像配准中的性能评价
S. A. A. Shah, Bennamoun, F. Boussaïd
Despite the advent and popularity of low-cost commercial sensors (e.g., Microsoft Kinect), research in 3D vision still primarily focuses on the development of advanced algorithms geared towards high resolution data. This paper presents a comparative performance evaluation of renowned state-of-the-art 3D local surface descriptors for the task of registration of both high and low resolution range image data. The datasets used in these experiments are the renowned high resolution Stanford 3D models dataset and challenging low resolution Washington RGB-D object dataset. Experimental results show that the performance of certain local surface descriptors is significantly affected by low resolution data.
尽管低成本商业传感器(如微软Kinect)的出现和普及,3D视觉的研究仍然主要集中在面向高分辨率数据的先进算法的开发上。本文介绍了著名的最先进的三维局部表面描述符在高分辨率和低分辨率范围图像数据配准任务中的比较性能评估。这些实验中使用的数据集是著名的高分辨率斯坦福3D模型数据集和具有挑战性的低分辨率华盛顿RGB-D对象数据集。实验结果表明,局部表面描述符的性能受到低分辨率数据的显著影响。
{"title":"Performance Evaluation of 3D Local Surface Descriptors for Low and High Resolution Range Image Registration","authors":"S. A. A. Shah, Bennamoun, F. Boussaïd","doi":"10.1109/DICTA.2014.7008123","DOIUrl":"https://doi.org/10.1109/DICTA.2014.7008123","url":null,"abstract":"Despite the advent and popularity of low-cost commercial sensors (e.g., Microsoft Kinect), research in 3D vision still primarily focuses on the development of advanced algorithms geared towards high resolution data. This paper presents a comparative performance evaluation of renowned state-of-the-art 3D local surface descriptors for the task of registration of both high and low resolution range image data. The datasets used in these experiments are the renowned high resolution Stanford 3D models dataset and challenging low resolution Washington RGB-D object dataset. Experimental results show that the performance of certain local surface descriptors is significantly affected by low resolution data.","PeriodicalId":146695,"journal":{"name":"2014 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130117862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Novel Evaluation Index for Image Quality 一种新的图像质量评价指标
Sheikh Md. Rabiul Islam, Xu Huang, K. Le
Indexes used for image quality evaluation are provided as computational models to measure the quality of images in a perceptually consistent manner. This paper presents a novel evaluation index for assessing image qualities. The index is a modification of the existing traditional Structural Similarity Index Measure (SSIM) by adding another factor to reflect the shape of the brightness histogram of the assessed image. The proposed index therefore is a combination of four major factors luminance, contrast, structure and shape of histogram. This index is mathematically simple and applicable in various image processing. For demonstration a new image de-noising approach using an adaptive shrinkage threshold in the shearlet domain is used. Experimental results show that the new image quality indexes give better prediction accuracy, better prediction monotonicity than PSNR, HQI, UIQI and SSIM.
用于图像质量评估的指标作为计算模型提供,以感知一致的方式测量图像质量。提出了一种新的图像质量评价指标。该指数是对现有传统的结构相似指数度量(SSIM)的改进,通过增加另一个因子来反映被评估图像的亮度直方图的形状。因此,所提出的指数是直方图亮度、对比度、结构和形状四个主要因素的组合。该指标在数学上简单,适用于各种图像处理。为了演示一种新的图像去噪方法,在shearlet域使用自适应收缩阈值。实验结果表明,新的图像质量指标比PSNR、HQI、UIQI和SSIM具有更好的预测精度和单调性。
{"title":"Novel Evaluation Index for Image Quality","authors":"Sheikh Md. Rabiul Islam, Xu Huang, K. Le","doi":"10.1109/DICTA.2014.7008120","DOIUrl":"https://doi.org/10.1109/DICTA.2014.7008120","url":null,"abstract":"Indexes used for image quality evaluation are provided as computational models to measure the quality of images in a perceptually consistent manner. This paper presents a novel evaluation index for assessing image qualities. The index is a modification of the existing traditional Structural Similarity Index Measure (SSIM) by adding another factor to reflect the shape of the brightness histogram of the assessed image. The proposed index therefore is a combination of four major factors luminance, contrast, structure and shape of histogram. This index is mathematically simple and applicable in various image processing. For demonstration a new image de-noising approach using an adaptive shrinkage threshold in the shearlet domain is used. Experimental results show that the new image quality indexes give better prediction accuracy, better prediction monotonicity than PSNR, HQI, UIQI and SSIM.","PeriodicalId":146695,"journal":{"name":"2014 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134243159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Pedestrian Lane Detection in Unstructured Environments for Assistive Navigation 面向辅助导航的非结构化环境行人车道检测
M. Le, S. L. Phung, A. Bouzerdoum
Automatically finding paths is a crucial and challenging task in autonomous navigation systems. The task becomes more difficult in unstructured environments such as indoor or outdoor scenes with unmarked pedestrian lanes under severe illumination conditions, complex lane surface structures, and occlusion. This paper proposes a robust method for pedestrian lane detection in such unstructured environments. The proposed method detects the walking lane in a probabilistic framework integrating both appearance of the lane region and characteristics of the lane borders. The vanishing point is employed to identify the lane borders. We propose an improved vanishing point estimation method based on orientation of color edges, and use pedestrian detection for occlusion handling. The proposed pedestrian lane detection method is evaluated on a new data set of 2000 images collected from various indoor and outdoor scenes with different types of unmarked lanes. Experimental results and comparisons with other existing methods on the new data set have shown the efficiency and robustness of the proposed method.
在自主导航系统中,自动寻径是一项至关重要且具有挑战性的任务。在非结构化环境中,如室内或室外,在光照条件恶劣、车道表面结构复杂和遮挡的情况下,没有标记的行人车道,任务变得更加困难。本文提出了一种鲁棒的非结构化环境下行人车道检测方法。该方法在结合车道区域外观和车道边界特征的概率框架中检测步行车道。利用消失点来识别车道边界。我们提出了一种改进的基于颜色边缘方向的消失点估计方法,并利用行人检测进行遮挡处理。在2000张不同类型无标记车道的室内和室外场景图像的新数据集上,对所提出的行人车道检测方法进行了评估。在新数据集上的实验结果和与其他现有方法的比较表明了该方法的有效性和鲁棒性。
{"title":"Pedestrian Lane Detection in Unstructured Environments for Assistive Navigation","authors":"M. Le, S. L. Phung, A. Bouzerdoum","doi":"10.1109/DICTA.2014.7008122","DOIUrl":"https://doi.org/10.1109/DICTA.2014.7008122","url":null,"abstract":"Automatically finding paths is a crucial and challenging task in autonomous navigation systems. The task becomes more difficult in unstructured environments such as indoor or outdoor scenes with unmarked pedestrian lanes under severe illumination conditions, complex lane surface structures, and occlusion. This paper proposes a robust method for pedestrian lane detection in such unstructured environments. The proposed method detects the walking lane in a probabilistic framework integrating both appearance of the lane region and characteristics of the lane borders. The vanishing point is employed to identify the lane borders. We propose an improved vanishing point estimation method based on orientation of color edges, and use pedestrian detection for occlusion handling. The proposed pedestrian lane detection method is evaluated on a new data set of 2000 images collected from various indoor and outdoor scenes with different types of unmarked lanes. Experimental results and comparisons with other existing methods on the new data set have shown the efficiency and robustness of the proposed method.","PeriodicalId":146695,"journal":{"name":"2014 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"115 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123987905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Unsupervised Image Classification by Probabilistic Latent Semantic Analysis for the Annotation of Images 基于概率潜在语义分析的图像标注无监督分类
Abass A. Olaode, G. Naghdy, Catherine A. Todd
Image annotation has been identified to be a suitable means by which the semantic gap which has made the accuracy of Content-based image retrieval unsatisfactory be eliminated. However existing methods of automatic annotation of images depends on supervised learning, which can be difficult to implement due to the need for manually annotated training samples which are not always readily available. This paper argues that the unsupervised learning via Probabilistic Latent Semantic Analysis provides a more suitable machine learning approach for image annotation especially due to its potential to based categorisation on the latent semantic content of the image samples, which can bridge the semantic gap present in Content Based Image Retrieval. This paper therefore proposes an unsupervised image categorisation model in which the semantic content of images are discovered using Probabilistic Latent Semantic Analysis, after which they are clustered into unique groups based on semantic content similarities using K-means algorithm, thereby providing suitable annotation exemplars. A common problem with categorisation algorithms based on Bag-of-Visual Words modelling is the loss of accuracy due to spatial incoherency of the Bag-of-Visual Word modelling, this paper also examines the effectiveness of Spatial pyramid as a means of eliminating spatial incoherency in Probabilistic Latent Semantic Analysis classification.
图像标注是消除基于内容的图像检索精度不理想的语义缺口的一种合适的方法。然而,现有的图像自动标注方法依赖于监督学习,由于需要手动标注的训练样本,而这些样本并不总是现成的,因此很难实现。本文认为,通过概率潜在语义分析的无监督学习为图像标注提供了一种更合适的机器学习方法,特别是由于它有可能基于图像样本的潜在语义内容进行分类,这可以弥补基于内容的图像检索中存在的语义差距。因此,本文提出了一种无监督图像分类模型,该模型使用概率潜在语义分析方法发现图像的语义内容,然后使用K-means算法根据语义内容相似度将图像聚类成唯一的组,从而提供合适的注释范例。基于Bag-of-Visual Words建模的分类算法的一个常见问题是由于Bag-of-Visual Word建模的空间不相干而导致准确性损失,本文还研究了空间金字塔作为消除概率潜在语义分析分类中空间不相干的一种手段的有效性。
{"title":"Unsupervised Image Classification by Probabilistic Latent Semantic Analysis for the Annotation of Images","authors":"Abass A. Olaode, G. Naghdy, Catherine A. Todd","doi":"10.1109/DICTA.2014.7008133","DOIUrl":"https://doi.org/10.1109/DICTA.2014.7008133","url":null,"abstract":"Image annotation has been identified to be a suitable means by which the semantic gap which has made the accuracy of Content-based image retrieval unsatisfactory be eliminated. However existing methods of automatic annotation of images depends on supervised learning, which can be difficult to implement due to the need for manually annotated training samples which are not always readily available. This paper argues that the unsupervised learning via Probabilistic Latent Semantic Analysis provides a more suitable machine learning approach for image annotation especially due to its potential to based categorisation on the latent semantic content of the image samples, which can bridge the semantic gap present in Content Based Image Retrieval. This paper therefore proposes an unsupervised image categorisation model in which the semantic content of images are discovered using Probabilistic Latent Semantic Analysis, after which they are clustered into unique groups based on semantic content similarities using K-means algorithm, thereby providing suitable annotation exemplars. A common problem with categorisation algorithms based on Bag-of-Visual Words modelling is the loss of accuracy due to spatial incoherency of the Bag-of-Visual Word modelling, this paper also examines the effectiveness of Spatial pyramid as a means of eliminating spatial incoherency in Probabilistic Latent Semantic Analysis classification.","PeriodicalId":146695,"journal":{"name":"2014 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116015302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Regularized Least-Squares Coding with Unlabeled Dictionary for Image-Set Based Face Recognition 基于图像集人脸识别的无标签字典正则化最小二乘编码
M. Uzair, A. Mian
Image set based face recognition provides more opportunities compared to single mug-shot face recognition. However, modelling the variations in an image set is a challenging task. We propose a computationally efficient and accurate image set modelling technique. The idea is to reconstruct each image set sample with an unlabeled dictionary using the computationally efficient regularized least squares. The reconstruction coefficients form a latent representation of an image set and efficiently model its underlying structure. We propose max and sum pooling to aggregate the latent representations into a single compact feature vector representation per set. We then perform Linear Discriminant Analysis on the pooled reconstruction coefficients to increase the discrimination and reduce the dimensionality of the proposed features. The proposed algorithm is extensively evaluated for the task of image set based face recognition on the Honda/UCSD, CMU Mobo and YouTube celebrities datasets. Experimental results show that the proposed algorithm outperforms current state-of-the-art image set classification algorithms in terms of both accuracy and execution time.
基于图像集的人脸识别比单一的人脸识别提供了更多的机会。然而,对图像集的变化进行建模是一项具有挑战性的任务。我们提出了一种计算效率高且精确的图像集建模技术。其思想是使用计算效率高的正则化最小二乘,用一个未标记的字典重构每个图像集样本。重建系数形成图像集的潜在表示,并有效地模拟其底层结构。我们提出了最大和和池化,将潜在表示聚合为每个集合的单个紧凑特征向量表示。然后,我们对合并的重建系数进行线性判别分析,以增加识别并降低所提出特征的维数。该算法在Honda/UCSD、CMU Mobo和YouTube名人数据集上进行了基于图像集的人脸识别任务的广泛评估。实验结果表明,该算法在准确率和执行时间上都优于当前最先进的图像集分类算法。
{"title":"Regularized Least-Squares Coding with Unlabeled Dictionary for Image-Set Based Face Recognition","authors":"M. Uzair, A. Mian","doi":"10.1109/DICTA.2014.7008128","DOIUrl":"https://doi.org/10.1109/DICTA.2014.7008128","url":null,"abstract":"Image set based face recognition provides more opportunities compared to single mug-shot face recognition. However, modelling the variations in an image set is a challenging task. We propose a computationally efficient and accurate image set modelling technique. The idea is to reconstruct each image set sample with an unlabeled dictionary using the computationally efficient regularized least squares. The reconstruction coefficients form a latent representation of an image set and efficiently model its underlying structure. We propose max and sum pooling to aggregate the latent representations into a single compact feature vector representation per set. We then perform Linear Discriminant Analysis on the pooled reconstruction coefficients to increase the discrimination and reduce the dimensionality of the proposed features. The proposed algorithm is extensively evaluated for the task of image set based face recognition on the Honda/UCSD, CMU Mobo and YouTube celebrities datasets. Experimental results show that the proposed algorithm outperforms current state-of-the-art image set classification algorithms in terms of both accuracy and execution time.","PeriodicalId":146695,"journal":{"name":"2014 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116287696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Rank Minimization or Nuclear-Norm Minimization: Are We Solving the Right Problem? 等级最小化还是核规范最小化:我们解决的问题对吗?
Yuchao Dai, Hongdong Li
Low rank method or rank-minimization has received considerable attention from recent computer vision community. Due to the inherent computational complexity of rank problems, the non-convex rank function is often relaxed to its convex relaxation, i.e. the nuclear norm. Thanks to recent progress made in the filed of compressive sensing (CS), vision researchers who are practicing CS are fully aware, and conscious, of the convex relaxation gap, as well as under which condition (e.g. Restricted Isometry Property) the relaxation is tight (i.e. with nil gap). In this paper, we however wish to alert the potential users of the low-rank method that: focusing too much on the issue of relaxation gap and optimization may possibly adversely obscure the "big picture'' of the original vision problem. In particular, this paper shows that for many commonly cited low-rank problems, nuclear norm minimization formulation of the original rank-minimization problem do not necessarily lead to the desired solution. Degenerate solutions and multiplicity seem often or always exist. Even if a certain nuclear-norm minimization solution is a provably tight relaxation, this solution can possibly be meaningless in its particular context. We therefore advocate that, in solving vision problems via nuclear norm minimization, special care must be given, and domain-dependent prior knowledge must be taken into account. This paper summarizes recent relevant theoretical results, provides original analysis, uses real examples to demonstrate the practical implications.
低秩法或秩最小化法是近年来计算机视觉界非常关注的一种方法。由于秩问题固有的计算复杂性,非凸秩函数通常被松弛到它的凸松弛,即核范数。由于压缩感知(CS)领域的最新进展,从事压缩感知的视觉研究人员充分意识到凸松弛间隙,以及在何种条件下(例如受限等距性质)松弛是紧的(即零间隙)。然而,在本文中,我们希望提醒低秩方法的潜在用户:过于关注松弛差距和优化问题可能会对原始视觉问题的“大局”产生不利影响。特别地,本文表明,对于许多常被引用的低秩问题,原秩最小化问题的核范数最小化公式不一定能得到期望的解。退化解和多重性似乎经常或总是存在。即使某种核规范最小化解决方案是可证明的紧松弛,这种解决方案在其特定上下文中也可能毫无意义。因此,我们主张,在通过核规范最小化来解决视觉问题时,必须特别注意,并且必须考虑到领域相关的先验知识。本文总结了近年来的相关理论成果,提供了独到的分析,并用实例说明了其实际意义。
{"title":"Rank Minimization or Nuclear-Norm Minimization: Are We Solving the Right Problem?","authors":"Yuchao Dai, Hongdong Li","doi":"10.1109/DICTA.2014.7008126","DOIUrl":"https://doi.org/10.1109/DICTA.2014.7008126","url":null,"abstract":"Low rank method or rank-minimization has received considerable attention from recent computer vision community. Due to the inherent computational complexity of rank problems, the non-convex rank function is often relaxed to its convex relaxation, i.e. the nuclear norm. Thanks to recent progress made in the filed of compressive sensing (CS), vision researchers who are practicing CS are fully aware, and conscious, of the convex relaxation gap, as well as under which condition (e.g. Restricted Isometry Property) the relaxation is tight (i.e. with nil gap). In this paper, we however wish to alert the potential users of the low-rank method that: focusing too much on the issue of relaxation gap and optimization may possibly adversely obscure the \"big picture'' of the original vision problem. In particular, this paper shows that for many commonly cited low-rank problems, nuclear norm minimization formulation of the original rank-minimization problem do not necessarily lead to the desired solution. Degenerate solutions and multiplicity seem often or always exist. Even if a certain nuclear-norm minimization solution is a provably tight relaxation, this solution can possibly be meaningless in its particular context. We therefore advocate that, in solving vision problems via nuclear norm minimization, special care must be given, and domain-dependent prior knowledge must be taken into account. This paper summarizes recent relevant theoretical results, provides original analysis, uses real examples to demonstrate the practical implications.","PeriodicalId":146695,"journal":{"name":"2014 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128281582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
期刊
2014 International Conference on Digital Image Computing: Techniques and Applications (DICTA)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1