首页 > 最新文献

2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)最新文献

英文 中文
Passenger Flow Estimation with Bipartite Matching on Bus Surveillance Cameras 基于二部匹配的公交监控摄像机客流估计
Shunta Komatsu, Ryosuke Furuta, Y. Taniguchi
To formulate the schedules and routes of buses, bus companies monitor and gather data on the number of passengers and the boarding sections for each passenger several days a year. The problem is, however, that this monitoring is currently performed manually and requires a great deal of human cost. To solve this problem, recent proposals analyze the images taken by the surveillance cameras installed in most modern Japanese buses. The previous methods make it possible to identify the boarding sections regardless of the payment method like IC cards by matching people in the images obtained from different surveillance cameras. In this paper, we propose an improved method for estimating boarding sections; it uses minimum weight perfect matching on a bipartite graph; the assumption is that there exists one-to-one correspondence between people appearing in two surveillance camera images. In addition, the proposed method takes the boarding direction estimates output by person detection and tracking into account. To further improve the estimation accuracy, we employ a time constraint to handle the restricted movement of passengers on a bus. To confirm the effectiveness of the proposed method, we conduct experiments on the images taken by actual bus surveillance cameras. The results show that the proposed method achieves significantly better results than the previous method.
为了制定巴士的时间表和路线,巴士公司每年都会有几天监测和收集乘客人数和每位乘客的上车部分的数据。然而,问题是,这种监视目前是手动执行的,需要大量的人力成本。为了解决这个问题,最近的建议分析了安装在大多数现代日本公交车上的监控摄像头拍摄的图像。以前的方法可以通过匹配从不同监控摄像头获得的图像中的人来识别登机区域,而不考虑IC卡等支付方式。在本文中,我们提出了一种改进的估算登机截面的方法;它在二部图上使用最小权值完美匹配;假设出现在两个监控摄像头图像中的人之间存在一一对应关系。此外,该方法还考虑了人员检测和跟踪输出的登机方向估计。为了进一步提高估计精度,我们采用时间约束来处理公交车上受限的乘客运动。为了验证该方法的有效性,我们对实际公交监控摄像机拍摄的图像进行了实验。结果表明,该方法的效果明显优于原有方法。
{"title":"Passenger Flow Estimation with Bipartite Matching on Bus Surveillance Cameras","authors":"Shunta Komatsu, Ryosuke Furuta, Y. Taniguchi","doi":"10.1109/MIPR51284.2021.00038","DOIUrl":"https://doi.org/10.1109/MIPR51284.2021.00038","url":null,"abstract":"To formulate the schedules and routes of buses, bus companies monitor and gather data on the number of passengers and the boarding sections for each passenger several days a year. The problem is, however, that this monitoring is currently performed manually and requires a great deal of human cost. To solve this problem, recent proposals analyze the images taken by the surveillance cameras installed in most modern Japanese buses. The previous methods make it possible to identify the boarding sections regardless of the payment method like IC cards by matching people in the images obtained from different surveillance cameras. In this paper, we propose an improved method for estimating boarding sections; it uses minimum weight perfect matching on a bipartite graph; the assumption is that there exists one-to-one correspondence between people appearing in two surveillance camera images. In addition, the proposed method takes the boarding direction estimates output by person detection and tracking into account. To further improve the estimation accuracy, we employ a time constraint to handle the restricted movement of passengers on a bus. To confirm the effectiveness of the proposed method, we conduct experiments on the images taken by actual bus surveillance cameras. The results show that the proposed method achieves significantly better results than the previous method.","PeriodicalId":139543,"journal":{"name":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117098935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Fact-checking Assistant System for Textual Documents* 文本文件事实核查助理系统*
Tomoya Furuta, Yumiko Suzuki
This paper proposes a system for identifying which parts of textual documents the editors should do fact-checking. Using our system, we can reduce editors’ time and efforts by identifying descriptions that need fact-checking. To accomplish this purpose, we construct a machine-learning-based classifier of sentences, which classifies a part of documents into four classes: according to the necessity of fact-checking. We assume that there are typical descriptions which contain misinformation. Therefore, if we collect the documents and their revised documents, and labels whether their revisions are corrections or not, we can construct the classifier by learning the dataset. To construct this classifier, we build a dataset that includes a set of sentences which are revised more than once, from Wikipedia edit history. The labels indicate the degree of sentence corrections by editors. We develop a Web-based system for demonstrating our proposed approach. When we input texts, the system predicts which parts of the texts the editors should re-confirm the facts.
本文提出了一个系统,用于识别编辑应该对文本文件的哪些部分进行事实核查。使用我们的系统,我们可以通过识别需要事实核查的描述来减少编辑的时间和努力。为了实现这一目的,我们构建了一个基于机器学习的句子分类器,它将部分文档分为四类:根据事实检查的必要性。我们假设有包含错误信息的典型描述。因此,如果我们收集文档及其修订文档,并标记其修订是否为更正,我们可以通过学习数据集来构建分类器。为了构建这个分类器,我们建立了一个数据集,其中包括一组来自维基百科编辑历史的多次修改的句子。标签表示编辑对句子的修改程度。我们开发了一个基于web的系统来演示我们提出的方法。当我们输入文本时,系统会预测编辑应该重新确认文本的哪些部分。
{"title":"A Fact-checking Assistant System for Textual Documents*","authors":"Tomoya Furuta, Yumiko Suzuki","doi":"10.1109/MIPR51284.2021.00046","DOIUrl":"https://doi.org/10.1109/MIPR51284.2021.00046","url":null,"abstract":"This paper proposes a system for identifying which parts of textual documents the editors should do fact-checking. Using our system, we can reduce editors’ time and efforts by identifying descriptions that need fact-checking. To accomplish this purpose, we construct a machine-learning-based classifier of sentences, which classifies a part of documents into four classes: according to the necessity of fact-checking. We assume that there are typical descriptions which contain misinformation. Therefore, if we collect the documents and their revised documents, and labels whether their revisions are corrections or not, we can construct the classifier by learning the dataset. To construct this classifier, we build a dataset that includes a set of sentences which are revised more than once, from Wikipedia edit history. The labels indicate the degree of sentence corrections by editors. We develop a Web-based system for demonstrating our proposed approach. When we input texts, the system predicts which parts of the texts the editors should re-confirm the facts.","PeriodicalId":139543,"journal":{"name":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123557809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
TLV-Bandit: Bandit Method for Collecting Topic-related Local Tweets TLV-Bandit:收集与主题相关的本地tweet的土匪方法
Carina Miwa Yoshimura, H. Kitagawa
Twitter hosts a large and diverse amount of information that makes up a corpus of data valuable to a wide range of institutions from marketing firms to governments. Collection of tweets can enable analysis like surveys of public opinions, marketing analysis or target analysis to users who live in a specific area. To collect useful data for a given task, the ability to capture tweets related to a specific topic sent from a specific area is needed. However, performing this kind of task on significantly sizable data sources such as the twitter stream data using just the Twitter API is a big challenge because of limitation relating to usage restrictions and lack of geotags. In this work, we propose "TLV-Bandit", which collects topic-related tweets sent from a specific area based on the bandit algorithm and analyze its performance. The experimental results show that our proposed method can collect efficiently the target tweets in comparison to other methods when considering the three aspects of collection requirements: Locality (sent from the target area), Similarity (topic-related) and Volume (number of tweets).
Twitter拥有大量不同种类的信息,这些信息构成了一个数据语料库,对从营销公司到政府的各种机构都有价值。收集推文可以进行民意调查、营销分析或针对特定地区用户的目标分析等分析。为了收集给定任务的有用数据,需要能够捕获从特定区域发送的与特定主题相关的tweet。然而,由于使用限制和缺乏地理标记的限制,仅使用twitter API在相当大的数据源(如twitter流数据)上执行这类任务是一个很大的挑战。在这项工作中,我们提出了“TLV-Bandit”,它基于bandit算法收集特定区域发出的与主题相关的推文,并分析其性能。实验结果表明,从局部性(从目标区域发送)、相似性(主题相关)和数量(推文数量)三个方面的收集要求出发,本文提出的方法比其他方法能够有效地收集目标推文。
{"title":"TLV-Bandit: Bandit Method for Collecting Topic-related Local Tweets","authors":"Carina Miwa Yoshimura, H. Kitagawa","doi":"10.1109/MIPR51284.2021.00016","DOIUrl":"https://doi.org/10.1109/MIPR51284.2021.00016","url":null,"abstract":"Twitter hosts a large and diverse amount of information that makes up a corpus of data valuable to a wide range of institutions from marketing firms to governments. Collection of tweets can enable analysis like surveys of public opinions, marketing analysis or target analysis to users who live in a specific area. To collect useful data for a given task, the ability to capture tweets related to a specific topic sent from a specific area is needed. However, performing this kind of task on significantly sizable data sources such as the twitter stream data using just the Twitter API is a big challenge because of limitation relating to usage restrictions and lack of geotags. In this work, we propose \"TLV-Bandit\", which collects topic-related tweets sent from a specific area based on the bandit algorithm and analyze its performance. The experimental results show that our proposed method can collect efficiently the target tweets in comparison to other methods when considering the three aspects of collection requirements: Locality (sent from the target area), Similarity (topic-related) and Volume (number of tweets).","PeriodicalId":139543,"journal":{"name":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132610253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning-based Tensor Decomposition with Adaptive Rank Penalty for CNNs Compression 基于自适应秩惩罚的学习张量分解在cnn压缩中的应用
Deli Yu, Peipei Yang, Cheng-Lin Liu
Low-rank tensor decomposition is a widely-used strategy to compress convolutional neural networks (CNNs). Existing learning-based decomposition methods encourage low-rank filter weights via regularizer of filters’ pair-wise force or nuclear norm during training. However, these methods can not obtain the satisfactory low-rank structure. We propose a new method with an adaptive rank penalty to learn more compact CNNs. Specifically, we transform rank constraint into a differentiable one and impose its adaptive violation-aware penalty on filters. Moreover, this paper is the first work to integrate the learning-based decomposition and group decomposition to make a better trade-off, especially for the tough task of compression of 1×1 convolution.The obtained low-rank model can be easily decomposed while nearly keeping the full accuracy without additional fine-tuning process. The effectiveness is verified by compression experiments of VGG and ResNet on CIFAR-10 and ILSVRC-2012. Our method can reduce about 65% parameters of ResNet-110 with 0.04% Top-1 accuracy drop on CIFAR-10, and reduce about 60% parameters of ResNet-50 with 0.57% Top-1 accuracy drop on ILSVRC-2012.
低秩张量分解是卷积神经网络压缩的一种常用策略。现有的基于学习的分解方法在训练过程中通过滤波器成对力或核范数的正则化器来鼓励低秩滤波器权重。然而,这些方法都不能得到满意的低阶结构。我们提出了一种新的自适应秩惩罚方法来学习更紧凑的cnn。具体而言,我们将秩约束转化为可微约束,并对过滤器施加其自适应违例感知惩罚。此外,本文是第一个将基于学习的分解和分组分解相结合的工作,以更好地权衡,特别是对于1×1卷积压缩的艰巨任务。所得到的低秩模型可以很容易地分解,而无需额外的微调过程,几乎可以保持全部精度。在CIFAR-10和ILSVRC-2012上进行了VGG和ResNet压缩实验,验证了其有效性。该方法在CIFAR-10上可减少约65%的ResNet-110参数,Top-1精度下降0.04%;在ILSVRC-2012上可减少约60%的ResNet-50参数,Top-1精度下降0.57%。
{"title":"Learning-based Tensor Decomposition with Adaptive Rank Penalty for CNNs Compression","authors":"Deli Yu, Peipei Yang, Cheng-Lin Liu","doi":"10.1109/MIPR51284.2021.00014","DOIUrl":"https://doi.org/10.1109/MIPR51284.2021.00014","url":null,"abstract":"Low-rank tensor decomposition is a widely-used strategy to compress convolutional neural networks (CNNs). Existing learning-based decomposition methods encourage low-rank filter weights via regularizer of filters’ pair-wise force or nuclear norm during training. However, these methods can not obtain the satisfactory low-rank structure. We propose a new method with an adaptive rank penalty to learn more compact CNNs. Specifically, we transform rank constraint into a differentiable one and impose its adaptive violation-aware penalty on filters. Moreover, this paper is the first work to integrate the learning-based decomposition and group decomposition to make a better trade-off, especially for the tough task of compression of 1×1 convolution.The obtained low-rank model can be easily decomposed while nearly keeping the full accuracy without additional fine-tuning process. The effectiveness is verified by compression experiments of VGG and ResNet on CIFAR-10 and ILSVRC-2012. Our method can reduce about 65% parameters of ResNet-110 with 0.04% Top-1 accuracy drop on CIFAR-10, and reduce about 60% parameters of ResNet-50 with 0.57% Top-1 accuracy drop on ILSVRC-2012.","PeriodicalId":139543,"journal":{"name":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128078325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Predicting inquiry from potential renters using property listing information 利用房源信息预测潜在租客的问询
Takeshi So, Y. Arai
In this study, we deduced how accurate the number of inquiries from potential tenants for housing available for rent can be predicted based on the attributes of the housing, using multiple statistical methods, and compared the results. The purpose of this study is to show these results as case studies. Confusion matrices were calculated based on the results deduced with three methods – the classical logistic regression, RandomForest, and XGBoost – and prediction accuracies were verified. The results showed that the accuracy of XGBoost was the highest, followed by that of logistic regression. It is sometimes desirable to use logistic regression, which is easy to interpret from the perspective of application to business, because the differences in accuracy among the statistical methods are not significant. It is thus important in business to take into account the accuracy, ease of interpretation, and research structure and select the most appropriate statistical method.
在本研究中,我们使用多种统计方法,根据房屋的属性推断出潜在租户对可出租房屋的查询数量的预测准确性,并对结果进行了比较。本研究的目的是将这些结果作为案例研究来展示。基于经典逻辑回归、随机森林和XGBoost三种方法推导的结果计算混淆矩阵,并验证了预测的准确性。结果表明,XGBoost的准确率最高,其次是logistic回归。有时需要使用逻辑回归,因为从应用到业务的角度来看,逻辑回归易于解释,因为统计方法之间的准确性差异并不显著。因此,在商业中重要的是要考虑到准确性、解释的便利性和研究结构,并选择最合适的统计方法。
{"title":"Predicting inquiry from potential renters using property listing information","authors":"Takeshi So, Y. Arai","doi":"10.1109/MIPR51284.2021.00053","DOIUrl":"https://doi.org/10.1109/MIPR51284.2021.00053","url":null,"abstract":"In this study, we deduced how accurate the number of inquiries from potential tenants for housing available for rent can be predicted based on the attributes of the housing, using multiple statistical methods, and compared the results. The purpose of this study is to show these results as case studies. Confusion matrices were calculated based on the results deduced with three methods – the classical logistic regression, RandomForest, and XGBoost – and prediction accuracies were verified. The results showed that the accuracy of XGBoost was the highest, followed by that of logistic regression. It is sometimes desirable to use logistic regression, which is easy to interpret from the perspective of application to business, because the differences in accuracy among the statistical methods are not significant. It is thus important in business to take into account the accuracy, ease of interpretation, and research structure and select the most appropriate statistical method.","PeriodicalId":139543,"journal":{"name":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"88 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131780399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identifying Maturity Rating Levels of Online Books 确定在线图书的成熟度等级
Eric Brewer, Yiu-Kai Ng
With the huge amount of books available nowadays, it is a challenge to determine appropriate reading materials that are suitable for a reader, especially books that match the maturity levels of children and adolescents. Analyzing the age-appropriateness for books can be a time-consuming process, since it can take up to three hours for a human to read a book, and the relatively low cost of creating literary content can cause it to be even more difficult to discover age-suitable materials to read. In order to solve this problem, we propose a maturity-rating-level detection tool based on neural network models. The proposed model predicts a book’s content rating level within each of the seven categories: (i) crude humor/language; (ii) drug, alcohol, and tobacco use; (iii) kissing; (iv) profanity; (v) nudity; (vi) sex and intimacy; and (vii) violence and horror, given the text of the book. The empirical study demonstrates that mature content of online books can be accurately predicted by computers through the use of natural language processing and machine learning techniques. Experimental results also verify the merit of the proposed model that outperforms a number of baseline models and well-known, existing maturity ratings prediction tools.
现在有大量的书籍可供选择,确定适合读者的合适的阅读材料是一项挑战,特别是适合儿童和青少年成熟水平的书籍。分析书籍的年龄适宜性可能是一个耗时的过程,因为一个人阅读一本书可能需要长达三个小时的时间,而创作文学内容的成本相对较低,这可能会导致发现适合年龄的阅读材料变得更加困难。为了解决这一问题,我们提出了一种基于神经网络模型的成熟度等级检测工具。所提出的模型在七个类别中预测一本书的内容评级水平:(i)粗俗幽默/语言;㈡吸毒、酗酒和吸烟;(3)接吻;(四)亵渎;(v)裸露;(vi)性和亲密关系;(七)暴力和恐怖,鉴于这本书的文本。实证研究表明,通过使用自然语言处理和机器学习技术,计算机可以准确地预测在线图书的成熟内容。实验结果还验证了该模型的优点,该模型优于许多基线模型和已知的现有成熟度评级预测工具。
{"title":"Identifying Maturity Rating Levels of Online Books","authors":"Eric Brewer, Yiu-Kai Ng","doi":"10.1109/MIPR51284.2021.00032","DOIUrl":"https://doi.org/10.1109/MIPR51284.2021.00032","url":null,"abstract":"With the huge amount of books available nowadays, it is a challenge to determine appropriate reading materials that are suitable for a reader, especially books that match the maturity levels of children and adolescents. Analyzing the age-appropriateness for books can be a time-consuming process, since it can take up to three hours for a human to read a book, and the relatively low cost of creating literary content can cause it to be even more difficult to discover age-suitable materials to read. In order to solve this problem, we propose a maturity-rating-level detection tool based on neural network models. The proposed model predicts a book’s content rating level within each of the seven categories: (i) crude humor/language; (ii) drug, alcohol, and tobacco use; (iii) kissing; (iv) profanity; (v) nudity; (vi) sex and intimacy; and (vii) violence and horror, given the text of the book. The empirical study demonstrates that mature content of online books can be accurately predicted by computers through the use of natural language processing and machine learning techniques. Experimental results also verify the merit of the proposed model that outperforms a number of baseline models and well-known, existing maturity ratings prediction tools.","PeriodicalId":139543,"journal":{"name":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131264851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Pop’n Food: 3D Food Model Estimation System from a Single Image 流行食品:从单个图像的3D食品模型估计系统
Shu Naritomi, Keiji Yanai
Dietary calorie management has been an important topic in recent years, and various methods and applications on image-based food calorie estimation have been published in the multimedia community. Most of the existing methods of estimating food calorie amounts use 2D-based image recognition. However, since actual food is a 3D object, there is a limit to the accuracy of calorie estimation using 2D-based methods. Therefore, in our previous work, we proposed a method to reconstruct the 3D shape of the dish (food and plate) and a plate (without foods) from a single 2D image and estimate a more accurate food volume. Such researches on 3D reconstruction have been active recently, and it is necessary to qualitatively evaluate what kind of 3D shape has been reconstructed. However, checking a large number of 3D models reconstructed from a large number of images requires many steps and is tedious. Against this background, this demo paper introduces an application named "Pop’n Food" which has the following two functions: (1) A web application for visualizing a large number of images to check the learning results and the 3D model generated from them. (2) A web application that selects an image from a browser and generates and visualizes a 3D model in real-time. This demo system is based on our previous work named Hungry Networks. Demo video: https://youtu.be/YyIu8bL65EE
膳食热量管理是近年来研究的一个重要课题,多媒体界已经发表了各种基于图像的食物热量估算方法和应用。大多数现有的估算食物卡路里量的方法使用基于2d的图像识别。然而,由于实际食物是3D物体,因此使用基于2d的方法估算卡路里的准确性是有限的。因此,在我们之前的工作中,我们提出了一种从单个二维图像中重建盘子(食物和盘子)和盘子(不含食物)的三维形状的方法,并估算出更准确的食物体积。这类三维重建的研究近年来比较活跃,有必要对重建的三维形状进行定性评价。然而,从大量图像重建的大量3D模型的检查需要许多步骤,并且是繁琐的。在此背景下,本演示论文介绍了一个名为“Pop 'n Food”的应用程序,它具有以下两个功能:(1)一个web应用程序,用于将大量图像可视化,以检查学习结果以及从中生成的3D模型。(2)一个从浏览器中选择图像并实时生成和可视化3D模型的web应用程序。这个演示系统是基于我们之前名为Hungry Networks的工作。演示视频:https://youtu.be/YyIu8bL65EE
{"title":"Pop’n Food: 3D Food Model Estimation System from a Single Image","authors":"Shu Naritomi, Keiji Yanai","doi":"10.1109/MIPR51284.2021.00041","DOIUrl":"https://doi.org/10.1109/MIPR51284.2021.00041","url":null,"abstract":"Dietary calorie management has been an important topic in recent years, and various methods and applications on image-based food calorie estimation have been published in the multimedia community. Most of the existing methods of estimating food calorie amounts use 2D-based image recognition. However, since actual food is a 3D object, there is a limit to the accuracy of calorie estimation using 2D-based methods. Therefore, in our previous work, we proposed a method to reconstruct the 3D shape of the dish (food and plate) and a plate (without foods) from a single 2D image and estimate a more accurate food volume. Such researches on 3D reconstruction have been active recently, and it is necessary to qualitatively evaluate what kind of 3D shape has been reconstructed. However, checking a large number of 3D models reconstructed from a large number of images requires many steps and is tedious. Against this background, this demo paper introduces an application named \"Pop’n Food\" which has the following two functions: (1) A web application for visualizing a large number of images to check the learning results and the 3D model generated from them. (2) A web application that selects an image from a browser and generates and visualizes a 3D model in real-time. This demo system is based on our previous work named Hungry Networks. Demo video: https://youtu.be/YyIu8bL65EE","PeriodicalId":139543,"journal":{"name":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116841943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Entity Resolution of Japanese Apartment Property Information Using Neural Networks 基于神经网络的日本公寓物业信息实体解析
Y. Kado, Takashi Hirokata, Koji Matsumura, Xueting Wang, T. Yamasaki
In Japan, there are many real estate companies and agencies, who create apartment room property records and register them to some real estate portal sites to be advertised. The apartment room records include the apartment building attributes information. However, the building attributes values are not entered by referring to the common building database but are arbitrarily created and entered by each company or agency. For effective use of property information, apartment rooms must be linked to the correct apartment building. In this regard, aggregating property information belonging to the same building (entity resolution) is typically performed by a rule-based process that statistically considers the similarity of attributes such as the building name, number of floors, or year/month the building was built. However, when property information is stored by room and registered by different businesses, the corresponding building information may be inconsistent, incomplete, or inaccurate. Therefore, entity resolution using a rule-based method is insufficient and requires extensive manual post-processing. This study proposes an entity resolution method for apartment properties using neural networks with inputs containing traditional property attributes and new attributes obtained from the phonetic and semantic pre-processing of building names. The experimental results show that the proposed method improves entity resolution accuracy.
在日本,有很多房地产公司和中介公司制作公寓房间的财产记录,并将其登记在房地产门户网站上进行广告宣传。公寓房间记录包括公寓建筑属性信息。但是,建筑物属性值不是通过引用公共建筑物数据库输入的,而是由每个公司或机构任意创建和输入的。为了有效地利用物业信息,公寓房间必须与正确的公寓大楼相连。在这方面,聚合属于同一建筑物的属性信息(实体解析)通常由基于规则的流程执行,该流程在统计上考虑建筑物名称、楼层数或建筑物建成的年份/月份等属性的相似性。然而,当物业信息按房间存储,由不同的企业登记时,相应的建筑信息可能不一致、不完整或不准确。因此,使用基于规则的方法进行实体解析是不够的,需要大量的手工后处理。本文提出了一种基于神经网络的公寓属性实体解析方法,该方法的输入包含传统属性和由建筑名称的语音和语义预处理获得的新属性。实验结果表明,该方法提高了实体分辨精度。
{"title":"Entity Resolution of Japanese Apartment Property Information Using Neural Networks","authors":"Y. Kado, Takashi Hirokata, Koji Matsumura, Xueting Wang, T. Yamasaki","doi":"10.1109/MIPR51284.2021.00052","DOIUrl":"https://doi.org/10.1109/MIPR51284.2021.00052","url":null,"abstract":"In Japan, there are many real estate companies and agencies, who create apartment room property records and register them to some real estate portal sites to be advertised. The apartment room records include the apartment building attributes information. However, the building attributes values are not entered by referring to the common building database but are arbitrarily created and entered by each company or agency. For effective use of property information, apartment rooms must be linked to the correct apartment building. In this regard, aggregating property information belonging to the same building (entity resolution) is typically performed by a rule-based process that statistically considers the similarity of attributes such as the building name, number of floors, or year/month the building was built. However, when property information is stored by room and registered by different businesses, the corresponding building information may be inconsistent, incomplete, or inaccurate. Therefore, entity resolution using a rule-based method is insufficient and requires extensive manual post-processing. This study proposes an entity resolution method for apartment properties using neural networks with inputs containing traditional property attributes and new attributes obtained from the phonetic and semantic pre-processing of building names. The experimental results show that the proposed method improves entity resolution accuracy.","PeriodicalId":139543,"journal":{"name":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117214568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Demo Paper: Ad Hoc Search On Statistical Data Based On Categorization And Metadata Augmentation 演示论文:基于分类和元数据扩充的统计数据的特别搜索
T. Okamoto, H. Miyamori
In this paper, we describe the system of ad hoc search on statistical data based on categorization and metadata augmentation. The documents covered by this paper consist of metadata extracted from the governmental statistical data and the body of the corresponding statistical data. The metadata is characterized by the fact that its document length is short, and the main body of statistical data is almost always composed of numbers, except for titles, headers, and comments. We newly developed the categorical search that narrows the set of documents to be retrieved by category in order to properly capture the scope of the problem domain intended by the given query. In addition, to compensate for the short document length of metadata, we implemented a method of extracting the header information of the table from the main body of statistical data to augment documents to be searched. As a ranking model, we adopted BM25, which can be adjusted with few parameters to take into account term frequency and document length.
本文描述了一种基于分类和元数据扩充的统计数据特别搜索系统。本文所涉及的文件包括从政府统计数据中提取的元数据和相应的统计数据主体。元数据的特点是它的文档长度很短,统计数据的主体几乎总是由数字组成,除了标题、标题和注释。我们最近开发了分类搜索,它按类别缩小要检索的文档集,以便正确捕获给定查询所要查询的问题域的范围。此外,为了弥补元数据的短文档长度,我们实现了一种从统计数据主体中提取表的标题信息的方法,以增加要搜索的文档。我们采用BM25作为排序模型,该模型可以通过较少的参数进行调整,以考虑词频和文档长度。
{"title":"Demo Paper: Ad Hoc Search On Statistical Data Based On Categorization And Metadata Augmentation","authors":"T. Okamoto, H. Miyamori","doi":"10.1109/MIPR51284.2021.00043","DOIUrl":"https://doi.org/10.1109/MIPR51284.2021.00043","url":null,"abstract":"In this paper, we describe the system of ad hoc search on statistical data based on categorization and metadata augmentation. The documents covered by this paper consist of metadata extracted from the governmental statistical data and the body of the corresponding statistical data. The metadata is characterized by the fact that its document length is short, and the main body of statistical data is almost always composed of numbers, except for titles, headers, and comments. We newly developed the categorical search that narrows the set of documents to be retrieved by category in order to properly capture the scope of the problem domain intended by the given query. In addition, to compensate for the short document length of metadata, we implemented a method of extracting the header information of the table from the main body of statistical data to augment documents to be searched. As a ranking model, we adopted BM25, which can be adjusted with few parameters to take into account term frequency and document length.","PeriodicalId":139543,"journal":{"name":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114896429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Respective Volumetric Heatmap Autoencoder for Multi-Person 3D Pose Estimation 各自的体积热图自动编码器的多人三维姿态估计
Minghao Wang, Long Ye, Fei Hu, Li Fang, Wei Zhong, Qin Zhang
Using heatmaps to predict body joint locations has become one of the best performing pose estimation methods, however, these methods often have the high demands for memory and computation, which make them difficult to apply into practice. This paper proposes an effective compression method to reduce the size of heatmaps, namely lies Respective Volumetric Heatmap Autoencoder(RVHA) to represent the ground truth heatmaps with smaller data size, then a RVHA-based pose estimation framework is built to achieve the human joint locations from monocular RGB images. Thanks to our compression strategy which takes each human joint volumetric heatmap as an input frame individually, our method performs favorably when compared to state of the art on the JTA datasets.
利用热图预测人体关节位置已成为目前性能最好的姿态估计方法之一,但这些方法往往对内存和计算量要求较高,难以应用于实践。本文提出了一种有效的压缩热图的方法,即各自体积热图自动编码器(RVHA)来表示数据量较小的地面真热图,然后构建了一个基于RVHA的姿态估计框架,从单目RGB图像中实现人体关节位置。由于我们的压缩策略将每个人体关节体积热图单独作为输入帧,因此与JTA数据集上的最新状态相比,我们的方法表现良好。
{"title":"Respective Volumetric Heatmap Autoencoder for Multi-Person 3D Pose Estimation","authors":"Minghao Wang, Long Ye, Fei Hu, Li Fang, Wei Zhong, Qin Zhang","doi":"10.1109/MIPR51284.2021.00070","DOIUrl":"https://doi.org/10.1109/MIPR51284.2021.00070","url":null,"abstract":"Using heatmaps to predict body joint locations has become one of the best performing pose estimation methods, however, these methods often have the high demands for memory and computation, which make them difficult to apply into practice. This paper proposes an effective compression method to reduce the size of heatmaps, namely lies Respective Volumetric Heatmap Autoencoder(RVHA) to represent the ground truth heatmaps with smaller data size, then a RVHA-based pose estimation framework is built to achieve the human joint locations from monocular RGB images. Thanks to our compression strategy which takes each human joint volumetric heatmap as an input frame individually, our method performs favorably when compared to state of the art on the JTA datasets.","PeriodicalId":139543,"journal":{"name":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133013558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1