Pub Date : 2021-09-01DOI: 10.1109/MIPR51284.2021.00038
Shunta Komatsu, Ryosuke Furuta, Y. Taniguchi
To formulate the schedules and routes of buses, bus companies monitor and gather data on the number of passengers and the boarding sections for each passenger several days a year. The problem is, however, that this monitoring is currently performed manually and requires a great deal of human cost. To solve this problem, recent proposals analyze the images taken by the surveillance cameras installed in most modern Japanese buses. The previous methods make it possible to identify the boarding sections regardless of the payment method like IC cards by matching people in the images obtained from different surveillance cameras. In this paper, we propose an improved method for estimating boarding sections; it uses minimum weight perfect matching on a bipartite graph; the assumption is that there exists one-to-one correspondence between people appearing in two surveillance camera images. In addition, the proposed method takes the boarding direction estimates output by person detection and tracking into account. To further improve the estimation accuracy, we employ a time constraint to handle the restricted movement of passengers on a bus. To confirm the effectiveness of the proposed method, we conduct experiments on the images taken by actual bus surveillance cameras. The results show that the proposed method achieves significantly better results than the previous method.
{"title":"Passenger Flow Estimation with Bipartite Matching on Bus Surveillance Cameras","authors":"Shunta Komatsu, Ryosuke Furuta, Y. Taniguchi","doi":"10.1109/MIPR51284.2021.00038","DOIUrl":"https://doi.org/10.1109/MIPR51284.2021.00038","url":null,"abstract":"To formulate the schedules and routes of buses, bus companies monitor and gather data on the number of passengers and the boarding sections for each passenger several days a year. The problem is, however, that this monitoring is currently performed manually and requires a great deal of human cost. To solve this problem, recent proposals analyze the images taken by the surveillance cameras installed in most modern Japanese buses. The previous methods make it possible to identify the boarding sections regardless of the payment method like IC cards by matching people in the images obtained from different surveillance cameras. In this paper, we propose an improved method for estimating boarding sections; it uses minimum weight perfect matching on a bipartite graph; the assumption is that there exists one-to-one correspondence between people appearing in two surveillance camera images. In addition, the proposed method takes the boarding direction estimates output by person detection and tracking into account. To further improve the estimation accuracy, we employ a time constraint to handle the restricted movement of passengers on a bus. To confirm the effectiveness of the proposed method, we conduct experiments on the images taken by actual bus surveillance cameras. The results show that the proposed method achieves significantly better results than the previous method.","PeriodicalId":139543,"journal":{"name":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117098935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-01DOI: 10.1109/MIPR51284.2021.00046
Tomoya Furuta, Yumiko Suzuki
This paper proposes a system for identifying which parts of textual documents the editors should do fact-checking. Using our system, we can reduce editors’ time and efforts by identifying descriptions that need fact-checking. To accomplish this purpose, we construct a machine-learning-based classifier of sentences, which classifies a part of documents into four classes: according to the necessity of fact-checking. We assume that there are typical descriptions which contain misinformation. Therefore, if we collect the documents and their revised documents, and labels whether their revisions are corrections or not, we can construct the classifier by learning the dataset. To construct this classifier, we build a dataset that includes a set of sentences which are revised more than once, from Wikipedia edit history. The labels indicate the degree of sentence corrections by editors. We develop a Web-based system for demonstrating our proposed approach. When we input texts, the system predicts which parts of the texts the editors should re-confirm the facts.
{"title":"A Fact-checking Assistant System for Textual Documents*","authors":"Tomoya Furuta, Yumiko Suzuki","doi":"10.1109/MIPR51284.2021.00046","DOIUrl":"https://doi.org/10.1109/MIPR51284.2021.00046","url":null,"abstract":"This paper proposes a system for identifying which parts of textual documents the editors should do fact-checking. Using our system, we can reduce editors’ time and efforts by identifying descriptions that need fact-checking. To accomplish this purpose, we construct a machine-learning-based classifier of sentences, which classifies a part of documents into four classes: according to the necessity of fact-checking. We assume that there are typical descriptions which contain misinformation. Therefore, if we collect the documents and their revised documents, and labels whether their revisions are corrections or not, we can construct the classifier by learning the dataset. To construct this classifier, we build a dataset that includes a set of sentences which are revised more than once, from Wikipedia edit history. The labels indicate the degree of sentence corrections by editors. We develop a Web-based system for demonstrating our proposed approach. When we input texts, the system predicts which parts of the texts the editors should re-confirm the facts.","PeriodicalId":139543,"journal":{"name":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123557809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-01DOI: 10.1109/MIPR51284.2021.00016
Carina Miwa Yoshimura, H. Kitagawa
Twitter hosts a large and diverse amount of information that makes up a corpus of data valuable to a wide range of institutions from marketing firms to governments. Collection of tweets can enable analysis like surveys of public opinions, marketing analysis or target analysis to users who live in a specific area. To collect useful data for a given task, the ability to capture tweets related to a specific topic sent from a specific area is needed. However, performing this kind of task on significantly sizable data sources such as the twitter stream data using just the Twitter API is a big challenge because of limitation relating to usage restrictions and lack of geotags. In this work, we propose "TLV-Bandit", which collects topic-related tweets sent from a specific area based on the bandit algorithm and analyze its performance. The experimental results show that our proposed method can collect efficiently the target tweets in comparison to other methods when considering the three aspects of collection requirements: Locality (sent from the target area), Similarity (topic-related) and Volume (number of tweets).
{"title":"TLV-Bandit: Bandit Method for Collecting Topic-related Local Tweets","authors":"Carina Miwa Yoshimura, H. Kitagawa","doi":"10.1109/MIPR51284.2021.00016","DOIUrl":"https://doi.org/10.1109/MIPR51284.2021.00016","url":null,"abstract":"Twitter hosts a large and diverse amount of information that makes up a corpus of data valuable to a wide range of institutions from marketing firms to governments. Collection of tweets can enable analysis like surveys of public opinions, marketing analysis or target analysis to users who live in a specific area. To collect useful data for a given task, the ability to capture tweets related to a specific topic sent from a specific area is needed. However, performing this kind of task on significantly sizable data sources such as the twitter stream data using just the Twitter API is a big challenge because of limitation relating to usage restrictions and lack of geotags. In this work, we propose \"TLV-Bandit\", which collects topic-related tweets sent from a specific area based on the bandit algorithm and analyze its performance. The experimental results show that our proposed method can collect efficiently the target tweets in comparison to other methods when considering the three aspects of collection requirements: Locality (sent from the target area), Similarity (topic-related) and Volume (number of tweets).","PeriodicalId":139543,"journal":{"name":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132610253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-01DOI: 10.1109/MIPR51284.2021.00014
Deli Yu, Peipei Yang, Cheng-Lin Liu
Low-rank tensor decomposition is a widely-used strategy to compress convolutional neural networks (CNNs). Existing learning-based decomposition methods encourage low-rank filter weights via regularizer of filters’ pair-wise force or nuclear norm during training. However, these methods can not obtain the satisfactory low-rank structure. We propose a new method with an adaptive rank penalty to learn more compact CNNs. Specifically, we transform rank constraint into a differentiable one and impose its adaptive violation-aware penalty on filters. Moreover, this paper is the first work to integrate the learning-based decomposition and group decomposition to make a better trade-off, especially for the tough task of compression of 1×1 convolution.The obtained low-rank model can be easily decomposed while nearly keeping the full accuracy without additional fine-tuning process. The effectiveness is verified by compression experiments of VGG and ResNet on CIFAR-10 and ILSVRC-2012. Our method can reduce about 65% parameters of ResNet-110 with 0.04% Top-1 accuracy drop on CIFAR-10, and reduce about 60% parameters of ResNet-50 with 0.57% Top-1 accuracy drop on ILSVRC-2012.
{"title":"Learning-based Tensor Decomposition with Adaptive Rank Penalty for CNNs Compression","authors":"Deli Yu, Peipei Yang, Cheng-Lin Liu","doi":"10.1109/MIPR51284.2021.00014","DOIUrl":"https://doi.org/10.1109/MIPR51284.2021.00014","url":null,"abstract":"Low-rank tensor decomposition is a widely-used strategy to compress convolutional neural networks (CNNs). Existing learning-based decomposition methods encourage low-rank filter weights via regularizer of filters’ pair-wise force or nuclear norm during training. However, these methods can not obtain the satisfactory low-rank structure. We propose a new method with an adaptive rank penalty to learn more compact CNNs. Specifically, we transform rank constraint into a differentiable one and impose its adaptive violation-aware penalty on filters. Moreover, this paper is the first work to integrate the learning-based decomposition and group decomposition to make a better trade-off, especially for the tough task of compression of 1×1 convolution.The obtained low-rank model can be easily decomposed while nearly keeping the full accuracy without additional fine-tuning process. The effectiveness is verified by compression experiments of VGG and ResNet on CIFAR-10 and ILSVRC-2012. Our method can reduce about 65% parameters of ResNet-110 with 0.04% Top-1 accuracy drop on CIFAR-10, and reduce about 60% parameters of ResNet-50 with 0.57% Top-1 accuracy drop on ILSVRC-2012.","PeriodicalId":139543,"journal":{"name":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128078325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-01DOI: 10.1109/MIPR51284.2021.00053
Takeshi So, Y. Arai
In this study, we deduced how accurate the number of inquiries from potential tenants for housing available for rent can be predicted based on the attributes of the housing, using multiple statistical methods, and compared the results. The purpose of this study is to show these results as case studies. Confusion matrices were calculated based on the results deduced with three methods – the classical logistic regression, RandomForest, and XGBoost – and prediction accuracies were verified. The results showed that the accuracy of XGBoost was the highest, followed by that of logistic regression. It is sometimes desirable to use logistic regression, which is easy to interpret from the perspective of application to business, because the differences in accuracy among the statistical methods are not significant. It is thus important in business to take into account the accuracy, ease of interpretation, and research structure and select the most appropriate statistical method.
{"title":"Predicting inquiry from potential renters using property listing information","authors":"Takeshi So, Y. Arai","doi":"10.1109/MIPR51284.2021.00053","DOIUrl":"https://doi.org/10.1109/MIPR51284.2021.00053","url":null,"abstract":"In this study, we deduced how accurate the number of inquiries from potential tenants for housing available for rent can be predicted based on the attributes of the housing, using multiple statistical methods, and compared the results. The purpose of this study is to show these results as case studies. Confusion matrices were calculated based on the results deduced with three methods – the classical logistic regression, RandomForest, and XGBoost – and prediction accuracies were verified. The results showed that the accuracy of XGBoost was the highest, followed by that of logistic regression. It is sometimes desirable to use logistic regression, which is easy to interpret from the perspective of application to business, because the differences in accuracy among the statistical methods are not significant. It is thus important in business to take into account the accuracy, ease of interpretation, and research structure and select the most appropriate statistical method.","PeriodicalId":139543,"journal":{"name":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"88 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131780399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-01DOI: 10.1109/MIPR51284.2021.00032
Eric Brewer, Yiu-Kai Ng
With the huge amount of books available nowadays, it is a challenge to determine appropriate reading materials that are suitable for a reader, especially books that match the maturity levels of children and adolescents. Analyzing the age-appropriateness for books can be a time-consuming process, since it can take up to three hours for a human to read a book, and the relatively low cost of creating literary content can cause it to be even more difficult to discover age-suitable materials to read. In order to solve this problem, we propose a maturity-rating-level detection tool based on neural network models. The proposed model predicts a book’s content rating level within each of the seven categories: (i) crude humor/language; (ii) drug, alcohol, and tobacco use; (iii) kissing; (iv) profanity; (v) nudity; (vi) sex and intimacy; and (vii) violence and horror, given the text of the book. The empirical study demonstrates that mature content of online books can be accurately predicted by computers through the use of natural language processing and machine learning techniques. Experimental results also verify the merit of the proposed model that outperforms a number of baseline models and well-known, existing maturity ratings prediction tools.
{"title":"Identifying Maturity Rating Levels of Online Books","authors":"Eric Brewer, Yiu-Kai Ng","doi":"10.1109/MIPR51284.2021.00032","DOIUrl":"https://doi.org/10.1109/MIPR51284.2021.00032","url":null,"abstract":"With the huge amount of books available nowadays, it is a challenge to determine appropriate reading materials that are suitable for a reader, especially books that match the maturity levels of children and adolescents. Analyzing the age-appropriateness for books can be a time-consuming process, since it can take up to three hours for a human to read a book, and the relatively low cost of creating literary content can cause it to be even more difficult to discover age-suitable materials to read. In order to solve this problem, we propose a maturity-rating-level detection tool based on neural network models. The proposed model predicts a book’s content rating level within each of the seven categories: (i) crude humor/language; (ii) drug, alcohol, and tobacco use; (iii) kissing; (iv) profanity; (v) nudity; (vi) sex and intimacy; and (vii) violence and horror, given the text of the book. The empirical study demonstrates that mature content of online books can be accurately predicted by computers through the use of natural language processing and machine learning techniques. Experimental results also verify the merit of the proposed model that outperforms a number of baseline models and well-known, existing maturity ratings prediction tools.","PeriodicalId":139543,"journal":{"name":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131264851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-01DOI: 10.1109/MIPR51284.2021.00041
Shu Naritomi, Keiji Yanai
Dietary calorie management has been an important topic in recent years, and various methods and applications on image-based food calorie estimation have been published in the multimedia community. Most of the existing methods of estimating food calorie amounts use 2D-based image recognition. However, since actual food is a 3D object, there is a limit to the accuracy of calorie estimation using 2D-based methods. Therefore, in our previous work, we proposed a method to reconstruct the 3D shape of the dish (food and plate) and a plate (without foods) from a single 2D image and estimate a more accurate food volume. Such researches on 3D reconstruction have been active recently, and it is necessary to qualitatively evaluate what kind of 3D shape has been reconstructed. However, checking a large number of 3D models reconstructed from a large number of images requires many steps and is tedious. Against this background, this demo paper introduces an application named "Pop’n Food" which has the following two functions: (1) A web application for visualizing a large number of images to check the learning results and the 3D model generated from them. (2) A web application that selects an image from a browser and generates and visualizes a 3D model in real-time. This demo system is based on our previous work named Hungry Networks. Demo video: https://youtu.be/YyIu8bL65EE
膳食热量管理是近年来研究的一个重要课题,多媒体界已经发表了各种基于图像的食物热量估算方法和应用。大多数现有的估算食物卡路里量的方法使用基于2d的图像识别。然而,由于实际食物是3D物体,因此使用基于2d的方法估算卡路里的准确性是有限的。因此,在我们之前的工作中,我们提出了一种从单个二维图像中重建盘子(食物和盘子)和盘子(不含食物)的三维形状的方法,并估算出更准确的食物体积。这类三维重建的研究近年来比较活跃,有必要对重建的三维形状进行定性评价。然而,从大量图像重建的大量3D模型的检查需要许多步骤,并且是繁琐的。在此背景下,本演示论文介绍了一个名为“Pop 'n Food”的应用程序,它具有以下两个功能:(1)一个web应用程序,用于将大量图像可视化,以检查学习结果以及从中生成的3D模型。(2)一个从浏览器中选择图像并实时生成和可视化3D模型的web应用程序。这个演示系统是基于我们之前名为Hungry Networks的工作。演示视频:https://youtu.be/YyIu8bL65EE
{"title":"Pop’n Food: 3D Food Model Estimation System from a Single Image","authors":"Shu Naritomi, Keiji Yanai","doi":"10.1109/MIPR51284.2021.00041","DOIUrl":"https://doi.org/10.1109/MIPR51284.2021.00041","url":null,"abstract":"Dietary calorie management has been an important topic in recent years, and various methods and applications on image-based food calorie estimation have been published in the multimedia community. Most of the existing methods of estimating food calorie amounts use 2D-based image recognition. However, since actual food is a 3D object, there is a limit to the accuracy of calorie estimation using 2D-based methods. Therefore, in our previous work, we proposed a method to reconstruct the 3D shape of the dish (food and plate) and a plate (without foods) from a single 2D image and estimate a more accurate food volume. Such researches on 3D reconstruction have been active recently, and it is necessary to qualitatively evaluate what kind of 3D shape has been reconstructed. However, checking a large number of 3D models reconstructed from a large number of images requires many steps and is tedious. Against this background, this demo paper introduces an application named \"Pop’n Food\" which has the following two functions: (1) A web application for visualizing a large number of images to check the learning results and the 3D model generated from them. (2) A web application that selects an image from a browser and generates and visualizes a 3D model in real-time. This demo system is based on our previous work named Hungry Networks. Demo video: https://youtu.be/YyIu8bL65EE","PeriodicalId":139543,"journal":{"name":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116841943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-01DOI: 10.1109/MIPR51284.2021.00052
Y. Kado, Takashi Hirokata, Koji Matsumura, Xueting Wang, T. Yamasaki
In Japan, there are many real estate companies and agencies, who create apartment room property records and register them to some real estate portal sites to be advertised. The apartment room records include the apartment building attributes information. However, the building attributes values are not entered by referring to the common building database but are arbitrarily created and entered by each company or agency. For effective use of property information, apartment rooms must be linked to the correct apartment building. In this regard, aggregating property information belonging to the same building (entity resolution) is typically performed by a rule-based process that statistically considers the similarity of attributes such as the building name, number of floors, or year/month the building was built. However, when property information is stored by room and registered by different businesses, the corresponding building information may be inconsistent, incomplete, or inaccurate. Therefore, entity resolution using a rule-based method is insufficient and requires extensive manual post-processing. This study proposes an entity resolution method for apartment properties using neural networks with inputs containing traditional property attributes and new attributes obtained from the phonetic and semantic pre-processing of building names. The experimental results show that the proposed method improves entity resolution accuracy.
{"title":"Entity Resolution of Japanese Apartment Property Information Using Neural Networks","authors":"Y. Kado, Takashi Hirokata, Koji Matsumura, Xueting Wang, T. Yamasaki","doi":"10.1109/MIPR51284.2021.00052","DOIUrl":"https://doi.org/10.1109/MIPR51284.2021.00052","url":null,"abstract":"In Japan, there are many real estate companies and agencies, who create apartment room property records and register them to some real estate portal sites to be advertised. The apartment room records include the apartment building attributes information. However, the building attributes values are not entered by referring to the common building database but are arbitrarily created and entered by each company or agency. For effective use of property information, apartment rooms must be linked to the correct apartment building. In this regard, aggregating property information belonging to the same building (entity resolution) is typically performed by a rule-based process that statistically considers the similarity of attributes such as the building name, number of floors, or year/month the building was built. However, when property information is stored by room and registered by different businesses, the corresponding building information may be inconsistent, incomplete, or inaccurate. Therefore, entity resolution using a rule-based method is insufficient and requires extensive manual post-processing. This study proposes an entity resolution method for apartment properties using neural networks with inputs containing traditional property attributes and new attributes obtained from the phonetic and semantic pre-processing of building names. The experimental results show that the proposed method improves entity resolution accuracy.","PeriodicalId":139543,"journal":{"name":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117214568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-01DOI: 10.1109/MIPR51284.2021.00043
T. Okamoto, H. Miyamori
In this paper, we describe the system of ad hoc search on statistical data based on categorization and metadata augmentation. The documents covered by this paper consist of metadata extracted from the governmental statistical data and the body of the corresponding statistical data. The metadata is characterized by the fact that its document length is short, and the main body of statistical data is almost always composed of numbers, except for titles, headers, and comments. We newly developed the categorical search that narrows the set of documents to be retrieved by category in order to properly capture the scope of the problem domain intended by the given query. In addition, to compensate for the short document length of metadata, we implemented a method of extracting the header information of the table from the main body of statistical data to augment documents to be searched. As a ranking model, we adopted BM25, which can be adjusted with few parameters to take into account term frequency and document length.
{"title":"Demo Paper: Ad Hoc Search On Statistical Data Based On Categorization And Metadata Augmentation","authors":"T. Okamoto, H. Miyamori","doi":"10.1109/MIPR51284.2021.00043","DOIUrl":"https://doi.org/10.1109/MIPR51284.2021.00043","url":null,"abstract":"In this paper, we describe the system of ad hoc search on statistical data based on categorization and metadata augmentation. The documents covered by this paper consist of metadata extracted from the governmental statistical data and the body of the corresponding statistical data. The metadata is characterized by the fact that its document length is short, and the main body of statistical data is almost always composed of numbers, except for titles, headers, and comments. We newly developed the categorical search that narrows the set of documents to be retrieved by category in order to properly capture the scope of the problem domain intended by the given query. In addition, to compensate for the short document length of metadata, we implemented a method of extracting the header information of the table from the main body of statistical data to augment documents to be searched. As a ranking model, we adopted BM25, which can be adjusted with few parameters to take into account term frequency and document length.","PeriodicalId":139543,"journal":{"name":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114896429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-01DOI: 10.1109/MIPR51284.2021.00070
Minghao Wang, Long Ye, Fei Hu, Li Fang, Wei Zhong, Qin Zhang
Using heatmaps to predict body joint locations has become one of the best performing pose estimation methods, however, these methods often have the high demands for memory and computation, which make them difficult to apply into practice. This paper proposes an effective compression method to reduce the size of heatmaps, namely lies Respective Volumetric Heatmap Autoencoder(RVHA) to represent the ground truth heatmaps with smaller data size, then a RVHA-based pose estimation framework is built to achieve the human joint locations from monocular RGB images. Thanks to our compression strategy which takes each human joint volumetric heatmap as an input frame individually, our method performs favorably when compared to state of the art on the JTA datasets.
{"title":"Respective Volumetric Heatmap Autoencoder for Multi-Person 3D Pose Estimation","authors":"Minghao Wang, Long Ye, Fei Hu, Li Fang, Wei Zhong, Qin Zhang","doi":"10.1109/MIPR51284.2021.00070","DOIUrl":"https://doi.org/10.1109/MIPR51284.2021.00070","url":null,"abstract":"Using heatmaps to predict body joint locations has become one of the best performing pose estimation methods, however, these methods often have the high demands for memory and computation, which make them difficult to apply into practice. This paper proposes an effective compression method to reduce the size of heatmaps, namely lies Respective Volumetric Heatmap Autoencoder(RVHA) to represent the ground truth heatmaps with smaller data size, then a RVHA-based pose estimation framework is built to achieve the human joint locations from monocular RGB images. Thanks to our compression strategy which takes each human joint volumetric heatmap as an input frame individually, our method performs favorably when compared to state of the art on the JTA datasets.","PeriodicalId":139543,"journal":{"name":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133013558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}