Pub Date : 2018-09-01DOI: 10.1109/ICAWST.2018.8517241
Atsushi Kawamura, B. Chakraborty
Feature subset selection is an optimization problem to achieve high classification accuracy with low number of features and low computational cost in the area of pattern classi- fication or data mining. There are various approaches to obtain this. Basically a search algorithm is used with a fitness function either based on intrinsic characteristics of the data, known as filter type, or based on classification accuracy of the classifier used, known as the wrapper type, to find out the optimum feature subset. Both the approaches have respective merits and demerits. Though lots of algorithms are developed so far, none of them works equally well for all the data sets, specially for very high dimensional data sets. In this work, a new feature evaluation measure based on the concept borrowed from topic modelling in text processing, has been developed. The proposed measure is used as a fitness function of evolutionary computational search techniques for designing filter type feature subset selection approach. Simulation experiments with various benchmark data sets have been done for assessing the efficiency of the proposed approach in comparison to the popular conventional filter type feature selection algorithms mRMR and CFS. It is found that the proposed approach is better in terms of selecting lesser number of features with comparable classification accuracy. The proposed algorithms work better for higher dimensional features and can be proved as an effective solution of feature selection for very high dimensional data.
{"title":"A New Filter Evaluation Function for Feature Subset Selection with Evolutionary Computation","authors":"Atsushi Kawamura, B. Chakraborty","doi":"10.1109/ICAWST.2018.8517241","DOIUrl":"https://doi.org/10.1109/ICAWST.2018.8517241","url":null,"abstract":"Feature subset selection is an optimization problem to achieve high classification accuracy with low number of features and low computational cost in the area of pattern classi- fication or data mining. There are various approaches to obtain this. Basically a search algorithm is used with a fitness function either based on intrinsic characteristics of the data, known as filter type, or based on classification accuracy of the classifier used, known as the wrapper type, to find out the optimum feature subset. Both the approaches have respective merits and demerits. Though lots of algorithms are developed so far, none of them works equally well for all the data sets, specially for very high dimensional data sets. In this work, a new feature evaluation measure based on the concept borrowed from topic modelling in text processing, has been developed. The proposed measure is used as a fitness function of evolutionary computational search techniques for designing filter type feature subset selection approach. Simulation experiments with various benchmark data sets have been done for assessing the efficiency of the proposed approach in comparison to the popular conventional filter type feature selection algorithms mRMR and CFS. It is found that the proposed approach is better in terms of selecting lesser number of features with comparable classification accuracy. The proposed algorithms work better for higher dimensional features and can be proved as an effective solution of feature selection for very high dimensional data.","PeriodicalId":277939,"journal":{"name":"2018 9th International Conference on Awareness Science and Technology (iCAST)","volume":"21 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133089108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-09-01DOI: 10.1109/ICAWST.2018.8517170
Oscar Karnalim, Lisan Sulistiani
This paper contributes in developing source code plagiarism detection that is more aligned with human perspective. Three evaluation mechanisms that directly relate human perspective with evaluated approaches are proposed: think-aloud, aspectoriented, and empirical mechanism. Using those mechanisms, a comparative study toward attribute-and structure-based plagiarism detection approach (i.e., two popular approach categories in source code plagiarism detection) is conducted. According to that study, structure-based approach is more effective than the attribute-based one; its signature aspect and resulted similarity degrees are more related to human preferences. In addition, such approach is related to most human-oriented aspects for suspecting source code plagiarism.
{"title":"Which Source Code Plagiarism Detection Approach is More Humane?","authors":"Oscar Karnalim, Lisan Sulistiani","doi":"10.1109/ICAWST.2018.8517170","DOIUrl":"https://doi.org/10.1109/ICAWST.2018.8517170","url":null,"abstract":"This paper contributes in developing source code plagiarism detection that is more aligned with human perspective. Three evaluation mechanisms that directly relate human perspective with evaluated approaches are proposed: think-aloud, aspectoriented, and empirical mechanism. Using those mechanisms, a comparative study toward attribute-and structure-based plagiarism detection approach (i.e., two popular approach categories in source code plagiarism detection) is conducted. According to that study, structure-based approach is more effective than the attribute-based one; its signature aspect and resulted similarity degrees are more related to human preferences. In addition, such approach is related to most human-oriented aspects for suspecting source code plagiarism.","PeriodicalId":277939,"journal":{"name":"2018 9th International Conference on Awareness Science and Technology (iCAST)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132955029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-09-01DOI: 10.1109/ICAWST.2018.8517197
Masataka Mito, K. Murata, Daisuke Eguchi, Yuichiro Mori, M. Toyonaga
In recent years, the big-data approach has become important within various business operations and salesjudgment tactics. Contrarily, numerous privacy problems limit the progress of their analysis technologies. To mitigate such problems, this paper proposes several privacy-preserving methods, i.e., anonymization, extreme value record elimination, fully encrypted analysis, and so on. However, privacy-cracking fears still remain that prevent the open use of big-data by other, external organizations. We propose a big-data reconstruction method that does not intrinsically use privacy data. The method uses only the statistical features of big-data, i.e., its attribute histograms and their correlation coefficients. To verify whether valuable information can be extracted using this method, we evaluate the data by using Self Organizing Map (SOM) as one of the big-data analysis tools. The results show that the same pieces ofinformation are extracted from our data and the big-data.
{"title":"A Data Reconstruction Method for The Big-Data Analysis","authors":"Masataka Mito, K. Murata, Daisuke Eguchi, Yuichiro Mori, M. Toyonaga","doi":"10.1109/ICAWST.2018.8517197","DOIUrl":"https://doi.org/10.1109/ICAWST.2018.8517197","url":null,"abstract":"In recent years, the big-data approach has become important within various business operations and salesjudgment tactics. Contrarily, numerous privacy problems limit the progress of their analysis technologies. To mitigate such problems, this paper proposes several privacy-preserving methods, i.e., anonymization, extreme value record elimination, fully encrypted analysis, and so on. However, privacy-cracking fears still remain that prevent the open use of big-data by other, external organizations. We propose a big-data reconstruction method that does not intrinsically use privacy data. The method uses only the statistical features of big-data, i.e., its attribute histograms and their correlation coefficients. To verify whether valuable information can be extracted using this method, we evaluate the data by using Self Organizing Map (SOM) as one of the big-data analysis tools. The results show that the same pieces ofinformation are extracted from our data and the big-data.","PeriodicalId":277939,"journal":{"name":"2018 9th International Conference on Awareness Science and Technology (iCAST)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132444650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-09-01DOI: 10.1109/ICAWST.2018.8517171
Yuto Yoshizawa, Y. Watanobe
In recent years, the importance of programming skills is increasing due to advances in information and communication technologies. However, the difficulty involved in learning programming is a major problem for novices. Therefore, we propose a logic error detection algorithm based on structure pattern and error degree. Structure pattern is an index of similarity based on abstract syntax trees, and error degree is a measure of appropriateness for feedback. In the present paper, we define structure pattern and error degree and present the proposed algorithm method. Implementation and experimentation using actual data are also considered.
{"title":"Logic Error Detection Algorithm for Novice Programmers based on Structure Pattern and Error Degree","authors":"Yuto Yoshizawa, Y. Watanobe","doi":"10.1109/ICAWST.2018.8517171","DOIUrl":"https://doi.org/10.1109/ICAWST.2018.8517171","url":null,"abstract":"In recent years, the importance of programming skills is increasing due to advances in information and communication technologies. However, the difficulty involved in learning programming is a major problem for novices. Therefore, we propose a logic error detection algorithm based on structure pattern and error degree. Structure pattern is an index of similarity based on abstract syntax trees, and error degree is a measure of appropriateness for feedback. In the present paper, we define structure pattern and error degree and present the proposed algorithm method. Implementation and experimentation using actual data are also considered.","PeriodicalId":277939,"journal":{"name":"2018 9th International Conference on Awareness Science and Technology (iCAST)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126343496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-09-01DOI: 10.1109/ICAWST.2018.8517172
S. Das, B. Chakraborty
Now-a-days social media and micro blogging sites are the most popular form of communication. The most useful application on these platforms is Opinion mining or Sentiment classification of the users. Here, in this work an automated method has been proposed to analyze and summarize opinions on a product in a structured, product aspect based manner. The proposed method will help future potential buyers to acquire complete idea, from a comprehensible representation of the reviews, without going through all the reviews manually.
{"title":"Aspect Aware Optimized Opinion Analysis of Online Product Reviews","authors":"S. Das, B. Chakraborty","doi":"10.1109/ICAWST.2018.8517172","DOIUrl":"https://doi.org/10.1109/ICAWST.2018.8517172","url":null,"abstract":"Now-a-days social media and micro blogging sites are the most popular form of communication. The most useful application on these platforms is Opinion mining or Sentiment classification of the users. Here, in this work an automated method has been proposed to analyze and summarize opinions on a product in a structured, product aspect based manner. The proposed method will help future potential buyers to acquire complete idea, from a comprehensible representation of the reviews, without going through all the reviews manually.","PeriodicalId":277939,"journal":{"name":"2018 9th International Conference on Awareness Science and Technology (iCAST)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125970124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this study, an assistant system is developed for pharmacist to improve the dispensing quality by two functions: notification in time and monitoring in real time. During drug dispensation, the system gets the patient’s prescription issued from doctors, and drives the light-emitting diode (LED) for notification. Since some drug titles, shapes, colors, or packages are very similar, pharmacists waste lots of time to find the correct drugs. With LED notification, pharmacists pick up the drugs from the correct cabinets, and save the dispensing time. Second, the system monitors pharmacist actions by the infrared (IR) sensors. An alarm is given if pharmacists pick up the incorrect drugs or lost the drug items, even the correct LEDs are turn on. In addition, a web-based information system is designed for drug dispensing and inventory management. During the dispensation, patient information and drug data are displayed on the screen for notification.
{"title":"The Assistance for Drug Dispensing Using LED Notification and IR Sensor-based Monitoring Methods","authors":"Chin-Chuan Han, Hao-Pu Lin, Chao-Hsu Chang, Chang-Hsing Lee, Jau-Ling Shih, Chunlan Hsu, Jen-Chih Chang","doi":"10.1109/ICAWST.2018.8517168","DOIUrl":"https://doi.org/10.1109/ICAWST.2018.8517168","url":null,"abstract":"In this study, an assistant system is developed for pharmacist to improve the dispensing quality by two functions: notification in time and monitoring in real time. During drug dispensation, the system gets the patient’s prescription issued from doctors, and drives the light-emitting diode (LED) for notification. Since some drug titles, shapes, colors, or packages are very similar, pharmacists waste lots of time to find the correct drugs. With LED notification, pharmacists pick up the drugs from the correct cabinets, and save the dispensing time. Second, the system monitors pharmacist actions by the infrared (IR) sensors. An alarm is given if pharmacists pick up the incorrect drugs or lost the drug items, even the correct LEDs are turn on. In addition, a web-based information system is designed for drug dispensing and inventory management. During the dispensation, patient information and drug data are displayed on the screen for notification.","PeriodicalId":277939,"journal":{"name":"2018 9th International Conference on Awareness Science and Technology (iCAST)","volume":"28 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114025312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper proposes a framework incorporating deep-learned features with the conventional machine learning models within which the objective function is optimized by using quadratic programming or quasi-Newton methods instead of an end-to-end deep learning approach which uses variants of stochastic gradient descent algorithms. A temporal segmentation algorithm is first scrutinized by using a learning to rank scheme to detect the abrupt changes of frame appearances in a video sequence. Afterward, a peak-searching algorithm, statisticssensitive non-linear iterative peak-clipping (SNIP), is employed to acquire the local maxima of the filtered video sequence after rank pooling, where each of the local maxima corresponds to a key frame in the video. Simulations show that the new approach outperforms the main state-of-the-art works on four public video datasets.
{"title":"Video Summarization: How to Use Deep-Learned Features Without a Large-Scale Dataset","authors":"Didik Purwanto, Yie-Tarng Chen, Wen-Hsien Fang, Wen-Chi Wu","doi":"10.29007/21Q3","DOIUrl":"https://doi.org/10.29007/21Q3","url":null,"abstract":"This paper proposes a framework incorporating deep-learned features with the conventional machine learning models within which the objective function is optimized by using quadratic programming or quasi-Newton methods instead of an end-to-end deep learning approach which uses variants of stochastic gradient descent algorithms. A temporal segmentation algorithm is first scrutinized by using a learning to rank scheme to detect the abrupt changes of frame appearances in a video sequence. Afterward, a peak-searching algorithm, statisticssensitive non-linear iterative peak-clipping (SNIP), is employed to acquire the local maxima of the filtered video sequence after rank pooling, where each of the local maxima corresponds to a key frame in the video. Simulations show that the new approach outperforms the main state-of-the-art works on four public video datasets.","PeriodicalId":277939,"journal":{"name":"2018 9th International Conference on Awareness Science and Technology (iCAST)","volume":"157 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126686345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Non-compositional multi-word expressions present great challenges to natural language processing applications. In this paper, we present a method for modeling non-compositional expressions based on the assumption that the meaning of expressions depends on context. Therefore, context words can be used to select documents and separate documents where the expression has different meanings. Deviation from a baseline is measured using serendipity (i.e. the pointwise effect size). We used this statistical measure to mark which patterns are over-and under-represented and to take a decision if the pattern under scrutiny belongs to the meaning selected by the context words or not. We used the Google search engine to find document frequency estimates. When used with Google document frequency estimates, the serendipity measure closely mirrors some human intuitions on the preferred alternative.
{"title":"Modeling Non-Compositional Expressions using a Search Engine","authors":"Cheikh M. Bamba Dione, Christer Johansson","doi":"10.29007/4JL9","DOIUrl":"https://doi.org/10.29007/4JL9","url":null,"abstract":"Non-compositional multi-word expressions present great challenges to natural language processing applications. In this paper, we present a method for modeling non-compositional expressions based on the assumption that the meaning of expressions depends on context. Therefore, context words can be used to select documents and separate documents where the expression has different meanings. Deviation from a baseline is measured using serendipity (i.e. the pointwise effect size). We used this statistical measure to mark which patterns are over-and under-represented and to take a decision if the pattern under scrutiny belongs to the meaning selected by the context words or not. We used the Google search engine to find document frequency estimates. When used with Google document frequency estimates, the serendipity measure closely mirrors some human intuitions on the preferred alternative.","PeriodicalId":277939,"journal":{"name":"2018 9th International Conference on Awareness Science and Technology (iCAST)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127070778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-12-13DOI: 10.1109/ICAWST.2018.8517177
Hong-Yen Chen, Chung-Yen Su
Complicated and deep neural network models can achieve high accuracy for image recognition. However, they require a huge amount of computations and model parameters, which are not suitable for mobile and embedded devices. Therefore, MobileNet was proposed, which can reduce the number of parameters and computational cost dramatically. The main idea of MobileNet is to use a depthwise separable convolution. Two hyper-parameters, a width multiplier and a resolution multiplier are used to the trade-off between the accuracy and the latency. In this paper, we propose a new architecture to improve the MobileNet. Instead of using the resolution multiplier, we use a depth multiplier and combine with either Fractional Max Pooling or the max pooling. Experimental results on CIFAR database show that the proposed architecture can reduce the amount of computational cost and increase the accuracy simultaneously 1.This work is partly supported by Ministry of Science and Technology, R.O.C. under Contract No. MOST 106-2221-E-003-011.
复杂的深度神经网络模型可以达到较高的图像识别精度。然而,它们需要大量的计算量和模型参数,不适合移动和嵌入式设备。因此,提出了MobileNet,它可以显著减少参数的数量和计算成本。MobileNet的主要思想是使用深度可分离卷积。使用两个超参数,一个宽度乘法器和一个分辨率乘法器来权衡精度和延迟。在本文中,我们提出了一种新的架构来改进MobileNet。我们不使用分辨率乘法器,而是使用深度乘法器,并结合分数最大池化或最大池化。在CIFAR数据库上的实验结果表明,该架构在降低计算成本的同时提高了准确率。本研究由中华民国科学技术部根据合约编号:大多数106 - 2221 - e - 003 - 011。
{"title":"An Enhanced Hybrid MobileNet","authors":"Hong-Yen Chen, Chung-Yen Su","doi":"10.1109/ICAWST.2018.8517177","DOIUrl":"https://doi.org/10.1109/ICAWST.2018.8517177","url":null,"abstract":"Complicated and deep neural network models can achieve high accuracy for image recognition. However, they require a huge amount of computations and model parameters, which are not suitable for mobile and embedded devices. Therefore, MobileNet was proposed, which can reduce the number of parameters and computational cost dramatically. The main idea of MobileNet is to use a depthwise separable convolution. Two hyper-parameters, a width multiplier and a resolution multiplier are used to the trade-off between the accuracy and the latency. In this paper, we propose a new architecture to improve the MobileNet. Instead of using the resolution multiplier, we use a depth multiplier and combine with either Fractional Max Pooling or the max pooling. Experimental results on CIFAR database show that the proposed architecture can reduce the amount of computational cost and increase the accuracy simultaneously 1.This work is partly supported by Ministry of Science and Technology, R.O.C. under Contract No. MOST 106-2221-E-003-011.","PeriodicalId":277939,"journal":{"name":"2018 9th International Conference on Awareness Science and Technology (iCAST)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133152471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}