Pub Date : 2019-12-01DOI: 10.3966/160792642019122007022
C. Kuo, C. Hou, Chu-Sing Yang
In recent years, network technology has developed rapidly. However, the Internet has been subject to a variety of attacks. Several notable attack events have been reported, such as those involving the use of flooding flows on widely used message boards, installation of malware in an automated teller machine to steal more than 80 million, and use of WannaCry to encrypt users’ files and request for ransoms. The majority of the attacks cannot be defended using single methods. Network-based intrusion detection systems (NIDSs) and host-based IDSs (HIDSs) can determine whether a system has been attacked. A NIDS alone cannot detect web-based attacks or system vulnerabilities. Thus, this paper proposes a risk assessment system (RAS) that integrates a NIDS and HIDS to detect suspicious behaviors and assess the risk value of Internet protocols (IPs). The RAS focuses on the analysis of attack or suspicious behaviors using the NIDS and HIDS. Furthermore, the system quantizes the influence of attackers in suspicious events by using PageRank. Finally, the RAS derives the risk value of every IP to warn users of an attack and protect hosts or devices from the attacks.
{"title":"The Study of a Risk Assessment System based on PageRank","authors":"C. Kuo, C. Hou, Chu-Sing Yang","doi":"10.3966/160792642019122007022","DOIUrl":"https://doi.org/10.3966/160792642019122007022","url":null,"abstract":"In recent years, network technology has developed rapidly. However, the Internet has been subject to a variety of attacks. Several notable attack events have been reported, such as those involving the use of flooding flows on widely used message boards, installation of malware in an automated teller machine to steal more than 80 million, and use of WannaCry to encrypt users’ files and request for ransoms. The majority of the attacks cannot be defended using single methods. Network-based intrusion detection systems (NIDSs) and host-based IDSs (HIDSs) can determine whether a system has been attacked. A NIDS alone cannot detect web-based attacks or system vulnerabilities. Thus, this paper proposes a risk assessment system (RAS) that integrates a NIDS and HIDS to detect suspicious behaviors and assess the risk value of Internet protocols (IPs). The RAS focuses on the analysis of attack or suspicious behaviors using the NIDS and HIDS. Furthermore, the system quantizes the influence of attackers in suspicious events by using PageRank. Finally, the RAS derives the risk value of every IP to warn users of an attack and protect hosts or devices from the attacks.","PeriodicalId":50172,"journal":{"name":"Journal of Internet Technology","volume":"20 1","pages":"2255-2264"},"PeriodicalIF":1.6,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45236858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-09-01DOI: 10.3966/160792642019092005017
Yun-Fei Jia, Z. Zhou, R. Wu
The dynamic behavior of software systems attracts widened attention through the phenomenon of software aging. Software aging is caused by runtime environment deterioration, such as the gradual loss of memory or CPU cycles. The dynamic behavior of aged software systems can be described by a set of evolving resource variables, including CPU usage, I/O bandwidth, available memory and the like. From this point of view, an aging software system can be analogous to a dynamic system. Control theory provides sound and rigorous mathematical principles to analyze dynamic systems and build controllers for them. This paper introduces control theory to analyze and build a control model and apply control techniques to an aged web server. First, we treated the software system as a black box, and conducted controlled experiments to build the relationship between input and output. Then, these input-output couples are used to build a control model via a system identification method. Finally, a PI (proportional-integral) controller is designed to adjust the aged state of the software system, and software rejuvenation techniques are customized to target the web server. Performance testing shows that our approach can accurately track the reference value set by the website administrator.
{"title":"A feedback control approach for preventing system resource exhaustion caused by software aging","authors":"Yun-Fei Jia, Z. Zhou, R. Wu","doi":"10.3966/160792642019092005017","DOIUrl":"https://doi.org/10.3966/160792642019092005017","url":null,"abstract":"The dynamic behavior of software systems attracts widened attention through the phenomenon of software aging. Software aging is caused by runtime environment deterioration, such as the gradual loss of memory or CPU cycles. The dynamic behavior of aged software systems can be described by a set of evolving resource variables, including CPU usage, I/O bandwidth, available memory and the like. From this point of view, an aging software system can be analogous to a dynamic system. Control theory provides sound and rigorous mathematical principles to analyze dynamic systems and build controllers for them. This paper introduces control theory to analyze and build a control model and apply control techniques to an aged web server. First, we treated the software system as a black box, and conducted controlled experiments to build the relationship between input and output. Then, these input-output couples are used to build a control model via a system identification method. Finally, a PI (proportional-integral) controller is designed to adjust the aged state of the software system, and software rejuvenation techniques are customized to target the web server. Performance testing shows that our approach can accurately track the reference value set by the website administrator.","PeriodicalId":50172,"journal":{"name":"Journal of Internet Technology","volume":"20 1","pages":"1513-1522"},"PeriodicalIF":1.6,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46201519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-07-01DOI: 10.3966/160792642019072004028
Feng Sun, Zhenjiang Zhang, Yi-Chih Kao, Tian-zhou Li, Bo Shen
Nowadays, with the development of artificial intelligence, deep learning has attracted more and more attention. Whereas deep neural network has made incredible progress in many domains including Computer Vision, Nature Language Processing, etc, recent studies show that they are vulnerable to the adversarial attacks which takes legitimate images with undetected perturbation as input and can mislead the model to predict incorrect outputs. We consider that the key point of the adversarial attack is the undetected perturbation added to the input. It will be of great significance to eliminate the effect of the added noise. Thus, we design a new, efficient model based on residual image which can detect this potential adversarial attack. We design a method to get the residual image which can capture these possible perturbations. Based on the residual image we got, the detection mechanism can help us detect whether it is an adversarial image or not. A serial of experiments has also been carried out. Subsequent experiments prove that the new detection method can detect the adversarial attack with high effectivity.
{"title":"A new method to detect the adversarial attack based on the residual image","authors":"Feng Sun, Zhenjiang Zhang, Yi-Chih Kao, Tian-zhou Li, Bo Shen","doi":"10.3966/160792642019072004028","DOIUrl":"https://doi.org/10.3966/160792642019072004028","url":null,"abstract":"Nowadays, with the development of artificial intelligence, deep learning has attracted more and more attention. Whereas deep neural network has made incredible progress in many domains including Computer Vision, Nature Language Processing, etc, recent studies show that they are vulnerable to the adversarial attacks which takes legitimate images with undetected perturbation as input and can mislead the model to predict incorrect outputs. We consider that the key point of the adversarial attack is the undetected perturbation added to the input. It will be of great significance to eliminate the effect of the added noise. Thus, we design a new, efficient model based on residual image which can detect this potential adversarial attack. We design a method to get the residual image which can capture these possible perturbations. Based on the residual image we got, the detection mechanism can help us detect whether it is an adversarial image or not. A serial of experiments has also been carried out. Subsequent experiments prove that the new detection method can detect the adversarial attack with high effectivity.","PeriodicalId":50172,"journal":{"name":"Journal of Internet Technology","volume":"20 1","pages":"1297-1304"},"PeriodicalIF":1.6,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44097567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-05-01DOI: 10.3966/160792642019052003013
Binbin Zhang, Jerry Chun‐wei Lin, Qiankun Liu, Philippe Fournier-Viger, Y. Djenouri
In recent years, analyzing transactional data has become an important data analytic task since it can discover important information in several domains, for recommendation, prediction, and personalization. Nonetheless, transactional data sometimes contains sensitive and confidential information such as personal identifiers, information aboutsexual orientations, medical diseases, and religious beliefs. Such information can be analyzed using various data mining algorithms, which may cause security threats to individuals. Several algorithms were proposed to hide sensitive information in databases but most of them assume that sensitive information is the same for all users, which is an unrealistic assumption. Hence, this paper presents a (k, p)-anonymity framework to hide personal sensitive information. The developed ANonymity for Transactional database (ANT) algorithm can hide multiple pieces of sensitive information in transactions. Besides, it let users assign sensitivity values to indicate how sensitive each piece of information is. The designed anonymity algorithm ensures that the percentage of anonymized data does not exceed a predefined maximum sensitivity threshold. Results of several experiments indicate that the proposed algorithm outperforms the-state-of-the-art PTA and Gray-TSP algorithms in terms of information loss and runtime.
{"title":"A(k, p)-anonymity Framework to Sanitize Transactional Database with Personalized Sensitivity","authors":"Binbin Zhang, Jerry Chun‐wei Lin, Qiankun Liu, Philippe Fournier-Viger, Y. Djenouri","doi":"10.3966/160792642019052003013","DOIUrl":"https://doi.org/10.3966/160792642019052003013","url":null,"abstract":"In recent years, analyzing transactional data has become an important data analytic task since it can discover important information in several domains, for recommendation, prediction, and personalization. Nonetheless, transactional data sometimes contains sensitive and confidential information such as personal identifiers, information aboutsexual orientations, medical diseases, and religious beliefs. Such information can be analyzed using various data mining algorithms, which may cause security threats to individuals. Several algorithms were proposed to hide sensitive information in databases but most of them assume that sensitive information is the same for all users, which is an unrealistic assumption. Hence, this paper presents a (k, p)-anonymity framework to hide personal sensitive information. The developed ANonymity for Transactional database (ANT) algorithm can hide multiple pieces of sensitive information in transactions. Besides, it let users assign sensitivity values to indicate how sensitive each piece of information is. The designed anonymity algorithm ensures that the percentage of anonymized data does not exceed a predefined maximum sensitivity threshold. Results of several experiments indicate that the proposed algorithm outperforms the-state-of-the-art PTA and Gray-TSP algorithms in terms of information loss and runtime.","PeriodicalId":50172,"journal":{"name":"Journal of Internet Technology","volume":"20 1","pages":"801-808"},"PeriodicalIF":1.6,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47087062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-05-01DOI: 10.3966/160792642019052003017
Po-Sen Huang, Po-Sheng Chiu, Jia-Wei Chang, Yueh-Min Huang, Ming-Che Lee
Short-text semantic similarity is an essential technique of natural language search and is widely used in social network analysis and opinion mining to find unknown knowledge. Such similarity measures usually measure short texts with 10-20 words. Similar to spoken utterances, short texts do not necessarily follow formal grammatical rules. The limited information contained in short texts and their syntactic and semantic flexibility make similarity measures difficult. Therefore, this study designed and tested a part-of-speech-based short-text similarity algorithm to solve those problems. The effects of evaluating different parts of speech are thoroughly discussed. The proposed algorithm achieved the best performance using word measures corresponding to different parts of speech.
{"title":"A study of using syntactic cues in short-text similarity measure","authors":"Po-Sen Huang, Po-Sheng Chiu, Jia-Wei Chang, Yueh-Min Huang, Ming-Che Lee","doi":"10.3966/160792642019052003017","DOIUrl":"https://doi.org/10.3966/160792642019052003017","url":null,"abstract":"Short-text semantic similarity is an essential technique of natural language search and is widely used in social network analysis and opinion mining to find unknown knowledge. Such similarity measures usually measure short texts with 10-20 words. Similar to spoken utterances, short texts do not necessarily follow formal grammatical rules. The limited information contained in short texts and their syntactic and semantic flexibility make similarity measures difficult. Therefore, this study designed and tested a part-of-speech-based short-text similarity algorithm to solve those problems. The effects of evaluating different parts of speech are thoroughly discussed. The proposed algorithm achieved the best performance using word measures corresponding to different parts of speech.","PeriodicalId":50172,"journal":{"name":"Journal of Internet Technology","volume":"20 1","pages":"839-850"},"PeriodicalIF":1.6,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47170271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-05-01DOI: 10.3966/160792642019052003024
Chi-Lun Lo, Chi-Hua Chen, Jin-Li Hu, Kuen-Rong Lo, Hsun-Jung Cho
This study adopts a fuel consumption estimation method to measure the consumed fuel quantity of each vehicle speed interval (i.e., a cost function) in accordance with individual behaviors. Furthermore, a mobile app is designed to consider the best responses of other route plan apps (e.g., the shortest route plan app and the fast route plan app) and plan the most fuel-efficient route according to the consumed fuel quantity. The numerical analysis results show that the proposed fuel-efficient route plan app can effectively support fuel-saving for logistics industries.
{"title":"A fuel-efficient route plan method based on game theory","authors":"Chi-Lun Lo, Chi-Hua Chen, Jin-Li Hu, Kuen-Rong Lo, Hsun-Jung Cho","doi":"10.3966/160792642019052003024","DOIUrl":"https://doi.org/10.3966/160792642019052003024","url":null,"abstract":"This study adopts a fuel consumption estimation method to measure the consumed fuel quantity of each vehicle speed interval (i.e., a cost function) in accordance with individual behaviors. Furthermore, a mobile app is designed to consider the best responses of other route plan apps (e.g., the shortest route plan app and the fast route plan app) and plan the most fuel-efficient route according to the consumed fuel quantity. The numerical analysis results show that the proposed fuel-efficient route plan app can effectively support fuel-saving for logistics industries.","PeriodicalId":50172,"journal":{"name":"Journal of Internet Technology","volume":"20 1","pages":"925-932"},"PeriodicalIF":1.6,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47295538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-03-01DOI: 10.3966/160792642019032002003
Yi-Wen Liao, Yueh-Min Huang, Y. Huang, Zhiyun Su, C. Wei
Programming instruction helps students develop critical-thinking and problem-solving skills, and is also important for the development of professional talent in the information technology sector. Cooperative learning has been shown to promote innovative educational applications while enhancing learning motivation and performance. Referring to the People, Activities, Context, Technologies (PACT) framework and the theory of technology readiness, this study explores factors that affect continued use intention and learning performance for students using an online cooperative programming (OCP) platform. A total of 120 students were invited to participate in the programming course. Students were asked to complete programming projects on the OCP platform and fill out a questionnaire. A causal model was proposed and examined using a partial least squares regression. The results show that optimism/ innovativeness in technology readiness and trust among team members both have a positive impact on user satisfaction using the platform. In addition, both user satisfaction and trust encourage continued use intention and improve learning performance. The findings can help educators and researchers promote the development of programming teaching and learning.
{"title":"An Empirical Evaluation of Online Cooperative Programming Platforms Based on the PACT Framework and Technology Readiness","authors":"Yi-Wen Liao, Yueh-Min Huang, Y. Huang, Zhiyun Su, C. Wei","doi":"10.3966/160792642019032002003","DOIUrl":"https://doi.org/10.3966/160792642019032002003","url":null,"abstract":"Programming instruction helps students develop critical-thinking and problem-solving skills, and is also important for the development of professional talent in the information technology sector. Cooperative learning has been shown to promote innovative educational applications while enhancing learning motivation and performance. Referring to the People, Activities, Context, Technologies (PACT) framework and the theory of technology readiness, this study explores factors that affect continued use intention and learning performance for students using an online cooperative programming (OCP) platform. A total of 120 students were invited to participate in the programming course. Students were asked to complete programming projects on the OCP platform and fill out a questionnaire. A causal model was proposed and examined using a partial least squares regression. The results show that optimism/ innovativeness in technology readiness and trust among team members both have a positive impact on user satisfaction using the platform. In addition, both user satisfaction and trust encourage continued use intention and improve learning performance. The findings can help educators and researchers promote the development of programming teaching and learning.","PeriodicalId":50172,"journal":{"name":"Journal of Internet Technology","volume":"20 1","pages":"345-352"},"PeriodicalIF":1.6,"publicationDate":"2019-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47606553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-01-01DOI: 10.3966/160792642019102006029
Yu Zhang, Bo Shen, Yi-Chih Kao, Hsin-Hung Chou
Driving behavior has been proved to have a great influence on road safety. Recognizing driving characteristic is an essential part of reducing traffic fatalities and developing intelligent traffic system. In this paper, we propose a method of driving characteristics recognition through mining vehicle operation data such as GPS, velocity and direction collected by the On-Board Diagnostic (OBD) port of vehicles. Based on the feature extracted from the vehicle operation sequence, we employ K-means algorithm to cluster and recognize different driving characteristics after reasonable normalization and dimensionality reduction of the features. Analysis and experimental results indicate that the proposed method has good application significance on mining effective information in vehicle operation data sequence.
{"title":"A Method of Driving Characteristics Recognition on Vehicle Operation Sequence","authors":"Yu Zhang, Bo Shen, Yi-Chih Kao, Hsin-Hung Chou","doi":"10.3966/160792642019102006029","DOIUrl":"https://doi.org/10.3966/160792642019102006029","url":null,"abstract":"Driving behavior has been proved to have a great influence on road safety. Recognizing driving characteristic is an essential part of reducing traffic fatalities and developing intelligent traffic system. In this paper, we propose a method of driving characteristics recognition through mining vehicle operation data such as GPS, velocity and direction collected by the On-Board Diagnostic (OBD) port of vehicles. Based on the feature extracted from the vehicle operation sequence, we employ K-means algorithm to cluster and recognize different driving characteristics after reasonable normalization and dimensionality reduction of the features. Analysis and experimental results indicate that the proposed method has good application significance on mining effective information in vehicle operation data sequence.","PeriodicalId":50172,"journal":{"name":"Journal of Internet Technology","volume":"20 1","pages":"2007-2014"},"PeriodicalIF":1.6,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70039283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-01-01DOI: 10.3966/160792642019012001023
Jui-Hung Chang, Chien-Yuan Tseng
This study explored the correlation between tourism-related popular words in search engines and the number of tourists, which is a topic worth discussing for the tourism industry. When individuals decide to sign up for a tour planned by travel agencies, they often have no idea whether the minimum number of participants will be reached until it is confirmed a few days before the departure, which is of great inconvenience when arranging itineraries. Hence, predicting whether the minimum number of participants in a tour can be reached is closely related to both the tourism industry and the tourists. In this regard, the number of Taiwanese traveling to Japan was predicted based on the popularity of keywords concerning travel in Japan searched on Google Trends. The scores of popular words concerning travel in Japan on the Google search engine and words concerning travel in Japan mentioned in tourism articles in e-news networks were also summarized. The experimental results indicated that the popularity of tourism keywords on Google was highly correlated to the number of Taiwanese tourists traveling to Japan. After the number of Taiwanese traveling to Japan within the following month was classified by the ANN model, the mean square error reached 0.13. Furthermore, by using the data of travel agencies in Taiwan to match the Google Trends data, the research predicted whether tours to Hokkaido would reach the minimum requirement for participants. The prediction accuracy of the ANN model was 68%.
{"title":"Analyzing Google Trends with Travel Keyword Rankings to Predict Tourists into a Group","authors":"Jui-Hung Chang, Chien-Yuan Tseng","doi":"10.3966/160792642019012001023","DOIUrl":"https://doi.org/10.3966/160792642019012001023","url":null,"abstract":"This study explored the correlation between tourism-related popular words in search engines and the number of tourists, which is a topic worth discussing for the tourism industry. When individuals decide to sign up for a tour planned by travel agencies, they often have no idea whether the minimum number of participants will be reached until it is confirmed a few days before the departure, which is of great inconvenience when arranging itineraries. Hence, predicting whether the minimum number of participants in a tour can be reached is closely related to both the tourism industry and the tourists. In this regard, the number of Taiwanese traveling to Japan was predicted based on the popularity of keywords concerning travel in Japan searched on Google Trends. The scores of popular words concerning travel in Japan on the Google search engine and words concerning travel in Japan mentioned in tourism articles in e-news networks were also summarized. The experimental results indicated that the popularity of tourism keywords on Google was highly correlated to the number of Taiwanese tourists traveling to Japan. After the number of Taiwanese traveling to Japan within the following month was classified by the ANN model, the mean square error reached 0.13. Furthermore, by using the data of travel agencies in Taiwan to match the Google Trends data, the research predicted whether tours to Hokkaido would reach the minimum requirement for participants. The prediction accuracy of the ANN model was 68%.","PeriodicalId":50172,"journal":{"name":"Journal of Internet Technology","volume":"20 1","pages":"247-256"},"PeriodicalIF":1.6,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70039243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-01-01DOI: 10.3966/160792642019012001021
C. Kao, Jia-Wei Chang, Tzone-I Wang, Yueh-Min Huang, Po-Sheng Chiu
Appropriate collocations in writing of research papers in English can make the context smoother and expression of ideas more precise. In consequence, it is easier for the reader to understand and the purpose of sharing the outcome of research is accomplished. However, for non-native English speakers, the choice and use of collocations is very difficult. For this reason, this study is intended to refer to a large amount of high-quality academic literature to establish a collocation corpus and adopt natural language processing techniques and statistical methods to develop a collocation recommendation system. The system will allow users to enter sentences, automatically detect the locations and types of collocations and recommend synonymous collocations in accordance with the semantics and frequency of use. The fitness of collocations in the system for beginning sentences achieves 73.1%. Writers of academic papers can use the system to select appropriate collocations, reduce erroneous use of collocations and improve the quality of their papers.
{"title":"Design and Development of the Sentence-based Collocation Recommender with Error Detection for Academic Writing","authors":"C. Kao, Jia-Wei Chang, Tzone-I Wang, Yueh-Min Huang, Po-Sheng Chiu","doi":"10.3966/160792642019012001021","DOIUrl":"https://doi.org/10.3966/160792642019012001021","url":null,"abstract":"Appropriate collocations in writing of research papers in English can make the context smoother and expression of ideas more precise. In consequence, it is easier for the reader to understand and the purpose of sharing the outcome of research is accomplished. However, for non-native English speakers, the choice and use of collocations is very difficult. For this reason, this study is intended to refer to a large amount of high-quality academic literature to establish a collocation corpus and adopt natural language processing techniques and statistical methods to develop a collocation recommendation system. The system will allow users to enter sentences, automatically detect the locations and types of collocations and recommend synonymous collocations in accordance with the semantics and frequency of use. The fitness of collocations in the system for beginning sentences achieves 73.1%. Writers of academic papers can use the system to select appropriate collocations, reduce erroneous use of collocations and improve the quality of their papers.","PeriodicalId":50172,"journal":{"name":"Journal of Internet Technology","volume":"20 1","pages":"229-236"},"PeriodicalIF":1.6,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70039697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}