Electronic voting (e-voting) improves the convenience of voting and the efficiency of vote counting. With the rise of blockchain technology, many studies have proposed e-voting systems using blockchain technology. Most of them consider the usage scenario of remote voting and pay less attention to that of voting conducted at polling stations. In view of this research gap, this study proposed a blockchain-based system for e-voting at polling stations. We designed a system process integrating blockchain technology and voter’s biometrics to prevent data tampering and imposter voting. In addition, the blockchain platform adopted in our proposed system is Hyperledger Fabric (HF). Therefore, based on the HF framework, we specified how to deploy a blockchain network customized for this study. Finally, this study provided a property analysis of the proposed system and implemented a simulation system. In conclusion, the proposed system meets the requirements e-voting from two aspects. First, in terms of technology, our system adopted blockchain, digital signature, and biometric identification to achieve eligibility and immutability. Second, in terms of system process, the roles of government agencies and inspectors are incorporated to achieve transparency, receipt-freeness and accessibility
{"title":"An On-Site Electronic Voting System Using Blockchain and Biometrics","authors":"Shu-Fen Tu, Ching-Sheng Hsu, Bo-Long You","doi":"10.34028/iajit/20/5/13","DOIUrl":"https://doi.org/10.34028/iajit/20/5/13","url":null,"abstract":"Electronic voting (e-voting) improves the convenience of voting and the efficiency of vote counting. With the rise of blockchain technology, many studies have proposed e-voting systems using blockchain technology. Most of them consider the usage scenario of remote voting and pay less attention to that of voting conducted at polling stations. In view of this research gap, this study proposed a blockchain-based system for e-voting at polling stations. We designed a system process integrating blockchain technology and voter’s biometrics to prevent data tampering and imposter voting. In addition, the blockchain platform adopted in our proposed system is Hyperledger Fabric (HF). Therefore, based on the HF framework, we specified how to deploy a blockchain network customized for this study. Finally, this study provided a property analysis of the proposed system and implemented a simulation system. In conclusion, the proposed system meets the requirements e-voting from two aspects. First, in terms of technology, our system adopted blockchain, digital signature, and biometric identification to achieve eligibility and immutability. Second, in terms of system process, the roles of government agencies and inspectors are incorporated to achieve transparency, receipt-freeness and accessibility","PeriodicalId":161392,"journal":{"name":"The International Arab Journal of Information Technology","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115801103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.26735/16585933.2018.002
عبد المجيد, محمد صادق عبد الرزاق
{"title":"دور تحليل مخرجات شبكات الحاسب الآلي في مواجهة الجريمة","authors":"عبد المجيد, محمد صادق عبد الرزاق","doi":"10.26735/16585933.2018.002","DOIUrl":"https://doi.org/10.26735/16585933.2018.002","url":null,"abstract":"","PeriodicalId":161392,"journal":{"name":"The International Arab Journal of Information Technology","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131289090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We use lattice basis reduction for ciphertext-only attack on RSA. Our attack is applicable in the conditions when known attacks are not applicable, and, contrary to known attacks, it does not require prior knowledge of a part of a message or key, small encryption key, e, or message broadcasting. Our attack is successful when a vector, comprised of a message and its exponent, is likely to be the shortest in the lattice, and meets Minkowski's Second Theorem bound. We have conducted experiments for message, keys, and encryption/decryption keys with sizes from 40 to 8193 bits, with dozens of thousands of successful RSA cracks. It took about 45 seconds for cracking 2001 messages of 2050 bits and for large public key values related with Euler’s totient function, and the same order private keys. Based on our findings, for RSA not to be susceptible to the proposed attack, it is recommended avoiding RSA public key form used in our experiments
{"title":"Ciphertext-Only Attack on RSA Using Lattice Basis Reduction","authors":"","doi":"10.34028/iajit/18/2/13","DOIUrl":"https://doi.org/10.34028/iajit/18/2/13","url":null,"abstract":"We use lattice basis reduction for ciphertext-only attack on RSA. Our attack is applicable in the conditions when known attacks are not applicable, and, contrary to known attacks, it does not require prior knowledge of a part of a message or key, small encryption key, e, or message broadcasting. Our attack is successful when a vector, comprised of a message and its exponent, is likely to be the shortest in the lattice, and meets Minkowski's Second Theorem bound. We have conducted experiments for message, keys, and encryption/decryption keys with sizes from 40 to 8193 bits, with dozens of thousands of successful RSA cracks. It took about 45 seconds for cracking 2001 messages of 2050 bits and for large public key values related with Euler’s totient function, and the same order private keys. Based on our findings, for RSA not to be susceptible to the proposed attack, it is recommended avoiding RSA public key form used in our experiments","PeriodicalId":161392,"journal":{"name":"The International Arab Journal of Information Technology","volume":"304 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121825847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.26735/16585933.2018.004
يسلم السقاف
{"title":"استخدام فيديوهات تفاعلية على موقع يوتيوب (YouTube) للتعامل مع التهديدات الأمنية في مجال تكنولوجيا المعلومات والاتصالات","authors":"يسلم السقاف","doi":"10.26735/16585933.2018.004","DOIUrl":"https://doi.org/10.26735/16585933.2018.004","url":null,"abstract":"","PeriodicalId":161392,"journal":{"name":"The International Arab Journal of Information Technology","volume":"780 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122628959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Route planning studies are of great importance for quadrotors, which are widely used in military and civil fields, to perform their duties autonomously and efficiently. In the study, a method is proposed that enables Quadrotor Route Planning (QRP) to be optimized with time priority and energy priority by using Genetic Algorithm (GA). For the proposed method, the Wind Effected QRP-App (WEQRP-App) desktop software was developed with the Visual Studio C# programming language. In the developed WEQRP-App, real location information from GMAP.Net map plugin and real wind data from the website of the General Directorate of Meteorology were used. Using Wind Effected Quadrotor Route Planning (WEQRP) method, more realistic planning was made before the flight and the flight efficiency was increased in terms of time and energy priority. Thus, it is foreseen that safer and less costly autonomous flights will take place when compared to the Standard QRP (SQRP) created without taking into account the wind effect, by avoiding the problems that arise due to unexpected energy consumption during the flight mission. When the results obtained from the proposed method were examined, it has been observed that WEQRP provides the improvements up to 13,5% in flight times and up to 27,4% in energy consumption according to SQRP
四旋翼飞行器广泛应用于军事和民用领域,航路规划研究对其自主高效地执行任务具有重要意义。提出了一种利用遗传算法(GA)对四旋翼飞行器航路规划(QRP)进行时间优先和能量优先优化的方法。针对所提出的方法,利用Visual Studio c#编程语言开发了Wind effects QRP-App (WEQRP-App)桌面软件。在开发的WEQRP-App中,真实位置信息来源于GMAP。Net地图插件和气象总局网站上的真实风数据被使用。采用风影响四旋翼航路规划(WEQRP)方法,在飞行前进行更真实的规划,从时间和能量优先度上提高了飞行效率。因此,可以预见的是,与没有考虑风效应的标准QRP (SQRP)相比,通过避免飞行任务期间因意外能源消耗而产生的问题,将实现更安全、成本更低的自主飞行。当从所提出的方法中获得的结果进行检验时,已经观察到WEQRP提供了高达13.5%的飞行时间改进和高达27.4%的能量消耗根据SQRP
{"title":"Optimization of Quadrotor Route Planning with Time and Energy Priority in Windy Environments","authors":"H. Incekara, M. Selek, F. Basçiftçi","doi":"10.34028/iajit/20/5/11","DOIUrl":"https://doi.org/10.34028/iajit/20/5/11","url":null,"abstract":"Route planning studies are of great importance for quadrotors, which are widely used in military and civil fields, to perform their duties autonomously and efficiently. In the study, a method is proposed that enables Quadrotor Route Planning (QRP) to be optimized with time priority and energy priority by using Genetic Algorithm (GA). For the proposed method, the Wind Effected QRP-App (WEQRP-App) desktop software was developed with the Visual Studio C# programming language. In the developed WEQRP-App, real location information from GMAP.Net map plugin and real wind data from the website of the General Directorate of Meteorology were used. Using Wind Effected Quadrotor Route Planning (WEQRP) method, more realistic planning was made before the flight and the flight efficiency was increased in terms of time and energy priority. Thus, it is foreseen that safer and less costly autonomous flights will take place when compared to the Standard QRP (SQRP) created without taking into account the wind effect, by avoiding the problems that arise due to unexpected energy consumption during the flight mission. When the results obtained from the proposed method were examined, it has been observed that WEQRP provides the improvements up to 13,5% in flight times and up to 27,4% in energy consumption according to SQRP","PeriodicalId":161392,"journal":{"name":"The International Arab Journal of Information Technology","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116966552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Electronic Health Record (EHR) includes highly sensitive data like medical images, prescriptions, medical test result, medical history of patients, etc., These sensitive data cannot be transmitted in its original form in the network due to security issues. Hence, encryption is done prior to transmission. To increase the speed of data transfer and to overcome the storage issues, data is usually transferred through the cloud. Hence, to ensure the security and scalability of the data, a third-party encryption called re-encryption is performed at the proxy cloud. This re-encryption ensures that the data can be reliably transmitted through the network. In this research, a novel scheme called block-chain based EHR data sharing using chaotic re-encryption (BC-EDS-CR) is proposed. In the proposed scheme, re-encryption is performed using chaos theory. The proposed re-encryption scheme ensures that the cloud administrator cannot access the medical data. Metrics such as Peak Signal to Noise Ratio (PSNR), Mean Square Error (MSE), Structural Similarity Index (SSIM), entropy and correlation coefficient are used in evaluating this scheme. It was found that the proposed scheme outperforms the existing methods by achieving a PSNR of 57.66, SSIM of 0.985 and MSE of 0.058.
电子健康记录(Electronic Health Record, EHR)包含医疗图像、处方、医疗检查结果、患者病史等高度敏感的数据,由于安全问题,这些敏感数据无法在网络中以原始形式传输。因此,加密是在传输之前完成的。为了提高数据传输的速度并克服存储问题,数据通常通过云传输。因此,为了确保数据的安全性和可伸缩性,在代理云上执行称为重新加密的第三方加密。这种重新加密确保了数据可以通过网络可靠地传输。本研究提出了一种基于区块链的混沌重加密EHR数据共享方案(BC-EDS-CR)。在该方案中,利用混沌理论进行重加密。提出的重新加密方案可确保云管理员无法访问医疗数据。采用峰值信噪比(PSNR)、均方误差(MSE)、结构相似指数(SSIM)、熵和相关系数等指标对该方案进行评价。结果表明,该方案的PSNR为57.66,SSIM为0.985,MSE为0.058,优于现有方法。
{"title":"Blockchain-based Scalable and Secure EHR Data Sharing using Proxy Re-Encryption","authors":"Naresh Sammeta, L. Parthiban","doi":"10.34028/iajit/20/5/2","DOIUrl":"https://doi.org/10.34028/iajit/20/5/2","url":null,"abstract":"Electronic Health Record (EHR) includes highly sensitive data like medical images, prescriptions, medical test result, medical history of patients, etc., These sensitive data cannot be transmitted in its original form in the network due to security issues. Hence, encryption is done prior to transmission. To increase the speed of data transfer and to overcome the storage issues, data is usually transferred through the cloud. Hence, to ensure the security and scalability of the data, a third-party encryption called re-encryption is performed at the proxy cloud. This re-encryption ensures that the data can be reliably transmitted through the network. In this research, a novel scheme called block-chain based EHR data sharing using chaotic re-encryption (BC-EDS-CR) is proposed. In the proposed scheme, re-encryption is performed using chaos theory. The proposed re-encryption scheme ensures that the cloud administrator cannot access the medical data. Metrics such as Peak Signal to Noise Ratio (PSNR), Mean Square Error (MSE), Structural Similarity Index (SSIM), entropy and correlation coefficient are used in evaluating this scheme. It was found that the proposed scheme outperforms the existing methods by achieving a PSNR of 57.66, SSIM of 0.985 and MSE of 0.058.","PeriodicalId":161392,"journal":{"name":"The International Arab Journal of Information Technology","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125182353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
GovdeTurk is a tool for stemming, morphological labeling and verb negation for Turkish language. We designed comprehensive finite automata to represent Turkish grammar rules. Based on these automata, GovdeTurk finds the stem of the word by removing the inflectional suffixes in a longest match strategy. Levenshtein Distance is used to correct spelling errors that may occur during suffix removal. Morphological labeling identifies the functionality of a given token. Nine different dictionaries are constructed for each specific word type. These dictionaries are used in the stemming and morphological labeling. Verb negation module is developed for lexicon based sentiment analysis. GovdeTurk is tested on a dataset of one million words. The results are compared with Zemberek and Turkish Snowball Algorithm. While the closest competitor, Zemberek, in the stemming step has an accuracy of 80%, GovdeTurk gives 97.3% of accuracy. Morphological labeling accuracy of GovdeTurk is 93.6%. With outperforming results, our model becomes foremost among its competitors
{"title":"GovdeTurk: A Novel Turkish Natural Language Processing Tool for Stemming, Morphological Labelling and Verb Negation","authors":"","doi":"10.34028/iajit/18/2/3","DOIUrl":"https://doi.org/10.34028/iajit/18/2/3","url":null,"abstract":"GovdeTurk is a tool for stemming, morphological labeling and verb negation for Turkish language. We designed comprehensive finite automata to represent Turkish grammar rules. Based on these automata, GovdeTurk finds the stem of the word by removing the inflectional suffixes in a longest match strategy. Levenshtein Distance is used to correct spelling errors that may occur during suffix removal. Morphological labeling identifies the functionality of a given token. Nine different dictionaries are constructed for each specific word type. These dictionaries are used in the stemming and morphological labeling. Verb negation module is developed for lexicon based sentiment analysis. GovdeTurk is tested on a dataset of one million words. The results are compared with Zemberek and Turkish Snowball Algorithm. While the closest competitor, Zemberek, in the stemming step has an accuracy of 80%, GovdeTurk gives 97.3% of accuracy. Morphological labeling accuracy of GovdeTurk is 93.6%. With outperforming results, our model becomes foremost among its competitors","PeriodicalId":161392,"journal":{"name":"The International Arab Journal of Information Technology","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126598058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jyoti Agarwal, M. Kumar, Mugdha Sharma, Deepak Verma, Richa Sharma
Component Based Software System (CBSS) provides an easy and efficient way to develop new software application with the help of existing software components of similar functionalities. It increases the reusability of software components and reduce the development time, cost and effort of software developers. To select the appropriate component, it become essential to assess the reusability of software components so that suitable component can be selected to reuse in another application. For assessing the reusability of CBSS, several factors are required to be considered. In this paper, four reusability sub-factors Interface Complexity (IC), Understandability (Un), Customizability (Co) and Reliability (Re) are used as input variables and reusability is assessed using Fuzzy Inference System (FIS) and Adaptive Neuro Fuzzy Inference System (ANFIS) approach because these two approaches are commonly used approach for assessing the quality factors. For experimental work, one case study has been done where rules are generated to assess reusability using four different reusability factors by taking feedback from researchers and academicians using online survey. Reusability was assessed for ten different values of input variables. Experiment shows that results obtained from ANFIS method were closer to the original values. Root Mean Square Error (RMSE) of FIS results was 6.05% which was further reduced by the application of ANFIS approach and finally 2.20% of RMSE was achieved. This research work will be helpful for software developers and researchers to assess the reusability of software components and they will be able to take corrective decision for choosing the appropriate component to be reused in new software applications, which will reduce their effort, time and cost of development
基于组件的软件系统(Component Based Software System, CBSS)提供了一种简单有效的方法,可以利用已有的功能相似的软件组件开发新的软件应用程序。它增加了软件组件的可重用性,减少了软件开发人员的开发时间、成本和工作量。为了选择合适的组件,必须评估软件组件的可重用性,以便选择合适的组件以在另一个应用程序中重用。为了评估CBSS的可重用性,需要考虑几个因素。本文以界面复杂性(IC)、可理解性(Un)、可定制性(Co)和可靠性(Re)四个可重用子因素作为输入变量,采用模糊推理系统(FIS)和自适应神经模糊推理系统(ANFIS)方法对可重用性进行评估,因为这两种方法是评估质量因素的常用方法。对于实验工作,已经完成了一个案例研究,其中通过使用在线调查从研究人员和学者那里获得反馈,生成规则来使用四种不同的可重用性因素来评估可重用性。对十个不同的输入变量值进行了可重用性评估。实验表明,ANFIS方法得到的结果更接近于原始值。FIS结果的均方根误差(RMSE)为6.05%,通过应用ANFIS方法进一步减小,最终RMSE为2.20%。这项研究工作将有助于软件开发人员和研究人员评估软件组件的可重用性,他们将能够在新的软件应用程序中选择合适的组件来进行重用,从而减少他们的工作、时间和开发成本
{"title":"Application of Intelligent Adaptive Neuro Fuzzy Method for Reusability of Component Based Software System","authors":"Jyoti Agarwal, M. Kumar, Mugdha Sharma, Deepak Verma, Richa Sharma","doi":"10.34028/iajit/20/5/10","DOIUrl":"https://doi.org/10.34028/iajit/20/5/10","url":null,"abstract":"Component Based Software System (CBSS) provides an easy and efficient way to develop new software application with the help of existing software components of similar functionalities. It increases the reusability of software components and reduce the development time, cost and effort of software developers. To select the appropriate component, it become essential to assess the reusability of software components so that suitable component can be selected to reuse in another application. For assessing the reusability of CBSS, several factors are required to be considered. In this paper, four reusability sub-factors Interface Complexity (IC), Understandability (Un), Customizability (Co) and Reliability (Re) are used as input variables and reusability is assessed using Fuzzy Inference System (FIS) and Adaptive Neuro Fuzzy Inference System (ANFIS) approach because these two approaches are commonly used approach for assessing the quality factors. For experimental work, one case study has been done where rules are generated to assess reusability using four different reusability factors by taking feedback from researchers and academicians using online survey. Reusability was assessed for ten different values of input variables. Experiment shows that results obtained from ANFIS method were closer to the original values. Root Mean Square Error (RMSE) of FIS results was 6.05% which was further reduced by the application of ANFIS approach and finally 2.20% of RMSE was achieved. This research work will be helpful for software developers and researchers to assess the reusability of software components and they will be able to take corrective decision for choosing the appropriate component to be reused in new software applications, which will reduce their effort, time and cost of development","PeriodicalId":161392,"journal":{"name":"The International Arab Journal of Information Technology","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124627536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Feature selection is a feasible solution to improve the speed and performance of machine learning models. Optimization algorithms are doing a significant job in searching for optimal variables from feature space. Recent feature selection methods are purely depending on various meta heuristic algorithms for searching a good combination of features without considering the importance of individual features, which makes classification models to suffer from local optima or overfitting problems. In this paper, a novel hybrid feature subset selection technique is introduced based on Regularized Neighborhood Component Analysis (RNCA) and Binary Teaching Learning Based Optimization (BTLBO) algorithms to overcome the above problems. RNCA algorithm assigns weights to the attributes based on their contribution in building the learning models for classification. BTLBO algorithm computes the fitness of individuals with respect to the weights of features and selects the best ones. The results of similar feature selection methods are matched with the proposed hybrid model and proved better performance in terms of classification accuracy, recall and AUC measures over breast cancer datasets.
{"title":"Hybrid Feature Selection based on BTLBO and RNCA to Diagnose the Breast Cancer","authors":"Mohan Allam, Nandhini Malaiyappan","doi":"10.34028/iajit/20/5/5","DOIUrl":"https://doi.org/10.34028/iajit/20/5/5","url":null,"abstract":"Feature selection is a feasible solution to improve the speed and performance of machine learning models. Optimization algorithms are doing a significant job in searching for optimal variables from feature space. Recent feature selection methods are purely depending on various meta heuristic algorithms for searching a good combination of features without considering the importance of individual features, which makes classification models to suffer from local optima or overfitting problems. In this paper, a novel hybrid feature subset selection technique is introduced based on Regularized Neighborhood Component Analysis (RNCA) and Binary Teaching Learning Based Optimization (BTLBO) algorithms to overcome the above problems. RNCA algorithm assigns weights to the attributes based on their contribution in building the learning models for classification. BTLBO algorithm computes the fitness of individuals with respect to the weights of features and selects the best ones. The results of similar feature selection methods are matched with the proposed hybrid model and proved better performance in terms of classification accuracy, recall and AUC measures over breast cancer datasets.","PeriodicalId":161392,"journal":{"name":"The International Arab Journal of Information Technology","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130067661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In order to serve a diversified user base with a range of purposes, general search engines offer search results for a wide variety of topics and material categories on the internet. While Focused Crawlers (FC) deliver more specialized and targeted results inside particular domains or verticals, general search engines give a wider coverage of the web. For a vertical search engine, the performance of a focused crawler is extremely important, and several ways of improvement are applied. We propose an intelligent, focused crawler which uses Reinforcement Learning (RL) to prioritize the hyperlinks for long-term profit. Our implementation differs from other RL based works by encouraging learning at an early stage using a decaying ϵ-greedy policy to select the next link and hence enables the crawler to use the experience gained to improve its performance with more relevant pages. With an increase in the infertility rate all over the world, searching for information regarding the issues and details about artificial reproduction treatments available is in need by many people. Hence, we have considered infertility domain as a case study and collected web pages from scratch. We compare the performance of crawling tasks following ϵ-greedy and decaying ϵ-greedy policies. Experimental results show that crawlers following a decaying ϵ-greedy policy demonstrate better performance
{"title":"Focused Crawler Based on Reinforcement Learning and Decaying Epsilon-Greedy Exploration Policy","authors":"Parisa Begum Kaleel, Shina Sheen","doi":"10.34028/iajit/20/5/14","DOIUrl":"https://doi.org/10.34028/iajit/20/5/14","url":null,"abstract":"In order to serve a diversified user base with a range of purposes, general search engines offer search results for a wide variety of topics and material categories on the internet. While Focused Crawlers (FC) deliver more specialized and targeted results inside particular domains or verticals, general search engines give a wider coverage of the web. For a vertical search engine, the performance of a focused crawler is extremely important, and several ways of improvement are applied. We propose an intelligent, focused crawler which uses Reinforcement Learning (RL) to prioritize the hyperlinks for long-term profit. Our implementation differs from other RL based works by encouraging learning at an early stage using a decaying ϵ-greedy policy to select the next link and hence enables the crawler to use the experience gained to improve its performance with more relevant pages. With an increase in the infertility rate all over the world, searching for information regarding the issues and details about artificial reproduction treatments available is in need by many people. Hence, we have considered infertility domain as a case study and collected web pages from scratch. We compare the performance of crawling tasks following ϵ-greedy and decaying ϵ-greedy policies. Experimental results show that crawlers following a decaying ϵ-greedy policy demonstrate better performance","PeriodicalId":161392,"journal":{"name":"The International Arab Journal of Information Technology","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126811990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}