According to mobility data that records mobility traffic using location trackers on mobile phones, the COVID-19 epidemic and the adoption of social distance policies have drastically altered people’s visiting patterns. However, rather than the volume of visitors, the transmission is controlled by the frequency and length of concurrent occupation at particular places. Therefore, it is essential to comprehend how people interact in various settings in order to focus legislation, guide contact tracking, and educate prevention initiatives. This study suggests an effective method for reducing the virus’s propagation among university students enrolled on-campus by creating a self-developed Google History Location Extractor and Indicator software based on actual data on people’s movements. The platform enables academics and policymakers to model the results of human mobility and the epidemic condition under various epidemic control measures and assess the potential for future advancements in the epidemic’s spread. It provides tools for identifying prospective contacts, analyzing individual infection risks, and reviewing the success of campus regulations. By more precisely focusing on probable virus carriers during the screening process, the suggested multi-functional platform makes it easier to decide on epidemic control measures, ultimately helping to manage and avoid future outbreaks.
{"title":"Big Data Analytics in Tracking COVID-19 Spread Utilizing Google Location Data","authors":"Mei Wyin Yaw, Prajindra Sankar Krishnan, Chai Phing Chen, Sieh Kiong Tiong","doi":"10.18080/jtde.v11n3.771","DOIUrl":"https://doi.org/10.18080/jtde.v11n3.771","url":null,"abstract":"According to mobility data that records mobility traffic using location trackers on mobile phones, the COVID-19 epidemic and the adoption of social distance policies have drastically altered people’s visiting patterns. However, rather than the volume of visitors, the transmission is controlled by the frequency and length of concurrent occupation at particular places. Therefore, it is essential to comprehend how people interact in various settings in order to focus legislation, guide contact tracking, and educate prevention initiatives. This study suggests an effective method for reducing the virus’s propagation among university students enrolled on-campus by creating a self-developed Google History Location Extractor and Indicator software based on actual data on people’s movements. The platform enables academics and policymakers to model the results of human mobility and the epidemic condition under various epidemic control measures and assess the potential for future advancements in the epidemic’s spread. It provides tools for identifying prospective contacts, analyzing individual infection risks, and reviewing the success of campus regulations. By more precisely focusing on probable virus carriers during the screening process, the suggested multi-functional platform makes it easier to decide on epidemic control measures, ultimately helping to manage and avoid future outbreaks.","PeriodicalId":37752,"journal":{"name":"Australian Journal of Telecommunications and the Digital Economy","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136279823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jia Yi Vivian Quek, Ying Han Pang, Zheng You Lim, Shih Yin Ooi, Wee How Khoh
An accurate customer churn prediction could alert businesses about potential churn customers so that proactive actions can be taken to retain the customers. Predicting churn may not be easy, especially with the increasing database sample size. Hence, attribute selection is vital in machine learning to comprehend complex attributes and identify essential variables. In this paper, a customer churn prediction model is proposed based on attribute selection analysis and Support Vector Machine. The proposed model improves churn prediction performance with reduced feature dimensions by identifying the most significant attributes of customer data. Firstly, exploratory data analysis and data preprocessing are performed to understand the data and preprocess it to improve the data quality. Next, two filter-based attribute selection techniques, i.e., Chi-squared and Analysis of Variance (ANOVA), are applied to the pre-processed data to select relevant features. Then, the selected features are input into a Support Vector Machine for classification. A real-world telecom database is used for model assessment. The empirical results demonstrate that ANOVA outperforms the Chi-squared filter in attribute selection. Furthermore, the results also show that, with merely ~50% of the features, feature selection based on ANOVA exhibits better performance compared to full feature set utilization.
{"title":"Customer Churn Prediction through Attribute Selection Analysis and Support Vector Machine","authors":"Jia Yi Vivian Quek, Ying Han Pang, Zheng You Lim, Shih Yin Ooi, Wee How Khoh","doi":"10.18080/jtde.v11n3.777","DOIUrl":"https://doi.org/10.18080/jtde.v11n3.777","url":null,"abstract":"An accurate customer churn prediction could alert businesses about potential churn customers so that proactive actions can be taken to retain the customers. Predicting churn may not be easy, especially with the increasing database sample size. Hence, attribute selection is vital in machine learning to comprehend complex attributes and identify essential variables. In this paper, a customer churn prediction model is proposed based on attribute selection analysis and Support Vector Machine. The proposed model improves churn prediction performance with reduced feature dimensions by identifying the most significant attributes of customer data. Firstly, exploratory data analysis and data preprocessing are performed to understand the data and preprocess it to improve the data quality. Next, two filter-based attribute selection techniques, i.e., Chi-squared and Analysis of Variance (ANOVA), are applied to the pre-processed data to select relevant features. Then, the selected features are input into a Support Vector Machine for classification. A real-world telecom database is used for model assessment. The empirical results demonstrate that ANOVA outperforms the Chi-squared filter in attribute selection. Furthermore, the results also show that, with merely ~50% of the features, feature selection based on ANOVA exhibits better performance compared to full feature set utilization.","PeriodicalId":37752,"journal":{"name":"Australian Journal of Telecommunications and the Digital Economy","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136280321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Phishing emails pose a severe risk to online users, necessitating effective identification methods to safeguard digital communication. Detection techniques are continuously researched to address the evolution of phishing strategies. Machine learning (ML) is a powerful tool for automated phishing email detection, but existing techniques like support vector machines and Naive Bayes have proven slow or ineffective in handling spam filtering. This study attempts to provide a phishing email detector and reliable classifier using a hybrid machine classifier with term frequency-inverse document frequency (TF-IDF) and an effective feature extraction technique (FET) on a real-world dataset from Kaggle. Exploratory data analysis is conducted to enhance understanding of the dataset and identify any conspicuous errors and outliers to facilitate the detection process. The FET converts the data text into a numerical representation that can be used for ML algorithms. The model’s performance is evaluated using accuracy, precision, recall, F1 score, receiver operating characteristic (ROC) curve and area under the ROC curve metrics. The research findings indicate that the hybrid model utilising TF-IDF achieved superior performance, with an accuracy of 87.5%. The paper offers valuable knowledge on using ML to identify phishing emails and highlights the importance of combining various models.
{"title":"Improving Phishing Email Detection Using the Hybrid Machine Learning Approach","authors":"Naveen Palanichamy, Yoga Shri Murti","doi":"10.18080/jtde.v11n3.778","DOIUrl":"https://doi.org/10.18080/jtde.v11n3.778","url":null,"abstract":"Phishing emails pose a severe risk to online users, necessitating effective identification methods to safeguard digital communication. Detection techniques are continuously researched to address the evolution of phishing strategies. Machine learning (ML) is a powerful tool for automated phishing email detection, but existing techniques like support vector machines and Naive Bayes have proven slow or ineffective in handling spam filtering. This study attempts to provide a phishing email detector and reliable classifier using a hybrid machine classifier with term frequency-inverse document frequency (TF-IDF) and an effective feature extraction technique (FET) on a real-world dataset from Kaggle. Exploratory data analysis is conducted to enhance understanding of the dataset and identify any conspicuous errors and outliers to facilitate the detection process. The FET converts the data text into a numerical representation that can be used for ML algorithms. The model’s performance is evaluated using accuracy, precision, recall, F1 score, receiver operating characteristic (ROC) curve and area under the ROC curve metrics. The research findings indicate that the hybrid model utilising TF-IDF achieved superior performance, with an accuracy of 87.5%. The paper offers valuable knowledge on using ML to identify phishing emails and highlights the importance of combining various models.","PeriodicalId":37752,"journal":{"name":"Australian Journal of Telecommunications and the Digital Economy","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136280322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We are living in the digital era and ICTs have become necessities in this contemporary world. The aim of this study is to investigate transparency in Asia through ICT’s diffusion by using Driscoll-Kraay standard error technique. We used panel data for 17 Asian countries from 2010 to 2019 and we use control of corruption as a proxy for transparency checking. The results show that ICTs leave a positive effect on the control of corruption. Other determinants of transparency in this paper, such as political stability and effective governance, have a positive effect on control of corruption. ICT policy can play an important role in curbing corruption. So, there is a strong need for ICT diffusion, suggesting that effective governance helped to reduce corruption in the Asian region and establish a surveillance-based system in public institutions.
{"title":"ICT-driven Transparency: Empirical Evidence from Selected Asian Countries","authors":"Ajmal Hussain","doi":"10.18080/jtde.v11n3.658","DOIUrl":"https://doi.org/10.18080/jtde.v11n3.658","url":null,"abstract":"We are living in the digital era and ICTs have become necessities in this contemporary world. The aim of this study is to investigate transparency in Asia through ICT’s diffusion by using Driscoll-Kraay standard error technique. We used panel data for 17 Asian countries from 2010 to 2019 and we use control of corruption as a proxy for transparency checking. The results show that ICTs leave a positive effect on the control of corruption. Other determinants of transparency in this paper, such as political stability and effective governance, have a positive effect on control of corruption. ICT policy can play an important role in curbing corruption. So, there is a strong need for ICT diffusion, suggesting that effective governance helped to reduce corruption in the Asian region and establish a surveillance-based system in public institutions.","PeriodicalId":37752,"journal":{"name":"Australian Journal of Telecommunications and the Digital Economy","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136279665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wei Kitt Wong, Filbert Hilman Juwono, Ing Ming Chew, Basil Andy Lease
In an era where massive information can be spread easily through social media, fake news detention is increasingly used to prevent widespread misinformation, especially fake news regarding COVID-19. Databases have been built and machine-learning algorithms have been used to identify patterns in news content and filter the false information. A brief overview, ranging from public domain datasets through the deployment of several machine learning models, as well as feature extraction methods, is provided in this paper. As a case study, a mixed language dataset is presented. The dataset consists of tweets of COVID-19 which have been labelled as fake or real news. To perform the detection task, a classification model is implemented using language-independent features. In particular, the features offer numerical inputs that are invariant to the language type; thus, they are suitable for investigation, as many regions in the world have similar linguistic structures. Furthermore, the classification task can be performed by using black box or white box models, each having its own advantages and disadvantages. In this paper, we compare the performance of the two approaches. Simulation results show that the performance difference between black box models and white box models is not significant.
{"title":"Language Independent Models for COVID-19 Fake News Detection","authors":"Wei Kitt Wong, Filbert Hilman Juwono, Ing Ming Chew, Basil Andy Lease","doi":"10.18080/jtde.v11n3.789","DOIUrl":"https://doi.org/10.18080/jtde.v11n3.789","url":null,"abstract":"In an era where massive information can be spread easily through social media, fake news detention is increasingly used to prevent widespread misinformation, especially fake news regarding COVID-19. Databases have been built and machine-learning algorithms have been used to identify patterns in news content and filter the false information. A brief overview, ranging from public domain datasets through the deployment of several machine learning models, as well as feature extraction methods, is provided in this paper. As a case study, a mixed language dataset is presented. The dataset consists of tweets of COVID-19 which have been labelled as fake or real news. To perform the detection task, a classification model is implemented using language-independent features. In particular, the features offer numerical inputs that are invariant to the language type; thus, they are suitable for investigation, as many regions in the world have similar linguistic structures. Furthermore, the classification task can be performed by using black box or white box models, each having its own advantages and disadvantages. In this paper, we compare the performance of the two approaches. Simulation results show that the performance difference between black box models and white box models is not significant.","PeriodicalId":37752,"journal":{"name":"Australian Journal of Telecommunications and the Digital Economy","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136280327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Online immersion is considered as a determining factor of web surfers’ reactions. Its importance may be greater in a 3D-enriched environment. However, little research has explored it in marketing and less has investigated its impact on consumer behaviour in an enriched commercial website. In addition, when it comes to its operationalization, many weaknesses are noticed in the existing literature. Accordingly, the objective of this study is two-fold: in order to test the impact of immersion on purchase and revisit intentions to a 3D-enriched commercial website, a scale measurement of immersion tailored to this specific context is proposed. Following Churchill’s framework and the recommendations of Rossiter, a number of methodological instruments, including two focus groups (the first with 4 experts; the second with 18 consumers) and three surveys (first: 140 students; second: 350 Internet users; third: 200 Internet users), are used. The confirmatory factor analysis resulted in an 8-item scale which seems to exhibit evidence of reliability and validity. The predictive validity was confirmed since the impacts of immersion on the intentions to buy and revisit the website are significant. The proposed scale measure may help academics conduct better and more reliable studies on consumer behaviour online.
{"title":"Proposal of a Measurement Scale and Test of the Impacts on Purchase and Revisit Intention","authors":"Salma Ayari, Imène Ben Yahia","doi":"10.18080/jtde.v11n3.657","DOIUrl":"https://doi.org/10.18080/jtde.v11n3.657","url":null,"abstract":"Online immersion is considered as a determining factor of web surfers’ reactions. Its importance may be greater in a 3D-enriched environment. However, little research has explored it in marketing and less has investigated its impact on consumer behaviour in an enriched commercial website. In addition, when it comes to its operationalization, many weaknesses are noticed in the existing literature. Accordingly, the objective of this study is two-fold: in order to test the impact of immersion on purchase and revisit intentions to a 3D-enriched commercial website, a scale measurement of immersion tailored to this specific context is proposed. Following Churchill’s framework and the recommendations of Rossiter, a number of methodological instruments, including two focus groups (the first with 4 experts; the second with 18 consumers) and three surveys (first: 140 students; second: 350 Internet users; third: 200 Internet users), are used. The confirmatory factor analysis resulted in an 8-item scale which seems to exhibit evidence of reliability and validity. The predictive validity was confirmed since the impacts of immersion on the intentions to buy and revisit the website are significant. The proposed scale measure may help academics conduct better and more reliable studies on consumer behaviour online.","PeriodicalId":37752,"journal":{"name":"Australian Journal of Telecommunications and the Digital Economy","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136280328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nafiz Fahad, Kah Ong Michael Goh, Md. Ismail Hossen, Connie Tee, Md. Asraf Ali
Given the prevalence of fake news in today’s tech-driven era, an urgent need exists for an automated mechanism to effectively curb its dissemination. This research aims to demonstrate the impacts of fake news through a literature review and establish a reliable system for identifying it using machine (ML) learning classifiers. By combining CNN, RNN, and ANN models, a novel model is proposed to detect fake news with 94.5% accuracy. Prior studies have successfully employed ML algorithms to identify false information by analysing textual and visual features in standard datasets. The comprehensive literature review emphasises the consequences of fake news on individuals, economies, societies, politics, and free expression. The proposed hybrid model, trained on extensive data and evaluated using accuracy, precision and recall measures, outperforms existing models. This study underscores the importance of developing automated systems to counter the spread of fake news and calls for further research in this domain.
{"title":"Building a Fortress Against Fake News","authors":"Nafiz Fahad, Kah Ong Michael Goh, Md. Ismail Hossen, Connie Tee, Md. Asraf Ali","doi":"10.18080/jtde.v11n3.765","DOIUrl":"https://doi.org/10.18080/jtde.v11n3.765","url":null,"abstract":"Given the prevalence of fake news in today’s tech-driven era, an urgent need exists for an automated mechanism to effectively curb its dissemination. This research aims to demonstrate the impacts of fake news through a literature review and establish a reliable system for identifying it using machine (ML) learning classifiers. By combining CNN, RNN, and ANN models, a novel model is proposed to detect fake news with 94.5% accuracy. Prior studies have successfully employed ML algorithms to identify false information by analysing textual and visual features in standard datasets. The comprehensive literature review emphasises the consequences of fake news on individuals, economies, societies, politics, and free expression. The proposed hybrid model, trained on extensive data and evaluated using accuracy, precision and recall measures, outperforms existing models. This study underscores the importance of developing automated systems to counter the spread of fake news and calls for further research in this domain.","PeriodicalId":37752,"journal":{"name":"Australian Journal of Telecommunications and the Digital Economy","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136280331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohd Norman Bin Bakri, Han Foon Neo, Chuan-Chin Teo
During the pandemic, the tourism industry was one of the most severely impacted sectors. As vaccines are now widely available, each government is working to develop a system that can generate a digital vaccine certificate and PCR lab test result to verify that a person has been fully vaccinated or has a negative PCR test result, in order to allow them to enter business premises, travel overseas or cross state borders. However, the use of centralised systems in the development of the digital COVID-19 pass system results in a number of challenges, including the system’s high susceptibility to failures, sluggish and inefficient information transmission, and vulnerability. The goal of this research is to offer a new digital COVID-19 pass based on the proposed “SmartHealthCard” blockchain technology. SmartHealthCard is a decentralised application (dApp) encrypting and hashing user data and safely storing it in a distributed database. Privacy preservation, GDPR compliance, self-sovereignty, KYC compliance and data integrity are featured. This initiative has the potential to benefit the public, healthcare professionals, service providers and the government. SmartHealthCard enables quick verification of tamper-proof COVID-19 tests/vaccines, aiding in COVID-19 transmission control while respecting the user’s right to privacy.
{"title":"Blockchain Technology for Tourism Post COVID-19","authors":"Mohd Norman Bin Bakri, Han Foon Neo, Chuan-Chin Teo","doi":"10.18080/jtde.v11n3.764","DOIUrl":"https://doi.org/10.18080/jtde.v11n3.764","url":null,"abstract":"During the pandemic, the tourism industry was one of the most severely impacted sectors. As vaccines are now widely available, each government is working to develop a system that can generate a digital vaccine certificate and PCR lab test result to verify that a person has been fully vaccinated or has a negative PCR test result, in order to allow them to enter business premises, travel overseas or cross state borders. However, the use of centralised systems in the development of the digital COVID-19 pass system results in a number of challenges, including the system’s high susceptibility to failures, sluggish and inefficient information transmission, and vulnerability. The goal of this research is to offer a new digital COVID-19 pass based on the proposed “SmartHealthCard” blockchain technology. SmartHealthCard is a decentralised application (dApp) encrypting and hashing user data and safely storing it in a distributed database. Privacy preservation, GDPR compliance, self-sovereignty, KYC compliance and data integrity are featured. This initiative has the potential to benefit the public, healthcare professionals, service providers and the government. SmartHealthCard enables quick verification of tamper-proof COVID-19 tests/vaccines, aiding in COVID-19 transmission control while respecting the user’s right to privacy.","PeriodicalId":37752,"journal":{"name":"Australian Journal of Telecommunications and the Digital Economy","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136279661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper proposes to use the Naïve Bayes-based algorithm for phishing detection, specifically in spam emails. The paper compares probability-based and frequency-based approaches and investigates the impact of imbalanced datasets and the use of stemming as a natural language processing (NLP) technique. Results show that both algorithms perform similarly in spam detection, with the choice between them depending on factors such as efficiency and scalability. Accuracy is influenced by the dataset configuration and stemming. Imbalanced datasets lead to higher accuracy in detecting emails in the majority class, while they struggle to classify minority-class emails. In contrast, balanced datasets yield overall high accuracy for both spam and ham email identification. This study reveals that stemming has a minor impact on algorithm performance, occasionally decreasing in accuracy due to word grouping. Balancing the dataset is crucial for improving algorithm performance and achieving accurate spam email detection. Hence, both probability-based and frequency-based Naïve Bayes algorithms are effective for phishing detection using balanced datasets. The frequency-based approach, with a balanced dataset and stemming, achieves a balanced performance between recall and precision, while the probability-based method with a balanced dataset and no stemming prioritises overall accuracy.
{"title":"Phishing Message Detection Based on Keyword Matching","authors":"Keng-Theen Tham, Kok-Why Ng, Su-Cheng Haw","doi":"10.18080/jtde.v11n3.776","DOIUrl":"https://doi.org/10.18080/jtde.v11n3.776","url":null,"abstract":"This paper proposes to use the Naïve Bayes-based algorithm for phishing detection, specifically in spam emails. The paper compares probability-based and frequency-based approaches and investigates the impact of imbalanced datasets and the use of stemming as a natural language processing (NLP) technique. Results show that both algorithms perform similarly in spam detection, with the choice between them depending on factors such as efficiency and scalability. Accuracy is influenced by the dataset configuration and stemming. Imbalanced datasets lead to higher accuracy in detecting emails in the majority class, while they struggle to classify minority-class emails. In contrast, balanced datasets yield overall high accuracy for both spam and ham email identification. This study reveals that stemming has a minor impact on algorithm performance, occasionally decreasing in accuracy due to word grouping. Balancing the dataset is crucial for improving algorithm performance and achieving accurate spam email detection. Hence, both probability-based and frequency-based Naïve Bayes algorithms are effective for phishing detection using balanced datasets. The frequency-based approach, with a balanced dataset and stemming, achieves a balanced performance between recall and precision, while the probability-based method with a balanced dataset and no stemming prioritises overall accuracy.","PeriodicalId":37752,"journal":{"name":"Australian Journal of Telecommunications and the Digital Economy","volume":"160 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136280319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A significant global health problem nowadays is the incidence of serious infectious illnesses. An extraordinary humanitarian crisis has been brought on by the current COVID-19 pandemic, which has spread around the world. The spread of new viruses has put established healthcare institutions under tremendous strain and created a number of pressing problems. It is important to predict the future movement and pattern of the illness in order to decrease infectious instances and maximize recovered cases. This research paper aims to utilize mobility tracking as a means to identify hotspots for contagious disease spread. The study focuses on collecting and analyzing mobility data from UNITEN students using Google Map data over a period of two weeks. The paper describes the data collection process, data pre-processing steps, and the application of the HDBSCAN algorithm for hotspot clustering. The results demonstrate the effectiveness of HDBSCAN in identifying hotspots based on the mobility data. The findings highlight the potential of mobility tracking for disease surveillance and provide insights for public health interventions and preventive measures.
{"title":"Utilizing Mobility Tracking to Identify Hotspots for Contagious Disease Spread","authors":"Mei Wyin Yaw, Prajindra Sankar Krishnan, Chai Phing Chen, Sieh Kiong Tiong","doi":"10.18080/jtde.v11n3.775","DOIUrl":"https://doi.org/10.18080/jtde.v11n3.775","url":null,"abstract":"A significant global health problem nowadays is the incidence of serious infectious illnesses. An extraordinary humanitarian crisis has been brought on by the current COVID-19 pandemic, which has spread around the world. The spread of new viruses has put established healthcare institutions under tremendous strain and created a number of pressing problems. It is important to predict the future movement and pattern of the illness in order to decrease infectious instances and maximize recovered cases. This research paper aims to utilize mobility tracking as a means to identify hotspots for contagious disease spread. The study focuses on collecting and analyzing mobility data from UNITEN students using Google Map data over a period of two weeks. The paper describes the data collection process, data pre-processing steps, and the application of the HDBSCAN algorithm for hotspot clustering. The results demonstrate the effectiveness of HDBSCAN in identifying hotspots based on the mobility data. The findings highlight the potential of mobility tracking for disease surveillance and provide insights for public health interventions and preventive measures.","PeriodicalId":37752,"journal":{"name":"Australian Journal of Telecommunications and the Digital Economy","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136280329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}