Pub Date : 2024-06-11DOI: 10.1007/s40745-024-00547-y
Muhammad Ali Faisal, Murat Donduran
In this study, we use a novel approach to explore possible connections between foreign exchange and stock returns using Turkish financial data from 2005 to 2022. Our method involves a two-stage technique. The first stage begins by decomposing individual time series signals into separate intrinsic mode functions (IMFs) with a complete ensemble empirical mode decomposition with added noise algorithm. Extracted IMFs are then used to construct high and low-frequency components through a fine-to-coarse algorithm. In the second phase, we utilized a cross-quantilogram technique to analyze the dependence in quantiles of the original return series along with frequency components obtained in the previous stage. Results revealed several important insights. Firstly, a relatively higher effect ran from stock returns to exchange rate returns for the pertinent period. Secondly, tail dependence is apparent, as returns are discernibly linked. Thirdly, the tail dependence in the returns is more profound in the high-frequency composition than in the low-frequency component. Lastly, the structure of dependence has stayed mostly constant throughout the sample period analyzed.
{"title":"A Two-Stage Analysis of Interaction Between Stock and Exchange Rate Markets: Evidence from Turkey","authors":"Muhammad Ali Faisal, Murat Donduran","doi":"10.1007/s40745-024-00547-y","DOIUrl":"10.1007/s40745-024-00547-y","url":null,"abstract":"<div><p>In this study, we use a novel approach to explore possible connections between foreign exchange and stock returns using Turkish financial data from 2005 to 2022. Our method involves a two-stage technique. The first stage begins by decomposing individual time series signals into separate intrinsic mode functions (IMFs) with a complete ensemble empirical mode decomposition with added noise algorithm. Extracted IMFs are then used to construct high and low-frequency components through a fine-to-coarse algorithm. In the second phase, we utilized a cross-quantilogram technique to analyze the dependence in quantiles of the original return series along with frequency components obtained in the previous stage. Results revealed several important insights. Firstly, a relatively higher effect ran from stock returns to exchange rate returns for the pertinent period. Secondly, tail dependence is apparent, as returns are discernibly linked. Thirdly, the tail dependence in the returns is more profound in the high-frequency composition than in the low-frequency component. Lastly, the structure of dependence has stayed mostly constant throughout the sample period analyzed.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 1","pages":"171 - 198"},"PeriodicalIF":0.0,"publicationDate":"2024-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141359846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-08DOI: 10.1007/s40745-024-00550-3
K. P. Muhammed Niyas, P. Thiyagarajan
Early detection of dementia patients in advance is a great concern for the physicians. That is why physicians make use of multi modal data to accomplish this. The baseline visit data of the patients are mainly utilized for this task. Modern Machine Learning techniques provide empirical evidence based approach to physicians for predicting the diagnosis status of the patients. This paper proposes an ensemble majority voting classifier approach for improving the detection of dementia using baseline visit data. The ensemble model consists of Logistic Regression, Random Forest, and Naive Bayes Classifiers. The proposed ensemble classifier reported with a BCA, F1-score of 92%, 0.92 for classifying demented and non-demented patients. Our results suggest that the prediction using the ensemble majority voting classifier improves the Balanced Classification Accuracy, F1-score for predicting dementia on the multi modal data of Open Access Series Imaging Dataset. The results using ensemble models are promising and highlight the importance of using ensemble models for dementia detection using multimodal data.
{"title":"Improving Dementia Prediction Using Ensemble Majority Voting Classifier","authors":"K. P. Muhammed Niyas, P. Thiyagarajan","doi":"10.1007/s40745-024-00550-3","DOIUrl":"10.1007/s40745-024-00550-3","url":null,"abstract":"<div><p>Early detection of dementia patients in advance is a great concern for the physicians. That is why physicians make use of multi modal data to accomplish this. The baseline visit data of the patients are mainly utilized for this task. Modern Machine Learning techniques provide empirical evidence based approach to physicians for predicting the diagnosis status of the patients. This paper proposes an ensemble majority voting classifier approach for improving the detection of dementia using baseline visit data. The ensemble model consists of Logistic Regression, Random Forest, and Naive Bayes Classifiers. The proposed ensemble classifier reported with a BCA, F1-score of 92%, 0.92 for classifying demented and non-demented patients. Our results suggest that the prediction using the ensemble majority voting classifier improves the Balanced Classification Accuracy, F1-score for predicting dementia on the multi modal data of Open Access Series Imaging Dataset. The results using ensemble models are promising and highlight the importance of using ensemble models for dementia detection using multimodal data.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 3","pages":"947 - 967"},"PeriodicalIF":0.0,"publicationDate":"2024-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141369288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-08DOI: 10.1007/s40745-024-00537-0
Hari Krishna Kalidindi, N. Srinivasu
Modernization in the healthcare industry is happening with the support of artificial intelligence and blockchain technologies. Collecting healthcare data is done through any Google survey from different governing bodies and data available on the Web of Sciences. However, the researchers continually suffered on developing effective classification approaches. In the recently developed models, deep learning is used for better generalization and training performance using a massive amount of data. A better learning model is built by sharing the data from organizations like research centers, testing labs, hospitals, etc. Each healthcare institution requires proper data privacy, and thus, these industries desire to use efficient and accurate learning systems for different applications. Among various diseases in the world, lung cancer is one of a hazardous diseases. Thus, early identification of lung cancer and followed by the appropriate treatment can save a life. Hence, the Computer Aided Diagnosis (CAD) model is essential for supporting healthcare applications. Therefore, an automated lung cancer detection models are developed to identify cancer from the different modalities of medical images. As a result, the privacy concern in clinical data restricts data sharing between various organizations based on legal and ethical problems. Hence, for these security reasons, the blockchain comes into focus. Here, there is a need to get access to the blockchain by healthcare professionals for displaying the clinical records of the patient, which ensures the security of the patient’s data. For this purpose, artificial intelligence utilizes numerous techniques, large quantities of data, and decision-making capability. Thus, the medical system must have democratized healthcare, reduced costs, and enhanced service efficiency by combining technological advancement. Therefore, this paper aims to review several lung cancer detection approaches in data sharing to help future research. Here, the systematic review of lung cancer detection models is done based on ML and DL algorithms. In recent years, the fundamental well-performed techniques have been discussed by categorizing them. Furthermore, the simulation platforms, dataset utilized, and performance measures are evaluated as an extended review. This survey explores the challenges and research findings for supporting future works. This work will produce many suggestions for future professionals and researchers for enhancing the secure data transmission of medical data.
{"title":"A Comprehensive Study and Research Perception towards Secured Data Sharing for Lung Cancer Detection with Blockchain Technology","authors":"Hari Krishna Kalidindi, N. Srinivasu","doi":"10.1007/s40745-024-00537-0","DOIUrl":"10.1007/s40745-024-00537-0","url":null,"abstract":"<div><p>Modernization in the healthcare industry is happening with the support of artificial intelligence and blockchain technologies. Collecting healthcare data is done through any Google survey from different governing bodies and data available on the Web of Sciences. However, the researchers continually suffered on developing effective classification approaches. In the recently developed models, deep learning is used for better generalization and training performance using a massive amount of data. A better learning model is built by sharing the data from organizations like research centers, testing labs, hospitals, etc. Each healthcare institution requires proper data privacy, and thus, these industries desire to use efficient and accurate learning systems for different applications. Among various diseases in the world, lung cancer is one of a hazardous diseases. Thus, early identification of lung cancer and followed by the appropriate treatment can save a life. Hence, the Computer Aided Diagnosis (CAD) model is essential for supporting healthcare applications. Therefore, an automated lung cancer detection models are developed to identify cancer from the different modalities of medical images. As a result, the privacy concern in clinical data restricts data sharing between various organizations based on legal and ethical problems. Hence, for these security reasons, the blockchain comes into focus. Here, there is a need to get access to the blockchain by healthcare professionals for displaying the clinical records of the patient, which ensures the security of the patient’s data. For this purpose, artificial intelligence utilizes numerous techniques, large quantities of data, and decision-making capability. Thus, the medical system must have democratized healthcare, reduced costs, and enhanced service efficiency by combining technological advancement. Therefore, this paper aims to review several lung cancer detection approaches in data sharing to help future research. Here, the systematic review of lung cancer detection models is done based on ML and DL algorithms. In recent years, the fundamental well-performed techniques have been discussed by categorizing them. Furthermore, the simulation platforms, dataset utilized, and performance measures are evaluated as an extended review. This survey explores the challenges and research findings for supporting future works. This work will produce many suggestions for future professionals and researchers for enhancing the secure data transmission of medical data.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 2","pages":"757 - 797"},"PeriodicalIF":0.0,"publicationDate":"2024-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141368507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Real estate significantly contributes to the broader stock market and garners substantial attention from individual households to the overall country’s economy. Predicting real estate trends holds great importance for investors, policymakers, and stakeholders to make informed decisions. However, accurate forecasting remains challenging due to it’s complex, volatile, and nonlinear behavior. This study develops a unified computational framework for implementing state-of-the-art deep learning model architectures the long short-term memory (LSTM), the gated recurrent unit (GRU), the convolutional neural network (CNN), their variants, and hybridizations, to predict the next day’s closing price of the real estate index S &P500-60. We incorporate diverse data sources by integrating real estate-specific indicators on top of fundamental data, macroeconomic factors, and technical indicators, capturing multifaceted features. Several models with varying degrees of complexity are constructed using different architectures and configurations. Model performance is evaluated using standard regression metrics, and statistical analysis is employed for model selection and validation to ensure robustness. The experimental results illustrate that the base GRU model, followed by the bidirectional GRU model, offers a superior fit with high accuracy in predicting the closing price of the index. We additionally tested the constructed models on the Vanguard Real Estate Index Fund ETF and the Dow Jones U.S. Real Estate Index for robustness and obtained consistent outcomes. The proposed framework can easily be generalized to model sequential data in various other domains.
{"title":"Real Estate Market Prediction Using Deep Learning Models","authors":"Ramchandra Rimal, Binod Rimal, Hum Nath Bhandari, Nawa Raj Pokhrel, Keshab R. Dahal","doi":"10.1007/s40745-024-00543-2","DOIUrl":"10.1007/s40745-024-00543-2","url":null,"abstract":"<div><p>Real estate significantly contributes to the broader stock market and garners substantial attention from individual households to the overall country’s economy. Predicting real estate trends holds great importance for investors, policymakers, and stakeholders to make informed decisions. However, accurate forecasting remains challenging due to it’s complex, volatile, and nonlinear behavior. This study develops a unified computational framework for implementing state-of-the-art deep learning model architectures the long short-term memory (LSTM), the gated recurrent unit (GRU), the convolutional neural network (CNN), their variants, and hybridizations, to predict the next day’s closing price of the real estate index S &P500-60. We incorporate diverse data sources by integrating real estate-specific indicators on top of fundamental data, macroeconomic factors, and technical indicators, capturing multifaceted features. Several models with varying degrees of complexity are constructed using different architectures and configurations. Model performance is evaluated using standard regression metrics, and statistical analysis is employed for model selection and validation to ensure robustness. The experimental results illustrate that the base GRU model, followed by the bidirectional GRU model, offers a superior fit with high accuracy in predicting the closing price of the index. We additionally tested the constructed models on the Vanguard Real Estate Index Fund ETF and the Dow Jones U.S. Real Estate Index for robustness and obtained consistent outcomes. The proposed framework can easily be generalized to model sequential data in various other domains.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 4","pages":"1113 - 1156"},"PeriodicalIF":0.0,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141267658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-30DOI: 10.1007/s40745-024-00532-5
Mojtaba Zeinali Najafabadi, Ehsan Bahrami Samani
Generalized linear mixed effect models (GLMEMs) are widely applied for the analysis of correlated non-Gaussian data such as those found in longitudinal studies. On the other hand, the Cox (proportional hazards, PHs) and the accelerated failure time (AFT) regression models are two well-known approaches in survival analysis to modeling time to event (TTE) data. In this article, we develop joint modeling of longitudinal count (LC) and TTE data and consider extensions with fixed effects and parametric random effects in our proposed joint models. The LC response is inflated in two points k and l (k < l) and we use some members of (k, l)-inflated power series distribution (PSD) as the distribution of this response. Also, for modeling of TTE process, the PHs model of Cox and the AFT model, based on a flexible hazard function, are separately proposed. One of the goals of the present paper is to evaluate and compare the performance of joint models of (k, l)-inflated LC and TTE data under two mentioned approaches via extensive simulations. The estimation is through the penalized likelihood method, and our concentration is on efficient computation and effective parameter selection. To assist efficient computation, the joint likelihoods of the observations and the latent variables of the random effects are used instead of the marginal likelihood of the observations. Finally, a real AIDS data example is presented to illustrate the potential applications of our joint models.
{"title":"Analysis of the HIV/AIDS Data Using Joint Modeling of Longitudinal (k,l)-Inflated Count and Time to Event Data in Clinical Trials","authors":"Mojtaba Zeinali Najafabadi, Ehsan Bahrami Samani","doi":"10.1007/s40745-024-00532-5","DOIUrl":"10.1007/s40745-024-00532-5","url":null,"abstract":"<div><p>Generalized linear mixed effect models (GLMEMs) are widely applied for the analysis of correlated non-Gaussian data such as those found in longitudinal studies. On the other hand, the Cox (proportional hazards, PHs) and the accelerated failure time (AFT) regression models are two well-known approaches in survival analysis to modeling time to event (TTE) data. In this article, we develop joint modeling of longitudinal count (LC) and TTE data and consider extensions with fixed effects and parametric random effects in our proposed joint models. The LC response is inflated in two points k and l (k < l) and we use some members of (k, l)-inflated power series distribution (PSD) as the distribution of this response. Also, for modeling of TTE process, the PHs model of Cox and the AFT model, based on a flexible hazard function, are separately proposed. One of the goals of the present paper is to evaluate and compare the performance of joint models of (k, l)-inflated LC and TTE data under two mentioned approaches via extensive simulations. The estimation is through the penalized likelihood method, and our concentration is on efficient computation and effective parameter selection. To assist efficient computation, the joint likelihoods of the observations and the latent variables of the random effects are used instead of the marginal likelihood of the observations. Finally, a real AIDS data example is presented to illustrate the potential applications of our joint models.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 2","pages":"695 - 719"},"PeriodicalIF":0.0,"publicationDate":"2024-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143809249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-27DOI: 10.1007/s40745-024-00539-y
Udochukwu Victor Echebiri, Nosakhare Liberty Osawe, Chukwuemeka Thomas Onyia
A mathematical approach to developing new distributions is reviewed. The method which composes of integration and the concept of a normalizing constant, allows for primitive interjection of new parameter(s) in an existing distribution to form new model(s), called Omega-Type probability models. A probability distribution is proposed from a root model, Lindley distribution, and some properties, such as the series representation of the density and cumulative distribution functions, shape of the density, hazard and survival functions, moments and related measures, quantile function, order statistics, parameter estimation and interval estimate, were studied. Amidst the usual hazard and survival shapes, a constant or uniform trend was realized for the survival function, which projects the possibility of modeling systems that may not terminate over a given period of time. Three different methods of estimation, namely, the Cramer‒von Mises estimator, maximum product of the spacing estimator and maximum likelihood estimator, were used. The modified unimodal shape of the proposed distribution is added as a special feature in the improvements made among the Lindley family of distributions. Finally, two real-life datasets were fitted to the new distribution to demonstrate its economic importance.
{"title":"Omega ({{omega}})—Type Probability Models: A Parametric Modification of Probability Distributions","authors":"Udochukwu Victor Echebiri, Nosakhare Liberty Osawe, Chukwuemeka Thomas Onyia","doi":"10.1007/s40745-024-00539-y","DOIUrl":"10.1007/s40745-024-00539-y","url":null,"abstract":"<div><p>A mathematical approach to developing new distributions is reviewed. The method which composes of integration and the concept of a normalizing constant, allows for primitive interjection of new parameter(s) in an existing distribution to form new model(s), called <i>Omega-Type</i> probability models. A probability distribution is proposed from a root model, Lindley distribution, and some properties, such as the series representation of the density and cumulative distribution functions, shape of the density, hazard and survival functions, moments and related measures, quantile function, order statistics, parameter estimation and interval estimate, were studied. Amidst the usual hazard and survival shapes, a constant or uniform trend was realized for the survival function, which projects the possibility of modeling systems that may not terminate over a given period of time. Three different methods of estimation, namely, the Cramer‒von Mises estimator, maximum product of the spacing estimator and maximum likelihood estimator, were used. The modified unimodal shape of the proposed distribution is added as a special feature in the improvements made among the Lindley family of distributions. Finally, two real-life datasets were fitted to the new distribution to demonstrate its economic importance.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 3","pages":"855 - 876"},"PeriodicalIF":0.0,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145170703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-25DOI: 10.1007/s40745-024-00546-z
Jun Li, Chong Xie, Sizheng Wu, Yawei Ren
This paper tackle the challenges associated with low recognition accuracy and the detection of occlusions when identifying long-range and diminutive targets (such as UAVs). We introduce a sophisticated detection framework named UAV-YOLOv5, which amalgamates the strengths of Swin Transformer V2 and YOLOv5. Firstly, we introduce Focal-EIOU, a refinement of the K-means algorithm tailored to generate anchor boxes better suited for the current dataset, thereby improving detection performance. Second, the convolutional and pooling layers in the network with step size greater than 1 are replaced to prevent information loss during feature extraction. Then, the Swin Transformer V2 module is introduced in the Neck to improve the accuracy of the model, and the BiFormer module is introduced to improve the ability of the model to acquire global and local feature information at the same time. In addition, BiFPN is introduced to replace the original FPN structure so that the network can acquire richer semantic information and fuse features across scales more effectively. Lastly, a small target detection head is appended to the existing architecture, augmenting the model’s proficiency in detecting smaller targets with heightened precision. Furthermore, various experiments are conducted on the comprehensive dataset to verify the effectiveness of UAV-YOLOv5, achieving an average accuracy of 87%. Compared with YOLOv5, the mAP of UAV-YOLOv5 is improved by 8.5%, which verifies that it has high-precision long-range small-target UAV optoelectronic detection capability.
{"title":"UAV-YOLOv5: A Swin-Transformer-Enabled Small Object Detection Model for Long-Range UAV Images","authors":"Jun Li, Chong Xie, Sizheng Wu, Yawei Ren","doi":"10.1007/s40745-024-00546-z","DOIUrl":"10.1007/s40745-024-00546-z","url":null,"abstract":"<div><p>This paper tackle the challenges associated with low recognition accuracy and the detection of occlusions when identifying long-range and diminutive targets (such as UAVs). We introduce a sophisticated detection framework named UAV-YOLOv5, which amalgamates the strengths of Swin Transformer V2 and YOLOv5. Firstly, we introduce Focal-EIOU, a refinement of the K-means algorithm tailored to generate anchor boxes better suited for the current dataset, thereby improving detection performance. Second, the convolutional and pooling layers in the network with step size greater than 1 are replaced to prevent information loss during feature extraction. Then, the Swin Transformer V2 module is introduced in the Neck to improve the accuracy of the model, and the BiFormer module is introduced to improve the ability of the model to acquire global and local feature information at the same time. In addition, BiFPN is introduced to replace the original FPN structure so that the network can acquire richer semantic information and fuse features across scales more effectively. Lastly, a small target detection head is appended to the existing architecture, augmenting the model’s proficiency in detecting smaller targets with heightened precision. Furthermore, various experiments are conducted on the comprehensive dataset to verify the effectiveness of UAV-YOLOv5, achieving an average accuracy of 87%. Compared with YOLOv5, the mAP of UAV-YOLOv5 is improved by 8.5%, which verifies that it has high-precision long-range small-target UAV optoelectronic detection capability.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 4","pages":"1109 - 1138"},"PeriodicalIF":0.0,"publicationDate":"2024-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142413758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Search and recommendation are two essential features of any e-commerce website for finding and purchasing a specific product. Visual Search is a promising and quick method in comparison to a textual-based search method. Hence, the objective of this research is to propose a conceptual framework for developing a visual search and recommendation system for grocery products using Ensemble Learning with CNN models. Traditional Deep learning and Ensemble Learning techniques were implemented with a publicly available and a self-made data set containing 3174 and 3162 images respectively. Various combinations of the suitable models found from research findings were used to find the best-fitted model for both the search and recommendation functionalities. All the models were evaluated using suitable performance metrics and the Ensemble Learning approach performed better. The best-performed results for visual searching are obtained by incorporating VGG16 and MobileNet with an accuracy of 99.8% for classification and in the case of product recommendation, the combination of MobileNET and ResNET50 performs better than other techniques.
{"title":"A Deep Convolutional Neural Network-Based Approach for Visual Search & Recommendation of Grocery Products","authors":"Nawreen Anan Khandaker, Amrin Rahman, Amrin Akter Pinky, Tasmiah Tamzid Anannya","doi":"10.1007/s40745-024-00540-5","DOIUrl":"10.1007/s40745-024-00540-5","url":null,"abstract":"<div><p>Search and recommendation are two essential features of any e-commerce website for finding and purchasing a specific product. Visual Search is a promising and quick method in comparison to a textual-based search method. Hence, the objective of this research is to propose a conceptual framework for developing a visual search and recommendation system for grocery products using Ensemble Learning with CNN models. Traditional Deep learning and Ensemble Learning techniques were implemented with a publicly available and a self-made data set containing 3174 and 3162 images respectively. Various combinations of the suitable models found from research findings were used to find the best-fitted model for both the search and recommendation functionalities. All the models were evaluated using suitable performance metrics and the Ensemble Learning approach performed better. The best-performed results for visual searching are obtained by incorporating VGG16 and MobileNet with an accuracy of 99.8% for classification and in the case of product recommendation, the combination of MobileNET and ResNET50 performs better than other techniques.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 3","pages":"877 - 897"},"PeriodicalIF":0.0,"publicationDate":"2024-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141104071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-23DOI: 10.1007/s40745-024-00545-0
Jun Li, YiFei Hai, SongJia Yin
In the past decade, deep learning has greatly increased the complexity of industrial production intelligence by virtue of its powerful learning capability. At the same time, it has also brought security challenges to the field of industrial production information networks, mainly in two aspects: production safety and network information security. The former is mainly focused on ensuring the safety of personnel behavior in the production environment, including two different categories: detection of dangerous targets and identification of dangerous behaviors. The latter focuses on the safety of industrial information systems, especially networks. In recent years, deep learning-based detection techniques have made great strides in addressing these dual problems. Therefore, this paper presents an exhaustive study on the development of deep learning-based detection methods for industrial production safety analysis and information network security problem detection. The paper presents a comprehensive taxonomy for classifying production environments and production network information, classifying and clustering prevalent industrial security challenges, with a special emphasis on the role of deep learning in insecure behavior identification and information security risk detection.We provides an in-depth analysis of the advantages, limitations, and suitable application scenarios of these two approaches. In addition, the paper provides insights into contemporary challenges and future trends in this field and concludes with a discussion of prospects for future research.
{"title":"A Survey of Artificial Intelligence for Industrial Detection","authors":"Jun Li, YiFei Hai, SongJia Yin","doi":"10.1007/s40745-024-00545-0","DOIUrl":"10.1007/s40745-024-00545-0","url":null,"abstract":"<div><p>In the past decade, deep learning has greatly increased the complexity of industrial production intelligence by virtue of its powerful learning capability. At the same time, it has also brought security challenges to the field of industrial production information networks, mainly in two aspects: production safety and network information security. The former is mainly focused on ensuring the safety of personnel behavior in the production environment, including two different categories: detection of dangerous targets and identification of dangerous behaviors. The latter focuses on the safety of industrial information systems, especially networks. In recent years, deep learning-based detection techniques have made great strides in addressing these dual problems. Therefore, this paper presents an exhaustive study on the development of deep learning-based detection methods for industrial production safety analysis and information network security problem detection. The paper presents a comprehensive taxonomy for classifying production environments and production network information, classifying and clustering prevalent industrial security challenges, with a special emphasis on the role of deep learning in insecure behavior identification and information security risk detection.We provides an in-depth analysis of the advantages, limitations, and suitable application scenarios of these two approaches. In addition, the paper provides insights into contemporary challenges and future trends in this field and concludes with a discussion of prospects for future research.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 2","pages":"799 - 827"},"PeriodicalIF":0.0,"publicationDate":"2024-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141103821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-21DOI: 10.1007/s40745-024-00533-4
Elias Mazrooei Rad, Mahdi Azarnoosh, Majid Ghoshuni, Mohammad Mahdi Khalilzadeh
This article, a new method for the diagnosis of Alzheimer’s disease in the mild stage is presented according to combining the characteristics of EEG signal and MRI images. The brain signal is recorded in four modes of closed-eyes, open eye, reminder, and stimulation from three channels Pz, Cz, and Fz of 90 participants in three groups of healthy subjects, mild, and severe Alzheimer’s disease (AD) patients.In addition, MRI images are taken with at least 3 Tesla and the thickness of 3 mm so it can be examined the senile plaques and neurofibrillary tangles. Proper image segmentation, mask, and sharp filters are used for preprocessing. Then proper features of brain signals extracted according to the nonlinear and chaotic nature of the brain such as Lyapunov exponent, correlation dimension, and entropy. Results: These features combined with brain MRI images properties including Medial Temporal Lobe Atrophy (MTA), Cerebral Spinal Fluid (CSF), Gray Matter (GM), Index Asymmetry (IA) and White Matter (WM) to diagnose the disease. Then two classifiers, the support vector machine, and Elman neural network are used with the optimal combined features extracted by analysis of variance. Results showed that between the three brain signals, and between the four modes of evaluation, the accuracy of the Pz channel and excitation mode was more than the others. Conclusions: Finally, by using neural network dynamics because of the nonlinear properties studied and due to the nonlinear dynamics of the EEG signal, the Elman neural network is used. However, it is the important to note that, by the way of analyzing medical images, we can determine the most effective channel for recording brain signals. 3D segmentation of MRI images further helps researchers diagnose Alzheimer’s disease and obtain important information. The accuracy of the results in Elman neural network with the combination of brain signal features and medical images is 94.4% and in the case without combining the signal and image features, the accuracy of the results is 92.2%. The use of nonlinear classifiers is more appropriate than other classification methods due to the nonlinear dynamics of the brain signal. The accuracy of the results in the support vector machine with RBF core with the combination of brain signal features and medical images is 75.5% and in the case without combining the signal and image features, the accuracy of the results is 76.8%.
{"title":"Combining Nonlinear Features of EEG and MRI to Diagnose Alzheimer’s Disease","authors":"Elias Mazrooei Rad, Mahdi Azarnoosh, Majid Ghoshuni, Mohammad Mahdi Khalilzadeh","doi":"10.1007/s40745-024-00533-4","DOIUrl":"10.1007/s40745-024-00533-4","url":null,"abstract":"<div><p>This article, a new method for the diagnosis of Alzheimer’s disease in the mild stage is presented according to combining the characteristics of EEG signal and MRI images. The brain signal is recorded in four modes of closed-eyes, open eye, reminder, and stimulation from three channels Pz, Cz, and Fz of 90 participants in three groups of healthy subjects, mild, and severe Alzheimer’s disease (AD) patients.In addition, MRI images are taken with at least 3 Tesla and the thickness of 3 mm so it can be examined the senile plaques and neurofibrillary tangles. Proper image segmentation, mask, and sharp filters are used for preprocessing. Then proper features of brain signals extracted according to the nonlinear and chaotic nature of the brain such as Lyapunov exponent, correlation dimension, and entropy. Results: These features combined with brain MRI images properties including Medial Temporal Lobe Atrophy (MTA), Cerebral Spinal Fluid (CSF), Gray Matter (GM), Index Asymmetry (IA) and White Matter (WM) to diagnose the disease. Then two classifiers, the support vector machine, and Elman neural network are used with the optimal combined features extracted by analysis of variance. Results showed that between the three brain signals, and between the four modes of evaluation, the accuracy of the Pz channel and excitation mode was more than the others. Conclusions: Finally, by using neural network dynamics because of the nonlinear properties studied and due to the nonlinear dynamics of the EEG signal, the Elman neural network is used. However, it is the important to note that, by the way of analyzing medical images, we can determine the most effective channel for recording brain signals. 3D segmentation of MRI images further helps researchers diagnose Alzheimer’s disease and obtain important information. The accuracy of the results in Elman neural network with the combination of brain signal features and medical images is 94.4% and in the case without combining the signal and image features, the accuracy of the results is 92.2%. The use of nonlinear classifiers is more appropriate than other classification methods due to the nonlinear dynamics of the brain signal. The accuracy of the results in the support vector machine with RBF core with the combination of brain signal features and medical images is 75.5% and in the case without combining the signal and image features, the accuracy of the results is 76.8%.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 1","pages":"95 - 116"},"PeriodicalIF":0.0,"publicationDate":"2024-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141115548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}