Annals of Data Science最新文献

英文中文

Transmuted Shifted Lindley Distribution: Characterizations, Classical and Bayesian Estimation with Applications 变换的移位林德利分布：特征、经典和贝叶斯估计及其应用

Q1 Decision Sciences

Annals of Data Science

Pub Date : 2024-07-16 DOI: 10.1007/s40745-024-00562-z

A. Chakraborty, S. Rana, S. I. Maiti

引用次数: 0

Apple Leaf Disease Detection Using Transfer Learning

Q1 Decision Sciences

Annals of Data Science

Pub Date : 2024-07-13 DOI: 10.1007/s40745-024-00555-y

Ozair Ahmad Wani, Umer Zahoor, Syed Zubair Ahmad Shah, Rijwan Khan

Automated detection of plant diseases is crucial as it simplifies the task of monitoring large farms and identifies diseases at their early stages to mitigate further plant degradation. Besides the decline in plant health, reduced production severely impacts the country’s economy. Traditional disease identification methods, relying on human experts, are slow, time-consuming, and impractical for large farms. Our proposed model utilizes a combination of pre-trained Resnet18, Alexnet, GoogLeNet, and VGG16 networks to classify apple tree leaves into categories such as healthy, black rot, apple cedar rust, and apple scab based on images. Various image enhancement techniques were employed to enhance the model’s accuracy. Ultimately, our model achieved an accuracy of 97.25% on the validation dataset, demonstrating excellent performance across various metrics. This suggests its potential for efficient and accurate plant health monitoring in the agricultural sector.

引用次数: 0

A Review of Anonymization Algorithms and Methods in Big Data 大数据中的匿名算法和方法综述

Q1 Decision Sciences

Annals of Data Science

Pub Date : 2024-07-13 DOI: 10.1007/s40745-024-00557-w

Elham Shamsinejad, Touraj Banirostam, Mir Mohsen Pedram, Amir Masoud Rahmani

In the era of big data, with the increase in volume and complexity of data, the main challenge is how to use big data while preserving the privacy of users. This study was conducted with the aim of finding a solution to this challenge. In this study, we examined various data anonymization methods, including differential privacy, advanced encryption, and strong access controls. In addition, the operation, advantages, disadvantages, and use of these methods, the challenges of adapting these methods to big data, and possible solutions for them were also examined. Our results show that traditional data anonymization methods lack scalability, leading to privacy breaches and data loss. When faced with large volumes of data, these methods may not be able to fully process the data. Also, these methods may be ineffective against re-identification attacks, linkage attacks, and inference attacks. We introduced emerging methods that are capable of providing improved privacy with minimal data loss. These methods have scalability for big data. Finally, we examined future research works and raised important questions that can help improve existing algorithms or develop new methods, better manage the complexity and scale of unstructured data.

{"title":"A Review of Anonymization Algorithms and Methods in Big Data","authors":"Elham Shamsinejad, Touraj Banirostam, Mir Mohsen Pedram, Amir Masoud Rahmani","doi":"10.1007/s40745-024-00557-w","DOIUrl":"10.1007/s40745-024-00557-w","url":null,"abstract":"<div><p>In the era of big data, with the increase in volume and complexity of data, the main challenge is how to use big data while preserving the privacy of users. This study was conducted with the aim of finding a solution to this challenge. In this study, we examined various data anonymization methods, including differential privacy, advanced encryption, and strong access controls. In addition, the operation, advantages, disadvantages, and use of these methods, the challenges of adapting these methods to big data, and possible solutions for them were also examined. Our results show that traditional data anonymization methods lack scalability, leading to privacy breaches and data loss. When faced with large volumes of data, these methods may not be able to fully process the data. Also, these methods may be ineffective against re-identification attacks, linkage attacks, and inference attacks. We introduced emerging methods that are capable of providing improved privacy with minimal data loss. These methods have scalability for big data. Finally, we examined future research works and raised important questions that can help improve existing algorithms or develop new methods, better manage the complexity and scale of unstructured data.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 1","pages":"253 - 279"},"PeriodicalIF":0.0,"publicationDate":"2024-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141650932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Representing a Model for the Anonymization of Big Data Stream Using In-Memory Processing 使用内存处理来表示大数据流匿名化模型

Q1 Decision Sciences

Annals of Data Science

Pub Date : 2024-07-13 DOI: 10.1007/s40745-024-00556-x

Elham Shamsinejad, Touraj Banirostam, Mir Mohsen Pedram, Amir Masoud Rahmani

In light of the escalating privacy risks in the big data era, this paper introduces an innovative model for the anonymization of big data streams, leveraging in-memory processing within the Spark framework. The approach is founded on the principle of K-anonymity and propels the field forward by critically evaluating various anonymization methods and algorithms, benchmarking their performance with respect to time and space complexities. A distinctive formula for optimized cluster determination in the K-means algorithm is presented, along with a novel tuple expiration time strategy for the efficient purging of clusters. The integration of these components into Spark’s RDD and MLlib modules results in a significant decrease in execution time and data loss rates, even with increasing data volumes. The paper’s notable contributions are its methodological advancements that offer a robust, scalable solution for data anonymization, safeguarding user privacy without sacrificing data utility or processing efficiency.

引用次数: 0

Analyzing Insurance Data with an Alpha Power Transformed Exponential Poisson Model 用阿尔法幂变换指数泊松模型分析保险数据

Q1 Decision Sciences

Annals of Data Science

Pub Date : 2024-07-10 DOI: 10.1007/s40745-024-00554-z

M. Meraou, M. Z. Raqab, Fatmah B. Almathkour

引用次数: 0

Unlocking Online Insights: LSTM Exploration and Transfer Learning Prospects 开启在线洞察力：LSTM 探索与迁移学习的前景

Q1 Decision Sciences

Annals of Data Science

Pub Date : 2024-07-08 DOI: 10.1007/s40745-024-00551-2

Muhammad Tahir, Sufyan Ali, Ayesha Sohail, Ying Zhang, Xiaohua Jin

Machine learning algorithms can improve the time series data analysis as compared to the traditional methods such as moving averages or auto-regressive approaches. This advancement has helped to unlock several challenging problems since machine learning not only helps to forecast the overall trend of the data, but it also helps to keep the historical track of changes in factors, influencing this trend. These predictions play a pivotal role in almost all areas of research where the observations are time dependent, such as problems ranging from challenges of finance to public health, environmental and climate change challenges. A key challenge of these domains is the higher number of attributes and predictors since managing and manipulating data from many attributes is itself a significant challenge for future forecasting. Addressing these challenges is possible with Recursive Long Short-Term Memory models. The application of such models is crucial, and their efficacy is further amplified when considering transfer learning. During this research, a detailed and comprehensive description of such models is addressed. Practical application is illustrated through an example, emphasizing that these models, when transferred to complex and large datasets using transfer learning, hold great promise.

与移动平均法或自动回归法等传统方法相比，机器学习算法可以改进时间序列数据分析。由于机器学习不仅有助于预测数据的整体趋势，还有助于对影响这一趋势的各种因素的变化进行历史跟踪，因此这一进步有助于解决一些具有挑战性的问题。这些预测在几乎所有观测数据依赖于时间的研究领域都发挥着关键作用，例如从金融挑战到公共卫生、环境和气候变化挑战等问题。这些领域面临的一个主要挑战是属性和预测因子的数量较多，因为管理和处理来自众多属性的数据本身就是对未来预测的一个重大挑战。利用递归长短期记忆模型可以应对这些挑战。此类模型的应用至关重要，如果考虑到迁移学习，其功效将进一步放大。本研究对此类模型进行了详细而全面的描述。通过一个实例来说明实际应用，强调这些模型在利用迁移学习转移到复杂的大型数据集时大有可为。

{"title":"Unlocking Online Insights: LSTM Exploration and Transfer Learning Prospects","authors":"Muhammad Tahir, Sufyan Ali, Ayesha Sohail, Ying Zhang, Xiaohua Jin","doi":"10.1007/s40745-024-00551-2","DOIUrl":"10.1007/s40745-024-00551-2","url":null,"abstract":"<div><p>Machine learning algorithms can improve the time series data analysis as compared to the traditional methods such as moving averages or auto-regressive approaches. This advancement has helped to unlock several challenging problems since machine learning not only helps to forecast the overall trend of the data, but it also helps to keep the historical track of changes in factors, influencing this trend. These predictions play a pivotal role in almost all areas of research where the observations are time dependent, such as problems ranging from challenges of finance to public health, environmental and climate change challenges. A key challenge of these domains is the higher number of attributes and predictors since managing and manipulating data from many attributes is itself a significant challenge for future forecasting. Addressing these challenges is possible with Recursive Long Short-Term Memory models. The application of such models is crucial, and their efficacy is further amplified when considering transfer learning. During this research, a detailed and comprehensive description of such models is addressed. Practical application is illustrated through an example, emphasizing that these models, when transferred to complex and large datasets using transfer learning, hold great promise.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 4","pages":"1421 - 1434"},"PeriodicalIF":0.0,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s40745-024-00551-2.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141667526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Drinkers Voice Recognition Intelligent System: An Ensemble Stacking Machine Learning Approach 饮酒者语音识别智能系统：集合堆叠机器学习方法

Q1 Decision Sciences

Annals of Data Science

Pub Date : 2024-07-07 DOI: 10.1007/s40745-024-00559-8

P. Terlapu

引用次数: 0

A New Kernel Density Estimation-Based Entropic Isometric Feature Mapping for Unsupervised Metric Learning 用于无监督度量学习的基于核密度估计的新熵等距特征映射法

Q1 Decision Sciences

Annals of Data Science

Pub Date : 2024-07-06 DOI: 10.1007/s40745-024-00548-x

Alaor Cervati Neto, A. Levada, Michel Ferreira Cardia Haddad

引用次数: 0

Power Evaluation of Some Tests for Inverse Rayleigh Distribution 反瑞利分布某些测试的功率评估

Q1 Decision Sciences

Annals of Data Science

Pub Date : 2024-07-05 DOI: 10.1007/s40745-024-00536-1

Vahideh Ahrari, P. Hasanalipour

引用次数: 0

A Comprehensive Survey of Image Generation Models Based on Deep Learning

Q1 Decision Sciences

Annals of Data Science

Pub Date : 2024-06-20 DOI: 10.1007/s40745-024-00544-1

Jun Li, Chenyang Zhang, Wei Zhu, Yawei Ren

In recent years, generative artificial intelligence has been developing rapidly. In the image domain, image generation models based on deep learning have made remarkable achievements. Early frameworks for image generation models were dominated by generative adversarial networks (GANs) and variational autoencoders (VAEs). Nowadays, large-scale generative models based on diffusion models have become mainstream, and the quality of their generated images is significantly improved. We will review the research and development of image generation models and delve into the significant progress made in the field in recent years. Initially, we revisit the development of traditional image generation models like GANs and VAEs, emphasizing their contributions and challenges. We also introduce diffusion models, which have received much attention in the field of image generation due to their unique generative process and excellent generative performance. Subsequently, we emphasized the large vision models with SAM as the focal point. We also pay special attention to large-scale generative models like Stable Diffusion, which have demonstrated unprecedented capabilities in high-quality image generation tasks. Additionally, we explore target models and respective fine-tuning methods for domain-oriented image generation tasks, predicts future directions in image generation, and proposes potential research focuses and challenges.

{"title":"A Comprehensive Survey of Image Generation Models Based on Deep Learning","authors":"Jun Li, Chenyang Zhang, Wei Zhu, Yawei Ren","doi":"10.1007/s40745-024-00544-1","DOIUrl":"10.1007/s40745-024-00544-1","url":null,"abstract":"<div><p>In recent years, generative artificial intelligence has been developing rapidly. In the image domain, image generation models based on deep learning have made remarkable achievements. Early frameworks for image generation models were dominated by generative adversarial networks (GANs) and variational autoencoders (VAEs). Nowadays, large-scale generative models based on diffusion models have become mainstream, and the quality of their generated images is significantly improved. We will review the research and development of image generation models and delve into the significant progress made in the field in recent years. Initially, we revisit the development of traditional image generation models like GANs and VAEs, emphasizing their contributions and challenges. We also introduce diffusion models, which have received much attention in the field of image generation due to their unique generative process and excellent generative performance. Subsequently, we emphasized the large vision models with SAM as the focal point. We also pay special attention to large-scale generative models like Stable Diffusion, which have demonstrated unprecedented capabilities in high-quality image generation tasks. Additionally, we explore target models and respective fine-tuning methods for domain-oriented image generation tasks, predicts future directions in image generation, and proposes potential research focuses and challenges.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 1","pages":"141 - 170"},"PeriodicalIF":0.0,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143521779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Annals of Data Science

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀