Gavino Faa, Massimo Castagnola, Luca Didaci, Fernando Coghe, M. Scartozzi, Luca Saba, Matteo Fraschini
The introduction of machine learning in digital pathology has deeply impacted the field, especially with the advent of whole slide image (WSI) analysis. In this review, we tried to elucidate the role of machine learning algorithms in diagnostic precision, efficiency, and the reproducibility of the results. First, we discuss some of the most used tools, including QuPath, HistoQC, and HistomicsTK, and provide an updated overview of machine learning approaches and their application in pathology. Later, we report how these tools may simplify the automation of WSI analyses, also reducing manual workload and inter-observer variability. A novel aspect of this review is its focus on open-source tools, presented in a way that may help the adoption process for pathologists. Furthermore, we highlight the major benefits of these technologies, with the aim of making this review a practical guide for clinicians seeking to implement machine learning-based solutions in their specific workflows. Moreover, this review also emphasizes some crucial limitations related to data quality and the interpretability of the models, giving insight into future directions for research. Overall, this work tries to bridge the gap between the more recent technological progress in computer science and traditional clinical practice, supporting a broader, yet smooth, adoption of machine learning approaches in digital pathology.
机器学习在数字病理学中的引入对该领域产生了深远影响,尤其是随着全切片图像(WSI)分析的出现。在这篇综述中,我们试图阐明机器学习算法在诊断精确度、效率和结果可重复性方面的作用。首先,我们讨论了一些最常用的工具,包括 QuPath、HistoQC 和 HistomicsTK,并提供了机器学习方法及其在病理学中应用的最新概述。随后,我们将报告这些工具如何简化 WSI 分析的自动化过程,同时减少人工工作量和观察者之间的差异。本综述的一个新颖之处在于它侧重于开源工具,其介绍方式可能有助于病理学家的采用过程。此外,我们还强调了这些技术的主要优点,目的是使本综述成为临床医生在其特定工作流程中寻求实施基于机器学习的解决方案的实用指南。此外,本综述还强调了与数据质量和模型可解释性有关的一些关键局限性,为未来的研究方向提供了启示。总之,这项工作试图在计算机科学的最新技术进步与传统临床实践之间架起一座桥梁,支持在数字病理学中更广泛、更顺利地采用机器学习方法。
{"title":"The Quest for the Application of Artificial Intelligence to Whole Slide Imaging: Unique Prospective from New Advanced Tools","authors":"Gavino Faa, Massimo Castagnola, Luca Didaci, Fernando Coghe, M. Scartozzi, Luca Saba, Matteo Fraschini","doi":"10.3390/a17060254","DOIUrl":"https://doi.org/10.3390/a17060254","url":null,"abstract":"The introduction of machine learning in digital pathology has deeply impacted the field, especially with the advent of whole slide image (WSI) analysis. In this review, we tried to elucidate the role of machine learning algorithms in diagnostic precision, efficiency, and the reproducibility of the results. First, we discuss some of the most used tools, including QuPath, HistoQC, and HistomicsTK, and provide an updated overview of machine learning approaches and their application in pathology. Later, we report how these tools may simplify the automation of WSI analyses, also reducing manual workload and inter-observer variability. A novel aspect of this review is its focus on open-source tools, presented in a way that may help the adoption process for pathologists. Furthermore, we highlight the major benefits of these technologies, with the aim of making this review a practical guide for clinicians seeking to implement machine learning-based solutions in their specific workflows. Moreover, this review also emphasizes some crucial limitations related to data quality and the interpretability of the models, giving insight into future directions for research. Overall, this work tries to bridge the gap between the more recent technological progress in computer science and traditional clinical practice, supporting a broader, yet smooth, adoption of machine learning approaches in digital pathology.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141363445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Digital systems are nowadays ubiquitous and often comprise an extremely high level of complexity. Guaranteeing the correct behavior of such systems has become an ever more pressing need for manufacturers. The correctness of digital systems can be addressed resorting to formal verification techniques, such as model checking. Currently, it is usually impossible to determine a priori the best algorithm to use given a verification task and, thus, portfolio approaches have become the de facto standard in model checking verification suites. This paper describes the most relevant algorithms and techniques, at the foundations of bit-level SAT-based model checking itself.
如今,数字系统无处不在,而且往往具有极高的复杂性。对于制造商来说,保证此类系统的正确行为已成为一项日益迫切的需求。数字系统的正确性可以通过形式验证技术(如模型检查)来解决。目前,通常不可能先验地确定给定验证任务应使用的最佳算法,因此,组合方法已成为模型检查验证套件的事实标准。本文从基于 SAT 的位级模型检查本身的基础出发,介绍了最相关的算法和技术。
{"title":"Hardware Model Checking Algorithms and Techniques","authors":"G. Cabodi, P. Camurati, M. Palena, P. Pasini","doi":"10.3390/a17060253","DOIUrl":"https://doi.org/10.3390/a17060253","url":null,"abstract":"Digital systems are nowadays ubiquitous and often comprise an extremely high level of complexity. Guaranteeing the correct behavior of such systems has become an ever more pressing need for manufacturers. The correctness of digital systems can be addressed resorting to formal verification techniques, such as model checking. Currently, it is usually impossible to determine a priori the best algorithm to use given a verification task and, thus, portfolio approaches have become the de facto standard in model checking verification suites. This paper describes the most relevant algorithms and techniques, at the foundations of bit-level SAT-based model checking itself.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141366872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A Distributed Elevator Fault Diagnosis System (DEFDS) is developed to tackle frequent malfunctions stemming from the widespread distribution and aging of elevator systems. Due to the complexity of elevator fault data and the subtlety of fault characteristics, traditional methods such as visual inspections and basic operational tests fall short in detecting early signs of mechanical wear and electrical issues. These conventional techniques often fail to recognize subtle fault characteristics, necessitating more advanced diagnostic tools. In response, this paper introduces a Principal Component Analysis–Long Short-Term Memory (PCA-LSTM) method for fault diagnosis. The distributed system decentralizes the fault diagnosis process to individual elevator units, utilizing PCA’s feature selection capabilities in high-dimensional spaces to extract and reduce the dimensionality of fault features. Subsequently, the LSTM model is employed for fault prediction. Elevator models within the system exchange data to refine and optimize a global prediction model. The efficacy of this approach is substantiated through empirical validation with actual data, achieving an accuracy rate of 90% and thereby confirming the method’s effectiveness in facilitating distributed elevator fault diagnosis.
{"title":"Research on Distributed Fault Diagnosis Model of Elevator Based on PCA-LSTM","authors":"Chengming Chen, Xuejun Ren, Guoqing Cheng","doi":"10.3390/a17060250","DOIUrl":"https://doi.org/10.3390/a17060250","url":null,"abstract":"A Distributed Elevator Fault Diagnosis System (DEFDS) is developed to tackle frequent malfunctions stemming from the widespread distribution and aging of elevator systems. Due to the complexity of elevator fault data and the subtlety of fault characteristics, traditional methods such as visual inspections and basic operational tests fall short in detecting early signs of mechanical wear and electrical issues. These conventional techniques often fail to recognize subtle fault characteristics, necessitating more advanced diagnostic tools. In response, this paper introduces a Principal Component Analysis–Long Short-Term Memory (PCA-LSTM) method for fault diagnosis. The distributed system decentralizes the fault diagnosis process to individual elevator units, utilizing PCA’s feature selection capabilities in high-dimensional spaces to extract and reduce the dimensionality of fault features. Subsequently, the LSTM model is employed for fault prediction. Elevator models within the system exchange data to refine and optimize a global prediction model. The efficacy of this approach is substantiated through empirical validation with actual data, achieving an accuracy rate of 90% and thereby confirming the method’s effectiveness in facilitating distributed elevator fault diagnosis.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141375519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Albina V. Lebedeva, Margarita I. Samburova, Vyacheslav V. Razin, Nikolay V. Gromov, S. A. Gerasimova, T. Levanova, L. Smirnov, Alexander N. Pisarchik
The increasing growth in knowledge about the functioning of the nervous system of mammals and humans, as well as the significant neuromorphic technology developments in recent decades, has led to the emergence of a large number of brain–computer interfaces and neuroprosthetics for regenerative medicine tasks. Neurotechnologies have traditionally been developed for therapeutic purposes to help or replace motor, sensory or cognitive abilities damaged by injury or disease. They also have significant potential for memory enhancement. However, there are still no fully developed neurotechnologies and neural interfaces capable of restoring or expanding cognitive functions, in particular memory, in mammals or humans. In this regard, the search for new technologies in the field of the restoration of cognitive functions is an urgent task of modern neurophysiology, neurotechnology and artificial intelligence. The hippocampus is an important brain structure connected to memory and information processing in the brain. The aim of this paper is to propose an approach based on deep neural networks for the prediction of hippocampal signals in the CA1 region based on received biological input in the CA3 region. We compare the results of prediction for two widely used deep architectures: reservoir computing (RC) and long short-term memory (LSTM) networks. The proposed study can be viewed as a first step in the complex task of the development of a neurohybrid chip, which allows one to restore memory functions in the damaged rodent hippocampus.
{"title":"Prediction of Hippocampal Signals in Mice Using a Deep Learning Approach for Neurohybrid Technology Applications","authors":"Albina V. Lebedeva, Margarita I. Samburova, Vyacheslav V. Razin, Nikolay V. Gromov, S. A. Gerasimova, T. Levanova, L. Smirnov, Alexander N. Pisarchik","doi":"10.3390/a17060252","DOIUrl":"https://doi.org/10.3390/a17060252","url":null,"abstract":"The increasing growth in knowledge about the functioning of the nervous system of mammals and humans, as well as the significant neuromorphic technology developments in recent decades, has led to the emergence of a large number of brain–computer interfaces and neuroprosthetics for regenerative medicine tasks. Neurotechnologies have traditionally been developed for therapeutic purposes to help or replace motor, sensory or cognitive abilities damaged by injury or disease. They also have significant potential for memory enhancement. However, there are still no fully developed neurotechnologies and neural interfaces capable of restoring or expanding cognitive functions, in particular memory, in mammals or humans. In this regard, the search for new technologies in the field of the restoration of cognitive functions is an urgent task of modern neurophysiology, neurotechnology and artificial intelligence. The hippocampus is an important brain structure connected to memory and information processing in the brain. The aim of this paper is to propose an approach based on deep neural networks for the prediction of hippocampal signals in the CA1 region based on received biological input in the CA3 region. We compare the results of prediction for two widely used deep architectures: reservoir computing (RC) and long short-term memory (LSTM) networks. The proposed study can be viewed as a first step in the complex task of the development of a neurohybrid chip, which allows one to restore memory functions in the damaged rodent hippocampus.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141374137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the global pursuit of renewable energy and carbon neutrality, hydrogen-based microgrids have also become an important area of research, as ensuring proper design and operation is essential to achieve optimal performance from hybrid systems. This paper proposes a distributed control strategy based on multiagent self-triggered model predictive control (ST-MPC), with the aim of achieving demand-side control of hydrogen-based microgrid systems. This architecture considers a hybrid energy storage system with renewable energy as the main power source, supplemented by fuel cells based on electrolytic hydrogen. The primary objective of this architecture is aiming at the supply and demand balance problem under the supply and demand relationship of microgrid, the service life of hydrogen-based microgrid energy storage equipment can be increased on the basis of realizing demand-side control of hydrogen energy microgrid system. To accomplish this, model predictive controllers are implemented within a self-triggered framework that dynamically adjusts the counting period. The simulation results demonstrate that the ST-MPC architecture significantly reduces the frequency of control action changes while maintaining an acceptable level of set-point tracking. These findings highlight the viability of the proposed solution for microgrids equipped with multiple types of electrochemical storage, which contributes to improved sustainability and efficiency in renewable-based microgrid systems.
{"title":"Distributed Control of Hydrogen-Based Microgrids for the Demand Side: A Multiagent Self-Triggered MPC-Based Strategy","authors":"Tingzhe Pan, Jue Hou, Xin Jin, Zhenfan Yu, Wei Zhou, Zhijun Wang","doi":"10.3390/a17060251","DOIUrl":"https://doi.org/10.3390/a17060251","url":null,"abstract":"With the global pursuit of renewable energy and carbon neutrality, hydrogen-based microgrids have also become an important area of research, as ensuring proper design and operation is essential to achieve optimal performance from hybrid systems. This paper proposes a distributed control strategy based on multiagent self-triggered model predictive control (ST-MPC), with the aim of achieving demand-side control of hydrogen-based microgrid systems. This architecture considers a hybrid energy storage system with renewable energy as the main power source, supplemented by fuel cells based on electrolytic hydrogen. The primary objective of this architecture is aiming at the supply and demand balance problem under the supply and demand relationship of microgrid, the service life of hydrogen-based microgrid energy storage equipment can be increased on the basis of realizing demand-side control of hydrogen energy microgrid system. To accomplish this, model predictive controllers are implemented within a self-triggered framework that dynamically adjusts the counting period. The simulation results demonstrate that the ST-MPC architecture significantly reduces the frequency of control action changes while maintaining an acceptable level of set-point tracking. These findings highlight the viability of the proposed solution for microgrids equipped with multiple types of electrochemical storage, which contributes to improved sustainability and efficiency in renewable-based microgrid systems.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141372092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Antimicrobial resistance, particularly the emergence of resistant strains in fungal pathogens, has become a pressing global health concern. Antifungal peptides (AFPs) have shown great potential as a promising alternative therapeutic strategy due to their inherent antimicrobial properties and potential application in combating fungal infections. However, the identification of antifungal peptides using experimental approaches is time-consuming and costly. Hence, there is a demand to propose fast and accurate computational approaches to identifying AFPs. This paper introduces a novel multi-view feature learning (MVFL) model, called AFP-MVFL, for accurate AFP identification, utilizing multi-view feature learning. By integrating the sequential and physicochemical properties of amino acids and employing a multi-view approach, the AFP-MVFL model significantly enhances prediction accuracy. It achieves 97.9%, 98.4%, 0.98, and 0.96 in terms of accuracy, precision, F1 score, and Matthews correlation coefficient (MCC), respectively, outperforming previous studies found in the literature.
{"title":"New Multi-View Feature Learning Method for Accurate Antifungal Peptide Detection","authors":"S. M. Ferdous, S. Mugdha, Iman Dehzangi","doi":"10.3390/a17060247","DOIUrl":"https://doi.org/10.3390/a17060247","url":null,"abstract":"Antimicrobial resistance, particularly the emergence of resistant strains in fungal pathogens, has become a pressing global health concern. Antifungal peptides (AFPs) have shown great potential as a promising alternative therapeutic strategy due to their inherent antimicrobial properties and potential application in combating fungal infections. However, the identification of antifungal peptides using experimental approaches is time-consuming and costly. Hence, there is a demand to propose fast and accurate computational approaches to identifying AFPs. This paper introduces a novel multi-view feature learning (MVFL) model, called AFP-MVFL, for accurate AFP identification, utilizing multi-view feature learning. By integrating the sequential and physicochemical properties of amino acids and employing a multi-view approach, the AFP-MVFL model significantly enhances prediction accuracy. It achieves 97.9%, 98.4%, 0.98, and 0.96 in terms of accuracy, precision, F1 score, and Matthews correlation coefficient (MCC), respectively, outperforming previous studies found in the literature.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141379479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohamed A. Sharaf, Heba Helal, Nazar Zaki, Wadha Alketbi, Latifa Alkaabi, Sara Alshamsi, Fatmah Alhefeiti
Analyzing crowdfunding data has been the focus of many research efforts, where analysts typically explore this data to identify the main factors and characteristics of the lending process as well as to discover unique patterns and anomalies in loan distributions. However, the manual exploration and visualization of such data is clearly an ad hoc, time-consuming, and labor-intensive process. Hence, in this work, we propose LoanVis, which is an automated solution for discovering and recommending those valuable and insightful visualizations. LoanVis is a data-driven system that utilizes objective metrics to quantify the “interestingness” of a visualization and employs such metrics in the recommendation process. We demonstrate the effectiveness of LoanVis in analyzing and exploring different aspects of the Kiva crowdfunding dataset.
{"title":"Automated Recommendation of Aggregate Visualizations for Crowdfunding Data","authors":"Mohamed A. Sharaf, Heba Helal, Nazar Zaki, Wadha Alketbi, Latifa Alkaabi, Sara Alshamsi, Fatmah Alhefeiti","doi":"10.3390/a17060244","DOIUrl":"https://doi.org/10.3390/a17060244","url":null,"abstract":"Analyzing crowdfunding data has been the focus of many research efforts, where analysts typically explore this data to identify the main factors and characteristics of the lending process as well as to discover unique patterns and anomalies in loan distributions. However, the manual exploration and visualization of such data is clearly an ad hoc, time-consuming, and labor-intensive process. Hence, in this work, we propose LoanVis, which is an automated solution for discovering and recommending those valuable and insightful visualizations. LoanVis is a data-driven system that utilizes objective metrics to quantify the “interestingness” of a visualization and employs such metrics in the recommendation process. We demonstrate the effectiveness of LoanVis in analyzing and exploring different aspects of the Kiva crowdfunding dataset.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141379060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
I. Volkau, Sergei Krasovskii, Abdul Mujeeb, Helen Balinsky
The manuscript presents a novel non-gradient and non-iterative method for mapping two 3D objects by matching extrema. This innovative approach utilizes the amplification of extrema through the summation of dependent random values, accompanied by a comprehensive explanation of the statistical background. The method further incorporates structural patterns based on spherical harmonic functions to calculate the rotation matrix, enabling the juxtaposition of the objects. Without utilizing gradients and iterations to improve the solution step by step, the proposed method generates a limited number of candidates, and the mapping (if it exists) is necessarily among the candidates. For instance, this method holds potential for object analysis and identification in additive manufacturing for 3D printing and protein matching.
{"title":"A Non-Gradient and Non-Iterative Method for Mapping 3D Mesh Objects Based on a Summation of Dependent Random Values","authors":"I. Volkau, Sergei Krasovskii, Abdul Mujeeb, Helen Balinsky","doi":"10.3390/a17060248","DOIUrl":"https://doi.org/10.3390/a17060248","url":null,"abstract":"The manuscript presents a novel non-gradient and non-iterative method for mapping two 3D objects by matching extrema. This innovative approach utilizes the amplification of extrema through the summation of dependent random values, accompanied by a comprehensive explanation of the statistical background. The method further incorporates structural patterns based on spherical harmonic functions to calculate the rotation matrix, enabling the juxtaposition of the objects. Without utilizing gradients and iterations to improve the solution step by step, the proposed method generates a limited number of candidates, and the mapping (if it exists) is necessarily among the candidates. For instance, this method holds potential for object analysis and identification in additive manufacturing for 3D printing and protein matching.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141380080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sean Pascoe, A. Farmery, Rachel Nichols, Sarah Lothian, Kamal Azmi
A key component of multi-criteria decision analysis is the estimation of criteria weights, reflecting the preference strength of different stakeholder groups related to different objectives. One common method is the Analytic Hierarchy Process (AHP). A key challenge with the AHP is the potential for inconsistency in responses, resulting in potentially unreliable preference weights. In small groups, interactions between analysts and respondents can compensate for this through reassessment of inconsistent responses. In many cases, however, stakeholders may be geographically dispersed, with online surveys being a more cost-effective means to elicit these preferences, making renegotiating with inconsistent respondents impossible. Further, the potentially large number of bivariate comparisons required using the AHP may adversely affect response rates. In this study, we test a new “modified” AHP (MAHP). The MAHP was designed to retain the key desirable features of the AHP but be more amenable to online surveys, reduce the problem of inconsistencies, and require substantially fewer comparisons. The MAHP is tested using three groups of university students through an online survey platform, along with a “traditional” AHP approach. The results indicate that the MAHP can provide statistically equivalent outcomes to the AHP but without problems arising due to inconsistencies.
{"title":"A Modified Analytic Hierarchy Process Suitable for Online Survey Preference Elicitation","authors":"Sean Pascoe, A. Farmery, Rachel Nichols, Sarah Lothian, Kamal Azmi","doi":"10.3390/a17060245","DOIUrl":"https://doi.org/10.3390/a17060245","url":null,"abstract":"A key component of multi-criteria decision analysis is the estimation of criteria weights, reflecting the preference strength of different stakeholder groups related to different objectives. One common method is the Analytic Hierarchy Process (AHP). A key challenge with the AHP is the potential for inconsistency in responses, resulting in potentially unreliable preference weights. In small groups, interactions between analysts and respondents can compensate for this through reassessment of inconsistent responses. In many cases, however, stakeholders may be geographically dispersed, with online surveys being a more cost-effective means to elicit these preferences, making renegotiating with inconsistent respondents impossible. Further, the potentially large number of bivariate comparisons required using the AHP may adversely affect response rates. In this study, we test a new “modified” AHP (MAHP). The MAHP was designed to retain the key desirable features of the AHP but be more amenable to online surveys, reduce the problem of inconsistencies, and require substantially fewer comparisons. The MAHP is tested using three groups of university students through an online survey platform, along with a “traditional” AHP approach. The results indicate that the MAHP can provide statistically equivalent outcomes to the AHP but without problems arising due to inconsistencies.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141379705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Gussenbauer, M. Templ, Siro Fritzmann, A. Kowarik
Syntheticdata generation methods are used to transform the original data into privacy-compliant synthetic copies (twin data). With our proposed approach, synthetic data can be simulated in the same size as the input data or in any size, and in the case of finite populations, even the entire population can be simulated. The proposed XGBoost-based method is compared with known model-based approaches to generate synthetic data using a complex survey data set. The XGBoost method shows strong performance, especially with synthetic categorical variables, and outperforms other tested methods. Furthermore, the structure and relationship between variables are well preserved. The tuning of the parameters is performed automatically by a modified k-fold cross-validation. If exact population margins are known, e.g., cross-tabulated population counts on age class, gender and region, the synthetic data must be calibrated to those known population margins. For this purpose, we have implemented a simulated annealing algorithm that is able to use multiple population margins simultaneously to post-calibrate a synthetic population. The algorithm is, thus, able to calibrate simulated population data containing cluster and individual information, e.g., about persons in households, at both person and household level. Furthermore, the algorithm is efficiently implemented so that the adjustment of populations with many millions or more persons is possible.
合成数据生成方法用于将原始数据转化为符合隐私要求的合成副本(孪生数据)。利用我们提出的方法,可以模拟与输入数据相同大小或任意大小的合成数据,在有限群体的情况下,甚至可以模拟整个群体。在使用复杂的调查数据集生成合成数据时,将所提出的基于 XGBoost 的方法与已知的基于模型的方法进行了比较。XGBoost 方法表现出很强的性能,尤其是在合成分类变量方面,优于其他测试方法。此外,变量之间的结构和关系也得到了很好的保留。参数的调整是通过改进的 k 倍交叉验证自动完成的。如果已知确切的人口边际值,例如年龄组、性别和地区的交叉表人口计数,则必须根据这些已知的人口边际值对合成数据进行校准。为此,我们采用了一种模拟退火算法,能够同时使用多个种群边际值对合成种群进行后校准。因此,该算法能够校准包含群组和个体信息的模拟人口数据,例如在个人和家庭层面上校准家庭中的人员信息。此外,该算法的实施效率很高,因此可以对数百万或更多人口进行调整。
{"title":"Simulation of Calibrated Complex Synthetic Population Data with XGBoost","authors":"J. Gussenbauer, M. Templ, Siro Fritzmann, A. Kowarik","doi":"10.3390/a17060249","DOIUrl":"https://doi.org/10.3390/a17060249","url":null,"abstract":"Syntheticdata generation methods are used to transform the original data into privacy-compliant synthetic copies (twin data). With our proposed approach, synthetic data can be simulated in the same size as the input data or in any size, and in the case of finite populations, even the entire population can be simulated. The proposed XGBoost-based method is compared with known model-based approaches to generate synthetic data using a complex survey data set. The XGBoost method shows strong performance, especially with synthetic categorical variables, and outperforms other tested methods. Furthermore, the structure and relationship between variables are well preserved. The tuning of the parameters is performed automatically by a modified k-fold cross-validation. If exact population margins are known, e.g., cross-tabulated population counts on age class, gender and region, the synthetic data must be calibrated to those known population margins. For this purpose, we have implemented a simulated annealing algorithm that is able to use multiple population margins simultaneously to post-calibrate a synthetic population. The algorithm is, thus, able to calibrate simulated population data containing cluster and individual information, e.g., about persons in households, at both person and household level. Furthermore, the algorithm is efficiently implemented so that the adjustment of populations with many millions or more persons is possible.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141379172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}