Pub Date : 2025-12-13DOI: 10.1016/j.array.2025.100646
Pengpeng Li , Xicheng Chen , Haojia Wang , Tao Xu , Yang Li , Wei Ye , Jia Chen , Fang Li , Ning Yao , Yazhou Wu
The high heterogeneity of Hepatocellular Carcinoma (HCC) limits prognostic prediction accuracy, and existing deep learning models often struggle to capture the complex interactions within multi-omics data. To address this limitation, this study proposes a novel Stacked Supervised Auto-Encoder (SSAE) framework. It integrates miRNA, mRNA, and DNA methylation data by embedding a Cox proportional hazards model into the hidden layers of sequentially stacked modules. Using the TCGA-LIHC dataset, we systematically evaluated Early Integration (EI-SSAE) and Late Integration (LI-SSAE) strategies while comparing them against traditional machine learning methods. The results demonstrate that the Late Integration strategy significantly outperforms both Early Integration and unsupervised variants. It achieved a high Concordance Index (CI) of 0.969 compared to 0.663 for EI-SSAE. Additionally, the LI-SSAE model exhibited superior calibration accuracy with a Brier score of 0.077 and successfully stratified patients into distinct risk groups with a P-value of 1.11e-08. Bioinformatic analysis further identified critical biomarkers such as FBXW10 and TRIP13. This study confirms that the LI-SSAE model effectively enhances prognostic prediction accuracy and offers a robust tool for clinical assessment and personalized treatment in HCC.
{"title":"Research on a multi-omics prognostic model of liver cancer based on stacked supervised deep learning","authors":"Pengpeng Li , Xicheng Chen , Haojia Wang , Tao Xu , Yang Li , Wei Ye , Jia Chen , Fang Li , Ning Yao , Yazhou Wu","doi":"10.1016/j.array.2025.100646","DOIUrl":"10.1016/j.array.2025.100646","url":null,"abstract":"<div><div>The high heterogeneity of Hepatocellular Carcinoma (HCC) limits prognostic prediction accuracy, and existing deep learning models often struggle to capture the complex interactions within multi-omics data. To address this limitation, this study proposes a novel Stacked Supervised Auto-Encoder (SSAE) framework. It integrates miRNA, mRNA, and DNA methylation data by embedding a Cox proportional hazards model into the hidden layers of sequentially stacked modules. Using the TCGA-LIHC dataset, we systematically evaluated Early Integration (EI-SSAE) and Late Integration (LI-SSAE) strategies while comparing them against traditional machine learning methods. The results demonstrate that the Late Integration strategy significantly outperforms both Early Integration and unsupervised variants. It achieved a high Concordance Index (CI) of 0.969 compared to 0.663 for EI-SSAE. Additionally, the LI-SSAE model exhibited superior calibration accuracy with a Brier score of 0.077 and successfully stratified patients into distinct risk groups with a P-value of 1.11e-08. Bioinformatic analysis further identified critical biomarkers such as FBXW10 and TRIP13. This study confirms that the LI-SSAE model effectively enhances prognostic prediction accuracy and offers a robust tool for clinical assessment and personalized treatment in HCC.</div></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"29 ","pages":"Article 100646"},"PeriodicalIF":4.5,"publicationDate":"2025-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145921255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-13DOI: 10.1016/j.array.2025.100644
Hafeez Ur Rehman Siddiqui , Muhammad Amjad Raza , Adil Ali Saleem , Josep Alemany-Iturriaga , Fernando Maniega Legarda , Isabel de la Torre Díez
Cricket stroke analysis plays a critical role in performance evaluation, strategic decision-making, and coaching. Traditional manual techniques are limited in their ability to capture the fine-grained timing and biomechanical nuances of batting movements. This study investigates four advanced neural network architectures—long short-term memory (LSTM), bidirectional LSTM (BiLSTM), Transformer, and a BERT-inspired model—for short-term prediction of cricket stroke dynamics. Using joint-level pose sequences extracted with MediaPipe, the models forecast the next five frames of motion, corresponding to approximately 0.16 s of future movement that is biomechanically relevant for adjustments in bat angle, head stability, and lower-body alignment. Preprocessing involved keypoint scaling, sliding-window generation, and sequence normalization. The models were evaluated across eight stroke types: straight drive, sweep, pull, on drive, flick, cut, cover drive, and back-foot punch. Performance was assessed using standard error metrics and prediction accuracy. The Transformer consistently delivered the best results for most strokes, particularly sweep, flick, and cover drive, achieving low prediction error and strong temporal–spatial alignment. The BERT-inspired model performed competitively on straight drive and on drive, while the BiLSTM model excelled on back-foot punch. Although the LSTM provided reasonable predictions, it was generally outperformed by the more advanced architectures. These findings demonstrate the suitability of Transformer-based models for capturing the complex spatial–temporal patterns inherent in cricket batting mechanics. The proposed framework offers practical implications for real-time coaching, player assessment, and sports science by advancing automated prediction of athletic performance.
{"title":"Advancing cricket biomechanics: Neural network based motion prediction for stroke analysis","authors":"Hafeez Ur Rehman Siddiqui , Muhammad Amjad Raza , Adil Ali Saleem , Josep Alemany-Iturriaga , Fernando Maniega Legarda , Isabel de la Torre Díez","doi":"10.1016/j.array.2025.100644","DOIUrl":"10.1016/j.array.2025.100644","url":null,"abstract":"<div><div>Cricket stroke analysis plays a critical role in performance evaluation, strategic decision-making, and coaching. Traditional manual techniques are limited in their ability to capture the fine-grained timing and biomechanical nuances of batting movements. This study investigates four advanced neural network architectures—long short-term memory (LSTM), bidirectional LSTM (BiLSTM), Transformer, and a BERT-inspired model—for short-term prediction of cricket stroke dynamics. Using joint-level pose sequences extracted with MediaPipe, the models forecast the next five frames of motion, corresponding to approximately 0.16<!--> <!-->s of future movement that is biomechanically relevant for adjustments in bat angle, head stability, and lower-body alignment. Preprocessing involved keypoint scaling, sliding-window generation, and sequence normalization. The models were evaluated across eight stroke types: straight drive, sweep, pull, on drive, flick, cut, cover drive, and back-foot punch. Performance was assessed using standard error metrics and prediction accuracy. The Transformer consistently delivered the best results for most strokes, particularly sweep, flick, and cover drive, achieving low prediction error and strong temporal–spatial alignment. The BERT-inspired model performed competitively on straight drive and on drive, while the BiLSTM model excelled on back-foot punch. Although the LSTM provided reasonable predictions, it was generally outperformed by the more advanced architectures. These findings demonstrate the suitability of Transformer-based models for capturing the complex spatial–temporal patterns inherent in cricket batting mechanics. The proposed framework offers practical implications for real-time coaching, player assessment, and sports science by advancing automated prediction of athletic performance.</div></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"29 ","pages":"Article 100644"},"PeriodicalIF":4.5,"publicationDate":"2025-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145788317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The increasing integration of smart city technologies has underscored the need for intelligent transportation systems that prioritize road safety. This study presents a neuro-cognitive artificial intelligence (AI)-driven system designed to detect driver drowsiness in real time and proposes a framework for future accident detection. The system combines computer vision and machine learning to monitor driver alertness and issue timely warnings. Facial expressions and eye movements are analyzed using a hybrid architecture that integrates convolutional neural networks (CNNs) for spatial feature extraction and gated recurrent units (GRUs) for temporal modeling. When signs of fatigue are detected—such as sustained low eye aspect ratio (EAR)—the system triggers visual and auditory alerts to re-engage the driver. Operating within a smart city infrastructure, the system is designed to communicate with traffic management platforms and emergency services for enhanced coordination. While the drowsiness detection module has been fully implemented and evaluated, the accident detection component remains a proposed feature for future development. This research contributes to the advancement of proactive road safety solutions and lays the groundwork for scalable, multimodal AI systems in intelligent transportation networks.
{"title":"Neuro-cognitive AI-driven system for preventing road accidents through driver drowsiness alerts","authors":"Chitaranjan Mahapatra , Shuvendra Kumar Tripathy , Mrunmayee Tripathy","doi":"10.1016/j.array.2025.100631","DOIUrl":"10.1016/j.array.2025.100631","url":null,"abstract":"<div><div>The increasing integration of smart city technologies has underscored the need for intelligent transportation systems that prioritize road safety. This study presents a neuro-cognitive artificial intelligence (AI)-driven system designed to detect driver drowsiness in real time and proposes a framework for future accident detection. The system combines computer vision and machine learning to monitor driver alertness and issue timely warnings. Facial expressions and eye movements are analyzed using a hybrid architecture that integrates convolutional neural networks (CNNs) for spatial feature extraction and gated recurrent units (GRUs) for temporal modeling. When signs of fatigue are detected—such as sustained low eye aspect ratio (EAR)—the system triggers visual and auditory alerts to re-engage the driver. Operating within a smart city infrastructure, the system is designed to communicate with traffic management platforms and emergency services for enhanced coordination. While the drowsiness detection module has been fully implemented and evaluated, the accident detection component remains a proposed feature for future development. This research contributes to the advancement of proactive road safety solutions and lays the groundwork for scalable, multimodal AI systems in intelligent transportation networks.</div></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"29 ","pages":"Article 100631"},"PeriodicalIF":4.5,"publicationDate":"2025-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145788381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-11DOI: 10.1016/j.array.2025.100639
Safa Ghannam
Accurate forecasting of national CO2 emissions is critical for evidence-based climate policy and for meeting commitments such as Australia's 2050 net-zero target and the United Nations Sustainable Development Goal 13 (Climate Action). This study implements and evaluates thirteen forecasting approaches, including statistical models (ARIMA), machine learning methods (random forest, XGBoost, SVR), kernel methods (GPR), hybrid approaches (ELM, ISSA-ELM), deep learning networks (MLP, LSTM, GRU, RNN), and two ensemble models (stacking regressor and enhanced stacking regressor), using annual Australian data from 1982 to 2022 within a reproducible pipeline. Thirty random seeds ensured robustness for stochastic learners. Ensemble tree methods delivered the most accurate and stable predictions: Random Forest achieved mean cross-validation R2 ≈ 0.989 ± 0.003 and RMSE ≈0.018 ± 0.002 and generalized well to unseen 2016–2022 data (R2 ≈ 0.96; RMSE ≈ 2.43 Mt CO2). Pairwise significance testing confirmed that Random Forest and stacking significantly outperformed most individual learners (p < 0.01). SHAP analysis identified energy productivity, total GHG excluding land-use change, total energy consumption, and population as dominant drivers. Scenario experiments show that deterministic adjustments yield only modest 2050 reductions (−0.49 % to −2.68 %), with population shifts treated as exogenous sensitivities, underscoring the need for system-level action to achieve net-zero. Limitations include reliance on annual data and exclusion of policy and trade factors. Future work could extend this framework through causal inference and hybrid physics-informed machine learning. Building on global advances in emissions forecasting, this study contributes a localized, interpretable comparative framework tailored to Australia's emissions profile, addressing a notable gap in national-level forecasting research. This transparent and reproducible approach provides evidence-based guidance for model selection and supports policy-relevant discussions on national CO2 forecasting.
{"title":"An explainable comparative study of statistical, machine learning, deep learning, and hybrid models for CO2 emissions forecasting in Australia","authors":"Safa Ghannam","doi":"10.1016/j.array.2025.100639","DOIUrl":"10.1016/j.array.2025.100639","url":null,"abstract":"<div><div>Accurate forecasting of national CO<sub>2</sub> emissions is critical for evidence-based climate policy and for meeting commitments such as Australia's 2050 net-zero target and the United Nations Sustainable Development Goal 13 (Climate Action). This study implements and evaluates thirteen forecasting approaches, including statistical models (ARIMA), machine learning methods (random forest, XGBoost, SVR), kernel methods (GPR), hybrid approaches (ELM, ISSA-ELM), deep learning networks (MLP, LSTM, GRU, RNN), and two ensemble models (stacking regressor and enhanced stacking regressor), using annual Australian data from 1982 to 2022 within a reproducible pipeline. Thirty random seeds ensured robustness for stochastic learners. Ensemble tree methods delivered the most accurate and stable predictions: Random Forest achieved mean cross-validation R<sup>2</sup> ≈ 0.989 ± 0.003 and RMSE ≈0.018 ± 0.002 and generalized well to unseen 2016–2022 data (R<sup>2</sup> ≈ 0.96; RMSE ≈ 2.43 Mt CO<sub>2</sub>). Pairwise significance testing confirmed that Random Forest and stacking significantly outperformed most individual learners (p < 0.01). SHAP analysis identified energy productivity, total GHG excluding land-use change, total energy consumption, and population as dominant drivers. Scenario experiments show that deterministic adjustments yield only modest 2050 reductions (−0.49 % to −2.68 %), with population shifts treated as exogenous sensitivities, underscoring the need for system-level action to achieve net-zero. Limitations include reliance on annual data and exclusion of policy and trade factors. Future work could extend this framework through causal inference and hybrid physics-informed machine learning. Building on global advances in emissions forecasting, this study contributes a localized, interpretable comparative framework tailored to Australia's emissions profile, addressing a notable gap in national-level forecasting research. This transparent and reproducible approach provides evidence-based guidance for model selection and supports policy-relevant discussions on national CO<sub>2</sub> forecasting.</div></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"29 ","pages":"Article 100639"},"PeriodicalIF":4.5,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145788382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This study introduces a multi-semester framework for Vocational Software Development Education (VSDE) that incorporates Agile practices—specifically Scrum, Lean Startup, and Extreme Programming—into a Project-Based Learning (PBL) curriculum. Conducted at an Indonesian polytechnic with 51 students across six courses, the framework was evaluated using a mixed-methods case study that combined qualitative data (such as interviews and reflections) with quantitative indicators (including Net Promoter Score [NPS], Minimum Viable Product (MVP) delivery, and sprint completion rates). The findings demonstrated a systematic improvement, with the median NPS increasing from 20.0 in Sprint 1 to 56.9 in Sprint 5, and six out of seven teams showing progress (Friedman χ2(4) = 11.38, p = 0.023). Survey results indicated gains of 1.5–2.0 points in Agile competencies, while student reflections highlighted a greater sense of ownership and more adaptive delivery processes. Overall, the results indicate that the iterative adoption of Agile practices can enhance both product quality and professional readiness. This study presents a replicable model and provides practical guidance for curriculum design, stakeholder engagement, and scaling Agile pedagogy across diverse institutional contexts.
本研究为职业软件开发教育(VSDE)引入了一个多学期的框架,该框架将敏捷实践(特别是Scrum、精益创业和极限编程)纳入基于项目的学习(PBL)课程。该框架在印度尼西亚一所理工学院开展,共有51名学生参加了6门课程,采用混合方法案例研究对该框架进行了评估,该案例研究将定性数据(如访谈和反思)与定量指标(包括净推荐值[NPS]、最小可行产品(MVP)交付和冲刺完成率)相结合。结果显示了系统的改善,中位数NPS从Sprint 1的20.0增加到Sprint 5的56.9,七个团队中有六个显示出进步(Friedman χ2(4) = 11.38, p = 0.023)。调查结果表明,在敏捷能力方面获得了1.5-2.0分,而学生的反映则强调了更强的所有权意识和更适应性的交付过程。总的来说,结果表明敏捷实践的迭代采用可以提高产品质量和专业准备。这项研究提出了一个可复制的模型,并为课程设计、利益相关者参与和在不同机构背景下扩展敏捷教学法提供了实践指导。
{"title":"Bridging the skills gap through Agile methodologies in Vocational Software Development Education","authors":"Umi Sa'adah , Umi Laili Yuhana , Siti Rochimah , Maulidan Bagus Afridian Rasyid","doi":"10.1016/j.array.2025.100614","DOIUrl":"10.1016/j.array.2025.100614","url":null,"abstract":"<div><div>This study introduces a multi-semester framework for Vocational Software Development Education (VSDE) that incorporates Agile practices—specifically Scrum, Lean Startup, and Extreme Programming—into a Project-Based Learning (PBL) curriculum. Conducted at an Indonesian polytechnic with 51 students across six courses, the framework was evaluated using a mixed-methods case study that combined qualitative data (such as interviews and reflections) with quantitative indicators (including Net Promoter Score [NPS], Minimum Viable Product (MVP) delivery, and sprint completion rates). The findings demonstrated a systematic improvement, with the median NPS increasing from 20.0 in Sprint 1 to 56.9 in Sprint 5, and six out of seven teams showing progress (Friedman χ<sup>2</sup>(4) = 11.38, p = 0.023). Survey results indicated gains of 1.5–2.0 points in Agile competencies, while student reflections highlighted a greater sense of ownership and more adaptive delivery processes. Overall, the results indicate that the iterative adoption of Agile practices can enhance both product quality and professional readiness. This study presents a replicable model and provides practical guidance for curriculum design, stakeholder engagement, and scaling Agile pedagogy across diverse institutional contexts.</div></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"29 ","pages":"Article 100614"},"PeriodicalIF":4.5,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145921326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-11DOI: 10.1016/j.array.2025.100641
Mohammad Mehdi Tahmouresi, Javad Behnamian
Supply chains operating in dynamic, high-risk environments need mechanisms that preserve economic efficiency while strengthening resilience to disruptions. This study develops an integrated, data-driven framework for four-echelon supply chains (suppliers, manufacturers, distributors, retailers) that explicitly models route disruptions, transportation capacity limits, product perishability, and production and storage constraints for finished goods and raw materials, together with uncertain customer demand. By considering these factors simultaneously, the model shows that production and storage limits critically influence system performance and that demand uncertainty increases operational complexity and cost. To this end, the study formulates a bi-objective, linearized Mixed-Integer Programming (MIP) model that minimizes (i) overall operational cost and (ii) network inflexibility, measured by the number of critical points and allocation counts, thereby capturing trade-offs among efficiency, risk mitigation, and flexibility. To address practical Just-In-Time (JIT) shortcomings under uncertainty, the model allows multi-sourcing (distributors can source from multiple manufacturers; manufacturers can procure from multiple suppliers), enhancing robustness relative to conventional configurations. Uncertainty is treated with a data-driven Distributionally Robust Optimization (DRO) approach. The model is solved with exact CPLEX routines via the augmented ε-constraint method for moderate-sized instances, and with the Non-dominated Sorting Genetic Algorithm II (NSGA-II) for large instances. Performance is benchmarked against a Multi-Objective Particle Swarm Optimization (MOPSO) comparator. Computational experiments—conducted on datasets supplied by Khoshgovar Company (soft-drink production and distribution)—demonstrate that NSGA-II yields superior Pareto fronts (proximity, dispersion, objective attainment) while enabling tractable solutions at practical scales. Results further indicate that risk-averse strategies, although costlier in the short term, materially improve long-term resilience by lowering disruption impact and systemic exposure. The integrated framework advances theory by bridging resilience, JIT, and robust data-driven planning, and offers actionable managerial guidance for industries handling perishable goods (e.g., food and cold-chain pharmaceuticals), including strategies to balance cost and flexibility under uncertainty.
{"title":"Data-driven risk mitigation and flexibility enhancement in perishable supply chain networks","authors":"Mohammad Mehdi Tahmouresi, Javad Behnamian","doi":"10.1016/j.array.2025.100641","DOIUrl":"10.1016/j.array.2025.100641","url":null,"abstract":"<div><div>Supply chains operating in dynamic, high-risk environments need mechanisms that preserve economic efficiency while strengthening resilience to disruptions. This study develops an integrated, data-driven framework for four-echelon supply chains (suppliers, manufacturers, distributors, retailers) that explicitly models route disruptions, transportation capacity limits, product perishability, and production and storage constraints for finished goods and raw materials, together with uncertain customer demand. By considering these factors simultaneously, the model shows that production and storage limits critically influence system performance and that demand uncertainty increases operational complexity and cost. To this end, the study formulates a bi-objective, linearized Mixed-Integer Programming (MIP) model that minimizes (<em>i</em>) overall operational cost and (<em>ii</em>) network inflexibility, measured by the number of critical points and allocation counts, thereby capturing trade-offs among efficiency, risk mitigation, and flexibility. To address practical Just-In-Time (JIT) shortcomings under uncertainty, the model allows multi-sourcing (distributors can source from multiple manufacturers; manufacturers can procure from multiple suppliers), enhancing robustness relative to conventional configurations. Uncertainty is treated with a data-driven Distributionally Robust Optimization (DRO) approach. The model is solved with exact CPLEX routines via the augmented ε-constraint method for moderate-sized instances, and with the Non-dominated Sorting Genetic Algorithm II (NSGA-II) for large instances. Performance is benchmarked against a Multi-Objective Particle Swarm Optimization (MOPSO) comparator. Computational experiments—conducted on datasets supplied by Khoshgovar Company (soft-drink production and distribution)—demonstrate that NSGA-II yields superior Pareto fronts (proximity, dispersion, objective attainment) while enabling tractable solutions at practical scales. Results further indicate that risk-averse strategies, although costlier in the short term, materially improve long-term resilience by lowering disruption impact and systemic exposure. The integrated framework advances theory by bridging resilience, JIT, and robust data-driven planning, and offers actionable managerial guidance for industries handling perishable goods (e.g., food and cold-chain pharmaceuticals), including strategies to balance cost and flexibility under uncertainty.</div></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"29 ","pages":"Article 100641"},"PeriodicalIF":4.5,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145788383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-11DOI: 10.1016/j.array.2025.100572
Mumrez Khan , Zhixiao Wang , Javed Ali Khan , Nek Dil Khan
The Apple App Store (AAS) allows users to provide feedback on applications, offering developers insights into improving software performance. Researchers have utilized this feedback for software evolution activities, including features, issues, and nonfunctional requirements. However, end-user feedback has not been explored to identify accessibility-related challenges. This study proposes an automated approach to detect and classify accessibility issues by analyzing end-user reviews in the AAS. We crawled 178667 user reviews from 85 apps across 18 categories to represent a diverse sample. We developed a coding guideline to identify common accessibility issues, including Navigation and Interaction Problems (NAV), Input and Control Issues (INPUT), Compatibility with Assistive Technologies (CAT), Audio and visual accessibility issues (AUDIOVISUAL), and UI Accessibility Issues (UI). We manually annotated reviews using coding guidelines and content analysis to create a labeled dataset for training and evaluating deep learning(DL) algorithms to detect accessibility in user comments and classify them into categories. The experiments showed that fine-tuned DL classifiers achieved high accuracy in detecting accessibility and classifying them into specific types. For binary classification, the CNN classifier achieved 93% precision, while LSTM, BiLSTM, GRU, and BiGRU achieved accuracies from 76% to 87%. In fine-grained classification, CNN performed better with 97% accuracy, followed by BiGRU and BiLSTM at 96%. The BiLSTM and LSTM models demonstrated strong performance, with accuracies of 96% and 95%. These results show the potential of automated methods to improve identification of accessibility challenges, helping developers address these issues effectively and enhance user experience.
{"title":"Exploring and identifying fine-grained accessibility issues in app store using fine-tuned deep learning","authors":"Mumrez Khan , Zhixiao Wang , Javed Ali Khan , Nek Dil Khan","doi":"10.1016/j.array.2025.100572","DOIUrl":"10.1016/j.array.2025.100572","url":null,"abstract":"<div><div>The Apple App Store (AAS) allows users to provide feedback on applications, offering developers insights into improving software performance. Researchers have utilized this feedback for software evolution activities, including features, issues, and nonfunctional requirements. However, end-user feedback has not been explored to identify accessibility-related challenges. This study proposes an automated approach to detect and classify accessibility issues by analyzing end-user reviews in the AAS. We crawled 178667 user reviews from 85 apps across 18 categories to represent a diverse sample. We developed a coding guideline to identify common accessibility issues, including Navigation and Interaction Problems (NAV), Input and Control Issues (INPUT), Compatibility with Assistive Technologies (CAT), Audio and visual accessibility issues (AUDIOVISUAL), and UI Accessibility Issues (UI). We manually annotated reviews using coding guidelines and content analysis to create a labeled dataset for training and evaluating deep learning(DL) algorithms to detect accessibility in user comments and classify them into categories. The experiments showed that fine-tuned DL classifiers achieved high accuracy in detecting accessibility and classifying them into specific types. For binary classification, the CNN classifier achieved 93% precision, while LSTM, BiLSTM, GRU, and BiGRU achieved accuracies from 76% to 87%. In fine-grained classification, CNN performed better with 97% accuracy, followed by BiGRU and BiLSTM at 96%. The BiLSTM and LSTM models demonstrated strong performance, with accuracies of 96% and 95%. These results show the potential of automated methods to improve identification of accessibility challenges, helping developers address these issues effectively and enhance user experience.</div></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"29 ","pages":"Article 100572"},"PeriodicalIF":4.5,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145788379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-08DOI: 10.1016/j.array.2025.100637
Ahmeed Yinusa , Misa Faezipour
Deep learning has significantly advanced automated medical imaging, particularly in lung cancer detection, yet vulnerability to adversarial manipulation continues to limit clinical reliability. This study investigates the impact of a 30% random-uniform label-flipping poisoning attack on Convolutional Neural Network (CNN) models trained using the IQ-OTH/NCCD dataset comprising 1190 CT images. A multi-strategy defense pipeline is proposed, integrating defensive distillation, Isolation Forest data sanitization, and a noise-tolerant loss function to enhance robustness against training data corruption. To ensure a valid evaluation framework, Synthetic Minority Oversampling Technique (SMOTE) was applied only to the clean training subset before any poisoning was introduced, preventing the propagation of corrupted labels and establishing a balanced and uncontaminated foundation for teacher training. A high-accuracy teacher model trained on this clean SMOTE-balanced dataset produces temperature-scaled soft labels that guide the Student model. The Student is then trained on a sanitized dataset filtered to remove anomalous or inconsistent samples, using Symmetric Cross-Entropy loss to reduce sensitivity to mislabeled data. Experimental results show that the pipeline maintains strong performance, achieving 99% accuracy on clean data and 95 to 96% accuracy under poisoning, while preserving stable precision, recall, and confidence calibration across all classes. These findings demonstrate that the proposed strategy effectively mitigates label-flipping poisoning, offering a reproducible path toward secure and trustworthy AI systems for medical imaging applications.
{"title":"Enhancing the robustness of CNN-based lung cancer detection models against label-flipping poison attacks using defensive distillation","authors":"Ahmeed Yinusa , Misa Faezipour","doi":"10.1016/j.array.2025.100637","DOIUrl":"10.1016/j.array.2025.100637","url":null,"abstract":"<div><div>Deep learning has significantly advanced automated medical imaging, particularly in lung cancer detection, yet vulnerability to adversarial manipulation continues to limit clinical reliability. This study investigates the impact of a 30% random-uniform label-flipping poisoning attack on Convolutional Neural Network (CNN) models trained using the IQ-OTH/NCCD dataset comprising 1190 CT images. A multi-strategy defense pipeline is proposed, integrating defensive distillation, Isolation Forest data sanitization, and a noise-tolerant loss function to enhance robustness against training data corruption. To ensure a valid evaluation framework, Synthetic Minority Oversampling Technique (SMOTE) was applied only to the clean training subset before any poisoning was introduced, preventing the propagation of corrupted labels and establishing a balanced and uncontaminated foundation for teacher training. A high-accuracy teacher model trained on this clean SMOTE-balanced dataset produces temperature-scaled soft labels that guide the Student model. The Student is then trained on a sanitized dataset filtered to remove anomalous or inconsistent samples, using Symmetric Cross-Entropy loss to reduce sensitivity to mislabeled data. Experimental results show that the pipeline maintains strong performance, achieving 99% accuracy on clean data and 95 to 96% accuracy under poisoning, while preserving stable precision, recall, and confidence calibration across all classes. These findings demonstrate that the proposed strategy effectively mitigates label-flipping poisoning, offering a reproducible path toward secure and trustworthy AI systems for medical imaging applications.</div></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"29 ","pages":"Article 100637"},"PeriodicalIF":4.5,"publicationDate":"2025-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145735330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In recent years, the art landscape has undergone a considerable transformation with the emergence of AI-powered generative art tools, challenging traditional notions of artistic authenticity and ownership. The exponential growth of generative artwork sharing on social media platforms has created an urgent need to protect artists' intellectual properties from impersonation, forgery, and style appropriation. This study introduces an innovative, lightweight detection framework that efficiently distinguishes AI-generated art from human-created artwork by analyzing spatial domain features using tree-based ensembles. The study focuses on two prominent generative image architectures, StyleGAN2-ADA and Stable Diffusion, to explore the method's effectiveness across various classes of probabilistic deep generative models while incorporating JPEG compression considerations to reflect real-world social media conditions. The framework was evaluated across a diverse dataset of 10,000 images, achieving a detection accuracy of 94.43 % for StyleGAN2-ADA and 97.97 % for Stable Diffusion outputs on average across varying quality factors (QF). A key limitation observed is the lack of cross-architecture generalization-classifiers trained on one generative model do not reliably detect outputs from others, highlighting the need for architecture-agnostic detection strategies for real-world deployment. These results demonstrate comparable or better performance to existing deep learning solutions, requiring significantly less computational resources and training data. The proposed approach represents a significant step towards digital art authentication, offering a practical solution for real-time detection of AI-generated artwork in social media environments. Future work will focus on expanding the framework's capabilities to address emerging generative models and developing and integrating tools for automatic art authentication across various social media platforms.
{"title":"JPEG-compression agnostic identification of generative art using explainable spatial domain features","authors":"Vrinda Kohli , Janmey Shukla , Harish Sharma , Narendra Khatri","doi":"10.1016/j.array.2025.100635","DOIUrl":"10.1016/j.array.2025.100635","url":null,"abstract":"<div><div>In recent years, the art landscape has undergone a considerable transformation with the emergence of AI-powered generative art tools, challenging traditional notions of artistic authenticity and ownership. The exponential growth of generative artwork sharing on social media platforms has created an urgent need to protect artists' intellectual properties from impersonation, forgery, and style appropriation. This study introduces an innovative, lightweight detection framework that efficiently distinguishes AI-generated art from human-created artwork by analyzing spatial domain features using tree-based ensembles. The study focuses on two prominent generative image architectures, StyleGAN2-ADA and Stable Diffusion, to explore the method's effectiveness across various classes of probabilistic deep generative models while incorporating JPEG compression considerations to reflect real-world social media conditions. The framework was evaluated across a diverse dataset of 10,000 images, achieving a detection accuracy of 94.43 % for StyleGAN2-ADA and 97.97 % for Stable Diffusion outputs on average across varying quality factors (QF). A key limitation observed is the lack of cross-architecture generalization-classifiers trained on one generative model do not reliably detect outputs from others, highlighting the need for architecture-agnostic detection strategies for real-world deployment. These results demonstrate comparable or better performance to existing deep learning solutions, requiring significantly less computational resources and training data. The proposed approach represents a significant step towards digital art authentication, offering a practical solution for real-time detection of AI-generated artwork in social media environments. Future work will focus on expanding the framework's capabilities to address emerging generative models and developing and integrating tools for automatic art authentication across various social media platforms.</div></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"29 ","pages":"Article 100635"},"PeriodicalIF":4.5,"publicationDate":"2025-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145788384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-08DOI: 10.1016/j.array.2025.100638
Shaik Farooq , M. Harshith , Aneeshsingh Bhatkhande , N. Kumaresan , B. Karthikeyan , A. Rammohan
The accurate and real-time prediction of state of charge (SoC) in battery pack is crucial for the safe and efficient operation of electric vehicles. Traditional estimation methods often suffer from reduced accuracy under sensor errors, battery aging, and dynamic load conditions. This study presents a real-time adaptive neural network (ANN)-based SoC prediction model integrated within a digital twin (DT) framework, designed, and validated using MATLAB/Simulink. The proposed algorithm continuously updates its parameters using real-time current, SoC, and voltage data of battery pack, enabling adaptive learning under varying load and ambient conditions. Compared with traditional methods such as Extended Kalman Filter and particle-filter based estimators, the proposed algorithm reduces the prediction error by 18–22 % and it shortens the response time by 30 %. The simulation results confirm that a strong correlation between the predicted and actual SoC values (R = 0.9999) with a maximum deviation of ±1.5 %. The proposed algorithm demonstrates robust convergence, improved generalization through Bayesian regularization, and high stability during real-time adaptation. This adaptive DT-integrated ANN framework enhances the BMS reliability, supports predictive maintenance, and provides a scalable, and intelligent solution for next-generation electric mobility applications.
{"title":"Real-time adaptive neural network-based state of charge prediction of battery pack in a digital twin framework","authors":"Shaik Farooq , M. Harshith , Aneeshsingh Bhatkhande , N. Kumaresan , B. Karthikeyan , A. Rammohan","doi":"10.1016/j.array.2025.100638","DOIUrl":"10.1016/j.array.2025.100638","url":null,"abstract":"<div><div>The accurate and real-time prediction of state of charge (SoC) in battery pack is crucial for the safe and efficient operation of electric vehicles. Traditional estimation methods often suffer from reduced accuracy under sensor errors, battery aging, and dynamic load conditions. This study presents a real-time adaptive neural network (ANN)-based SoC prediction model integrated within a digital twin (DT) framework, designed, and validated using MATLAB/Simulink. The proposed algorithm continuously updates its parameters using real-time current, SoC, and voltage data of battery pack, enabling adaptive learning under varying load and ambient conditions. Compared with traditional methods such as Extended Kalman Filter and particle-filter based estimators, the proposed algorithm reduces the prediction error by 18–22 % and it shortens the response time by 30 %. The simulation results confirm that a strong correlation between the predicted and actual SoC values (R = 0.9999) with a maximum deviation of ±1.5 %. The proposed algorithm demonstrates robust convergence, improved generalization through Bayesian regularization, and high stability during real-time adaptation. This adaptive DT-integrated ANN framework enhances the BMS reliability, supports predictive maintenance, and provides a scalable, and intelligent solution for next-generation electric mobility applications.</div></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"29 ","pages":"Article 100638"},"PeriodicalIF":4.5,"publicationDate":"2025-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145735327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}