Mohamad Abou Ali, F. Dornaika, Ignacio Arganda-Carreras
Deep learning (DL) has made significant advances in computer vision with the advent of vision transformers (ViTs). Unlike convolutional neural networks (CNNs), ViTs use self-attention to extract both local and global features from image data, and then apply residual connections to feed these features directly into a fully networked multilayer perceptron head. In hospitals, hematologists prepare peripheral blood smears (PBSs) and read them under a medical microscope to detect abnormalities in blood counts such as leukemia. However, this task is time-consuming and prone to human error. This study investigated the transfer learning process of the Google ViT and ImageNet CNNs to automate the reading of PBSs. The study used two online PBS datasets, PBC and BCCD, and transferred them into balanced datasets to investigate the influence of data amount and noise immunity on both neural networks. The PBC results showed that the Google ViT is an excellent DL neural solution for data scarcity. The BCCD results showed that the Google ViT is superior to ImageNet CNNs in dealing with unclean, noisy image data because it is able to extract both global and local features and use residual connections, despite the additional time and computational overhead.
随着视觉转换器(ViTs)的出现,深度学习(DL)在计算机视觉领域取得了重大进展。与卷积神经网络(CNNs)不同,ViTs 利用自我注意从图像数据中提取局部和全局特征,然后应用残差连接将这些特征直接输入完全网络化的多层感知器头。在医院里,血液学专家要制备外周血涂片(PBS),并在医用显微镜下进行读取,以检测血细胞计数的异常情况,如白血病。然而,这项工作既耗时又容易出现人为错误。本研究调查了 Google ViT 和 ImageNet CNN 的迁移学习过程,以实现 PBS 读取的自动化。研究使用了 PBC 和 BCCD 两个在线 PBS 数据集,并将它们转移到平衡数据集中,以研究数据量和抗噪性对两个神经网络的影响。PBC 结果表明,Google ViT 是一种出色的数据稀缺性 DL 神经解决方案。BCCD 结果表明,Google ViT 在处理不干净、有噪声的图像数据方面优于 ImageNet CNN,因为它能够提取全局和局部特征,并使用残差连接,尽管需要额外的时间和计算开销。
{"title":"White Blood Cell Classification: Convolutional Neural Network (CNN) and Vision Transformer (ViT) under Medical Microscope","authors":"Mohamad Abou Ali, F. Dornaika, Ignacio Arganda-Carreras","doi":"10.3390/a16110525","DOIUrl":"https://doi.org/10.3390/a16110525","url":null,"abstract":"Deep learning (DL) has made significant advances in computer vision with the advent of vision transformers (ViTs). Unlike convolutional neural networks (CNNs), ViTs use self-attention to extract both local and global features from image data, and then apply residual connections to feed these features directly into a fully networked multilayer perceptron head. In hospitals, hematologists prepare peripheral blood smears (PBSs) and read them under a medical microscope to detect abnormalities in blood counts such as leukemia. However, this task is time-consuming and prone to human error. This study investigated the transfer learning process of the Google ViT and ImageNet CNNs to automate the reading of PBSs. The study used two online PBS datasets, PBC and BCCD, and transferred them into balanced datasets to investigate the influence of data amount and noise immunity on both neural networks. The PBC results showed that the Google ViT is an excellent DL neural solution for data scarcity. The BCCD results showed that the Google ViT is superior to ImageNet CNNs in dealing with unclean, noisy image data because it is able to extract both global and local features and use residual connections, despite the additional time and computational overhead.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":"43 11","pages":""},"PeriodicalIF":2.3,"publicationDate":"2023-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139272793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Emmanouil Koutoulakis, Louis Marage, Emmanouil Markodimitrakis, L. Aubignac, Catherine Jenny, I. Bessières, Alain Lalande
MR-Linac is a recent device combining a linear accelerator with an MRI scanner. The improved soft tissue contrast of MR images is used for optimum delineation of tumors or organs at risk (OARs) and precise treatment delivery. Automatic segmentation of OARs can contribute to alleviating the time-consuming process for radiation oncologists and improving the accuracy of radiation delivery by providing faster, more consistent, and more accurate delineation of target structures and organs at risk. It can also help reduce inter-observer variability and improve the consistency of contouring while reducing the time required for treatment planning. In this work, state-of-the-art deep learning techniques were evaluated based on 2D and 2.5D training strategies to develop a comprehensive tool for the accurate segmentation of pelvic OARs dedicated to 0.35 T MR-Linac. In total, 103 cases with 0.35 T MR images of the pelvic region were investigated. Experts considered and contoured the bladder, rectum, and femoral heads as OARs and the prostate as the target volume. For the training of the neural network, 85 patients were randomly selected, and 18 were used for testing. Multiple U-Net-based architectures were considered, and the best model was compared using both 2D and 2.5D training strategies. The evaluation of the models was performed based on two metrics: the Dice similarity coefficient (DSC) and the Hausdorff distance (HD). In the 2D training strategy, Residual Attention U-Net (ResAttU-Net) had the highest scores among the other deep neural networks. Due to the additional contextual information, the configured 2.5D ResAttU-Net performed better. The overall DSC were 0.88 ± 0.09 and 0.86 ± 0.10, and the overall HD was 1.78 ± 3.02 mm and 5.90 ± 7.58 mm for 2.5D and 2D ResAttU-Net, respectively. The 2.5D ResAttU-Net provides accurate segmentation of OARs without affecting the computational cost. The developed end-to-end pipeline will be merged with the treatment planning system for in-time automatic segmentation.
{"title":"Automatic Multiorgan Segmentation in Pelvic Region with Convolutional Neural Networks on 0.35 T MR-Linac Images","authors":"Emmanouil Koutoulakis, Louis Marage, Emmanouil Markodimitrakis, L. Aubignac, Catherine Jenny, I. Bessières, Alain Lalande","doi":"10.3390/a16110521","DOIUrl":"https://doi.org/10.3390/a16110521","url":null,"abstract":"MR-Linac is a recent device combining a linear accelerator with an MRI scanner. The improved soft tissue contrast of MR images is used for optimum delineation of tumors or organs at risk (OARs) and precise treatment delivery. Automatic segmentation of OARs can contribute to alleviating the time-consuming process for radiation oncologists and improving the accuracy of radiation delivery by providing faster, more consistent, and more accurate delineation of target structures and organs at risk. It can also help reduce inter-observer variability and improve the consistency of contouring while reducing the time required for treatment planning. In this work, state-of-the-art deep learning techniques were evaluated based on 2D and 2.5D training strategies to develop a comprehensive tool for the accurate segmentation of pelvic OARs dedicated to 0.35 T MR-Linac. In total, 103 cases with 0.35 T MR images of the pelvic region were investigated. Experts considered and contoured the bladder, rectum, and femoral heads as OARs and the prostate as the target volume. For the training of the neural network, 85 patients were randomly selected, and 18 were used for testing. Multiple U-Net-based architectures were considered, and the best model was compared using both 2D and 2.5D training strategies. The evaluation of the models was performed based on two metrics: the Dice similarity coefficient (DSC) and the Hausdorff distance (HD). In the 2D training strategy, Residual Attention U-Net (ResAttU-Net) had the highest scores among the other deep neural networks. Due to the additional contextual information, the configured 2.5D ResAttU-Net performed better. The overall DSC were 0.88 ± 0.09 and 0.86 ± 0.10, and the overall HD was 1.78 ± 3.02 mm and 5.90 ± 7.58 mm for 2.5D and 2D ResAttU-Net, respectively. The 2.5D ResAttU-Net provides accurate segmentation of OARs without affecting the computational cost. The developed end-to-end pipeline will be merged with the treatment planning system for in-time automatic segmentation.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":"3 2","pages":""},"PeriodicalIF":2.3,"publicationDate":"2023-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139274570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In unmanned aerial vehicle photographs, object detection algorithms encounter challenges in enhancing both speed and accuracy for objects of different sizes, primarily due to complex backgrounds and small objects. This study introduces the PDWT-YOLO algorithm, based on the YOLOv7-tiny model, to improve the effectiveness of object detection across all sizes. The proposed method enhances the detection of small objects by incorporating a dedicated small-object detection layer, while reducing the conflict between classification and regression tasks through the replacement of the YOLOv7-tiny model’s detection head (IDetect) with a decoupled head. Moreover, network convergence is accelerated, and regression accuracy is improved by replacing the Complete Intersection over Union (CIoU) loss function with a Wise Intersection over Union (WIoU) focusing mechanism in the loss function. To assess the proposed model’s effectiveness, it was trained and tested on the VisDrone-2019 dataset comprising images captured by various drones across diverse scenarios, weather conditions, and lighting conditions. The experiments show that mAP@0.5:0.95 and mAP@0.5 increased by 5% and 6.7%, respectively, with acceptable running speed compared with the original YOLOv7-tiny model. Furthermore, this method shows improvement over other datasets, confirming that PDWT-YOLO is effective for multiscale object detection.
在无人机照片中,目标检测算法在提高不同尺寸目标的速度和准确性方面遇到了挑战,主要是由于复杂的背景和小目标。本研究引入了基于YOLOv7-tiny模型的PDWT-YOLO算法,以提高各种尺寸目标检测的有效性。该方法通过引入专用的小目标检测层来增强对小目标的检测,同时通过将YOLOv7-tiny模型的检测头(IDetect)替换为解耦头来减少分类任务与回归任务之间的冲突。通过在损失函数中使用WIoU (Wise Intersection over Union)聚焦机制取代CIoU (Complete Intersection over Union)损失函数,加快了网络收敛速度,提高了回归精度。为了评估所提出的模型的有效性,在VisDrone-2019数据集上对其进行了训练和测试,该数据集包括各种无人机在不同场景、天气条件和照明条件下捕获的图像。实验表明,mAP@0.5:0.95和mAP@0.5分别比原来的YOLOv7-tiny模型提高了5%和6.7%,运行速度可以接受。此外,该方法在其他数据集上也有改进,证实了PDWT-YOLO对多尺度目标检测的有效性。
{"title":"Improved Object Detection Method Utilizing YOLOv7-Tiny for Unmanned Aerial Vehicle Photographic Imagery","authors":"Linhua Zhang, Ning Xiong, Xinghao Pan, Xiaodong Yue, Peng Wu, Caiping Guo","doi":"10.3390/a16110520","DOIUrl":"https://doi.org/10.3390/a16110520","url":null,"abstract":"In unmanned aerial vehicle photographs, object detection algorithms encounter challenges in enhancing both speed and accuracy for objects of different sizes, primarily due to complex backgrounds and small objects. This study introduces the PDWT-YOLO algorithm, based on the YOLOv7-tiny model, to improve the effectiveness of object detection across all sizes. The proposed method enhances the detection of small objects by incorporating a dedicated small-object detection layer, while reducing the conflict between classification and regression tasks through the replacement of the YOLOv7-tiny model’s detection head (IDetect) with a decoupled head. Moreover, network convergence is accelerated, and regression accuracy is improved by replacing the Complete Intersection over Union (CIoU) loss function with a Wise Intersection over Union (WIoU) focusing mechanism in the loss function. To assess the proposed model’s effectiveness, it was trained and tested on the VisDrone-2019 dataset comprising images captured by various drones across diverse scenarios, weather conditions, and lighting conditions. The experiments show that mAP@0.5:0.95 and mAP@0.5 increased by 5% and 6.7%, respectively, with acceptable running speed compared with the original YOLOv7-tiny model. Furthermore, this method shows improvement over other datasets, confirming that PDWT-YOLO is effective for multiscale object detection.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":"5 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134992028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The maximum sum subarray problem is to find a contiguous subarray with the largest sum. The history of algorithms to address this problem is recounted, culminating in what is known as Kadane’s algorithm. However, that algorithm is not the algorithm Kadane intended. Nonetheless, the algorithm known as Kadane’s has found many uses, some of which are recounted here. The algorithm Kadane intended is reported here, and compared to the algorithm attributed to Kadane. They are both linear in time, employ just a few words of memory, and use a dynamic programming structure. The results proved here show that these two algorithms differ only in the case of an input consisting of only negative numbers. In that case, the algorithm Kadane intended is more informative than the algorithm attributed to him.
{"title":"Two Kadane Algorithms for the Maximum Sum Subarray Problem","authors":"Joseph B. Kadane","doi":"10.3390/a16110519","DOIUrl":"https://doi.org/10.3390/a16110519","url":null,"abstract":"The maximum sum subarray problem is to find a contiguous subarray with the largest sum. The history of algorithms to address this problem is recounted, culminating in what is known as Kadane’s algorithm. However, that algorithm is not the algorithm Kadane intended. Nonetheless, the algorithm known as Kadane’s has found many uses, some of which are recounted here. The algorithm Kadane intended is reported here, and compared to the algorithm attributed to Kadane. They are both linear in time, employ just a few words of memory, and use a dynamic programming structure. The results proved here show that these two algorithms differ only in the case of an input consisting of only negative numbers. In that case, the algorithm Kadane intended is more informative than the algorithm attributed to him.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":"21 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134993745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The continuous development of quantum computing necessitates the development of quantum-resistant cryptographic algorithms. In response to this demand, the National Institute of Standards and Technology selected standardized algorithms including Crystals-Dilithium, Falcon, and Sphincs+ for digital signatures. This paper provides a comparative evaluation of these algorithms across key metrics. The results indicate varying strengths and weaknesses for each algorithm, underscoring the importance of context-specific deployments. Our findings indicate that Dilithium offers advantages in low-power scenarios, Falcon excels in signature verification speed, and Sphincs+ provides robust security at the cost of computational efficiency. These results underscore the importance of context-specific deployments in specific and resource-constrained technological applications, like IoT, smart cards, blockchain, and vehicle-to-vehicle communication.
量子计算的不断发展要求开发抗量子密码算法。为了满足这一需求,美国国家标准与技术研究所(National Institute of Standards and Technology)为数字签名选择了包括crystals - diliium、Falcon和sphins +在内的标准化算法。本文提供了跨关键指标的这些算法的比较评估。结果表明了每种算法的不同优点和缺点,强调了特定于上下文的部署的重要性。我们的研究结果表明,diliium在低功耗场景中具有优势,Falcon在签名验证速度方面表现出色,而sphins +以计算效率为代价提供了强大的安全性。这些结果强调了在特定和资源受限的技术应用中,如物联网、智能卡、区块链和车对车通信,具体部署的重要性。
{"title":"Performance and Applicability of Post-Quantum Digital Signature Algorithms in Resource-Constrained Environments","authors":"Marin Vidaković, Kruno Miličević","doi":"10.3390/a16110518","DOIUrl":"https://doi.org/10.3390/a16110518","url":null,"abstract":"The continuous development of quantum computing necessitates the development of quantum-resistant cryptographic algorithms. In response to this demand, the National Institute of Standards and Technology selected standardized algorithms including Crystals-Dilithium, Falcon, and Sphincs+ for digital signatures. This paper provides a comparative evaluation of these algorithms across key metrics. The results indicate varying strengths and weaknesses for each algorithm, underscoring the importance of context-specific deployments. Our findings indicate that Dilithium offers advantages in low-power scenarios, Falcon excels in signature verification speed, and Sphincs+ provides robust security at the cost of computational efficiency. These results underscore the importance of context-specific deployments in specific and resource-constrained technological applications, like IoT, smart cards, blockchain, and vehicle-to-vehicle communication.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":"33 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136346950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jöran Rixen, Nico Blass, Simon Lyra, Steffen Leonhardt
Breast cancer is the leading cause of cancer-related death among women. Early prediction is crucial as it severely increases the survival rate. Although classical X-ray mammography is an established technique for screening, many eligible women do not consider this due to concerns about pain from breast compression. Electrical Impedance Tomography (EIT) is a technique that aims to visualize the conductivity distribution within the human body. As cancer has a greater conductivity than surrounding fatty tissue, it provides a contrast for image reconstruction. However, the interpretation of EIT images is still hard, due to the low spatial resolution. In this paper, we investigated three different classification models for the detection of breast cancer. This is important as EIT is a highly non-linear inverse problem and tends to produce reconstruction artifacts, which can be misinterpreted as, e.g., tumors. To aid in the interpretation of breast cancer EIT images, we compare three different classification models for breast cancer. We found that random forests and support vector machines performed best for this task.
{"title":"Comparison of Machine Learning Classifiers for the Detection of Breast Cancer in an Electrical Impedance Tomography Setup","authors":"Jöran Rixen, Nico Blass, Simon Lyra, Steffen Leonhardt","doi":"10.3390/a16110517","DOIUrl":"https://doi.org/10.3390/a16110517","url":null,"abstract":"Breast cancer is the leading cause of cancer-related death among women. Early prediction is crucial as it severely increases the survival rate. Although classical X-ray mammography is an established technique for screening, many eligible women do not consider this due to concerns about pain from breast compression. Electrical Impedance Tomography (EIT) is a technique that aims to visualize the conductivity distribution within the human body. As cancer has a greater conductivity than surrounding fatty tissue, it provides a contrast for image reconstruction. However, the interpretation of EIT images is still hard, due to the low spatial resolution. In this paper, we investigated three different classification models for the detection of breast cancer. This is important as EIT is a highly non-linear inverse problem and tends to produce reconstruction artifacts, which can be misinterpreted as, e.g., tumors. To aid in the interpretation of breast cancer EIT images, we compare three different classification models for breast cancer. We found that random forests and support vector machines performed best for this task.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":"37 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136346935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dharahas Tallapally, John Wang, Katerina Potika, Magdalini Eirinaki
Recommender systems have revolutionized the way users discover and engage with content. Moving beyond the collaborative filtering approach, most modern recommender systems leverage additional sources of information, such as context and social network data. Such data can be modeled using graphs, and the recent advances in Graph Neural Networks have led to the prominence of a new family of graph-based recommender system algorithms. In this work, we propose the RelationalNet algorithm, which not only models user–item, and user–user relationships but also item–item relationships with graphs and uses them as input to the recommendation process. The rationale for utilizing item–item interactions is to enrich the item embeddings by leveraging the similarities between items. By using Graph Neural Networks (GNNs), RelationalNet incorporates social influence and similar item influence into the recommendation process and captures more accurate user interests, especially when traditional methods fall short due to data sparsity. Such models improve the accuracy and effectiveness of recommendation systems by leveraging social connections and item interactions. Results demonstrate that RelationalNet outperforms current state-of-the-art social recommendation algorithms.
{"title":"Using Graph Neural Networks for Social Recommendations","authors":"Dharahas Tallapally, John Wang, Katerina Potika, Magdalini Eirinaki","doi":"10.3390/a16110515","DOIUrl":"https://doi.org/10.3390/a16110515","url":null,"abstract":"Recommender systems have revolutionized the way users discover and engage with content. Moving beyond the collaborative filtering approach, most modern recommender systems leverage additional sources of information, such as context and social network data. Such data can be modeled using graphs, and the recent advances in Graph Neural Networks have led to the prominence of a new family of graph-based recommender system algorithms. In this work, we propose the RelationalNet algorithm, which not only models user–item, and user–user relationships but also item–item relationships with graphs and uses them as input to the recommendation process. The rationale for utilizing item–item interactions is to enrich the item embeddings by leveraging the similarities between items. By using Graph Neural Networks (GNNs), RelationalNet incorporates social influence and similar item influence into the recommendation process and captures more accurate user interests, especially when traditional methods fall short due to data sparsity. Such models improve the accuracy and effectiveness of recommendation systems by leveraging social connections and item interactions. Results demonstrate that RelationalNet outperforms current state-of-the-art social recommendation algorithms.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":" 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135188167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xinbo Huang, Zhiwei Song, Chao Ji, Ye Zhang, Luya Yang
Different types of surface defects will occur during the production of strip steel. To ensure production quality, it is essential to classify these defects. Our research indicates that two main problems exist in the existing strip steel surface defect classification methods: (1) they cannot solve the problem of unbalanced data using few-shot in reality, (2) they cannot meet the requirement of online real-time classification. To solve the aforementioned problems, a relational knowledge distillation self-adaptive residual shrinkage network (RKD-SARSN) is presented in this work. First, the data enhancement strategy of Cycle GAN defective sample migration is designed. Second, the self-adaptive residual shrinkage network (SARSN) is intended as the backbone network for feature extraction. An adaptive loss function based on accuracy and geometric mean (Gmean) is proposed to solve the problem of unbalanced samples. Finally, a relational knowledge distillation model (RKD) is proposed, and the functions of GUI operation interface encapsulation are designed by combining image processing technology. SARSN is used as a teacher model, its generalization performance is transferred to the lightweight network ResNet34, and it is conveniently deployed as a student model. The results show that the proposed method can improve the deployment efficiency of the model and ensure the real-time performance of the classification algorithms. It is superior to other mainstream algorithms for fine-grained images with unbalanced data classification.
{"title":"Research on a Classification Method for Strip Steel Surface Defects Based on Knowledge Distillation and a Self-Adaptive Residual Shrinkage Network","authors":"Xinbo Huang, Zhiwei Song, Chao Ji, Ye Zhang, Luya Yang","doi":"10.3390/a16110516","DOIUrl":"https://doi.org/10.3390/a16110516","url":null,"abstract":"Different types of surface defects will occur during the production of strip steel. To ensure production quality, it is essential to classify these defects. Our research indicates that two main problems exist in the existing strip steel surface defect classification methods: (1) they cannot solve the problem of unbalanced data using few-shot in reality, (2) they cannot meet the requirement of online real-time classification. To solve the aforementioned problems, a relational knowledge distillation self-adaptive residual shrinkage network (RKD-SARSN) is presented in this work. First, the data enhancement strategy of Cycle GAN defective sample migration is designed. Second, the self-adaptive residual shrinkage network (SARSN) is intended as the backbone network for feature extraction. An adaptive loss function based on accuracy and geometric mean (Gmean) is proposed to solve the problem of unbalanced samples. Finally, a relational knowledge distillation model (RKD) is proposed, and the functions of GUI operation interface encapsulation are designed by combining image processing technology. SARSN is used as a teacher model, its generalization performance is transferred to the lightweight network ResNet34, and it is conveniently deployed as a student model. The results show that the proposed method can improve the deployment efficiency of the model and ensure the real-time performance of the classification algorithms. It is superior to other mainstream algorithms for fine-grained images with unbalanced data classification.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":"2 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135186031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Holger Boche, Yannik N. Böck, Ullrich J. Mönich, Frank H. P. Fitzek
This article compares two methods of algorithmically processing bandlimited time-continuous signals in light of the general problem of finding “suitable” representations of analog information on digital hardware. Albeit abstract, we argue that this problem is fundamental in digital twinning, a signal-processing paradigm the upcoming 6G communication-technology standard relies on heavily. Using computable analysis, we formalize a general framework of machine-readable descriptions for representing analytic objects on Turing machines. Subsequently, we apply this framework to sampling and interpolation theory, providing a thoroughly formalized method for digitally processing the information carried by bandlimited analog signals. We investigate discrete-time descriptions, which form the implicit quasi-standard in digital signal processing, and establish continuous-time descriptions that take the signal’s continuous-time behavior into account. Motivated by an exemplary application of digital twinning, we analyze a textbook model of digital communication systems accordingly. We show that technologically fundamental properties, such as a signal’s (Banach-space) norm, can be computed from continuous-time, but not from discrete-time descriptions of the signal. Given the high trustworthiness requirements within 6G, e.g., employed software must satisfy assessment criteria in a provable manner, we conclude that the problem of “trustworthy” digital representations of analog information is indeed essential to near-future information technology.
{"title":"Trustworthy Digital Representations of Analog Information—An Application-Guided Analysis of a Fundamental Theoretical Problem in Digital Twinning","authors":"Holger Boche, Yannik N. Böck, Ullrich J. Mönich, Frank H. P. Fitzek","doi":"10.3390/a16110514","DOIUrl":"https://doi.org/10.3390/a16110514","url":null,"abstract":"This article compares two methods of algorithmically processing bandlimited time-continuous signals in light of the general problem of finding “suitable” representations of analog information on digital hardware. Albeit abstract, we argue that this problem is fundamental in digital twinning, a signal-processing paradigm the upcoming 6G communication-technology standard relies on heavily. Using computable analysis, we formalize a general framework of machine-readable descriptions for representing analytic objects on Turing machines. Subsequently, we apply this framework to sampling and interpolation theory, providing a thoroughly formalized method for digitally processing the information carried by bandlimited analog signals. We investigate discrete-time descriptions, which form the implicit quasi-standard in digital signal processing, and establish continuous-time descriptions that take the signal’s continuous-time behavior into account. Motivated by an exemplary application of digital twinning, we analyze a textbook model of digital communication systems accordingly. We show that technologically fundamental properties, such as a signal’s (Banach-space) norm, can be computed from continuous-time, but not from discrete-time descriptions of the signal. Given the high trustworthiness requirements within 6G, e.g., employed software must satisfy assessment criteria in a provable manner, we conclude that the problem of “trustworthy” digital representations of analog information is indeed essential to near-future information technology.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":" 22","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135244385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pietro Dell’Oglio, Alessandro Bondielli, Francesco Marcelloni
Today, most newspapers utilize social media to disseminate news. On the one hand, this results in an overload of related articles for social media users. On the other hand, since social media tends to form echo chambers around their users, different opinions and information may be hidden. Enabling users to access different information (possibly outside of their echo chambers, without the burden of reading entire articles, often containing redundant information) may be a step forward in allowing them to form their own opinions. To address this challenge, we propose a system that integrates Transformer neural models and text summarization models along with decision rules. Given a reference article already read by the user, our system first collects articles related to the same topic from a configurable number of different sources. Then, it identifies and summarizes the information that differs from the reference article and outputs the summary to the user. The core of the system is the sentence classification algorithm, which classifies sentences in the collected articles into three classes based on similarity with the reference article: sentences classified as dissimilar are summarized by using a pre-trained abstractive summarization model. We evaluated the proposed system in two steps. First, we assessed its effectiveness in identifying content differences between the reference article and the related articles by using human judgments obtained through crowdsourcing as ground truth. We obtained an average F1 score of 0.772 against average F1 scores of 0.797 and 0.676 achieved by two state-of-the-art approaches based, respectively, on model tuning and prompt tuning, which require an appropriate tuning phase and, therefore, greater computational effort. Second, we asked a sample of people to evaluate how well the summary generated by the system represents the information that is not present in the article read by the user. The results are extremely encouraging. Finally, we present a use case.
{"title":"A System to Support Readers in Automatically Acquiring Complete Summarized Information on an Event from Different Sources","authors":"Pietro Dell’Oglio, Alessandro Bondielli, Francesco Marcelloni","doi":"10.3390/a16110513","DOIUrl":"https://doi.org/10.3390/a16110513","url":null,"abstract":"Today, most newspapers utilize social media to disseminate news. On the one hand, this results in an overload of related articles for social media users. On the other hand, since social media tends to form echo chambers around their users, different opinions and information may be hidden. Enabling users to access different information (possibly outside of their echo chambers, without the burden of reading entire articles, often containing redundant information) may be a step forward in allowing them to form their own opinions. To address this challenge, we propose a system that integrates Transformer neural models and text summarization models along with decision rules. Given a reference article already read by the user, our system first collects articles related to the same topic from a configurable number of different sources. Then, it identifies and summarizes the information that differs from the reference article and outputs the summary to the user. The core of the system is the sentence classification algorithm, which classifies sentences in the collected articles into three classes based on similarity with the reference article: sentences classified as dissimilar are summarized by using a pre-trained abstractive summarization model. We evaluated the proposed system in two steps. First, we assessed its effectiveness in identifying content differences between the reference article and the related articles by using human judgments obtained through crowdsourcing as ground truth. We obtained an average F1 score of 0.772 against average F1 scores of 0.797 and 0.676 achieved by two state-of-the-art approaches based, respectively, on model tuning and prompt tuning, which require an appropriate tuning phase and, therefore, greater computational effort. Second, we asked a sample of people to evaluate how well the summary generated by the system represents the information that is not present in the article read by the user. The results are extremely encouraging. Finally, we present a use case.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":"106 s415","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135342397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}