首页 > 最新文献

Expert Systems with Applications最新文献

英文 中文
Research on the generation and evaluation of bridge defect datasets for underwater environments utilizing CycleGAN networks 利用 CycleGAN 网络生成和评估水下环境桥梁缺陷数据集的研究
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-06 DOI: 10.1016/j.eswa.2024.125576
The surface cracks on the underwater structures critically damages the overall reliability of the structures and reduces their strength. It is significant to monitor these cracks in timely manner. Recently, deep learning algorithms have been used for large scale data study and predictions. However, deep supervised learning algorithms need to get training on large scale data set which is time consuming and difficult to apply on the underwater structures. Therefore, it is highly needed to address these issues. Current research proposes an improved cycle-constraint generative adversarial algorithm for the timely detection of surface cracks in underwater structures. It utilizes an enhanced cycle-consistent generative adversarial network (CycleGAN). The proposed algorithm uses image processing techniques including DeblurGAN and Dark channel prior methods to get quality of dataset from underwater structures. The proposed Algorithm introduces a novel cross-domain VGG-cosine similarity assessment to precisely evaluate the performance of proposed algorithm to retain crack information etc. Moreover, performance of proposed algorithm is evaluated through both qualitative and quantitative methods. The quantitative results are directly obtained from the visual results are presented which are generated by the proposed Algorithm. Whereas, the performance of proposed algorithm based on quantitative results is obtained from metrics including PSNR, SSIM, and FID. Experimental results indicates that the proposed algorithm outperforms the original CycleGAN. End results indicate that the proposed algorithm decreased the value of FID by 20 % and increased the values of PSNR and SSIM by 2.37 % and 3.33 % respectively. Quantitative and qualitative results of the proposed algorithm give significant advantages during creating of surface crack images.
水下结构的表面裂缝会严重损害结构的整体可靠性并降低其强度。及时监测这些裂缝意义重大。最近,深度学习算法被用于大规模数据研究和预测。然而,深度监督学习算法需要在大规模数据集上进行训练,既耗时又难以应用于水下结构。因此,亟需解决这些问题。目前的研究提出了一种改进的周期约束生成对抗算法,用于及时检测水下结构的表面裂缝。该算法采用了增强型周期约束生成对抗网络(CycleGAN)。该算法采用了包括 DeblurGAN 和暗通道先验方法在内的图像处理技术,以获得高质量的水下结构数据集。拟议算法引入了一种新颖的跨域 VGG-余弦相似性评估,以精确评估拟议算法在保留裂缝信息等方面的性能。此外,还通过定性和定量方法评估了所提算法的性能。定量结果直接从所提出的算法生成的视觉结果中获得。而基于定量结果的拟议算法性能则是通过 PSNR、SSIM 和 FID 等指标获得的。实验结果表明,所提出的算法优于原始的 CycleGAN 算法。最终结果表明,所提算法的 FID 值降低了 20%,PSNR 和 SSIM 值分别提高了 2.37% 和 3.33%。所提算法的定量和定性结果在创建表面裂纹图像时具有显著优势。
{"title":"Research on the generation and evaluation of bridge defect datasets for underwater environments utilizing CycleGAN networks","authors":"","doi":"10.1016/j.eswa.2024.125576","DOIUrl":"10.1016/j.eswa.2024.125576","url":null,"abstract":"<div><div>The surface cracks on the underwater structures critically damages the overall reliability of the structures and reduces their strength. It is significant to monitor these cracks in timely manner. Recently, deep learning algorithms have been used for large scale data study and predictions. However, deep supervised learning algorithms need to get training on large scale data set which is time consuming and difficult to apply on the underwater structures. Therefore, it is highly needed to address these issues. Current research proposes an improved cycle-constraint generative adversarial algorithm for the timely detection of surface cracks in underwater structures. It utilizes an enhanced cycle-consistent generative adversarial network (CycleGAN). The proposed algorithm uses image processing techniques including DeblurGAN and Dark channel prior methods to get quality of dataset from underwater structures. The proposed Algorithm introduces a novel cross-domain VGG-cosine similarity assessment to precisely evaluate the performance of proposed algorithm to retain crack information etc. Moreover, performance of proposed algorithm is evaluated through both qualitative and quantitative methods. The quantitative results are directly obtained from the visual results are presented which are generated by the proposed Algorithm. Whereas, the performance of proposed algorithm based on quantitative results is obtained from metrics including PSNR, SSIM, and FID. Experimental results indicates that the proposed algorithm outperforms the original CycleGAN. End results indicate that the proposed algorithm decreased the value of FID by 20 % and increased the values of PSNR and SSIM by 2.37 % and 3.33 % respectively. Quantitative and qualitative results of the proposed algorithm give significant advantages during creating of surface crack images.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142593683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hybrid genetic algorithm with Wiener process for multi-scale colored balanced traveling salesman problem 多尺度彩色平衡旅行推销员问题的维纳过程混合遗传算法
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-05 DOI: 10.1016/j.eswa.2024.125610
Colored traveling salesman problem (CTSP) can be applied to Multi-machine Engineering Systems (MES) in industry, colored balanced traveling salesman problem (CBTSP) is a variant of CTSP, which can be used to model the optimization problems with partially overlapped workspace such as the planning optimization (For example, process planning, assembly planning, productions scheduling). The traditional algorithms have been used to solve CBTSP, however, they are limited both in solution quality and solving speed, and the scale of CBTSP is also restricted. Moreover, the traditional algorithms still have the problems such as lacking theoretical support of mathematical physics. In order to improve these, this paper proposes a novel hybrid genetic algorithm (NHGA) based on Wiener process (ITÖ process) and generating neighborhood solution (GNS) to solve multi-scale CBTSP problem. NHGA firstly uses dual-chromosome coding to construct the solutions of CBTSP, then they are updated by the crossover operator, mutation operator and GNS. The crossover length of the crossover operator and the city number of the mutation operator are controlled by activity intensity based on ITÖ process, while the city keeping probability of GNS can be learned or obtained by Wiener process. The experiments show that NHGA can demonstrate an improvement over the state-of-art algorithms for multi-scale CBTSP in term of solution quality.
有色平衡旅行推销员问题(CBTSP)是 CTSP 的一种变体,可用于规划优化(如流程规划、装配规划、生产调度)等工作空间部分重叠的优化问题建模。传统算法已被用于求解 CBTSP,但它们在求解质量和求解速度上都受到限制,而且 CBTSP 的规模也受到限制。此外,传统算法还存在缺乏数学物理理论支持等问题。为了改善这些问题,本文提出了一种基于维纳过程(ITÖ process)和生成邻域解(GNS)的新型混合遗传算法(NHGA)来解决多尺度 CBTSP 问题。NHGA 首先使用双染色体编码构建 CBTSP 的解,然后通过交叉算子、突变算子和 GNS 对其进行更新。交叉算子的交叉长度和突变算子的城市数由基于 ITÖ 过程的活动强度控制,而 GNS 的城市保持概率可以通过学习或 Wiener 过程获得。实验表明,NHGA 在多尺度 CBTSP 的求解质量方面比最先进的算法有所提高。
{"title":"Hybrid genetic algorithm with Wiener process for multi-scale colored balanced traveling salesman problem","authors":"","doi":"10.1016/j.eswa.2024.125610","DOIUrl":"10.1016/j.eswa.2024.125610","url":null,"abstract":"<div><div>Colored traveling salesman problem (CTSP) can be applied to Multi-machine Engineering Systems (MES) in industry, colored balanced traveling salesman problem (CBTSP) is a variant of CTSP, which can be used to model the optimization problems with partially overlapped workspace such as the planning optimization (For example, process planning, assembly planning, productions scheduling). The traditional algorithms have been used to solve CBTSP, however, they are limited both in solution quality and solving speed, and the scale of CBTSP is also restricted. Moreover, the traditional algorithms still have the problems such as lacking theoretical support of mathematical physics. In order to improve these, this paper proposes a novel hybrid genetic algorithm (NHGA) based on Wiener process (ITÖ process) and generating neighborhood solution (GNS) to solve multi-scale CBTSP problem. NHGA firstly uses dual-chromosome coding to construct the solutions of CBTSP, then they are updated by the crossover operator, mutation operator and GNS. The crossover length of the crossover operator and the city number of the mutation operator are controlled by activity intensity based on ITÖ process, while the city keeping probability of GNS can be learned or obtained by Wiener process. The experiments show that NHGA can demonstrate an improvement over the state-of-art algorithms for multi-scale CBTSP in term of solution quality.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142586046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GCENet: A geometric correspondence estimation network for tracking and loop detection in visual–inertial SLAM GCENet:用于视觉惯性 SLAM 跟踪和环路检测的几何对应估计网络
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-04 DOI: 10.1016/j.eswa.2024.125659
Establishing robust and effective data correlation has been one of the core problems in visual based SLAM (Simultaneous Localization and Mapping). In this paper, we propose a geometric correspondence estimation network, GCENet, tailored for visual tracking and loop detection in visual–inertial SLAM. GCENet considers both local and global correlation in frames, enabling deep feature matching in scenarios involving noticeable displacement. Building upon this, we introduce a tightly-coupled visual–inertial state estimation system. To address challenges in extreme environments, such as strong illumination and weak texture, where manual feature matching tends to fail, a compensatory deep optical flow tracker is incorporated into our system. In such cases, our approach utilizes GCENet for dense optical flow tracking, replacing manual pipelines to conduct visual tracking. Furthermore, a deep loop detector based on GCENet is constructed, which utilizes estimated flow to represent scene similarity. Spatial consistency discrimination on candidate loops is conducted with GCENet to establish long-term data association, effectively suppressing false negatives and false positives in loop closure. Dedicated experiments are conducted in EuRoC drone, TUM-4Seasons and private robot datasets to evaluate the proposed method. The results demonstrate that our system exhibits superior robustness and accuracy in extreme environments compared to the state-of-the-art methods.
建立稳健有效的数据相关性一直是基于视觉的 SLAM(同步定位与绘图)的核心问题之一。在本文中,我们提出了一种几何对应估计网络 GCENet,专门用于视觉惯性 SLAM 中的视觉跟踪和环路检测。GCENet 考虑了帧中的局部和全局相关性,可在涉及明显位移的情况下进行深度特征匹配。在此基础上,我们引入了一个紧密耦合的视觉-惯性状态估计系统。在强光照和弱纹理等极端环境下,人工特征匹配往往会失败,为了应对这些挑战,我们在系统中加入了补偿性深度光流跟踪器。在这种情况下,我们的方法利用 GCENet 进行密集光流跟踪,取代人工管道进行视觉跟踪。此外,我们还构建了基于 GCENet 的深度环路检测器,该检测器利用估计的光流来表示场景的相似性。利用 GCENet 对候选环路进行空间一致性判别,以建立长期数据关联,从而有效抑制环路闭合中的假阴性和假阳性。我们在 EuRoC 无人机、TUM-4Seasons 和私人机器人数据集中进行了专门实验,以评估所提出的方法。结果表明,与最先进的方法相比,我们的系统在极端环境中表现出更高的鲁棒性和准确性。
{"title":"GCENet: A geometric correspondence estimation network for tracking and loop detection in visual–inertial SLAM","authors":"","doi":"10.1016/j.eswa.2024.125659","DOIUrl":"10.1016/j.eswa.2024.125659","url":null,"abstract":"<div><div>Establishing robust and effective data correlation has been one of the core problems in visual based SLAM (Simultaneous Localization and Mapping). In this paper, we propose a geometric correspondence estimation network, GCENet, tailored for visual tracking and loop detection in visual–inertial SLAM. GCENet considers both local and global correlation in frames, enabling deep feature matching in scenarios involving noticeable displacement. Building upon this, we introduce a tightly-coupled visual–inertial state estimation system. To address challenges in extreme environments, such as strong illumination and weak texture, where manual feature matching tends to fail, a compensatory deep optical flow tracker is incorporated into our system. In such cases, our approach utilizes GCENet for dense optical flow tracking, replacing manual pipelines to conduct visual tracking. Furthermore, a deep loop detector based on GCENet is constructed, which utilizes estimated flow to represent scene similarity. Spatial consistency discrimination on candidate loops is conducted with GCENet to establish long-term data association, effectively suppressing false negatives and false positives in loop closure. Dedicated experiments are conducted in EuRoC drone, TUM-4Seasons and private robot datasets to evaluate the proposed method. The results demonstrate that our system exhibits superior robustness and accuracy in extreme environments compared to the state-of-the-art methods.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142593550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CE-DCVSI: Multimodal relational extraction based on collaborative enhancement of dual-channel visual semantic information CE-DCVSI:基于双通道视觉语义信息协同增强的多模态关系提取
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-04 DOI: 10.1016/j.eswa.2024.125608
Visual information implied by the images in multimodal relation extraction (MRE) usually contains details that are difficult to describe in text sentences. Integrating textual and visual information is the mainstream method to enhance the understanding and extraction of relations between entities. However, existing MRE methods neglect the semantic gap caused by data heterogeneity. Besides, some approaches map the relations between target objects in image scene graphs to text, but massive invalid visual relations introduce noise. To alleviate the above problems, we propose a novel multimodal relation extraction method based on cooperative enhancement of dual-channel visual semantic information (CE-DCVSI). Specifically, to mitigate the semantic gap between modalities, we realize fine-grained semantic alignment between entities and target objects through multimodal heterogeneous graphs, aligning feature representations of different modalities into the same semantic space using the heterogeneous graph Transformer, thus promoting the consistency and accuracy of feature representations. To eliminate the effect of useless visual relations, we perform multi-scale feature fusion between different levels of visual information and textual representations to increase the complementarity between features, improving the comprehensiveness and robustness of the multimodal representation. Finally, we utilize the information bottleneck principle to filter out invalid information from the multimodal representation to mitigate the negative impact of irrelevant noise. The experiments demonstrate that the method achieves 86.08% of the F1 score on the publicly available MRE dataset, which outperforms other baseline methods.
在多模态关系提取(MRE)中,图像所隐含的视觉信息通常包含难以用文本句子描述的细节。整合文本和视觉信息是增强实体间关系理解和提取的主流方法。然而,现有的 MRE 方法忽视了数据异质性造成的语义差距。此外,有些方法将图像场景图中目标对象之间的关系映射到文本中,但大量无效的视觉关系会带来噪声。为了解决上述问题,我们提出了一种基于双通道视觉语义信息协同增强(CE-DCVSI)的新型多模态关系提取方法。具体来说,为了缓解模态之间的语义差距,我们通过多模态异构图实现了实体与目标对象之间的细粒度语义对齐,利用异构图变换器将不同模态的特征表征对齐到同一语义空间,从而提高了特征表征的一致性和准确性。为了消除无用视觉关系的影响,我们在不同层次的视觉信息和文本表征之间进行多尺度特征融合,以增加特征之间的互补性,提高多模态表征的全面性和鲁棒性。最后,我们利用信息瓶颈原理过滤掉多模态表征中的无效信息,以减轻无关噪声的负面影响。实验证明,该方法在公开的 MRE 数据集上获得了 86.08% 的 F1 分数,优于其他基线方法。
{"title":"CE-DCVSI: Multimodal relational extraction based on collaborative enhancement of dual-channel visual semantic information","authors":"","doi":"10.1016/j.eswa.2024.125608","DOIUrl":"10.1016/j.eswa.2024.125608","url":null,"abstract":"<div><div>Visual information implied by the images in multimodal relation extraction (MRE) usually contains details that are difficult to describe in text sentences. Integrating textual and visual information is the mainstream method to enhance the understanding and extraction of relations between entities. However, existing MRE methods neglect the semantic gap caused by data heterogeneity. Besides, some approaches map the relations between target objects in image scene graphs to text, but massive invalid visual relations introduce noise. To alleviate the above problems, we propose a novel multimodal relation extraction method based on cooperative enhancement of dual-channel visual semantic information (CE-DCVSI). Specifically, to mitigate the semantic gap between modalities, we realize fine-grained semantic alignment between entities and target objects through multimodal heterogeneous graphs, aligning feature representations of different modalities into the same semantic space using the heterogeneous graph Transformer, thus promoting the consistency and accuracy of feature representations. To eliminate the effect of useless visual relations, we perform multi-scale feature fusion between different levels of visual information and textual representations to increase the complementarity between features, improving the comprehensiveness and robustness of the multimodal representation. Finally, we utilize the information bottleneck principle to filter out invalid information from the multimodal representation to mitigate the negative impact of irrelevant noise. The experiments demonstrate that the method achieves 86.08% of the F1 score on the publicly available MRE dataset, which outperforms other baseline methods.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142578463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identification of gene regulatory networks associated with breast cancer patient survival using an interpretable deep neural network model 利用可解释的深度神经网络模型识别与乳腺癌患者生存相关的基因调控网络
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-04 DOI: 10.1016/j.eswa.2024.125632
Artificial neural networks have recently gained significant attention in biomedical research. However, their utility in survival analysis still faces many challenges. In addition to designing models for high accuracy, it is essential to optimize models that provide biologically meaningful insights. With these considerations in mind, we developed a deep neural network model, MaskedNet, to identify genes and pathways whose expression at the time of diagnosis is associated with overall survival. MaskedNet was trained using TCGA breast cancer transcriptome and clinical data, and the model’s final output was the predicted logarithm of the hazard ratio for death. The trained model was interpreted using SHapley Additive exPlanations (SHAP), a technique grounded in robust mathematical principles that assigns importance scores to input features. Compared to traditional Cox proportional hazards regression, MaskedNet had higher accuracy, as measured by Harrell’s C-index. We also found that aggregating outputs from several model runs identified multiple genes and pathways associated with overall survival, including IFNG and PIK3CA genes, along with their related pathways. To further elucidate the role of the IFNG gene, tumors were partitioned into two groups based on low and high IFNG SHAP values, respectively. Tumors with lower IFNG SHAP values exhibited higher IFNG expression and better overall survival, which were linked to more abundant presence of M1 macrophages and activated CD4+ and CD8+ T cells in the tumor microenvironment. The association of the IFNG pathway with overall survival was validated in the trastuzumab arm of the NCCTG-N9831 trial, an independent breast cancer study.
最近,人工神经网络在生物医学研究领域获得了极大关注。然而,它们在生存分析中的应用仍然面临着许多挑战。除了要设计出高精度的模型外,还必须优化能提供有生物学意义的见解的模型。考虑到这些因素,我们开发了一种深度神经网络模型--MaskedNet,用于识别诊断时表达与总生存期相关的基因和通路。我们使用 TCGA 乳腺癌转录组和临床数据对 MaskedNet 进行了训练,模型的最终输出是预测的死亡危险比对数。训练后的模型使用 SHapley Additive exPlanations(SHAP)进行解释,SHAP 是一种基于稳健数学原理的技术,它为输入特征分配重要性分数。与传统的 Cox 正比危险回归相比,MaskedNet 的准确性更高,这可以用 Harrell 的 C 指数来衡量。我们还发现,汇总多个模型运行的输出结果,可以发现与总生存率相关的多个基因和通路,包括 IFNG 和 PIK3CA 基因及其相关通路。为进一步阐明IFNG基因的作用,根据IFNG SHAP值的高低将肿瘤分为两组。IFNG SHAP值较低的肿瘤表现出较高的IFNG表达和较好的总生存率,这与肿瘤微环境中存在较多的M1巨噬细胞和活化的CD4+和CD8+T细胞有关。IFNG通路与总生存期的关系在一项独立的乳腺癌研究--NCCTG-N9831试验的曲妥珠单抗组中得到了验证。
{"title":"Identification of gene regulatory networks associated with breast cancer patient survival using an interpretable deep neural network model","authors":"","doi":"10.1016/j.eswa.2024.125632","DOIUrl":"10.1016/j.eswa.2024.125632","url":null,"abstract":"<div><div>Artificial neural networks have recently gained significant attention in biomedical research. However, their utility in survival analysis still faces many challenges. In addition to designing models for high accuracy, it is essential to optimize models that provide biologically meaningful insights. With these considerations in mind, we developed a deep neural network model, MaskedNet, to identify genes and pathways whose expression at the time of diagnosis is associated with overall survival. MaskedNet was trained using TCGA breast cancer transcriptome and clinical data, and the model’s final output was the predicted logarithm of the hazard ratio for death. The trained model was interpreted using SHapley Additive exPlanations (SHAP), a technique grounded in robust mathematical principles that assigns importance scores to input features. Compared to traditional Cox proportional hazards regression, MaskedNet had higher accuracy, as measured by Harrell’s C-index. We also found that aggregating outputs from several model runs identified multiple genes and pathways associated with overall survival, including <em>IFNG</em> and <em>PIK3CA</em> genes<em>,</em> along with their related pathways. To further elucidate the role of the <em>IFNG</em> gene, tumors were partitioned into two groups based on low and high <em>IFNG</em> SHAP values, respectively. Tumors with lower <em>IFNG</em> SHAP values exhibited higher <em>IFNG</em> expression and better overall survival, which were linked to more abundant presence of M1 macrophages and activated CD4+ and CD8+ T cells in the tumor microenvironment. The association of the <em>IFNG</em> pathway with overall survival was validated in the trastuzumab arm of the NCCTG-N9831 trial, an independent breast cancer study.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142578466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning face super-resolution through identity features and distilling facial prior knowledge 通过身份特征和提炼面部先验知识学习面部超分辨率
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-04 DOI: 10.1016/j.eswa.2024.125625
Deep learning techniques in electronic surveillance have shown impressive performance for super-resolution (SR) of captured low-quality face images. Most of these techniques adopt facial priors to improve the feature details in the resultant super-resolved images. However, the estimation of facial priors from the captured low-quality images is often inaccurate in real-life situations because of their tiny, noisy, and blurry nature. Thus, the fusion of such priors badly affects the performance of these models. Therefore, this work presents a teacher–student-based face SR framework that efficiently preserves the personal facial structure information in the super-resolved faces. In the proposed framework, the teacher network exploits the facial heatmap-based ground-truth-prior to learn the facial structure that is utilized by the student network. The student network is trained with the identity feature loss for maintaining the identity and facial structure information in reconstructed high-resolution (HR) face images. The performance of the proposed framework is evaluated by conducting the experimental study on standard datasets namely CelebA-HQ and LFW face. The experimental results reveal that the proposed technique conquers the existing methods for the face SR task.
电子监控领域的深度学习技术在对捕捉到的低质量人脸图像进行超分辨率(SR)处理方面表现出令人印象深刻的性能。这些技术大多采用面部先验来改善超分辨率图像中的特征细节。然而,在现实生活中,由于拍摄到的低质量图像微小、嘈杂、模糊,因此从这些图像中估算出的面部先验值往往并不准确。因此,融合这些前验会严重影响这些模型的性能。因此,本研究提出了一种基于教师-学生的人脸 SR 框架,它能有效保留超分辨率人脸中的个人面部结构信息。在所提出的框架中,教师网络利用基于面部热图的地面实况先验来学习面部结构,学生网络则利用这些先验来学习面部结构。学生网络通过身份特征损失进行训练,以保持重建的高分辨率(HR)人脸图像中的身份和面部结构信息。通过在标准数据集(即 CelebA-HQ 和 LFW 人脸)上进行实验研究,对所提出框架的性能进行了评估。实验结果表明,在人脸 SR 任务中,所提出的技术战胜了现有的方法。
{"title":"Learning face super-resolution through identity features and distilling facial prior knowledge","authors":"","doi":"10.1016/j.eswa.2024.125625","DOIUrl":"10.1016/j.eswa.2024.125625","url":null,"abstract":"<div><div>Deep learning techniques in electronic surveillance have shown impressive performance for super-resolution (SR) of captured low-quality face images. Most of these techniques adopt facial priors to improve the feature details in the resultant super-resolved images. However, the estimation of facial priors from the captured low-quality images is often inaccurate in real-life situations because of their tiny, noisy, and blurry nature. Thus, the fusion of such priors badly affects the performance of these models. Therefore, this work presents a teacher–student-based face SR framework that efficiently preserves the personal facial structure information in the super-resolved faces. In the proposed framework, the teacher network exploits the facial heatmap-based ground-truth-prior to learn the facial structure that is utilized by the student network. The student network is trained with the identity feature loss for maintaining the identity and facial structure information in reconstructed high-resolution (HR) face images. The performance of the proposed framework is evaluated by conducting the experimental study on standard datasets namely CelebA-HQ and LFW face. The experimental results reveal that the proposed technique conquers the existing methods for the face SR task.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142578464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Vehicle trajectory extraction with interacting multiple model for low-channel roadside LiDAR 针对低通道路边激光雷达的车辆轨迹提取与交互多重模型
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-03 DOI: 10.1016/j.eswa.2024.125662
High-precision and consistent vehicle trajectories encompass microscopic traffic parameters, mesoscopic traffic flow characteristics, and macroscopic traffic flow features, which is the cornerstone of innovation in data-driven traffic management and control applications. However, occlusion and trajectory interruption remain challenging in multivehicle tracking under complex traffic environments using low-channel roadside LiDAR. To address the challenge, a novel framework for vehicle trajectory extraction using low-channel roadside LiDAR was proposed. First, the geometric features of the cluster and its L-shape bounding box were used to address the over-segmentation in vehicle detection arising from occlusion and point cloud sparse. Then, objects within adjacent point cloud frames were associated by developing an improved Hungarian algorithm integrated with an adaptive distance threshold to solve the mismatching problem caused by objects entrancing and exiting in a new point cloud frame. Finally, an improved interacting multiple model by considering vehicle driving patterns was deployed to predict the location of missing vehicles and connect the interrupted trajectories. Experimental results showed that the proposed methods achieve 98.76 % of vehicle detection accuracy and 97.40 % of data association precision. The mean absolute error (MAE) and mean square error (MSE) of the vehicle position estimation are 0.2252 m and 0.0729 m2, respectively. The trajectory extraction precision outperforms most of the state-of-the-art algorithms.
高精度和一致的车辆轨迹包含微观交通参数、中观交通流特征和宏观交通流特征,是数据驱动交通管理和控制应用创新的基石。然而,在复杂交通环境下使用低通道路边激光雷达进行多车跟踪时,遮挡和轨迹中断仍然是一个挑战。为了应对这一挑战,我们提出了一种利用低信道路边激光雷达进行车辆轨迹提取的新型框架。首先,利用集群的几何特征及其 L 形边界框来解决车辆检测中因遮挡和点云稀疏而产生的过度分割问题。然后,通过改进的匈牙利算法与自适应距离阈值相结合,将相邻点云帧内的物体关联起来,以解决新点云帧中物体进出造成的不匹配问题。最后,考虑到车辆驾驶模式,采用改进的交互式多重模型来预测丢失车辆的位置,并将中断的轨迹连接起来。实验结果表明,所提出的方法实现了 98.76% 的车辆检测准确率和 97.40% 的数据关联精度。车辆位置估计的平均绝对误差(MAE)和平均平方误差(MSE)分别为 0.2252 m 和 0.0729 m2。轨迹提取精度优于大多数最先进的算法。
{"title":"Vehicle trajectory extraction with interacting multiple model for low-channel roadside LiDAR","authors":"","doi":"10.1016/j.eswa.2024.125662","DOIUrl":"10.1016/j.eswa.2024.125662","url":null,"abstract":"<div><div>High-precision and consistent vehicle trajectories encompass microscopic traffic parameters, mesoscopic traffic flow characteristics, and macroscopic traffic flow features, which is the cornerstone of innovation in data-driven traffic management and control applications. However, occlusion and trajectory interruption remain challenging in multivehicle tracking under complex traffic environments using low-channel roadside LiDAR. To address the challenge, a novel framework for vehicle trajectory extraction using low-channel roadside LiDAR was proposed. First, the geometric features of the cluster and its L-shape bounding box were used to address the over-segmentation in vehicle detection arising from occlusion and point cloud sparse. Then, objects within adjacent point cloud frames were associated by developing an improved Hungarian algorithm integrated with an adaptive distance threshold to solve the mismatching problem caused by objects entrancing and exiting in a new point cloud frame. Finally, an improved interacting multiple model by considering vehicle driving patterns was deployed to predict the location of missing vehicles and connect the interrupted trajectories. Experimental results showed that the proposed methods achieve 98.76 % of vehicle detection accuracy and 97.40 % of data association precision. The mean absolute error (MAE) and mean square error (MSE) of the vehicle position estimation are 0.2252 m and 0.0729 m<sup>2</sup>, respectively. The trajectory extraction precision outperforms most of the state-of-the-art algorithms.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142586044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A new look of dispatching for multi-objective interbay AMHS in semiconductor wafer manufacturing: A T–S fuzzy-based learning approach 半导体晶圆制造中多目标晶圆间 AMHS 的调度新视角:基于 T-S 模糊学习的方法
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-02 DOI: 10.1016/j.eswa.2024.125615
Semiconductor wafer fabrication systems (SWFS) are among the most intricate discrete processing environments globally. Since the costs associated with automated material handling systems (AMHS) within fabs account for 20%–50% of manufacturing expenses, it is crucial to enhance the efficiency of material handling in semiconductor production lines. However, optimizing AMHS is difficult due to the complexities inherent in large-scale, nonlinear, dynamic, and stochastic production settings, as well as differing objectives and goals. To overcome these challenges, this paper presents a novel fuzzy-based learning algorithm to enhance the multi-objective dispatching model, which incorporates both transportation and production aspects for interbay AMHS in wafer fabrication manufacturing, aligning it more closely with real-world conditions. Furthermore, we formulate a new constrained nonlinear dispatching problem. To tackle the inherent nonlinearity, a Takagi-Sugeno (T–S) fuzzy modeling approach is developed, which transforms nonlinear terms into a fuzzy linear dispatching model and optimizes the weight in multi-objective problems to obtain the optimal solution. The effectiveness and superiority of the proposed approach are demonstrated through extensive simulations and comparative analysis with existing methods. As a result, the proposed method significantly improves transport efficiency, increases wafer throughput, and reduces processing cycle times.
半导体晶片制造系统(SWFS)是全球最复杂的离散加工环境之一。由于晶圆厂内与自动材料处理系统(AMHS)相关的成本占制造费用的 20%-50%,因此提高半导体生产线的材料处理效率至关重要。然而,由于大规模、非线性、动态和随机生产环境固有的复杂性,以及不同的目的和目标,优化 AMHS 十分困难。为了克服这些挑战,本文提出了一种新颖的基于模糊学习的算法来增强多目标调度模型,该模型结合了晶圆制造过程中板间 AMHS 的运输和生产两个方面,使其更加贴近现实条件。此外,我们还提出了一个新的约束非线性调度问题。为了解决固有的非线性问题,我们开发了一种高木-菅野(T-S)模糊建模方法,它将非线性项转化为模糊线性调度模型,并优化多目标问题中的权重,从而获得最优解。通过大量模拟和与现有方法的对比分析,证明了所提方法的有效性和优越性。因此,所提出的方法显著提高了运输效率,增加了晶片吞吐量,并缩短了加工周期时间。
{"title":"A new look of dispatching for multi-objective interbay AMHS in semiconductor wafer manufacturing: A T–S fuzzy-based learning approach","authors":"","doi":"10.1016/j.eswa.2024.125615","DOIUrl":"10.1016/j.eswa.2024.125615","url":null,"abstract":"<div><div>Semiconductor wafer fabrication systems (SWFS) are among the most intricate discrete processing environments globally. Since the costs associated with automated material handling systems (AMHS) within fabs account for 20%–50% of manufacturing expenses, it is crucial to enhance the efficiency of material handling in semiconductor production lines. However, optimizing AMHS is difficult due to the complexities inherent in large-scale, nonlinear, dynamic, and stochastic production settings, as well as differing objectives and goals. To overcome these challenges, this paper presents a novel fuzzy-based learning algorithm to enhance the multi-objective dispatching model, which incorporates both transportation and production aspects for interbay AMHS in wafer fabrication manufacturing, aligning it more closely with real-world conditions. Furthermore, we formulate a new constrained nonlinear dispatching problem. To tackle the inherent nonlinearity, a Takagi-Sugeno (T–S) fuzzy modeling approach is developed, which transforms nonlinear terms into a fuzzy linear dispatching model and optimizes the weight in multi-objective problems to obtain the optimal solution. The effectiveness and superiority of the proposed approach are demonstrated through extensive simulations and comparative analysis with existing methods. As a result, the proposed method significantly improves transport efficiency, increases wafer throughput, and reduces processing cycle times.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142572124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development of A deep Learning-based algorithm for High-Pitch helical computed tomography imaging 开发基于深度学习的高螺距螺旋计算机断层扫描成像算法
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-02 DOI: 10.1016/j.eswa.2024.125663
High-pitch X-ray helical computed tomography (HCT) imaging has been recently drawing considerable attention in biomedical fields due to its ability to reduce the scanning time and thus lower the radiation dose that objects (being imagined) may receive. However, the issue of compromised reconstruction quality caused by incomplete data in these high-pitch CT scans remains, thus limiting its applications. By addressing the aforementioned issue, this paper presents our study on the development of a novel deep leaning (DL)-based algorithm, ViT-U, for high-pitch X-ray propagation-based imaging HCT (PBI-HCT) reconstruction. ViT-U consists of two key process modules of a vision transformer (ViT) and a convolutional neural network (i.e., U-Net), where ViT addresses the missing information in the data domain and U-Net enhances the post data-processing in the reconstruction domain. For verification, we designed and conducted simulations and experiments with both low-density-biomaterial samples and biological-tissue samples to exemplify the biomedical applications, and then examined the ViT-U performance with varying pitches of 3, 3.5, 4, and 4.5, respectively, for comparison in term of radiation does and reconstruction quality. Our results showed that the high-pitch PBI-HCT allowed for the dose reduction from 72% to 93%. Importantly, our results demonstrated that the ViT-U exhibited outstanding performance by effectively removing the missing wedge artifacts thus enhancing the reconstruction quality of high-pitch PBI-HCT imaging. Also, our results showed the superior capability of ViT-U to achieve high quality of reconstruction from the high-pitch images with the helical pitch value up to 4 (which allowed for the substantial reduction of radiation doses). Taken together, our DL-based ViT-U algorithm not only enables high-speed imaging with low radiation dose, but also maintains the high quality of imaging reconstruction, thereby offering significant potentials for biomedical imaging applications.
高螺距 X 射线螺旋计算机断层扫描(HCT)成像技术能够缩短扫描时间,从而降低物体(被成像物体)可能受到的辐射剂量,因此最近在生物医学领域备受关注。然而,这些高间距 CT 扫描中的不完整数据导致重建质量下降的问题依然存在,从而限制了其应用。针对上述问题,本文介绍了我们针对基于 X 射线传播的高间距成像 HCT(PBI-HCT)重建开发的基于深度倾斜(DL)的新型算法 ViT-U 的研究。ViT-U 由视觉转换器(ViT)和卷积神经网络(即 U-Net)两个关键处理模块组成,其中 ViT 处理数据域的缺失信息,U-Net 增强重建域的后数据处理。为了进行验证,我们设计并进行了低密度生物材料样本和生物组织样本的模拟和实验,以生物医学应用为例,然后分别在 3、3.5、4 和 4.5 不同间距下检验了 ViT-U 的性能,以比较辐射影响和重建质量。结果表明,高间距 PBI-HCT 可使剂量减少 72% 至 93%。重要的是,我们的结果表明,ViT-U 能有效去除缺失的楔形伪影,从而提高高间距 PBI-HCT 成像的重建质量,表现出卓越的性能。此外,我们的结果还显示了 ViT-U 的卓越能力,它能从螺旋间距值高达 4 的高间距图像中获得高质量的重建(从而大幅降低了辐射剂量)。综上所述,我们基于 DL 的 ViT-U 算法不仅能以较低的辐射剂量实现高速成像,还能保持高质量的成像重建,从而为生物医学成像应用提供了巨大的潜力。
{"title":"Development of A deep Learning-based algorithm for High-Pitch helical computed tomography imaging","authors":"","doi":"10.1016/j.eswa.2024.125663","DOIUrl":"10.1016/j.eswa.2024.125663","url":null,"abstract":"<div><div>High-pitch X-ray helical computed tomography (HCT) imaging has been recently drawing considerable attention in biomedical fields due to its ability to reduce the scanning time and thus lower the radiation dose that objects (being imagined) may receive. However, the issue of compromised reconstruction quality caused by incomplete data in these high-pitch CT scans remains, thus limiting its applications. By addressing the aforementioned issue, this paper presents our study on the development of a novel deep leaning (DL)-based algorithm, ViT-U, for high-pitch X-ray propagation-based imaging HCT (PBI-HCT) reconstruction. ViT-U consists of two key process modules of a vision transformer (ViT) and a convolutional neural network (i.e., U-Net), where ViT addresses the missing information in the data domain and U-Net enhances the post data-processing in the reconstruction domain. For verification, we designed and conducted simulations and experiments with both low-density-biomaterial samples and biological-tissue samples to exemplify the biomedical applications, and then examined the ViT-U performance with varying pitches of 3, 3.5, 4, and 4.5, respectively, for comparison in term of radiation does and reconstruction quality. Our results showed that the high-pitch PBI-HCT allowed for the dose reduction from 72% to 93%. Importantly, our results demonstrated that the ViT-U exhibited outstanding performance by effectively removing the missing wedge artifacts thus enhancing the reconstruction quality of high-pitch PBI-HCT imaging. Also, our results showed the superior capability of ViT-U to achieve high quality of reconstruction from the high-pitch images with the helical pitch value up to 4 (which allowed for the substantial reduction of radiation doses). Taken together, our DL-based ViT-U algorithm not only enables high-speed imaging with low radiation dose, but also maintains the high quality of imaging reconstruction, thereby offering significant potentials for biomedical imaging applications.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142578467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Informer-FDR: A short-term vehicle speed prediction model in car-following scenario based on traffic environment Informer-FDR:基于交通环境的跟车情景下短期车速预测模型
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-02 DOI: 10.1016/j.eswa.2024.125655
Drivers’ car-following behaviors on urban roads are influenced by various factors, including pedestrians, cyclists, adjacent vehicles, and roadside parking. However, few models consider these factors’ influence on drivers’ speed selections during car-following, limiting the human-like driving capability of advanced driver assistance systems (ADAS). This paper proposes a vehicle speed prediction model in car-following scenario that considers the influences of the traffic environment. The vehicle speed is predicted using Informer-FDR (Informer with fusion features, dilated causal convolution, and residual connection), which adopts an improved encoder-decoder structure based on the Informer model. Fusing features of traffic environment characteristics and vehicle dynamics parameters enables the dynamic interaction characteristics between drivers and the traffic environment and potential traffic conflicts to be effectively reflected, which enhances the model’s understanding of the complex driving environment. Moreover, the high computational complexity is reduced by using the ProbSparse self-attention mechanism, which will help to address the difficulty of applying Transformer class models to on-board platforms. Totally 3,980 car-following cases were extracted from naturalistic driving data (NDD), vehicle dynamics parameters and traffic environment characteristics in the car-following scenarios were obtained through target detection and ranging algorithm. The optimal feature set was mined using the combined feature selection method. The dilated causal convolution and average pooling layer are introduced to expand the receptive field of the model, enhance global feature extraction, and ensure the causality of temporal predictions. Furthermore, the residual connection was added to the encoder, realizing the direct deep transfer of cross-layer information. Verifications on the test set show that Informer-FDR has the lowest MAE (0.583), MSE (2.942), RMSE (1.715), and the highest speed prediction accuracy (97.76%), spacing gap accuracy (94.27%), acceleration accuracy (95.35%), which outperforms other baseline models in terms of prediction performance. The ablation study confirms the importance of the improved distilling layer module, residual connection module, and fusion features for predictive performance improvement. Additionally, the road-type experiment reveals performance differences of the model on different road types, emphasizing the importance of incorporating traffic environment on urban road.
驾驶员在城市道路上的跟车行为受到行人、骑车人、相邻车辆和路边停车等多种因素的影响。然而,很少有模型考虑到这些因素对驾驶员在跟车过程中速度选择的影响,从而限制了高级驾驶辅助系统(ADAS)的仿人驾驶能力。本文提出了一种考虑交通环境影响的跟车场景中的车速预测模型。车速预测采用 Informer-FDR(具有融合特征、扩张因果卷积和残差连接的 Informer),该模型在 Informer 模型的基础上采用了改进的编码器-解码器结构。通过融合交通环境特征和车辆动态参数,可以有效反映驾驶员与交通环境之间的动态交互特征以及潜在的交通冲突,从而增强模型对复杂驾驶环境的理解。此外,通过使用 ProbSparse 自关注机制降低了高计算复杂度,这将有助于解决将 Transformer 类模型应用于车载平台的困难。从自然驾驶数据(NDD)中提取了 3,980 个跟车案例,通过目标检测和测距算法获得了跟车场景中的车辆动力学参数和交通环境特征。采用组合特征选择法挖掘出最优特征集。引入扩张因果卷积层和平均池化层,以扩大模型的感受野,增强全局特征提取,并确保时序预测的因果性。此外,在编码器中加入了残差连接,实现了跨层信息的直接深度传递。测试集的验证结果表明,Informer-FDR 的 MAE(0.583)、MSE(2.942)、RMSE(1.715)最低,速度预测准确率(97.76%)、间距差距准确率(94.27%)、加速度准确率(95.35%)最高,在预测性能方面优于其他基线模型。烧蚀研究证实了改进的蒸馏层模块、残余连接模块和融合特征对提高预测性能的重要性。此外,道路类型实验揭示了模型在不同道路类型上的性能差异,强调了结合城市道路交通环境的重要性。
{"title":"Informer-FDR: A short-term vehicle speed prediction model in car-following scenario based on traffic environment","authors":"","doi":"10.1016/j.eswa.2024.125655","DOIUrl":"10.1016/j.eswa.2024.125655","url":null,"abstract":"<div><div>Drivers’ car-following behaviors on urban roads are influenced by various factors, including pedestrians, cyclists, adjacent vehicles, and roadside parking. However, few models consider these factors’ influence on drivers’ speed selections during car-following, limiting the human-like driving capability of advanced driver assistance systems (ADAS). This paper proposes a vehicle speed prediction model in car-following scenario that considers the influences of the traffic environment. The vehicle speed is predicted using Informer-FDR (Informer with fusion features, dilated causal convolution, and residual connection), which adopts an improved encoder-decoder structure based on the Informer model. Fusing features of traffic environment characteristics and vehicle dynamics parameters enables the dynamic interaction characteristics between drivers and the traffic environment and potential traffic conflicts to be effectively reflected, which enhances the model’s understanding of the complex driving environment. Moreover, the high computational complexity is reduced by using the ProbSparse self-attention mechanism, which will help to address the difficulty of applying Transformer class models to on-board platforms. Totally 3,980 car-following cases were extracted from naturalistic driving data (NDD), vehicle dynamics parameters and traffic environment characteristics in the car-following scenarios were obtained through target detection and ranging algorithm. The optimal feature set was mined using the combined feature selection method. The dilated causal convolution and average pooling layer are introduced to expand the receptive field of the model, enhance global feature extraction, and ensure the causality of temporal predictions. Furthermore, the residual connection was added to the encoder, realizing the direct deep transfer of cross-layer information. Verifications on the test set show that Informer-FDR has the lowest MAE (0.583), MSE (2.942), RMSE (1.715), and the highest speed prediction accuracy (97.76%), spacing gap accuracy (94.27%), acceleration accuracy (95.35%), which outperforms other baseline models in terms of prediction performance. The ablation study confirms the importance of the improved distilling layer module, residual connection module, and fusion features for predictive performance improvement. Additionally, the road-type experiment reveals performance differences of the model on different road types, emphasizing the importance of incorporating traffic environment on urban road.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142593549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Expert Systems with Applications
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1