In damage‐level classification, deep learning. models are more likely to focus on regions unrelated to classification targets because of the complexities inherent in real data, such as the diversity of damages (e.g., crack, efflorescence, and corrosion). This causes performance degradation. To solve this problem, it is necessary to handle data complexity and uncertainty. This study proposes a multimodal deep learning model that can focus on damaged regions using text data related to damage in images, such as materials and components. Furthermore, by adjusting the effect of attention maps on damage‐level classification performance based on the confidence calculated when estimating these maps, the proposed method realizes an accurate damage‐level classification. Our contribution is the development of a model with an end‐to‐end multimodal attention mechanism that can simultaneously consider both text and image data and the confidence of the attention map. Finally, experiments using real images validate the effectiveness of the proposed method.
{"title":"Damage‐level classification considering both correlation between image and text data and confidence of attention map","authors":"Keisuke Maeda, Naoki Ogawa, Takahiro Ogawa, Miki Haseyama","doi":"10.1111/mice.13366","DOIUrl":"https://doi.org/10.1111/mice.13366","url":null,"abstract":"In damage‐level classification, deep learning. models are more likely to focus on regions unrelated to classification targets because of the complexities inherent in real data, such as the diversity of damages (e.g., crack, efflorescence, and corrosion). This causes performance degradation. To solve this problem, it is necessary to handle data complexity and uncertainty. This study proposes a multimodal deep learning model that can focus on damaged regions using text data related to damage in images, such as materials and components. Furthermore, by adjusting the effect of attention maps on damage‐level classification performance based on the confidence calculated when estimating these maps, the proposed method realizes an accurate damage‐level classification. Our contribution is the development of a model with an end‐to‐end multimodal attention mechanism that can simultaneously consider both text and image data and the confidence of the attention map. Finally, experiments using real images validate the effectiveness of the proposed method.","PeriodicalId":156,"journal":{"name":"Computer-Aided Civil and Infrastructure Engineering","volume":"13 1","pages":""},"PeriodicalIF":11.775,"publicationDate":"2024-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142597433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Structural response estimation based on deep learning can suffer from reduced estimation performance owing to discrepancies between the training and test data as the noise level in the test data increases. This study proposes a short‐time Fourier transform‐based long short‐term memory (STFT‐LSTM) model to improve estimation performance in the presence of noise and ensure estimation robustness. This model enables robust estimations in the presence of noise by positioning an STFT layer before feeding the data into the LSTM layer. The output transformed into the time‐frequency domain by the STFT layer is learned by the LSTM model. The robustness of the proposed model was validated using a numerical model with three degrees of freedom at various signal‐to‐noise ratio levels, and its robustness against impulse and periodic noise was verified. Experimental validation assessed the estimation robustness under impact load and verified the robustness against environmental noise in the acquired acceleration response.
{"title":"Noise‐robust structural response estimation method using short‐time Fourier transform and long short‐term memory","authors":"Da Yo Yun, Hyo Seon Park","doi":"10.1111/mice.13370","DOIUrl":"https://doi.org/10.1111/mice.13370","url":null,"abstract":"Structural response estimation based on deep learning can suffer from reduced estimation performance owing to discrepancies between the training and test data as the noise level in the test data increases. This study proposes a short‐time Fourier transform‐based long short‐term memory (STFT‐LSTM) model to improve estimation performance in the presence of noise and ensure estimation robustness. This model enables robust estimations in the presence of noise by positioning an STFT layer before feeding the data into the LSTM layer. The output transformed into the time‐frequency domain by the STFT layer is learned by the LSTM model. The robustness of the proposed model was validated using a numerical model with three degrees of freedom at various signal‐to‐noise ratio levels, and its robustness against impulse and periodic noise was verified. Experimental validation assessed the estimation robustness under impact load and verified the robustness against environmental noise in the acquired acceleration response.","PeriodicalId":156,"journal":{"name":"Computer-Aided Civil and Infrastructure Engineering","volume":"35 1","pages":""},"PeriodicalIF":11.775,"publicationDate":"2024-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142597435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The cover image is based on the Article A generative adversarial network approach for removing motion blur in the automatic detection of pavement cracks by Yu Zhang and Lin Zhang, https://doi.org/10.1111/mice.13231. Image Credit: Lin Zhang.