首页 > 最新文献

2023 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)最新文献

英文 中文
An Integrated Approach to Near-duplicate Image Detection 一种近重复图像检测的集成方法
Heesung Yang, Hyeyoung Park
Near-duplicate image detection is a task to find clusters of images that are considered to be the same pictures in human view. This is important in image recommendation systems, because when the systems recommend candidate images, redundancies of retrieved candidate images need to be avoided. In addition, in the era of big-data where image data is overflowing, its importance in terms of saving storage resources further increases. In this paper, we propose a robust model for detecting various types of near-duplicate images by integrating four different detection modules, where we use multiple image feature extractors such as Gabor filter and deep networks. The four modules are then integrated to conduct the multivariate log-likelihood ratio test for detecting duplication. Through computational experiments, we confirmed that our method reaches state-of-the-art performance.
近重复图像检测是一项寻找在人类眼中被认为是相同图像的图像簇的任务。这在图像推荐系统中很重要,因为当系统推荐候选图像时,需要避免检索到的候选图像的冗余。此外,在图像数据泛滥的大数据时代,其在节省存储资源方面的重要性进一步增加。在本文中,我们提出了一个鲁棒模型,通过集成四个不同的检测模块来检测各种类型的近重复图像,其中我们使用了多个图像特征提取器,如Gabor滤波器和深度网络。然后将这四个模块整合起来进行多变量对数似然比检验,以检测重复。通过计算实验,我们证实了我们的方法达到了最先进的性能。
{"title":"An Integrated Approach to Near-duplicate Image Detection","authors":"Heesung Yang, Hyeyoung Park","doi":"10.1109/ICAIIC57133.2023.10067005","DOIUrl":"https://doi.org/10.1109/ICAIIC57133.2023.10067005","url":null,"abstract":"Near-duplicate image detection is a task to find clusters of images that are considered to be the same pictures in human view. This is important in image recommendation systems, because when the systems recommend candidate images, redundancies of retrieved candidate images need to be avoided. In addition, in the era of big-data where image data is overflowing, its importance in terms of saving storage resources further increases. In this paper, we propose a robust model for detecting various types of near-duplicate images by integrating four different detection modules, where we use multiple image feature extractors such as Gabor filter and deep networks. The four modules are then integrated to conduct the multivariate log-likelihood ratio test for detecting duplication. Through computational experiments, we confirmed that our method reaches state-of-the-art performance.","PeriodicalId":105769,"journal":{"name":"2023 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126108122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Design of poultry farm disease detection system based on K-Nearest Neighbor Algorithm 基于k -最近邻算法的家禽养殖场疾病检测系统设计
Seung-Jae Kim, H. Yoe, Meong-hun Lee
Every year, poultry farms suffer great damage from avian influenza outbreaks. The outbreak of avian influenza lowers the egg production rate and has a great impact on the market price. Countries around the world are working simultaneously to prevent the spread of avian influenza. However, despite these efforts, avian influenza still outbreaks every year, resulting in large-scale deaths of chickens. Therefore, in this paper, we propose a disease detection system based on K-Nearest Neighbor Algorithm to prevent large-scale spread of avian influenza. We used decrease in feed intake and a decrease in egg laying rate, which are the main symptoms of avian influenza when it outbreaks, as the standard data for the system's decision. If avian influenza is suspected according to the data analysis result, a push message is sent to the farmer's cell phone, and the farmer checks the information on the area suspected of avian influenza through the application linked with the system and transmits it to the server of the national livestock quarantine system. This is how the system is designed to work. Through this disease detection system, we expect that it will be possible to prevent the spread of avian influenza to the surrounding areas and neighboring farms in advance and to contribute to preventing damage to farms.
每年,家禽养殖场都会因禽流感爆发而遭受巨大损失。禽流感的爆发降低了产蛋率,对市场价格产生了很大的影响。世界各国正在同时努力防止禽流感的传播。然而,尽管作出了这些努力,每年仍会爆发禽流感,造成鸡的大规模死亡。因此,本文提出了一种基于k -最近邻算法的疾病检测系统,以防止禽流感的大规模传播。我们使用采食量减少和产蛋率下降作为系统决策的标准数据,这是禽流感爆发时的主要症状。如果根据数据分析结果发现疑似禽流感,则向养殖户的手机发送推送消息,养殖户通过与系统相连的应用程序查询疑似禽流感地区的信息,并将其传输到国家畜禽检疫系统服务器。这就是系统设计的工作方式。我们期待,通过该疾病检测系统,可以提前防止禽流感向周边地区和周边农场扩散,并为防止农场损失做出贡献。”
{"title":"Design of poultry farm disease detection system based on K-Nearest Neighbor Algorithm","authors":"Seung-Jae Kim, H. Yoe, Meong-hun Lee","doi":"10.1109/ICAIIC57133.2023.10067067","DOIUrl":"https://doi.org/10.1109/ICAIIC57133.2023.10067067","url":null,"abstract":"Every year, poultry farms suffer great damage from avian influenza outbreaks. The outbreak of avian influenza lowers the egg production rate and has a great impact on the market price. Countries around the world are working simultaneously to prevent the spread of avian influenza. However, despite these efforts, avian influenza still outbreaks every year, resulting in large-scale deaths of chickens. Therefore, in this paper, we propose a disease detection system based on K-Nearest Neighbor Algorithm to prevent large-scale spread of avian influenza. We used decrease in feed intake and a decrease in egg laying rate, which are the main symptoms of avian influenza when it outbreaks, as the standard data for the system's decision. If avian influenza is suspected according to the data analysis result, a push message is sent to the farmer's cell phone, and the farmer checks the information on the area suspected of avian influenza through the application linked with the system and transmits it to the server of the national livestock quarantine system. This is how the system is designed to work. Through this disease detection system, we expect that it will be possible to prevent the spread of avian influenza to the surrounding areas and neighboring farms in advance and to contribute to preventing damage to farms.","PeriodicalId":105769,"journal":{"name":"2023 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132421250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Interpretable Anomaly Detection for Lung Sounds Using Topology 基于拓扑的可解释肺音异常检测
Ryosuke Wakamoto, Shingo Mabu
In the medical field, research on computer-aided diagnosis using machine learning has been actively conducted. While machine learning can achieve high accuracy by collecting a large amount of data, low interpretability of machine learning is an important issue for achieving practical use in the medical field, where missing a disease may lead to fatal results. In this paper, we propose an anomaly detection method that takes the interpretability into account for diagnosing lung sounds. Furthermore, the proposed method incorporates the context information included in the sound data in the machine learning-based anomaly detection method to improve the detection performance while maintaining the interpretability of the detection results.
在医学领域,利用机器学习进行计算机辅助诊断的研究一直很活跃。虽然机器学习可以通过收集大量数据来实现高精度,但机器学习的低可解释性是在医疗领域实现实际应用的一个重要问题,在医疗领域,错过一种疾病可能会导致致命的结果。在本文中,我们提出一种考虑可解释性的异常检测方法来诊断肺音。此外,该方法将声音数据中包含的上下文信息融入到基于机器学习的异常检测方法中,以提高检测性能,同时保持检测结果的可解释性。
{"title":"Interpretable Anomaly Detection for Lung Sounds Using Topology","authors":"Ryosuke Wakamoto, Shingo Mabu","doi":"10.1109/ICAIIC57133.2023.10067072","DOIUrl":"https://doi.org/10.1109/ICAIIC57133.2023.10067072","url":null,"abstract":"In the medical field, research on computer-aided diagnosis using machine learning has been actively conducted. While machine learning can achieve high accuracy by collecting a large amount of data, low interpretability of machine learning is an important issue for achieving practical use in the medical field, where missing a disease may lead to fatal results. In this paper, we propose an anomaly detection method that takes the interpretability into account for diagnosing lung sounds. Furthermore, the proposed method incorporates the context information included in the sound data in the machine learning-based anomaly detection method to improve the detection performance while maintaining the interpretability of the detection results.","PeriodicalId":105769,"journal":{"name":"2023 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134163022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Radar Signal Abnormal Point Classification based on Camera-Radar Sensor Fusion 基于摄像头-雷达传感器融合的雷达信号异常点分类
Hyojeong Seo, Dong Seog Han
For safe driving, it is essential to accept reliable information from recognition sensors. In this paper, we present a deep learning model that classifies whether radar signals coming in are normal or abnormal. The abnormal signal is defined as noise from the radar and all signals received when the radar fails or is in trouble. It is difficult to determine whether reflected signals are normal or not based only on radar data. Therefore, the camera and radar sensors are used together, considering the radar cross section (RCS) distribution varies by the angle and distance of the object. The proposed model uses data received from camera and radar sensors to determine the normality of object signals. The model shows an accuracy of 96.24%. Through the results of this study, the reliability of radar signals can be determined in the actual driving environment, thereby ensuring the safety of vehicles and pedestrians.
为了安全驾驶,必须接受来自识别传感器的可靠信息。在本文中,我们提出了一个深度学习模型,用于分类进入的雷达信号是正常的还是异常的。异常信号定义为来自雷达的噪声以及雷达故障或故障时接收到的所有信号。仅凭雷达数据很难判断反射信号是否正常。因此,考虑到雷达截面(RCS)分布随目标角度和距离的变化而变化,摄像机和雷达传感器一起使用。该模型使用从相机和雷达传感器接收的数据来确定目标信号的正态性。该模型的准确率为96.24%。通过本研究的结果,可以在实际驾驶环境中确定雷达信号的可靠性,从而保证车辆和行人的安全。
{"title":"Radar Signal Abnormal Point Classification based on Camera-Radar Sensor Fusion","authors":"Hyojeong Seo, Dong Seog Han","doi":"10.1109/ICAIIC57133.2023.10067112","DOIUrl":"https://doi.org/10.1109/ICAIIC57133.2023.10067112","url":null,"abstract":"For safe driving, it is essential to accept reliable information from recognition sensors. In this paper, we present a deep learning model that classifies whether radar signals coming in are normal or abnormal. The abnormal signal is defined as noise from the radar and all signals received when the radar fails or is in trouble. It is difficult to determine whether reflected signals are normal or not based only on radar data. Therefore, the camera and radar sensors are used together, considering the radar cross section (RCS) distribution varies by the angle and distance of the object. The proposed model uses data received from camera and radar sensors to determine the normality of object signals. The model shows an accuracy of 96.24%. Through the results of this study, the reliability of radar signals can be determined in the actual driving environment, thereby ensuring the safety of vehicles and pedestrians.","PeriodicalId":105769,"journal":{"name":"2023 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134397515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Simultaneously Transmitting and Reflecting-Reconfigurable Intelligent Surfaces with Hardware Impairment and Phase Error 同时发射和反射-具有硬件损伤和相位误差的可重构智能曲面
Waqas Khalid, M. A. U. Rehman, Trinh Van Chien, Heejung Yu
Simultaneously transmitting and reflecting reconfig-urable intelligent surfaces (STAR-RISs) provide both transmitting and reflecting signals. The combination of STAR-RIS and non-orthogonal multiple access (NOMA) provides higher performance gains. In this paper, we evaluate NOMA downlink transmission with STAR-RIS under phase error and transceiver hardware impairment. We exploit the statistical properties of the effective channel power and evaluate the ergodic rate behaviors for ideal and non-ideal STAR-RIS-NOMA systems. The numerical results confirm the accuracy of the analytical analysis and demonstrate the selection of the system parameters.
同时发射和反射可重构智能表面(STAR-RISs)提供发射和反射信号。STAR-RIS和非正交多址(NOMA)的结合提供了更高的性能增益。在本文中,我们评估了星- ris在相位误差和收发器硬件损坏情况下的NOMA下行传输。我们利用有效信道功率的统计特性,评估了理想和非理想STAR-RIS-NOMA系统的遍历速率行为。数值结果证实了解析分析的准确性,并对系统参数的选择进行了论证。
{"title":"Simultaneously Transmitting and Reflecting-Reconfigurable Intelligent Surfaces with Hardware Impairment and Phase Error","authors":"Waqas Khalid, M. A. U. Rehman, Trinh Van Chien, Heejung Yu","doi":"10.1109/ICAIIC57133.2023.10067009","DOIUrl":"https://doi.org/10.1109/ICAIIC57133.2023.10067009","url":null,"abstract":"Simultaneously transmitting and reflecting reconfig-urable intelligent surfaces (STAR-RISs) provide both transmitting and reflecting signals. The combination of STAR-RIS and non-orthogonal multiple access (NOMA) provides higher performance gains. In this paper, we evaluate NOMA downlink transmission with STAR-RIS under phase error and transceiver hardware impairment. We exploit the statistical properties of the effective channel power and evaluate the ergodic rate behaviors for ideal and non-ideal STAR-RIS-NOMA systems. The numerical results confirm the accuracy of the analytical analysis and demonstrate the selection of the system parameters.","PeriodicalId":105769,"journal":{"name":"2023 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130846854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Generalized Spatio-Temporal Adaptive Normalization Framework 广义时空自适应归一化框架
Neeraj Kumar, A. Narang
In this paper, we propose Generalized Spatio-Temporal Adaptive Normalization (GSTAN) Framework for Generative Adversarial and Deep Learning Inference Architectures. By leveraging higher-order derivatives based temporal feature maps along with spatial feature map, our normalization approach leads to: (a) efficient generation of high-quality videos with better details and enhanced temporal coherence, and, (b) higher accuracy inference on multiple tasks. In order to evaluate model generalization, we performed experimental evaluation on multiple tasks including: video to video generation, video segmentation and activity recognition (classify the activity out of 101 activity classes, for a given input video). Detailed experimental analysis over a variety of datasets including CityScape, UCF101 and CK+ demonstrates superior performance of GSTAN and also provides the impact of its various configurations, including parallel GSTAN and sequential GSTAN.
在本文中,我们提出了用于生成对抗和深度学习推理架构的广义时空自适应归一化(GSTAN)框架。通过利用基于高阶导数的时间特征图和空间特征图,我们的归一化方法可以:(a)高效地生成具有更好细节和增强时间一致性的高质量视频,以及(b)对多任务的更高精度推断。为了评估模型泛化,我们对多个任务进行了实验评估,包括:视频到视频生成、视频分割和活动识别(对于给定的输入视频,从101个活动类中对活动进行分类)。在CityScape、UCF101和CK+等多种数据集上进行的详细实验分析证明了GSTAN的优越性能,并提供了其各种配置(包括并行GSTAN和顺序GSTAN)的影响。
{"title":"Generalized Spatio-Temporal Adaptive Normalization Framework","authors":"Neeraj Kumar, A. Narang","doi":"10.1109/ICAIIC57133.2023.10067068","DOIUrl":"https://doi.org/10.1109/ICAIIC57133.2023.10067068","url":null,"abstract":"In this paper, we propose Generalized Spatio-Temporal Adaptive Normalization (GSTAN) Framework for Generative Adversarial and Deep Learning Inference Architectures. By leveraging higher-order derivatives based temporal feature maps along with spatial feature map, our normalization approach leads to: (a) efficient generation of high-quality videos with better details and enhanced temporal coherence, and, (b) higher accuracy inference on multiple tasks. In order to evaluate model generalization, we performed experimental evaluation on multiple tasks including: video to video generation, video segmentation and activity recognition (classify the activity out of 101 activity classes, for a given input video). Detailed experimental analysis over a variety of datasets including CityScape, UCF101 and CK+ demonstrates superior performance of GSTAN and also provides the impact of its various configurations, including parallel GSTAN and sequential GSTAN.","PeriodicalId":105769,"journal":{"name":"2023 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114690552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cost-Effective Peak Shaving Strategy Based on Clustering and XGBoost Algorithm 基于聚类和XGBoost算法的高效调峰策略
Sol Lim, Rahma Gantassi, Yonghoon Choi
In a cost-effective peak shaving strategy, clustering and machine learning algorithm can be used to set optimal peak shaving time zone for each load. Energy Storage System (ESS) charge amount is determined with load prediction data through machine learning model, and the peak shaving time zone is adjusted flexibly according to load patterns for each cluster. It is possible to prevent ESS from being overcharged or undercharged through load prediction. In addition, rather than applying peak shaving collectively at the on-peak time, efficient operation of the power grid can be expected by adjusting the time zone flexibly for each power usage pattern. The effectiveness of the proposed system model is to be proved through changes in electricity cost depending on whether it is introduced or not.
在经济有效的调峰策略中,可以使用聚类和机器学习算法为每个负载设置最佳调峰时区。通过机器学习模型,根据负荷预测数据确定储能系统的电量,并根据各集群的负荷模式灵活调整调峰时区。通过负荷预测,可以防止ESS过充或欠充。此外,与其在高峰时段集体实施调峰,还可以通过灵活调整各用电模式的时区来实现电网的高效运行。所提出的系统模型的有效性将通过电力成本的变化来证明,这取决于是否引入该模型。
{"title":"Cost-Effective Peak Shaving Strategy Based on Clustering and XGBoost Algorithm","authors":"Sol Lim, Rahma Gantassi, Yonghoon Choi","doi":"10.1109/ICAIIC57133.2023.10067091","DOIUrl":"https://doi.org/10.1109/ICAIIC57133.2023.10067091","url":null,"abstract":"In a cost-effective peak shaving strategy, clustering and machine learning algorithm can be used to set optimal peak shaving time zone for each load. Energy Storage System (ESS) charge amount is determined with load prediction data through machine learning model, and the peak shaving time zone is adjusted flexibly according to load patterns for each cluster. It is possible to prevent ESS from being overcharged or undercharged through load prediction. In addition, rather than applying peak shaving collectively at the on-peak time, efficient operation of the power grid can be expected by adjusting the time zone flexibly for each power usage pattern. The effectiveness of the proposed system model is to be proved through changes in electricity cost depending on whether it is introduced or not.","PeriodicalId":105769,"journal":{"name":"2023 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116490732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Captioning Remote Sensing Images Using Transformer Architecture 使用Transformer架构为遥感图像添加字幕
Wrucha Nanal, M. Hajiarbabi
Image Captioning aspires to achieve a description of images with machines as a combination of Computer Vision (CV) and Natural Language Processing (NLP) fields. The current state of the art for image captioning use the Attention-based Encoder-Decoder model. The Attention-based model uses an ‘Attention mechanism’ that focuses on a particular section of the image to generate its corresponding caption word. The NLP side of this model uses Long Short-Term Memory (LSTM) for word generation. Attention-based models did not emphasize the relative arrangement of words in a caption thereby, ignoring the context of the sentence. Inspired by the versatility of Transformers in NLP, this work tries to utilise its architecture features for the Image Captioning use case. This work also makes use of a pretrained Bidirectional Encoder Representation of Transformer (BERT) which generates a contextually rich embedding of a caption. The Multi-Head Attention of the Transformer establishes a strong correlation between the image and contextually aware caption. This experiment is performed on the Remote Sensing Image Captioning Dataset. The results of the model are evaluated using NLP evaluation metrics such as Bilingual Evaluation Understudy 1–4 (BLEU), Metric for Evaluation of Translation with Explicit ORdering (METEOR) and Recall-Oriented Understudy for Gisting Evaluation (ROUGE). The proposed model shows better results for a few of the metrics.
作为计算机视觉(CV)和自然语言处理(NLP)领域的结合,图像字幕渴望用机器实现对图像的描述。目前最先进的图像字幕使用基于注意力的编码器-解码器模型。基于注意力的模型使用“注意力机制”,将注意力集中在图像的特定部分以生成相应的标题词。该模型的NLP部分使用长短期记忆(LSTM)来生成单词。基于注意力的模型没有强调标题中单词的相对排列,从而忽略了句子的上下文。受NLP中变形金刚的多功能性的启发,这项工作试图将其架构特征用于图像字幕用例。这项工作还利用了预训练的双向编码器转换器表示(BERT),它生成上下文丰富的标题嵌入。变形者的多头注意在图像和上下文感知标题之间建立了很强的相关性。本实验在遥感图像字幕数据集上进行。使用双语评价替补研究1-4 (BLEU)、明确排序翻译评价度量(METEOR)和面向回忆的注册评价替补研究(ROUGE)等NLP评价指标对模型的结果进行评价。提出的模型在一些指标上显示出更好的结果。
{"title":"Captioning Remote Sensing Images Using Transformer Architecture","authors":"Wrucha Nanal, M. Hajiarbabi","doi":"10.1109/ICAIIC57133.2023.10067039","DOIUrl":"https://doi.org/10.1109/ICAIIC57133.2023.10067039","url":null,"abstract":"Image Captioning aspires to achieve a description of images with machines as a combination of Computer Vision (CV) and Natural Language Processing (NLP) fields. The current state of the art for image captioning use the Attention-based Encoder-Decoder model. The Attention-based model uses an ‘Attention mechanism’ that focuses on a particular section of the image to generate its corresponding caption word. The NLP side of this model uses Long Short-Term Memory (LSTM) for word generation. Attention-based models did not emphasize the relative arrangement of words in a caption thereby, ignoring the context of the sentence. Inspired by the versatility of Transformers in NLP, this work tries to utilise its architecture features for the Image Captioning use case. This work also makes use of a pretrained Bidirectional Encoder Representation of Transformer (BERT) which generates a contextually rich embedding of a caption. The Multi-Head Attention of the Transformer establishes a strong correlation between the image and contextually aware caption. This experiment is performed on the Remote Sensing Image Captioning Dataset. The results of the model are evaluated using NLP evaluation metrics such as Bilingual Evaluation Understudy 1–4 (BLEU), Metric for Evaluation of Translation with Explicit ORdering (METEOR) and Recall-Oriented Understudy for Gisting Evaluation (ROUGE). The proposed model shows better results for a few of the metrics.","PeriodicalId":105769,"journal":{"name":"2023 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123530105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
PCR Radar-Based Counting System for Packaged Objects 基于PCR雷达的包装物计数系统
Joon-Il Cho, Soorim Yang, Jae-Hoon Kim
Counting the number of packaged objects is a process that the fulfillment center must go through before shipping it to the customer. With the development of computer vision and deep learning, many studies are being conducted to recognize objects and determine quantity using cameras, but it is impossible to identify objects in a sealed package due to the occlusion problem of not seeing objects that are occluded. Among the nondestructive inspection methods using wavelengths that use penetrating properties, radar that is harmless to the human body does not require contact with objects, and uses low-power radio frequency is the most suitable for counting objects. In this paper, we propose a system that achieves 99.33% counting accuracy of packed objects by removing background noise of radio frequency measured by 60GHz pulsed coherent radar.
计算包装物品的数量是履行中心在将其运送给客户之前必须经历的一个过程。随着计算机视觉和深度学习的发展,利用相机识别物体和确定数量的研究很多,但由于看不到被遮挡的物体的遮挡问题,无法识别密封包装中的物体。在利用穿透性波长的无损检测方法中,对人体无害的雷达不需要与物体接触,使用低功率射频最适合计数物体。本文提出了一种通过去除60GHz脉冲相干雷达测量射频的背景噪声,实现对堆积物体计数精度99.33%的系统。
{"title":"PCR Radar-Based Counting System for Packaged Objects","authors":"Joon-Il Cho, Soorim Yang, Jae-Hoon Kim","doi":"10.1109/ICAIIC57133.2023.10066968","DOIUrl":"https://doi.org/10.1109/ICAIIC57133.2023.10066968","url":null,"abstract":"Counting the number of packaged objects is a process that the fulfillment center must go through before shipping it to the customer. With the development of computer vision and deep learning, many studies are being conducted to recognize objects and determine quantity using cameras, but it is impossible to identify objects in a sealed package due to the occlusion problem of not seeing objects that are occluded. Among the nondestructive inspection methods using wavelengths that use penetrating properties, radar that is harmless to the human body does not require contact with objects, and uses low-power radio frequency is the most suitable for counting objects. In this paper, we propose a system that achieves 99.33% counting accuracy of packed objects by removing background noise of radio frequency measured by 60GHz pulsed coherent radar.","PeriodicalId":105769,"journal":{"name":"2023 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122129502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi Task Learning: A Survey and Future Directions 多任务学习:综述及未来发展方向
Taeho Lee, Junhee Seok
Multi-task learning (MTL) is a problem that must be applied in modern recommendation systems and is just as difficult. In the recent e-commerce advertising market, it is necessary to be able to predict not only the probability of users clicking, but also the probability of conversion and purchase. By predicting multi-task, it is possible to increase the accuracy of each task and optimize advertisements for various goals of advertisers. Traditional conversion rate (CVR) prediction models have difficulty learning because the number of conversions is too small compared to the total number of impressions. This problem is called a data sparsity (DS) problem. Another problem is that CVR models trained with samples of clicked impressions infer on samples of all impressions. This problem is called a sample selection bias (SSB) problem. This paper is a summary of the various solutions and current limitations and further directions about solving sample selection bias problem and data sparsity problem.
多任务学习(MTL)是一个必须应用于现代推荐系统的问题,也是一个难题。在最近的电子商务广告市场中,不仅需要能够预测用户点击的概率,还需要能够预测转化和购买的概率。通过多任务预测,可以提高每个任务的准确性,并针对广告商的各种目标优化广告。传统的转化率(CVR)预测模型很难学习,因为转化率与总印象数相比太少了。这个问题被称为数据稀疏性(DS)问题。另一个问题是,使用点击印象样本训练的CVR模型是基于所有印象样本进行推断的。这个问题被称为样本选择偏差(SSB)问题。本文总结了解决样本选择偏倚问题和数据稀疏性问题的各种解决方案,以及目前的局限性和进一步的发展方向。
{"title":"Multi Task Learning: A Survey and Future Directions","authors":"Taeho Lee, Junhee Seok","doi":"10.1109/ICAIIC57133.2023.10067098","DOIUrl":"https://doi.org/10.1109/ICAIIC57133.2023.10067098","url":null,"abstract":"Multi-task learning (MTL) is a problem that must be applied in modern recommendation systems and is just as difficult. In the recent e-commerce advertising market, it is necessary to be able to predict not only the probability of users clicking, but also the probability of conversion and purchase. By predicting multi-task, it is possible to increase the accuracy of each task and optimize advertisements for various goals of advertisers. Traditional conversion rate (CVR) prediction models have difficulty learning because the number of conversions is too small compared to the total number of impressions. This problem is called a data sparsity (DS) problem. Another problem is that CVR models trained with samples of clicked impressions infer on samples of all impressions. This problem is called a sample selection bias (SSB) problem. This paper is a summary of the various solutions and current limitations and further directions about solving sample selection bias problem and data sparsity problem.","PeriodicalId":105769,"journal":{"name":"2023 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124024429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
2023 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1