Pub Date : 2024-02-04DOI: 10.23919/ICACT60172.2024.10471991
Woo-Hyeon Kim, Joo-Chang Kim
This paper proposes an AI-based video metadata extension model to overcome the limitations of video search and recommendation systems in the multimedia industry. Current video searches and recommendations utilize pre-added metadata. Metadata includes filenames, keywords, tags, genres, etc. This makes it impossible to make direct predictions about the content of a video without pre-added metadata. These platforms also analyze your previous search history, viewing history, etc. to understand your interests in order to serve you personalized videos. This may not reflect the actual content and may raise privacy concerns. In addition, recommendation systems suffer from a cold start problem, which is the lack of an initial target, as well as a bubble effect. Therefore, this study proposes a search and recommendation system by expanding metadata in videos using techniques such as shot boundary detection, speech recognition, and text mining. The proposed method selects the main objects required by the recommendation system based on the object frequency and extracts the corresponding objects from the video frame by frame. In addition, we extract the speech from the video separately, convert the speech to text to extract the script and apply text mining techniques to the extracted script to quantify it. Then, we synchronize the object frequency and the transcript to create a single contextual data. After that, we group videos and clips based on the contextual data and index them. Finally, we utilize Shot Boundary Detection to segment videos based on their content. To ensure that the generated contextual data is appropriate for the video, the proposed model compares the extracted script with the video's subtitle data to check and calibrate its accuracy. The model can then be fine-tuned by tuning and cross-validating the hyperparameter to improve its performance. These models can be incorporated into a variety of content discovery and recommendation platforms. By using expanded metadata to provide results close to a search query and recommend videos with similar content based on the video, it solves problems with traditional search, recommendation, and censorship schemes, allowing users to explore more similar videos and clips.
{"title":"Search and Recommendation Systems with Metadata Extensions","authors":"Woo-Hyeon Kim, Joo-Chang Kim","doi":"10.23919/ICACT60172.2024.10471991","DOIUrl":"https://doi.org/10.23919/ICACT60172.2024.10471991","url":null,"abstract":"This paper proposes an AI-based video metadata extension model to overcome the limitations of video search and recommendation systems in the multimedia industry. Current video searches and recommendations utilize pre-added metadata. Metadata includes filenames, keywords, tags, genres, etc. This makes it impossible to make direct predictions about the content of a video without pre-added metadata. These platforms also analyze your previous search history, viewing history, etc. to understand your interests in order to serve you personalized videos. This may not reflect the actual content and may raise privacy concerns. In addition, recommendation systems suffer from a cold start problem, which is the lack of an initial target, as well as a bubble effect. Therefore, this study proposes a search and recommendation system by expanding metadata in videos using techniques such as shot boundary detection, speech recognition, and text mining. The proposed method selects the main objects required by the recommendation system based on the object frequency and extracts the corresponding objects from the video frame by frame. In addition, we extract the speech from the video separately, convert the speech to text to extract the script and apply text mining techniques to the extracted script to quantify it. Then, we synchronize the object frequency and the transcript to create a single contextual data. After that, we group videos and clips based on the contextual data and index them. Finally, we utilize Shot Boundary Detection to segment videos based on their content. To ensure that the generated contextual data is appropriate for the video, the proposed model compares the extracted script with the video's subtitle data to check and calibrate its accuracy. The model can then be fine-tuned by tuning and cross-validating the hyperparameter to improve its performance. These models can be incorporated into a variety of content discovery and recommendation platforms. By using expanded metadata to provide results close to a search query and recommend videos with similar content based on the video, it solves problems with traditional search, recommendation, and censorship schemes, allowing users to explore more similar videos and clips.","PeriodicalId":518077,"journal":{"name":"2024 26th International Conference on Advanced Communications Technology (ICACT)","volume":"16 2","pages":"38-42"},"PeriodicalIF":0.0,"publicationDate":"2024-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140528304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-04DOI: 10.23919/icact60172.2024.10471936
{"title":"The 26th International Conference on Advanced Communications Technology","authors":"","doi":"10.23919/icact60172.2024.10471936","DOIUrl":"https://doi.org/10.23919/icact60172.2024.10471936","url":null,"abstract":"","PeriodicalId":518077,"journal":{"name":"2024 26th International Conference on Advanced Communications Technology (ICACT)","volume":"76 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140528312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-04DOI: 10.23919/ICACT60172.2024.10472013
Muhammad Yaseen, Maisam Ali, Sikander Ali, Ali Hussain, Ali Athar, Hee-Cheol Kim
Cervical spine bones detection plays a crucial role in various medical applications, such as diagnosis, surgical planning, and treatment assessment. Traditional methods for cervical spine bones detection often rely on manual identification and segmentation, which are time-consuming and prone to errors. In recent years, deep learning approaches have shown great potential in automating the detection process and achieving high accuracy. In this research paper, we propose a deep learning-based approach for detecting cervical spine bones. Our suggested approach employs the YOLOv5 architecture, a cutting-edge object identification system renowned for its effectiveness and precision. The model is trained to recognize and locate bones structures using computed tomography (CT) scans image of the cervical spine as inputs. We conduct extensive evaluations using the trained models on the cervical spine dataset. The mean average precision (mAP) scores achieved by our model are 93% at threshold (mAP _0.5) and 83% at thresholds ranging from (mAP _0.5:0.95), which demonstrate the effectiveness of our approach in accurately detecting and localizing cervical spine bones. Our deep learning-based method for detecting cervical spine bones with high mAP scores presented in this research paper has significant implications for medical applications. With accurate and reliable bones detection, medical professionals can enhance diagnosis, surgical planning, and treatment assessment processes. The achieved mAP scores showcase the performance and potential of our proposed method, contributing to the advancement of bone detection techniques in cervical spine imaging and facilitating collaboration between the medical imaging and deep learning communities.
{"title":"Deep Learning Based Cervical Spine Bones Detection: A Case Study Using YOLO","authors":"Muhammad Yaseen, Maisam Ali, Sikander Ali, Ali Hussain, Ali Athar, Hee-Cheol Kim","doi":"10.23919/ICACT60172.2024.10472013","DOIUrl":"https://doi.org/10.23919/ICACT60172.2024.10472013","url":null,"abstract":"Cervical spine bones detection plays a crucial role in various medical applications, such as diagnosis, surgical planning, and treatment assessment. Traditional methods for cervical spine bones detection often rely on manual identification and segmentation, which are time-consuming and prone to errors. In recent years, deep learning approaches have shown great potential in automating the detection process and achieving high accuracy. In this research paper, we propose a deep learning-based approach for detecting cervical spine bones. Our suggested approach employs the YOLOv5 architecture, a cutting-edge object identification system renowned for its effectiveness and precision. The model is trained to recognize and locate bones structures using computed tomography (CT) scans image of the cervical spine as inputs. We conduct extensive evaluations using the trained models on the cervical spine dataset. The mean average precision (mAP) scores achieved by our model are 93% at threshold (mAP _0.5) and 83% at thresholds ranging from (mAP _0.5:0.95), which demonstrate the effectiveness of our approach in accurately detecting and localizing cervical spine bones. Our deep learning-based method for detecting cervical spine bones with high mAP scores presented in this research paper has significant implications for medical applications. With accurate and reliable bones detection, medical professionals can enhance diagnosis, surgical planning, and treatment assessment processes. The achieved mAP scores showcase the performance and potential of our proposed method, contributing to the advancement of bone detection techniques in cervical spine imaging and facilitating collaboration between the medical imaging and deep learning communities.","PeriodicalId":518077,"journal":{"name":"2024 26th International Conference on Advanced Communications Technology (ICACT)","volume":"62 ","pages":"01-05"},"PeriodicalIF":0.0,"publicationDate":"2024-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140528115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Reconfigurable intelligent surface (RIS) is considered as one of the key enabling technologies for future 6G wireless communication by realizing an intelligent radio environment. RIS is used as reflective array to change the transmission and coverage of radio frequency (RF) signals. In this paper, we propose a deep reinforcement learning (DRL) based RIS beamforming design in practical scenarios where RIS may have hardware loss, and the soft actor-critic (SAC)-exploration algorithm is presented to solve the beamforming design. The algorithm reduces the prediction error by introducing a perturbation signal to influence the action prediction. Simulation results show that our proposed SAC-exploration algorithm has significant improvement over the typical SAC algorithm, which verifies the effectiveness of the proposed algorithm,
{"title":"Deep Reinforcement Learning Based Beamforming in RIS-Assisted MIMO System Under Hardware Loss","authors":"Yuan Sun, Zhiquan Bai, Jinqiu Zhao, Dejie Ma, Zhaoxia Xian, Kyungsup Kwak","doi":"10.23919/ICACT60172.2024.10472006","DOIUrl":"https://doi.org/10.23919/ICACT60172.2024.10472006","url":null,"abstract":"Reconfigurable intelligent surface (RIS) is considered as one of the key enabling technologies for future 6G wireless communication by realizing an intelligent radio environment. RIS is used as reflective array to change the transmission and coverage of radio frequency (RF) signals. In this paper, we propose a deep reinforcement learning (DRL) based RIS beamforming design in practical scenarios where RIS may have hardware loss, and the soft actor-critic (SAC)-exploration algorithm is presented to solve the beamforming design. The algorithm reduces the prediction error by introducing a perturbation signal to influence the action prediction. Simulation results show that our proposed SAC-exploration algorithm has significant improvement over the typical SAC algorithm, which verifies the effectiveness of the proposed algorithm,","PeriodicalId":518077,"journal":{"name":"2024 26th International Conference on Advanced Communications Technology (ICACT)","volume":"57 ","pages":"01-06"},"PeriodicalIF":0.0,"publicationDate":"2024-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140528117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-04DOI: 10.23919/ICACT60172.2024.10471975
Birahim Babou, Khalifa Sylla, M. Sow, S. Ouya
Digital universities have been developed in several countries, particularly on the African continent, to meet the need for massification in the higher education sector. However, the lack of physical space is a major drawback, preventing learners from succeeding and increasing the drop-out rate compared with a conventional university. In these digital universities, learners use distance learning platforms to complete their training. For a good training, mastery of the fundamental modules is essential. With the frequent use of messaging applications, the integration of Artificial Intelligence (AI) could promote and facilitate access to educational content and enhance their learning experience. In this article, we propose a model for integrating a chatbot that will enable learners to access training modules to increase their knowledge and master core modules through formative skills assessments. The model we propose is based on the use of Machine Learning (ML) with the Rasa open-source framework and the Moodle Learning Management System (LMS) platform.
为了满足高等教育大众化的需求,一些国家,尤其是非洲大陆的国家,已经开发了数字大学。然而,与传统大学相比,缺乏物理空间是一个主要缺点,阻碍了学习者取得成功,并增加了辍学率。在这些数字化大学中,学习者利用远程学习平台完成培训。要想获得良好的培训效果,掌握基本模块至关重要。随着信息应用的频繁使用,人工智能(AI)的集成可以促进和便利教育内容的获取,并增强他们的学习体验。在本文中,我们提出了一个整合聊天机器人的模型,该模型将使学习者能够访问培训模块,通过形成性技能评估增加知识并掌握核心模块。我们提出的模型基于机器学习(ML)与 Rasa 开源框架和 Moodle 学习管理系统(LMS)平台的结合使用。
{"title":"Integration of a Chatbot to Facilitate Access to Educational Content in Digital Universities","authors":"Birahim Babou, Khalifa Sylla, M. Sow, S. Ouya","doi":"10.23919/ICACT60172.2024.10471975","DOIUrl":"https://doi.org/10.23919/ICACT60172.2024.10471975","url":null,"abstract":"Digital universities have been developed in several countries, particularly on the African continent, to meet the need for massification in the higher education sector. However, the lack of physical space is a major drawback, preventing learners from succeeding and increasing the drop-out rate compared with a conventional university. In these digital universities, learners use distance learning platforms to complete their training. For a good training, mastery of the fundamental modules is essential. With the frequent use of messaging applications, the integration of Artificial Intelligence (AI) could promote and facilitate access to educational content and enhance their learning experience. In this article, we propose a model for integrating a chatbot that will enable learners to access training modules to increase their knowledge and master core modules through formative skills assessments. The model we propose is based on the use of Machine Learning (ML) with the Rasa open-source framework and the Moodle Learning Management System (LMS) platform.","PeriodicalId":518077,"journal":{"name":"2024 26th International Conference on Advanced Communications Technology (ICACT)","volume":"80 ","pages":"311-314"},"PeriodicalIF":0.0,"publicationDate":"2024-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140528292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-04DOI: 10.23919/ICACT60172.2024.10472008
Hyejin S. Kim, Hyonyoung Han, Jiyon Son
Additive manufacturing is gaining attention in various fields such as medical applications, aerospace, defense, and complicated manufacturing industries. This is due to the advantages of additive manufacturing including reduced logistical constraints and the ability to produce customized products. However, the materials used in additive manufacturing are generally expensive and highly sensitive to changes in external conditions. For these reasons, it is crucial from a productivity standpoint to monitor the additive manufacturing process closely to detect any anomalies early on and decide whether to continue with the layering process. In this paper, we developed an algorithm that takes camera footage as input to determine the quality of the additive manufacturing output. We achieved an accuracy rate of 99.65%. Additionally, to simulate rare abnormal conditions, we used computer graphics to define nine different abnormal states and generated data for these conditions.
{"title":"Anomaly Detection During Additive Processes for DLP 3D Printing","authors":"Hyejin S. Kim, Hyonyoung Han, Jiyon Son","doi":"10.23919/ICACT60172.2024.10472008","DOIUrl":"https://doi.org/10.23919/ICACT60172.2024.10472008","url":null,"abstract":"Additive manufacturing is gaining attention in various fields such as medical applications, aerospace, defense, and complicated manufacturing industries. This is due to the advantages of additive manufacturing including reduced logistical constraints and the ability to produce customized products. However, the materials used in additive manufacturing are generally expensive and highly sensitive to changes in external conditions. For these reasons, it is crucial from a productivity standpoint to monitor the additive manufacturing process closely to detect any anomalies early on and decide whether to continue with the layering process. In this paper, we developed an algorithm that takes camera footage as input to determine the quality of the additive manufacturing output. We achieved an accuracy rate of 99.65%. Additionally, to simulate rare abnormal conditions, we used computer graphics to define nine different abnormal states and generated data for these conditions.","PeriodicalId":518077,"journal":{"name":"2024 26th International Conference on Advanced Communications Technology (ICACT)","volume":"19 1","pages":"01-03"},"PeriodicalIF":0.0,"publicationDate":"2024-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140528299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-04DOI: 10.23919/ICACT60172.2024.10471931
W. Choi, Sang Ju Lee, Jong Oh Kim, S. Choi
We propose a data plane acceleration technology to deliver data from the network to the host system in a high-performance computing environment. In the fourth industrial revolution, server systems are developing into high-performance computing systems through convergence with maj or technologies such as IoT, cloud, AI, and self-driving cars. The 4th industrial revolution is the convergence of various technologies and IT, requiring various flows and large amounts of data to be processed on servers. When transferring packets from the network interface card to the host server, packet processing in kernel space has a large overhead. Additionally, for fast packet processing by the host server, packets must be processed according to core affinity. Therefore, we propose a load balancing data transmission method to 48 cores based on Tile-Gx72 network processor to transfer data from the network interface card to the host CPU by kernel bypass in a multi-core-based high-performance server system. In addition, the performance of the 48 cores-based load balancing data transmission system based on the Tile-Gx72 network processor is confirmed through implementation.
{"title":"Multicore Packet Distribution Method Using Multicore Network Interface Card Based on Tile-gx72 Network Processor","authors":"W. Choi, Sang Ju Lee, Jong Oh Kim, S. Choi","doi":"10.23919/ICACT60172.2024.10471931","DOIUrl":"https://doi.org/10.23919/ICACT60172.2024.10471931","url":null,"abstract":"We propose a data plane acceleration technology to deliver data from the network to the host system in a high-performance computing environment. In the fourth industrial revolution, server systems are developing into high-performance computing systems through convergence with maj or technologies such as IoT, cloud, AI, and self-driving cars. The 4th industrial revolution is the convergence of various technologies and IT, requiring various flows and large amounts of data to be processed on servers. When transferring packets from the network interface card to the host server, packet processing in kernel space has a large overhead. Additionally, for fast packet processing by the host server, packets must be processed according to core affinity. Therefore, we propose a load balancing data transmission method to 48 cores based on Tile-Gx72 network processor to transfer data from the network interface card to the host CPU by kernel bypass in a multi-core-based high-performance server system. In addition, the performance of the 48 cores-based load balancing data transmission system based on the Tile-Gx72 network processor is confirmed through implementation.","PeriodicalId":518077,"journal":{"name":"2024 26th International Conference on Advanced Communications Technology (ICACT)","volume":"17 10","pages":"350-353"},"PeriodicalIF":0.0,"publicationDate":"2024-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140528276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}