Pub Date : 2023-07-13DOI: 10.1109/IAICT59002.2023.10205539
st Sartika, Z. Zainuddin, rd Amil, Ahmad Ilham
Road damage is a common issue in large cities, caused by factors such as heavy traffic, rainfall, and inadequate road maintenance. Detecting road damage, such as potholes, cracks, distortion, fatness, and polished aggregate, is crucial to ensure the safety and comfort of road users. This study proposes a method that uses the Gray Level Co-Occurrence Matrix (GLCM) and Support Vector Machine (SVM) algorithms to detect road damage. The proposed method involves processing road images using the GLCM algorithm to extract texture features, such as dissimilarity, correlation, contrast, energy, and Angular Second Moment. GLCM is an effective approach for extracting texture information and generating a matrix that illustrates the relationship between image pixels. These extracted features are then fed as input to the SVM model. The SVM model is trained to classify road images into several categories, including potholes, cracks, distortion, fatness, and polished aggregate. SVM is a machine learning method that can classify data into predetermined categories based on the extracted features. The test results show that the proposed method can detect road damage with high accuracy, as indicated by the F1 score for potholes of 0.95, cracks of 0.89, distortion of 0.8, fatness of 0.89, and polished aggregate of 0.95, with an overall accuracy of 80%. By improving the dataset and reducing the number of existing damage categories, it is likely that the accuracy of the method can be increased to around 90%. This approach can serve as a tool for continuously monitoring road conditions and assisting road authorities in making decisions regarding timely road improvements.
{"title":"Detection and Classification of Road Damage Using Camera with GLCM and SVM","authors":"st Sartika, Z. Zainuddin, rd Amil, Ahmad Ilham","doi":"10.1109/IAICT59002.2023.10205539","DOIUrl":"https://doi.org/10.1109/IAICT59002.2023.10205539","url":null,"abstract":"Road damage is a common issue in large cities, caused by factors such as heavy traffic, rainfall, and inadequate road maintenance. Detecting road damage, such as potholes, cracks, distortion, fatness, and polished aggregate, is crucial to ensure the safety and comfort of road users. This study proposes a method that uses the Gray Level Co-Occurrence Matrix (GLCM) and Support Vector Machine (SVM) algorithms to detect road damage. The proposed method involves processing road images using the GLCM algorithm to extract texture features, such as dissimilarity, correlation, contrast, energy, and Angular Second Moment. GLCM is an effective approach for extracting texture information and generating a matrix that illustrates the relationship between image pixels. These extracted features are then fed as input to the SVM model. The SVM model is trained to classify road images into several categories, including potholes, cracks, distortion, fatness, and polished aggregate. SVM is a machine learning method that can classify data into predetermined categories based on the extracted features. The test results show that the proposed method can detect road damage with high accuracy, as indicated by the F1 score for potholes of 0.95, cracks of 0.89, distortion of 0.8, fatness of 0.89, and polished aggregate of 0.95, with an overall accuracy of 80%. By improving the dataset and reducing the number of existing damage categories, it is likely that the accuracy of the method can be increased to around 90%. This approach can serve as a tool for continuously monitoring road conditions and assisting road authorities in making decisions regarding timely road improvements.","PeriodicalId":339796,"journal":{"name":"2023 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126871304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-13DOI: 10.1109/IAICT59002.2023.10205647
Shamisa Kaspour, A. Yassine
Nowadays, Non-Intrusive Load Monitoring (NILM) with Federated Learning (FL) framework has become a growing study towards providing a secure energy disaggregation system in smart homes. This study aims at deploying an attention-based aggregation (FedAtt) approach in FL to emphasize agents’ behavioral differences when consuming energy from various appliances. The goal of the proposed technique is to minimize the weighted distance between the parameters of the local model and the global model to better represent each local model’s characteristics. In this paper, we examine two different models for NILM: Short Sequence-to-Point (SS2P) and Variational Auto-Encoder (VAE). Our goal is to evaluate the effectiveness of FedAtt. The evaluation of the framework was carried out using the UK-DALE and REFIT datasets. The obtained results were then compared against centralized approaches of the models as well as FedAvg. Our findings show that FedAtt generates comparable results to the centralized model and FedAvg while improving the stability of FL at different values of added noise to local parameters.
{"title":"Federated Non-Intrusive Load Monitoring for Smart Homes Utilizing Attention-Based Aggregation","authors":"Shamisa Kaspour, A. Yassine","doi":"10.1109/IAICT59002.2023.10205647","DOIUrl":"https://doi.org/10.1109/IAICT59002.2023.10205647","url":null,"abstract":"Nowadays, Non-Intrusive Load Monitoring (NILM) with Federated Learning (FL) framework has become a growing study towards providing a secure energy disaggregation system in smart homes. This study aims at deploying an attention-based aggregation (FedAtt) approach in FL to emphasize agents’ behavioral differences when consuming energy from various appliances. The goal of the proposed technique is to minimize the weighted distance between the parameters of the local model and the global model to better represent each local model’s characteristics. In this paper, we examine two different models for NILM: Short Sequence-to-Point (SS2P) and Variational Auto-Encoder (VAE). Our goal is to evaluate the effectiveness of FedAtt. The evaluation of the framework was carried out using the UK-DALE and REFIT datasets. The obtained results were then compared against centralized approaches of the models as well as FedAvg. Our findings show that FedAtt generates comparable results to the centralized model and FedAvg while improving the stability of FL at different values of added noise to local parameters.","PeriodicalId":339796,"journal":{"name":"2023 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128856024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-13DOI: 10.1109/IAICT59002.2023.10205911
Chia-Yu Lin, Chieh-Ling Li, Yu-Chiao Kuo, Yun-Chieh Cheng, C. Jian, Hsiang-Ting Huang, Mitchel M. Hsu
Microsectioning is a destructive testing procedure used in the printed circuit board (PCB) fabrication industry to evaluate the quality of PCBs. During cross-section analysis, operators measure PCB component widths manually, which can lead to inconsistencies and make it challenging to establish standardized procedures. We propose a Deep Learning-based Microsection Measurement (DL-MM) Framework for PCB microsection samples to address this issue. The framework comprises four modules: the target detection module, the image preprocessing module, the labeling model, and the coordinate adaptation module. The target detection module is responsible for extracting the area of interest to be measured, which reduces the influence of surrounding noise and improves measurement accuracy. In the image preprocessing module, the target area image is normalized, labeled with coordinates, and resized to different sizes based on the class. The labeling model utilizes a convolutional neural network (CNN) model trained separately for each class to predict its punctuation, as the number of coordinates varies for each class. The final module is the coordinate adaptation module, which utilizes the predicted coordinates to draw a straight line on the expected image for improved readability. In addition, we evaluate the proposed framework on two types of microsections, and the experimental results show that the measurements’ root-mean-square error (RMSE) is only 2.1 pixels. Our proposed framework offers a more efficient, faster, and cost-effective alternative to the traditional manual measurement method.
{"title":"A Deep Learning-based Microsection Measurement Framework for Print Circuit Boards","authors":"Chia-Yu Lin, Chieh-Ling Li, Yu-Chiao Kuo, Yun-Chieh Cheng, C. Jian, Hsiang-Ting Huang, Mitchel M. Hsu","doi":"10.1109/IAICT59002.2023.10205911","DOIUrl":"https://doi.org/10.1109/IAICT59002.2023.10205911","url":null,"abstract":"Microsectioning is a destructive testing procedure used in the printed circuit board (PCB) fabrication industry to evaluate the quality of PCBs. During cross-section analysis, operators measure PCB component widths manually, which can lead to inconsistencies and make it challenging to establish standardized procedures. We propose a Deep Learning-based Microsection Measurement (DL-MM) Framework for PCB microsection samples to address this issue. The framework comprises four modules: the target detection module, the image preprocessing module, the labeling model, and the coordinate adaptation module. The target detection module is responsible for extracting the area of interest to be measured, which reduces the influence of surrounding noise and improves measurement accuracy. In the image preprocessing module, the target area image is normalized, labeled with coordinates, and resized to different sizes based on the class. The labeling model utilizes a convolutional neural network (CNN) model trained separately for each class to predict its punctuation, as the number of coordinates varies for each class. The final module is the coordinate adaptation module, which utilizes the predicted coordinates to draw a straight line on the expected image for improved readability. In addition, we evaluate the proposed framework on two types of microsections, and the experimental results show that the measurements’ root-mean-square error (RMSE) is only 2.1 pixels. Our proposed framework offers a more efficient, faster, and cost-effective alternative to the traditional manual measurement method.","PeriodicalId":339796,"journal":{"name":"2023 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131032726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-13DOI: 10.1109/IAICT59002.2023.10205901
A. Bhuvaneshwari, P. Kaythry
Many physical devices in everyday life have been connected to the global web since its inception. The network’s security improves as the number of items connected to it grows. However, current security measures make progress difficult. As a result, based on the BLAKE 2 hash algorithm, we propose a basic security mechanism. Our proposed method aims to improve the transfer of sensitive image data between nodes. The key issue is transmitting data across multiple nodes invisibly without being hacked. The proposed system’s primary objective is to maintain the picture secure and safe from third parties. It is accomplished by combining encryption and decryption into a lightweight image transport technique. It describes a technique for generating secret cryptographic keys from image pixels using the BLAKE 2 cryptographic hash that is image content adaptive. This scheme includes three encryption processes: DC coefficient encryption, AC coefficient encryption, and novel orthogonal transformation. The encrypted image is safely sent to another node over the network using an upgraded visual cryptosystem, and the decrypted image is successfully obtained at the receiver node.
{"title":"Enhanced Visual Cryptosystem Using BLAKE2 Hash Algorithm","authors":"A. Bhuvaneshwari, P. Kaythry","doi":"10.1109/IAICT59002.2023.10205901","DOIUrl":"https://doi.org/10.1109/IAICT59002.2023.10205901","url":null,"abstract":"Many physical devices in everyday life have been connected to the global web since its inception. The network’s security improves as the number of items connected to it grows. However, current security measures make progress difficult. As a result, based on the BLAKE 2 hash algorithm, we propose a basic security mechanism. Our proposed method aims to improve the transfer of sensitive image data between nodes. The key issue is transmitting data across multiple nodes invisibly without being hacked. The proposed system’s primary objective is to maintain the picture secure and safe from third parties. It is accomplished by combining encryption and decryption into a lightweight image transport technique. It describes a technique for generating secret cryptographic keys from image pixels using the BLAKE 2 cryptographic hash that is image content adaptive. This scheme includes three encryption processes: DC coefficient encryption, AC coefficient encryption, and novel orthogonal transformation. The encrypted image is safely sent to another node over the network using an upgraded visual cryptosystem, and the decrypted image is successfully obtained at the receiver node.","PeriodicalId":339796,"journal":{"name":"2023 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131159303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-13DOI: 10.1109/IAICT59002.2023.10205596
Andi Prademon Yunus, Kento Morita, Nobu C. Shirai, Tetsushi Wakabayashi
The ability to forecast human motion is crucial in increasing awareness of moving objects in the environment. To address this challenge, this study focuses on human motion forecasting based on annotated 2D and 3D data and the model’s usability on data obtained from pose estimation. This research presents the Temporal-Spatial Time Series Self-Attention method for human motion forecasting. The approach is evaluated using the Human 3.6M, 3DPW, and AMASS datasets based on standard evaluation protocols. Our method performed well in the 2D ground truth and pose estimation data compared to the other time series method. Our method did not yet outperform previous research in 3D input data. However, based on the quantitative and qualitative assessments, our approach demonstrated excellent performance in predicting human motion for short- and long-term objectives.
{"title":"Temporal-Spatial Time Series Self-Attention 2D & 3D Human Motion Forecasting","authors":"Andi Prademon Yunus, Kento Morita, Nobu C. Shirai, Tetsushi Wakabayashi","doi":"10.1109/IAICT59002.2023.10205596","DOIUrl":"https://doi.org/10.1109/IAICT59002.2023.10205596","url":null,"abstract":"The ability to forecast human motion is crucial in increasing awareness of moving objects in the environment. To address this challenge, this study focuses on human motion forecasting based on annotated 2D and 3D data and the model’s usability on data obtained from pose estimation. This research presents the Temporal-Spatial Time Series Self-Attention method for human motion forecasting. The approach is evaluated using the Human 3.6M, 3DPW, and AMASS datasets based on standard evaluation protocols. Our method performed well in the 2D ground truth and pose estimation data compared to the other time series method. Our method did not yet outperform previous research in 3D input data. However, based on the quantitative and qualitative assessments, our approach demonstrated excellent performance in predicting human motion for short- and long-term objectives.","PeriodicalId":339796,"journal":{"name":"2023 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT)","volume":"160 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114044770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-13DOI: 10.1109/IAICT59002.2023.10205854
M. Nasrun
ing is permitted with credit to the source. Libraries are permitted to photocopy beyond the limit of U.S. copyright law for private use of patrons those articles in this volume that carry a code at the bottom of the first page, provided the per-copy fee indicated in the code is paid through Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923. For reprint or republication permission, email to IEEE Copyrights Manager at pubs-permissions@ieee.org.
允许Ing,并注明出处。在美国版权法的限制之外,图书馆允许影印本卷中第一页底部带有代码的文章,供用户私人使用,前提是代码中显示的每本费用由版权清算中心支付,地址:222 Rosewood Drive, Danvers, MA 01923。如需转载或转载许可,请发送电子邮件至IEEE版权经理pubs-permissions@ieee.org。
{"title":"IAICT 2023 Cover Page","authors":"M. Nasrun","doi":"10.1109/IAICT59002.2023.10205854","DOIUrl":"https://doi.org/10.1109/IAICT59002.2023.10205854","url":null,"abstract":"ing is permitted with credit to the source. Libraries are permitted to photocopy beyond the limit of U.S. copyright law for private use of patrons those articles in this volume that carry a code at the bottom of the first page, provided the per-copy fee indicated in the code is paid through Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923. For reprint or republication permission, email to IEEE Copyrights Manager at pubs-permissions@ieee.org.","PeriodicalId":339796,"journal":{"name":"2023 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121895058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-13DOI: 10.1109/IAICT59002.2023.10205769
Meizan Arthur Alfianto, Y. Priyadi, K. A. Laksitowening
The compatibility between the Use Case Description (UCD) and the Functional Requirements (FR) is essential for the successful development of software. Nevertheless, discrepancies may occur if the UCD does not precisely reflect the intended functionalities specified in the FR. This paper uses a Sentence Transformer Model to evaluate the alignment between the UCD and FR, both written in natural language. The study aims to identify potential discrepancies and ambiguities in the UCD and suggest modifications to better their correspondence with the FR. The Sentence Transformer Model quantifies the degree of alignment between the UCD and FR by analyzing semantic similarity. According to the findings, modifications to the UCD, such as refining terminology, elucidating definitions, and correcting writing errors, can substantially increase semantic similarity with the FR. The Pearson correlation coefficient of 0.70 indicates the correlation between the predicted and the ground truth of semantic similarity is linearly positive. The Spearman rank correlation coefficient value of 0.715 suggests a positive monotonic relationship, with the two text types maintaining their rank of semantic similarity. The low mean squared error (MSE) value of 0.024 demonstrates the model’s predictive accuracy for semantic similarity.
{"title":"Semantic Textual Similarity in Requirement Specification and Use Case Description based on Sentence Transformer Model","authors":"Meizan Arthur Alfianto, Y. Priyadi, K. A. Laksitowening","doi":"10.1109/IAICT59002.2023.10205769","DOIUrl":"https://doi.org/10.1109/IAICT59002.2023.10205769","url":null,"abstract":"The compatibility between the Use Case Description (UCD) and the Functional Requirements (FR) is essential for the successful development of software. Nevertheless, discrepancies may occur if the UCD does not precisely reflect the intended functionalities specified in the FR. This paper uses a Sentence Transformer Model to evaluate the alignment between the UCD and FR, both written in natural language. The study aims to identify potential discrepancies and ambiguities in the UCD and suggest modifications to better their correspondence with the FR. The Sentence Transformer Model quantifies the degree of alignment between the UCD and FR by analyzing semantic similarity. According to the findings, modifications to the UCD, such as refining terminology, elucidating definitions, and correcting writing errors, can substantially increase semantic similarity with the FR. The Pearson correlation coefficient of 0.70 indicates the correlation between the predicted and the ground truth of semantic similarity is linearly positive. The Spearman rank correlation coefficient value of 0.715 suggests a positive monotonic relationship, with the two text types maintaining their rank of semantic similarity. The low mean squared error (MSE) value of 0.024 demonstrates the model’s predictive accuracy for semantic similarity.","PeriodicalId":339796,"journal":{"name":"2023 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130262940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-13DOI: 10.1109/IAICT59002.2023.10205705
Tabinda Ashraf, M. Iqbal, Steven S. W. Lee, Jen-Yi Pan
Energy efficient routing protocol is the requirement of today’s wireless sensor networks. Various protocols have been developed in order to create an energy efficient wireless sensor networks, but there are still some shortcomings in this area. During cluster formation, some nodes are left alone and referred to as lone nodes, which directly communicate with the Base Station (BS) and consume a significant portion of the energy. To overcome this issue, this study proposes a Two-Fold Cluster Head Selection (TFCHS) routing algorithm that reduces the number of lone nodes and enhances the network’s lifetime. The proposed algorithm is based on the LEACH-B (LEACH-Balanced) and Residual Energy (ResEn) protocols. In TFCHS, lone node cluster formation is achieved by identifying the location of lone nodes and comparing their distance to a threshold distance. Cluster Heads (CHs) are then selected, and they broadcast TDMA slots to their member nodes in a steady phase, where nodes send their data to the CH. The CHs process the received data and send it to the BS. The proposed work performed better in terms of average aggregation energy, lone nodes, consumed energy, network lifetime, and effective packets.
{"title":"Two Fold Cluster Head Selection in Wireless Sensor Networks","authors":"Tabinda Ashraf, M. Iqbal, Steven S. W. Lee, Jen-Yi Pan","doi":"10.1109/IAICT59002.2023.10205705","DOIUrl":"https://doi.org/10.1109/IAICT59002.2023.10205705","url":null,"abstract":"Energy efficient routing protocol is the requirement of today’s wireless sensor networks. Various protocols have been developed in order to create an energy efficient wireless sensor networks, but there are still some shortcomings in this area. During cluster formation, some nodes are left alone and referred to as lone nodes, which directly communicate with the Base Station (BS) and consume a significant portion of the energy. To overcome this issue, this study proposes a Two-Fold Cluster Head Selection (TFCHS) routing algorithm that reduces the number of lone nodes and enhances the network’s lifetime. The proposed algorithm is based on the LEACH-B (LEACH-Balanced) and Residual Energy (ResEn) protocols. In TFCHS, lone node cluster formation is achieved by identifying the location of lone nodes and comparing their distance to a threshold distance. Cluster Heads (CHs) are then selected, and they broadcast TDMA slots to their member nodes in a steady phase, where nodes send their data to the CH. The CHs process the received data and send it to the BS. The proposed work performed better in terms of average aggregation energy, lone nodes, consumed energy, network lifetime, and effective packets.","PeriodicalId":339796,"journal":{"name":"2023 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121395420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-13DOI: 10.1109/IAICT59002.2023.10205899
I. Kaur, Adwaita Janardhan Jadhav
Deep neural networks (DNNs) are state-of-the-art techniques for solving most computer vision problems. DNNs require billions of parameters and operations to achieve state-of-the-art results. This requirement makes DNNs extremely compute, memory, and energy-hungry, and consequently difficult to deploy on small battery-powered Internet-of-Things (IoT) devices with limited computing resources. Deployment of DNNs on Internet-of-Things devices, such as traffic cameras, can improve public safety by enabling applications such as automatic accident detection and emergency response. Through this paper, we survey the recent advances in low-power and energy-efficient DNN implementations that improve the deployability of DNNs without significantly sacrificing accuracy. In general, these techniques either reduce the memory requirements, the number of arithmetic operations, or both. The techniques can be divided into three major categories: (1) neural network compression, (2) network architecture search and design, and (3) compiler and graph optimizations. In this paper, we survey both low-power techniques for both convolutional and transformer DNNs, and summarize the advantages, disadvantages, and open research problems.
{"title":"Survey on Computer Vision Techniques for Internet-of-Things Devices","authors":"I. Kaur, Adwaita Janardhan Jadhav","doi":"10.1109/IAICT59002.2023.10205899","DOIUrl":"https://doi.org/10.1109/IAICT59002.2023.10205899","url":null,"abstract":"Deep neural networks (DNNs) are state-of-the-art techniques for solving most computer vision problems. DNNs require billions of parameters and operations to achieve state-of-the-art results. This requirement makes DNNs extremely compute, memory, and energy-hungry, and consequently difficult to deploy on small battery-powered Internet-of-Things (IoT) devices with limited computing resources. Deployment of DNNs on Internet-of-Things devices, such as traffic cameras, can improve public safety by enabling applications such as automatic accident detection and emergency response. Through this paper, we survey the recent advances in low-power and energy-efficient DNN implementations that improve the deployability of DNNs without significantly sacrificing accuracy. In general, these techniques either reduce the memory requirements, the number of arithmetic operations, or both. The techniques can be divided into three major categories: (1) neural network compression, (2) network architecture search and design, and (3) compiler and graph optimizations. In this paper, we survey both low-power techniques for both convolutional and transformer DNNs, and summarize the advantages, disadvantages, and open research problems.","PeriodicalId":339796,"journal":{"name":"2023 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116249404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-13DOI: 10.1109/IAICT59002.2023.10205826
Mohammad Sholik, C. Fatichah, B. Amaliah
Cancer of the cervix is the disease that accounts for the majority of deaths in women. This disease accounts for nearly 12% of all cancers and has a high risk of death for women worldwide. If precancerous lesions are found early, the disease can be cured. Pap smear screening is known for its reliability and effectiveness in detecting cervical cell abnormalities early, but there is a risk of errors in manual image analysis. Using deep learning approaches in the domains of medicine and healthcare can be used for decision support systems to remove bias from observations. This paper presents a framework that utilizes deep learning and techniques to reduce the dimensions of features. The suggested framework captures deep features from a convolutional neural network (CNN) model and employs a feature reduction approach using linear discriminant analysis (LDA) to ensure computational cost reduction. The feature dimension derived from the CNN model produces a huge feature space that requires a feature reduction to eliminate redundant features. The features that have been reduced by linear discriminant analysis are used for the training of three classifiers, namely SVM, MLP, and K-NN, to generate final predictions. The evaluation of the proposed framework involved the utilization of three datasets that are openly accessible: the Herlev dataset, the Mendeley dataset, and the SIPaKMeD dataset, which achieved classification accuracies of 95.65% (SVM and MLP), 100% (MLP), and 97.54 (K-NN), respectively.
{"title":"Classification of Cervical Cell Images into Healthy or Cancer Using Convolution Neural Network and Linear Discriminant Analysis","authors":"Mohammad Sholik, C. Fatichah, B. Amaliah","doi":"10.1109/IAICT59002.2023.10205826","DOIUrl":"https://doi.org/10.1109/IAICT59002.2023.10205826","url":null,"abstract":"Cancer of the cervix is the disease that accounts for the majority of deaths in women. This disease accounts for nearly 12% of all cancers and has a high risk of death for women worldwide. If precancerous lesions are found early, the disease can be cured. Pap smear screening is known for its reliability and effectiveness in detecting cervical cell abnormalities early, but there is a risk of errors in manual image analysis. Using deep learning approaches in the domains of medicine and healthcare can be used for decision support systems to remove bias from observations. This paper presents a framework that utilizes deep learning and techniques to reduce the dimensions of features. The suggested framework captures deep features from a convolutional neural network (CNN) model and employs a feature reduction approach using linear discriminant analysis (LDA) to ensure computational cost reduction. The feature dimension derived from the CNN model produces a huge feature space that requires a feature reduction to eliminate redundant features. The features that have been reduced by linear discriminant analysis are used for the training of three classifiers, namely SVM, MLP, and K-NN, to generate final predictions. The evaluation of the proposed framework involved the utilization of three datasets that are openly accessible: the Herlev dataset, the Mendeley dataset, and the SIPaKMeD dataset, which achieved classification accuracies of 95.65% (SVM and MLP), 100% (MLP), and 97.54 (K-NN), respectively.","PeriodicalId":339796,"journal":{"name":"2023 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT)","volume":"33 7-8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116476095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}