Pub Date : 2023-02-14DOI: 10.1109/IPRIA59240.2023.10147183
Shahla Najaflou, Fatemeh Sadat Lesani
The World Health Organization (WHO) considered it difficult to describe the information about the spread of critical symptoms of the Coronavirus due to the different behaviors of the COVID −19 virus. Most people only experience symptoms when the symptoms of the Coronavirus reach an acute stage, and others do not experience any symptoms at all. Lung scan images are one of the ways to distinguish COVID-19 from other similar diseases, such as pneumonia. The emerging novel of the coronavirus and the similarity of pulmonary complications cause the doctor to misdiagnose. In this paper, we utilize 13967 samples of lung scan images to diagnose COVID-19 cases from viral pneumonia and normal ones. This paper proposes an Xception based transfer learning approach to extract the deep features of each image based on depthwise separable convolutions. We extend the Xception architecture by adding a Gated Recurrent Unit (GRU) and a fully connected layer and fine-tune the model to adjust a more abstract representation of features to classify them. The obtained results show the effectiveness of our proposed hybrid method in detecting cases of COVID-19 from normal and viral pneumonia with an accuracy and precision of 95.71% and 94.24%, respectively, which improves the state-of-the-art results.
{"title":"Diagnosis of COVID-19 cases from viral pneumonia and normal ones based on transfer learning approach: Xception-GRU","authors":"Shahla Najaflou, Fatemeh Sadat Lesani","doi":"10.1109/IPRIA59240.2023.10147183","DOIUrl":"https://doi.org/10.1109/IPRIA59240.2023.10147183","url":null,"abstract":"The World Health Organization (WHO) considered it difficult to describe the information about the spread of critical symptoms of the Coronavirus due to the different behaviors of the COVID −19 virus. Most people only experience symptoms when the symptoms of the Coronavirus reach an acute stage, and others do not experience any symptoms at all. Lung scan images are one of the ways to distinguish COVID-19 from other similar diseases, such as pneumonia. The emerging novel of the coronavirus and the similarity of pulmonary complications cause the doctor to misdiagnose. In this paper, we utilize 13967 samples of lung scan images to diagnose COVID-19 cases from viral pneumonia and normal ones. This paper proposes an Xception based transfer learning approach to extract the deep features of each image based on depthwise separable convolutions. We extend the Xception architecture by adding a Gated Recurrent Unit (GRU) and a fully connected layer and fine-tune the model to adjust a more abstract representation of features to classify them. The obtained results show the effectiveness of our proposed hybrid method in detecting cases of COVID-19 from normal and viral pneumonia with an accuracy and precision of 95.71% and 94.24%, respectively, which improves the state-of-the-art results.","PeriodicalId":109390,"journal":{"name":"2023 6th International Conference on Pattern Recognition and Image Analysis (IPRIA)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126560309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-14DOI: 10.1109/IPRIA59240.2023.10147169
Fereshteh Zandi, H. Ebrahimpour-Komleh, Hassan Homayoun
The Covid_19 disease is one of the deadliest inflammatories and chronic and acute diseases of the human respiratory system, which is the result of the inhibition of a virus called corona in the respiratory organs since the spread of this virus is rapid and has affected many people in the world. a specialist needs to be carefully evaluated to diagnose the disease based on X-ray images because the number of patients with Covid_19 exceeds the capacity of hospitals and taking care of a large number of people is tedious work that can reduce the accuracy of the doctor in diagnosing the disease. In addition, in such cases, the absence of a specialist doctor can lead to misdiagnosis and incorrect prescribing. In this article, we intend to provide an approach to accelerate the diagnosis process and reduce the workload of specialists automatically, which in addition to helping physicians in hospitals that do not have a specialist physician, also allows patients to be diagnosed and treated. we use pre-trained UNet to extract the lung balloons, which eliminates the extra noise and parts in the X-ray image and then we give the generated images to a convolutional neural network model designed to diagnose and classify Covid_19 disease from Pneumonia, and finally, we use Grad-CAM and Vanilla Gradient and Smooth Grad techniques to validate the designed model. according to the results, our proposed approach using evaluation metrics was able to achieve the highest degree of accuracy in distinguishing Covid_19 disease from Pneumonia.
{"title":"Application of Explainable Convolutional Neural Networks on the Differential Diagnosis of Covid_19 and Pneumonia using Chest Radiograph","authors":"Fereshteh Zandi, H. Ebrahimpour-Komleh, Hassan Homayoun","doi":"10.1109/IPRIA59240.2023.10147169","DOIUrl":"https://doi.org/10.1109/IPRIA59240.2023.10147169","url":null,"abstract":"The Covid_19 disease is one of the deadliest inflammatories and chronic and acute diseases of the human respiratory system, which is the result of the inhibition of a virus called corona in the respiratory organs since the spread of this virus is rapid and has affected many people in the world. a specialist needs to be carefully evaluated to diagnose the disease based on X-ray images because the number of patients with Covid_19 exceeds the capacity of hospitals and taking care of a large number of people is tedious work that can reduce the accuracy of the doctor in diagnosing the disease. In addition, in such cases, the absence of a specialist doctor can lead to misdiagnosis and incorrect prescribing. In this article, we intend to provide an approach to accelerate the diagnosis process and reduce the workload of specialists automatically, which in addition to helping physicians in hospitals that do not have a specialist physician, also allows patients to be diagnosed and treated. we use pre-trained UNet to extract the lung balloons, which eliminates the extra noise and parts in the X-ray image and then we give the generated images to a convolutional neural network model designed to diagnose and classify Covid_19 disease from Pneumonia, and finally, we use Grad-CAM and Vanilla Gradient and Smooth Grad techniques to validate the designed model. according to the results, our proposed approach using evaluation metrics was able to achieve the highest degree of accuracy in distinguishing Covid_19 disease from Pneumonia.","PeriodicalId":109390,"journal":{"name":"2023 6th International Conference on Pattern Recognition and Image Analysis (IPRIA)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126414468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-14DOI: 10.1109/IPRIA59240.2023.10147181
Kosar Amiri, M. Imani, H. Ghassemian
The empirical mode decomposition (EMD) based morphological profile (MP), called as EMDMP, is proposed for hyperspectral image classification in this work. The EMD algorithm can well decompose the nonlinear spectral feature vector to intrinsic components and the residual term. To extract the main spatial characteristics and shape structures, the closing operators are applied to the intrinsic components. In contrast, to extract details and more abstract contextual features, the opening operators are applied to the residual component. Finally, a multi-resolution morphological profile is provided with concatenation of the intrinsic components-based closing profile and residual component based opening profile. EMDMP achieves 96.54% overall accuracy compared to 95.15% obtained by convolutional neural network (CNN) on Indian dataset with 10% training samples. In University of Pavia with 1% training samples, EMDMP results in 97.66% overall accuracy compared to 95.90% obtained by CNN.
{"title":"Empirical Mode Decomposition Based Morphological Profile For Hyperspectral Image Classification","authors":"Kosar Amiri, M. Imani, H. Ghassemian","doi":"10.1109/IPRIA59240.2023.10147181","DOIUrl":"https://doi.org/10.1109/IPRIA59240.2023.10147181","url":null,"abstract":"The empirical mode decomposition (EMD) based morphological profile (MP), called as EMDMP, is proposed for hyperspectral image classification in this work. The EMD algorithm can well decompose the nonlinear spectral feature vector to intrinsic components and the residual term. To extract the main spatial characteristics and shape structures, the closing operators are applied to the intrinsic components. In contrast, to extract details and more abstract contextual features, the opening operators are applied to the residual component. Finally, a multi-resolution morphological profile is provided with concatenation of the intrinsic components-based closing profile and residual component based opening profile. EMDMP achieves 96.54% overall accuracy compared to 95.15% obtained by convolutional neural network (CNN) on Indian dataset with 10% training samples. In University of Pavia with 1% training samples, EMDMP results in 97.66% overall accuracy compared to 95.90% obtained by CNN.","PeriodicalId":109390,"journal":{"name":"2023 6th International Conference on Pattern Recognition and Image Analysis (IPRIA)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121951111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-14DOI: 10.1109/IPRIA59240.2023.10147176
Hossein Motamednia, Pooryaa Cheraaqee, Azadeh Mansouri, Ahmad Mahmoudi-Aznaveh
Perceptual quality assessment has always been challenging due to the difficulty in modeling the no-linear human visual system. With the diversity in the contents of multimedia signals, the conventional methods for traditional media seems no longer satisfying. One of these emerging media, is the screen content images/videos (SCINs), Containing texts and computer generated graphics, SCVs cannot be sufficiently expressed with features designed for natural sceneries. Therefore, new researches tried to devise objective quality assessment metrics, specificly for screen contents. Recently, a dataset was proposed for quality assessment of screen content videos. Since screen contents are full of structures that spread in cardinal directions, we were motivated to employ the horizontal and vertical subbands of the wavelet transform to characterize these types of visual contents. The features were incorporated in a full-reference method that showed promising results on the publicly available dataset for SCV quality assessment. The method can bo accessed via: https://github.com/motamedNia/QASCV.
{"title":"Quality Assessment of Screen Content Videos","authors":"Hossein Motamednia, Pooryaa Cheraaqee, Azadeh Mansouri, Ahmad Mahmoudi-Aznaveh","doi":"10.1109/IPRIA59240.2023.10147176","DOIUrl":"https://doi.org/10.1109/IPRIA59240.2023.10147176","url":null,"abstract":"Perceptual quality assessment has always been challenging due to the difficulty in modeling the no-linear human visual system. With the diversity in the contents of multimedia signals, the conventional methods for traditional media seems no longer satisfying. One of these emerging media, is the screen content images/videos (SCINs), Containing texts and computer generated graphics, SCVs cannot be sufficiently expressed with features designed for natural sceneries. Therefore, new researches tried to devise objective quality assessment metrics, specificly for screen contents. Recently, a dataset was proposed for quality assessment of screen content videos. Since screen contents are full of structures that spread in cardinal directions, we were motivated to employ the horizontal and vertical subbands of the wavelet transform to characterize these types of visual contents. The features were incorporated in a full-reference method that showed promising results on the publicly available dataset for SCV quality assessment. The method can bo accessed via: https://github.com/motamedNia/QASCV.","PeriodicalId":109390,"journal":{"name":"2023 6th International Conference on Pattern Recognition and Image Analysis (IPRIA)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114339680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Today, in natural language processing, labeled data is important, however, getting adequate amount of data is a challenging step. There are many tasks for which it is difficult to obtain the required training data. For example, in machine translation, we need to prepare a lot of data in the target language, so that the work performance is acceptable. We may not be able to collect useful data in the target language. Hence, we need to use few-shot learning. Recently, a method called prompting has evolved, in which text inputs are converted into text with a new structure using a certain format, which has a blank space. Given the prompted text, a pre-trained language model replaces the space with the best word. Prompting can help us in the field of few-shot learning; even in cases where there is no data, i.e. zero-shot learning. Recent works use large language models such as GPT-2 and GPT-3, with the prompting method, performed tasks such as machine translation. These efforts do not use any labeled training data. But these types of models with a massive number of parameters require powerful hardware. Pattern-Exploiting Training (PET) and iterative Pattern-Exploiting Training (iPET) were introduced, which perform few-shot learning using prompting and smaller pre-trained language models such as Bert and Roberta. For example, for the Yahoo text classification dataset, using iPET and Roberta and ten labeled datasets, 70% accuracy has been reached. This paper reviews research works in few-shot learning with a new paradigm in natural language processing, which we dub prompt-based learning or in short, prompting.
{"title":"Few-shot Learning with Prompting Methods","authors":"Morteza Bahrami, Muharram Mansoorizadeh, Hassan Khotanlou","doi":"10.1109/IPRIA59240.2023.10147172","DOIUrl":"https://doi.org/10.1109/IPRIA59240.2023.10147172","url":null,"abstract":"Today, in natural language processing, labeled data is important, however, getting adequate amount of data is a challenging step. There are many tasks for which it is difficult to obtain the required training data. For example, in machine translation, we need to prepare a lot of data in the target language, so that the work performance is acceptable. We may not be able to collect useful data in the target language. Hence, we need to use few-shot learning. Recently, a method called prompting has evolved, in which text inputs are converted into text with a new structure using a certain format, which has a blank space. Given the prompted text, a pre-trained language model replaces the space with the best word. Prompting can help us in the field of few-shot learning; even in cases where there is no data, i.e. zero-shot learning. Recent works use large language models such as GPT-2 and GPT-3, with the prompting method, performed tasks such as machine translation. These efforts do not use any labeled training data. But these types of models with a massive number of parameters require powerful hardware. Pattern-Exploiting Training (PET) and iterative Pattern-Exploiting Training (iPET) were introduced, which perform few-shot learning using prompting and smaller pre-trained language models such as Bert and Roberta. For example, for the Yahoo text classification dataset, using iPET and Roberta and ten labeled datasets, 70% accuracy has been reached. This paper reviews research works in few-shot learning with a new paradigm in natural language processing, which we dub prompt-based learning or in short, prompting.","PeriodicalId":109390,"journal":{"name":"2023 6th International Conference on Pattern Recognition and Image Analysis (IPRIA)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123735309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-14DOI: 10.1109/IPRIA59240.2023.10147187
Amirhosein Zanganeh, E. Sharifi, M. Jampour
Football event detection in videos is very challenging, but challenges on the Penalty and the Free-kick, which have common visual elements, are severe and critical. The existence of common elements between two events causes the extraction of common and ineffective features in recognizing these two events. As a result, the error of recognizing and separating these two events is more than other events. In this paper, we present a new method for filtering the input data to converge the intra-class features and diverge the inter-class features to increase the classification accuracy. For this purpose, using the IAUFD Dataset, we have evaluated images for the Penalty and the Free-kick classes with the criterion of structural similarity. Based on the results, inappropriate images have been ignored according to the average value and standard deviation of each class of data. This filtration leads to ignore of ineffective and common features in the learning process. The results of the proposed method indicate an improvement in the accuracy of distinguishing between two Penalty and Free-kick events using a deep neural network and filtered training images compared to the deep neural network using all training images.
{"title":"Converge intra-class and Diverge inter-class features for CNN-based Event Detection in football videos","authors":"Amirhosein Zanganeh, E. Sharifi, M. Jampour","doi":"10.1109/IPRIA59240.2023.10147187","DOIUrl":"https://doi.org/10.1109/IPRIA59240.2023.10147187","url":null,"abstract":"Football event detection in videos is very challenging, but challenges on the Penalty and the Free-kick, which have common visual elements, are severe and critical. The existence of common elements between two events causes the extraction of common and ineffective features in recognizing these two events. As a result, the error of recognizing and separating these two events is more than other events. In this paper, we present a new method for filtering the input data to converge the intra-class features and diverge the inter-class features to increase the classification accuracy. For this purpose, using the IAUFD Dataset, we have evaluated images for the Penalty and the Free-kick classes with the criterion of structural similarity. Based on the results, inappropriate images have been ignored according to the average value and standard deviation of each class of data. This filtration leads to ignore of ineffective and common features in the learning process. The results of the proposed method indicate an improvement in the accuracy of distinguishing between two Penalty and Free-kick events using a deep neural network and filtered training images compared to the deep neural network using all training images.","PeriodicalId":109390,"journal":{"name":"2023 6th International Conference on Pattern Recognition and Image Analysis (IPRIA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130259376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-14DOI: 10.1109/IPRIA59240.2023.10147174
Marjan. Shahchera, Hossein Ebrahimpour-komleh
Due to the important relationship between the impact of accurate detection of traffic signs in self-driving cars and driver assistance during car movement, it is very challenging and necessary to create a high-accuracy system for interpretation and immediate decision-making. In this research, by applying the new vision transformer Deit approach, a system is designed that can recognize Iranian traffic signs. We trained our model with two collections of traffic sign images (GTSRB and PTSD) that reached higher accuracy levels of 99.5% and 98.8%, respectively, in optimal conditions.
{"title":"Deit Model for Iranian Traffic Sign Recognition in Advanced Driver Assistance Systems","authors":"Marjan. Shahchera, Hossein Ebrahimpour-komleh","doi":"10.1109/IPRIA59240.2023.10147174","DOIUrl":"https://doi.org/10.1109/IPRIA59240.2023.10147174","url":null,"abstract":"Due to the important relationship between the impact of accurate detection of traffic signs in self-driving cars and driver assistance during car movement, it is very challenging and necessary to create a high-accuracy system for interpretation and immediate decision-making. In this research, by applying the new vision transformer Deit approach, a system is designed that can recognize Iranian traffic signs. We trained our model with two collections of traffic sign images (GTSRB and PTSD) that reached higher accuracy levels of 99.5% and 98.8%, respectively, in optimal conditions.","PeriodicalId":109390,"journal":{"name":"2023 6th International Conference on Pattern Recognition and Image Analysis (IPRIA)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124983849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-14DOI: 10.1109/IPRIA59240.2023.10147193
Alirza Dori, H. Ghassemian, M. Imani
This work proposes a multidisciplinary contextual information extraction and decision fusion approach for increasing the classification accuracy. It improves the image classification with integrating the results of various classifiers. The proposed method is implemented in three-steps: 1) contextual feature extraction using four different feature extractors methods: a) Gray Level Cooccurrence Matrix, b) Gabor filters, c) Laplacian Gaussian filters and d) Gaussian Derivatives Functions; 2) classification of contextual features using four different classification rules (ML, Tree, KNN and SVM) by using only 2% of data for training the classifiers; and 3) finally, decision fusion using six decision fusion rules. The experimental results on real remotely sensed images have been presented.
{"title":"Contextual Information Classification of Remotely Sensed Images","authors":"Alirza Dori, H. Ghassemian, M. Imani","doi":"10.1109/IPRIA59240.2023.10147193","DOIUrl":"https://doi.org/10.1109/IPRIA59240.2023.10147193","url":null,"abstract":"This work proposes a multidisciplinary contextual information extraction and decision fusion approach for increasing the classification accuracy. It improves the image classification with integrating the results of various classifiers. The proposed method is implemented in three-steps: 1) contextual feature extraction using four different feature extractors methods: a) Gray Level Cooccurrence Matrix, b) Gabor filters, c) Laplacian Gaussian filters and d) Gaussian Derivatives Functions; 2) classification of contextual features using four different classification rules (ML, Tree, KNN and SVM) by using only 2% of data for training the classifiers; and 3) finally, decision fusion using six decision fusion rules. The experimental results on real remotely sensed images have been presented.","PeriodicalId":109390,"journal":{"name":"2023 6th International Conference on Pattern Recognition and Image Analysis (IPRIA)","volume":"371 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132776062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-14DOI: 10.1109/IPRIA59240.2023.10147190
Mohammadiman Hosseinnia, A. Behrad
The act of assigning word labels to images using machine learning algorithms is called automatic image annotation. Automatic annotation of image is used in various applications like media, medical, industrial and archaeological fields. Several methods have been proposed for automatic annotation of images, but most of them are focused on 2D images. In this article, we propose a new approach for 3D image annotation using deep learning and view-based image features. The most challenging issue in the automatic annotation of 3D images is to extract suitable features for image representation. 3D images are generally presented in the form of polygon meshes that are not suitable for deep learning. To counter the problem, we represent 3D images as several view-based images that are captured from different views. This process converts a 3D image into a multi-channel 2D image that can be classified using image-based deep classification networks. We utilized various classification networks for 3D image annotation, and the results showed the F1 score of 0.97 for the best architecture.
{"title":"3D Image Annotation using Deep Learning and View-based Image Features","authors":"Mohammadiman Hosseinnia, A. Behrad","doi":"10.1109/IPRIA59240.2023.10147190","DOIUrl":"https://doi.org/10.1109/IPRIA59240.2023.10147190","url":null,"abstract":"The act of assigning word labels to images using machine learning algorithms is called automatic image annotation. Automatic annotation of image is used in various applications like media, medical, industrial and archaeological fields. Several methods have been proposed for automatic annotation of images, but most of them are focused on 2D images. In this article, we propose a new approach for 3D image annotation using deep learning and view-based image features. The most challenging issue in the automatic annotation of 3D images is to extract suitable features for image representation. 3D images are generally presented in the form of polygon meshes that are not suitable for deep learning. To counter the problem, we represent 3D images as several view-based images that are captured from different views. This process converts a 3D image into a multi-channel 2D image that can be classified using image-based deep classification networks. We utilized various classification networks for 3D image annotation, and the results showed the F1 score of 0.97 for the best architecture.","PeriodicalId":109390,"journal":{"name":"2023 6th International Conference on Pattern Recognition and Image Analysis (IPRIA)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115139565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-14DOI: 10.1109/IPRIA59240.2023.10147179
Pouria Omrani, Zahra Ebrahimian, Ramin Toosi, M. Akhaee
The spread of fake news has become more prevalent given the popularity of social media and the various news that circulates on it. As a result, it is crucial to discern between real and fake news. During the COVID-19 pandemic, there have been numerous tweets, posts, and news about this illness in social media and electronic media worldwide. This research presents a bilingual model combining Latent Dirichlet Allocation (LDA) topic modeling and the BERT transformer to detect COVID-19 fake news in both Persian and English. First, the dataset is prepared in Persian and English, and then the proposed method is used to detect COVID-19 fake news on the prepared dataset. Finally, the proposed model is evaluated using various metrics such as accuracy, precision, recall, and the f1-score. As a result of this approach, we achieve 92.18% accuracy, which shows that adding topic information to the pre-trained contextual representations given by the BERT network, significantly improves the solving of instances that are domain-specific. Also, the results show that our proposed approach outperforms previous state-of-the-art methods.
{"title":"Bilingual COVID-19 Fake News Detection Based on LDA Topic Modeling and BERT Transformer","authors":"Pouria Omrani, Zahra Ebrahimian, Ramin Toosi, M. Akhaee","doi":"10.1109/IPRIA59240.2023.10147179","DOIUrl":"https://doi.org/10.1109/IPRIA59240.2023.10147179","url":null,"abstract":"The spread of fake news has become more prevalent given the popularity of social media and the various news that circulates on it. As a result, it is crucial to discern between real and fake news. During the COVID-19 pandemic, there have been numerous tweets, posts, and news about this illness in social media and electronic media worldwide. This research presents a bilingual model combining Latent Dirichlet Allocation (LDA) topic modeling and the BERT transformer to detect COVID-19 fake news in both Persian and English. First, the dataset is prepared in Persian and English, and then the proposed method is used to detect COVID-19 fake news on the prepared dataset. Finally, the proposed model is evaluated using various metrics such as accuracy, precision, recall, and the f1-score. As a result of this approach, we achieve 92.18% accuracy, which shows that adding topic information to the pre-trained contextual representations given by the BERT network, significantly improves the solving of instances that are domain-specific. Also, the results show that our proposed approach outperforms previous state-of-the-art methods.","PeriodicalId":109390,"journal":{"name":"2023 6th International Conference on Pattern Recognition and Image Analysis (IPRIA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128522935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}