2023 6th International Conference on Pattern Recognition and Image Analysis (IPRIA)最新文献

英文中文

PSO-based procedure to find number of clusters and better initial centroids for K-means algorithm: Image segmentation as case study 基于pso的K-means算法聚类数量和初始质心的确定方法:以图像分割为例

2023 6th International Conference on Pattern Recognition and Image Analysis (IPRIA)

Pub Date : 2023-02-14 DOI: 10.1109/IPRIA59240.2023.10147171

M. Zarei, A. Nickfarjam

In this paper, we propose a combination of K-means algorithm and Particle Swarm Optimization (PSO) method. The K-means algorithm is utilized for data clustering. On one hand, the number of clusters (K) should be determined by expert or found by try-and-error procedure in the K-means algorithm. On the other hand, initial centroids and number of clusters (K) are influenced on the quality of resulted grouping. Therefore, the aim of the proposed procedure is using PSO and the Structural Similarity Index (SSIM) criterion as a fitness function in order to find the best value for K parameter and better initial clusters' center. Due to different value of K parameter, the number of initial centroids which should be produced is variant. Thus, length of particles in PSO method may be different in each iteration. Experimental results show the superiority of this approach in comparison with standard K-means algorithm and both of them are evaluated on image segmentation problem.

本文提出了一种结合K-means算法和粒子群优化(Particle Swarm Optimization, PSO)的算法。采用K-means算法进行数据聚类。一方面，聚类的数量(K)应由专家确定或通过K-means算法中的试错过程找到。另一方面，初始质心和簇数(K)会影响结果分组的质量。因此，该方法的目的是使用PSO和结构相似指数(SSIM)准则作为适应度函数，以找到K参数的最佳值和更好的初始聚类中心。由于K参数的取值不同，需要产生的初始质心个数也不同。因此，粒子群算法的粒子长度在每次迭代中可能是不同的。实验结果表明，该方法与标准K-means算法相比具有优越性，并对两种方法在图像分割问题上进行了评价。

引用次数: 0

Automatic summarization of Instagram social network posts by combining semantic and statistical approaches 结合语义和统计方法自动总结Instagram社交网络帖子

2023 6th International Conference on Pattern Recognition and Image Analysis (IPRIA)

Pub Date : 2023-02-14 DOI: 10.1109/IPRIA59240.2023.10147186

Zainab Tabanmehr, Ehsan Akhtarkavan

The increasing spread of data and text documents such as articles, web pages, books, posts on social networks, etc. on the Internet, creates a fundamental challenge in various fields of text processing under the title of “automatic text summarization”. Manual processing and summarization of large volumes of textual data is a very difficult, expensive, time-consuming, and impossible process for human users. Text summarization systems are divided into extractive and abstract categories. In the extractive summarization method, the final summary of a text document is extracted from the important sentences of the same document without any kind of change. In this method, it is possible to repeat a series of sentences repeatedly and interfere with pronouns. But in the abstract summarization method, the final summary of a textual document is extracted from the meaning of the sentences and words of the same document or other documents. Many of the performed works have used extraction methods or abstracts to summarize the collection of web documents, each of which has advantages and disadvantages in the results obtained in terms of similarity or size. In this research, by developing a crawler, extracting the popular text posts from the Instagram social network, suitable pre-processing, and combining the set of extractive and abstract algorithms, the researcher showed how to use each of the abstract algorithms. and used extraction as a supplement to increase the accuracy and accuracy of another algorithm. Observations made on 820 popular text posts on the Instagram social network show the accuracy (80%) of the proposed system.

随着互联网上文章、网页、书籍、社交网络帖子等数据和文本文档的日益普及，“自动文本摘要”对文本处理的各个领域提出了根本性的挑战。对于人类用户来说，手动处理和总结大量文本数据是一个非常困难、昂贵、耗时和不可能的过程。文本摘要系统分为抽取类和抽象类。在抽取摘要方法中，文本文档的最终摘要是从同一文档的重要句子中抽取出来的，而不做任何改变。在这种方法中，可以反复重复一系列句子，并干扰代词。但在抽象摘要法中，文本文档的最终摘要是从同一文档或其他文档的句子和单词的意思中提取出来的。许多已完成的作品都使用了提取方法或摘要对web文档的集合进行总结，每种方法所获得的结果在相似度或大小上都各有优缺点。在本研究中，研究人员通过开发爬虫，从Instagram社交网络中提取热门文本帖子，进行适当的预处理，并将抽取和抽象算法集合结合起来，展示了每个抽象算法的使用方法。并利用提取作为补充，提高了另一种算法的准确率和准确性。对Instagram社交网络上820个热门文本帖子的观察显示，该系统的准确性(80%)。

{"title":"Automatic summarization of Instagram social network posts by combining semantic and statistical approaches","authors":"Zainab Tabanmehr, Ehsan Akhtarkavan","doi":"10.1109/IPRIA59240.2023.10147186","DOIUrl":"https://doi.org/10.1109/IPRIA59240.2023.10147186","url":null,"abstract":"The increasing spread of data and text documents such as articles, web pages, books, posts on social networks, etc. on the Internet, creates a fundamental challenge in various fields of text processing under the title of “automatic text summarization”. Manual processing and summarization of large volumes of textual data is a very difficult, expensive, time-consuming, and impossible process for human users. Text summarization systems are divided into extractive and abstract categories. In the extractive summarization method, the final summary of a text document is extracted from the important sentences of the same document without any kind of change. In this method, it is possible to repeat a series of sentences repeatedly and interfere with pronouns. But in the abstract summarization method, the final summary of a textual document is extracted from the meaning of the sentences and words of the same document or other documents. Many of the performed works have used extraction methods or abstracts to summarize the collection of web documents, each of which has advantages and disadvantages in the results obtained in terms of similarity or size. In this research, by developing a crawler, extracting the popular text posts from the Instagram social network, suitable pre-processing, and combining the set of extractive and abstract algorithms, the researcher showed how to use each of the abstract algorithms. and used extraction as a supplement to increase the accuracy and accuracy of another algorithm. Observations made on 820 popular text posts on the Instagram social network show the accuracy (80%) of the proposed system.","PeriodicalId":109390,"journal":{"name":"2023 6th International Conference on Pattern Recognition and Image Analysis (IPRIA)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125002503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The Effect of Variance-Based Patch Selection on No-Reference Image Quality Assessment 基于方差的Patch选择在无参考图像质量评估中的作用

2023 6th International Conference on Pattern Recognition and Image Analysis (IPRIA)

Pub Date : 2023-02-14 DOI: 10.1109/IPRIA59240.2023.10147195

S. F. Hosseini-Benvidi, Azadeh Mansouri

The objective of the No-Reference Image Quality Assessment (NR-IQA) is to evaluate the perceived image quality subjectively. Since there is no reference image, this is a challenging and unresolved issue. Convolutional neural networks (CNNs) have gained popularity in recent years and have outperformed many traditional techniques in the field of image processing. In order to overcome overfitting, a large percentage of deep learning based IQA methods work with tiny image patches and assess the quality of the entire image based on the average scores of patches. Patch extraction is one of the most crucial elements of CNN-based methods in quality assessment problems. Assuming that visual perception in humans is well suited to extract structural details from a scene, we analyzed the effect of feeding informative and structural patches to the quality framework. In this paper, a method for structural patch extraction is presented, which is based on the variance values of each patch. The obtained results show that the presented method has an acceptable improvement compared to the random patch selection. The proposed model has also performed well in cross-dataset experiments on common distortions, indicating the model's high generalizability. Additionally, the test was run on the flipped images, and the outcomes are satisfactory.

无参考图像质量评价(NR-IQA)的目的是对感知到的图像质量进行主观评价。由于没有参考图像，这是一个具有挑战性和未解决的问题。卷积神经网络(cnn)近年来越来越受欢迎，在图像处理领域的表现优于许多传统技术。为了克服过拟合，很大一部分基于深度学习的IQA方法使用微小的图像补丁，并根据补丁的平均分数评估整个图像的质量。斑块提取是基于cnn的质量评估方法中最关键的部分之一。假设人类的视觉感知非常适合从场景中提取结构细节，我们分析了向质量框架提供信息和结构补丁的效果。本文提出了一种基于各斑块方差值的结构斑块提取方法。实验结果表明，该方法与随机patch选择方法相比有较好的改进。该模型在常见畸变的跨数据集实验中也表现良好，表明该模型具有较高的泛化能力。此外，还对翻转后的图像进行了测试，结果令人满意。

{"title":"The Effect of Variance-Based Patch Selection on No-Reference Image Quality Assessment","authors":"S. F. Hosseini-Benvidi, Azadeh Mansouri","doi":"10.1109/IPRIA59240.2023.10147195","DOIUrl":"https://doi.org/10.1109/IPRIA59240.2023.10147195","url":null,"abstract":"The objective of the No-Reference Image Quality Assessment (NR-IQA) is to evaluate the perceived image quality subjectively. Since there is no reference image, this is a challenging and unresolved issue. Convolutional neural networks (CNNs) have gained popularity in recent years and have outperformed many traditional techniques in the field of image processing. In order to overcome overfitting, a large percentage of deep learning based IQA methods work with tiny image patches and assess the quality of the entire image based on the average scores of patches. Patch extraction is one of the most crucial elements of CNN-based methods in quality assessment problems. Assuming that visual perception in humans is well suited to extract structural details from a scene, we analyzed the effect of feeding informative and structural patches to the quality framework. In this paper, a method for structural patch extraction is presented, which is based on the variance values of each patch. The obtained results show that the presented method has an acceptable improvement compared to the random patch selection. The proposed model has also performed well in cross-dataset experiments on common distortions, indicating the model's high generalizability. Additionally, the test was run on the flipped images, and the outcomes are satisfactory.","PeriodicalId":109390,"journal":{"name":"2023 6th International Conference on Pattern Recognition and Image Analysis (IPRIA)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132382640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Classification of Skin Cancer With Using Color-ILQP and MEETG 应用颜色- ilqp和MEETG对皮肤癌进行分类

2023 6th International Conference on Pattern Recognition and Image Analysis (IPRIA)

Pub Date : 2023-02-14 DOI: 10.1109/IPRIA59240.2023.10147194

Laleh Armi, Hossein Ebrahimpour-komleh

Skin cancer is one of the most common forms of cancer in the world that has grown dramatically over the past decades. Malignant melanoma is the deadliest type of skin cancer. Melanocytic nevi are benign whereas melanoma is malignant. Most skin cancers are treatable in the early stages. So, rapid diagnosis and the importance of early stage can be very important to cure it and increasing day by day. Today, artificial intelligence can represent an important role in medical image diagnosis. The aim of this paper is to an auto-diagnosis system can be deployed to help dermatologists in identifying melanoma that may facilitate early detection of melanoma, and hence substantially reduce the mortality chance of this dangerous malignancy. We used image processing tools to diagnose melanoma skin cancer. In this paper, the advantage of improved local quinary pattern (ILQP) is used as texture feature extraction method and used mixture of ELM-based experts with a trainable gating network (MEETG) for skin cancer classification. Our proposed method achieved the classification accuracy on f and d datasets, 97.05% and 86.61% respectively.

皮肤癌是世界上最常见的癌症之一，在过去的几十年里发病率急剧上升。恶性黑色素瘤是最致命的一种皮肤癌。黑素细胞痣是良性的，而黑色素瘤是恶性的。大多数皮肤癌在早期阶段是可以治疗的。因此，快速诊断和早期诊断的重要性对治疗非常重要，并且日益增加。如今，人工智能在医学影像诊断中发挥着重要作用。本文的目的是建立一个自动诊断系统，帮助皮肤科医生识别黑色素瘤，从而促进黑色素瘤的早期发现，从而大大降低这种危险恶性肿瘤的死亡率。我们使用图像处理工具来诊断黑色素瘤皮肤癌。本文利用改进局部五元模式(ILQP)的优势作为纹理特征提取方法，将基于elm的专家与可训练门控网络(MEETG)相结合用于皮肤癌分类。我们提出的方法在f和d数据集上的分类准确率分别为97.05%和86.61%。

{"title":"Classification of Skin Cancer With Using Color-ILQP and MEETG","authors":"Laleh Armi, Hossein Ebrahimpour-komleh","doi":"10.1109/IPRIA59240.2023.10147194","DOIUrl":"https://doi.org/10.1109/IPRIA59240.2023.10147194","url":null,"abstract":"Skin cancer is one of the most common forms of cancer in the world that has grown dramatically over the past decades. Malignant melanoma is the deadliest type of skin cancer. Melanocytic nevi are benign whereas melanoma is malignant. Most skin cancers are treatable in the early stages. So, rapid diagnosis and the importance of early stage can be very important to cure it and increasing day by day. Today, artificial intelligence can represent an important role in medical image diagnosis. The aim of this paper is to an auto-diagnosis system can be deployed to help dermatologists in identifying melanoma that may facilitate early detection of melanoma, and hence substantially reduce the mortality chance of this dangerous malignancy. We used image processing tools to diagnose melanoma skin cancer. In this paper, the advantage of improved local quinary pattern (ILQP) is used as texture feature extraction method and used mixture of ELM-based experts with a trainable gating network (MEETG) for skin cancer classification. Our proposed method achieved the classification accuracy on f and d datasets, 97.05% and 86.61% respectively.","PeriodicalId":109390,"journal":{"name":"2023 6th International Conference on Pattern Recognition and Image Analysis (IPRIA)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124507768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Machine Learning Techniques During the COVID-19 Pandemic: A Bibliometric Analysis COVID-19大流行期间的机器学习技术:文献计量学分析

2023 6th International Conference on Pattern Recognition and Image Analysis (IPRIA)

Pub Date : 2023-02-14 DOI: 10.1109/IPRIA59240.2023.10147175

Meysam Alavi, Arefeh Valiollahi, M. Kargari

The Coronavirus pandemic (COVID-19) has encouraged researchers to produce significant scientific research in this field in reputable international citation databases. It is important to constantly identify and assess scientific outputs in order to learn more about the situation. One of the methods used for evaluating scientific research activities is scientometrics, which has many applications in describing, explaining and predicting the scientific status of researchers and research centers in various national and international fields. It also provides efficient methods for monitoring and ranking organizations, researchers, journals and countries. On the other hand, in recent years, the use of various scientometric techniques, including co-word analysis, co-authorship network and scientific network, has been of great help in discovering the direction of researchers' production in scientific domain and its hidden and overt dimensions. One of the most popular areas since the COVID-19 epidemic started, has been research the use of artificial intelligence and especially machine learning techniques in the prediction, diagnosis and treatment of this disease. In this regard, 2659 documents from the PubMed citation database since the start of the COVID-19 epidemic have been reviewed. The findings of this research show that America, China, India and England are the countries that have cooperated the most with other countries. In addition, the results of this research showed that deep learning and CNN had been significantly used in the researchers' studies.

冠状病毒大流行(COVID-19)鼓励研究人员在知名的国际引文数据库中进行该领域的重要科学研究。重要的是要不断确定和评估科学产出，以便更多地了解情况。科学计量学是评价科研活动的一种方法，它在描述、解释和预测国内外各领域研究人员和研究中心的科学状况方面有着广泛的应用。它还为组织、研究人员、期刊和国家的监测和排名提供了有效的方法。另一方面，近年来，共词分析、合著网络、科研网络等科学计量技术的应用，对揭示科研人员在科学领域的产出方向及其隐性和显性维度有很大帮助。自COVID-19疫情开始以来，最受欢迎的领域之一是研究人工智能，特别是机器学习技术在这种疾病的预测、诊断和治疗中的应用。在这方面，我们审查了自COVID-19疫情开始以来PubMed引文数据库中的2659篇文献。这项研究的结果表明，美国、中国、印度和英国是与其他国家合作最多的国家。此外，本研究的结果表明，深度学习和CNN在研究人员的研究中得到了显著的应用。

{"title":"Machine Learning Techniques During the COVID-19 Pandemic: A Bibliometric Analysis","authors":"Meysam Alavi, Arefeh Valiollahi, M. Kargari","doi":"10.1109/IPRIA59240.2023.10147175","DOIUrl":"https://doi.org/10.1109/IPRIA59240.2023.10147175","url":null,"abstract":"The Coronavirus pandemic (COVID-19) has encouraged researchers to produce significant scientific research in this field in reputable international citation databases. It is important to constantly identify and assess scientific outputs in order to learn more about the situation. One of the methods used for evaluating scientific research activities is scientometrics, which has many applications in describing, explaining and predicting the scientific status of researchers and research centers in various national and international fields. It also provides efficient methods for monitoring and ranking organizations, researchers, journals and countries. On the other hand, in recent years, the use of various scientometric techniques, including co-word analysis, co-authorship network and scientific network, has been of great help in discovering the direction of researchers' production in scientific domain and its hidden and overt dimensions. One of the most popular areas since the COVID-19 epidemic started, has been research the use of artificial intelligence and especially machine learning techniques in the prediction, diagnosis and treatment of this disease. In this regard, 2659 documents from the PubMed citation database since the start of the COVID-19 epidemic have been reviewed. The findings of this research show that America, China, India and England are the countries that have cooperated the most with other countries. In addition, the results of this research showed that deep learning and CNN had been significantly used in the researchers' studies.","PeriodicalId":109390,"journal":{"name":"2023 6th International Conference on Pattern Recognition and Image Analysis (IPRIA)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115198276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Challenges in natural language processing and natural language understanding by considering both technical and natural domains 从技术和自然两方面考虑自然语言处理和理解的挑战

2023 6th International Conference on Pattern Recognition and Image Analysis (IPRIA)

Pub Date : 2023-02-14 DOI: 10.1109/IPRIA59240.2023.10147185

Pouya Ardehkhani, Amir Vahedi, Hossein Aghababa

As deep learning became more sophisticated, it significantly increased the use of AI in industry, academia, and other sectors. NLP is a part of the deep learning paradigm that offers different types of systems mainly related to human language understanding, meaning, and interpretations. Nowadays, NLP is used in several applications, including sentiment analysis, categorization of texts, translation, etc. Due to this new usage, new challenges occurred. This paper discusses the challenges of developing or creating an NLP model and the problems that will be occurred in NLU. Moreover, the paper illustrates issues in both technical and natural domains that should be considered upon deployment or creation of NLP models or NLU systems.

随着深度学习变得越来越复杂，它大大增加了人工智能在工业、学术界和其他领域的应用。NLP是深度学习范式的一部分，它提供了主要与人类语言理解、意义和解释相关的不同类型的系统。目前，自然语言处理已广泛应用于情感分析、文本分类、翻译等领域。由于这种新的用法，出现了新的挑战。本文讨论了开发或创建自然语言处理模型所面临的挑战以及在自然语言处理中将出现的问题。此外，本文说明了在部署或创建NLP模型或NLU系统时应该考虑的技术和自然领域的问题。

引用次数: 0

Deep perceptual similarity and Quality Assessment 深度感知相似度与质量评价

2023 6th International Conference on Pattern Recognition and Image Analysis (IPRIA)

Pub Date : 2023-02-14 DOI: 10.1109/IPRIA59240.2023.10147170

Alireza Khatami, Ahmad Mahmoudi-Aznaveh

Measuring the perceptual similarity between two images is a long-standing problem. This assessment should mimic human judgments. Considering the complexity of the human visual system, it is challenging to model human perception. On the other hand, the recent low-level vision task approaches, mostly based on supervised deep learning, require an appropriate loss for the backward pass. The per-pixel loss, such as MSE and MAE, between the output of the network and the ground-truth images were among the first choices. More complicated and common similarity measures in which the error is computed in a hand-designed feature space are also employed. Furthermore, Deep Perceptual Similarity (DPS) metrics, where the similarity is measured in the deep feature space, also have promising results. This feature can be selected from a pre-trained or optimized model for the task at hand. Recently many studies have been conducted to thoroughly investigate DPS. In this research, we provide an in-depth analysis of the pros and cons of DPS in assessing the full reference quality assessment. In addition, to compare different similarity measures, we propose a metric which aggregates various desired factors. Based on our experiment, it can be concluded that perceptual similarity is not directly related to classification accuracy. It is discovered that the outliers mostly contain high-frequency elements. The code and complete outcomes described in results, can be found on: https://github.com/Alireza-Khatami/PerceptualQuality

测量两幅图像之间的感知相似性是一个长期存在的问题。这种评估应该模仿人类的判断。考虑到人类视觉系统的复杂性，对人类感知进行建模是一项挑战。另一方面，最近的低层次视觉任务方法，主要基于监督深度学习，需要对向后传递进行适当的损失。网络输出和真实图像之间的每像素损失(如MSE和MAE)是首选。更复杂和常见的相似度度量，其中误差是在手工设计的特征空间中计算的。此外，在深度特征空间中测量相似性的深度感知相似度(DPS)指标也有很好的结果。此特征可以从针对手头任务的预训练或优化模型中选择。最近进行了许多研究，以彻底调查DPS。在本研究中，我们深入分析了DPS在评估全参考文献质量评估中的利弊。此外，为了比较不同的相似性度量，我们提出了一个聚合各种期望因素的度量。根据我们的实验，可以得出感知相似度与分类准确率没有直接关系的结论。研究发现，异常值大多含有高频元素。结果中描述的代码和完整结果可以在https://github.com/Alireza-Khatami/PerceptualQuality上找到

{"title":"Deep perceptual similarity and Quality Assessment","authors":"Alireza Khatami, Ahmad Mahmoudi-Aznaveh","doi":"10.1109/IPRIA59240.2023.10147170","DOIUrl":"https://doi.org/10.1109/IPRIA59240.2023.10147170","url":null,"abstract":"Measuring the perceptual similarity between two images is a long-standing problem. This assessment should mimic human judgments. Considering the complexity of the human visual system, it is challenging to model human perception. On the other hand, the recent low-level vision task approaches, mostly based on supervised deep learning, require an appropriate loss for the backward pass. The per-pixel loss, such as MSE and MAE, between the output of the network and the ground-truth images were among the first choices. More complicated and common similarity measures in which the error is computed in a hand-designed feature space are also employed. Furthermore, Deep Perceptual Similarity (DPS) metrics, where the similarity is measured in the deep feature space, also have promising results. This feature can be selected from a pre-trained or optimized model for the task at hand. Recently many studies have been conducted to thoroughly investigate DPS. In this research, we provide an in-depth analysis of the pros and cons of DPS in assessing the full reference quality assessment. In addition, to compare different similarity measures, we propose a metric which aggregates various desired factors. Based on our experiment, it can be concluded that perceptual similarity is not directly related to classification accuracy. It is discovered that the outliers mostly contain high-frequency elements. The code and complete outcomes described in results, can be found on: https://github.com/Alireza-Khatami/PerceptualQuality","PeriodicalId":109390,"journal":{"name":"2023 6th International Conference on Pattern Recognition and Image Analysis (IPRIA)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126049493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Audio-Visual Emotion Recognition Using K-Means Clustering and Spatio-Temporal CNN 基于k均值聚类和时空CNN的视听情感识别

2023 6th International Conference on Pattern Recognition and Image Analysis (IPRIA)

Pub Date : 2023-02-14 DOI: 10.1109/IPRIA59240.2023.10147192

Masoumeh Sharafi, M. Yazdchi, J. Rasti

Emotion recognition is a challenging task due to the emotional gap between subjective feeling and low-level audio-visual characteristics. Thus, the development of a feasible approach for high-performance emotion recognition might enhance human-computer interaction. Deep learning methods have enhanced the performance of emotion recognition systems in comparison to other current methods. In this paper, a multimodal deep convolutional neural network (CNN) and bidirectional long short-term memory (BiLSTM) network are proposed, which fuses the audio and visual cues in a deep model. The spatial and temporal features extracted from video frames are fused with short-term Fourier transform (STFT) extracted from audio signals. Finally, a Softmax classifier is used to classify inputs into seven groups: anger, disgust, fear, happiness, sadness, surprise, and neutral mode. The proposed model is evaluated on Surrey Audio-Visual Expressed Emotion (SAVEE) database with an accuracy of 95.48%. Our experimental study reveals that the suggested method is more effective than existing algorithms in adapting to emotion recognition in this dataset.

情感识别是一项具有挑战性的任务，因为主观感受与低层次视听特征之间存在情感差距。因此，开发一种可行的高性能情感识别方法可能会增强人机交互。与其他现有方法相比，深度学习方法提高了情绪识别系统的性能。本文提出了一种多模态深度卷积神经网络(CNN)和双向长短期记忆(BiLSTM)网络，将音频和视觉线索融合在一个深度模型中。将从视频帧中提取的时空特征与从音频信号中提取的短时傅里叶变换(STFT)相融合。最后，使用Softmax分类器将输入分为七组:愤怒、厌恶、恐惧、快乐、悲伤、惊讶和中性模式。在Surrey视听表达情感数据库(SAVEE)上对该模型进行了评价，准确率达到95.48%。我们的实验研究表明，该方法比现有算法更有效地适应该数据集的情绪识别。

{"title":"Audio-Visual Emotion Recognition Using K-Means Clustering and Spatio-Temporal CNN","authors":"Masoumeh Sharafi, M. Yazdchi, J. Rasti","doi":"10.1109/IPRIA59240.2023.10147192","DOIUrl":"https://doi.org/10.1109/IPRIA59240.2023.10147192","url":null,"abstract":"Emotion recognition is a challenging task due to the emotional gap between subjective feeling and low-level audio-visual characteristics. Thus, the development of a feasible approach for high-performance emotion recognition might enhance human-computer interaction. Deep learning methods have enhanced the performance of emotion recognition systems in comparison to other current methods. In this paper, a multimodal deep convolutional neural network (CNN) and bidirectional long short-term memory (BiLSTM) network are proposed, which fuses the audio and visual cues in a deep model. The spatial and temporal features extracted from video frames are fused with short-term Fourier transform (STFT) extracted from audio signals. Finally, a Softmax classifier is used to classify inputs into seven groups: anger, disgust, fear, happiness, sadness, surprise, and neutral mode. The proposed model is evaluated on Surrey Audio-Visual Expressed Emotion (SAVEE) database with an accuracy of 95.48%. Our experimental study reveals that the suggested method is more effective than existing algorithms in adapting to emotion recognition in this dataset.","PeriodicalId":109390,"journal":{"name":"2023 6th International Conference on Pattern Recognition and Image Analysis (IPRIA)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116618260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Feature Extraction and Classification of Respiratory Sound and Lung Diseases 呼吸声与肺部疾病的特征提取与分类

2023 6th International Conference on Pattern Recognition and Image Analysis (IPRIA)

Pub Date : 2023-02-14 DOI: 10.1109/IPRIA59240.2023.10147191

Seyed Amir Latifi, H. Ghassemian, M. Imani

Bacteria, viruses, and fungi can cause respiratory infections. It is usually possible to detect respiratory diseases early by listening to the lung sounds with a stethoscope. In reality, lung sound analysis is a time-consuming and difficult task that depends on medical skills and recognition experience. Recent advances in automatic respiratory sound recognition and classification have attracted more attention. The outbreak of COVID-19 throughout the world and the high patient numbers have placed a great deal of pressure on medical professionals. A smart algorithm is therefore a necessity to provide a faster and more accurate detection of lung infections by automatically processing the sounds of the lungs. This paper proposes two new lung sound feature extraction, maximum entropy Gabor filter bank (MAGFB), and maximum entropy Mel filter bank (MAMFB). The classification is performed by a deep neural convolution network (DCNN) by using 50% of data for training the classifier. The filter banks have been substituted, instead of the convolutional layers. Experiments were conducted on the ICBHI 2017 Challenge dataset (with eight classes). The proposed method has a better performance compared to famous methods such as MFCC and Wavelet transform. Particularly, the performance of the second method is significant. For ICBHI 2017 challenge dataset, the overall accuracy of MFCC, Wavelet, MAGFB and MAMFB were 87%, 86%,90% and 93%, respectively.

细菌、病毒和真菌会引起呼吸道感染。用听诊器听肺音通常可以早期发现呼吸系统疾病。在现实中，肺音分析是一项耗时且困难的任务，依赖于医疗技能和识别经验。近年来，呼吸声自动识别与分类的研究进展越来越受到人们的关注。2019冠状病毒病在全球爆发，患者人数众多，给医疗专业人员带来了巨大压力。因此，需要一种智能算法，通过自动处理肺部的声音，提供更快、更准确的肺部感染检测。本文提出了两种新的肺音特征提取方法:最大熵Gabor滤波器组(MAGFB)和最大熵Mel滤波器组(MAMFB)。分类由深度神经卷积网络(DCNN)完成，使用50%的数据来训练分类器。滤波器组已被替换，而不是卷积层。实验在ICBHI 2017 Challenge数据集(共8个类)上进行。与MFCC和小波变换等著名方法相比，该方法具有更好的性能。其中，第二种方法的性能尤为显著。对于ICBHI 2017挑战数据集，MFCC、小波、MAGFB和MAMFB的总体准确率分别为87%、86%、90%和93%。

{"title":"Feature Extraction and Classification of Respiratory Sound and Lung Diseases","authors":"Seyed Amir Latifi, H. Ghassemian, M. Imani","doi":"10.1109/IPRIA59240.2023.10147191","DOIUrl":"https://doi.org/10.1109/IPRIA59240.2023.10147191","url":null,"abstract":"Bacteria, viruses, and fungi can cause respiratory infections. It is usually possible to detect respiratory diseases early by listening to the lung sounds with a stethoscope. In reality, lung sound analysis is a time-consuming and difficult task that depends on medical skills and recognition experience. Recent advances in automatic respiratory sound recognition and classification have attracted more attention. The outbreak of COVID-19 throughout the world and the high patient numbers have placed a great deal of pressure on medical professionals. A smart algorithm is therefore a necessity to provide a faster and more accurate detection of lung infections by automatically processing the sounds of the lungs. This paper proposes two new lung sound feature extraction, maximum entropy Gabor filter bank (MAGFB), and maximum entropy Mel filter bank (MAMFB). The classification is performed by a deep neural convolution network (DCNN) by using 50% of data for training the classifier. The filter banks have been substituted, instead of the convolutional layers. Experiments were conducted on the ICBHI 2017 Challenge dataset (with eight classes). The proposed method has a better performance compared to famous methods such as MFCC and Wavelet transform. Particularly, the performance of the second method is significant. For ICBHI 2017 challenge dataset, the overall accuracy of MFCC, Wavelet, MAGFB and MAMFB were 87%, 86%,90% and 93%, respectively.","PeriodicalId":109390,"journal":{"name":"2023 6th International Conference on Pattern Recognition and Image Analysis (IPRIA)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117321147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Hippocampus segmentation in MR brain images using learned fuzzy mask and U-Net 基于学习模糊掩模和U-Net的MR脑图像海马分割

2023 6th International Conference on Pattern Recognition and Image Analysis (IPRIA)

Pub Date : 2023-02-14 DOI: 10.1109/IPRIA59240.2023.10147188

Alireza Sadeghi, Hassan Khutanlou

The hippocampus is an important part of the human brain that is damaged in some diseases such as Alzheimer's, schizophrenia, and epilepsy. This paper presents a new method in hippocampus segmentation which is applicable in the early diagnosis of mentioned diseases. This method has introduced a two-section model to detect the hippocampus region in brain MR images. In the first section, the location of the hippocampus is roughly detected using a U-Net neural network model, and then a fuzzy mask is created around the detected area using a fuzzy function. In the second section, this mask is applied to the brain images and a U-Net neural network is used to segment these masked images, which finally predicts the location of the hippocampus. The main advantage and idea of this method is the use of a pre-trained fuzzy mask, which increases the quality of segmentation. The proposed method in this research was trained and tested using the HARP dataset, which contains 135 T1-weighted MRI volumes and the proposed model reached 0.95 dice in the best case.

海马体是人类大脑的一个重要部分，在阿尔茨海默氏症、精神分裂症和癫痫等一些疾病中受损。本文提出了一种新的海马分割方法，可用于上述疾病的早期诊断。该方法引入了两段模型来检测脑磁共振图像中的海马区。在第一部分中，使用U-Net神经网络模型粗略检测海马的位置，然后使用模糊函数在检测区域周围创建模糊掩模。在第二部分中，将该掩模应用于大脑图像，并使用U-Net神经网络对这些掩模图像进行分割，最终预测海马的位置。该方法的主要优点和思想是使用预训练的模糊掩模，提高了分割质量。本研究中提出的方法使用包含135个t1加权MRI体积的HARP数据集进行训练和测试，在最佳情况下提出的模型达到0.95 dice。

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2023 6th International Conference on Pattern Recognition and Image Analysis (IPRIA)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀