Pub Date : 2022-09-17DOI: 10.1142/s1469026822500158
Dayin Shi, Zhiyong Wu, Longbo Zhang, Benjia Hu, Ke Meng
In this paper, a novel multi-scale deep residual shrinkage network (MS-DRSN) is proposed for signal denoising and atrial fibrillation (AF) recognition. Signal denoising is done by multi-scale threshold denoising module (MS-TDM), which consists of two parts: threshold acquisition and threshold denoising. The thresholds are automatically obtained through the global attention module constructed by the neural network. Threshold denoising chooses Garrote as the threshold function, which combines the advantages of soft and hard thresholding. The multi-scale features consist of global attention module and local attention module, and then the multi-scale features are denoised using the acquired thresholds and threshold functions, and the AF recognition task is finally completed in the Softmax layer after the superposition of multiple MS-TDMs. An adaptive synthetic sampling (ADASYN) algorithm is also used to oversample the dataset and achieve data category balancing by generating new samples, which improves the accuracy of AF recognition and alleviates the overfitting of the neural network. This method was experimented and validated on the PhysioNet2017 dataset. The experimental results show that the approach achieves an accuracy of 0.894 and an [Formula: see text] score of 0.881, which is better than current machine learning and deep learning models.
提出了一种新型的多尺度深度残差收缩网络(MS-DRSN),用于房颤信号去噪和识别。信号去噪由多尺度阈值去噪模块(MS-TDM)完成,该模块由阈值采集和阈值去噪两部分组成。通过神经网络构建的全局注意力模块自动获取阈值。阈值去噪选择Garrote作为阈值函数,结合了软阈值和硬阈值的优点。多尺度特征由全局注意模块和局部注意模块组成,然后利用获取的阈值和阈值函数对多尺度特征进行去噪,将多个ms - tdm叠加后在Softmax层完成AF识别任务。采用自适应合成采样(ADASYN)算法对数据集进行过采样,通过生成新样本实现数据类别平衡,提高了AF识别的准确率,缓解了神经网络的过拟合问题。该方法在PhysioNet2017数据集上进行了实验和验证。实验结果表明,该方法的准确率为0.894,[Formula: see text]得分为0.881,优于当前的机器学习和深度学习模型。
{"title":"Multi-Scale Deep Residual Shrinkage Network for Atrial Fibrillation Recognition","authors":"Dayin Shi, Zhiyong Wu, Longbo Zhang, Benjia Hu, Ke Meng","doi":"10.1142/s1469026822500158","DOIUrl":"https://doi.org/10.1142/s1469026822500158","url":null,"abstract":"In this paper, a novel multi-scale deep residual shrinkage network (MS-DRSN) is proposed for signal denoising and atrial fibrillation (AF) recognition. Signal denoising is done by multi-scale threshold denoising module (MS-TDM), which consists of two parts: threshold acquisition and threshold denoising. The thresholds are automatically obtained through the global attention module constructed by the neural network. Threshold denoising chooses Garrote as the threshold function, which combines the advantages of soft and hard thresholding. The multi-scale features consist of global attention module and local attention module, and then the multi-scale features are denoised using the acquired thresholds and threshold functions, and the AF recognition task is finally completed in the Softmax layer after the superposition of multiple MS-TDMs. An adaptive synthetic sampling (ADASYN) algorithm is also used to oversample the dataset and achieve data category balancing by generating new samples, which improves the accuracy of AF recognition and alleviates the overfitting of the neural network. This method was experimented and validated on the PhysioNet2017 dataset. The experimental results show that the approach achieves an accuracy of 0.894 and an [Formula: see text] score of 0.881, which is better than current machine learning and deep learning models.","PeriodicalId":422521,"journal":{"name":"Int. J. Comput. Intell. Appl.","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132306396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-13DOI: 10.1142/s1469026822500171
S. Ozden, F. Kılıç
In this study, modeling of thin banana slices dried on 316 stainless steel shelves is carried out in an oven working with serial controlled and concentric blower-resistor couple. Changes occurred in banana slices (area and color) during drying process have been recorded by a camera. Additionally, weight has been measured with a load cell which is under the shelf and energy consumption has been measured with electricity consumption meter which is tied to energy input. The main aim of the study is to conduct the drying process of banana slices according to the data obtained from camera. Besides, obtained data have been tested with a powerful modeling technique like Artificial Neural Networks (ANN), and it has been seen that drying process could be modeled according to the data obtained from camera. Energy consumption data have been added in order to increase the performance of ANN and strengthen the modeling. Thus, an automatic drying system that can learn by itself using only a camera without any other sensors will be installed. This has been caused an increase in performance. However, it is obvious that it increases cost. According to the results of modeling process, 99% of “goodness of fit” has been obtained by using the change in banana slices and the number of pixels. It has been found that the developed model performed adequately in predicting the changes of the moisture content. Thus, it has been available to control the food drying process with a digital camera.
{"title":"Modeling of Drying Kinetics of Banana (Musa spp., Musaceae) Slices with the Method of Image Processing and Artificial Neural Networks","authors":"S. Ozden, F. Kılıç","doi":"10.1142/s1469026822500171","DOIUrl":"https://doi.org/10.1142/s1469026822500171","url":null,"abstract":"In this study, modeling of thin banana slices dried on 316 stainless steel shelves is carried out in an oven working with serial controlled and concentric blower-resistor couple. Changes occurred in banana slices (area and color) during drying process have been recorded by a camera. Additionally, weight has been measured with a load cell which is under the shelf and energy consumption has been measured with electricity consumption meter which is tied to energy input. The main aim of the study is to conduct the drying process of banana slices according to the data obtained from camera. Besides, obtained data have been tested with a powerful modeling technique like Artificial Neural Networks (ANN), and it has been seen that drying process could be modeled according to the data obtained from camera. Energy consumption data have been added in order to increase the performance of ANN and strengthen the modeling. Thus, an automatic drying system that can learn by itself using only a camera without any other sensors will be installed. This has been caused an increase in performance. However, it is obvious that it increases cost. According to the results of modeling process, 99% of “goodness of fit” has been obtained by using the change in banana slices and the number of pixels. It has been found that the developed model performed adequately in predicting the changes of the moisture content. Thus, it has been available to control the food drying process with a digital camera.","PeriodicalId":422521,"journal":{"name":"Int. J. Comput. Intell. Appl.","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129347849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-03DOI: 10.1142/s146902682250016x
Parul Agarwal, Anirban Dutta, Tarushi Agrawal, Nikhil Mehra, S. Mehta
Alzheimer is an irreversible neurological disorder. It impairs the memory and thinking ability of a person. Its symptoms are not known at an early stage due to which a person is deprived of receiving medication at an early stage. Dementia, a general form of Alzheimer, is difficult to diagnose and hence a proper system for detection of Alzheimer is needed. Various studies have been done for accurate classification of patients with or without Alzheimer’s disease (AD). However, accuracy of prediction is still a challenge depending on the type of data used for diagnosis. Timely identification of true positives and false negatives are critical to the diagnosis. This work focuses on extraction of optimal features using nature-inspired algorithms to enhance the accuracy of classification models. This work proposes two hybrid nature-inspired algorithms — particle swarm optimization with genetic algorithm (PSO_GA) and whale optimization algorithm with genetic algorithm, (WOA_GA) to improve prediction accuracy. The performance of proposed algorithms is evaluated with respect to various existing algorithms on the basis of accuracy and time taken. Experimental results depict that there is trade-off in time and accuracy. Results revealed that the best accuracy is achieved by PSO_GA while it takes higher time than WOA and WOA_GA. Overall WOA_GA gives better performance accuracy when compared to a majority of the compared algorithms using support vector machine (SVM) and AdaSVM classifiers.
{"title":"Hybrid Nature-Inspired Algorithm for Feature Selection in Alzheimer Detection Using Brain MRI Images","authors":"Parul Agarwal, Anirban Dutta, Tarushi Agrawal, Nikhil Mehra, S. Mehta","doi":"10.1142/s146902682250016x","DOIUrl":"https://doi.org/10.1142/s146902682250016x","url":null,"abstract":"Alzheimer is an irreversible neurological disorder. It impairs the memory and thinking ability of a person. Its symptoms are not known at an early stage due to which a person is deprived of receiving medication at an early stage. Dementia, a general form of Alzheimer, is difficult to diagnose and hence a proper system for detection of Alzheimer is needed. Various studies have been done for accurate classification of patients with or without Alzheimer’s disease (AD). However, accuracy of prediction is still a challenge depending on the type of data used for diagnosis. Timely identification of true positives and false negatives are critical to the diagnosis. This work focuses on extraction of optimal features using nature-inspired algorithms to enhance the accuracy of classification models. This work proposes two hybrid nature-inspired algorithms — particle swarm optimization with genetic algorithm (PSO_GA) and whale optimization algorithm with genetic algorithm, (WOA_GA) to improve prediction accuracy. The performance of proposed algorithms is evaluated with respect to various existing algorithms on the basis of accuracy and time taken. Experimental results depict that there is trade-off in time and accuracy. Results revealed that the best accuracy is achieved by PSO_GA while it takes higher time than WOA and WOA_GA. Overall WOA_GA gives better performance accuracy when compared to a majority of the compared algorithms using support vector machine (SVM) and AdaSVM classifiers.","PeriodicalId":422521,"journal":{"name":"Int. J. Comput. Intell. Appl.","volume":"224 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133617831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-07DOI: 10.1142/s1469026822500080
Jiajian Huang, Shih-Ping Wang
Person re-identification (Re-ID) arises in many applications such as video surveillance and intelligent security. Background clutter and distribution drift are two issues that cross-domain person Re-ID faces. In this research, we propose that the background clutter problem be solved by combining semantic segmentation technology with human attribute identification technology. To overcome the distribution drift problem, we propose employing MMD as a metric for distribution differences and processing methods based on feature properties. The results of the experiments reveal that our strategy yielded the best results.
{"title":"A Hierarchical Processing and Completion Mechanism of Foreground Information for Person Re-Identification","authors":"Jiajian Huang, Shih-Ping Wang","doi":"10.1142/s1469026822500080","DOIUrl":"https://doi.org/10.1142/s1469026822500080","url":null,"abstract":"Person re-identification (Re-ID) arises in many applications such as video surveillance and intelligent security. Background clutter and distribution drift are two issues that cross-domain person Re-ID faces. In this research, we propose that the background clutter problem be solved by combining semantic segmentation technology with human attribute identification technology. To overcome the distribution drift problem, we propose employing MMD as a metric for distribution differences and processing methods based on feature properties. The results of the experiments reveal that our strategy yielded the best results.","PeriodicalId":422521,"journal":{"name":"Int. J. Comput. Intell. Appl.","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129837277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-27DOI: 10.1142/s1469026822500109
Jun Yu
In this paper, we have observed that different types of plants in nature can use their own survival mechanisms to adapt to various living environments. A new population-based vegetation evolution (VEGE) algorithm is proposed to solve optimization problems by interactively simulating the growth and maturity periods of plants. In the growth period, individuals explore their local areas and grow in potential directions, while individuals generate many seed individuals and spread them as widely as possible in the maturity period. The main contribution of our proposed VEGE is to balance exploitation and exploration from a novel perspective, which is to perform these two periods in alternation to switch between two different search capabilities. To evaluate the performance of the proposed VEGE, we compare it with three well-known algorithms in the evolutionary computation community: differential evolution, particle swarm optimization, and enhanced fireworks algorithm — and run them on 28 benchmark functions with 2-dimensions (2D), 10D, and 30D with 30 trial runs. The experimental results show that VEGE is efficient and promising in terms of faster convergence speed and higher accuracy. In addition, we further analyze the effects of the composition of VEGE on performance, and some open topics are also given.
{"title":"Vegetation Evolution: An Optimization Algorithm Inspired by the Life Cycle of Plants","authors":"Jun Yu","doi":"10.1142/s1469026822500109","DOIUrl":"https://doi.org/10.1142/s1469026822500109","url":null,"abstract":"In this paper, we have observed that different types of plants in nature can use their own survival mechanisms to adapt to various living environments. A new population-based vegetation evolution (VEGE) algorithm is proposed to solve optimization problems by interactively simulating the growth and maturity periods of plants. In the growth period, individuals explore their local areas and grow in potential directions, while individuals generate many seed individuals and spread them as widely as possible in the maturity period. The main contribution of our proposed VEGE is to balance exploitation and exploration from a novel perspective, which is to perform these two periods in alternation to switch between two different search capabilities. To evaluate the performance of the proposed VEGE, we compare it with three well-known algorithms in the evolutionary computation community: differential evolution, particle swarm optimization, and enhanced fireworks algorithm — and run them on 28 benchmark functions with 2-dimensions (2D), 10D, and 30D with 30 trial runs. The experimental results show that VEGE is efficient and promising in terms of faster convergence speed and higher accuracy. In addition, we further analyze the effects of the composition of VEGE on performance, and some open topics are also given.","PeriodicalId":422521,"journal":{"name":"Int. J. Comput. Intell. Appl.","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125402535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-21DOI: 10.1142/s1469026822500146
Ditsuhi Iskandaryan, Francisco Ramos, S. Trilles
Traditionally, machine learning technologies with the methods and capabilities available, combined with a geospatial dimension, can perform predictive analyzes of air quality with greater accuracy. However, air pollution is influenced by many external factors, one of which has recently been caused by the restrictions applied to curb the relentless advance of COVID-19. These sudden changes in air quality levels can negatively influence current forecasting models. This work compares air pollution forecasts during a pandemic and non-pandemic period under the same conditions. The ConvLSTM algorithm was applied to predict the concentration of nitrogen dioxide using data from the air quality and meteorological stations in Madrid. The proposed model was applied for two scenarios: pandemic (January–June 2020) and non-pandemic (January–June 2019), each with sub-scenarios based on time granularity (1-h, 12-h, 24-h and 48-h) and combination of features. The Root Mean Square Error was taken as the estimation metric, and the results showed that the proposed method outperformed a reference model, and the feature selection technique significantly improved the overall accuracy.
传统上,机器学习技术与现有的方法和能力相结合,结合地理空间维度,可以更准确地对空气质量进行预测分析。然而,空气污染受到许多外部因素的影响,其中一个因素是最近为遏制COVID-19的无情发展而实施的限制措施。这些空气质量水平的突然变化会对目前的预报模式产生负面影响。这项工作比较了在相同条件下大流行期间和非大流行期间的空气污染预测。利用马德里空气质量和气象站的数据,应用ConvLSTM算法预测二氧化氮浓度。该模型应用于大流行(2020年1月至6月)和非大流行(2019年1月至6月)两种情景,每种情景都有基于时间粒度(1小时、12小时、24小时和48小时)和特征组合的子情景。以均方根误差(Root Mean Square Error)作为估计度量,结果表明该方法优于参考模型,特征选择技术显著提高了整体精度。
{"title":"Comparison of Nitrogen Dioxide Predictions During a Pandemic and Non-pandemic Scenario in the City of Madrid using a Convolutional LSTM Network","authors":"Ditsuhi Iskandaryan, Francisco Ramos, S. Trilles","doi":"10.1142/s1469026822500146","DOIUrl":"https://doi.org/10.1142/s1469026822500146","url":null,"abstract":"Traditionally, machine learning technologies with the methods and capabilities available, combined with a geospatial dimension, can perform predictive analyzes of air quality with greater accuracy. However, air pollution is influenced by many external factors, one of which has recently been caused by the restrictions applied to curb the relentless advance of COVID-19. These sudden changes in air quality levels can negatively influence current forecasting models. This work compares air pollution forecasts during a pandemic and non-pandemic period under the same conditions. The ConvLSTM algorithm was applied to predict the concentration of nitrogen dioxide using data from the air quality and meteorological stations in Madrid. The proposed model was applied for two scenarios: pandemic (January–June 2020) and non-pandemic (January–June 2019), each with sub-scenarios based on time granularity (1-h, 12-h, 24-h and 48-h) and combination of features. The Root Mean Square Error was taken as the estimation metric, and the results showed that the proposed method outperformed a reference model, and the feature selection technique significantly improved the overall accuracy.","PeriodicalId":422521,"journal":{"name":"Int. J. Comput. Intell. Appl.","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114888494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-21DOI: 10.1142/s1469026822500110
Manomita Chakraborty, S. K. Biswas
Mortality rate due to fatal heart disease (HD) or cardiovascular disease (CVD) has increased drastically over the world in recent decades. HD is a very hazardous problem prevailing among people which is treatable if detected early. But in most of the cases, the disease is not diagnosed until it becomes severe. Hence, it is requisite to develop an effective system which can accurately diagnosis HD and provide a concise description for the underlying causes [risk factors (RFs)] of the disease, so that in future HD can be controlled only by managing the primary RFs. Recently, researchers are using various machine learning algorithms for HD diagnosis, and neural network (NN) is one among them which has attracted tons of people because of its high performance. But the main obstacle with a NN is its black-box nature, i.e., its incapability in explaining the decisions. So, as a solution to this pitfall, the rule extraction algorithms can be very effective as they can extract explainable decision rules from NNs with high prediction accuracies. Many neural-based rule extraction algorithms have been applied successfully in various medical diagnosis problems. This study assesses the performance of rule extraction algorithms for HD diagnosis, particularly those that construct rules recursively from NNs. Because they subdivide a rule’s subspace until the accuracy improves, recursive algorithms are known for delivering interpretable decisions with high accuracy. The recursive rule extraction algorithms’ efficacy in HD diagnosis is demonstrated by the results. Along with the significant data ranges for the primary RFs, a maximum accuracy of 82.59% is attained.
{"title":"Computer-Aided Heart Disease Diagnosis Using Recursive Rule Extraction Algorithms from Neural Networks","authors":"Manomita Chakraborty, S. K. Biswas","doi":"10.1142/s1469026822500110","DOIUrl":"https://doi.org/10.1142/s1469026822500110","url":null,"abstract":"Mortality rate due to fatal heart disease (HD) or cardiovascular disease (CVD) has increased drastically over the world in recent decades. HD is a very hazardous problem prevailing among people which is treatable if detected early. But in most of the cases, the disease is not diagnosed until it becomes severe. Hence, it is requisite to develop an effective system which can accurately diagnosis HD and provide a concise description for the underlying causes [risk factors (RFs)] of the disease, so that in future HD can be controlled only by managing the primary RFs. Recently, researchers are using various machine learning algorithms for HD diagnosis, and neural network (NN) is one among them which has attracted tons of people because of its high performance. But the main obstacle with a NN is its black-box nature, i.e., its incapability in explaining the decisions. So, as a solution to this pitfall, the rule extraction algorithms can be very effective as they can extract explainable decision rules from NNs with high prediction accuracies. Many neural-based rule extraction algorithms have been applied successfully in various medical diagnosis problems. This study assesses the performance of rule extraction algorithms for HD diagnosis, particularly those that construct rules recursively from NNs. Because they subdivide a rule’s subspace until the accuracy improves, recursive algorithms are known for delivering interpretable decisions with high accuracy. The recursive rule extraction algorithms’ efficacy in HD diagnosis is demonstrated by the results. Along with the significant data ranges for the primary RFs, a maximum accuracy of 82.59% is attained.","PeriodicalId":422521,"journal":{"name":"Int. J. Comput. Intell. Appl.","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114301708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-21DOI: 10.1142/s1469026822500134
Semih Sevim, Ekin Ekinci, S. İ. Omurca, Eren Berk Edinç, S. Eken, Türkücan Erdem, A. Sayar
The digitalization era has brought digital documents with it, and the classification of document images has become an important need as in classical text documents. Document images, in which text documents are stored as images, contain both text and visual features, unlike images. Therefore, it is possible to use both text and visual features while classifying such data. Considering this situation, in this study, it is aimed to classify document images by using both text and visual features and to determine which feature type is more successful in classification. In the text-based approach, each document/class is labeled with the keywords associated with that document/class and the classification is realized according to whether the document contains the related key-words or not. For visual-based classification, we use four deep learning models namely CNN, NASNet-Large, InceptionV3, and EfficientNetB3. Experimental study is carried out on document images obtained from applicants of the Kocaeli University. As a result, it is seen ii that EfficientNetB3 is the most superior among all with 0.8987 F-score.
{"title":"Multi-Class Document Image Classification using Deep Visual and Textual Features","authors":"Semih Sevim, Ekin Ekinci, S. İ. Omurca, Eren Berk Edinç, S. Eken, Türkücan Erdem, A. Sayar","doi":"10.1142/s1469026822500134","DOIUrl":"https://doi.org/10.1142/s1469026822500134","url":null,"abstract":"The digitalization era has brought digital documents with it, and the classification of document images has become an important need as in classical text documents. Document images, in which text documents are stored as images, contain both text and visual features, unlike images. Therefore, it is possible to use both text and visual features while classifying such data. Considering this situation, in this study, it is aimed to classify document images by using both text and visual features and to determine which feature type is more successful in classification. In the text-based approach, each document/class is labeled with the keywords associated with that document/class and the classification is realized according to whether the document contains the related key-words or not. For visual-based classification, we use four deep learning models namely CNN, NASNet-Large, InceptionV3, and EfficientNetB3. Experimental study is carried out on document images obtained from applicants of the Kocaeli University. As a result, it is seen ii that EfficientNetB3 is the most superior among all with 0.8987 F-score.","PeriodicalId":422521,"journal":{"name":"Int. J. Comput. Intell. Appl.","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130601952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-01DOI: 10.1142/s1469026822500079
Ruchika Kumari, A. Dev, Ashwani Kumar
To develop a high-quality TTS system, an appropriate segmentation of continuous speech into the syllabic units plays a vital role. The significant objective of this research work involves the implementation of an automatic syllable-based speech segmentation technique for continuous speech of the Hindi language. Here, the parameters involved in the segmentation process are optimized to segment the speech syllables. In addition to this, the proposed iterative splitting process containing the optimum parameters minimizes the deletion errors. Thus, the optimized iterative incorporation can discard more insertions without merging the frequent non-iterative incorporation. The mixture of optimized iterative and iterative incorporation provides the best accuracy with the least insertion and deletion errors. The segmentation output based on different text signals for the proposed approach and other techniques namely GA, PSO and SOM is accurately segmented. The average accuracy obtained for the proposed approach is high with 97.5% than GA, PSO and SOM. The performance of the proposed algorithm is also analyzed and gives better-segmented accuracy when compared with other state-of-the-art methods. Here, the syllable-based segmented database is suitable for the speech technology system for Hindi in the travel domain.
{"title":"An Efficient Syllable-Based Speech Segmentation Model Using Fuzzy and Threshold-Based Boundary Detection","authors":"Ruchika Kumari, A. Dev, Ashwani Kumar","doi":"10.1142/s1469026822500079","DOIUrl":"https://doi.org/10.1142/s1469026822500079","url":null,"abstract":"To develop a high-quality TTS system, an appropriate segmentation of continuous speech into the syllabic units plays a vital role. The significant objective of this research work involves the implementation of an automatic syllable-based speech segmentation technique for continuous speech of the Hindi language. Here, the parameters involved in the segmentation process are optimized to segment the speech syllables. In addition to this, the proposed iterative splitting process containing the optimum parameters minimizes the deletion errors. Thus, the optimized iterative incorporation can discard more insertions without merging the frequent non-iterative incorporation. The mixture of optimized iterative and iterative incorporation provides the best accuracy with the least insertion and deletion errors. The segmentation output based on different text signals for the proposed approach and other techniques namely GA, PSO and SOM is accurately segmented. The average accuracy obtained for the proposed approach is high with 97.5% than GA, PSO and SOM. The performance of the proposed algorithm is also analyzed and gives better-segmented accuracy when compared with other state-of-the-art methods. Here, the syllable-based segmented database is suitable for the speech technology system for Hindi in the travel domain.","PeriodicalId":422521,"journal":{"name":"Int. J. Comput. Intell. Appl.","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116856386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-03-01DOI: 10.1142/s146902682250002x
Chebah Ouafa, M. Laskri
Facial expression recognition is an interesting research direction of pattern recognition and computer vision. It has been increasingly used in artificial intelligence, human–computer interaction and security monitoring. In recent years, Convolution Neural Network (CNN) as a deep learning technique and multiple classifier combination method has been applied to gain accurate results in classifying face expressions. In this paper, we propose a multimodal classification approach based on a local texture descriptor representation and a combination of CNN to recognize facial expression. Initially, in order to reduce the influence of redundant information, the preprocessing stage is performed using face detection, face image cropping and texture descriptors of Local Binary Pattern (LBP), Local Gradient Code (LGC), Local Directional Pattern (LDP) and Gradient Direction Pattern (GDP) calculation. Second, we construct a cascade CNN architecture using the multimodal data of each descriptor (CNNLBP, CNNLGC, CNNGDP and CNNLDP) to extract facial features and classify emotions. Finally, we apply aggregation techniques (sum and product rule) for each modality to combine the four multimodal outputs and thus obtain the final decision of our system. Experimental results using CK[Formula: see text] and JAFFE database show that the proposed multimodal classification system achieves superior recognition performance compared to the existing studies with classification accuracy of 97, 93% and 94, 45%, respectively.
{"title":"Facial Expression Recognition Using Convolution Neural Network Fusion and Texture Descriptors Representation","authors":"Chebah Ouafa, M. Laskri","doi":"10.1142/s146902682250002x","DOIUrl":"https://doi.org/10.1142/s146902682250002x","url":null,"abstract":"Facial expression recognition is an interesting research direction of pattern recognition and computer vision. It has been increasingly used in artificial intelligence, human–computer interaction and security monitoring. In recent years, Convolution Neural Network (CNN) as a deep learning technique and multiple classifier combination method has been applied to gain accurate results in classifying face expressions. In this paper, we propose a multimodal classification approach based on a local texture descriptor representation and a combination of CNN to recognize facial expression. Initially, in order to reduce the influence of redundant information, the preprocessing stage is performed using face detection, face image cropping and texture descriptors of Local Binary Pattern (LBP), Local Gradient Code (LGC), Local Directional Pattern (LDP) and Gradient Direction Pattern (GDP) calculation. Second, we construct a cascade CNN architecture using the multimodal data of each descriptor (CNNLBP, CNNLGC, CNNGDP and CNNLDP) to extract facial features and classify emotions. Finally, we apply aggregation techniques (sum and product rule) for each modality to combine the four multimodal outputs and thus obtain the final decision of our system. Experimental results using CK[Formula: see text] and JAFFE database show that the proposed multimodal classification system achieves superior recognition performance compared to the existing studies with classification accuracy of 97, 93% and 94, 45%, respectively.","PeriodicalId":422521,"journal":{"name":"Int. J. Comput. Intell. Appl.","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134398484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}