Microcirculation in a subject can be examined and pathological changes can be assessed by utilizing capillaroscopy, which is a very safe, convenient and non-invasive approach. Using a microscope, doctors view the capillaries by looking through nailfold epidermis. Nailfold anatomy is ideal to evaluate the microcirculation and detect various diseases caused by vascular damages. Rheumatologists evaluate systemic diseases which involve damage in vasculature, by analyzing the red blood cells within the capillaries. Sometimes, capillary morphology may be useful as an early indicator while, severity of damage in capillary architecture may indicate internal organ involvement. Thus, in a capillaroscopic assessment, the doctor examines modifications in morphological and functional aspects of capillaries. These comprise of capillary diameter, visibility, distribution, length, microhemorrhages, blood flow and density. In this paper, a novel object detection algorithm is proposed based on deep learning architectures for detecting and locating various capillary loops in the nailfold region. Various characteristic features are extracted from the capillaries through image processing algorithms and in turn an attempt is made to differentiate between images of diseased subjects and healthy controls.
{"title":"Deep learning based object detection in nailfold capillary images","authors":"Suma Kuncha Venkatapathiah, Sethu Selvi Selvan, P. Nanda, Manisha Shetty, Vikas Mallikarjuna Swamy, Kushagra Awasthi","doi":"10.11591/ijai.v12.i2.pp931-942","DOIUrl":"https://doi.org/10.11591/ijai.v12.i2.pp931-942","url":null,"abstract":"Microcirculation in a subject can be examined and pathological changes can be assessed by utilizing capillaroscopy, which is a very safe, convenient and non-invasive approach. Using a microscope, doctors view the capillaries by looking through nailfold epidermis. Nailfold anatomy is ideal to evaluate the microcirculation and detect various diseases caused by vascular damages. Rheumatologists evaluate systemic diseases which involve damage in vasculature, by analyzing the red blood cells within the capillaries. Sometimes, capillary morphology may be useful as an early indicator while, severity of damage in capillary architecture may indicate internal organ involvement. Thus, in a capillaroscopic assessment, the doctor examines modifications in morphological and functional aspects of capillaries. These comprise of capillary diameter, visibility, distribution, length, microhemorrhages, blood flow and density. In this paper, a novel object detection algorithm is proposed based on deep learning architectures for detecting and locating various capillary loops in the nailfold region. Various characteristic features are extracted from the capillaries through image processing algorithms and in turn an attempt is made to differentiate between images of diseased subjects and healthy controls.","PeriodicalId":52221,"journal":{"name":"IAES International Journal of Artificial Intelligence","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46736634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01DOI: 10.11591/ijai.v12.i2.pp984-994
D. Kadhim, M. Saleh, S. Abou-Loukh
The fundamental of a downlink massive multiple-input multiple-output (MIMO) energy- issue efficiency strategy is known as minimum mean squared error (MMSE) implementation degrades the performance of a downlink massive MIMO energy-efficiency scheme, so some improvements are adding for this precoding scheme to improve its workthat is called our proposal solution as a proposed improved MMSE precoder (PIMP). The energy efficiency (EE) study has also taken into mind drastically lowering radiated power while maintaining high throughput and minimizing interference issues. We further find the tradeoff between spectral efficiency (SE) and EE although they coincide at the beginning but later their interests become conflicting and divergent then leading EE to decrease so gradually while SE continues increasing logarithmically. The results achieved that for a single-cellular massive MU-MIMO downlink model, our PIMP scheme is the appropriate scenario to achieve higher precoding performance system. Furthermore, both maximum ratio transmission (MRT) and PIMP are suitable for performance improvement in massive MIMO results of EE and SE. So, the main contribution comes with this work that highest EE and SE are belong to use a PIMP which performs better appreciably than MRT at bigger ratio of number of antennas to the number of the users.
{"title":"Evaluation of massive multiple-input multiple-output communication performance under a proposed improved minimum mean squared error precoding","authors":"D. Kadhim, M. Saleh, S. Abou-Loukh","doi":"10.11591/ijai.v12.i2.pp984-994","DOIUrl":"https://doi.org/10.11591/ijai.v12.i2.pp984-994","url":null,"abstract":"The fundamental of a downlink massive multiple-input multiple-output (MIMO) energy- issue efficiency strategy is known as minimum mean squared error (MMSE) implementation degrades the performance of a downlink massive MIMO energy-efficiency scheme, so some improvements are adding for this precoding scheme to improve its workthat is called our proposal solution as a proposed improved MMSE precoder (PIMP). The energy efficiency (EE) study has also taken into mind drastically lowering radiated power while maintaining high throughput and minimizing interference issues. We further find the tradeoff between spectral efficiency (SE) and EE although they coincide at the beginning but later their interests become conflicting and divergent then leading EE to decrease so gradually while SE continues increasing logarithmically. The results achieved that for a single-cellular massive MU-MIMO downlink model, our PIMP scheme is the appropriate scenario to achieve higher precoding performance system. Furthermore, both maximum ratio transmission (MRT) and PIMP are suitable for performance improvement in massive MIMO results of EE and SE. So, the main contribution comes with this work that highest EE and SE are belong to use a PIMP which performs better appreciably than MRT at bigger ratio of number of antennas to the number of the users. ","PeriodicalId":52221,"journal":{"name":"IAES International Journal of Artificial Intelligence","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42920102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01DOI: 10.11591/ijai.v12.i2.pp560-567
Oscar Ivan Vargas Mora, Daiam Camilo Parrado Nieto, Jairo David Cuero Ortega, Javier Eduardo Martinez Baquero, Robinson Jimenez Moreno
This document presents a machine learning model development as a tool to improve chemical dosing procedure in ariari regional aqueduct (ARA). The supervised learning model has been addressed starting from the knowledge of data color, turbidity and pH at the water inlet to the aqueduct and the dosing results of type A aluminum sulfate and calcium oxide (lime) obtained through jar tests. The construction of the automatic learning model had a comprehensive implementation and improvement field through continuous system training, which allowed an optimal dosage of Aluminum Sulfate and Lime to generate an outlet pH less than 7.5 and outlet turbidity less than 8 nephelometric turbidity unit (NTU). Those outlet water parameters meet the ministry of social protection criteria in Colombia. Also, a virtual jar test was created to reduce the time required to obtain chemical dosing values to less than a minute. In contrast, a laboratory test takes approximately a half-hour to displays results.
{"title":"Neural network-based pH and coagulation adjustment system in water treatment","authors":"Oscar Ivan Vargas Mora, Daiam Camilo Parrado Nieto, Jairo David Cuero Ortega, Javier Eduardo Martinez Baquero, Robinson Jimenez Moreno","doi":"10.11591/ijai.v12.i2.pp560-567","DOIUrl":"https://doi.org/10.11591/ijai.v12.i2.pp560-567","url":null,"abstract":"This document presents a machine learning model development as a tool to improve chemical dosing procedure in ariari regional aqueduct (ARA). The supervised learning model has been addressed starting from the knowledge of data color, turbidity and pH at the water inlet to the aqueduct and the dosing results of type A aluminum sulfate and calcium oxide (lime) obtained through jar tests. The construction of the automatic learning model had a comprehensive implementation and improvement field through continuous system training, which allowed an optimal dosage of Aluminum Sulfate and Lime to generate an outlet pH less than 7.5 and outlet turbidity less than 8 nephelometric turbidity unit (NTU). Those outlet water parameters meet the ministry of social protection criteria in Colombia. Also, a virtual jar test was created to reduce the time required to obtain chemical dosing values to less than a minute. In contrast, a laboratory test takes approximately a half-hour to displays results.","PeriodicalId":52221,"journal":{"name":"IAES International Journal of Artificial Intelligence","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44847486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01DOI: 10.11591/ijai.v12.i2.pp714-730
S. Fuada, T. Adiono, Hans Kasan
In respect of the accuracy, one of the well-known techniques for human detection is the histogram-oriented gradients (HOG) method. Unfortunately, the HOG feature calculation is highly complex and computationally intensive. Thus, in this research, we aim to achieve a resource-efficient and low-power HOG hardware architecture while maintaining its high frame-rate performance for real-time processing. A hardware architecture for human detection in 2D images using simplified HOG algorithm was introduced in this paper. To increase the frame-rate, we simplify the HOG computation while maintaining the detection quality. In the hardware architecture, we design a cell-based processing method instead of a window-based method. Moreover, 64 parallel and pipeline architectures were used to increase the processing speed. Our pipeline architecture can significantly reduce memory bandwidth and avoid any external memory utilization. an altera field programmable gate arrays (FPGA) E2-115 was employed to evaluate the design. The evaluation results show that our design achieves performance up to 86.51 frame rate per second (Fps) with a relatively low operating frequency (27 MHz). It consumes 48,360 logic elements (LEs) and 4,363 registers. The performance test results reveal that the proposed solution exhibits a trade-off between Fps, clock frequency, the use of registers, and Fps-to-clock ratio.
{"title":"A high frame-rate of cell-based histogram-oriented gradients human detector architecture implemented in field programmable gate arrays","authors":"S. Fuada, T. Adiono, Hans Kasan","doi":"10.11591/ijai.v12.i2.pp714-730","DOIUrl":"https://doi.org/10.11591/ijai.v12.i2.pp714-730","url":null,"abstract":"In respect of the accuracy, one of the well-known techniques for human detection is the histogram-oriented gradients (HOG) method. Unfortunately, the HOG feature calculation is highly complex and computationally intensive. Thus, in this research, we aim to achieve a resource-efficient and low-power HOG hardware architecture while maintaining its high frame-rate performance for real-time processing. A hardware architecture for human detection in 2D images using simplified HOG algorithm was introduced in this paper. To increase the frame-rate, we simplify the HOG computation while maintaining the detection quality. In the hardware architecture, we design a cell-based processing method instead of a window-based method. Moreover, 64 parallel and pipeline architectures were used to increase the processing speed. Our pipeline architecture can significantly reduce memory bandwidth and avoid any external memory utilization. an altera field programmable gate arrays (FPGA) E2-115 was employed to evaluate the design. The evaluation results show that our design achieves performance up to 86.51 frame rate per second (Fps) with a relatively low operating frequency (27 MHz). It consumes 48,360 logic elements (LEs) and 4,363 registers. The performance test results reveal that the proposed solution exhibits a trade-off between Fps, clock frequency, the use of registers, and Fps-to-clock ratio.","PeriodicalId":52221,"journal":{"name":"IAES International Journal of Artificial Intelligence","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44156781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01DOI: 10.11591/ijai.v12.i2.pp593-601
Tetyana Honcharenko, Roman Akselrod, Andrii Shpakov, Oleksandr Khomenko
This study is devoted to solving the problem to determine the professional adaptive capabilities of construction management staff using artificial intelligence systems. It is proposed fully connected feed-forward neural network (FCF-FNN) architecture and performed empirical modeling to create a data set. Model of artificial intelligence system allows evaluating the processes in an FCF-FNN during the execution of multi-value classification of professional areas. A method has been developed for the training process of a machine learning model, which reflects the internal connections between the components of an artificial intelligence system that allow it to “learn” from training data. To train the neural network, a data set of 35 input parameters and 29 output parameters was used; the amount of data in the set is 936 data lines. Neural network training occurred in the proportion of 10% and 90%, respectively. Results of this study research can be used to further improve the knowledge and skills necessary for successful professional realization.
{"title":"Information system based on multi-value classification of fully connected neural network for construction management","authors":"Tetyana Honcharenko, Roman Akselrod, Andrii Shpakov, Oleksandr Khomenko","doi":"10.11591/ijai.v12.i2.pp593-601","DOIUrl":"https://doi.org/10.11591/ijai.v12.i2.pp593-601","url":null,"abstract":"This study is devoted to solving the problem to determine the professional adaptive capabilities of construction management staff using artificial intelligence systems. It is proposed fully connected feed-forward neural network (FCF-FNN) architecture and performed empirical modeling to create a data set. Model of artificial intelligence system allows evaluating the processes in an FCF-FNN during the execution of multi-value classification of professional areas. A method has been developed for the training process of a machine learning model, which reflects the internal connections between the components of an artificial intelligence system that allow it to “learn” from training data. To train the neural network, a data set of 35 input parameters and 29 output parameters was used; the amount of data in the set is 936 data lines. Neural network training occurred in the proportion of 10% and 90%, respectively. Results of this study research can be used to further improve the knowledge and skills necessary for successful professional realization.","PeriodicalId":52221,"journal":{"name":"IAES International Journal of Artificial Intelligence","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136370903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Word embedding has become the most popular method of lexical description in a given context in the natural language processing domain, especially through the word to vector (Word2Vec) and global vectors (GloVe) implementations. Since GloVe is a pre-trained model that provides access to word mapping vectors on many dimensionalities, a large number of applications rely on its prowess, especially in the field of sentiment analysis. However, in the literature, we found that in many cases, GloVe is implemented with arbitrary dimensionalities (often 300d) regardless of the length of the text to be analyzed. In this work, we conducted a study that identifies the effect of the dimensionality of word embedding mapping vectors on short and long texts in a sentiment analysis context. The results suggest that as the dimensionality of the vectors increases, the performance metrics of the model also increase for long texts. In contrast, for short texts, we recorded a threshold at which dimensionality does not matter.
{"title":"Effect of word embedding vector dimensionality on sentiment analysis through short and long texts","authors":"Mohamed Chiny, Marouane Chihab, Abdelkarim Ait Lahcen, Omar Bencharef, Younes Chihab","doi":"10.11591/ijai.v12.i2.pp823-830","DOIUrl":"https://doi.org/10.11591/ijai.v12.i2.pp823-830","url":null,"abstract":"<span lang=\"EN-US\">Word embedding has become the most popular method of lexical description in a given context in the natural language processing domain, especially through the word to vector (Word2Vec) and global vectors (GloVe) implementations. Since GloVe is a pre-trained model that provides access to word mapping vectors on many dimensionalities, a large number of applications rely on its prowess, especially in the field of sentiment analysis. However, in the literature, we found that in many cases, GloVe is implemented with arbitrary dimensionalities (often 300d) regardless of the length of the text to be analyzed. In this work, we conducted a study that identifies the effect of the dimensionality of word embedding mapping vectors on short and long texts in a sentiment analysis context. The results suggest that as the dimensionality of the vectors increases, the performance metrics of the model also increase for long texts. In contrast, for short texts, we recorded a threshold at which dimensionality does not matter.</span>","PeriodicalId":52221,"journal":{"name":"IAES International Journal of Artificial Intelligence","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135275273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01DOI: 10.11591/ijai.v12.i2.pp686-695
Abbas H. Issa, Sarab A. Mahmood, Abdulrahim T. Humod, Nihad M. Ameen
The dissolved oxygen concentration in the wastewater treatment process (WWTP) must remain in a specific range while the factory operates. The augmented positive identification (PID) controller with a nonlinear element (sigmoid function) is proposed to assure stability and reduce uncertainties in the wastewater direct reuse/recycling model. The nonlinear controller gains (PID controller with sigmoid function) for uncertain wastewater treatment processes are tuned using the particle swarm optimization (PSO) technique. The proposed robust method for controlling wastewater treatment processes has good robustness during model mismatching, reduces treatment time compared to traditional positive identification (PID) controllers tuned by PSO, is easy to apply, and has good performance, according to simulation results.
{"title":"Robustness enhancement study of augmented positive identification controller by a sigmoid function","authors":"Abbas H. Issa, Sarab A. Mahmood, Abdulrahim T. Humod, Nihad M. Ameen","doi":"10.11591/ijai.v12.i2.pp686-695","DOIUrl":"https://doi.org/10.11591/ijai.v12.i2.pp686-695","url":null,"abstract":"The dissolved oxygen concentration in the wastewater treatment process (WWTP) must remain in a specific range while the factory operates. The augmented positive identification (PID) controller with a nonlinear element (sigmoid function) is proposed to assure stability and reduce uncertainties in the wastewater direct reuse/recycling model. The nonlinear controller gains (PID controller with sigmoid function) for uncertain wastewater treatment processes are tuned using the particle swarm optimization (PSO) technique. The proposed robust method for controlling wastewater treatment processes has good robustness during model mismatching, reduces treatment time compared to traditional positive identification (PID) controllers tuned by PSO, is easy to apply, and has good performance, according to simulation results.","PeriodicalId":52221,"journal":{"name":"IAES International Journal of Artificial Intelligence","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43702792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01DOI: 10.11591/ijai.v12.i2.pp884-891
Harsha Raju, Veena Kalludi Narasimhaiah
With the advancement of digital image processing in agriculture and crop cultivation, imaging techniques are adopted to acquire real-time health status. Out of all the parts of plants, the leaf is the direct indicator of its health status, and hence applying various image processing approaches could benefit the process of yielding informative cases of plant health. At present, there are various approaches, e.g., feature extraction, segmentation, identification, the classification being evolved up with more dependencies being found in using machine learning; the studies show many contributions towards this challenge. However, it is not yet conclusive to understand the optimal approach. Hence, this paper highlights an explicit strength and weakness associated with the existing approaches existing imaging processing techniques to identify the disease condition from an input of plant leaves' image. The study also contributes to highlighting open-end research problems to have conclusive remarks about effectiveness.
{"title":"Insights on assessing image processing approaches towards health status of plant leaf using machine learning","authors":"Harsha Raju, Veena Kalludi Narasimhaiah","doi":"10.11591/ijai.v12.i2.pp884-891","DOIUrl":"https://doi.org/10.11591/ijai.v12.i2.pp884-891","url":null,"abstract":"With the advancement of digital image processing in agriculture and crop cultivation, imaging techniques are adopted to acquire real-time health status. Out of all the parts of plants, the leaf is the direct indicator of its health status, and hence applying various image processing approaches could benefit the process of yielding informative cases of plant health. At present, there are various approaches, e.g., feature extraction, segmentation, identification, the classification being evolved up with more dependencies being found in using machine learning; the studies show many contributions towards this challenge. However, it is not yet conclusive to understand the optimal approach. Hence, this paper highlights an explicit strength and weakness associated with the existing approaches existing imaging processing techniques to identify the disease condition from an input of plant leaves' image. The study also contributes to highlighting open-end research problems to have conclusive remarks about effectiveness. ","PeriodicalId":52221,"journal":{"name":"IAES International Journal of Artificial Intelligence","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47851624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01DOI: 10.11591/ijai.v12.i2.pp514-521
A. A. Manjunath, Manjunath Sudhakar Nayak, Santhanam Nishith, Satish Nitin Pandit, Shreyas Sunkad, Pratiba Deenadhayalan, Shobha Gangadhara
Manually processing invoices which are in the form of scanned photocopies is a time-consuming process. There is a need to automate the task of extraction of data from the invoices with a similar format. In this paper we investigate and analyse various techniques of image processing and text extraction to improve the results of the optical character recognition (OCR) engine, which is applied to extract the text from the invoice. This paper also proposes the design and implementation of a web enabled invoice processing system (IPS). The IPS consists of an annotation tool and an extraction tool. The annotation tool is used to mark the fields of interest in the invoice which are to be extracted. The extraction tool makes use of opensource computer vision library (OpenCV) algorithms to detect text. The proposed system was tested on more than 25 types of invoices with the average accuracy score lying between 85% and 95%. Finally, to provide ease of use, a web application is developed which also presents the results in a structured format. The entire system is designed so as to provide flexibility and automate the process of extracting details of interest from the invoices.
{"title":"Automated invoice data extraction using image processing","authors":"A. A. Manjunath, Manjunath Sudhakar Nayak, Santhanam Nishith, Satish Nitin Pandit, Shreyas Sunkad, Pratiba Deenadhayalan, Shobha Gangadhara","doi":"10.11591/ijai.v12.i2.pp514-521","DOIUrl":"https://doi.org/10.11591/ijai.v12.i2.pp514-521","url":null,"abstract":"Manually processing invoices which are in the form of scanned photocopies is a time-consuming process. There is a need to automate the task of extraction of data from the invoices with a similar format. In this paper we investigate and analyse various techniques of image processing and text extraction to improve the results of the optical character recognition (OCR) engine, which is applied to extract the text from the invoice. This paper also proposes the design and implementation of a web enabled invoice processing system (IPS). The IPS consists of an annotation tool and an extraction tool. The annotation tool is used to mark the fields of interest in the invoice which are to be extracted. The extraction tool makes use of opensource computer vision library (OpenCV) algorithms to detect text. The proposed system was tested on more than 25 types of invoices with the average accuracy score lying between 85% and 95%. Finally, to provide ease of use, a web application is developed which also presents the results in a structured format. The entire system is designed so as to provide flexibility and automate the process of extracting details of interest from the invoices.","PeriodicalId":52221,"journal":{"name":"IAES International Journal of Artificial Intelligence","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45047842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}