Pub Date : 2024-05-29DOI: 10.1007/s12652-024-04818-7
Saba Hameed, Javaria Amin, Muhammad Almas Anjum, Muhammad Sharif
Nowadays there is growing demand for surveillance applications due to the safety and security from anomalous events. An anomaly in the video is referred to as an event that has some unusual behavior. Although time is required for the recognition of these anomalous events, computerized methods might help to decrease it and perform efficient prediction. However, accurate anomaly detection is still a challenge due to complex background, illumination, variations, and occlusion. To handle these challenges a method is proposed for a vision transformer convolutional recurrent neural network named ViT-CNN-RCNN model for the classification of suspicious activities based on frames and videos. The proposed pre-trained ViT-base-patch16-224-in21k model contains 224 × 224 × 3 video frames as input and converts into a 16 × 16 patch size. The ViT-base-patch16-224-in21k has a patch embedding layer, ViT encoder, and ViT transformer layer having 11 blocks, layer-norm, and ViT pooler. The ViT model is trained on selected learning parameters such as 20 training epochs, and 10 batch-size to categorize the input frames into thirteen different classes such as robbery, fighting, shooting, stealing, shoplifting, Arrest, Arson, Abuse, exploiting, Road Accident, Burglary, and Vandalism. The CNN-RNN sequential model is designed to process sequential data, that contains an input layer, GRU layer, GRU-1 Layer and Dense Layer. This model is trained on optimal hyperparameters such as 32 video frame sizes, 30 training epochs, and 16 batch-size for classification into corresponding class labels. The proposed model is evaluated on UNI-crime and UCF-crime datasets. The experimental outcomes conclude that the proposed approach better performed as compared to recently published works.
{"title":"Suspicious activities detection using spatial–temporal features based on vision transformer and recurrent neural network","authors":"Saba Hameed, Javaria Amin, Muhammad Almas Anjum, Muhammad Sharif","doi":"10.1007/s12652-024-04818-7","DOIUrl":"https://doi.org/10.1007/s12652-024-04818-7","url":null,"abstract":"<p>Nowadays there is growing demand for surveillance applications due to the safety and security from anomalous events. An anomaly in the video is referred to as an event that has some unusual behavior. Although time is required for the recognition of these anomalous events, computerized methods might help to decrease it and perform efficient prediction. However, accurate anomaly detection is still a challenge due to complex background, illumination, variations, and occlusion. To handle these challenges a method is proposed for a vision transformer convolutional recurrent neural network named ViT-CNN-RCNN model for the classification of suspicious activities based on frames and videos. The proposed pre-trained ViT-base-patch16-224-in21k model contains 224 × 224 × 3 video frames as input and converts into a 16 × 16 patch size. The ViT-base-patch16-224-in21k has a patch embedding layer, ViT encoder, and ViT transformer layer having 11 blocks, layer-norm, and ViT pooler. The ViT model is trained on selected learning parameters such as 20 training epochs, and 10 batch-size to categorize the input frames into thirteen different classes such as robbery, fighting, shooting, stealing, shoplifting, Arrest, Arson, Abuse, exploiting, Road Accident, Burglary, and Vandalism. The CNN-RNN sequential model is designed to process sequential data, that contains an input layer, GRU layer, GRU-1 Layer and Dense Layer. This model is trained on optimal hyperparameters such as 32 video frame sizes, 30 training epochs, and 16 batch-size for classification into corresponding class labels. The proposed model is evaluated on UNI-crime and UCF-crime datasets. The experimental outcomes conclude that the proposed approach better performed as compared to recently published works.</p>","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":"95 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141198178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-28DOI: 10.1007/s12652-024-04809-8
Lu Lianju, Zhang Haiying
Because of the fast take-off speed of badminton, a single action recognition method can’t quickly and accurately identify the action. Therefore, a new badminton take-off recognition method based on improved deep learning is proposed to capture badminton take-off accurately. Collect badminton sports videos and get images of athletes’ activity areas by tracking the moving targets in badminton competition videos. The static characteristics of badminton players’ take-off actions are extracted from the athletes’ activity areas’ images using 3D ConvNets. According to the human joint points in the badminton player’s target tracking image, the human skeleton sequence is constructed by using a 2D coordinate pseudo-image and 2D skeleton data design algorithm, and the dynamic characteristics of badminton take-off action are extracted from the human skeleton sequence by using LSTM (Long-term and Short-term Memory Network). After the static and dynamic features are fused by weighted summation, badminton take-off feature fusion results are input into a convolutional neural network (CNN) to complete badminton take-off recognition. The CNN pool layer is improved by adaptive pooling, and the network convergence is accelerated by combining batch normalization to further optimize the recognition results of badminton take-off. Experiments show that the human skeleton model can accurately match human movements and assist in extracting action features. The improved CNN has greatly improved the accuracy of recognition of take-off actions. When recognizing real images, it can accurately identify human movements and judge whether there is a take-off action.
{"title":"Research on badminton take-off recognition method based on improved deep learning","authors":"Lu Lianju, Zhang Haiying","doi":"10.1007/s12652-024-04809-8","DOIUrl":"https://doi.org/10.1007/s12652-024-04809-8","url":null,"abstract":"<p>Because of the fast take-off speed of badminton, a single action recognition method can’t quickly and accurately identify the action. Therefore, a new badminton take-off recognition method based on improved deep learning is proposed to capture badminton take-off accurately. Collect badminton sports videos and get images of athletes’ activity areas by tracking the moving targets in badminton competition videos. The static characteristics of badminton players’ take-off actions are extracted from the athletes’ activity areas’ images using 3D ConvNets. According to the human joint points in the badminton player’s target tracking image, the human skeleton sequence is constructed by using a 2D coordinate pseudo-image and 2D skeleton data design algorithm, and the dynamic characteristics of badminton take-off action are extracted from the human skeleton sequence by using LSTM (Long-term and Short-term Memory Network). After the static and dynamic features are fused by weighted summation, badminton take-off feature fusion results are input into a convolutional neural network (CNN) to complete badminton take-off recognition. The CNN pool layer is improved by adaptive pooling, and the network convergence is accelerated by combining batch normalization to further optimize the recognition results of badminton take-off. Experiments show that the human skeleton model can accurately match human movements and assist in extracting action features. The improved CNN has greatly improved the accuracy of recognition of take-off actions. When recognizing real images, it can accurately identify human movements and judge whether there is a take-off action.</p>","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":"47 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141169612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-21DOI: 10.1007/s12652-024-04752-8
George Routis, George Katsouris, Ioanna Roussaki
{"title":"Cryptography-based location privacy protection in the Internet of Vehicles","authors":"George Routis, George Katsouris, Ioanna Roussaki","doi":"10.1007/s12652-024-04752-8","DOIUrl":"https://doi.org/10.1007/s12652-024-04752-8","url":null,"abstract":"","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":"52 9","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141116601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-18DOI: 10.1007/s12652-024-04808-9
Jia Zhao, Zhanfeng Yao, Liujun Qiu, Tanghuai Fan, Ivan Lee
The density peaks clustering (DPC) algorithm is simple in principle, efficient in operation, and has good clustering effects on various types of datasets. However, this algorithm still has some defects: (1) due to the definition limitations of local density and relative distance of samples, it is difficult for the algorithm to find correct density peaks; (2) the allocation strategy of the algorithm has poor robustness and is prone to cause other problems. In response to solve the above shortcomings, we proposed a density peaks clustering algorithm based on multi-cluster merge (DPC-MM). In view of the difficulty in selecting density peaks of the DPC algorithm, a new method of calculating relative distance of samples was defined to make the density peaks found more accurate. The allocation strategy of multi-cluster merge was proposed to alleviate or avoid problems caused by allocation errors. Experimental results revealed that the DPC-MM algorithm can efficiently perform clustering on datasets of any shape and scale. The DPC-MM algorithm was applied in extraction of typical load patterns of users, and can more accurately perform clustering on user loads. The extraction results can better reflect electricity consumption habits of users.
{"title":"Density peaks clustering algorithm based on multi-cluster merge and its application in the extraction of typical load patterns of users","authors":"Jia Zhao, Zhanfeng Yao, Liujun Qiu, Tanghuai Fan, Ivan Lee","doi":"10.1007/s12652-024-04808-9","DOIUrl":"https://doi.org/10.1007/s12652-024-04808-9","url":null,"abstract":"<p>The density peaks clustering (DPC) algorithm is simple in principle, efficient in operation, and has good clustering effects on various types of datasets. However, this algorithm still has some defects: (1) due to the definition limitations of local density and relative distance of samples, it is difficult for the algorithm to find correct density peaks; (2) the allocation strategy of the algorithm has poor robustness and is prone to cause other problems. In response to solve the above shortcomings, we proposed a density peaks clustering algorithm based on multi-cluster merge (DPC-MM). In view of the difficulty in selecting density peaks of the DPC algorithm, a new method of calculating relative distance of samples was defined to make the density peaks found more accurate. The allocation strategy of multi-cluster merge was proposed to alleviate or avoid problems caused by allocation errors. Experimental results revealed that the DPC-MM algorithm can efficiently perform clustering on datasets of any shape and scale. The DPC-MM algorithm was applied in extraction of typical load patterns of users, and can more accurately perform clustering on user loads. The extraction results can better reflect electricity consumption habits of users.</p>","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":"46 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141062788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-17DOI: 10.1007/s12652-024-04811-0
Ying Xin
{"title":"MusicEmo: transformer-based intelligent approach towards music emotion generation and recognition","authors":"Ying Xin","doi":"10.1007/s12652-024-04811-0","DOIUrl":"https://doi.org/10.1007/s12652-024-04811-0","url":null,"abstract":"","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":"55 11","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140964929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-10DOI: 10.1007/s12652-024-04807-w
Kholoud Maswadi, Ali Alhazmi, Faisal Alshanketi, Christopher Ifeanyi Eke
Disaster-based tweets during an emergency consist of a variety of information on people who have been hurt or killed, people who are lost or discovered, infrastructure and utilities destroyed; this information can assist governmental and humanitarian organizations in prioritizing their aid and rescue efforts. It is crucial to build a model that can categorize these tweets into distinct types due to their massive volume so as to better organize rescue and relief effort and save lives. In this study, Twitter data of 2013 Queensland flood and 2015 Nepal earthquake has been classified as disaster or non-disaster by employing three classes of models. The first model is performed using the lexical feature based on Term Frequency-Inverse Document Frequency (TF-IDF). The classification was performed using five classification algorithms such as DT, LR, SVM, RF, while Ensemble Voting was used to produce the outcome of the models. The second model uses shallow classifiers in conjunction with several features, including lexical (TF-IDF), hashtag, POS, and GloVe embedding. The third set of the model utilized deep learning algorithms including LSTM, LSTM, and GRU, using BERT (Bidirectional Encoder Representations from Transformers) for constructing semantic word embedding to learn the context. The key performance evaluation metrics such as accuracy, F1 score, recall, and precision were employed to measure and compare the three sets of models for disaster response classification on two publicly available Twitter datasets. By performing a comprehensive empirical evaluation of the tweet classification technique across different disaster kinds, the predictive performance shows that the best accuracy was achieved with DT algorithm which attained the highest performance accuracy followed by Bi-LSTM models for disaster response classification by attaining the best accuracy of 96.46% and 96.40% on the Queensland flood dataset; DT algorithm also attained 78.3% accuracy on the Nepal earthquake dataset based on the majority-voting ensemble respectively. Thus, this research contributes by investigating the integration of deep and shallow learning models effectively in a tweet classification system designed for disaster response. Examining the ways that these two methods work seamlessly offers insights into how to best utilize their complimentary advantages to increase the robustness and accuracy of locating suitable data in disaster crisis.
{"title":"The empirical study of tweet classification system for disaster response using shallow and deep learning models","authors":"Kholoud Maswadi, Ali Alhazmi, Faisal Alshanketi, Christopher Ifeanyi Eke","doi":"10.1007/s12652-024-04807-w","DOIUrl":"https://doi.org/10.1007/s12652-024-04807-w","url":null,"abstract":"<p>Disaster-based tweets during an emergency consist of a variety of information on people who have been hurt or killed, people who are lost or discovered, infrastructure and utilities destroyed; this information can assist governmental and humanitarian organizations in prioritizing their aid and rescue efforts. It is crucial to build a model that can categorize these tweets into distinct types due to their massive volume so as to better organize rescue and relief effort and save lives. In this study, Twitter data of 2013 Queensland flood and 2015 Nepal earthquake has been classified as disaster or non-disaster by employing three classes of models. The first model is performed using the lexical feature based on Term Frequency-Inverse Document Frequency (TF-IDF). The classification was performed using five classification algorithms such as DT, LR, SVM, RF, while Ensemble Voting was used to produce the outcome of the models. The second model uses shallow classifiers in conjunction with several features, including lexical (TF-IDF), hashtag, POS, and GloVe embedding. The third set of the model utilized deep learning algorithms including LSTM, LSTM, and GRU, using BERT (Bidirectional Encoder Representations from Transformers) for constructing semantic word embedding to learn the context. The key performance evaluation metrics such as accuracy, F1 score, recall, and precision were employed to measure and compare the three sets of models for disaster response classification on two publicly available Twitter datasets. By performing a comprehensive empirical evaluation of the tweet classification technique across different disaster kinds, the predictive performance shows that the best accuracy was achieved with DT algorithm which attained the highest performance accuracy followed by Bi-LSTM models for disaster response classification by attaining the best accuracy of 96.46% and 96.40% on the Queensland flood dataset; DT algorithm also attained 78.3% accuracy on the Nepal earthquake dataset based on the majority-voting ensemble respectively. Thus, this research contributes by investigating the integration of deep and shallow learning models effectively in a tweet classification system designed for disaster response. Examining the ways that these two methods work seamlessly offers insights into how to best utilize their complimentary advantages to increase the robustness and accuracy of locating suitable data in disaster crisis.</p>","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140931634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-09DOI: 10.1007/s12652-024-04798-8
Fei Liu
Unlike other marketing strategies, sporting goods marketing strategies are affected by the fragility and randomness of marketing data, resulting in more restrictive factors. To improve the economic benefits of sporting goods enterprises, the design of sporting goods marketing strategy simulation system based on multi-agent technology was proposed. Based on the LMZ10503 circuit, ADP2164 circuit, and ADP1755 circuit, the rectifier is the power supply. The hardware design of the system is completed by combining the design of the marketing strategy acquisition card module and the sporting goods marketing strategy simulator module. In the software design of the system, according to the evaluation results of the logic degree of the sports marketing strategy simulation node, the logic degree of the marketing strategy simulation node is optimized. The multi-agent technology is used to implement the multi-agent modeling of the marketing strategy, and the simulation of the sports marketing strategy is realized by generating the marketing strategy simulation signal. The test results show that the sports equipment market marketing strategy simulation system based on multi-agent technology has successfully simulated the marketing strategy of sports equipment and achieved exciting results. Through the application of this system, sales volume and profit margin have increased to 900,000 units and 90% respectively. These results validate the potential of the system in optimizing marketing strategies and improving economic benefits, and provide strong reference and guidance for the sports equipment industry. Further promotion and application of this system is expected to help enterprises develop more accurate and scientific marketing strategies, achieve higher sales volume and profit margins, and thus promote the sustainable development and competitive advantage of the enterprise.
{"title":"Design of sports goods marketing strategy simulation system based on multi agent technology","authors":"Fei Liu","doi":"10.1007/s12652-024-04798-8","DOIUrl":"https://doi.org/10.1007/s12652-024-04798-8","url":null,"abstract":"<p>Unlike other marketing strategies, sporting goods marketing strategies are affected by the fragility and randomness of marketing data, resulting in more restrictive factors. To improve the economic benefits of sporting goods enterprises, the design of sporting goods marketing strategy simulation system based on multi-agent technology was proposed. Based on the LMZ10503 circuit, ADP2164 circuit, and ADP1755 circuit, the rectifier is the power supply. The hardware design of the system is completed by combining the design of the marketing strategy acquisition card module and the sporting goods marketing strategy simulator module. In the software design of the system, according to the evaluation results of the logic degree of the sports marketing strategy simulation node, the logic degree of the marketing strategy simulation node is optimized. The multi-agent technology is used to implement the multi-agent modeling of the marketing strategy, and the simulation of the sports marketing strategy is realized by generating the marketing strategy simulation signal. The test results show that the sports equipment market marketing strategy simulation system based on multi-agent technology has successfully simulated the marketing strategy of sports equipment and achieved exciting results. Through the application of this system, sales volume and profit margin have increased to 900,000 units and 90% respectively. These results validate the potential of the system in optimizing marketing strategies and improving economic benefits, and provide strong reference and guidance for the sports equipment industry. Further promotion and application of this system is expected to help enterprises develop more accurate and scientific marketing strategies, achieve higher sales volume and profit margins, and thus promote the sustainable development and competitive advantage of the enterprise.</p>","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140931418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-09DOI: 10.1007/s12652-024-04804-z
Arifa Shikalgar, Shefali Sonavane
Data mining applications use high-dimensional datasets, but still, a large number of extents causes the well-known ‘Curse of Dimensionality,' which leads to worse accuracy of machine learning classifiers due to the fact that most unimportant and unnecessary dimensions are included in the dataset. Many approaches are employed to handle critical dimension datasets, but their accuracy suffers as a result. As a consequence, to deal with high-dimensional datasets, a hybrid Deep Kernelized Stacked De-Noising Auto encoder based on feature learning was proposed (DKSDA). Because of the layered property, the DKSDA can manage vast amounts of heterogeneous data and performs knowledge-based reduction by taking into account many qualities. It will examine all the multimodalities and all hidden potential modalities using two fine-tuning stages, the input has random noise along with feature vectors, and a stack of de-noising auto-encoders is generated. This SDA processing decreases the prediction error caused by the lack of analysis of concealed objects among the multimodalities. In addition, to handle a huge set of data, a new layer of Spatial Pyramid Pooling (SPP) is introduced along with the structure of Convolutional Neural Network (CNN) by decreasing or removing the remaining sections other than the key characteristic with structural knowledge using kernel function. The recent studies revealed that the DKSDA proposed has an average accuracy of about 97.57% with a dimensionality reduction of 12%. By enhancing the classification accuracy and processing complexity, pre-training reduces dimensionality.
{"title":"Deep kernelized dimensionality reducer for multi-modality heterogeneous data","authors":"Arifa Shikalgar, Shefali Sonavane","doi":"10.1007/s12652-024-04804-z","DOIUrl":"https://doi.org/10.1007/s12652-024-04804-z","url":null,"abstract":"<p>Data mining applications use high-dimensional datasets, but still, a large number of extents causes the well-known ‘Curse of Dimensionality,' which leads to worse accuracy of machine learning classifiers due to the fact that most unimportant and unnecessary dimensions are included in the dataset. Many approaches are employed to handle critical dimension datasets, but their accuracy suffers as a result. As a consequence, to deal with high-dimensional datasets, a hybrid Deep Kernelized Stacked De-Noising Auto encoder based on feature learning was proposed (DKSDA). Because of the layered property, the DKSDA can manage vast amounts of heterogeneous data and performs knowledge-based reduction by taking into account many qualities. It will examine all the multimodalities and all hidden potential modalities using two fine-tuning stages, the input has random noise along with feature vectors, and a stack of de-noising auto-encoders is generated. This SDA processing decreases the prediction error caused by the lack of analysis of concealed objects among the multimodalities. In addition, to handle a huge set of data, a new layer of Spatial Pyramid Pooling (SPP) is introduced along with the structure of Convolutional Neural Network (CNN) by decreasing or removing the remaining sections other than the key characteristic with structural knowledge using kernel function. The recent studies revealed that the DKSDA proposed has an average accuracy of about 97.57% with a dimensionality reduction of 12%. By enhancing the classification accuracy and processing complexity, pre-training reduces dimensionality.</p>","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":"23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140931424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-09DOI: 10.1007/s12652-024-04801-2
Issameldeen Elfadul, Lijun Wu, Rashad Elhabob, Ahmed Elkhalil
Decentralized Anonymous Payment Systems (DAP), often known as cryptocurrencies, stand out as some of the most innovative and successful applications on the blockchain. These systems have garnered significant attention in the financial industry due to their highly secure and reliable features. Regrettably, the DAP system can be exploited to fund illegal activities such as drug dealing and terrorism. Therefore, governments are increasingly worried about the illicit use of DAP systems, which poses a critical threat to their security. This paper proposes Privacy and Compliance in Regulated Anonymous Payment System Based on Blockchain (PCRAP), which provides government supervision and enforces regulations over transactions without sacrificing the essential idea of the blockchain, that is, without surrendering transaction privacy or anonymity of the participants. The key characteristic of the proposed scheme is using a ring signature and stealth address to ensure the anonymity of both the sender and receiver of the transaction. Moreover, a Merkle Tree is used to guarantee government supervision and enforce regulations. Our proposed scheme satisfies most of the stringent security requirements and complies with the standards of secure payment systems. Additionally, while our work supports government regulations and supervision, it guarantees unconditional anonymity for users. Furthermore, the performance analysis demonstrates that our suggested scheme still remains applicable and effective even when achieving complete anonymity.
{"title":"A privacy and compliance in regulated anonymous payment system based on blockchain","authors":"Issameldeen Elfadul, Lijun Wu, Rashad Elhabob, Ahmed Elkhalil","doi":"10.1007/s12652-024-04801-2","DOIUrl":"https://doi.org/10.1007/s12652-024-04801-2","url":null,"abstract":"<p>Decentralized Anonymous Payment Systems (DAP), often known as cryptocurrencies, stand out as some of the most innovative and successful applications on the blockchain. These systems have garnered significant attention in the financial industry due to their highly secure and reliable features. Regrettably, the DAP system can be exploited to fund illegal activities such as drug dealing and terrorism. Therefore, governments are increasingly worried about the illicit use of DAP systems, which poses a critical threat to their security. This paper proposes Privacy and Compliance in Regulated Anonymous Payment System Based on Blockchain (PCRAP), which provides government supervision and enforces regulations over transactions without sacrificing the essential idea of the blockchain, that is, without surrendering transaction privacy or anonymity of the participants. The key characteristic of the proposed scheme is using a ring signature and stealth address to ensure the anonymity of both the sender and receiver of the transaction. Moreover, a Merkle Tree is used to guarantee government supervision and enforce regulations. Our proposed scheme satisfies most of the stringent security requirements and complies with the standards of secure payment systems. Additionally, while our work supports government regulations and supervision, it guarantees unconditional anonymity for users. Furthermore, the performance analysis demonstrates that our suggested scheme still remains applicable and effective even when achieving complete anonymity.</p>","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":"35 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140931633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-07DOI: 10.1007/s12652-024-04805-y
Prarthana Dutta, Naresh Babu Muppalaneni
Digitization offers a solution to the challenges associated with managing and retrieving paper-based documents. However, these paper-based documents must be converted into a format that digital machines can comprehend, as they primarily understand alphanumeric text. This transformation is achieved through Optical Character Recognition (OCR), a technology that converts scanned image documents into a format that machines can process. A novel top-down character segmentation approach has been proposed in this work, involving multiple stages. Our approach began by isolating lines from handwritten documents and using these lines to segment words and characters. To further enhance the character segmentation, a Raster Scanning object detection technique is employed to isolate individual characters within words. Thus, the character segmentation results are integrated from the results of the vertical projection and raster scanning. Recognizing the significance of advancing digitization of handwritten documents, we have chosen to focus on the regional languages of Assam and Andhra Pradesh due to their historical and cultural importance in India’s linguistic diversity. So, we have collected datasets of handwritten texts in Assamese and Telugu languages due to their unavailability in the desired form. Our approach achieved an average segmentation accuracy of 93.61%, 85.96%, and 88.74% for lines, words, and characters for both languages. The key motivation behind opting for a top-down approach is two-fold: firstly, it enhances the accuracy of character recognition, and secondly, it holds the potential for future use in language/script identification through the utilization of segmented lines and words.
{"title":"A top-down character segmentation approach for Assamese and Telugu handwritten documents","authors":"Prarthana Dutta, Naresh Babu Muppalaneni","doi":"10.1007/s12652-024-04805-y","DOIUrl":"https://doi.org/10.1007/s12652-024-04805-y","url":null,"abstract":"<p>Digitization offers a solution to the challenges associated with managing and retrieving paper-based documents. However, these paper-based documents must be converted into a format that digital machines can comprehend, as they primarily understand alphanumeric text. This transformation is achieved through Optical Character Recognition (OCR), a technology that converts scanned image documents into a format that machines can process. A novel top-down character segmentation approach has been proposed in this work, involving multiple stages. Our approach began by isolating lines from handwritten documents and using these lines to segment words and characters. To further enhance the character segmentation, a <i>Raster Scanning</i> object detection technique is employed to isolate individual characters within words. Thus, the character segmentation results are integrated from the results of the vertical projection and raster scanning. Recognizing the significance of advancing digitization of handwritten documents, we have chosen to focus on the regional languages of Assam and Andhra Pradesh due to their historical and cultural importance in India’s linguistic diversity. So, we have collected datasets of handwritten texts in Assamese and Telugu languages due to their unavailability in the desired form. Our approach achieved an average segmentation accuracy of 93.61%, 85.96%, and 88.74% for lines, words, and characters for both languages. The key motivation behind opting for a top-down approach is two-fold: firstly, it enhances the accuracy of character recognition, and secondly, it holds the potential for future use in language/script identification through the utilization of segmented lines and words.</p>","PeriodicalId":14959,"journal":{"name":"Journal of Ambient Intelligence and Humanized Computing","volume":"4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}