The OCR system has been widely used in many fields, such as office automation, file management, online education, etc. However, due to its high requirements on computing resources, the system is mostly runing on desktop or server platforms. In recent years, the performance of mobile devices has been improving, and they have been increasingly used in people's life and work. In this paper, we design an OCR system for mobile devices, which can better apply the performance of mobile devices, improve the stability of mobile OCR tasks, and reduce its dependence on network state by using various strategies to slimming and enhance the model applied by server, the total size of the final model is only 20M.
{"title":"An OCR System : Towards Mobile Device","authors":"Peng Yang, Junfeng Zhang, Jiangfeng Xu, Yumin Li","doi":"10.1145/3556677.3556685","DOIUrl":"https://doi.org/10.1145/3556677.3556685","url":null,"abstract":"The OCR system has been widely used in many fields, such as office automation, file management, online education, etc. However, due to its high requirements on computing resources, the system is mostly runing on desktop or server platforms. In recent years, the performance of mobile devices has been improving, and they have been increasingly used in people's life and work. In this paper, we design an OCR system for mobile devices, which can better apply the performance of mobile devices, improve the stability of mobile OCR tasks, and reduce its dependence on network state by using various strategies to slimming and enhance the model applied by server, the total size of the final model is only 20M.","PeriodicalId":350340,"journal":{"name":"Proceedings of the 2022 6th International Conference on Deep Learning Technologies","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123995011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sleep is one of the most critical functions of the human body, yet many disorders disrupt this physiological process. These conditions can be diagnosed by observing the pattern and length of sleep stages that a patient enters; however, this process requires the manual scoring of a patient's EEG patterns by a specialist. This process is time-consuming and inaccessible, but the accurate and automated scoring of sleep stages by artificial intelligence would help medical professionals quickly offer diagnoses and treatments. In this paper, we propose a bagged trees model using wavelet decomposition for feature extraction, while also utilizing random undersampling to handle the inherent data imbalance. We achieve 85.1% and 87.1 % accuracy on 5-fold cross validation and the test set, respectively. The accuracy across all stages is consistent, indicating that the model may be more suitable for real-world applications than other models with nominally higher accuracies.
{"title":"An Undersampled Model for Automated Sleep Stage Scoring Using EEG Data: Utilization of DWT, bagged trees, and random undersampling to achieve more consistent accuracy on the sleepstage problem","authors":"Zachary I. Li, James Yang, Jianguo Liu","doi":"10.1145/3556677.3556696","DOIUrl":"https://doi.org/10.1145/3556677.3556696","url":null,"abstract":"Sleep is one of the most critical functions of the human body, yet many disorders disrupt this physiological process. These conditions can be diagnosed by observing the pattern and length of sleep stages that a patient enters; however, this process requires the manual scoring of a patient's EEG patterns by a specialist. This process is time-consuming and inaccessible, but the accurate and automated scoring of sleep stages by artificial intelligence would help medical professionals quickly offer diagnoses and treatments. In this paper, we propose a bagged trees model using wavelet decomposition for feature extraction, while also utilizing random undersampling to handle the inherent data imbalance. We achieve 85.1% and 87.1 % accuracy on 5-fold cross validation and the test set, respectively. The accuracy across all stages is consistent, indicating that the model may be more suitable for real-world applications than other models with nominally higher accuracies.","PeriodicalId":350340,"journal":{"name":"Proceedings of the 2022 6th International Conference on Deep Learning Technologies","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123007025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The recognition of weather condition from still images is quite challenging due to weather diversity and lack of distinct characteristics that exists in many weather conditions. Some researchers have used the K-nearest neighbor method to recognise a specific extract of a weather condition, to test the efficiency of the recognition task. Other works attempted to resolve this problem viewed weather recognition as a single identifier task. In order to enhance the accuracy of recognising weather conditions, this research uses the approach of convolutional layers of Resnet-15 model to extract the essential features of an image. Thereafter, uses the fully connected layers and the softmax classifier to recognise and classify the images, a small size dataset of images from diverse scenes called dataset-2, is used. And Resnet-15 model is used for the testing and training on the datadet-2. The experiments of the proposed approach have been able to correctly recognise the weather conditions of the images, with a better accuracy, speed and reduction in the model size of the network.
{"title":"Weather Recognition Based on Still Images Using Deep Learning Neural Network with Resnet-15","authors":"Peace Uloma Egbueze, Z. Wang","doi":"10.1145/3556677.3556688","DOIUrl":"https://doi.org/10.1145/3556677.3556688","url":null,"abstract":"The recognition of weather condition from still images is quite challenging due to weather diversity and lack of distinct characteristics that exists in many weather conditions. Some researchers have used the K-nearest neighbor method to recognise a specific extract of a weather condition, to test the efficiency of the recognition task. Other works attempted to resolve this problem viewed weather recognition as a single identifier task. In order to enhance the accuracy of recognising weather conditions, this research uses the approach of convolutional layers of Resnet-15 model to extract the essential features of an image. Thereafter, uses the fully connected layers and the softmax classifier to recognise and classify the images, a small size dataset of images from diverse scenes called dataset-2, is used. And Resnet-15 model is used for the testing and training on the datadet-2. The experiments of the proposed approach have been able to correctly recognise the weather conditions of the images, with a better accuracy, speed and reduction in the model size of the network.","PeriodicalId":350340,"journal":{"name":"Proceedings of the 2022 6th International Conference on Deep Learning Technologies","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115928060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Oracle bone inscriptions (OBIs) are the earliest Chinese writing system. However, deciphering OBIs is a very challenging task because of the lack of data and time- and resource-consuming manual classification process. In this paper, I apply the technology of deep learning to solve the problem of OBI recognition, proposing a method for merging incompatible OBI classification datasets and implementing it successfully, significantly raising the training and testing accuracy of the neural networks tested. Another major contribution of this paper is the inclusion of a residual module on the AlexNet convolutional neural network, which achieves an accuracy of 89.51% after hyperparameter optimization on the merged dataset, about 1% better than the classical AlexNet under the same conditions and meets the expectation.
{"title":"Automated Recognition of Oracle Bone Inscriptions Using Deep Learning and Data Augmentation","authors":"Zhao Lyu","doi":"10.1145/3556677.3556700","DOIUrl":"https://doi.org/10.1145/3556677.3556700","url":null,"abstract":"Oracle bone inscriptions (OBIs) are the earliest Chinese writing system. However, deciphering OBIs is a very challenging task because of the lack of data and time- and resource-consuming manual classification process. In this paper, I apply the technology of deep learning to solve the problem of OBI recognition, proposing a method for merging incompatible OBI classification datasets and implementing it successfully, significantly raising the training and testing accuracy of the neural networks tested. Another major contribution of this paper is the inclusion of a residual module on the AlexNet convolutional neural network, which achieves an accuracy of 89.51% after hyperparameter optimization on the merged dataset, about 1% better than the classical AlexNet under the same conditions and meets the expectation.","PeriodicalId":350340,"journal":{"name":"Proceedings of the 2022 6th International Conference on Deep Learning Technologies","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114944761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we describe a way of using translation memory (TM) to improve the translation quality and stability of neural machine translation (NMT) systems, especially when the sentences to be translated have high similarity with sentences stored in the TM. The difference between the sentences to be translated and the sentences stored in the TM may only be in a few phrases. Our TM comprises not only paired sentences (i.e., a sentence in the source language paired with its translation in the target language) but also paired phrases. Translation quality is improved using good phrase translations for the differing phrases. The NMT system is used to assist phrase translation. We tested our TM on 3,000 English-Chinese paired sentences which were randomly picked from recent annual reports published and submitted to the Hong Kong Stock Exchange. Our TM translations achieved a significant BLEU improvement for high similar sentences compared with our NMT translations.
{"title":"Using Translation Memory to Improve Neural Machine Translations","authors":"Wu Zhang, Tung Yeung Lam, Mee Yee Chan","doi":"10.1145/3556677.3556691","DOIUrl":"https://doi.org/10.1145/3556677.3556691","url":null,"abstract":"In this paper, we describe a way of using translation memory (TM) to improve the translation quality and stability of neural machine translation (NMT) systems, especially when the sentences to be translated have high similarity with sentences stored in the TM. The difference between the sentences to be translated and the sentences stored in the TM may only be in a few phrases. Our TM comprises not only paired sentences (i.e., a sentence in the source language paired with its translation in the target language) but also paired phrases. Translation quality is improved using good phrase translations for the differing phrases. The NMT system is used to assist phrase translation. We tested our TM on 3,000 English-Chinese paired sentences which were randomly picked from recent annual reports published and submitted to the Hong Kong Stock Exchange. Our TM translations achieved a significant BLEU improvement for high similar sentences compared with our NMT translations.","PeriodicalId":350340,"journal":{"name":"Proceedings of the 2022 6th International Conference on Deep Learning Technologies","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126966101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The investigation on the water-entry impact load of the trans-medium aircraft (TMA) during the media-cross procedure was presented in this paper. The generalized regression neural network (GRNN) is adopted to described the characteristics of the water-entry impact load which is performed by the acceleration variable. In this paper, the train data of the water-entry impact load with the velocity 0, 2m/s, 4m/s, 6m/s, 8m/s, the angle 90°, 80°, 70°, 60°, 50°, the attitude 90°, 80°, 70°, 60°, 50° are generated by the finite element method based on the coupled Eulerian-Lagrangian (CEL) algorithm. The results show that the GRNN has a good performance on approximating the impact load of the TMA with the root mean square error (RMSE) 19.005. The deep learning algorithm for characterizing water-entry impact load can supply a good reference to the structural load evaluation of the TMA.
{"title":"Presentation of water-entry impact load for TMA during media-cross procedure based on GRNN","authors":"Dong Hao, J. Yu","doi":"10.1145/3556677.3556680","DOIUrl":"https://doi.org/10.1145/3556677.3556680","url":null,"abstract":"The investigation on the water-entry impact load of the trans-medium aircraft (TMA) during the media-cross procedure was presented in this paper. The generalized regression neural network (GRNN) is adopted to described the characteristics of the water-entry impact load which is performed by the acceleration variable. In this paper, the train data of the water-entry impact load with the velocity 0, 2m/s, 4m/s, 6m/s, 8m/s, the angle 90°, 80°, 70°, 60°, 50°, the attitude 90°, 80°, 70°, 60°, 50° are generated by the finite element method based on the coupled Eulerian-Lagrangian (CEL) algorithm. The results show that the GRNN has a good performance on approximating the impact load of the TMA with the root mean square error (RMSE) 19.005. The deep learning algorithm for characterizing water-entry impact load can supply a good reference to the structural load evaluation of the TMA.","PeriodicalId":350340,"journal":{"name":"Proceedings of the 2022 6th International Conference on Deep Learning Technologies","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134310346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, a video moving object tracking method based on SVM and Meanshift tracking algorithm is proposed. The location of the tracking object is selected in the initial image of the sports video, the feature vectors of the object and background around the tracking object is obtained, the object and background feature vectors are used to train the SVM binary classifier, and the classifier is used to classify the next video image to track the object location and the background image to obtain the confidence map. Use the Meanshift tracking algorithm to get the current tracking object center position within the confidence map range, move the center position of the object frame and background frame to reach the object position, zoom the object frame at a 10% scale, and select the best one to adapt to the change of object size. Determines if the last frame of the video has been tracked, and if not, train a new SVM classifier using the object and background pixels at this time to track the next frame of the video until the entire video sequence image moving object tracking task is completed. The experimental results show that the proposed method can track the moving objects in the video real-time and accurately.
{"title":"Moving Object Tracking Method Based on SVM and Meanshift Tracking Algorithm","authors":"Fan Zhang","doi":"10.1145/3556677.3556701","DOIUrl":"https://doi.org/10.1145/3556677.3556701","url":null,"abstract":"In this paper, a video moving object tracking method based on SVM and Meanshift tracking algorithm is proposed. The location of the tracking object is selected in the initial image of the sports video, the feature vectors of the object and background around the tracking object is obtained, the object and background feature vectors are used to train the SVM binary classifier, and the classifier is used to classify the next video image to track the object location and the background image to obtain the confidence map. Use the Meanshift tracking algorithm to get the current tracking object center position within the confidence map range, move the center position of the object frame and background frame to reach the object position, zoom the object frame at a 10% scale, and select the best one to adapt to the change of object size. Determines if the last frame of the video has been tracked, and if not, train a new SVM classifier using the object and background pixels at this time to track the next frame of the video until the entire video sequence image moving object tracking task is completed. The experimental results show that the proposed method can track the moving objects in the video real-time and accurately.","PeriodicalId":350340,"journal":{"name":"Proceedings of the 2022 6th International Conference on Deep Learning Technologies","volume":" 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132076135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we propose a method for evaluating SHAP values by time series change. SHAP values are based on the Shapley theory and have been widely used to interpret the machine-learning based regression results. The SHAP approach plays an important role in the machine-learning regression analysis. We apply the SHAP approach to the time series analysis which is effective when the target values fluctuate but the explanatory variable values have little variation over a long time, such as behavior characteristics of a company. In the paper, the automobile manufacturing industry data just after the outbreak of COVID-19 were used. After this stock prices’ worst plunge, many automakers’ stock prices had been recovered and started again growing rapidly. We conducted the regressions of which target variable were the recovery rates to find the important factors for the recoveries. The regression method we used is XGBoost. As a result, we found that an explanatory variable “sales growth ratio” was the most important factor for the stock recovery. In addition, the individual companies' important factors could be evaluated as time series data in detail, using the SHAP sequences. This SHAP-based time series analysis method is applicable to various fields.
{"title":"Time Series Analysis of SHAP Values by Automobile Manufacturers Recovery Rates","authors":"Y. Shirota, Kotaro Kuno, H. Yoshiura","doi":"10.1145/3556677.3556697","DOIUrl":"https://doi.org/10.1145/3556677.3556697","url":null,"abstract":"In this paper, we propose a method for evaluating SHAP values by time series change. SHAP values are based on the Shapley theory and have been widely used to interpret the machine-learning based regression results. The SHAP approach plays an important role in the machine-learning regression analysis. We apply the SHAP approach to the time series analysis which is effective when the target values fluctuate but the explanatory variable values have little variation over a long time, such as behavior characteristics of a company. In the paper, the automobile manufacturing industry data just after the outbreak of COVID-19 were used. After this stock prices’ worst plunge, many automakers’ stock prices had been recovered and started again growing rapidly. We conducted the regressions of which target variable were the recovery rates to find the important factors for the recoveries. The regression method we used is XGBoost. As a result, we found that an explanatory variable “sales growth ratio” was the most important factor for the stock recovery. In addition, the individual companies' important factors could be evaluated as time series data in detail, using the SHAP sequences. This SHAP-based time series analysis method is applicable to various fields.","PeriodicalId":350340,"journal":{"name":"Proceedings of the 2022 6th International Conference on Deep Learning Technologies","volume":"120 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126641526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Social media has become a significant news source as the modern world develops. Compared with traditional news media such as newspapers and television, people can consume and share news much faster on social media platforms such as Twitter, Facebook, and Weibo. These platforms are not regulated, which leads to massive amounts of fake news produced online and causes severe negative impacts on politics, economics, and social well-being. Thus, detecting fake news on social media is extremely important but technically challenging. This paper proposes a hybrid fake news detection model called CSIBERT, extracting text features of news events utilizing a Bidirectional Encoder Representations from Transformers (BERT) pre-trained model and introducing other social context features via the Capture, Score, and Integrate (CSI) framework. Our proposed model outperforms existing models with an accuracy of 97.1%. In addition, the CSIBERT model receives decent performance even with a small number of labeled samples on the Weibo fake news detection tasks, demonstrating its ability to solve the label shortage problem in fake news detection challenges.
{"title":"Detecting Fake News on Social Media by CSIBERT","authors":"Yawen Deng, Sheng-Wen Wang","doi":"10.1145/3556677.3556698","DOIUrl":"https://doi.org/10.1145/3556677.3556698","url":null,"abstract":"Social media has become a significant news source as the modern world develops. Compared with traditional news media such as newspapers and television, people can consume and share news much faster on social media platforms such as Twitter, Facebook, and Weibo. These platforms are not regulated, which leads to massive amounts of fake news produced online and causes severe negative impacts on politics, economics, and social well-being. Thus, detecting fake news on social media is extremely important but technically challenging. This paper proposes a hybrid fake news detection model called CSIBERT, extracting text features of news events utilizing a Bidirectional Encoder Representations from Transformers (BERT) pre-trained model and introducing other social context features via the Capture, Score, and Integrate (CSI) framework. Our proposed model outperforms existing models with an accuracy of 97.1%. In addition, the CSIBERT model receives decent performance even with a small number of labeled samples on the Weibo fake news detection tasks, demonstrating its ability to solve the label shortage problem in fake news detection challenges.","PeriodicalId":350340,"journal":{"name":"Proceedings of the 2022 6th International Conference on Deep Learning Technologies","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114408478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the rapid development of the Internet, voice recognition has become one of the core technologies on information era. Bird monitoring through sound recognition can be used as an effective indicator of wetland environmental quality. In this paper, we use Python to classify birds based on the features of Mel frequency cepstrum coefficient via K-Nearest Neighbor, support vector machine and multi-layer perceptron. Further, we carry out the comparisons of these algorithms and propose a novel classifier on the base of them. The experimental results show that the new classifier absorbs the fast prediction speed of the Multi-Layer Perception, the high accuracy and strong noise immunity of the K-Nearest Neighbor.
{"title":"Bird sound recognition based on novel classifier","authors":"Guowei Lei, Qiang Shu, Ruixing Cai, Wenliang Liao","doi":"10.1145/3556677.3556681","DOIUrl":"https://doi.org/10.1145/3556677.3556681","url":null,"abstract":"With the rapid development of the Internet, voice recognition has become one of the core technologies on information era. Bird monitoring through sound recognition can be used as an effective indicator of wetland environmental quality. In this paper, we use Python to classify birds based on the features of Mel frequency cepstrum coefficient via K-Nearest Neighbor, support vector machine and multi-layer perceptron. Further, we carry out the comparisons of these algorithms and propose a novel classifier on the base of them. The experimental results show that the new classifier absorbs the fast prediction speed of the Multi-Layer Perception, the high accuracy and strong noise immunity of the K-Nearest Neighbor.","PeriodicalId":350340,"journal":{"name":"Proceedings of the 2022 6th International Conference on Deep Learning Technologies","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126350983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}