Pub Date : 2020-12-02DOI: 10.1109/ICMLC51923.2020.9469546
Jincheng Li
Chemical compound toxicity prediction is a challenge learning problem that the number of active chemicals obtained for toxicity assays are far smaller than the inactive chemicals, i.e. imbalanced data. Neural Networks learned from these tasks with imbalanced data tend to misclassify the minority samples into majority samples. In this paper, we propose a novel learning method that combine multi-task deep neural networks learning with over-sampling method to handle imbalanced data and lack of training data problems of toxicity prediction. Over-sampling is a kind of re-sampling method that tackle the class imbalance problem by replicating the minority class samples. For each toxicity prediction task, we apply over-sampling method on training set to generate synthetic samples of the minority class to balance the training data. Then, we train the multi-task deep neural network on the tasks with balanced training set. Multi-task learning can share common information among tasks and the balanced data set have larger number of training data that benefit the multi-task deep neural networks learning.Experiment results on tox21 toxicity prediction data set shows that our method significantly relieve imbalanced data problem of multi-task deep neural networks learning and outperforms multi-task deep neural network method that without over-sampling and many other computational approaches like support vector machine and random forests.
{"title":"Imbalanced Toxicity Prediction Using Multi-Task Learning and Over-Sampling","authors":"Jincheng Li","doi":"10.1109/ICMLC51923.2020.9469546","DOIUrl":"https://doi.org/10.1109/ICMLC51923.2020.9469546","url":null,"abstract":"Chemical compound toxicity prediction is a challenge learning problem that the number of active chemicals obtained for toxicity assays are far smaller than the inactive chemicals, i.e. imbalanced data. Neural Networks learned from these tasks with imbalanced data tend to misclassify the minority samples into majority samples. In this paper, we propose a novel learning method that combine multi-task deep neural networks learning with over-sampling method to handle imbalanced data and lack of training data problems of toxicity prediction. Over-sampling is a kind of re-sampling method that tackle the class imbalance problem by replicating the minority class samples. For each toxicity prediction task, we apply over-sampling method on training set to generate synthetic samples of the minority class to balance the training data. Then, we train the multi-task deep neural network on the tasks with balanced training set. Multi-task learning can share common information among tasks and the balanced data set have larger number of training data that benefit the multi-task deep neural networks learning.Experiment results on tox21 toxicity prediction data set shows that our method significantly relieve imbalanced data problem of multi-task deep neural networks learning and outperforms multi-task deep neural network method that without over-sampling and many other computational approaches like support vector machine and random forests.","PeriodicalId":170815,"journal":{"name":"2020 International Conference on Machine Learning and Cybernetics (ICMLC)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116388453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-02DOI: 10.1109/ICMLC51923.2020.9469557
Lingfei Zhang, Chunfang Li
Nowadays, media from various countries have published a large number of report tweets on international hot topics. The rapid spread of news events on twitter has become increasingly popular. For hotspot mining of news events, topic division and sentiment analysis are two indispensable factors. In this Paper, we use topic segmentation and sentiment analysis to conduct hot mining of social media news for the US media and Chinese media tweets in Huawei-related news in 2019. First, we apply LDA to media tweets to divide topics and obtain related topic words. Then we devised improved methods for effective sentiment analysis on media tweets and influencer comments respectively. What's more, we draw some valid conclusions about news hotspot mining in social media tweets.
{"title":"Research on Hotspot Mining Method of Twitter News Report Based on LDA and Sentiment Analysis","authors":"Lingfei Zhang, Chunfang Li","doi":"10.1109/ICMLC51923.2020.9469557","DOIUrl":"https://doi.org/10.1109/ICMLC51923.2020.9469557","url":null,"abstract":"Nowadays, media from various countries have published a large number of report tweets on international hot topics. The rapid spread of news events on twitter has become increasingly popular. For hotspot mining of news events, topic division and sentiment analysis are two indispensable factors. In this Paper, we use topic segmentation and sentiment analysis to conduct hot mining of social media news for the US media and Chinese media tweets in Huawei-related news in 2019. First, we apply LDA to media tweets to divide topics and obtain related topic words. Then we devised improved methods for effective sentiment analysis on media tweets and influencer comments respectively. What's more, we draw some valid conclusions about news hotspot mining in social media tweets.","PeriodicalId":170815,"journal":{"name":"2020 International Conference on Machine Learning and Cybernetics (ICMLC)","volume":"180 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115936202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-02DOI: 10.1109/ICMLC51923.2020.9469558
Xinqi Fan, Rizwan Qureshi, A. Shahid, Jianfeng Cao, Luoxiao Yang, H. Yan
Facial expression recognition has been applied widely in human-machine interactions, security and business applications. The aim of facial expression recognition is to classify human expressions from their face images. In this work, we propose a novel neural network-based pipeline for facial expression recognition, Hybrid Separable Convolutional Inception Residual Network, using transfer learning with Inception residual network and depth-wise separable convolution. Specifically, our method uses multi-task convolutional neural network for face detection, then modifies the last two blocks of the original Inception residual network using depthwise separable convolution to reduce the computation cost, and finally utilizes transfer learning to take advantages of the transferable weights from a large face recognition dataset. Experimental result on three different databases - the Radboud Faces Database, Compounded Facial Expression of Emotions Database, and Real-word Affective Face Database, shows superior performance compared with the existing studies. Moreover, the proposed method is computationally efficient and reduces the trainable parameters by approximately 25% than the original Inception residual network.
面部表情识别在人机交互、安全、商业等领域有着广泛的应用。面部表情识别的目的是从人脸图像中对人类表情进行分类。在这项工作中,我们提出了一种新的基于神经网络的面部表情识别管道,混合可分离卷积初始残差网络,使用迁移学习与初始残差网络和深度可分离卷积。具体来说,我们的方法使用多任务卷积神经网络进行人脸检测,然后使用深度可分离卷积修改原始Inception残差网络的最后两个块以降低计算成本,最后利用迁移学习利用来自大型人脸识别数据集的可转移权。在Radboud Faces数据库、complex Facial Expression of Emotions数据库和Real-word Affective Face数据库上的实验结果与已有的研究结果相比,显示出了更好的性能。此外,该方法计算效率高,可训练参数比原始Inception残差网络减少约25%。
{"title":"Hybrid Separable Convolutional Inception Residual Network for Human Facial Expression Recognition","authors":"Xinqi Fan, Rizwan Qureshi, A. Shahid, Jianfeng Cao, Luoxiao Yang, H. Yan","doi":"10.1109/ICMLC51923.2020.9469558","DOIUrl":"https://doi.org/10.1109/ICMLC51923.2020.9469558","url":null,"abstract":"Facial expression recognition has been applied widely in human-machine interactions, security and business applications. The aim of facial expression recognition is to classify human expressions from their face images. In this work, we propose a novel neural network-based pipeline for facial expression recognition, Hybrid Separable Convolutional Inception Residual Network, using transfer learning with Inception residual network and depth-wise separable convolution. Specifically, our method uses multi-task convolutional neural network for face detection, then modifies the last two blocks of the original Inception residual network using depthwise separable convolution to reduce the computation cost, and finally utilizes transfer learning to take advantages of the transferable weights from a large face recognition dataset. Experimental result on three different databases - the Radboud Faces Database, Compounded Facial Expression of Emotions Database, and Real-word Affective Face Database, shows superior performance compared with the existing studies. Moreover, the proposed method is computationally efficient and reduces the trainable parameters by approximately 25% than the original Inception residual network.","PeriodicalId":170815,"journal":{"name":"2020 International Conference on Machine Learning and Cybernetics (ICMLC)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123889317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-02DOI: 10.1109/ICMLC51923.2020.9469529
Si-Ting Zou, Qing Sun, Dangdang Du
This paper explores the method of establishing dynamic model of inverted pendulum based on system identification. The maximum likelihood method in frequency domain is innovatively applied to identify the model parameters of the inverted pendulum system(IPS). The frequency domain maximum likelihood (ML) method is used to resolve the output error(OE) model with transient term for the system transfer function between the output state variables and the input variable. Finally, the parameters are identified using the frequency domain ML method and compared with the time-domain weighted least square method. Under the condition with only measured data, the experiment of a single inverted pendulum system with random time-varying control signal as excitation signal is designed. The numerical results show that the frequency domain ML method is effective in the identification of the inverted pendulum system.
{"title":"Identification of Inverted Pendulum System Using Frequency Domain Maximum Likelihood Estimation","authors":"Si-Ting Zou, Qing Sun, Dangdang Du","doi":"10.1109/ICMLC51923.2020.9469529","DOIUrl":"https://doi.org/10.1109/ICMLC51923.2020.9469529","url":null,"abstract":"This paper explores the method of establishing dynamic model of inverted pendulum based on system identification. The maximum likelihood method in frequency domain is innovatively applied to identify the model parameters of the inverted pendulum system(IPS). The frequency domain maximum likelihood (ML) method is used to resolve the output error(OE) model with transient term for the system transfer function between the output state variables and the input variable. Finally, the parameters are identified using the frequency domain ML method and compared with the time-domain weighted least square method. Under the condition with only measured data, the experiment of a single inverted pendulum system with random time-varying control signal as excitation signal is designed. The numerical results show that the frequency domain ML method is effective in the identification of the inverted pendulum system.","PeriodicalId":170815,"journal":{"name":"2020 International Conference on Machine Learning and Cybernetics (ICMLC)","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131160604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-02DOI: 10.1109/ICMLC51923.2020.9469551
F. Wu, Zhenjie Shi, Zhaowei Dong, C. Pang, Bailing Zhang
Sentiment analysis, also known as opinion mining, is an important area of research to analyze people’s opinions. In online e-commerce marketplace like Taobao, customers are allowed to comment on different products, brands and services using text and numerical ratings. Such reviews towards a product are valuable for the improvement of the product quality as they influence consumers’ purchase decisions. In this paper, we introduce a novel model, SenBERT-CNN, to analyze customer’s review. In order to capture more sentiment information in sentences, SenBERT-CNN model combines a pre-trained Bidirectional Encoder Representations from Transformers (BERT) network with Convolutional Neural Network (CNN). Specifically, we use BERT structure to better express sentence semantics as a text vector, and then further extract the deep features of the sentence through a Convolutional Neural Network. The effectiveness of the proposed method is validated through a collected product reviews of mobile phone from the e-commerce website, JD.com.
情感分析,也被称为意见挖掘,是分析人们意见的一个重要研究领域。在像淘宝这样的在线电子商务市场上,顾客可以用文字和数字对不同的产品、品牌和服务进行评价。这种对产品的评论对产品质量的提高是有价值的,因为它们会影响消费者的购买决策。在本文中,我们引入了一个新的模型SenBERT-CNN来分析顾客评论。为了在句子中捕获更多的情感信息,SenBERT-CNN模型将预训练的双向编码器表示(Bidirectional Encoder Representations from Transformers, BERT)网络与卷积神经网络(Convolutional Neural network, CNN)相结合。具体来说,我们使用BERT结构将句子语义更好地表达为文本向量,然后通过卷积神经网络进一步提取句子的深层特征。通过收集电子商务网站京东的手机产品评论,验证了所提出方法的有效性。
{"title":"Sentiment Analysis of Online Product Reviews Based On SenBERT-CNN","authors":"F. Wu, Zhenjie Shi, Zhaowei Dong, C. Pang, Bailing Zhang","doi":"10.1109/ICMLC51923.2020.9469551","DOIUrl":"https://doi.org/10.1109/ICMLC51923.2020.9469551","url":null,"abstract":"Sentiment analysis, also known as opinion mining, is an important area of research to analyze people’s opinions. In online e-commerce marketplace like Taobao, customers are allowed to comment on different products, brands and services using text and numerical ratings. Such reviews towards a product are valuable for the improvement of the product quality as they influence consumers’ purchase decisions. In this paper, we introduce a novel model, SenBERT-CNN, to analyze customer’s review. In order to capture more sentiment information in sentences, SenBERT-CNN model combines a pre-trained Bidirectional Encoder Representations from Transformers (BERT) network with Convolutional Neural Network (CNN). Specifically, we use BERT structure to better express sentence semantics as a text vector, and then further extract the deep features of the sentence through a Convolutional Neural Network. The effectiveness of the proposed method is validated through a collected product reviews of mobile phone from the e-commerce website, JD.com.","PeriodicalId":170815,"journal":{"name":"2020 International Conference on Machine Learning and Cybernetics (ICMLC)","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133706483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-02DOI: 10.1109/ICMLC51923.2020.9469540
Nan Jiang, Junwei Jia, Dongmei Shao
This paper compares and analyzes the training effect of Convolutional Recurrent Neural Network (CRNN) and Convolutional Neural Network (CNN) in speech emotion recognition. In order to solve the problem that CNN lacks the extraction of temporal information and the general temporal model is insufficient to represent the spatial information, CRNN is applied to speech emotion recognition. Taking Mel Frequency Cepstrum Coefficient (MFCC) and Gammatone Frequency Cepstrum Coefficient (GFCC) as the input features of the model, the recognition performances of CRNN and CNN in speech emotion recognition are compared and analyzed. The research shows that CRNN has higher accuracy for both features, which effectively improves the computing power of speech emotion model and provides a theoretical basis and optimization direction for improving the accuracy of speech emotion recognition.
{"title":"Comparative Study of Speech Emotion Recognition Based On CNN and CRNN","authors":"Nan Jiang, Junwei Jia, Dongmei Shao","doi":"10.1109/ICMLC51923.2020.9469540","DOIUrl":"https://doi.org/10.1109/ICMLC51923.2020.9469540","url":null,"abstract":"This paper compares and analyzes the training effect of Convolutional Recurrent Neural Network (CRNN) and Convolutional Neural Network (CNN) in speech emotion recognition. In order to solve the problem that CNN lacks the extraction of temporal information and the general temporal model is insufficient to represent the spatial information, CRNN is applied to speech emotion recognition. Taking Mel Frequency Cepstrum Coefficient (MFCC) and Gammatone Frequency Cepstrum Coefficient (GFCC) as the input features of the model, the recognition performances of CRNN and CNN in speech emotion recognition are compared and analyzed. The research shows that CRNN has higher accuracy for both features, which effectively improves the computing power of speech emotion model and provides a theoretical basis and optimization direction for improving the accuracy of speech emotion recognition.","PeriodicalId":170815,"journal":{"name":"2020 International Conference on Machine Learning and Cybernetics (ICMLC)","volume":"144 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134401632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-02DOI: 10.1109/ICMLC51923.2020.9469044
Omar Shindi, Qi Yu, D. Dong, Jiangjun Tang
This paper investigates quantum control problems using tabular Q-learning. A modified tabular Q-learning algorithm based on dynamic greedy method is proposed and the proposed algorithm succeeds for finding control sequences to drive a two-qubit system to a given target state with high fidelity. The modified algorithm also shows improved performance over the traditional Q-learning for solving quantum control problems on continuous states space. Moreover, the modified tabular Q-learning algorithm is compared with stochastic gradient descent and Krotov algorithms for solving quantum control problems. Simulation results on a two-qubit system demonstrate the effectiveness of the proposed algorithm.
{"title":"A Modified Q-Learning Algorithm for Control of Two-Qubit Systems","authors":"Omar Shindi, Qi Yu, D. Dong, Jiangjun Tang","doi":"10.1109/ICMLC51923.2020.9469044","DOIUrl":"https://doi.org/10.1109/ICMLC51923.2020.9469044","url":null,"abstract":"This paper investigates quantum control problems using tabular Q-learning. A modified tabular Q-learning algorithm based on dynamic greedy method is proposed and the proposed algorithm succeeds for finding control sequences to drive a two-qubit system to a given target state with high fidelity. The modified algorithm also shows improved performance over the traditional Q-learning for solving quantum control problems on continuous states space. Moreover, the modified tabular Q-learning algorithm is compared with stochastic gradient descent and Krotov algorithms for solving quantum control problems. Simulation results on a two-qubit system demonstrate the effectiveness of the proposed algorithm.","PeriodicalId":170815,"journal":{"name":"2020 International Conference on Machine Learning and Cybernetics (ICMLC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129903958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-02DOI: 10.1109/ICMLC51923.2020.9469579
Shuangshuang Kong, Hui Wang, Kaijun Wang
Small data analytics is to tackle the data analysis challenges such as overfitting when the data set is small. There are different approaches to small data analytics, including knowledge-based learning, but most of these approaches need experience to use. In this paper we consider another approach, lattice machine. Lattice machine is a conservative generalisation based learning algorithm. It is a learning paradigm that "learns" by generalising data in a consistent, conservative and parsimonious way. A lattice machine model built from a dataset is a set of hyper tuples that tightly "wraps around" clusters of data, each of which is a conservative generalisation of the underlying cluster. A key feature of lattice machine, indeed any conservative generalisation based learning algorithm, is that it has high precision and low recall, limiting its applications as high recall is needed in some applications such as disease (e.g. covid-19) screening. It is thus necessary to improve lattice machine’s recall whilst retaining his high precision. In this paper, we present a study on how to achieve this for lattice machine.
{"title":"Conservative Generalisation for Small Data Analytics –An Extended Lattice Machine Approach","authors":"Shuangshuang Kong, Hui Wang, Kaijun Wang","doi":"10.1109/ICMLC51923.2020.9469579","DOIUrl":"https://doi.org/10.1109/ICMLC51923.2020.9469579","url":null,"abstract":"Small data analytics is to tackle the data analysis challenges such as overfitting when the data set is small. There are different approaches to small data analytics, including knowledge-based learning, but most of these approaches need experience to use. In this paper we consider another approach, lattice machine. Lattice machine is a conservative generalisation based learning algorithm. It is a learning paradigm that \"learns\" by generalising data in a consistent, conservative and parsimonious way. A lattice machine model built from a dataset is a set of hyper tuples that tightly \"wraps around\" clusters of data, each of which is a conservative generalisation of the underlying cluster. A key feature of lattice machine, indeed any conservative generalisation based learning algorithm, is that it has high precision and low recall, limiting its applications as high recall is needed in some applications such as disease (e.g. covid-19) screening. It is thus necessary to improve lattice machine’s recall whilst retaining his high precision. In this paper, we present a study on how to achieve this for lattice machine.","PeriodicalId":170815,"journal":{"name":"2020 International Conference on Machine Learning and Cybernetics (ICMLC)","volume":"121 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116212585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-02DOI: 10.1109/ICMLC51923.2020.9469590
Chan-liang Wu, Chunfang Li, Wenjuan Jiang
As the two peaks of Chinese culture, poems of the Tang and Song Dynasties attracts numerous scholars to devote themselves to research. "China Biographical Database Project" (CBDB) is a relational database of Chinese historical figures established by Harvard University, and "Chinese poetry: the most complete database of ancient Chinese poetry" is an open source project of GitHub. This paper uses the above databases, takes poetry in Tang and Song Dynasties as the main body, combines the related video of the cultural variety show "Chinese Poetry Conference" and the Chinese textbooks for primary school to establish the system of "Knowledge Graph of Tang and Song Dynasties", and achieves the integration of video and teaching, characters and works across the media. It provides the comprehensive search function of characters, poems and related videos, the online generation function of knowledge graph, and the display function of the chronology of the top 100 figures divided by the emperors of past dynasties. Users can interact with the knowledge graph by clicking, typing, dragging and so on to complete exploratory visual analysis.
{"title":"The Realization of Cross-Media Knowledge Graph of Tang and Song Poetry","authors":"Chan-liang Wu, Chunfang Li, Wenjuan Jiang","doi":"10.1109/ICMLC51923.2020.9469590","DOIUrl":"https://doi.org/10.1109/ICMLC51923.2020.9469590","url":null,"abstract":"As the two peaks of Chinese culture, poems of the Tang and Song Dynasties attracts numerous scholars to devote themselves to research. \"China Biographical Database Project\" (CBDB) is a relational database of Chinese historical figures established by Harvard University, and \"Chinese poetry: the most complete database of ancient Chinese poetry\" is an open source project of GitHub. This paper uses the above databases, takes poetry in Tang and Song Dynasties as the main body, combines the related video of the cultural variety show \"Chinese Poetry Conference\" and the Chinese textbooks for primary school to establish the system of \"Knowledge Graph of Tang and Song Dynasties\", and achieves the integration of video and teaching, characters and works across the media. It provides the comprehensive search function of characters, poems and related videos, the online generation function of knowledge graph, and the display function of the chronology of the top 100 figures divided by the emperors of past dynasties. Users can interact with the knowledge graph by clicking, typing, dragging and so on to complete exploratory visual analysis.","PeriodicalId":170815,"journal":{"name":"2020 International Conference on Machine Learning and Cybernetics (ICMLC)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127877048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-02DOI: 10.1109/ICMLC51923.2020.9469536
Zhi-Ming Deng, Minyong Shi, Chunfang Li
The traditional digitization method of electronic textbooks is limited by text data and illustration layout, and the data processing effect is poor. In order to adapt to the complex and changeable data formats, this paper proposes an adaptive data partitioning technique. We divide all the texts and illustrations in the textbooks into independent data blocks, locate and cut them, and use OCR technology to identify the information of each area to make the processing goals more clear. Experiments were conducted on the junior middle school history textbooks in terms of data recognition rate. The experimental results show that the method proposed in this paper has a good effect on the digitalization of electronic textbooks.
{"title":"Digitalization of Electronic Textbook Based on OPENCV","authors":"Zhi-Ming Deng, Minyong Shi, Chunfang Li","doi":"10.1109/ICMLC51923.2020.9469536","DOIUrl":"https://doi.org/10.1109/ICMLC51923.2020.9469536","url":null,"abstract":"The traditional digitization method of electronic textbooks is limited by text data and illustration layout, and the data processing effect is poor. In order to adapt to the complex and changeable data formats, this paper proposes an adaptive data partitioning technique. We divide all the texts and illustrations in the textbooks into independent data blocks, locate and cut them, and use OCR technology to identify the information of each area to make the processing goals more clear. Experiments were conducted on the junior middle school history textbooks in terms of data recognition rate. The experimental results show that the method proposed in this paper has a good effect on the digitalization of electronic textbooks.","PeriodicalId":170815,"journal":{"name":"2020 International Conference on Machine Learning and Cybernetics (ICMLC)","volume":"233 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134414453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}