Yao Wang, Kun Li, Xiang Zhang, Jinhai Wang, Ran Wei
Visual stimulators play an important role in the steady-state visually evoked potential (SSVEP) brain computer interface (BCI). Traditional displays limit the application of SSVEP-BCI. Augmented reality, as a new pattern of visual stimulator, can be more flexible in the practical applications of SSVEP-BCI. In this study, the stimulus interface was presented by liquid crystal display and HoloLens, respectively. The feasibility experiment was to compared the influence on the acquisition of EEG signals when HoloLens was on and off. The stability experiment compared the flicker of HoloLens with LCD. The feasibility and stability of HoloLens in SSVEP-BCI was proved by the accuracy of SSVEP. First, the accuracy of HoloLens's is consistent with the accuracy of traditional display during the acquisition of EEG signals. It was proved that the application of HoloLens will not affect the acquisition of EEG signals. Second, compared the ARinduced SSVEP with traditional display-induced SSVEP, the accuracy of EEG signal classification in AR environment was 44.27%, 83.85%, 93.23%, 98.44% and 98.44% respectively when the data length was 0.5 s, 1.0 s, 1.5 s, 2.0 s and 2.5 s. The corresponding accuracy rate of the display was 73.44%, 95.31%, 98.44%, 99.48% and 99.48%. There was no difference in accuracy values after 2 seconds. HoloLens can completely replace the traditional display in the application of SSVEP-BCI.
{"title":"Research on the Application of Augmented Reality in SSVEP-BCI","authors":"Yao Wang, Kun Li, Xiang Zhang, Jinhai Wang, Ran Wei","doi":"10.1145/3404555.3404587","DOIUrl":"https://doi.org/10.1145/3404555.3404587","url":null,"abstract":"Visual stimulators play an important role in the steady-state visually evoked potential (SSVEP) brain computer interface (BCI). Traditional displays limit the application of SSVEP-BCI. Augmented reality, as a new pattern of visual stimulator, can be more flexible in the practical applications of SSVEP-BCI. In this study, the stimulus interface was presented by liquid crystal display and HoloLens, respectively. The feasibility experiment was to compared the influence on the acquisition of EEG signals when HoloLens was on and off. The stability experiment compared the flicker of HoloLens with LCD. The feasibility and stability of HoloLens in SSVEP-BCI was proved by the accuracy of SSVEP. First, the accuracy of HoloLens's is consistent with the accuracy of traditional display during the acquisition of EEG signals. It was proved that the application of HoloLens will not affect the acquisition of EEG signals. Second, compared the ARinduced SSVEP with traditional display-induced SSVEP, the accuracy of EEG signal classification in AR environment was 44.27%, 83.85%, 93.23%, 98.44% and 98.44% respectively when the data length was 0.5 s, 1.0 s, 1.5 s, 2.0 s and 2.5 s. The corresponding accuracy rate of the display was 73.44%, 95.31%, 98.44%, 99.48% and 99.48%. There was no difference in accuracy values after 2 seconds. HoloLens can completely replace the traditional display in the application of SSVEP-BCI.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"171 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121343866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The significantly growing computation and memory demands have become a bottleneck for the application of convolutional neural networks (CNNs). Model compression is an efficient method to accelerate CNNs. However, the commonly designed architectures are not suitable for compressed models and waste large computational resources on zero operands. In this work, we propose a flexible CNNs inference accelerator on FPGA utilizing uniform sparsity introduced by pattern pruning to achieve high performance. Our accelerator architecture exploits different input & output parallelism for sparse computation to maximize the utilization of computing arrays. A dynamically adjustable mechanism is designed to deal with the unbalanced workload. What's more, a novel data buffering structure with slightly rearranged sequences is applied to address the challenge of access conflict. The experiments show that our accelerator can achieve 316.4 GOP/s ~ 343.5 GOP/s for VGG-16 and ResNet-50.
{"title":"A High Energy-Efficiency Inference Accelerator Exploiting Sparse CNNs","authors":"Ning Li","doi":"10.1145/3404555.3404626","DOIUrl":"https://doi.org/10.1145/3404555.3404626","url":null,"abstract":"The significantly growing computation and memory demands have become a bottleneck for the application of convolutional neural networks (CNNs). Model compression is an efficient method to accelerate CNNs. However, the commonly designed architectures are not suitable for compressed models and waste large computational resources on zero operands. In this work, we propose a flexible CNNs inference accelerator on FPGA utilizing uniform sparsity introduced by pattern pruning to achieve high performance. Our accelerator architecture exploits different input & output parallelism for sparse computation to maximize the utilization of computing arrays. A dynamically adjustable mechanism is designed to deal with the unbalanced workload. What's more, a novel data buffering structure with slightly rearranged sequences is applied to address the challenge of access conflict. The experiments show that our accelerator can achieve 316.4 GOP/s ~ 343.5 GOP/s for VGG-16 and ResNet-50.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125948892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The mapping and masking methods based on deep learning are both essential methods for speech dereverberation at present, which typically enhance the amplitude of the reverberant speech while letting the reverberant phase unprocessed. The reverberant phase and enhanced amplitude are used to synthesize the target speech. However, because the overlapping frames interfere with each other during the superposition process (overlap-and-add), the final synthesized speech signal will deviate from the ideal value. In this paper, we propose an amplitude consistent enhancement method (ACE) to solve this problem. With ACE to train the deep neural networks (DNNs), we use the difference between amplitudes of the synthesized and clean speech as the loss function. Also, we propose a method of adding an adjustment layer to improve the regression accuracy of DNN. The speech dereverberation experiments show that the proposed method has improved the PESQ and SNR by 5% and 15% compared with the traditional signal approximation method.
{"title":"Amplitude Consistent Enhancement for Speech Dereverberation","authors":"Chunlei Liu, Longbiao Wang, J. Dang","doi":"10.1145/3404555.3404618","DOIUrl":"https://doi.org/10.1145/3404555.3404618","url":null,"abstract":"The mapping and masking methods based on deep learning are both essential methods for speech dereverberation at present, which typically enhance the amplitude of the reverberant speech while letting the reverberant phase unprocessed. The reverberant phase and enhanced amplitude are used to synthesize the target speech. However, because the overlapping frames interfere with each other during the superposition process (overlap-and-add), the final synthesized speech signal will deviate from the ideal value. In this paper, we propose an amplitude consistent enhancement method (ACE) to solve this problem. With ACE to train the deep neural networks (DNNs), we use the difference between amplitudes of the synthesized and clean speech as the loss function. Also, we propose a method of adding an adjustment layer to improve the regression accuracy of DNN. The speech dereverberation experiments show that the proposed method has improved the PESQ and SNR by 5% and 15% compared with the traditional signal approximation method.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"1991 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128227370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Relation extraction is a necessary step in obtaining information from electronic medical records. The deep learning methods for relation extraction are primarily based on word2vec and convolutional or recurrent neural network. However, word vectors generated by word2vec are static and cannot well reflect the different meanings of polysemy in different contexts and the feature extraction ability of RNN (Recurrent Neural Network) is not good enough. At the same time, the BERT (Bidirectional Encoder Representations from Transformers) pre-trained language model has achieved excellent results in many natural language processing tasks. In this paper, we propose a medical relation extraction model based on BERT. We combine the information of the whole sentence obtained from the pre-train language model with the corresponding information of two medical entities to complete relation extraction task. The experimental data were obtained from the Chinese electronic medical records provided by a hospital in Beijing. Experimental results on electronic medical records show that our model's accuracy, precision, recall, and F1-score reach 67.37%, 69.54%, 67.38%, 68.44%, which are higher than other three methods. Because named entity recognition task is the premise of relation extraction, we will combine the model with named entity recognition in the future work.
关系提取是获取电子病历信息的必要步骤。关系提取的深度学习方法主要基于word2vec和卷积或递归神经网络。然而,word2vec生成的词向量是静态的,不能很好地反映多义词在不同语境下的不同含义,RNN (Recurrent Neural Network)的特征提取能力也不够好。同时,BERT (Bidirectional Encoder Representations from Transformers)预训练语言模型在许多自然语言处理任务中取得了优异的效果。本文提出了一种基于BERT的医学关系提取模型。我们将从预训练语言模型中获得的整句信息与两个医疗实体的对应信息相结合,完成关系提取任务。实验数据来源于北京某医院提供的中文电子病历。电子病历的实验结果表明,模型的准确率、精密度、查全率和f1得分分别达到67.37%、69.54%、67.38%、68.44%,均高于其他三种方法。由于命名实体识别任务是关系提取的前提,我们将在今后的工作中将该模型与命名实体识别相结合。
{"title":"Research on Relation Extraction Method of Chinese Electronic Medical Records Based on BERT","authors":"Shengxin Gao, Jinlian Du, Xiao Zhang","doi":"10.1145/3404555.3404635","DOIUrl":"https://doi.org/10.1145/3404555.3404635","url":null,"abstract":"Relation extraction is a necessary step in obtaining information from electronic medical records. The deep learning methods for relation extraction are primarily based on word2vec and convolutional or recurrent neural network. However, word vectors generated by word2vec are static and cannot well reflect the different meanings of polysemy in different contexts and the feature extraction ability of RNN (Recurrent Neural Network) is not good enough. At the same time, the BERT (Bidirectional Encoder Representations from Transformers) pre-trained language model has achieved excellent results in many natural language processing tasks. In this paper, we propose a medical relation extraction model based on BERT. We combine the information of the whole sentence obtained from the pre-train language model with the corresponding information of two medical entities to complete relation extraction task. The experimental data were obtained from the Chinese electronic medical records provided by a hospital in Beijing. Experimental results on electronic medical records show that our model's accuracy, precision, recall, and F1-score reach 67.37%, 69.54%, 67.38%, 68.44%, which are higher than other three methods. Because named entity recognition task is the premise of relation extraction, we will combine the model with named entity recognition in the future work.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128450219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Learning from multi-modal data is very often in current data mining and knowledge management applications. However, the information imbalance between modalities brings challenges for many multi-modal learning tasks, such as cross-modal retrieval, image captioning, and image synthesis. Understanding the cross-modal information gap is an important foundation for designing models and choosing the evaluating criteria of those applications. Especially for text and image data, existing researches have proposed the abstractness to measure the information imbalance. They evaluate the abstractness disparity by training a classifier using the manually annotated multi-modal sample pairs. However, these methods ignore the impact of the intra-modal relationship on the inter-modal abstractness; besides, the annotating process is very labor-intensive, and the quality cannot be guaranteed. In order to evaluate the text-image relationship more comprehensively and reduce the cost of evaluating, we propose the relative abstractness index (RAI) to measure the abstractness between multi-modal items, which measures the abstractness of a sample according to its certainty of differentiating the items of another modality. Besides, we proposed a cycled generating model to compute the RAI values between images and text. In contrast to existing works, the proposed index can better describe the image-text information disparity, and its computing process needs no annotated training samples.
{"title":"Generalization or Instantiation?: Estimating the Relative Abstractness between Images and Text","authors":"Qibin Zheng, Xiaoguang Ren, Yi Liu, Wei Qin","doi":"10.1145/3404555.3404610","DOIUrl":"https://doi.org/10.1145/3404555.3404610","url":null,"abstract":"Learning from multi-modal data is very often in current data mining and knowledge management applications. However, the information imbalance between modalities brings challenges for many multi-modal learning tasks, such as cross-modal retrieval, image captioning, and image synthesis. Understanding the cross-modal information gap is an important foundation for designing models and choosing the evaluating criteria of those applications. Especially for text and image data, existing researches have proposed the abstractness to measure the information imbalance. They evaluate the abstractness disparity by training a classifier using the manually annotated multi-modal sample pairs. However, these methods ignore the impact of the intra-modal relationship on the inter-modal abstractness; besides, the annotating process is very labor-intensive, and the quality cannot be guaranteed. In order to evaluate the text-image relationship more comprehensively and reduce the cost of evaluating, we propose the relative abstractness index (RAI) to measure the abstractness between multi-modal items, which measures the abstractness of a sample according to its certainty of differentiating the items of another modality. Besides, we proposed a cycled generating model to compute the RAI values between images and text. In contrast to existing works, the proposed index can better describe the image-text information disparity, and its computing process needs no annotated training samples.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114365847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xueyong Hu, Bei Wang, Lei Zhao, Yang Yang, Aiyu Hu, Ge Pan, Baoxian Zhou
Smart grids are becoming more complex due to the development of big data., and technical documents and institutional standards are constantly updated. As a result, It is difficult for workers in different positions to obtain the required information and data. This thesis is oriented towards this problem, and combined with deep learning algorithms to build a user intent prediction model based on the existing knowledge map. By extracting user characteristics and using a dynamic matching algorithm, the purpose of intent prediction is achieved. In this way, the required standards and requirements can be found faster and more directly in the work process, which effectively improves the working efficiency of employees and reduces the difficulty of learning and training.
{"title":"Research on Search Intent Prediction for Big Data of National Grid System Standards","authors":"Xueyong Hu, Bei Wang, Lei Zhao, Yang Yang, Aiyu Hu, Ge Pan, Baoxian Zhou","doi":"10.1145/3404555.3404642","DOIUrl":"https://doi.org/10.1145/3404555.3404642","url":null,"abstract":"Smart grids are becoming more complex due to the development of big data., and technical documents and institutional standards are constantly updated. As a result, It is difficult for workers in different positions to obtain the required information and data. This thesis is oriented towards this problem, and combined with deep learning algorithms to build a user intent prediction model based on the existing knowledge map. By extracting user characteristics and using a dynamic matching algorithm, the purpose of intent prediction is achieved. In this way, the required standards and requirements can be found faster and more directly in the work process, which effectively improves the working efficiency of employees and reduces the difficulty of learning and training.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128824141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper considers an Energy Harvesting Wireless Sensor Network (EH-WSN) where nodes have a dual alternative battery system. We propose a stateless distributed reinforcement learning based routing algorithm, named QLRA, where each node learns the best next hop(s) to forward its data based on the battery and data information of its neighbors. We study how the number of sources and path exploration probability impacts the performance of QLRA. Numerical results show that after learning, QLRA is able to achieve minimal end-to-end delays in all tested scenarios, which is about 18% lower than the average end-to-end delay of a competing routing algorithm.
{"title":"Reinforcement Learning Based Routing in EH-WSNs with Dual Alternative Batteries","authors":"T. Zhao, Luyao Wang, Kwan-Wu Chin","doi":"10.1145/3404555.3404569","DOIUrl":"https://doi.org/10.1145/3404555.3404569","url":null,"abstract":"This paper considers an Energy Harvesting Wireless Sensor Network (EH-WSN) where nodes have a dual alternative battery system. We propose a stateless distributed reinforcement learning based routing algorithm, named QLRA, where each node learns the best next hop(s) to forward its data based on the battery and data information of its neighbors. We study how the number of sources and path exploration probability impacts the performance of QLRA. Numerical results show that after learning, QLRA is able to achieve minimal end-to-end delays in all tested scenarios, which is about 18% lower than the average end-to-end delay of a competing routing algorithm.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128218745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recent years, Machine Learning has been widely used in malware analysis and achieved unprecedented success. However, deep learning models are found to be highly vulnerable to adversarial examples, which leads to the machine learning-based malware analysis methods vulnerable to malware makers. Exploring the attack algorithm can not only promote the generation of more effective malware analysis methods, but also can promote the development of the defense algorithm. Different machine learning models use different malware features as their classification basis, and accordingly there will be different attack methods against them. For malware visualization method, corresponding effective adversarial attack has not yet appeared. Most existing malware adversarial examples for malware visualization are generated at the feature level, and do not consider whether the generated adversarial examples can be executed and complete their original functions. In this paper, we explored how to modify an Android executable file without affecting its original functions and made it become an adversarial example. We proposed an executable adversarial examples attack strategy for machine learning-based malware visualization analysis. Experimental result shows that the executable adversarial examples we generated can be normally run on Android devices without affecting its original functions, and can confuse the malware family classifier with 93% success rate. We explored possible defense methods and hope to contribute to building a more robust malware classification method.
{"title":"From Image to Code: Executable Adversarial Examples of Android Applications","authors":"Shangyu Gu, Shaoyin Cheng, Weiming Zhang","doi":"10.1145/3404555.3404574","DOIUrl":"https://doi.org/10.1145/3404555.3404574","url":null,"abstract":"Recent years, Machine Learning has been widely used in malware analysis and achieved unprecedented success. However, deep learning models are found to be highly vulnerable to adversarial examples, which leads to the machine learning-based malware analysis methods vulnerable to malware makers. Exploring the attack algorithm can not only promote the generation of more effective malware analysis methods, but also can promote the development of the defense algorithm. Different machine learning models use different malware features as their classification basis, and accordingly there will be different attack methods against them. For malware visualization method, corresponding effective adversarial attack has not yet appeared. Most existing malware adversarial examples for malware visualization are generated at the feature level, and do not consider whether the generated adversarial examples can be executed and complete their original functions. In this paper, we explored how to modify an Android executable file without affecting its original functions and made it become an adversarial example. We proposed an executable adversarial examples attack strategy for machine learning-based malware visualization analysis. Experimental result shows that the executable adversarial examples we generated can be normally run on Android devices without affecting its original functions, and can confuse the malware family classifier with 93% success rate. We explored possible defense methods and hope to contribute to building a more robust malware classification method.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"2012 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133410636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Automatic patent classification based on the TRIZ inventive principles is essential for patent management and industrial analysis. However, acquiring labels for deep learning methods is extraordinarily difficult and costly. This paper proposes a new two-stage semi-supervised learning framework called TRIZ-ESSL, which stands for Enhanced Semi-Supervised Learning for TRIZ. TRIZ-ESSL makes full use of both labeled and unlabeled data to improve the prediction performance. TRIZ-ESSL takes the advantages of semi-supervised sequence learning and mixed objective function, a combination of cross-entropy, entropy minimization, adversarial and virtual adversarial loss functions. Firstly, TRIZ-ESSL uses unlabeled data to train a recurrent language model. Secondly, TRIZ-ESSL initializes the weights of the LSTM-based model with the pre-trained recurrent language model and then trains the text classification model using mixed objective function on both labeled and unlabeled sets. On 3 TRIZ-based classification tasks, TRIZ-ESSL outperforms the current popular semi-supervised training methods and Bert in terms of accuracy score.
{"title":"A Semi-Supervised Learning Framework for TRIZ-Based Chinese Patent Classification","authors":"Lixiao Huang, Jiasi Yu, Yongjun Hu, Huiyou Chang","doi":"10.1145/3404555.3404600","DOIUrl":"https://doi.org/10.1145/3404555.3404600","url":null,"abstract":"Automatic patent classification based on the TRIZ inventive principles is essential for patent management and industrial analysis. However, acquiring labels for deep learning methods is extraordinarily difficult and costly. This paper proposes a new two-stage semi-supervised learning framework called TRIZ-ESSL, which stands for Enhanced Semi-Supervised Learning for TRIZ. TRIZ-ESSL makes full use of both labeled and unlabeled data to improve the prediction performance. TRIZ-ESSL takes the advantages of semi-supervised sequence learning and mixed objective function, a combination of cross-entropy, entropy minimization, adversarial and virtual adversarial loss functions. Firstly, TRIZ-ESSL uses unlabeled data to train a recurrent language model. Secondly, TRIZ-ESSL initializes the weights of the LSTM-based model with the pre-trained recurrent language model and then trains the text classification model using mixed objective function on both labeled and unlabeled sets. On 3 TRIZ-based classification tasks, TRIZ-ESSL outperforms the current popular semi-supervised training methods and Bert in terms of accuracy score.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131135791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In recent years, global divorce rate is still high. What kind of couple will divorce and what factors lead to divorce are important problems that worth studying. In this paper, we apply three machine learning algorithms (Support Vector Machine (SVM), Random forest (RF) and Natural Gradient Boosting (NGBoost)) on a divorce prediction dataset. The dataset consists of 170 samples, each of which contains 54 questions about the couple's emotional status. We regard the scores of 54 questions as the features of each sample to apply our machine learning algorithms. Compared with SVM and RF, NGBoost has superior performance as NGBoost can achieve 0.9833 accuracy, 0.9769 precision and 0.9828 F1 score. In addition, we also show the most important features in the model of RF and NGBoost to find the most important factors which lead to divorce.
{"title":"Is Your Marriage Reliable?: Divorce Analysis with Machine Learning Algorithms","authors":"Jue Kong, Tianrui Chai","doi":"10.1145/3404555.3404559","DOIUrl":"https://doi.org/10.1145/3404555.3404559","url":null,"abstract":"In recent years, global divorce rate is still high. What kind of couple will divorce and what factors lead to divorce are important problems that worth studying. In this paper, we apply three machine learning algorithms (Support Vector Machine (SVM), Random forest (RF) and Natural Gradient Boosting (NGBoost)) on a divorce prediction dataset. The dataset consists of 170 samples, each of which contains 54 questions about the couple's emotional status. We regard the scores of 54 questions as the features of each sample to apply our machine learning algorithms. Compared with SVM and RF, NGBoost has superior performance as NGBoost can achieve 0.9833 accuracy, 0.9769 precision and 0.9828 F1 score. In addition, we also show the most important features in the model of RF and NGBoost to find the most important factors which lead to divorce.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"38 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134226198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}