{"title":"从文本语料库中发现词的因果网络","authors":"Yue Wang","doi":"10.1109/ICEBE.2015.11","DOIUrl":null,"url":null,"abstract":"As the \"bag of words\" approaches cut down the linkage between the words, they are hardly to be applied to explore the causal relations between the terms described from the text corpuses. To discover the networked causal knowledge from the corpus, we (1) propose the algorithm pv-swapping and the PV-parse-tree to adjust the term orders for the sentences by observing the relationship between the grammatical voice and causal relation, provide the NLP approaches to transform the corpus to the sets of ordered terms sequences (OTS) by preserving the semantic orders of the phrases in the sentences, (2) formalize the causal network extracting problem in the corpus and show that it is a NP-hard problem, (3) propose the algorithms NE-IC and heuristic-majority-vote to extract the causal network of the terms based on the OTS sequences. We provide sufficient experiments on the real data sets. The experimental results show that our methods are both effective and efficient to discover the causal network of the terms, and the resulted causal networks of heuristic-majority-vote with less conflict causal relations or cycles than the results of NE-IC. At the last, we also provide experiments on several causal knowledge discovering tasks based on the resulted causal networks to show their interesting applications.","PeriodicalId":153535,"journal":{"name":"2015 IEEE 12th International Conference on e-Business Engineering","volume":"165 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Discovering the Causal Network of Terms from the Text Corpus\",\"authors\":\"Yue Wang\",\"doi\":\"10.1109/ICEBE.2015.11\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As the \\\"bag of words\\\" approaches cut down the linkage between the words, they are hardly to be applied to explore the causal relations between the terms described from the text corpuses. To discover the networked causal knowledge from the corpus, we (1) propose the algorithm pv-swapping and the PV-parse-tree to adjust the term orders for the sentences by observing the relationship between the grammatical voice and causal relation, provide the NLP approaches to transform the corpus to the sets of ordered terms sequences (OTS) by preserving the semantic orders of the phrases in the sentences, (2) formalize the causal network extracting problem in the corpus and show that it is a NP-hard problem, (3) propose the algorithms NE-IC and heuristic-majority-vote to extract the causal network of the terms based on the OTS sequences. We provide sufficient experiments on the real data sets. The experimental results show that our methods are both effective and efficient to discover the causal network of the terms, and the resulted causal networks of heuristic-majority-vote with less conflict causal relations or cycles than the results of NE-IC. At the last, we also provide experiments on several causal knowledge discovering tasks based on the resulted causal networks to show their interesting applications.\",\"PeriodicalId\":153535,\"journal\":{\"name\":\"2015 IEEE 12th International Conference on e-Business Engineering\",\"volume\":\"165 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-10-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE 12th International Conference on e-Business Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICEBE.2015.11\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE 12th International Conference on e-Business Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEBE.2015.11","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Discovering the Causal Network of Terms from the Text Corpus
As the "bag of words" approaches cut down the linkage between the words, they are hardly to be applied to explore the causal relations between the terms described from the text corpuses. To discover the networked causal knowledge from the corpus, we (1) propose the algorithm pv-swapping and the PV-parse-tree to adjust the term orders for the sentences by observing the relationship between the grammatical voice and causal relation, provide the NLP approaches to transform the corpus to the sets of ordered terms sequences (OTS) by preserving the semantic orders of the phrases in the sentences, (2) formalize the causal network extracting problem in the corpus and show that it is a NP-hard problem, (3) propose the algorithms NE-IC and heuristic-majority-vote to extract the causal network of the terms based on the OTS sequences. We provide sufficient experiments on the real data sets. The experimental results show that our methods are both effective and efficient to discover the causal network of the terms, and the resulted causal networks of heuristic-majority-vote with less conflict causal relations or cycles than the results of NE-IC. At the last, we also provide experiments on several causal knowledge discovering tasks based on the resulted causal networks to show their interesting applications.