{"title":"建模隐式内容网络跟踪信息传播跨媒体来源分析新闻事件","authors":"Anirudh Joshi, R. Sinnott","doi":"10.1109/eScience.2018.00136","DOIUrl":null,"url":null,"abstract":"With the rise of the Internet as the premier news source for billions of people around the world, the propagation of news media online now influences many critical decisions made by society every day. Fake news is now a mainstream concern. In the context of news propagation, recent works in media analysis largely focus on extracting clusters, news events, stories or tracking links or conserved sentences at aggregate levels between sources. However, the insight provided by these approaches is limited for analysis and context for end users. To tackle this, we present an approach to model implicit content networks at a semantic level that is inherent within news event clusters as seen by users on a daily basis through the generation of semantic content indexes. The approach is based on an end-to-end unsupervised machine learning system trained on real-life news data that combine together with algorithms to generate useful contextual views of the sources and the inter-relationships of news events. We illustrate how the approach is able to track conserved semantic context through the use of a combination of machine learning techniques, including document vectors, k-nearest neighbors and the use of hierarchical agglomerative clustering. We demonstrate the system by training semantic vector models on realistic real-world data taken from the Signal News dataset. We quantitatively evaluate the performance against existing state of the art systems to demonstrate the end-to-end capability. We then qualitatively demonstrate the usefulness of a news event centered semantic content index graph for end-user applications. This is evaluated with respect to the goal of generating rich contextual interconnections and providing differential background on how news media sources report, parrot and position information on ostensibly identical news events.","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"27 1","pages":"475-485"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Modelling Implicit Content Networks to Track Information Propagation Across Media Sources to Analyze News Events\",\"authors\":\"Anirudh Joshi, R. Sinnott\",\"doi\":\"10.1109/eScience.2018.00136\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the rise of the Internet as the premier news source for billions of people around the world, the propagation of news media online now influences many critical decisions made by society every day. Fake news is now a mainstream concern. In the context of news propagation, recent works in media analysis largely focus on extracting clusters, news events, stories or tracking links or conserved sentences at aggregate levels between sources. However, the insight provided by these approaches is limited for analysis and context for end users. To tackle this, we present an approach to model implicit content networks at a semantic level that is inherent within news event clusters as seen by users on a daily basis through the generation of semantic content indexes. The approach is based on an end-to-end unsupervised machine learning system trained on real-life news data that combine together with algorithms to generate useful contextual views of the sources and the inter-relationships of news events. We illustrate how the approach is able to track conserved semantic context through the use of a combination of machine learning techniques, including document vectors, k-nearest neighbors and the use of hierarchical agglomerative clustering. We demonstrate the system by training semantic vector models on realistic real-world data taken from the Signal News dataset. We quantitatively evaluate the performance against existing state of the art systems to demonstrate the end-to-end capability. We then qualitatively demonstrate the usefulness of a news event centered semantic content index graph for end-user applications. This is evaluated with respect to the goal of generating rich contextual interconnections and providing differential background on how news media sources report, parrot and position information on ostensibly identical news events.\",\"PeriodicalId\":6476,\"journal\":{\"name\":\"2018 IEEE 14th International Conference on e-Science (e-Science)\",\"volume\":\"27 1\",\"pages\":\"475-485\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE 14th International Conference on e-Science (e-Science)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/eScience.2018.00136\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 14th International Conference on e-Science (e-Science)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/eScience.2018.00136","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Modelling Implicit Content Networks to Track Information Propagation Across Media Sources to Analyze News Events
With the rise of the Internet as the premier news source for billions of people around the world, the propagation of news media online now influences many critical decisions made by society every day. Fake news is now a mainstream concern. In the context of news propagation, recent works in media analysis largely focus on extracting clusters, news events, stories or tracking links or conserved sentences at aggregate levels between sources. However, the insight provided by these approaches is limited for analysis and context for end users. To tackle this, we present an approach to model implicit content networks at a semantic level that is inherent within news event clusters as seen by users on a daily basis through the generation of semantic content indexes. The approach is based on an end-to-end unsupervised machine learning system trained on real-life news data that combine together with algorithms to generate useful contextual views of the sources and the inter-relationships of news events. We illustrate how the approach is able to track conserved semantic context through the use of a combination of machine learning techniques, including document vectors, k-nearest neighbors and the use of hierarchical agglomerative clustering. We demonstrate the system by training semantic vector models on realistic real-world data taken from the Signal News dataset. We quantitatively evaluate the performance against existing state of the art systems to demonstrate the end-to-end capability. We then qualitatively demonstrate the usefulness of a news event centered semantic content index graph for end-user applications. This is evaluated with respect to the goal of generating rich contextual interconnections and providing differential background on how news media sources report, parrot and position information on ostensibly identical news events.