首页 > 最新文献

Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries最新文献

英文 中文
Session details: Panels 会议详情:小组讨论
A. Rauber, Hideo Joho
{"title":"Session details: Panels","authors":"A. Rauber, Hideo Joho","doi":"10.1145/3260519","DOIUrl":"https://doi.org/10.1145/3260519","url":null,"abstract":"","PeriodicalId":256118,"journal":{"name":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123679416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unified Relevance Feedback for Multi-Application User Interest Modeling 面向多应用用户兴趣建模的统一关联反馈
Pub Date : 2015-06-21 DOI: 10.1145/2756406.2756914
S. Jayarathna, Atish Patra, F. Shipman
A user often interacts with multiple applications while working on a task. User models can be developed individually at each of the individual applications, but there is no easy way to come up with a more complete user model based on the distributed activity of the user. To address this issue, this research studies the importance of combining various implicit and explicit relevance feedback indicators in a multi-application environment. It allows different applications used for different purposes by the user to contribute user activity and its context to mutually support users with unified relevance feedback. Using the data collected by the web browser, Microsoft Word and Microsoft PowerPoint, combinations of implicit relevance feedback with semi-explicit relevance feedback were analyzed and compared with explicit user ratings. Our results are two-fold: first we demonstrate the aggregation of implicit and semi-explicit user interest data across multiple everyday applications using our Interest Profile Manager (IPM) framework. Second, our experimental results show that incorporating implicit feedback with semi-explicit feedback for page-level user interest estimation resulted in a significant improvement over the content-based models.
用户在处理任务时经常与多个应用程序交互。用户模型可以在每个单独的应用程序上单独开发,但是没有一种简单的方法可以基于用户的分布式活动提出一个更完整的用户模型。为了解决这一问题,本研究研究了在多应用环境下,将各种隐式和显式相关反馈指标结合起来的重要性。它允许用户用于不同目的的不同应用程序贡献用户活动及其上下文,从而通过统一的相关性反馈相互支持用户。利用web浏览器、Microsoft Word和Microsoft PowerPoint收集的数据,对隐式关联反馈和半显式关联反馈的组合进行分析,并与显式用户评分进行比较。我们的结果是双重的:首先,我们使用我们的兴趣配置文件管理器(IPM)框架演示了跨多个日常应用程序的隐式和半显式用户兴趣数据的聚合。其次,我们的实验结果表明,将隐式反馈与半显式反馈结合起来用于页面级用户兴趣估计,比基于内容的模型有了显著的改进。
{"title":"Unified Relevance Feedback for Multi-Application User Interest Modeling","authors":"S. Jayarathna, Atish Patra, F. Shipman","doi":"10.1145/2756406.2756914","DOIUrl":"https://doi.org/10.1145/2756406.2756914","url":null,"abstract":"A user often interacts with multiple applications while working on a task. User models can be developed individually at each of the individual applications, but there is no easy way to come up with a more complete user model based on the distributed activity of the user. To address this issue, this research studies the importance of combining various implicit and explicit relevance feedback indicators in a multi-application environment. It allows different applications used for different purposes by the user to contribute user activity and its context to mutually support users with unified relevance feedback. Using the data collected by the web browser, Microsoft Word and Microsoft PowerPoint, combinations of implicit relevance feedback with semi-explicit relevance feedback were analyzed and compared with explicit user ratings. Our results are two-fold: first we demonstrate the aggregation of implicit and semi-explicit user interest data across multiple everyday applications using our Interest Profile Manager (IPM) framework. Second, our experimental results show that incorporating implicit feedback with semi-explicit feedback for page-level user interest estimation resulted in a significant improvement over the content-based models.","PeriodicalId":256118,"journal":{"name":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124627428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
What does Twitter Measure?: Influence of Diverse User Groups in Altmetrics 推特衡量的是什么?Altmetrics中不同用户群体的影响
Pub Date : 2015-06-21 DOI: 10.1145/2756406.2756913
Simon Barthel, S. Tönnies, B. Köhncke, Patrick Siehndel, Wolf-Tilo Balke
The most important goal for digital libraries is to ensure high quality search experience for all kinds of users. To attain this goal, it is necessary to have as much relevant metadata as possible at hand to assess the quality of publications. Recently, a new group of metrics appeared, that has the potential to raise the quality of publication metadata to the next level -- the altmetrics. These metrics try to reflect the impact of publications within the social web. However, currently it is still unclear if and how altmetrics should be used to assess the quality of a publication and how altmetrics are related to classical bibliographical metrics (like e.g. citations). To gain more insights about what kind of concepts are reflected by altmetrics, we conducted an in-depth analysis on a real world dataset crawled from the Public Library of Science (PLOS). Especially, we analyzed if the common approach to regard the users in the social web as one homogeneous group is sensible or if users need to be divided into diverse groups in order to receive meaningful results.
数字图书馆最重要的目标是为各类用户提供高质量的搜索体验。为了实现这一目标,有必要掌握尽可能多的相关元数据,以评估出版物的质量。最近,出现了一组新的指标,它有可能将出版物元数据的质量提升到一个新的水平——替代指标。这些指标试图反映出版物在社交网络中的影响。然而,目前尚不清楚是否以及如何使用替代指标来评估出版物的质量,以及替代指标如何与经典书目指标(如引文)相关联。为了更深入地了解altmetrics反映了什么样的概念,我们对从公共科学图书馆(PLOS)抓取的真实数据集进行了深入分析。特别是,我们分析了将社交网络中的用户视为一个同质群体的常见方法是否合理,或者是否需要将用户划分为不同的群体才能获得有意义的结果。
{"title":"What does Twitter Measure?: Influence of Diverse User Groups in Altmetrics","authors":"Simon Barthel, S. Tönnies, B. Köhncke, Patrick Siehndel, Wolf-Tilo Balke","doi":"10.1145/2756406.2756913","DOIUrl":"https://doi.org/10.1145/2756406.2756913","url":null,"abstract":"The most important goal for digital libraries is to ensure high quality search experience for all kinds of users. To attain this goal, it is necessary to have as much relevant metadata as possible at hand to assess the quality of publications. Recently, a new group of metrics appeared, that has the potential to raise the quality of publication metadata to the next level -- the altmetrics. These metrics try to reflect the impact of publications within the social web. However, currently it is still unclear if and how altmetrics should be used to assess the quality of a publication and how altmetrics are related to classical bibliographical metrics (like e.g. citations). To gain more insights about what kind of concepts are reflected by altmetrics, we conducted an in-depth analysis on a real world dataset crawled from the Public Library of Science (PLOS). Especially, we analyzed if the common approach to regard the users in the social web as one homogeneous group is sensible or if users need to be divided into diverse groups in order to receive meaningful results.","PeriodicalId":256118,"journal":{"name":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","volume":"149 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117293884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Result List Actions in Fiction Search 虚构搜索中的结果列表操作
Pub Date : 2015-06-21 DOI: 10.1145/2756406.2756911
P. Vakkari, J. Pöntinen
It is studied how users browse search results to find interesting novels for four search scenarios. It is evaluated in particular whether there are differences in search result page (SERP) browsing patterns and effectiveness between an enriched catalog for finding fiction compared to a traditional public library catalog. The data was collected from 30 participants by eye-tracking and questionnaires. The results indicate that the enriched catalog supported users to identify sooner and more effectively potentially clickable items on the results list compared to a traditional public library catalog. This is likely due to the more informative metadata in the enriched catalog like snippets of content description on the result list items. The discussion includes a theoretical and empirical comparison of findings in studies on fiction and non-fiction searching.
研究了用户如何在四种搜索场景下浏览搜索结果以找到有趣的小说。本文特别评估了在搜索结果页(SERP)浏览模式和有效性方面,与传统的公共图书馆目录相比,一个丰富的查找小说目录是否存在差异。数据是通过眼动追踪和问卷调查从30名参与者中收集的。结果表明,与传统的公共图书馆目录相比,丰富的目录支持用户更快、更有效地识别结果列表中可能可点击的项目。这可能是由于丰富的目录中有更多信息丰富的元数据,比如结果列表项上的内容描述片段。讨论包括小说和非小说搜索研究结果的理论和实证比较。
{"title":"Result List Actions in Fiction Search","authors":"P. Vakkari, J. Pöntinen","doi":"10.1145/2756406.2756911","DOIUrl":"https://doi.org/10.1145/2756406.2756911","url":null,"abstract":"It is studied how users browse search results to find interesting novels for four search scenarios. It is evaluated in particular whether there are differences in search result page (SERP) browsing patterns and effectiveness between an enriched catalog for finding fiction compared to a traditional public library catalog. The data was collected from 30 participants by eye-tracking and questionnaires. The results indicate that the enriched catalog supported users to identify sooner and more effectively potentially clickable items on the results list compared to a traditional public library catalog. This is likely due to the more informative metadata in the enriched catalog like snippets of content description on the result list items. The discussion includes a theoretical and empirical comparison of findings in studies on fiction and non-fiction searching.","PeriodicalId":256118,"journal":{"name":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127000327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Multi-Emotion Estimation in Narratives from Crowdsourced Annotations 众包注释叙事中的多情感估计
Pub Date : 2015-06-21 DOI: 10.1145/2756406.2756910
Lei Duan, S. Oyama, Haruhiko Sato, M. Kurihara
Emotion annotations are important metadata for narrative texts in digital libraries. Such annotations are necessary for automatic text-to-speech conversion of narratives and affective education support and can be used as training data for machine learning algorithms to train automatic emotion detectors. However, obtaining high-quality emotion annotations is a challenging problem because it is usually expensive and time-consuming due to the subjectivity of emotion. Moreover, due to the multiplicity of "emotion", emotion annotations more naturally fit the paradigm of multi-label classification than that of multi-class classification since one instance (such as a sentence) may evoke a combination of multiple emotion categories. We thus investigated ways to obtain a set of high-quality emotion annotations ({instance, multi-emotion} paired data) from variable-quality crowdsourced annotations. A common quality control strategy for crowdsourced labeling tasks is to aggregate the responses provided by multiple annotators to produce a reliable annotation. Given that the categories of "emotion" have characteristics different from those of other kinds of labels, we propose incorporating domain-specific information of emotional consistencies across instances and contextual cues among emotion categories into the aggregation process. Experimental results demonstrate that, from a limited number of crowdsourced annotations, the proposed models enable gold standards to be more effectively estimated than the majority vote and the original domain-independent model.
情感注释是数字图书馆叙事文本的重要元数据。这种标注对于叙事的自动文本到语音转换和情感教育支持是必要的,可以作为机器学习算法训练自动情感检测器的训练数据。然而,由于情感的主观性,获得高质量的情感注释是一个具有挑战性的问题,因为它通常是昂贵和耗时的。此外,由于“情感”的多重性,情感注释更自然地适合多标签分类的范式,而不是多类分类的范式,因为一个实例(如句子)可能唤起多个情感类别的组合。因此,我们研究了如何从可变质量的众包注释中获得一组高质量的情感注释({instance, multi-emotion}配对数据)。众包标注任务的常见质量控制策略是聚合多个注释者提供的响应,以生成可靠的注释。鉴于“情绪”类别具有不同于其他类型标签的特征,我们建议将跨实例的特定领域的情绪一致性信息和情绪类别之间的上下文线索纳入到聚合过程中。实验结果表明,在有限数量的众包注释中,所提出的模型能够比多数投票和原始的领域独立模型更有效地估计出金标准。
{"title":"Multi-Emotion Estimation in Narratives from Crowdsourced Annotations","authors":"Lei Duan, S. Oyama, Haruhiko Sato, M. Kurihara","doi":"10.1145/2756406.2756910","DOIUrl":"https://doi.org/10.1145/2756406.2756910","url":null,"abstract":"Emotion annotations are important metadata for narrative texts in digital libraries. Such annotations are necessary for automatic text-to-speech conversion of narratives and affective education support and can be used as training data for machine learning algorithms to train automatic emotion detectors. However, obtaining high-quality emotion annotations is a challenging problem because it is usually expensive and time-consuming due to the subjectivity of emotion. Moreover, due to the multiplicity of \"emotion\", emotion annotations more naturally fit the paradigm of multi-label classification than that of multi-class classification since one instance (such as a sentence) may evoke a combination of multiple emotion categories. We thus investigated ways to obtain a set of high-quality emotion annotations ({instance, multi-emotion} paired data) from variable-quality crowdsourced annotations. A common quality control strategy for crowdsourced labeling tasks is to aggregate the responses provided by multiple annotators to produce a reliable annotation. Given that the categories of \"emotion\" have characteristics different from those of other kinds of labels, we propose incorporating domain-specific information of emotional consistencies across instances and contextual cues among emotion categories into the aggregation process. Experimental results demonstrate that, from a limited number of crowdsourced annotations, the proposed models enable gold standards to be more effectively estimated than the majority vote and the original domain-independent model.","PeriodicalId":256118,"journal":{"name":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126286390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
WikiMirs 3.0: A Hybrid MIR System Based on the Context, Structure and Importance of Formulae in a Document WikiMirs 3.0:基于文档中公式的上下文、结构和重要性的混合MIR系统
Pub Date : 2015-06-21 DOI: 10.1145/2756406.2756918
Yuehan Wang, Liangcai Gao, Simeng Wang, Zhi Tang, Xiaozhong Liu, Ke Yuan
Nowadays, mathematical information is increasingly available in websites and repositories, such like ArXiv, Wikipedia and growing numbers of digital libraries. Mathematical formulae are highly structured and usually presented in layout presentations, such as PDF, LATEX and Presentation MathML. The differences of presentation between text and formulae challenge traditional text-based index and retrieval methods. To address the challenge, this paper proposes an upgraded Mathematical Information Retrieval (MIR) system, namely WikiMirs 3.0, based on the context, structure and importance of formulae in a document. In WikiMirs 3.0, users can easily "cut" formulae and contexts from PDF documents as well as type in queries. Furthermore, a novel hybrid indexing and matching model is proposed to support both exact and fuzzy matching. In the hybrid model, both context and structure information of formulae are taken into consideration. In addition, the concept of formula importance within a document is introduced into the model for more reasonable ranking. Experimental results, compared with two classical MIR systems, demonstrate that the proposed system along with the novel model provides higher accuracy and better ranking results over Wikipedia.
如今,数学信息越来越多地出现在网站和存储库中,比如ArXiv、维基百科和越来越多的数字图书馆。数学公式是高度结构化的,通常以布局表示形式呈现,例如PDF、LATEX和Presentation MathML。文本和公式表达的差异对传统的基于文本的索引和检索方法提出了挑战。为了解决这一挑战,本文提出了一种基于上下文、结构和文档中公式重要性的数学信息检索(MIR)系统,即WikiMirs 3.0。在WikiMirs 3.0中,用户可以很容易地从PDF文档中“剪切”公式和上下文,也可以输入查询。在此基础上,提出了一种新的混合索引和匹配模型,以支持精确和模糊匹配。混合模型同时考虑了公式的上下文信息和结构信息。此外,在模型中引入了文档内公式重要性的概念,使排序更加合理。实验结果与两种经典的MIR系统进行了比较,结果表明,本文提出的系统和新模型比Wikipedia提供了更高的准确率和更好的排名结果。
{"title":"WikiMirs 3.0: A Hybrid MIR System Based on the Context, Structure and Importance of Formulae in a Document","authors":"Yuehan Wang, Liangcai Gao, Simeng Wang, Zhi Tang, Xiaozhong Liu, Ke Yuan","doi":"10.1145/2756406.2756918","DOIUrl":"https://doi.org/10.1145/2756406.2756918","url":null,"abstract":"Nowadays, mathematical information is increasingly available in websites and repositories, such like ArXiv, Wikipedia and growing numbers of digital libraries. Mathematical formulae are highly structured and usually presented in layout presentations, such as PDF, LATEX and Presentation MathML. The differences of presentation between text and formulae challenge traditional text-based index and retrieval methods. To address the challenge, this paper proposes an upgraded Mathematical Information Retrieval (MIR) system, namely WikiMirs 3.0, based on the context, structure and importance of formulae in a document. In WikiMirs 3.0, users can easily \"cut\" formulae and contexts from PDF documents as well as type in queries. Furthermore, a novel hybrid indexing and matching model is proposed to support both exact and fuzzy matching. In the hybrid model, both context and structure information of formulae are taken into consideration. In addition, the concept of formula importance within a document is introduced into the model for more reasonable ranking. Experimental results, compared with two classical MIR systems, demonstrate that the proposed system along with the novel model provides higher accuracy and better ranking results over Wikipedia.","PeriodicalId":256118,"journal":{"name":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126086487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Reconstruction of the US First Website 重建美国第一网站
Pub Date : 2015-06-21 DOI: 10.1145/2756406.2756954
A. Alsum
The Web idea started on 1989 with a proposal from Sir Tim Berners-Lee. The first US website has been developed at SLAC on 1991. This early version of the Web and the subsequent updates until 1998 have been preserved by SLAC archive and history office for many years. In this paper, we discuss the strategy and techniques to reconstruct this early website and make it available through Stanford Web Archive Portal.
万维网的想法始于1989年蒂姆·伯纳斯-李爵士的一个提议。第一个美国网站是1991年在SLAC开发的。这个早期的Web版本和1998年之前的后续更新已经被SLAC存档和历史办公室保存了很多年。在本文中,我们讨论了重建这个早期网站的策略和技术,并使其通过斯坦福大学网络档案门户网站提供。
{"title":"Reconstruction of the US First Website","authors":"A. Alsum","doi":"10.1145/2756406.2756954","DOIUrl":"https://doi.org/10.1145/2756406.2756954","url":null,"abstract":"The Web idea started on 1989 with a proposal from Sir Tim Berners-Lee. The first US website has been developed at SLAC on 1991. This early version of the Web and the subsequent updates until 1998 have been preserved by SLAC archive and history office for many years. In this paper, we discuss the strategy and techniques to reconstruct this early website and make it available through Stanford Web Archive Portal.","PeriodicalId":256118,"journal":{"name":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122141559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Time will Tell: Temporal Linking of News Stories 时间会证明:新闻故事的时间链接
Pub Date : 2015-06-21 DOI: 10.1145/2756406.2756919
Thomas Bögel, Michael Gertz
Readers of news articles are typically faced with the problem of getting a good understanding of a complex story covered in an article. However, as news articles mainly focus on current or recent events, they often do not provide sufficient information about the history of an event or topic, leaving the user alone in discovering and exploring other news articles that might be related to a given article. This is a time consuming and non-trivial task, and the only help provided by some news outlets is some list of related articles or a few links within an article itself. What further complicates this task is that many of today's news stories cover a wide range of topics and events even within a single article, thus leaving the realm of traditional approaches that track a single topic or event over time. In this paper, we present a framework to link news articles based on temporal expressions that occur in the articles, following the idea "if an article refers to something in the past, then there should be an article about that something". Our approach aims to recover the chronology of one or more events and topics covered in an article, leading to an information network of articles that can be explored in a thematic and particular chronological fashion. For this, we propose a measure for the relatedness of articles that is primarily based on temporal expressions in articles but also exploits other information such as persons mentioned and keywords. We provide a comprehensive evaluation that demonstrates the functionality of our framework using a multi-source corpus of recent German news articles.
新闻文章的读者通常面临着如何很好地理解文章中所涉及的复杂故事的问题。然而,由于新闻文章主要关注当前或最近的事件,它们通常不会提供关于事件或主题的历史的足够信息,从而使用户独自发现和探索可能与给定文章相关的其他新闻文章。这是一项耗时且重要的任务,一些新闻媒体提供的唯一帮助是一些相关文章的列表或文章本身的一些链接。使这项任务进一步复杂化的是,今天的许多新闻报道甚至在一篇文章中涵盖了广泛的主题和事件,从而离开了传统方法的领域,即随时间跟踪单个主题或事件。在本文中,我们遵循“如果一篇文章涉及过去的事情,那么就应该有一篇关于过去的事情的文章”的想法,提出了一个基于文章中出现的时态表达来链接新闻文章的框架。我们的方法旨在恢复文章中涉及的一个或多个事件和主题的时间顺序,从而形成一个可以以主题和特定时间顺序方式探索的文章信息网络。为此,我们提出了一种文章相关性的度量方法,该方法主要基于文章中的时间表达式,但也利用了其他信息,如提到的人物和关键词。我们提供了一个全面的评估,展示了我们的框架的功能,使用最近的德国新闻文章的多源语料库。
{"title":"Time will Tell: Temporal Linking of News Stories","authors":"Thomas Bögel, Michael Gertz","doi":"10.1145/2756406.2756919","DOIUrl":"https://doi.org/10.1145/2756406.2756919","url":null,"abstract":"Readers of news articles are typically faced with the problem of getting a good understanding of a complex story covered in an article. However, as news articles mainly focus on current or recent events, they often do not provide sufficient information about the history of an event or topic, leaving the user alone in discovering and exploring other news articles that might be related to a given article. This is a time consuming and non-trivial task, and the only help provided by some news outlets is some list of related articles or a few links within an article itself. What further complicates this task is that many of today's news stories cover a wide range of topics and events even within a single article, thus leaving the realm of traditional approaches that track a single topic or event over time. In this paper, we present a framework to link news articles based on temporal expressions that occur in the articles, following the idea \"if an article refers to something in the past, then there should be an article about that something\". Our approach aims to recover the chronology of one or more events and topics covered in an article, leading to an information network of articles that can be explored in a thematic and particular chronological fashion. For this, we propose a measure for the relatedness of articles that is primarily based on temporal expressions in articles but also exploits other information such as persons mentioned and keywords. We provide a comprehensive evaluation that demonstrates the functionality of our framework using a multi-source corpus of recent German news articles.","PeriodicalId":256118,"journal":{"name":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128340544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Building Complex Research Collections in Digital Libraries: A Survey of Ontology Implications 在数字图书馆中建立复杂的研究馆藏:本体论意义的调查
Pub Date : 2015-06-21 DOI: 10.1145/2756406.2756944
Terhi Nurmikko-Fuller, Kevin R. Page, P. Willcox, Jacob Jett, Christopher R. Maden, Timothy W. Cole, Colleen Fallaw, Megan Senseney, J. S. Downie
Bibliographic metadata standards are a longstanding mechanism for Digital Libraries to manage records and express relationships between them. As digital scholarship, particularly in the humanities, incorporates and manipulates these records in an increasingly direct manner, existing systems are proving insufficient for providing the underlying addressability and relational expressivity required to construct and interact with complex research collections. In this paper we describe motivations for these "worksets" and the technical requirements they raise. We survey the coverage of existing bibliographic ontologies in the context of meeting these scholarly needs, and finally provide an illustrated discussion of potential extensions that might fully realize a solution.
书目元数据标准是数字图书馆管理记录和表达记录之间关系的长期机制。随着数字学术,特别是人文学科,以越来越直接的方式整合和操纵这些记录,现有的系统被证明不足以提供构建复杂研究收藏所需的潜在可寻址性和关系表达性。在本文中,我们描述了这些“工作集”的动机以及它们提出的技术需求。我们在满足这些学术需求的背景下调查了现有书目本体的覆盖范围,并最后提供了一个可能完全实现解决方案的潜在扩展的说明讨论。
{"title":"Building Complex Research Collections in Digital Libraries: A Survey of Ontology Implications","authors":"Terhi Nurmikko-Fuller, Kevin R. Page, P. Willcox, Jacob Jett, Christopher R. Maden, Timothy W. Cole, Colleen Fallaw, Megan Senseney, J. S. Downie","doi":"10.1145/2756406.2756944","DOIUrl":"https://doi.org/10.1145/2756406.2756944","url":null,"abstract":"Bibliographic metadata standards are a longstanding mechanism for Digital Libraries to manage records and express relationships between them. As digital scholarship, particularly in the humanities, incorporates and manipulates these records in an increasingly direct manner, existing systems are proving insufficient for providing the underlying addressability and relational expressivity required to construct and interact with complex research collections. In this paper we describe motivations for these \"worksets\" and the technical requirements they raise. We survey the coverage of existing bibliographic ontologies in the context of meeting these scholarly needs, and finally provide an illustrated discussion of potential extensions that might fully realize a solution.","PeriodicalId":256118,"journal":{"name":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123740838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Digital Data Curation Essentials for Data Scientists and Data Curators and Librarians 数据科学家、数据管理员和图书管理员的数字数据管理要点
Pub Date : 2015-06-21 DOI: 10.1145/2756406.2756928
H. Tibbo, C. Hank
This paper describes a detailed description of a full-day data digital curation tutorial held at JCDL'15.
本文描述了JCDL'15举办的全天数据数字策展教程的详细描述。
{"title":"Digital Data Curation Essentials for Data Scientists and Data Curators and Librarians","authors":"H. Tibbo, C. Hank","doi":"10.1145/2756406.2756928","DOIUrl":"https://doi.org/10.1145/2756406.2756928","url":null,"abstract":"This paper describes a detailed description of a full-day data digital curation tutorial held at JCDL'15.","PeriodicalId":256118,"journal":{"name":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","volume":"683 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134127862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1