We demonstrate a prototype that takes advantage of open-source software to put a full-text searchable copy of Wikipedia on a Raspberry Pi, providing nearby devices access to content via wifi or bluetooth without requiring internet connectivity. This short paper articulates the advantages of such a form factor and provides an evaluation of browsing and search capabilities. We believe that personal digital libraries on lightweight mobile computing devices represent an interesting research direction to pursue.
{"title":"The Sum of All Human Knowledge in Your Pocket: Full-Text Searchable Wikipedia on a Raspberry Pi","authors":"Jimmy J. Lin","doi":"10.1145/2756406.2756938","DOIUrl":"https://doi.org/10.1145/2756406.2756938","url":null,"abstract":"We demonstrate a prototype that takes advantage of open-source software to put a full-text searchable copy of Wikipedia on a Raspberry Pi, providing nearby devices access to content via wifi or bluetooth without requiring internet connectivity. This short paper articulates the advantages of such a form factor and provides an evaluation of browsing and search capabilities. We believe that personal digital libraries on lightweight mobile computing devices represent an interesting research direction to pursue.","PeriodicalId":256118,"journal":{"name":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","volume":"10 43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124733682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Petr Knoth, Kris Jack, Lucas Anastasiou, Nuno Freire, Nancy Pontika, Drahomira Herrmannova
{"title":"WOSP2015: 4th International Workshop on Mining Scientific Publications","authors":"Petr Knoth, Kris Jack, Lucas Anastasiou, Nuno Freire, Nancy Pontika, Drahomira Herrmannova","doi":"10.1145/2756406.2756932","DOIUrl":"https://doi.org/10.1145/2756406.2756932","url":null,"abstract":"","PeriodicalId":256118,"journal":{"name":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125690783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A survey measured users' perceived self-efficacy about interactively retrieving digital video, both overall and according to different factors potentially related to user confidence preceding an actual video search session. A total of 270 surveys, with quantifiable responses, were collected and analyzed. T-tests and correlation tests produced significant findings about users' levels of perceived self-efficacy, including associations with topic familiarly, type or nature of the information need, and system context. Findings give researchers a better understanding of users' confidence and preconceptions prior to interactive information retrieval (IIR) sessions for video, providing valuable insight about users' attitudes which can be used to promote initial and continued use of interactive tools like digital libraries.
{"title":"User and Topical Factors in Perceived Self-Efficacy of Video Digital Libraries","authors":"Dan E. Albertson, Boryung Ju","doi":"10.1145/2756406.2756950","DOIUrl":"https://doi.org/10.1145/2756406.2756950","url":null,"abstract":"A survey measured users' perceived self-efficacy about interactively retrieving digital video, both overall and according to different factors potentially related to user confidence preceding an actual video search session. A total of 270 surveys, with quantifiable responses, were collected and analyzed. T-tests and correlation tests produced significant findings about users' levels of perceived self-efficacy, including associations with topic familiarly, type or nature of the information need, and system context. Findings give researchers a better understanding of users' confidence and preconceptions prior to interactive information retrieval (IIR) sessions for video, providing valuable insight about users' attitudes which can be used to promote initial and continued use of interactive tools like digital libraries.","PeriodicalId":256118,"journal":{"name":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128262599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents 5ex+y, a system that is able to extract, index and query mathematical content expressed as mathematical expressions, complementing the CERN Document Server (CDS). We present the most important aspects of its design, our approach to model the relevant features of the mathematical content, and provide a demonstration of its searching capabilities.
{"title":"5e{x+y}: Searching over Mathematical Content in Digital Libraries","authors":"A. Oviedo, Nikos Kasioumis, K. Aberer","doi":"10.1145/2756406.2756953","DOIUrl":"https://doi.org/10.1145/2756406.2756953","url":null,"abstract":"This paper presents 5ex+y, a system that is able to extract, index and query mathematical content expressed as mathematical expressions, complementing the CERN Document Server (CDS). We present the most important aspects of its design, our approach to model the relevant features of the mathematical content, and provide a demonstration of its searching capabilities.","PeriodicalId":256118,"journal":{"name":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124623473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The organisation of personal data is receiving increasing research attention due to the challenges that are faced in gathering, enriching, searching and visualising this data. Given the increasing quantities of personal data being gathered by individuals, the concept of a lifelong digital library of rich multimedia and sensory content for every individual is becoming a reality. This panel brought together researchers from different parts of the information retrieval and digital libraries community to debate the opportunities and challenges for researchers in this new and challenging area.
{"title":"Lifelong Digital Libraries","authors":"C. Gurrin, F. Hopfgartner","doi":"10.1145/2756406.2756974","DOIUrl":"https://doi.org/10.1145/2756406.2756974","url":null,"abstract":"The organisation of personal data is receiving increasing research attention due to the challenges that are faced in gathering, enriching, searching and visualising this data. Given the increasing quantities of personal data being gathered by individuals, the concept of a lifelong digital library of rich multimedia and sensory content for every individual is becoming a reality. This panel brought together researchers from different parts of the information retrieval and digital libraries community to debate the opportunities and challenges for researchers in this new and challenging area.","PeriodicalId":256118,"journal":{"name":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120950088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Poster & Demo Session","authors":"Kyumin Lee, Martin Klein","doi":"10.1145/3260518","DOIUrl":"https://doi.org/10.1145/3260518","url":null,"abstract":"","PeriodicalId":256118,"journal":{"name":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126227659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shuo Yang, Yansong Feng, Lei Zou, Aixia Jia, Dongyan Zhao
Taxonomy is a useful and ubiquitous way to organize knowledge. As online education attracting more and more attention, organizing lecture notes or exercises, from different online sources, in a more structured form has become an effective way to navigate users to better access course materials. However, it is expensive and time consuming to manually annotate large amounts of corpora to build a detailed taxonomy. In this paper, we propose a taxonomy induction framework with limited human involvement. We also show that the constructed taxonomy can be used to improve lecture notes/exercises recommendations.
{"title":"Taxonomy Induction and Taxonomy-based Recommendations for Online Courses","authors":"Shuo Yang, Yansong Feng, Lei Zou, Aixia Jia, Dongyan Zhao","doi":"10.1145/2756406.2756968","DOIUrl":"https://doi.org/10.1145/2756406.2756968","url":null,"abstract":"Taxonomy is a useful and ubiquitous way to organize knowledge. As online education attracting more and more attention, organizing lecture notes or exercises, from different online sources, in a more structured form has become an effective way to navigate users to better access course materials. However, it is expensive and time consuming to manually annotate large amounts of corpora to build a detailed taxonomy. In this paper, we propose a taxonomy induction framework with limited human involvement. We also show that the constructed taxonomy can be used to improve lecture notes/exercises recommendations.","PeriodicalId":256118,"journal":{"name":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134232961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the field of academic document search, citations are often used for measuring implicit relationships between documents. Recently, some studies have attempted to extend co-citation searching. However, these studies mainly focus on comparisons of traditional co-citation and extended co-citation search methods; combination effects of word-based and extended co-citation search algorithms have not yet been sufficiently evaluated. This paper empirically evaluates the search performance of the combination search by using a test collection comprising about 152,000 documents and a metric 'precision at k.' The experimental results indicate that the combination search outperforms two baseline methods: a word-based search and a combination search of word-based and traditional co-citation search algorithms.
{"title":"Combination Effects of Word-based and Extended Co-citation Search Algorithms","authors":"Masaki Eto","doi":"10.1145/2756406.2756957","DOIUrl":"https://doi.org/10.1145/2756406.2756957","url":null,"abstract":"In the field of academic document search, citations are often used for measuring implicit relationships between documents. Recently, some studies have attempted to extend co-citation searching. However, these studies mainly focus on comparisons of traditional co-citation and extended co-citation search methods; combination effects of word-based and extended co-citation search algorithms have not yet been sufficiently evaluated. This paper empirically evaluates the search performance of the combination search by using a test collection comprising about 152,000 documents and a metric 'precision at k.' The experimental results indicate that the combination search outperforms two baseline methods: a word-based search and a combination search of word-based and traditional co-citation search algorithms.","PeriodicalId":256118,"journal":{"name":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131721196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Session 8 - Temporality","authors":"E. Rasmussen","doi":"10.1145/3260516","DOIUrl":"https://doi.org/10.1145/3260516","url":null,"abstract":"","PeriodicalId":256118,"journal":{"name":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132147301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The citation of resources is a fundamental part of scholarly discourse. Due to the popularity of the web, there is an increasing trend for scholarly articles to reference web resources (e.g. software, data). However, due to the dynamic nature of the web, the referenced links may become inaccessible ('rotten') sometime after publication, returning a "404 Not Found" HTTP error. In this paper we first present some preliminary findings of a study of the persistence and availability of web resources referenced from papers in a large-scale scholarly repository. We reaffirm previous research that link rot is a serious problem in the scholarly world and that current web archives do not always preserve all rotten links. Therefore, a more pro-active archival solution needs to be developed to further preserve web content referenced in scholarly articles. To this end, we propose to apply machine learning techniques to train a link rot predictor for use by an archival framework to prioritise pro-active archiving of links that are more likely to be rotten. We demonstrate that we can obtain a fairly high link rot prediction AUC (0.72) with only a small set of features. By simulation, we also show that our prediction framework is more effective than current web archives for preserving links that are likely to be rotten. This work has a potential impact for the scholarly world where publishers can utilise this framework to prioritise the archiving of links for digital preservation, especially when there is a large quantity of links to be archived.
资源引用是学术话语的基本组成部分。由于网络的普及,学术文章越来越多地引用网络资源(如软件、数据)。然而,由于网络的动态特性,引用的链接可能在发布后的某个时候变得无法访问(“腐烂”),返回“404 Not Found”HTTP错误。在这篇论文中,我们首先提出了一些关于大规模学术知识库中引用的网络资源的持久性和可用性的初步研究结果。我们重申先前的研究,链接腐烂是一个严重的问题,在学术界,目前的网络档案并不总是保存所有腐烂的链接。因此,需要开发一种更主动的存档解决方案,以进一步保存学术文章中引用的网络内容。为此,我们建议应用机器学习技术来训练一个链接腐烂预测器,供存档框架使用,以优先考虑更有可能腐烂的链接的主动存档。我们证明,我们可以获得一个相当高的链接rot预测AUC(0.72),只有一个小的特征集。通过模拟,我们还表明,我们的预测框架在保存可能腐烂的链接方面比当前的网络档案更有效。这项工作对学术世界有潜在的影响,出版商可以利用这个框架来优先归档数字保存的链接,特别是当有大量的链接需要归档时。
{"title":"No More 404s: Predicting Referenced Link Rot in Scholarly Articles for Pro-Active Archiving","authors":"Ke Zhou, Claire Grover, Martin Klein, R. Tobin","doi":"10.1145/2756406.2756940","DOIUrl":"https://doi.org/10.1145/2756406.2756940","url":null,"abstract":"The citation of resources is a fundamental part of scholarly discourse. Due to the popularity of the web, there is an increasing trend for scholarly articles to reference web resources (e.g. software, data). However, due to the dynamic nature of the web, the referenced links may become inaccessible ('rotten') sometime after publication, returning a \"404 Not Found\" HTTP error. In this paper we first present some preliminary findings of a study of the persistence and availability of web resources referenced from papers in a large-scale scholarly repository. We reaffirm previous research that link rot is a serious problem in the scholarly world and that current web archives do not always preserve all rotten links. Therefore, a more pro-active archival solution needs to be developed to further preserve web content referenced in scholarly articles. To this end, we propose to apply machine learning techniques to train a link rot predictor for use by an archival framework to prioritise pro-active archiving of links that are more likely to be rotten. We demonstrate that we can obtain a fairly high link rot prediction AUC (0.72) with only a small set of features. By simulation, we also show that our prediction framework is more effective than current web archives for preserving links that are likely to be rotten. This work has a potential impact for the scholarly world where publishers can utilise this framework to prioritise the archiving of links for digital preservation, especially when there is a large quantity of links to be archived.","PeriodicalId":256118,"journal":{"name":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115666091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}