P. Bogen, Katherine Skinner, Piotr Adamczyk, Unmil Karadkar
Cultural Heritage content is increasingly being both created digitally and digitized. Preserving this content has been a much discussed and debated question in the Digital Libraries and Digital Humanities communities. Many concerns that have been raised around the organizational challenges. Centralized preservation is often praised for unified access and consistency, but at the same are criticized for their reliance on the continued interest of a smaller number of maintainers. Alternatively, decentralized preservation leads to better longevity but often at a cost of consistency or ease of access. Beyond this question, there are many other organizational issues. Such as the role of states and commercial entities in preservation; and, dealing with concerns about ownership, privacy and acceptable use of materials. This panel will discuss these issues with the goal of finding a balance between these often conflicting approaches.
{"title":"Organizational Strategies for Cultural Heritage Preservation","authors":"P. Bogen, Katherine Skinner, Piotr Adamczyk, Unmil Karadkar","doi":"10.1145/2756406.2756975","DOIUrl":"https://doi.org/10.1145/2756406.2756975","url":null,"abstract":"Cultural Heritage content is increasingly being both created digitally and digitized. Preserving this content has been a much discussed and debated question in the Digital Libraries and Digital Humanities communities. Many concerns that have been raised around the organizational challenges. Centralized preservation is often praised for unified access and consistency, but at the same are criticized for their reliance on the continued interest of a smaller number of maintainers. Alternatively, decentralized preservation leads to better longevity but often at a cost of consistency or ease of access. Beyond this question, there are many other organizational issues. Such as the role of states and commercial entities in preservation; and, dealing with concerns about ownership, privacy and acceptable use of materials. This panel will discuss these issues with the goal of finding a balance between these often conflicting approaches.","PeriodicalId":256118,"journal":{"name":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132895127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The goal of the RMap Project is to create a prototype service that can capture and preserve maps of relationships amongst the increasingly distributed components (article, data, software, workflow objects, multimedia, etc.) that comprise the new model for scholarly publication. The demonstration will provide a tour of some of the features of the initial web service prototype. This will include examples of Distributed Scholarly Complex Objects (DiSCOs) and associated provenance data in RMap, as well as some of the options that users might have for interacting with the framework.
{"title":"The RMap Project: Capturing and Preserving Associations amongst Multi-Part Distributed Publications","authors":"Karen L. Hanson, T. DiLauro, M. Donoghue","doi":"10.1145/2756406.2756952","DOIUrl":"https://doi.org/10.1145/2756406.2756952","url":null,"abstract":"The goal of the RMap Project is to create a prototype service that can capture and preserve maps of relationships amongst the increasingly distributed components (article, data, software, workflow objects, multimedia, etc.) that comprise the new model for scholarly publication. The demonstration will provide a tour of some of the features of the initial web service prototype. This will include examples of Distributed Scholarly Complex Objects (DiSCOs) and associated provenance data in RMap, as well as some of the options that users might have for interacting with the framework.","PeriodicalId":256118,"journal":{"name":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","volume":"125 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115736890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Knyazeva, O. Kolobov, Fjodor E. Tatarsky, I. Turchanovsky
The process of merging two or more library catalogues is considered in this paper. It's necessary to solve the problem of duplicate detection and merging into one database instead of simple union of different resources. The toolbox Cflib for duplicate detection and merging has been developed by us. It's based on standard principles of record linkage and has quite simple architecture.
{"title":"An Instrument for Merging of Bibliographic Databases","authors":"A. Knyazeva, O. Kolobov, Fjodor E. Tatarsky, I. Turchanovsky","doi":"10.1145/2756406.2756973","DOIUrl":"https://doi.org/10.1145/2756406.2756973","url":null,"abstract":"The process of merging two or more library catalogues is considered in this paper. It's necessary to solve the problem of duplicate detection and merging into one database instead of simple union of different resources. The toolbox Cflib for duplicate detection and merging has been developed by us. It's based on standard principles of record linkage and has quite simple architecture.","PeriodicalId":256118,"journal":{"name":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122779119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Building evaluation datasets for information retrieval is a time-consuming and exhausting activity. To evaluate research over novel corpora, researchers are increasingly turning to crowdsourcing to efficiently distribute the evaluation dataset creation among many workers. However, there has been little investigation into the effect of instrument design on data quality in crowdsourced evaluation datasets. We pursue this question through a case study, music similarity judgments in a music digital library evaluation, where we find that even with trusted graders song pairs are not consistently rated the same. We find that much of this low intra-coder consistency can be attributed to the task design and judge effects, concluding with recommendations for achieving reliable evaluation judgments for music similarity and other normative judgment tasks.
{"title":"Improving Consistency of Crowdsourced Multimedia Similarity for Evaluation","authors":"Peter Organisciak, J. S. Downie","doi":"10.1145/2756406.2756942","DOIUrl":"https://doi.org/10.1145/2756406.2756942","url":null,"abstract":"Building evaluation datasets for information retrieval is a time-consuming and exhausting activity. To evaluate research over novel corpora, researchers are increasingly turning to crowdsourcing to efficiently distribute the evaluation dataset creation among many workers. However, there has been little investigation into the effect of instrument design on data quality in crowdsourced evaluation datasets. We pursue this question through a case study, music similarity judgments in a music digital library evaluation, where we find that even with trusted graders song pairs are not consistently rated the same. We find that much of this low intra-coder consistency can be attributed to the task design and judge effects, concluding with recommendations for achieving reliable evaluation judgments for music similarity and other normative judgment tasks.","PeriodicalId":256118,"journal":{"name":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121773231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Classifying publication venues into top-tier or non top-tier is quite subjective and can be debatable at times. sIn this paper, we propose ConfAssist, a novel assisting framework for conference categorization that aims to address the limitations in the existing systems and portals for venue classification. We identify various features related to the stability of conferences that might help us separate a top-tier conference from the rest of the lot. While there are many clear cases where expert agreement can be almost immediately achieved as to whether a conference is a top-tier or not, there are equally many cases that can result in a conflict even among the experts. ConfAssist tries to serve as an aid in such cases by increasing the confidence of the experts in their decision. A human judgment survey was conducted with 28 domain experts. The results were quite impressive with 91.6% classification accuracy.
{"title":"ConfAssist: A Conflict Resolution Framework for Assisting the Categorization of Computer Science Conferences","authors":"Mayank Singh, Tanmoy Chakraborty, Animesh Mukherjee, Pawan Goyal","doi":"10.1145/2756406.2756963","DOIUrl":"https://doi.org/10.1145/2756406.2756963","url":null,"abstract":"Classifying publication venues into top-tier or non top-tier is quite subjective and can be debatable at times. sIn this paper, we propose ConfAssist, a novel assisting framework for conference categorization that aims to address the limitations in the existing systems and portals for venue classification. We identify various features related to the stability of conferences that might help us separate a top-tier conference from the rest of the lot. While there are many clear cases where expert agreement can be almost immediately achieved as to whether a conference is a top-tier or not, there are equally many cases that can result in a conflict even among the experts. ConfAssist tries to serve as an aid in such cases by increasing the confidence of the experts in their decision. A human judgment survey was conducted with 28 domain experts. The results were quite impressive with 91.6% classification accuracy.","PeriodicalId":256118,"journal":{"name":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115016699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Digital reading is a topic of rising interest in digital libraries, particularly in terms of optimizing the reading experience. However, there is relatively little data on the patterns of digital reading, including issues of where and what users read, and how they organize, plan and conduct their reading sessions. This paper reports the first data on mobile reading, combining insights from three different studies of users, including diary studies, interviews and ethnomethodological work. The data reveals that reading often depends on highly developed and rehearsed practices, especially when the reading is related to study or research. From this, we are able to identify a number of opportunities for further digital library research to better support the needs of users.
{"title":"Where My Books Go: Choice and Place in Digital Reading","authors":"G. Buchanan, Dana Mckay, J. Levitt","doi":"10.1145/2756406.2756917","DOIUrl":"https://doi.org/10.1145/2756406.2756917","url":null,"abstract":"Digital reading is a topic of rising interest in digital libraries, particularly in terms of optimizing the reading experience. However, there is relatively little data on the patterns of digital reading, including issues of where and what users read, and how they organize, plan and conduct their reading sessions. This paper reports the first data on mobile reading, combining insights from three different studies of users, including diary studies, interviews and ethnomethodological work. The data reveals that reading often depends on highly developed and rehearsed practices, especially when the reading is related to study or research. From this, we are able to identify a number of opportunities for further digital library research to better support the needs of users.","PeriodicalId":256118,"journal":{"name":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134646518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Session 7 - Non-text Collections","authors":"G. Henry","doi":"10.1145/3260515","DOIUrl":"https://doi.org/10.1145/3260515","url":null,"abstract":"","PeriodicalId":256118,"journal":{"name":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130771726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Emília A. de Souza, Anderson A. Ferreira, Marcos André Gonçalves
Historically, supervised methods have been the most effective ones for author name disambiguation tasks. In here, we propose a specific manner to combine supervised techniques along with user feedback. Although, we use supervised techniques, the only user effort is to provide feedback on results since initial training data is automatically generated. Our experiments show gains up to 20% in the disambiguation performance against representative baselines.
{"title":"Combining Classifiers and User Feedback for Disambiguating Author Names","authors":"Emília A. de Souza, Anderson A. Ferreira, Marcos André Gonçalves","doi":"10.1145/2756406.2756964","DOIUrl":"https://doi.org/10.1145/2756406.2756964","url":null,"abstract":"Historically, supervised methods have been the most effective ones for author name disambiguation tasks. In here, we propose a specific manner to combine supervised techniques along with user feedback. Although, we use supervised techniques, the only user effort is to provide feedback on results since initial training data is automatically generated. Our experiments show gains up to 20% in the disambiguation performance against representative baselines.","PeriodicalId":256118,"journal":{"name":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","volume":"188 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114004278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
F. Shipman, R. Gutierrez-Osuna, T. Shipman, C. D. D. Monteiro, Virendra Karappa
The Internet provides access to content in almost all languages through a combination of crawling, indexing, and ranking capabilities. The ability to locate content on almost any topic has become expected for most users. But it is not the case for those whose primary language is a sign language. Members of this community communicate via the Internet, but they pass around links to videos via email and social media. In this paper, we describe the need for, the architecture of, and initial software components of a distributed digital library of sign language content, called SLaDL. Our initial efforts have been to develop a model of collection development that enables community involvement without assuming it. This goal necessitated the development of video processing techniques that automatically detect sign language content in video.
{"title":"Towards a Distributed Digital Library for Sign Language Content","authors":"F. Shipman, R. Gutierrez-Osuna, T. Shipman, C. D. D. Monteiro, Virendra Karappa","doi":"10.1145/2756406.2756945","DOIUrl":"https://doi.org/10.1145/2756406.2756945","url":null,"abstract":"The Internet provides access to content in almost all languages through a combination of crawling, indexing, and ranking capabilities. The ability to locate content on almost any topic has become expected for most users. But it is not the case for those whose primary language is a sign language. Members of this community communicate via the Internet, but they pass around links to videos via email and social media. In this paper, we describe the need for, the architecture of, and initial software components of a distributed digital library of sign language content, called SLaDL. Our initial efforts have been to develop a model of collection development that enables community involvement without assuming it. This goal necessitated the development of video processing techniques that automatically detect sign language content in video.","PeriodicalId":256118,"journal":{"name":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","volume":"233 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132982249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Faceted browsing has become ubiquitous with modern digital libraries and online search engines, yet the process is still difficult to abstractly model in a manner that supports the development of interoperable and reusable interfaces. Existing efforts in facet modeling are based upon set theory, formal concept analysis, and light-weight ontologies, but in many regards, they are implementations of faceted browsing rather than a specification of the basic, underlying structures and interactions. We propose category theory as a theoretical foundation for faceted browsing and demonstrate how the interactive process can be mathematically abstracted in a way that naturally supports interoperability and reuse.
{"title":"Modeling Faceted Browsing with Category Theory to Support Interoperability and Reuse","authors":"Daniel R. Harris","doi":"10.1145/2756406.2756972","DOIUrl":"https://doi.org/10.1145/2756406.2756972","url":null,"abstract":"Faceted browsing has become ubiquitous with modern digital libraries and online search engines, yet the process is still difficult to abstractly model in a manner that supports the development of interoperable and reusable interfaces. Existing efforts in facet modeling are based upon set theory, formal concept analysis, and light-weight ontologies, but in many regards, they are implementations of faceted browsing rather than a specification of the basic, underlying structures and interactions. We propose category theory as a theoretical foundation for faceted browsing and demonstrate how the interactive process can be mathematically abstracted in a way that naturally supports interoperability and reuse.","PeriodicalId":256118,"journal":{"name":"Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129608522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}