Digital game-based learning (DGBL) has become increasingly popular. With elements such as narratives, rewards, quests, and interactivity, DGBL can actively engage learners, stimulating desired learning outcomes. In an effort to increase its appeal, affective embodied agents (EAs) have been incorporated as learning companions or instructors in DGBL. However, claims about the efficacy of using affective EAs in DGBL have scarcely been subjected to empirical analysis. Therefore, this study aims to investigate the influence of affective EAs on students' learning outcome, motivation, perceived usefulness, and behavioral intention in an information literacy (IL) game. Eighty tertiary students were recruited and randomly assigned in a pre- and post-test between-subjects experiment with two conditions: affective-EA and no-EA. Results showed that participants benefited from interacting with the affective EA in the IL game in terms of attention, confidence, satisfaction, and intention to learn IL knowledge and to recommend. However, there were no significant differences in learning outcome, relevance, or intention to play the game. Contributions and limitations of this study are also discussed at the end.
{"title":"Experimental evaluation of affective embodied agents in an information literacy game","authors":"Yanru Guo, D. Goh","doi":"10.1145/2910896.2910897","DOIUrl":"https://doi.org/10.1145/2910896.2910897","url":null,"abstract":"Digital game-based learning (DGBL) has become increasingly popular. With elements such as narratives, rewards, quests, and interactivity, DGBL can actively engage learners, stimulating desired learning outcomes. In an effort to increase its appeal, affective embodied agents (EAs) have been incorporated as learning companions or instructors in DGBL. However, claims about the efficacy of using affective EAs in DGBL have scarcely been subjected to empirical analysis. Therefore, this study aims to investigate the influence of affective EAs on students' learning outcome, motivation, perceived usefulness, and behavioral intention in an information literacy (IL) game. Eighty tertiary students were recruited and randomly assigned in a pre- and post-test between-subjects experiment with two conditions: affective-EA and no-EA. Results showed that participants benefited from interacting with the affective EA in the IL game in terms of attention, confidence, satisfaction, and intention to learn IL knowledge and to recommend. However, there were no significant differences in learning outcome, relevance, or intention to play the game. Contributions and limitations of this study are also discussed at the end.","PeriodicalId":109613,"journal":{"name":"2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131317850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Leonidas Papachristopoulos, Michalis Sfakakis, Nikos Kleidis, G. Tsakonas, C. Papatheodorou
The multidimensional nature of digital libraries evaluation domain and the amount of scientific production published on the field hinders and disorientates the interested researchers who contemplate to focus on the specific domain. These communities need guidance in order to exploit the considerable amount of data and the diversity of methods effectively as well as to identify new research goals and develop their plans for future works. This poster investigates the core topics of the digital library evaluation field and their impact by applying topic modeling and network analysis on a corpus of the JCDL, ECDL/TDPL and ICADL conferences proceedings in the period 2001-2013.
{"title":"Exploiting network analysis to investigate topic dynamics in the digital library evaluation domain","authors":"Leonidas Papachristopoulos, Michalis Sfakakis, Nikos Kleidis, G. Tsakonas, C. Papatheodorou","doi":"10.1145/2910896.2925464","DOIUrl":"https://doi.org/10.1145/2910896.2925464","url":null,"abstract":"The multidimensional nature of digital libraries evaluation domain and the amount of scientific production published on the field hinders and disorientates the interested researchers who contemplate to focus on the specific domain. These communities need guidance in order to exploit the considerable amount of data and the diversity of methods effectively as well as to identify new research goals and develop their plans for future works. This poster investigates the core topics of the digital library evaluation field and their impact by applying topic modeling and network analysis on a corpus of the JCDL, ECDL/TDPL and ICADL conferences proceedings in the period 2001-2013.","PeriodicalId":109613,"journal":{"name":"2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126836073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Memento protocol makes it easy to build a uniform lookup service to aggregate the holdings of web archives. However, there is a lack of tools to utilize this capability in archiving applications and research projects. We created MemGator, an open source, easy to use, portable, concurrent, cross-platform, and self-documented Memento aggregator CLI and server tool written in Go. MemGator implements all the basic features of a Memento aggregator (e.g., TimeMap and TimeGate) and gives the ability to customize various options including which archives are aggregated. It is being used heavily by tools and services such as Mink, WAIL, OldWeb. today, and archiving research projects and has proved to be reliable even in conditions of extreme load.
{"title":"MemGator — A portable concurrent memento aggregator: Cross-platform CLI and server binaries in Go","authors":"Sawood Alam, Michael L. Nelson","doi":"10.1145/2910896.2925452","DOIUrl":"https://doi.org/10.1145/2910896.2925452","url":null,"abstract":"The Memento protocol makes it easy to build a uniform lookup service to aggregate the holdings of web archives. However, there is a lack of tools to utilize this capability in archiving applications and research projects. We created MemGator, an open source, easy to use, portable, concurrent, cross-platform, and self-documented Memento aggregator CLI and server tool written in Go. MemGator implements all the basic features of a Memento aggregator (e.g., TimeMap and TimeGate) and gives the ability to customize various options including which archives are aggregated. It is being used heavily by tools and services such as Mink, WAIL, OldWeb. today, and archiving research projects and has proved to be reliable even in conditions of extreme load.","PeriodicalId":109613,"journal":{"name":"2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL)","volume":"622 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127524617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We investigate methods for collecting data to form an archive on the debate within Twitter surrounding the UK's inclusion in the EU. We use three strategies, gathering data using hashtags, extracting data from the random stream and collecting from users known to be discussing the debate. We explore the various bias in the resulting datasets.
{"title":"Avoiding the Drunkard's search: Investigating collection strategies for building a Twitter dataset","authors":"Claire Llewellyn, Laura Cram, A. Favero","doi":"10.1145/2910896.2925433","DOIUrl":"https://doi.org/10.1145/2910896.2925433","url":null,"abstract":"We investigate methods for collecting data to form an archive on the debate within Twitter surrounding the UK's inclusion in the EU. We use three strategies, gathering data using hashtags, extracting data from the random stream and collecting from users known to be discussing the debate. We explore the various bias in the resulting datasets.","PeriodicalId":109613,"journal":{"name":"2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133483105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Inventor name disambiguation is the task that distinguishes each unique inventor from all other inventor records in a patent database. This task is essential for processing person name queries in order to get information related to a specific inventor, e.g. a list of all that inventor's patents. Using earlier work on author name disambiguation, we apply it to inventor name disambiguation. A random forest classifier is trained to classify whether each pair of inventor records is the same person. The DBSCAN algorithm is use for inventor record clustering, and its distance function is derived using the random forest classifier. For scalability, blocking functions are used to reduce the complexity of record matching and enable parallelization since each block can be run simultaneously. Tested on the USPTO patent database, 12 million inventor records were disambiguated in 6.5 hours. Evaluation on the labeled datasets from USPTO PatentsView competition shows our algorithm outperforms all algorithms submitted to the competition.
{"title":"Inventor name disambiguation for a patent database using a random forest and DBSCAN","authors":"Kunho Kim, Madian Khabsa, C. Lee Giles","doi":"10.1145/2910896.2925465","DOIUrl":"https://doi.org/10.1145/2910896.2925465","url":null,"abstract":"Inventor name disambiguation is the task that distinguishes each unique inventor from all other inventor records in a patent database. This task is essential for processing person name queries in order to get information related to a specific inventor, e.g. a list of all that inventor's patents. Using earlier work on author name disambiguation, we apply it to inventor name disambiguation. A random forest classifier is trained to classify whether each pair of inventor records is the same person. The DBSCAN algorithm is use for inventor record clustering, and its distance function is derived using the random forest classifier. For scalability, blocking functions are used to reduce the complexity of record matching and enable parallelization since each block can be run simultaneously. Tested on the USPTO patent database, 12 million inventor records were disambiguated in 6.5 hours. Evaluation on the labeled datasets from USPTO PatentsView competition shows our algorithm outperforms all algorithms submitted to the competition.","PeriodicalId":109613,"journal":{"name":"2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL)","volume":"198 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122353373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Paul D. Clough, Paula Goodale, M. Agosti, S. Lawless
The workshop aims to bring together researchers and practitioners to review and discuss ways of providing effective access to large-scale collections of cultural heritage content. The scale, variety and availability of cultural heritage content, combined with the variety of user groups with respect to background knowledge, specialist experience and needs is challenging in the context of existing access methods. In particular, we consider going beyond keyword search in large-scale cultural heritage digital libraries, in support of exploration and discovery. Our purpose for the workshop is to consider the opportunities and challenges presented by new and existing technologies, as well as the needs and experiences of diverse user communities. Our goal is to assess the current state-of the-art, to identify opportunities and establish future research priorities, informed by the combined knowledge and experience of academics and practitioners.
{"title":"ACHS'16: First international workshop on accessing cultural heritage at scale","authors":"Paul D. Clough, Paula Goodale, M. Agosti, S. Lawless","doi":"10.1145/2910896.2926733","DOIUrl":"https://doi.org/10.1145/2910896.2926733","url":null,"abstract":"The workshop aims to bring together researchers and practitioners to review and discuss ways of providing effective access to large-scale collections of cultural heritage content. The scale, variety and availability of cultural heritage content, combined with the variety of user groups with respect to background knowledge, specialist experience and needs is challenging in the context of existing access methods. In particular, we consider going beyond keyword search in large-scale cultural heritage digital libraries, in support of exploration and discovery. Our purpose for the workshop is to consider the opportunities and challenges presented by new and existing technologies, as well as the needs and experiences of diverse user communities. Our goal is to assess the current state-of the-art, to identify opportunities and establish future research priorities, informed by the combined knowledge and experience of academics and practitioners.","PeriodicalId":109613,"journal":{"name":"2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116263337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper proposes a novel method named ScholarRank to evaluate the scientific impact of rising stars. Our proposed ScholarRank integrates the merits of both statistical indicators and influence calculation algorithms in heterogeneous academic networks. The ScholarRank method considers three factors, which are the citation counts of authors, the mutual influence among coauthors and the mutual reinforce process among different entities in heterogeneous academic networks. Through experiments on real datasets, we demonstrate that our ScholarRank can efficiently select more top ranking rising stars than other methods.
{"title":"Who are the rising stars in academia?","authors":"Jun Zhang, Zhaolong Ning, Xiaomei Bai, Wei Wang, Shuo Yu, Feng Xia","doi":"10.1145/2910896.2925436","DOIUrl":"https://doi.org/10.1145/2910896.2925436","url":null,"abstract":"This paper proposes a novel method named ScholarRank to evaluate the scientific impact of rising stars. Our proposed ScholarRank integrates the merits of both statistical indicators and influence calculation algorithms in heterogeneous academic networks. The ScholarRank method considers three factors, which are the citation counts of authors, the mutual influence among coauthors and the mutual reinforce process among different entities in heterogeneous academic networks. Through experiments on real datasets, we demonstrate that our ScholarRank can efficiently select more top ranking rising stars than other methods.","PeriodicalId":109613,"journal":{"name":"2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL)","volume":"179 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121926019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Figures and tables are key sources of information in many scholarly documents. However, current academic search engines do not make use of figures and tables when semantically parsing documents or presenting document summaries to users. To facilitate these applications we develop an algorithm that extracts figures, tables, and captions from documents called “PDFFigures 2.0.” Our proposed approach analyzes the structure of individual pages by detecting captions, graphical elements, and chunks of body text, and then locates figures and tables by reasoning about the empty regions within that text. To evaluate our work, we introduce a new dataset of computer science papers, along with ground truth labels for the locations of the figures, tables, and captions within them. Our algorithm achieves impressive results (94% precision at 90% recall) on this dataset surpassing previous state of the art. Further, we show how our framework was used to extract figures from a corpus of over one million papers, and how the resulting extractions were integrated into the user interface of a smart academic search engine, Semantic Scholar (www.semanticscholar.org). Finally, we present results of exploratory data analysis completed on the extracted figures as well as an extension of our method for the task of section title extraction. We release our dataset and code on our project webpage for enabling future research (http://pdffigures2.allenai.org).
{"title":"PDFFigures 2.0: Mining figures from research papers","authors":"Christopher Clark, S. Divvala","doi":"10.1145/2910896.2910904","DOIUrl":"https://doi.org/10.1145/2910896.2910904","url":null,"abstract":"Figures and tables are key sources of information in many scholarly documents. However, current academic search engines do not make use of figures and tables when semantically parsing documents or presenting document summaries to users. To facilitate these applications we develop an algorithm that extracts figures, tables, and captions from documents called “PDFFigures 2.0.” Our proposed approach analyzes the structure of individual pages by detecting captions, graphical elements, and chunks of body text, and then locates figures and tables by reasoning about the empty regions within that text. To evaluate our work, we introduce a new dataset of computer science papers, along with ground truth labels for the locations of the figures, tables, and captions within them. Our algorithm achieves impressive results (94% precision at 90% recall) on this dataset surpassing previous state of the art. Further, we show how our framework was used to extract figures from a corpus of over one million papers, and how the resulting extractions were integrated into the user interface of a smart academic search engine, Semantic Scholar (www.semanticscholar.org). Finally, we present results of exploratory data analysis completed on the extracted figures as well as an extension of our method for the task of section title extraction. We release our dataset and code on our project webpage for enabling future research (http://pdffigures2.allenai.org).","PeriodicalId":109613,"journal":{"name":"2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130554301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we utilize a set of controlled experiments to benchmark the cost associated with the cloud execution of typical repository functions such as ingestion, fixity checking, and heavy data processing. We focus on the repository service pattern where content is explicitly stored away from where it is processed. We measured the processing speed and unit cost of each scenario using a large sensor dataset and Amazon Web Services (AWS). The initial results reveal three distinct cost patterns: 1) spend more to buy up to proportionally faster services; 2) more money does not necessarily buy better performance; and 3) spend less, but faster. Further investigations into these performance and cost patterns will help repositories to form a more effective operation strategy.
在本文中,我们利用一组受控实验来基准计算与云执行典型存储库功能(如摄取、固定检查和大量数据处理)相关的成本。我们将重点关注存储库服务模式,其中内容显式地存储在远离其处理位置的地方。我们使用大型传感器数据集和Amazon Web Services (AWS)测量了每个场景的处理速度和单位成本。初步结果揭示了三种不同的成本模式:1)花更多的钱购买按比例更快的服务;2)更多的钱不一定买到更好的性能;3)花费更少,但速度更快。对这些性能和成本模式的进一步研究将有助于存储库形成更有效的操作策略。
{"title":"Evaluating cost of cloud execution in a data repository","authors":"Zhiwu Xie, Yinlin Chen, J. Speer, T. Walters","doi":"10.1145/2910896.2925454","DOIUrl":"https://doi.org/10.1145/2910896.2925454","url":null,"abstract":"In this paper, we utilize a set of controlled experiments to benchmark the cost associated with the cloud execution of typical repository functions such as ingestion, fixity checking, and heavy data processing. We focus on the repository service pattern where content is explicitly stored away from where it is processed. We measured the processing speed and unit cost of each scenario using a large sensor dataset and Amazon Web Services (AWS). The initial results reveal three distinct cost patterns: 1) spend more to buy up to proportionally faster services; 2) more money does not necessarily buy better performance; and 3) spend less, but faster. Further investigations into these performance and cost patterns will help repositories to form a more effective operation strategy.","PeriodicalId":109613,"journal":{"name":"2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124620644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Along with popular of academic social media, academic blogs are one of the user generated academic information that can be annotated using social tags for user's information retrieval and organization. In order to improve the existing social tagging system to satisfy the users' needs, users' tagging behavior need to be understood. However, there is no researches on characterizing user tagging behaviors of academic resources. In this paper, using the tag of academic blog as the research object, the author analyze user's tagging behaviors based on the characteristics of tags (tags-based features) and those related to blog contents (content-based features). These characteristics can be used to the academic tagging system to promote organization and propagation of academic knowledge.
{"title":"Characterizing users tagging behavior in academic blogs","authors":"Lei Li, Chengzhi Zhang","doi":"10.1145/2910896.2925438","DOIUrl":"https://doi.org/10.1145/2910896.2925438","url":null,"abstract":"Along with popular of academic social media, academic blogs are one of the user generated academic information that can be annotated using social tags for user's information retrieval and organization. In order to improve the existing social tagging system to satisfy the users' needs, users' tagging behavior need to be understood. However, there is no researches on characterizing user tagging behaviors of academic resources. In this paper, using the tag of academic blog as the research object, the author analyze user's tagging behaviors based on the characteristics of tags (tags-based features) and those related to blog contents (content-based features). These characteristics can be used to the academic tagging system to promote organization and propagation of academic knowledge.","PeriodicalId":109613,"journal":{"name":"2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133816796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}