The elements of a surrogate serve as clues to relevance. They may be seen as operationalized relevance criteria by which users judge the relevance of a search result according to their information need. In addition to short textual summaries, today's academic search systems integrate additional data into their search results presentation, for example, the number of citations or the number of downloads. This kind of data can be described as popularity data, serving as factors also incorporated in search engines' ranking algorithms. Past research shows that there are diverse criteria and factors involved in relevance judgements from the user perspective. However, previous empirical studies on relevance criteria and clues examined surrogates that did not include popularity data. The goal of my doctoral research is to gain significant knowledge on the criteria by which users in an academic search situation make relevance judgements based on surrogates that include popularity data. This paper describes the current state of the experimental research design and method of data collection.
{"title":"Investigating the Effects of Popularity Data on Predictive Relevance Judgments in Academic Search Systems","authors":"Christiane Behnert","doi":"10.1145/3295750.3298978","DOIUrl":"https://doi.org/10.1145/3295750.3298978","url":null,"abstract":"The elements of a surrogate serve as clues to relevance. They may be seen as operationalized relevance criteria by which users judge the relevance of a search result according to their information need. In addition to short textual summaries, today's academic search systems integrate additional data into their search results presentation, for example, the number of citations or the number of downloads. This kind of data can be described as popularity data, serving as factors also incorporated in search engines' ranking algorithms. Past research shows that there are diverse criteria and factors involved in relevance judgements from the user perspective. However, previous empirical studies on relevance criteria and clues examined surrogates that did not include popularity data. The goal of my doctoral research is to gain significant knowledge on the criteria by which users in an academic search situation make relevance judgements based on surrogates that include popularity data. This paper describes the current state of the experimental research design and method of data collection.","PeriodicalId":187771,"journal":{"name":"Proceedings of the 2019 Conference on Human Information Interaction and Retrieval","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121447419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Joni O. Salminen, Hind Almerekhi, A. Kamel, Soon-Gyo Jung, B. Jansen
Analyzing 5,665 crowd ratings on 1,133 social media comments, we find that individuals tend to agree on the extremes of a hate rating scale more than in the middle when evaluating the hatefulness of online comments. The agreement is higher for less hateful comments and lowest on moderately hateful comments. The results have implications for researchers developing machine learning models for online hate processing, as the extreme classes are likely to require fewer annotations for reaching statistical stability. Our findings suggest that the models developed in this domain should consider the distributions of hate ratings rather than average hate scores.
{"title":"Online Hate Ratings Vary by Extremes: A Statistical Analysis","authors":"Joni O. Salminen, Hind Almerekhi, A. Kamel, Soon-Gyo Jung, B. Jansen","doi":"10.1145/3295750.3298954","DOIUrl":"https://doi.org/10.1145/3295750.3298954","url":null,"abstract":"Analyzing 5,665 crowd ratings on 1,133 social media comments, we find that individuals tend to agree on the extremes of a hate rating scale more than in the middle when evaluating the hatefulness of online comments. The agreement is higher for less hateful comments and lowest on moderately hateful comments. The results have implications for researchers developing machine learning models for online hate processing, as the extreme classes are likely to require fewer annotations for reaching statistical stability. Our findings suggest that the models developed in this domain should consider the distributions of hate ratings rather than average hate scores.","PeriodicalId":187771,"journal":{"name":"Proceedings of the 2019 Conference on Human Information Interaction and Retrieval","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129385624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Improving the informativeness of online content can help organisations to reach a wider range of audiences and ensure that their information is accessible to as many people as possible. Whilst many studies focus on the technical aspects of system design, this research aims to identify the key characteristics of quality within informative websites, and provide practitioners with a technique to generate improved content through an action research approach.
{"title":"Assessing Online Content Quality through User Surveys and Web Analytics","authors":"J. Muirhead","doi":"10.1145/3295750.3298977","DOIUrl":"https://doi.org/10.1145/3295750.3298977","url":null,"abstract":"Improving the informativeness of online content can help organisations to reach a wider range of audiences and ensure that their information is accessible to as many people as possible. Whilst many studies focus on the technical aspects of system design, this research aims to identify the key characteristics of quality within informative websites, and provide practitioners with a technique to generate improved content through an action research approach.","PeriodicalId":187771,"journal":{"name":"Proceedings of the 2019 Conference on Human Information Interaction and Retrieval","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125591761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In legal due diligence, lawyers identify a variety of topic instances in a company's contracts that may pose risk during a transaction. In this paper, we present a study of 9 lawyers conducting a simulated review of 50 contracts for five topics. We find that lawyers agree on the general location of relevant material at a higher rate than in other assessor agreement studies, but they do not entirely agree on the extent of the relevant material. Additionally, we do not find strong differences between lawyers who have differing levels of due diligence expertise. If we train machine learning models to identify these topics based on each user's judgments, the resulting models exhibit similar levels of agreement between each other as to the lawyers that trained them. This indicates that these models are learning the types of behaviour exhibited by their trainers, even if they are doing so imperfectly. Accordingly, we argue that additional work is necessary to improve the assessment process to ensure that all parties agree on identified material.
{"title":"Variations in Assessor Agreement in Due Diligence","authors":"Adam Roegiest, Anne McNulty","doi":"10.1145/3295750.3298945","DOIUrl":"https://doi.org/10.1145/3295750.3298945","url":null,"abstract":"In legal due diligence, lawyers identify a variety of topic instances in a company's contracts that may pose risk during a transaction. In this paper, we present a study of 9 lawyers conducting a simulated review of 50 contracts for five topics. We find that lawyers agree on the general location of relevant material at a higher rate than in other assessor agreement studies, but they do not entirely agree on the extent of the relevant material. Additionally, we do not find strong differences between lawyers who have differing levels of due diligence expertise. If we train machine learning models to identify these topics based on each user's judgments, the resulting models exhibit similar levels of agreement between each other as to the lawyers that trained them. This indicates that these models are learning the types of behaviour exhibited by their trainers, even if they are doing so imperfectly. Accordingly, we argue that additional work is necessary to improve the assessment process to ensure that all parties agree on identified material.","PeriodicalId":187771,"journal":{"name":"Proceedings of the 2019 Conference on Human Information Interaction and Retrieval","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128876280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ian A. Knight, Max L. Wilson, D. Brailsford, Natasa Milic-Frayling
Systematic reviews are a comprehensive and parameterised form of literature review, found in most disciplines, that involve exhaustive analyses and rigorous interpretation of prior literature. Performing systematic reviews, however, can involve repetitive and laborious work in order to reach reliable standards. Strict guidelines and availability of published reviews make the task amenable to computerised assistance and automation using text mining, information extraction, and machine learning techniques. However, it is unclear which aspects of this Work Task are best suited for such support. This paper describes a three-month ethnographic study and CognitiveWork Analysis of the systematic reviews performed by a medical research group. Our findings show that the IR aspects of systematic reviews involve many tasks at two separate levels: 1) taxonomic organisation of documents and sub-document elements in relation to topic queries and domain-specific resources, and 2) extraction methods for structured summaries from the classified resources. This provides the basis for future work designing search tools with localised optimization and subtask automation to support specific phases of the process.
{"title":"Enslaved to the Trapped Data: A Cognitive Work Analysis of Medical Systematic Reviews","authors":"Ian A. Knight, Max L. Wilson, D. Brailsford, Natasa Milic-Frayling","doi":"10.1145/3295750.3298937","DOIUrl":"https://doi.org/10.1145/3295750.3298937","url":null,"abstract":"Systematic reviews are a comprehensive and parameterised form of literature review, found in most disciplines, that involve exhaustive analyses and rigorous interpretation of prior literature. Performing systematic reviews, however, can involve repetitive and laborious work in order to reach reliable standards. Strict guidelines and availability of published reviews make the task amenable to computerised assistance and automation using text mining, information extraction, and machine learning techniques. However, it is unclear which aspects of this Work Task are best suited for such support. This paper describes a three-month ethnographic study and CognitiveWork Analysis of the systematic reviews performed by a medical research group. Our findings show that the IR aspects of systematic reviews involve many tasks at two separate levels: 1) taxonomic organisation of documents and sub-document elements in relation to topic queries and domain-specific resources, and 2) extraction methods for structured summaries from the classified resources. This provides the basis for future work designing search tools with localised optimization and subtask automation to support specific phases of the process.","PeriodicalId":187771,"journal":{"name":"Proceedings of the 2019 Conference on Human Information Interaction and Retrieval","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131088263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-03-08DOI: 10.1057/978-1-352-00112-9_9
Rebekah Willson
{"title":"Analysing Qualitative Data: You Asked Them, Now What to Do With What They Said","authors":"Rebekah Willson","doi":"10.1057/978-1-352-00112-9_9","DOIUrl":"https://doi.org/10.1057/978-1-352-00112-9_9","url":null,"abstract":"","PeriodicalId":187771,"journal":{"name":"Proceedings of the 2019 Conference on Human Information Interaction and Retrieval","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133517079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Joni O. Salminen, Sercan Sengün, Soon-Gyo Jung, B. Jansen
Increased access to data and computational techniques enable innovations in the space of automated customer analytics, for example, automatic persona generation. Automatic persona generation is the process of creating data-driven representations from user or customer statistics. Even though automatic persona generation is technically possible and provides advantages compared to manual persona creation regarding the speed and freshness of the personas, it is not clear (a) what information to include in the persona profiles and (b) how to display that information. To query into these aspects relating information design of personas, we conducted a user study with 38 participants. In the findings, we report several challenges relating to the design of automatically generated persona profiles, including usability issues, perceptual issues, and issues relating to information content. Our research has implications for the information design of data-driven personas.
{"title":"Design Issues in Automatically Generated Persona Profiles: A Qualitative Analysis from 38 Think-Aloud Transcripts","authors":"Joni O. Salminen, Sercan Sengün, Soon-Gyo Jung, B. Jansen","doi":"10.1145/3295750.3298942","DOIUrl":"https://doi.org/10.1145/3295750.3298942","url":null,"abstract":"Increased access to data and computational techniques enable innovations in the space of automated customer analytics, for example, automatic persona generation. Automatic persona generation is the process of creating data-driven representations from user or customer statistics. Even though automatic persona generation is technically possible and provides advantages compared to manual persona creation regarding the speed and freshness of the personas, it is not clear (a) what information to include in the persona profiles and (b) how to display that information. To query into these aspects relating information design of personas, we conducted a user study with 38 participants. In the findings, we report several challenges relating to the design of automatically generated persona profiles, including usability issues, perceptual issues, and issues relating to information content. Our research has implications for the information design of data-driven personas.","PeriodicalId":187771,"journal":{"name":"Proceedings of the 2019 Conference on Human Information Interaction and Retrieval","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114991729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Corrado Grappiolo, E. V. Gerwen, J. Verhoosel, L. Somers
The booming popularity of data science is also affecting high-tech industries. However, since these usually have different core competencies --- building cyber-physical systems rather than e.g. machine learning or data mining algorithms --- delving into data science by domain experts such as system engineers or architects might be more cumbersome than expected. In order to help domain experts to delve into data science we designed the Semantic Snake Charmer (SSC), a domain knowledge-based search engine for Jupyter Notebooks. SSC is composed of three modules: (1) a human-machine cooperative module to identify internal documentation which contains the most relevant domain knowledge, (2) a natural language processing module capable of transforming relevant documentation into several semantic graph types, (3) a reinforcement-learning based search engine which learns, given user feedback, the best mapping between input queries and semantic graph type to rely on. We believe SSC can be a fundamental asset to allow the easy landing of data science in industrial domains.
{"title":"The Semantic Snake Charmer Search Engine: A Tool to Facilitate Data Science in High-tech Industry Domains","authors":"Corrado Grappiolo, E. V. Gerwen, J. Verhoosel, L. Somers","doi":"10.1145/3295750.3298915","DOIUrl":"https://doi.org/10.1145/3295750.3298915","url":null,"abstract":"The booming popularity of data science is also affecting high-tech industries. However, since these usually have different core competencies --- building cyber-physical systems rather than e.g. machine learning or data mining algorithms --- delving into data science by domain experts such as system engineers or architects might be more cumbersome than expected. In order to help domain experts to delve into data science we designed the Semantic Snake Charmer (SSC), a domain knowledge-based search engine for Jupyter Notebooks. SSC is composed of three modules: (1) a human-machine cooperative module to identify internal documentation which contains the most relevant domain knowledge, (2) a natural language processing module capable of transforming relevant documentation into several semantic graph types, (3) a reinforcement-learning based search engine which learns, given user feedback, the best mapping between input queries and semantic graph type to rely on. We believe SSC can be a fundamental asset to allow the easy landing of data science in industrial domains.","PeriodicalId":187771,"journal":{"name":"Proceedings of the 2019 Conference on Human Information Interaction and Retrieval","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128283350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xenia Zürn, Mendel Broekhuijsen, Doménique van Gennip, Saskia Bakker, Annemarie F. Zijlema, E. V. D. Hoven
Personal photo collections have grown due to digital photography and the introduction of smartphones, and photo collections have become harder to manage. Deleting photos appears to be difficult and the task of curation is often perceived as not enjoyable. The lack of curation can make it harder to retrieve photos when people need them for various reasons, such as individual reminiscing, shared remembering or self-presentation. In this study we investigate how we can stimulate people to organise their photo collections on their smartphones. Ten participants evaluated and qualitatively compared four applications with different characteristics regarding voting on and deleting photos. We found that voting on photos is easier and more enjoyable in comparison to deleting photos, that participants showed reminiscence while organising, that deleting can be frustrating, that participants have different preferences for sorting and viewing photos and that voting could make deleting and retrieving easier.
{"title":"Stimulating Photo Curation on Smartphones","authors":"Xenia Zürn, Mendel Broekhuijsen, Doménique van Gennip, Saskia Bakker, Annemarie F. Zijlema, E. V. D. Hoven","doi":"10.1145/3295750.3298947","DOIUrl":"https://doi.org/10.1145/3295750.3298947","url":null,"abstract":"Personal photo collections have grown due to digital photography and the introduction of smartphones, and photo collections have become harder to manage. Deleting photos appears to be difficult and the task of curation is often perceived as not enjoyable. The lack of curation can make it harder to retrieve photos when people need them for various reasons, such as individual reminiscing, shared remembering or self-presentation. In this study we investigate how we can stimulate people to organise their photo collections on their smartphones. Ten participants evaluated and qualitatively compared four applications with different characteristics regarding voting on and deleting photos. We found that voting on photos is easier and more enjoyable in comparison to deleting photos, that participants showed reminiscence while organising, that deleting can be frustrating, that participants have different preferences for sorting and viewing photos and that voting could make deleting and retrieving easier.","PeriodicalId":187771,"journal":{"name":"Proceedings of the 2019 Conference on Human Information Interaction and Retrieval","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134141534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Johanne R. Trippas, Damiano Spina, Falk Scholer, Ahmed Hassan Awadallah, P. Bailey, Paul N. Bennett, Ryen W. White, J. Liono, Yongli Ren, Flora D. Salim, M. Sanderson
Intelligent assistants can serve many purposes, including entertainment (e.g. playing music), home automation, and task management (e.g. timers, reminders). The role of these assistants is evolving to also support people engaged in work tasks, in workplaces and beyond. To design truly useful intelligent assistants for work, it is important to better understand the work tasks that people are performing. Based on a survey of 401 respondents' daily tasks and activities in a work setting, we present a classification of work-related tasks, and analyze their key characteristics, including the frequency of their self-reported tasks, the environment in which they undertake the tasks, and which, if any, electronic devices are used. We also investigate the cyber, physical, and social aspects of tasks. Finally, we reflect on how intelligent assistants could influence and help people in a work environment to complete their tasks, and synthesize our findings to provide insight on the future of intelligent assistants in support of amplifying personal productivity.
{"title":"Learning About Work Tasks to Inform Intelligent Assistant Design","authors":"Johanne R. Trippas, Damiano Spina, Falk Scholer, Ahmed Hassan Awadallah, P. Bailey, Paul N. Bennett, Ryen W. White, J. Liono, Yongli Ren, Flora D. Salim, M. Sanderson","doi":"10.1145/3295750.3298934","DOIUrl":"https://doi.org/10.1145/3295750.3298934","url":null,"abstract":"Intelligent assistants can serve many purposes, including entertainment (e.g. playing music), home automation, and task management (e.g. timers, reminders). The role of these assistants is evolving to also support people engaged in work tasks, in workplaces and beyond. To design truly useful intelligent assistants for work, it is important to better understand the work tasks that people are performing. Based on a survey of 401 respondents' daily tasks and activities in a work setting, we present a classification of work-related tasks, and analyze their key characteristics, including the frequency of their self-reported tasks, the environment in which they undertake the tasks, and which, if any, electronic devices are used. We also investigate the cyber, physical, and social aspects of tasks. Finally, we reflect on how intelligent assistants could influence and help people in a work environment to complete their tasks, and synthesize our findings to provide insight on the future of intelligent assistants in support of amplifying personal productivity.","PeriodicalId":187771,"journal":{"name":"Proceedings of the 2019 Conference on Human Information Interaction and Retrieval","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125506213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}