Culinary preferences contribute significantly to the sense of ourself [2]. While gender, race, sexuality and ethnicity describe our "major identity", preferences in music, style and food define our "minor identity". However, we find that only certain parts of them can be explained by gender-specific differences in the food consumption behavior, while other parts can be better explained by the media portrayal of food consumption.
{"title":"Men eat on Mars, Women on Venus?: An Empirical Study of Food-Images","authors":"Claudia Wagner, L. Aiello","doi":"10.1145/2786451.2786505","DOIUrl":"https://doi.org/10.1145/2786451.2786505","url":null,"abstract":"Culinary preferences contribute significantly to the sense of ourself [2]. While gender, race, sexuality and ethnicity describe our \"major identity\", preferences in music, style and food define our \"minor identity\". However, we find that only certain parts of them can be explained by gender-specific differences in the food consumption behavior, while other parts can be better explained by the media portrayal of food consumption.","PeriodicalId":93136,"journal":{"name":"Proceedings of the ... ACM Web Science Conference. ACM Web Science Conference","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84485355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Samoilenko, F. Karimi, Jérôme Kunegis, Daniel Edler, M. Strohmaier
The Internet is highly multilingual, and its content is created, shared, debated and shaped within many different language-speaking communities. These communities do not exist in isolation, but communicate and influence each other's interests, just as in the offline world. Quantifying this influence is however a non-trivial task, as these communities are usually spread across multiple heterogeneous platforms. In this work, we set out to measure the influence of languages on each other by observing concept overlap between the 110 largest Wikipedia language editions. We describe experiments to test if language overlap in concept coverage is a random process, and find that edition size is a strong predictor of higher concept overlap, with English--German being the most frequently co-occurring pair (45%). Both small and large editions co-occur more frequently than expected with editions of similar size, but co-occurrences across groups are below what is expected by chance. Additionally, by applying network analysis, we find that the hierarchy of language interconnections differs depending on the locality of topics: for interlingually popular topics, the dominance of English, German and French is pronounced, while for topics with a local reach, geographical and cultural proximity as well as common heritage are better explanators of co-occurrence.
{"title":"Linguistic influence patterns within the global network of Wikipedia language editions","authors":"A. Samoilenko, F. Karimi, Jérôme Kunegis, Daniel Edler, M. Strohmaier","doi":"10.1145/2786451.2786497","DOIUrl":"https://doi.org/10.1145/2786451.2786497","url":null,"abstract":"The Internet is highly multilingual, and its content is created, shared, debated and shaped within many different language-speaking communities. These communities do not exist in isolation, but communicate and influence each other's interests, just as in the offline world. Quantifying this influence is however a non-trivial task, as these communities are usually spread across multiple heterogeneous platforms. In this work, we set out to measure the influence of languages on each other by observing concept overlap between the 110 largest Wikipedia language editions. We describe experiments to test if language overlap in concept coverage is a random process, and find that edition size is a strong predictor of higher concept overlap, with English--German being the most frequently co-occurring pair (45%). Both small and large editions co-occur more frequently than expected with editions of similar size, but co-occurrences across groups are below what is expected by chance. Additionally, by applying network analysis, we find that the hierarchy of language interconnections differs depending on the locality of topics: for interlingually popular topics, the dominance of English, German and French is pronounced, while for topics with a local reach, geographical and cultural proximity as well as common heritage are better explanators of co-occurrence.","PeriodicalId":93136,"journal":{"name":"Proceedings of the ... ACM Web Science Conference. ACM Web Science Conference","volume":"29 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83083934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gathering training and evaluation data for open domain tasks, such as general question answering, is a challenging task. Typically, ground truth data is provided by human expert annotators, however, in an open domain experts are difficult to define. Moreover, the overall process for annotating examples can be lengthy and expensive. Naturally, crowdsourcing has become a mainstream approach for filling this gap, i.e. gathering human interpretation data. However, similar to the traditional expert annotation tasks, most of those methods use majority voting to measure the quality of the annotations and thus aim at identifying a single right answer for each example, despite the fact that many annotation tasks can have multiple interpretations, which results in multiple correct answers to the same question. We present a crowdsourcing-based approach for efficiently gathering ground truth data called CrowdTruth, where disagreement-based metrics are used to harness the multitude of human interpretation and measure the quality of the resulting ground truth. We exemplify our approach in two semantic interpretation use cases for answering questions.
{"title":"Crowdsourcing ground truth for Question Answering using CrowdTruth","authors":"Benjamin Timmermans, Lora Aroyo, Chris Welty","doi":"10.1145/2786451.2786492","DOIUrl":"https://doi.org/10.1145/2786451.2786492","url":null,"abstract":"Gathering training and evaluation data for open domain tasks, such as general question answering, is a challenging task. Typically, ground truth data is provided by human expert annotators, however, in an open domain experts are difficult to define. Moreover, the overall process for annotating examples can be lengthy and expensive. Naturally, crowdsourcing has become a mainstream approach for filling this gap, i.e. gathering human interpretation data. However, similar to the traditional expert annotation tasks, most of those methods use majority voting to measure the quality of the annotations and thus aim at identifying a single right answer for each example, despite the fact that many annotation tasks can have multiple interpretations, which results in multiple correct answers to the same question. We present a crowdsourcing-based approach for efficiently gathering ground truth data called CrowdTruth, where disagreement-based metrics are used to harness the multitude of human interpretation and measure the quality of the resulting ground truth. We exemplify our approach in two semantic interpretation use cases for answering questions.","PeriodicalId":93136,"journal":{"name":"Proceedings of the ... ACM Web Science Conference. ACM Web Science Conference","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90435143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we present SNAP:DRGN, a pilot project intended to support Ancient World Linked Open Data through the creation of persistent identifiers for person and person-like entities. We introduce the linked data landscape as it exists with respect to the digitized Classical world and SNAP:DRGN's place within it.
{"title":"Prosopography is Greek for Facebook: The SNAP:DRGN Project","authors":"K. F. Lawrence, G. Bodard","doi":"10.1145/2786451.2786496","DOIUrl":"https://doi.org/10.1145/2786451.2786496","url":null,"abstract":"In this paper, we present SNAP:DRGN, a pilot project intended to support Ancient World Linked Open Data through the creation of persistent identifiers for person and person-like entities. We introduce the linked data landscape as it exists with respect to the digitized Classical world and SNAP:DRGN's place within it.","PeriodicalId":93136,"journal":{"name":"Proceedings of the ... ACM Web Science Conference. ACM Web Science Conference","volume":"72 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74033308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniel Zoller, Stephan Doerfel, R. Jäschke, Gerd Stumme, A. Hotho
Scholarly success is traditionally measured in terms of citations to publications. With the advent of publication management and digital libraries on the web, scholarly usage data has become a target of investigation and new impact metrics computed on such usage data have been proposed -- so called altmetrics. In scholarly social bookmarking systems, scientists collect and manage publication meta data and thus reveal their interest in these publications. In this work, we investigate connections between usage metrics and citations, and find posts, exports, and page views of publications to be correlated to citations.
{"title":"On Publication Usage in a Social Bookmarking System","authors":"Daniel Zoller, Stephan Doerfel, R. Jäschke, Gerd Stumme, A. Hotho","doi":"10.1145/2786451.2786927","DOIUrl":"https://doi.org/10.1145/2786451.2786927","url":null,"abstract":"Scholarly success is traditionally measured in terms of citations to publications. With the advent of publication management and digital libraries on the web, scholarly usage data has become a target of investigation and new impact metrics computed on such usage data have been proposed -- so called altmetrics. In scholarly social bookmarking systems, scientists collect and manage publication meta data and thus reveal their interest in these publications. In this work, we investigate connections between usage metrics and citations, and find posts, exports, and page views of publications to be correlated to citations.","PeriodicalId":93136,"journal":{"name":"Proceedings of the ... ACM Web Science Conference. ACM Web Science Conference","volume":"256 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72980529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Large numbers of today's businesses use social media in advertising. There is a belief in a great opportunity, even if return on investment is difficult to quantify. To fill this gap we consider a cross-media-platform-analysis across Facebook, Twitter and Foursquare. Rationale for and against different characteristics within social media advertisement are addressed. The paper finds correlation from posts and tweets to Foursquare check-ins. Results show that posts or tweets containing pictures have higher return on investment than posts or tweets without, and that when the text of a post or tweet raises curiosity or attracts individuals or groups Foursquare check-ins increase.
{"title":"Tweet if you will: the real question is, who do you influence?","authors":"J. Schacht, M. Hall, M. Chorley","doi":"10.1145/2786451.2786923","DOIUrl":"https://doi.org/10.1145/2786451.2786923","url":null,"abstract":"Large numbers of today's businesses use social media in advertising. There is a belief in a great opportunity, even if return on investment is difficult to quantify. To fill this gap we consider a cross-media-platform-analysis across Facebook, Twitter and Foursquare. Rationale for and against different characteristics within social media advertisement are addressed. The paper finds correlation from posts and tweets to Foursquare check-ins. Results show that posts or tweets containing pictures have higher return on investment than posts or tweets without, and that when the text of a post or tweet raises curiosity or attracts individuals or groups Foursquare check-ins increase.","PeriodicalId":93136,"journal":{"name":"Proceedings of the ... ACM Web Science Conference. ACM Web Science Conference","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75950231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Introducing Social Machines as web-enabled entities integrating social energies and computational powers into a socio-technical system (whether purposeful or not) where social dynamics animate communities, this paper proposes a theoretical framework in which to observe them. Attempting to strike a balance between the roles of humans and non-humans, and aware of the difficulties that this heterogeneity presents, we propose to approach the questions of capturing the social dynamics of a social machine through prosopography. Prosopography is a method, used in particular by historians, that allows to systematically study a collection of biographies, be they of persons, artefacts, infrastructures of groups thereof. Systematization is achieved through designing an appropriate questionnaire to gather homogeneous data across the biographies. Our questionnaire design relies on the identification of five archetypal elements in biographical narratives. Illustrating our method with three examples, we demonstrate how our archetypal narratives have the potential to describe at least aspects of the social dynamics in social machines.
{"title":"Archetypal Narratives in Social Machines: Approaching Sociality through Prosopography","authors":"S. Tarte, P. Willcox, H. Glaser, D. D. Roure","doi":"10.1145/2786451.2786471","DOIUrl":"https://doi.org/10.1145/2786451.2786471","url":null,"abstract":"Introducing Social Machines as web-enabled entities integrating social energies and computational powers into a socio-technical system (whether purposeful or not) where social dynamics animate communities, this paper proposes a theoretical framework in which to observe them. Attempting to strike a balance between the roles of humans and non-humans, and aware of the difficulties that this heterogeneity presents, we propose to approach the questions of capturing the social dynamics of a social machine through prosopography. Prosopography is a method, used in particular by historians, that allows to systematically study a collection of biographies, be they of persons, artefacts, infrastructures of groups thereof. Systematization is achieved through designing an appropriate questionnaire to gather homogeneous data across the biographies. Our questionnaire design relies on the identification of five archetypal elements in biographical narratives. Illustrating our method with three examples, we demonstrate how our archetypal narratives have the potential to describe at least aspects of the social dynamics in social machines.","PeriodicalId":93136,"journal":{"name":"Proceedings of the ... ACM Web Science Conference. ACM Web Science Conference","volume":"83 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79344536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Social News site reddit.com is composed of thousands of independent user-created subreddits where people use the site's submission and voting features in a variety of ways. This paper offers a brief overview of different types of subreddit and how user activity is distributed between these.
{"title":"Reddit.com: A census of subreddits","authors":"Richard A. Mills","doi":"10.1145/2786451.2786491","DOIUrl":"https://doi.org/10.1145/2786451.2786491","url":null,"abstract":"The Social News site reddit.com is composed of thousands of independent user-created subreddits where people use the site's submission and voting features in a variety of ways. This paper offers a brief overview of different types of subreddit and how user activity is distributed between these.","PeriodicalId":93136,"journal":{"name":"Proceedings of the ... ACM Web Science Conference. ACM Web Science Conference","volume":"59 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80731026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Maynard, M. Greenwood, Ian Roberts, George Windsor, Kalina Bontcheva
This paper describes an open source framework for analysing large volume social media content, which comprises semantic annotation, Linked Open Data, semantic search, dynamic result aggregation, and information visualisation. In particular, exploratory search and sense-making are supported through information visualisation interfaces, such as co-occurrence matrices, term clouds, treemaps, and choropleths. There is also an interactive semantic search interface (Prospector), where users can save, refine, and analyse the results of semantic search queries over time. These functionalities are presented in more detail in the context of analysing tweets from UK politicians and party candidates in the run up to the 2015 UK general election.
{"title":"Real-time Social Media Analytics through Semantic Annotation and Linked Open Data","authors":"D. Maynard, M. Greenwood, Ian Roberts, George Windsor, Kalina Bontcheva","doi":"10.1145/2786451.2786500","DOIUrl":"https://doi.org/10.1145/2786451.2786500","url":null,"abstract":"This paper describes an open source framework for analysing large volume social media content, which comprises semantic annotation, Linked Open Data, semantic search, dynamic result aggregation, and information visualisation. In particular, exploratory search and sense-making are supported through information visualisation interfaces, such as co-occurrence matrices, term clouds, treemaps, and choropleths. There is also an interactive semantic search interface (Prospector), where users can save, refine, and analyse the results of semantic search queries over time. These functionalities are presented in more detail in the context of analysing tweets from UK politicians and party candidates in the run up to the 2015 UK general election.","PeriodicalId":93136,"journal":{"name":"Proceedings of the ... ACM Web Science Conference. ACM Web Science Conference","volume":"91 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79962674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As video on the Web becomes a more interactive medium, as opposed to broadcast only, there is an opportunity to incorporate the interactive features of video annotation into Web video interfaces. Existing studies into collaborative video annotation provide a rather interactionally decontextualized view of collaboration: there exists only minimal understanding of the situated practice of collaborative video annotation, as it may be applied to the design of Web interfaces. At the same time, studies of situated practice in other research areas such as Computer-Supported Collaborative Work have provided substantive improvements in Web interface design to support collaboration. Therefore, we propose there is an opportunity to use an understanding of the situated practice of collaborative video annotation to design a Web video annotation interface. A method that is commonly used for these studies is ethnomethodology, which examines in detail the observable-reportable characteristics of practice of social activity as accomplished by the participants in the activity. We discuss three important issues that need to be addressed so that an ethnomethodologically-informed approach can be applied to the development of a Web video annotation interface: establishing a site for data elicitation, generalization, and the paradox of technomethodology. Having addressed each issue in turn, we then use a fragment of data to illustrate an ethnomethodologically-informed approach to surfacing insights into collaboration, as well as implications for Web video annotation interface design, which would be difficult if not impossible to surface with other approaches not informed by ethnomethodology.
{"title":"An Ethnomethodologically-Informed Approach to Interface Design to Support Collective Web Practice Around Video","authors":"Anna Zawilska, Steven Albury","doi":"10.1145/2786451.2786480","DOIUrl":"https://doi.org/10.1145/2786451.2786480","url":null,"abstract":"As video on the Web becomes a more interactive medium, as opposed to broadcast only, there is an opportunity to incorporate the interactive features of video annotation into Web video interfaces. Existing studies into collaborative video annotation provide a rather interactionally decontextualized view of collaboration: there exists only minimal understanding of the situated practice of collaborative video annotation, as it may be applied to the design of Web interfaces. At the same time, studies of situated practice in other research areas such as Computer-Supported Collaborative Work have provided substantive improvements in Web interface design to support collaboration. Therefore, we propose there is an opportunity to use an understanding of the situated practice of collaborative video annotation to design a Web video annotation interface. A method that is commonly used for these studies is ethnomethodology, which examines in detail the observable-reportable characteristics of practice of social activity as accomplished by the participants in the activity. We discuss three important issues that need to be addressed so that an ethnomethodologically-informed approach can be applied to the development of a Web video annotation interface: establishing a site for data elicitation, generalization, and the paradox of technomethodology. Having addressed each issue in turn, we then use a fragment of data to illustrate an ethnomethodologically-informed approach to surfacing insights into collaboration, as well as implications for Web video annotation interface design, which would be difficult if not impossible to surface with other approaches not informed by ethnomethodology.","PeriodicalId":93136,"journal":{"name":"Proceedings of the ... ACM Web Science Conference. ACM Web Science Conference","volume":"14 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85873807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}