{"title":"Effective approaches to retrieving and using expertise in social media","authors":"Reyyan Yeniterzi","doi":"10.1145/2484028.2484230","DOIUrl":null,"url":null,"abstract":"Expert retrieval has been widely studied especially after the introduction of Expert Finding task in the TREC's Enterprise Track in 2005 [3]. This track provided two different test collections crawled from two organizations' public-facing websites and internal emails which led to the development of many state-of-the-art algorithms on expert retrieval [1]. Until recently, these datasets were considered good representatives of the information resources available within enterprise. However, the recent growth of social media also influenced the work environment, and social media became a common communication and collaboration tool within organizations. According to a recent survey by McKinsey Global Institute [2], 29% of the companies use at least one social media tool for matching their employees to tasks, and 26% of them assess their employees' performance by using social media. This shows that intra-organizational social media became an important resource to identify expertise within organizations. In recent years, in addition to the intra-organizational social media, public social media tools like Twitter, Facebook, LinkedIn also became common environments for searching expertise. These tools provide an opportunity for their users to show their specific skills to the world which motivates recruiters to look for talented job candidates on social media, or writers and reporters to find experts for consulting on specific topics they are working on. With these motivations in mind, in this work we propose to develop expert retrieval algorithms for intra-organizational and public social media tools. Social media datasets have both challenges and advantages. In terms of challenges, they do not always contain context on one specific domain, instead one social media tool may contain discussions on technical stuff, hobbies or news concurrently. They may also contain spam posts or advertisements. Compared to well-edited enterprise documents, they are much more informal in language. Furthermore, depending on the social media platform, they may have limits on the number of characters used in posts. Even though they include the challenges stated above, they also bring some unique authority signals, such as votes, comments, follower/following information, which can be useful in estimating expertise. Furthermore, compared to previously used enterprise documents, social media provides clear associations between documents and candidates in the context of authorship information. In this work, we propose to develop expert retrieval approaches which will handle these challenges while making use of the advantages. Expert retrieval is a very useful application by itself; furthermore, it can be a step towards improving other social media applications. Social media is different than other web based tools mainly because it is dependent on its users. In social media, users are not just content consumers, but they are also the primary and sometimes the only content creators. Therefore, the quality of any user-generated content in social media depends on its creator. In this thesis, we propose to use expertise of users in order to improve the existing applications so that they can estimate the relevancy of a content not just based on the content, but also based on the expertise of the content creator. By using expertise of the content generator, we also hope to boost contents that are more reliable. We propose to apply this user's expertise information in order to improve ad-hoc search and question answering applications in social media. In this work, previous TREC enterprise datasets, available intra-organizational social media and public social media datasets will be used to test the proposed algorithms.","PeriodicalId":178818,"journal":{"name":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","volume":"52 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2484028.2484230","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Expert retrieval has been widely studied especially after the introduction of Expert Finding task in the TREC's Enterprise Track in 2005 [3]. This track provided two different test collections crawled from two organizations' public-facing websites and internal emails which led to the development of many state-of-the-art algorithms on expert retrieval [1]. Until recently, these datasets were considered good representatives of the information resources available within enterprise. However, the recent growth of social media also influenced the work environment, and social media became a common communication and collaboration tool within organizations. According to a recent survey by McKinsey Global Institute [2], 29% of the companies use at least one social media tool for matching their employees to tasks, and 26% of them assess their employees' performance by using social media. This shows that intra-organizational social media became an important resource to identify expertise within organizations. In recent years, in addition to the intra-organizational social media, public social media tools like Twitter, Facebook, LinkedIn also became common environments for searching expertise. These tools provide an opportunity for their users to show their specific skills to the world which motivates recruiters to look for talented job candidates on social media, or writers and reporters to find experts for consulting on specific topics they are working on. With these motivations in mind, in this work we propose to develop expert retrieval algorithms for intra-organizational and public social media tools. Social media datasets have both challenges and advantages. In terms of challenges, they do not always contain context on one specific domain, instead one social media tool may contain discussions on technical stuff, hobbies or news concurrently. They may also contain spam posts or advertisements. Compared to well-edited enterprise documents, they are much more informal in language. Furthermore, depending on the social media platform, they may have limits on the number of characters used in posts. Even though they include the challenges stated above, they also bring some unique authority signals, such as votes, comments, follower/following information, which can be useful in estimating expertise. Furthermore, compared to previously used enterprise documents, social media provides clear associations between documents and candidates in the context of authorship information. In this work, we propose to develop expert retrieval approaches which will handle these challenges while making use of the advantages. Expert retrieval is a very useful application by itself; furthermore, it can be a step towards improving other social media applications. Social media is different than other web based tools mainly because it is dependent on its users. In social media, users are not just content consumers, but they are also the primary and sometimes the only content creators. Therefore, the quality of any user-generated content in social media depends on its creator. In this thesis, we propose to use expertise of users in order to improve the existing applications so that they can estimate the relevancy of a content not just based on the content, but also based on the expertise of the content creator. By using expertise of the content generator, we also hope to boost contents that are more reliable. We propose to apply this user's expertise information in order to improve ad-hoc search and question answering applications in social media. In this work, previous TREC enterprise datasets, available intra-organizational social media and public social media datasets will be used to test the proposed algorithms.
自2005年在TREC的企业轨道中引入专家查找任务以来,专家检索得到了广泛的研究。这条赛道提供了从两个组织的面向公众的网站和内部电子邮件中抓取的两个不同的测试集合,这导致了专家检索[1]上许多最先进算法的发展。直到最近,这些数据集还被认为是企业内可用信息资源的良好代表。然而,最近社交媒体的发展也影响了工作环境,社交媒体成为组织内部常见的沟通和协作工具。根据麦肯锡全球研究院(McKinsey Global Institute)最近的一项调查,29%的公司至少使用一种社交媒体工具来匹配员工的任务,26%的公司通过使用社交媒体来评估员工的表现。这表明组织内社交媒体成为组织内识别专业知识的重要资源。近年来,除了组织内部的社交媒体外,Twitter、Facebook、LinkedIn等公共社交媒体工具也成为搜索专业知识的常见环境。这些工具为他们的用户提供了一个向世界展示他们的特定技能的机会,这促使招聘人员在社交媒体上寻找有才华的求职者,或者作家和记者找到专家来咨询他们正在研究的特定主题。考虑到这些动机,在这项工作中,我们建议为组织内部和公共社交媒体工具开发专家检索算法。社交媒体数据集既有挑战,也有优势。就挑战而言,它们并不总是包含特定领域的上下文,相反,一个社交媒体工具可能同时包含有关技术内容、爱好或新闻的讨论。它们也可能包含垃圾邮件或广告。与精心编辑的企业文档相比,它们在语言上要随意得多。此外,根据社交媒体平台的不同,他们可能会限制帖子中使用的字符数量。尽管它们包括上述挑战,但它们也带来了一些独特的权威信号,如投票、评论、追随者/跟踪信息,这些信息在评估专业知识时很有用。此外,与以前使用的企业文档相比,社交媒体在作者信息上下文中提供了文档和候选人之间的明确关联。在这项工作中,我们建议开发专家检索方法来处理这些挑战,同时利用优势。专家检索本身就是一个非常有用的应用;此外,它可以成为改进其他社交媒体应用程序的一步。社交媒体不同于其他基于网络的工具,主要是因为它依赖于它的用户。在社交媒体中,用户不仅仅是内容的消费者,他们也是主要的,有时甚至是唯一的内容创造者。因此,社交媒体中任何用户生成内容的质量取决于其创建者。在本文中,我们建议使用用户的专业知识来改进现有的应用程序,以便他们不仅可以基于内容,还可以基于内容创建者的专业知识来估计内容的相关性。通过使用内容生成器的专业知识,我们也希望增加更可靠的内容。我们建议利用这些用户的专业知识信息来改进社交媒体中的特别搜索和问答应用程序。在这项工作中,以前的TREC企业数据集,可用的组织内部社交媒体和公共社交媒体数据集将用于测试所提出的算法。