首页 > 最新文献

ACM SIGMOD Workshop on Databases and Social Networks最新文献

英文 中文
How people describe themselves on Twitter 人们如何在Twitter上描述自己
Pub Date : 2013-06-22 DOI: 10.1145/2484702.2484708
Konstantinos Semertzidis, E. Pitoura, Panayiotis Tsaparas
Twitter, being both a micro-blogging service and a social network, has become one of the primary means of communicating and disseminating information online. As such, significant amount of research has been devoted to analyzing the Twitter graph, the tweets, and the behavior of its users. In this work, we undertake a study of the user profile bios on Twitter. The goal of our study is two-fold: first, to understand what Twitter users choose to expose about themselves in their profile bio, and second, to investigate if it is possible to exploit the information in the user bio for tasks such as predicting connections between Twitter users.
作为一种微博客服务和社交网络,Twitter已经成为在线交流和传播信息的主要手段之一。因此,大量的研究致力于分析Twitter图表、tweet和用户行为。在这项工作中,我们对Twitter上的用户配置文件进行了研究。我们研究的目的有两个:首先,了解Twitter用户选择在他们的个人资料简介中暴露自己的哪些信息;其次,调查是否有可能利用用户简介中的信息来完成预测Twitter用户之间的联系等任务。
{"title":"How people describe themselves on Twitter","authors":"Konstantinos Semertzidis, E. Pitoura, Panayiotis Tsaparas","doi":"10.1145/2484702.2484708","DOIUrl":"https://doi.org/10.1145/2484702.2484708","url":null,"abstract":"Twitter, being both a micro-blogging service and a social network, has become one of the primary means of communicating and disseminating information online. As such, significant amount of research has been devoted to analyzing the Twitter graph, the tweets, and the behavior of its users. In this work, we undertake a study of the user profile bios on Twitter. The goal of our study is two-fold: first, to understand what Twitter users choose to expose about themselves in their profile bio, and second, to investigate if it is possible to exploit the information in the user bio for tasks such as predicting connections between Twitter users.","PeriodicalId":104130,"journal":{"name":"ACM SIGMOD Workshop on Databases and Social Networks","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131839077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Interesting event detection through hall of fame rankings
Pub Date : 2013-06-22 DOI: 10.1145/2484702.2484704
F. Alvanaki, Evica Milchevski, S. Michel, A. Stupar
Everything is relative. Cars are compared by gas per mile, websites by page rank, students based on GPA, scientists by number of publications, and celebrities by beauty or wealth. In this paper, we study the characteristics of such entity rankings based on a set of rankings obtained from a popular Web portal. The obtained insights are integrated in our approach, coined Pantheon. Pantheon maintains sets of top-k rankings and reports identified changes in a way that appeals to users, using a novel combination of different characteristics like competitiveness, information entropy, and scale of change. Entity rankings are assembled by combining entity type attributes with data-driven categorical constraints and sorting criteria on numeric attributes. We report on the results of an experimental evaluation using real-world data obtained from a basketball statistics website.
一切都是相对的。汽车的比较标准是每英里汽油量,网站的比较标准是网页排名,学生的比较标准是GPA,科学家的比较标准是出版物数量,名人的比较标准是美貌或财富。在本文中,我们基于从一个流行的Web门户获得的一组排名来研究这种实体排名的特征。获得的见解被整合到我们的方法中,创造了万神殿。Pantheon保持top-k排名,并以一种吸引用户的方式报告确定的变化,使用不同特征(如竞争力,信息熵和变化规模)的新颖组合。实体排名是通过将实体类型属性与数据驱动的分类约束和数字属性的排序标准结合起来进行的。我们报告了使用从篮球统计网站获得的真实数据进行实验评估的结果。
{"title":"Interesting event detection through hall of fame rankings","authors":"F. Alvanaki, Evica Milchevski, S. Michel, A. Stupar","doi":"10.1145/2484702.2484704","DOIUrl":"https://doi.org/10.1145/2484702.2484704","url":null,"abstract":"Everything is relative. Cars are compared by gas per mile, websites by page rank, students based on GPA, scientists by number of publications, and celebrities by beauty or wealth. In this paper, we study the characteristics of such entity rankings based on a set of rankings obtained from a popular Web portal. The obtained insights are integrated in our approach, coined Pantheon. Pantheon maintains sets of top-k rankings and reports identified changes in a way that appeals to users, using a novel combination of different characteristics like competitiveness, information entropy, and scale of change. Entity rankings are assembled by combining entity type attributes with data-driven categorical constraints and sorting criteria on numeric attributes. We report on the results of an experimental evaluation using real-world data obtained from a basketball statistics website.","PeriodicalId":104130,"journal":{"name":"ACM SIGMOD Workshop on Databases and Social Networks","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126181911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Scalable, continuous tracking of tag co-occurrences between short sets using (almost) disjoint tag partitions 使用(几乎)不相交的标签分区对短集之间的标签共现进行可扩展的连续跟踪
Pub Date : 2013-06-22 DOI: 10.1145/2484702.2484705
F. Alvanaki, S. Michel
In this work we consider the continuous computation of set correlations over a stream of set-valued attributes, such as Tweets and their hashtags, social annotations of blog posts obtained through RSS, or updates to set-valued attributes of databases. In order to compute tag correlations in a distributed fashion, all necessary information has to be present at the computing node(s). Our approach makes use of a partitioning scheme based on set covers for efficient and replication-lean information flow. We report on the results of a preliminary performance evaluation using Tweets obtained through Twitter's streaming API.
在这项工作中,我们考虑了集合值属性流上的集合相关性的连续计算,例如Tweets及其hashtag,通过RSS获得的博客帖子的社交注释,或数据库的集合值属性更新。为了以分布式方式计算标签相关性,所有必要的信息都必须出现在计算节点上。我们的方法利用基于集合覆盖的分区方案来实现高效和精简复制的信息流。我们报告了通过Twitter的流媒体API获得的tweet的初步性能评估结果。
{"title":"Scalable, continuous tracking of tag co-occurrences between short sets using (almost) disjoint tag partitions","authors":"F. Alvanaki, S. Michel","doi":"10.1145/2484702.2484705","DOIUrl":"https://doi.org/10.1145/2484702.2484705","url":null,"abstract":"In this work we consider the continuous computation of set correlations over a stream of set-valued attributes, such as Tweets and their hashtags, social annotations of blog posts obtained through RSS, or updates to set-valued attributes of databases. In order to compute tag correlations in a distributed fashion, all necessary information has to be present at the computing node(s). Our approach makes use of a partitioning scheme based on set covers for efficient and replication-lean information flow. We report on the results of a preliminary performance evaluation using Tweets obtained through Twitter's streaming API.","PeriodicalId":104130,"journal":{"name":"ACM SIGMOD Workshop on Databases and Social Networks","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127839415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
STK-anonymity: k-anonymity of social networks containing both structural and textual information stk -匿名:包含结构和文本信息的社交网络的k-匿名性
Pub Date : 2013-06-22 DOI: 10.1145/2484702.2484707
Yifan Hao, H. Cao, K. Bhattarai, S. Misra
We study the problem of anonymizing social networks to prevent individual identifications which use both structural (node degrees) and textual (edge labels) information in social networks. We introduce the concept of Structural and Textual (ST)-equivalence of individuals at two levels (strict and loose), and formally define the problem as Structure and Text aware K-anonymity of social networks (STK-Anonymity). In an STK-anonymized network, each individual is ST-equivalent to at least K-1 other nodes. The major challenge in achieving STK-Anonymity comes from the correlation of edge labels, which causes the propagation of edge anonymization. To address the challenge, we present a two-phase approach. In particular, a set-enumeration tree based approach and three pruning strategies are introduced in the second phase to avoid the propagation problem during anonymization. Experimental results on both real and synthetic datasets are presented to show the effectiveness and efficiency of our approaches.
我们研究了匿名化社交网络的问题,以防止在社交网络中同时使用结构(节点度)和文本(边缘标签)信息的个人识别。本文在严格和宽松两个层面引入了个体的结构和文本对等(ST)概念,并将其正式定义为社会网络的结构和文本感知k -匿名(stk -匿名)。在stk匿名网络中,每个个体至少与K-1个其他节点st等价。实现stk -匿名的主要挑战来自边缘标签的相关性,这会导致边缘匿名化的传播。为了应对这一挑战,我们提出了一个两阶段的方法。在第二阶段提出了一种基于集合枚举树的方法和三种修剪策略,以避免匿名化过程中的传播问题。在真实数据集和合成数据集上的实验结果表明了我们方法的有效性和效率。
{"title":"STK-anonymity: k-anonymity of social networks containing both structural and textual information","authors":"Yifan Hao, H. Cao, K. Bhattarai, S. Misra","doi":"10.1145/2484702.2484707","DOIUrl":"https://doi.org/10.1145/2484702.2484707","url":null,"abstract":"We study the problem of anonymizing social networks to prevent individual identifications which use both structural (node degrees) and textual (edge labels) information in social networks. We introduce the concept of Structural and Textual (ST)-equivalence of individuals at two levels (strict and loose), and formally define the problem as Structure and Text aware K-anonymity of social networks (STK-Anonymity). In an STK-anonymized network, each individual is ST-equivalent to at least K-1 other nodes. The major challenge in achieving STK-Anonymity comes from the correlation of edge labels, which causes the propagation of edge anonymization. To address the challenge, we present a two-phase approach. In particular, a set-enumeration tree based approach and three pruning strategies are introduced in the second phase to avoid the propagation problem during anonymization. Experimental results on both real and synthetic datasets are presented to show the effectiveness and efficiency of our approaches.","PeriodicalId":104130,"journal":{"name":"ACM SIGMOD Workshop on Databases and Social Networks","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124983972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
curso: protect yourself from curse of attribute inference: a social network privacy-analyzer Curso:保护自己免受属性推理的诅咒:一个社交网络隐私分析器
Pub Date : 2013-06-22 DOI: 10.1145/2484702.2484706
Eunsu Ryu, Yao Rong, Jie Li, Ashwin Machanavajjhala
While social networking platforms allow users to control how their private information is shared, recent research has shown that a user's sensitive attribute can be inferred based on friendship links and group memberships, even when the attribute value is not shared with anyone else. Thus, existing access control mechanisms are unable to protect against such privacy breaches. Our research goal is to develop tools that help a user Alice be aware of privacy breaches via attribute inference. In this paper, we specifically focus on two problems: (a) whether Alice's sensitive attribute can be inferred based on public information in Alice's neighborhood, and (b) whether making Alice's sensitive attribute public leads to the disclosure of sensitive information of another user Bob in Alice's neighborhood. We propose three algorithms to detect the aforementioned privacy breaches. We limit our scope to the one-hop neighbors of Alice -- information that is visible to an app that can be executed on behalf of Alice. Our results indicate that analyzing local networks is sufficient to extract a significant amount of information about most users.
虽然社交网络平台允许用户控制他们的私人信息如何共享,但最近的研究表明,用户的敏感属性可以根据友谊链接和群组成员来推断,即使属性值没有与其他人共享。因此,现有的访问控制机制无法防止此类隐私泄露。我们的研究目标是开发工具,帮助用户Alice通过属性推理意识到隐私泄露。在本文中,我们特别关注两个问题:(a)是否可以根据Alice邻居的公开信息推断出Alice的敏感属性,以及(b)将Alice的敏感属性公开是否会导致Alice邻居中另一个用户Bob的敏感信息被泄露。我们提出了三种算法来检测上述隐私泄露。我们将范围限制在Alice的单跳邻居——可以代表Alice执行的应用程序可见的信息。我们的结果表明,分析本地网络足以提取关于大多数用户的大量信息。
{"title":"curso: protect yourself from curse of attribute inference: a social network privacy-analyzer","authors":"Eunsu Ryu, Yao Rong, Jie Li, Ashwin Machanavajjhala","doi":"10.1145/2484702.2484706","DOIUrl":"https://doi.org/10.1145/2484702.2484706","url":null,"abstract":"While social networking platforms allow users to control how their private information is shared, recent research has shown that a user's sensitive attribute can be inferred based on friendship links and group memberships, even when the attribute value is not shared with anyone else. Thus, existing access control mechanisms are unable to protect against such privacy breaches.\u0000 Our research goal is to develop tools that help a user Alice be aware of privacy breaches via attribute inference. In this paper, we specifically focus on two problems: (a) whether Alice's sensitive attribute can be inferred based on public information in Alice's neighborhood, and (b) whether making Alice's sensitive attribute public leads to the disclosure of sensitive information of another user Bob in Alice's neighborhood. We propose three algorithms to detect the aforementioned privacy breaches. We limit our scope to the one-hop neighbors of Alice -- information that is visible to an app that can be executed on behalf of Alice. Our results indicate that analyzing local networks is sufficient to extract a significant amount of information about most users.","PeriodicalId":104130,"journal":{"name":"ACM SIGMOD Workshop on Databases and Social Networks","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129693095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Implementing link-prediction for social networks in a database system 在数据库系统中实现社交网络的链接预测
Pub Date : 2013-06-22 DOI: 10.1145/2484702.2484710
Sara Cohen, Netanel Cohen-Tzemach
Storing and querying large social networks is a challenging problem, due both to the scale of the data, and to intricate querying requirements. One common type of query over a social network is link prediction, which is used to suggest new friends for existing nodes in the network. There is no gold standard metric for predicting new links. However, past work has been effective at identifying a number of metrics that work well for this problem. These metrics vastly differ one from another in their computational complexity, e.g., they may consider a small neighborhood of a node for which new links should be predicted, or they may perform random walks over the entire social network graph. This paper considers the problem of implementing metrics for link prediction in a social network over different types of database systems. We consider the use of a relational database, a key-value store and a graph database. We show the type of database system affects the ease in which link prediction may be performed. Our results are empirically validated by extensive experimentation over real social networks of varying sizes.
由于数据的规模和复杂的查询需求,存储和查询大型社交网络是一个具有挑战性的问题。社交网络上一种常见的查询类型是链接预测,它用于为网络中的现有节点推荐新朋友。预测新链接没有黄金标准。然而,过去的工作已经有效地确定了一些很好地解决这个问题的度量。这些指标在计算复杂度上有很大的不同,例如,它们可能考虑一个节点的小邻域,在这个节点上应该预测新的链接,或者它们可能在整个社交网络图上执行随机漫步。本文考虑了在不同类型的数据库系统上实现社交网络中链接预测指标的问题。我们考虑使用关系数据库、键值存储和图数据库。我们展示了数据库系统的类型对执行链接预测的容易程度的影响。我们的研究结果在不同规模的真实社交网络上得到了广泛的实验验证。
{"title":"Implementing link-prediction for social networks in a database system","authors":"Sara Cohen, Netanel Cohen-Tzemach","doi":"10.1145/2484702.2484710","DOIUrl":"https://doi.org/10.1145/2484702.2484710","url":null,"abstract":"Storing and querying large social networks is a challenging problem, due both to the scale of the data, and to intricate querying requirements. One common type of query over a social network is link prediction, which is used to suggest new friends for existing nodes in the network. There is no gold standard metric for predicting new links. However, past work has been effective at identifying a number of metrics that work well for this problem. These metrics vastly differ one from another in their computational complexity, e.g., they may consider a small neighborhood of a node for which new links should be predicted, or they may perform random walks over the entire social network graph. This paper considers the problem of implementing metrics for link prediction in a social network over different types of database systems. We consider the use of a relational database, a key-value store and a graph database. We show the type of database system affects the ease in which link prediction may be performed. Our results are empirically validated by extensive experimentation over real social networks of varying sizes.","PeriodicalId":104130,"journal":{"name":"ACM SIGMOD Workshop on Databases and Social Networks","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129060463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Event identification for local areas using social media streaming data 使用社交媒体流数据进行本地事件识别
Pub Date : 2013-06-22 DOI: 10.1145/2484702.2484703
Andreas Weiler, M. Scholl, Franz Wanner, Christian Rohrdantz
Unprecedented success and active usage of social media services result in massive amounts of user-generated data. An increasing interest in the contained information from social media data leads to more and more sophisticated analysis and visualization applications. Because of the fast pace and distribution of news in social media data it is an appropriate source to identify events in the data and directly display their occurrence to analysts or other users. This paper presents a method for event identification in local areas using the Twitter data stream. We implement and use a combined log-likelihood ratio approach for the geographic and time dimension of real-life Twitter data in predefined areas of the world to detect events occurring in the message contents. We present a case study with two interesting scenarios to show the usefulness of our approach.
前所未有的成功和社交媒体服务的积极使用导致了大量的用户生成数据。人们对社交媒体数据中包含的信息越来越感兴趣,这导致了越来越复杂的分析和可视化应用。由于社交媒体数据中新闻的传播速度快,因此识别数据中的事件并将其发生情况直接显示给分析师或其他用户是一个合适的来源。本文提出了一种利用Twitter数据流进行局部区域事件识别的方法。我们对世界上预定义区域的真实Twitter数据的地理和时间维度实现并使用了组合的对数似然比方法,以检测消息内容中发生的事件。我们提供了一个案例研究,其中包含两个有趣的场景,以展示我们的方法的有用性。
{"title":"Event identification for local areas using social media streaming data","authors":"Andreas Weiler, M. Scholl, Franz Wanner, Christian Rohrdantz","doi":"10.1145/2484702.2484703","DOIUrl":"https://doi.org/10.1145/2484702.2484703","url":null,"abstract":"Unprecedented success and active usage of social media services result in massive amounts of user-generated data. An increasing interest in the contained information from social media data leads to more and more sophisticated analysis and visualization applications. Because of the fast pace and distribution of news in social media data it is an appropriate source to identify events in the data and directly display their occurrence to analysts or other users. This paper presents a method for event identification in local areas using the Twitter data stream. We implement and use a combined log-likelihood ratio approach for the geographic and time dimension of real-life Twitter data in predefined areas of the world to detect events occurring in the message contents. We present a case study with two interesting scenarios to show the usefulness of our approach.","PeriodicalId":104130,"journal":{"name":"ACM SIGMOD Workshop on Databases and Social Networks","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124062213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
Cache augmented database management systems 缓存增强的数据库管理系统
Pub Date : 2013-06-22 DOI: 10.1145/2484702.2484709
Shahram Ghandeharizadeh, Jason Yap
Cache Augmented Database Management Systems, CADBMSs, enhance the velocity of simple operations that read and write a small amount of data from big data. They are most suitable for those applications with workloads that exhibit a high read to write ratio, e.g., interactive social networking actions. This study surveys state of the art with CADBMSs and presents physical data independence as the next step in their evolution. We detail the requirements of this evolution, technological trends and software practices, and our research efforts in this area.
缓存增强数据库管理系统(cadbms)提高了从大数据中读取和写入少量数据的简单操作的速度。它们最适合那些具有高读写比率工作负载的应用程序,例如交互式社交网络操作。本研究调查了cadbms的最新状况,并提出了物理数据独立性作为其发展的下一步。我们详细描述了这种演变的需求、技术趋势和软件实践,以及我们在这一领域的研究工作。
{"title":"Cache augmented database management systems","authors":"Shahram Ghandeharizadeh, Jason Yap","doi":"10.1145/2484702.2484709","DOIUrl":"https://doi.org/10.1145/2484702.2484709","url":null,"abstract":"Cache Augmented Database Management Systems, CADBMSs, enhance the velocity of simple operations that read and write a small amount of data from big data. They are most suitable for those applications with workloads that exhibit a high read to write ratio, e.g., interactive social networking actions. This study surveys state of the art with CADBMSs and presents physical data independence as the next step in their evolution. We detail the requirements of this evolution, technological trends and software practices, and our research efforts in this area.","PeriodicalId":104130,"journal":{"name":"ACM SIGMOD Workshop on Databases and Social Networks","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116476480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
The predictive value of young and old links in a social network 社交网络中年轻和年老链接的预测价值
Pub Date : 2013-06-22 DOI: 10.1145/2484702.2484711
Hung-Hsuan Chen, David J. Miller, C. Lee Giles
Recent studies show that vertex similarity measures are good at predicting link formation over the near term, but are less effective in predicting over the long term. This indicates that, generally, as links age, their degree of influence diminishes. However, few papers have systematically studied this phenomenon. In this paper, we apply a supervised learning approach to study age as a factor for link formation. Experiments on several real-world datasets show that younger links are more informative than older ones in predicting the formation of new links. Since older links become less useful, it might be appropriate to remove them when studying network evolution. Several previously observed network properties and network evolution phenomena, such as "the number of edges grows super-linearly in the number of nodes" and "the diameter is decreasing as the network grows", may need to be reconsidered under a dynamic network model where old, inactive links are removed.
最近的研究表明,顶点相似性度量在预测短期内的链接形成方面很好,但在预测长期的链接形成方面效果较差。这表明,一般来说,随着关系的老化,其影响程度会减弱。然而,很少有论文系统地研究这一现象。在本文中,我们采用监督学习方法来研究年龄作为链接形成的一个因素。在几个真实数据集上的实验表明,在预测新链接的形成方面,较年轻的链接比较老的链接提供的信息更多。由于较老的链接变得不那么有用,因此在研究网络进化时删除它们可能是合适的。一些先前观察到的网络特性和网络进化现象,如“边的数量在节点数量中呈超线性增长”和“随着网络的增长,直径正在减少”,可能需要在动态网络模型下重新考虑,在动态网络模型中,旧的、不活跃的链接被删除。
{"title":"The predictive value of young and old links in a social network","authors":"Hung-Hsuan Chen, David J. Miller, C. Lee Giles","doi":"10.1145/2484702.2484711","DOIUrl":"https://doi.org/10.1145/2484702.2484711","url":null,"abstract":"Recent studies show that vertex similarity measures are good at predicting link formation over the near term, but are less effective in predicting over the long term. This indicates that, generally, as links age, their degree of influence diminishes. However, few papers have systematically studied this phenomenon. In this paper, we apply a supervised learning approach to study age as a factor for link formation. Experiments on several real-world datasets show that younger links are more informative than older ones in predicting the formation of new links. Since older links become less useful, it might be appropriate to remove them when studying network evolution. Several previously observed network properties and network evolution phenomena, such as \"the number of edges grows super-linearly in the number of nodes\" and \"the diameter is decreasing as the network grows\", may need to be reconsidered under a dynamic network model where old, inactive links are removed.","PeriodicalId":104130,"journal":{"name":"ACM SIGMOD Workshop on Databases and Social Networks","volume":"146 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116497462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Distance matters: an exploratory analysis of the linguistic features of Flickr photo tag metadata in relation to impression management 距离问题:对Flickr照片标签元数据与印象管理相关的语言特征的探索性分析
Pub Date : 2012-05-20 DOI: 10.1145/2304536.2304538
Syed Ishtiaque Ahmed, Shion Guha
Tags are words that users add to shared multimedia contents as metadata to facilitate better categorization and improved sharing experiences. With the burgeoning growth of shared images and videos over online social networks, a huge number of tags is being populated everyday in public or shared databases. While one major reason for tagging a photo or a video incorporates the functional needs for the organization of that shared object, people also use tags as a medium of communication for conveying their emotions to their family, friends, and other contacts. The diversity in the linguistic features of these tags demonstrates some interesting patterns that reflect different facets of human nature in managing their online impression to their social peers. This paper investigates how some linguistic features of tags associated with the Flickr photos change with the distance between the user's home location and the location where the photo is taken. In our exploratory analysis "affective" and "relativ" words and their multiplicative interaction show correlations with this distance. These initial findings help us to have a better understanding of online social phenomena related to the expression of emotions and sharing information. At the same time, this might have some indirect implications to understand the insight of impression management in online communities.
标签是用户添加到共享的多媒体内容中作为元数据的单词,以促进更好的分类和改进共享体验。随着在线社交网络上共享图像和视频的迅速增长,每天都有大量的标签被填充到公共或共享数据库中。虽然给照片或视频加标签的一个主要原因是为了组织共享对象的功能需求,但人们也把标签作为一种沟通媒介,向家人、朋友和其他联系人传达他们的情感。这些标签语言特征的多样性展示了一些有趣的模式,反映了人类在管理他们对社会同伴的在线印象方面的不同方面。本文研究了与Flickr照片相关的标签的一些语言特征是如何随着用户家庭位置和照片拍摄位置之间的距离而变化的。在我们的探索性分析中,“情感”和“相对”词及其乘法交互作用显示出与这一距离的相关性。这些初步发现有助于我们更好地理解与情感表达和信息分享相关的网络社会现象。同时,这可能对理解网络社区印象管理的洞见有一些间接的启示。
{"title":"Distance matters: an exploratory analysis of the linguistic features of Flickr photo tag metadata in relation to impression management","authors":"Syed Ishtiaque Ahmed, Shion Guha","doi":"10.1145/2304536.2304538","DOIUrl":"https://doi.org/10.1145/2304536.2304538","url":null,"abstract":"Tags are words that users add to shared multimedia contents as metadata to facilitate better categorization and improved sharing experiences. With the burgeoning growth of shared images and videos over online social networks, a huge number of tags is being populated everyday in public or shared databases. While one major reason for tagging a photo or a video incorporates the functional needs for the organization of that shared object, people also use tags as a medium of communication for conveying their emotions to their family, friends, and other contacts. The diversity in the linguistic features of these tags demonstrates some interesting patterns that reflect different facets of human nature in managing their online impression to their social peers. This paper investigates how some linguistic features of tags associated with the Flickr photos change with the distance between the user's home location and the location where the photo is taken. In our exploratory analysis \"affective\" and \"relativ\" words and their multiplicative interaction show correlations with this distance. These initial findings help us to have a better understanding of online social phenomena related to the expression of emotions and sharing information. At the same time, this might have some indirect implications to understand the insight of impression management in online communities.","PeriodicalId":104130,"journal":{"name":"ACM SIGMOD Workshop on Databases and Social Networks","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122551175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
期刊
ACM SIGMOD Workshop on Databases and Social Networks
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1