首页 > 最新文献

Proceedings of The Web Conference 2020最新文献

英文 中文
When Recommender Systems Meet Fleet Management: Practical Study in Online Driver Repositioning System 当推荐系统满足车队管理:在线驾驶员重新定位系统的实践研究
Pub Date : 2020-04-20 DOI: 10.1145/3366423.3380287
Zhe Xu, Chang Men, Pengbo Li, Bicheng Jin, Ge Li, Yue Yang, Chunyang Liu, Ben Wang, X. Qie
E-hailing platforms have become an important component of public transportation in recent years. The supply (online drivers) and demand (passenger requests) are intrinsically imbalanced because of the pattern of human behavior, especially in time and locations such as peak hours and train stations. Hence, how to balance supply and demand is one of the key problems to satisfy passengers and drivers and increase social welfare. As an intuitive and effective approach to address this problem, driver repositioning has been employed by some real-world e-hailing platforms. In this paper, we describe a novel framework of driver repositioning system, which meets various requirements in practical situations, including robust driver experience satisfaction and multi-driver collaboration. We introduce an effective and user-friendly driver interaction design called “driver repositioning task”. A novel modularized algorithm is developed to generate the repositioning tasks in real time. To our knowledge, this is the first industry-level application of driver repositioning. We evaluate the proposed method in real-world experiments, achieving a 2% improvement of driver income. Our framework has been fully deployed in the online system of DiDi Chuxing and serves millions of drivers on a daily basis.
近年来,网约车平台已成为公共交通的重要组成部分。供给(网约车司机)和需求(乘客请求)本质上是不平衡的,因为人类的行为模式,特别是在时间和地点,如高峰时间和火车站。因此,如何平衡供给和需求是满足乘客和司机,增加社会福利的关键问题之一。作为解决这一问题的一种直观而有效的方法,司机重新定位已经被一些现实世界的网约车平台采用。在本文中,我们描述了一种新的驾驶员重新定位系统框架,它能满足各种实际情况下的要求,包括鲁棒的驾驶员体验满意度和多驾驶员协作。我们引入了一种有效且人性化的驱动交互设计,称为“驱动重新定位任务”。提出了一种实时生成重定位任务的模块化算法。据我们所知,这是第一个驱动重新定位的行业级应用。我们在现实世界的实验中评估了所提出的方法,实现了2%的司机收入提高。我们的框架已经全面部署在滴滴出行的在线系统中,每天为数百万司机提供服务。
{"title":"When Recommender Systems Meet Fleet Management: Practical Study in Online Driver Repositioning System","authors":"Zhe Xu, Chang Men, Pengbo Li, Bicheng Jin, Ge Li, Yue Yang, Chunyang Liu, Ben Wang, X. Qie","doi":"10.1145/3366423.3380287","DOIUrl":"https://doi.org/10.1145/3366423.3380287","url":null,"abstract":"E-hailing platforms have become an important component of public transportation in recent years. The supply (online drivers) and demand (passenger requests) are intrinsically imbalanced because of the pattern of human behavior, especially in time and locations such as peak hours and train stations. Hence, how to balance supply and demand is one of the key problems to satisfy passengers and drivers and increase social welfare. As an intuitive and effective approach to address this problem, driver repositioning has been employed by some real-world e-hailing platforms. In this paper, we describe a novel framework of driver repositioning system, which meets various requirements in practical situations, including robust driver experience satisfaction and multi-driver collaboration. We introduce an effective and user-friendly driver interaction design called “driver repositioning task”. A novel modularized algorithm is developed to generate the repositioning tasks in real time. To our knowledge, this is the first industry-level application of driver repositioning. We evaluate the proposed method in real-world experiments, achieving a 2% improvement of driver income. Our framework has been fully deployed in the online system of DiDi Chuxing and serves millions of drivers on a daily basis.","PeriodicalId":20754,"journal":{"name":"Proceedings of The Web Conference 2020","volume":"25 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80903514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Negative Purchase Intent Identification in Twitter Twitter中的消极购买意向识别
Pub Date : 2020-04-20 DOI: 10.1145/3366423.3380040
Samed Atouati, Xiao Lu, Mauro Sozio
Social network users often express their discontent with a product or a service from a company on social media. Such a reaction is more pronounced in the aftermath of a corporate scandal such as a corruption scandal or food poisoning in a chain restaurant. In our work, we focus on identifying negative purchase intent in a tweet, i.e. the intent of a user of not purchasing any product or consuming any service from a company. We develop a binary classifier for such a task, which consists of a generalization of logistic regression leveraging the locality of purchase intent in posts from Twitter. We conduct an extensive experimental evaluation against state-of-the-art approaches on a large collection of tweets, showing the effectiveness of our approach in terms of F1 score. We also provide some preliminary results on which kinds of corporate scandals might affect the purchase intent of customers the most.
社交网络用户经常在社交媒体上表达他们对公司产品或服务的不满。在腐败丑闻或连锁餐厅食物中毒等企业丑闻发生后,这种反应更为明显。在我们的工作中,我们专注于识别推文中的负面购买意图,即用户不购买任何产品或从公司消费任何服务的意图。我们为这样的任务开发了一个二元分类器,它由利用Twitter帖子中购买意图的局部性的逻辑回归的泛化组成。我们对大量推文进行了针对最先进方法的广泛实验评估,显示了我们的方法在F1分数方面的有效性。我们还提供了一些初步的结果,哪些类型的公司丑闻最可能影响消费者的购买意愿。
{"title":"Negative Purchase Intent Identification in Twitter","authors":"Samed Atouati, Xiao Lu, Mauro Sozio","doi":"10.1145/3366423.3380040","DOIUrl":"https://doi.org/10.1145/3366423.3380040","url":null,"abstract":"Social network users often express their discontent with a product or a service from a company on social media. Such a reaction is more pronounced in the aftermath of a corporate scandal such as a corruption scandal or food poisoning in a chain restaurant. In our work, we focus on identifying negative purchase intent in a tweet, i.e. the intent of a user of not purchasing any product or consuming any service from a company. We develop a binary classifier for such a task, which consists of a generalization of logistic regression leveraging the locality of purchase intent in posts from Twitter. We conduct an extensive experimental evaluation against state-of-the-art approaches on a large collection of tweets, showing the effectiveness of our approach in terms of F1 score. We also provide some preliminary results on which kinds of corporate scandals might affect the purchase intent of customers the most.","PeriodicalId":20754,"journal":{"name":"Proceedings of The Web Conference 2020","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81624438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Examining Protest as An Intervention to Reduce Online Prejudice: A Case Study of Prejudice Against Immigrants 考察抗议作为减少网络偏见的干预手段:以移民偏见为例
Pub Date : 2020-04-20 DOI: 10.1145/3366423.3380307
Kai Wei, Y. Lin, Muheng Yan
There has been a growing concern about online users using social media to incite prejudice and hatred against other individuals or groups. While there has been research in developing automated techniques to identify online prejudice acts and hate speech, how to effectively counter online prejudice remains a societal challenge. Social protests, on the other hand, have been frequently used as an intervention for countering prejudice. However, research to date has not examined the relationship between protests and online prejudice. Using large-scale panel data collected from Twitter, we examine the changes in users’ tweeting behaviors relating to prejudice against immigrants following recent protests in the U.S. on immigration related topics. This is the first empirical study examining the effect of protests on reducing online prejudice. Our results show that there were both negative and positive changes in the measured prejudice after a protest, suggesting protest might have a mixed effect on reducing prejudice. We further identify users who are likely to change (or resist change) after a protest. This work contributes to the understanding of online prejudice and its intervention effect. The findings of this research have implications for designing targeted intervention.
人们越来越担心网络用户利用社交媒体煽动对其他个人或团体的偏见和仇恨。虽然已经有研究开发自动化技术来识别网络偏见行为和仇恨言论,但如何有效地对抗网络偏见仍然是一个社会挑战。另一方面,社会抗议经常被用作对抗偏见的干预手段。然而,迄今为止的研究还没有调查抗议活动与网络偏见之间的关系。利用从Twitter收集的大规模面板数据,我们研究了在美国最近针对移民相关话题的抗议活动之后,用户与移民偏见相关的推文行为的变化。这是第一次实证研究抗议活动对减少网络偏见的影响。我们的研究结果表明,抗议后测量的偏见既有消极的变化,也有积极的变化,这表明抗议可能对减少偏见有混合的影响。我们进一步识别在抗议后可能改变(或抵制改变)的用户。这项工作有助于理解网络偏见及其干预作用。这项研究的发现对设计有针对性的干预措施具有启示意义。
{"title":"Examining Protest as An Intervention to Reduce Online Prejudice: A Case Study of Prejudice Against Immigrants","authors":"Kai Wei, Y. Lin, Muheng Yan","doi":"10.1145/3366423.3380307","DOIUrl":"https://doi.org/10.1145/3366423.3380307","url":null,"abstract":"There has been a growing concern about online users using social media to incite prejudice and hatred against other individuals or groups. While there has been research in developing automated techniques to identify online prejudice acts and hate speech, how to effectively counter online prejudice remains a societal challenge. Social protests, on the other hand, have been frequently used as an intervention for countering prejudice. However, research to date has not examined the relationship between protests and online prejudice. Using large-scale panel data collected from Twitter, we examine the changes in users’ tweeting behaviors relating to prejudice against immigrants following recent protests in the U.S. on immigration related topics. This is the first empirical study examining the effect of protests on reducing online prejudice. Our results show that there were both negative and positive changes in the measured prejudice after a protest, suggesting protest might have a mixed effect on reducing prejudice. We further identify users who are likely to change (or resist change) after a protest. This work contributes to the understanding of online prejudice and its intervention effect. The findings of this research have implications for designing targeted intervention.","PeriodicalId":20754,"journal":{"name":"Proceedings of The Web Conference 2020","volume":"186 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85090224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Designing Fairly Fair Classifiers Via Economic Fairness Notions 基于经济公平理念设计公平的分类器
Pub Date : 2020-04-20 DOI: 10.1145/3366423.3380228
Safwan Hossain, Andjela Mladenovic, Nisarg Shah
The past decade has witnessed a rapid growth of research on fairness in machine learning. In contrast, fairness has been formally studied for almost a century in microeconomics in the context of resource allocation, during which many general-purpose notions of fairness have been proposed. This paper explore the applicability of two such notions — envy-freeness and equitability — in machine learning. We propose novel relaxations of these fairness notions which apply to groups rather than individuals, and are compelling in a broad range of settings. Our approach provides a unifying framework by incorporating several recently proposed fairness definitions as special cases. We provide generalization bounds for our approach, and theoretically and experimentally evaluate the tradeoff between loss minimization and our fairness guarantees.
过去十年见证了机器学习公平性研究的快速增长。相比之下,在微观经济学中,公平已经在资源配置的背景下正式研究了近一个世纪,在此期间,人们提出了许多通用的公平概念。本文探讨了这两个概念在机器学习中的适用性——无嫉妒性和公平性。我们提出新的放宽这些适用于群体而不是个人的公平概念,并且在广泛的环境中具有吸引力。我们的方法通过结合几个最近提出的公平定义作为特殊情况,提供了一个统一的框架。我们为我们的方法提供了泛化界限,并从理论上和实验上评估了损失最小化和我们的公平性保证之间的权衡。
{"title":"Designing Fairly Fair Classifiers Via Economic Fairness Notions","authors":"Safwan Hossain, Andjela Mladenovic, Nisarg Shah","doi":"10.1145/3366423.3380228","DOIUrl":"https://doi.org/10.1145/3366423.3380228","url":null,"abstract":"The past decade has witnessed a rapid growth of research on fairness in machine learning. In contrast, fairness has been formally studied for almost a century in microeconomics in the context of resource allocation, during which many general-purpose notions of fairness have been proposed. This paper explore the applicability of two such notions — envy-freeness and equitability — in machine learning. We propose novel relaxations of these fairness notions which apply to groups rather than individuals, and are compelling in a broad range of settings. Our approach provides a unifying framework by incorporating several recently proposed fairness definitions as special cases. We provide generalization bounds for our approach, and theoretically and experimentally evaluate the tradeoff between loss minimization and our fairness guarantees.","PeriodicalId":20754,"journal":{"name":"Proceedings of The Web Conference 2020","volume":"52 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81031909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Private Data Manipulation in Optimal Sponsored Search Auction 最优赞助搜索拍卖中的私人数据操纵
Pub Date : 2020-04-20 DOI: 10.1145/3366423.3380023
Xiaotie Deng, Tao Lin, Tao Xiao
In this paper, We revisit the sponsored search auction as a repeated auction. We view it as a learning and exploiting task of the seller against the private data distribution of the buyers. We model such a game between the seller and buyers by a Private Data Manipulation (PDM) game: the auction seller first announces an auction for which allocation and payment rules are based on the value distributions submitted by buyers. The seller’s expected revenue depends on the design of the protocol and the game played among the buyers in their choice on the submitted (fake) value distributions. Under the PDM game, we re-evaluate the theory, methodology, and techniques in the sponsored search auctions that have been the most intensively studied in Internet economics.
在本文中,我们重新审视赞助搜索拍卖作为一个重复拍卖。我们认为这是卖方对买方私人数据分布的学习和利用任务。我们通过一个私人数据操作(PDM)游戏来模拟卖家和买家之间的这种博弈:拍卖卖家首先宣布拍卖,其中分配和支付规则是基于买家提交的价值分布。卖方的预期收入取决于协议的设计以及买家在选择提交(虚假)价值分配时所玩的游戏。在PDM游戏下,我们重新评估赞助搜索拍卖的理论、方法和技术,这些在互联网经济学中得到了最深入的研究。
{"title":"Private Data Manipulation in Optimal Sponsored Search Auction","authors":"Xiaotie Deng, Tao Lin, Tao Xiao","doi":"10.1145/3366423.3380023","DOIUrl":"https://doi.org/10.1145/3366423.3380023","url":null,"abstract":"In this paper, We revisit the sponsored search auction as a repeated auction. We view it as a learning and exploiting task of the seller against the private data distribution of the buyers. We model such a game between the seller and buyers by a Private Data Manipulation (PDM) game: the auction seller first announces an auction for which allocation and payment rules are based on the value distributions submitted by buyers. The seller’s expected revenue depends on the design of the protocol and the game played among the buyers in their choice on the submitted (fake) value distributions. Under the PDM game, we re-evaluate the theory, methodology, and techniques in the sponsored search auctions that have been the most intensively studied in Internet economics.","PeriodicalId":20754,"journal":{"name":"Proceedings of The Web Conference 2020","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77851039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Architectures for Autonomy: Towards an Equitable Web of Data in the Age of AI 自主架构:在人工智能时代走向公平的数据网络
Pub Date : 2020-04-20 DOI: 10.1145/3366423.3382668
Sir Nigel Shadbolt
Today, the Web connects over half the world's population, many of whom use it to stay connected to a multiplicity of vital digital public and private services, impacting every aspect of their lives. Access to the Web and underlying Internet is seen as essential for all—even a fundamental human right [7]. However, many contend that the power structure on large swaths of the Web has become inverted; they argue that instead of being run for and by users, it has been made to serve the platforms themselves, and the powerful actors that sponsor such platforms to run targeted advertising on their behalf. In such an ad-driven platform ecosystem, users, including their beliefs, data, and attention, have become traded commodities [13]. There is concern that the emergence of powerful data analytics and AI techniques threaten to further entrench the power of these same platforms, by putting the control of powerful and valuable new capabilities in their hands rather than the users who produce the data [10]. The fear is that it is giving rise to data and AI monopolies [2,6]. Individuals have no long-term control or agency over their personal data or many of the decisions made using it. This may be one reason we are witnessing a so called Renaissance of Ethics - a plethora of initiatives and activities that call out the range of threats to individual autonomy, self-determination and privacy, the lack of transparency and accountability, a concern around bias and fairness, equity and access in our data driven ecosystem. This keynote will argue as the remaining half of the world's population comes online, we need digital infrastructures that will promote a plurality of methods of data sovereignty and governance instead of imposing a ’single policy fits-all’ platform governance model, which has strained and undermined the ability for governments to protect and support their citizens digital rights. This is an opportunity to re-imagine and re-architect elements of the Web, data, algorithms and institutions so as to ensure a more equitable distribution of these new digital potentialities. Based on our existing research we have been developing methods and tech-nologies pertaining to the following core principles: informational self-determination and autonomy, balanced and equitable access to AI and data, accountability and redress of AI/algorithmic decisions, and new models of ethical participation and contribution. The technology that underpins the modern web has seen exponential rates of change that have continuously improved the capabilities of the processors, memory and communications upon which it depends. This has enabled huge amounts of data to be linked and stored as well as providing for increasing use of AI. A variety of projects will be described where we sought to unlock the potential of this increasingly powerful infrastructure [1, 4, 5, 9]. The lessons learnt through various efforts to develop the Seman-tic Web [8] and the insights gained through the rel
今天,网络连接着世界上一半以上的人口,其中许多人使用它来与多种重要的数字公共和私人服务保持联系,影响着他们生活的方方面面。访问网络和基础互联网被视为对所有人都至关重要——甚至是一项基本人权[7]。然而,许多人认为,网络大片的权力结构已经颠倒了;他们认为,它不是为用户而运行,也不是由用户运行,而是服务于平台本身,以及赞助这些平台的强大参与者代表他们运行有针对性的广告。在这样一个广告驱动的平台生态系统中,用户,包括他们的信念、数据和注意力,已经成为交易的商品[13]。人们担心,强大的数据分析和人工智能技术的出现可能会进一步巩固这些平台的权力,因为它们将强大而有价值的新功能的控制权交给了这些平台,而不是产生数据的用户[10]。令人担忧的是,这将导致数据和人工智能的垄断[2,6]。个人无法长期控制或代理他们的个人数据或使用这些数据做出的许多决定。这可能是我们正在见证所谓的道德复兴的一个原因——大量的倡议和活动,呼吁对个人自主权、自决和隐私的一系列威胁,缺乏透明度和问责制,对数据驱动生态系统中偏见、公平、公平和访问的担忧。本次主题演讲将讨论,随着世界上剩下的一半人口上网,我们需要数字基础设施来促进多种数据主权和治理方法,而不是强加一种“单一政策适用于所有人”的平台治理模式,这种模式已经削弱了政府保护和支持公民数字权利的能力。这是一个重新设想和重新构建网络、数据、算法和机构的机会,以确保更公平地分配这些新的数字潜力。基于我们现有的研究,我们一直在开发与以下核心原则相关的方法和技术:信息自决和自治,平衡和公平地获取人工智能和数据,问责制和纠正人工智能/算法决策,以及道德参与和贡献的新模式。支撑现代网络的技术已经经历了指数级的变化,它所依赖的处理器、内存和通信能力不断提高。这使得大量数据能够被链接和存储,并为人工智能的使用提供了越来越多的机会。我们将描述各种项目,在这些项目中,我们试图释放这一日益强大的基础设施的潜力[1,4,5,9]。本文将回顾通过开发语义网的各种努力所获得的经验教训[8]以及通过大规模发布开放数据所获得的见解[11]。我们将回顾我们的尝试,以理解人类、算法和数据的大规模混合是如何产生社交机器的,这些社交机器的涌现特性导致了任何单个元素都无法实现的行为和问题解决[12]。理解网络的这些涌现特性是建立网络科学背后的激励因素之一[3]。我们将简要回顾网络科学的前景。将强调数据作为基础设施的重要性,以实现广泛的创新、问责制和可信赖的可再生科学。最近的工作,旨在促进一个公平和平衡的网络环境,以维护隐私和实现更好的互惠。在技术和制度架构的发展,可以支持数据的道德网络将概述。
{"title":"Architectures for Autonomy: Towards an Equitable Web of Data in the Age of AI","authors":"Sir Nigel Shadbolt","doi":"10.1145/3366423.3382668","DOIUrl":"https://doi.org/10.1145/3366423.3382668","url":null,"abstract":"Today, the Web connects over half the world's population, many of whom use it to stay connected to a multiplicity of vital digital public and private services, impacting every aspect of their lives. Access to the Web and underlying Internet is seen as essential for all—even a fundamental human right [7]. However, many contend that the power structure on large swaths of the Web has become inverted; they argue that instead of being run for and by users, it has been made to serve the platforms themselves, and the powerful actors that sponsor such platforms to run targeted advertising on their behalf. In such an ad-driven platform ecosystem, users, including their beliefs, data, and attention, have become traded commodities [13]. There is concern that the emergence of powerful data analytics and AI techniques threaten to further entrench the power of these same platforms, by putting the control of powerful and valuable new capabilities in their hands rather than the users who produce the data [10]. The fear is that it is giving rise to data and AI monopolies [2,6]. Individuals have no long-term control or agency over their personal data or many of the decisions made using it. This may be one reason we are witnessing a so called Renaissance of Ethics - a plethora of initiatives and activities that call out the range of threats to individual autonomy, self-determination and privacy, the lack of transparency and accountability, a concern around bias and fairness, equity and access in our data driven ecosystem. This keynote will argue as the remaining half of the world's population comes online, we need digital infrastructures that will promote a plurality of methods of data sovereignty and governance instead of imposing a ’single policy fits-all’ platform governance model, which has strained and undermined the ability for governments to protect and support their citizens digital rights. This is an opportunity to re-imagine and re-architect elements of the Web, data, algorithms and institutions so as to ensure a more equitable distribution of these new digital potentialities. Based on our existing research we have been developing methods and tech-nologies pertaining to the following core principles: informational self-determination and autonomy, balanced and equitable access to AI and data, accountability and redress of AI/algorithmic decisions, and new models of ethical participation and contribution. The technology that underpins the modern web has seen exponential rates of change that have continuously improved the capabilities of the processors, memory and communications upon which it depends. This has enabled huge amounts of data to be linked and stored as well as providing for increasing use of AI. A variety of projects will be described where we sought to unlock the potential of this increasingly powerful infrastructure [1, 4, 5, 9]. The lessons learnt through various efforts to develop the Seman-tic Web [8] and the insights gained through the rel","PeriodicalId":20754,"journal":{"name":"Proceedings of The Web Conference 2020","volume":"21 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75543581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
De-Kodi: Understanding the Kodi Ecosystem De-Kodi:了解Kodi生态系统
Pub Date : 2020-04-20 DOI: 10.1145/3366423.3380194
Marc Anthony Warrior, Yunming Xiao, Matteo Varvello, A. Kuzmanovic
Free and open source media centers are currently experiencing a boom in popularity for the convenience and flexibility they offer users seeking to remotely consume digital content. This newfound fame is matched by increasing notoriety—for their potential to serve as hubs for illegal content—and a presumably ever-increasing network footprint. It is fair to say that a complex ecosystem has developed around Kodi, composed of millions of users, thousands of “add-ons”—Kodi extensions from 3rd-party developers—and content providers. Motivated by these observations, this paper conducts the first analysis of the Kodi ecosystem. Our approach is to build “crawling” software around Kodi which can automatically install an addon, explore its menu, and locate (video) content. This is challenging for many reasons. First, Kodi largely relies on visual information and user input which intrinsically complicates automation. Second, no central aggregators for Kodi addons exist. Third, the potential sheer size of this ecosystem requires a highly scalable crawling solution. We address these challenges with de-Kodi, a full fledged crawling system capable of discovering and crawling large cross-sections of Kodi’s decentralized ecosystem. With de-Kodi, we discovered and tested over 9,000 distinct Kodi addons. Our results demonstrate de-Kodi, which we make available to the general public, to be an essential asset in studying one of the largest multimedia platforms in the world. Our work further serves as the first ever transparent and repeatable analysis of the Kodi ecosystem at large.
免费和开源媒体中心目前正因其为寻求远程消费数字内容的用户提供的便利性和灵活性而大受欢迎。这种新发现的名声伴随着越来越多的恶名——因为它们有可能成为非法内容的集散地——以及可能不断增加的网络足迹。公平地说,围绕Kodi已经形成了一个复杂的生态系统,由数百万用户、数千个“附加组件”(来自第三方开发人员的Kodi扩展)和内容提供商组成。在这些观察的激励下,本文对Kodi生态系统进行了首次分析。我们的方法是围绕Kodi构建“爬行”软件,可以自动安装插件,探索其菜单,并定位(视频)内容。这是一个挑战,原因有很多。首先,Kodi很大程度上依赖于视觉信息和用户输入,这本质上使自动化变得复杂。第二,没有集中的Kodi插件聚合器。第三,这个生态系统的潜在规模需要一个高度可扩展的爬行解决方案。我们解决这些挑战与de-Kodi,一个成熟的爬行系统能够发现和爬行Kodi的分散生态系统的大横截面。有了de-Kodi,我们发现并测试了9000多个不同的Kodi插件。我们的研究结果表明,我们向公众提供的de-Kodi是研究世界上最大的多媒体平台之一的重要资产。我们的工作进一步作为Kodi生态系统的首次透明和可重复的分析。
{"title":"De-Kodi: Understanding the Kodi Ecosystem","authors":"Marc Anthony Warrior, Yunming Xiao, Matteo Varvello, A. Kuzmanovic","doi":"10.1145/3366423.3380194","DOIUrl":"https://doi.org/10.1145/3366423.3380194","url":null,"abstract":"Free and open source media centers are currently experiencing a boom in popularity for the convenience and flexibility they offer users seeking to remotely consume digital content. This newfound fame is matched by increasing notoriety—for their potential to serve as hubs for illegal content—and a presumably ever-increasing network footprint. It is fair to say that a complex ecosystem has developed around Kodi, composed of millions of users, thousands of “add-ons”—Kodi extensions from 3rd-party developers—and content providers. Motivated by these observations, this paper conducts the first analysis of the Kodi ecosystem. Our approach is to build “crawling” software around Kodi which can automatically install an addon, explore its menu, and locate (video) content. This is challenging for many reasons. First, Kodi largely relies on visual information and user input which intrinsically complicates automation. Second, no central aggregators for Kodi addons exist. Third, the potential sheer size of this ecosystem requires a highly scalable crawling solution. We address these challenges with de-Kodi, a full fledged crawling system capable of discovering and crawling large cross-sections of Kodi’s decentralized ecosystem. With de-Kodi, we discovered and tested over 9,000 distinct Kodi addons. Our results demonstrate de-Kodi, which we make available to the general public, to be an essential asset in studying one of the largest multimedia platforms in the world. Our work further serves as the first ever transparent and repeatable analysis of the Kodi ecosystem at large.","PeriodicalId":20754,"journal":{"name":"Proceedings of The Web Conference 2020","volume":"390 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80313481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Learning the Structure of Auto-Encoding Recommenders 学习自动编码推荐器的结构
Pub Date : 2020-04-20 DOI: 10.1145/3366423.3380135
Farhan Khawar, Leonard K. M. Poon, N. Zhang
Autoencoder recommenders have recently shown state-of-the-art performance in the recommendation task due to their ability to model non-linear item relationships effectively. However, existing autoencoder recommenders use fully-connected neural network layers and do not employ structure learning. This can lead to inefficient training, especially when the data is sparse as commonly found in collaborative filtering. The aforementioned results in lower generalization ability and reduced performance. In this paper, we introduce structure learning for autoencoder recommenders by taking advantage of the inherent item groups present in the collaborative filtering domain. Due to the nature of items in general, we know that certain items are more related to each other than to other items. Based on this, we propose a method that first learns groups of related items and then uses this information to determine the connectivity structure of an auto-encoding neural network. This results in a network that is sparsely connected. This sparse structure can be viewed as a prior that guides the network training. Empirically we demonstrate that the proposed structure learning enables the autoencoder to converge to a local optimum with a much smaller spectral norm and generalization error bound than the fully-connected network. The resultant sparse network considerably outperforms the state-of-the-art methods like Mult-vae/Mult-dae on multiple benchmarked datasets even when the same number of parameters and flops are used. It also has a better cold-start performance.
自动编码器推荐器最近在推荐任务中表现出了最先进的性能,因为它们能够有效地模拟非线性项目关系。然而,现有的自动编码器推荐使用全连接的神经网络层,而不使用结构学习。这可能会导致训练效率低下,尤其是在协同过滤中常见的数据稀疏的情况下。上述结果会导致较低的泛化能力和性能下降。在本文中,我们利用协同过滤域中存在的固有条目组,为自动编码器推荐器引入结构学习。由于一般项目的性质,我们知道某些项目彼此之间的关系比其他项目更密切。在此基础上,我们提出了一种首先学习相关项组,然后利用这些信息确定自编码神经网络连接结构的方法。这就造成了一个稀疏连接的网络。这种稀疏结构可以看作是指导网络训练的先验。我们的经验证明,所提出的结构学习使自编码器收敛到局部最优,具有比全连接网络小得多的谱范数和泛化误差界。由此产生的稀疏网络在多个基准数据集上,即使使用相同数量的参数和flops,其性能也大大优于multi -vae/ multi -dae等最先进的方法。它还具有更好的冷启动性能。
{"title":"Learning the Structure of Auto-Encoding Recommenders","authors":"Farhan Khawar, Leonard K. M. Poon, N. Zhang","doi":"10.1145/3366423.3380135","DOIUrl":"https://doi.org/10.1145/3366423.3380135","url":null,"abstract":"Autoencoder recommenders have recently shown state-of-the-art performance in the recommendation task due to their ability to model non-linear item relationships effectively. However, existing autoencoder recommenders use fully-connected neural network layers and do not employ structure learning. This can lead to inefficient training, especially when the data is sparse as commonly found in collaborative filtering. The aforementioned results in lower generalization ability and reduced performance. In this paper, we introduce structure learning for autoencoder recommenders by taking advantage of the inherent item groups present in the collaborative filtering domain. Due to the nature of items in general, we know that certain items are more related to each other than to other items. Based on this, we propose a method that first learns groups of related items and then uses this information to determine the connectivity structure of an auto-encoding neural network. This results in a network that is sparsely connected. This sparse structure can be viewed as a prior that guides the network training. Empirically we demonstrate that the proposed structure learning enables the autoencoder to converge to a local optimum with a much smaller spectral norm and generalization error bound than the fully-connected network. The resultant sparse network considerably outperforms the state-of-the-art methods like Mult-vae/Mult-dae on multiple benchmarked datasets even when the same number of parameters and flops are used. It also has a better cold-start performance.","PeriodicalId":20754,"journal":{"name":"Proceedings of The Web Conference 2020","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74887145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Client Insourcing: Bringing Ops In-House for Seamless Re-engineering of Full-Stack JavaScript Applications 客户内包:将运维带入内部,实现全栈JavaScript应用程序的无缝重新设计
Pub Date : 2020-04-20 DOI: 10.1145/3366423.3380105
Kijin An, E. Tilevich
Modern web applications are distributed across a browser-based client and a cloud-based server. Distribution provides access to remote resources, accessed over the web and shared by clients. Much of the complexity of inspecting and evolving web applications lies in their distributed nature. Also, the majority of mature program analysis and transformation tools works only with centralized software. Inspired by business process re-engineering, in which remote operations can be insourced back in house to restructure and outsource anew, we bring an analogous approach to the re-engineering of web applications. Our target domain are full-stack JavaScript applications that implement both the client and server code in this language. Our approach is enabled by Client Insourcing, a novel automatic refactoring that creates a semantically equivalent centralized version of a distributed application. This centralized version is then inspected, modified, and redistributed to meet new requirements. After describing the design and implementation of Client Insourcing, we demonstrate its utility and value in addressing changes in security, reliability, and performance requirements. By reducing the complexity of the non-trivial program inspection and evolution tasks performed to meet these requirements, our approach can become a helpful aid in the re-engineering of web applications in this domain.
现代web应用程序分布在基于浏览器的客户端和基于云的服务器上。分发提供了对远程资源的访问,这些资源可以通过web访问并由客户端共享。检查和发展web应用程序的复杂性很大程度上在于它们的分布式特性。同样,大多数成熟的程序分析和转换工具只适用于集中式软件。受业务流程重新设计的启发,我们引入了一种类似于web应用程序重新设计的方法。在业务流程重新设计中,远程操作可以内包到内部以重新构建和外包。我们的目标领域是用这种语言实现客户端和服务器代码的全栈JavaScript应用程序。我们的方法是通过Client Insourcing实现的,Client Insourcing是一种新颖的自动重构,可以创建语义上等同的分布式应用程序的集中版本。然后检查、修改和重新分发这个集中的版本,以满足新的需求。在描述了Client Insourcing的设计和实现之后,我们将展示其在处理安全性、可靠性和性能需求变化方面的实用性和价值。通过减少为满足这些需求而执行的重要程序检查和演化任务的复杂性,我们的方法可以成为该领域中web应用程序重新设计的有用帮助。
{"title":"Client Insourcing: Bringing Ops In-House for Seamless Re-engineering of Full-Stack JavaScript Applications","authors":"Kijin An, E. Tilevich","doi":"10.1145/3366423.3380105","DOIUrl":"https://doi.org/10.1145/3366423.3380105","url":null,"abstract":"Modern web applications are distributed across a browser-based client and a cloud-based server. Distribution provides access to remote resources, accessed over the web and shared by clients. Much of the complexity of inspecting and evolving web applications lies in their distributed nature. Also, the majority of mature program analysis and transformation tools works only with centralized software. Inspired by business process re-engineering, in which remote operations can be insourced back in house to restructure and outsource anew, we bring an analogous approach to the re-engineering of web applications. Our target domain are full-stack JavaScript applications that implement both the client and server code in this language. Our approach is enabled by Client Insourcing, a novel automatic refactoring that creates a semantically equivalent centralized version of a distributed application. This centralized version is then inspected, modified, and redistributed to meet new requirements. After describing the design and implementation of Client Insourcing, we demonstrate its utility and value in addressing changes in security, reliability, and performance requirements. By reducing the complexity of the non-trivial program inspection and evolution tasks performed to meet these requirements, our approach can become a helpful aid in the re-engineering of web applications in this domain.","PeriodicalId":20754,"journal":{"name":"Proceedings of The Web Conference 2020","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80159027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
I’ve Got Your Packages: Harvesting Customers’ Delivery Order Information using Package Tracking Number Enumeration Attacks 我有你的包裹:使用包裹跟踪号码枚举攻击收集客户的送货订单信息
Pub Date : 2020-04-20 DOI: 10.1145/3366423.3380062
Simon S. Woo, Hanbin Jang, Woojung Ji, Hyoungshick Kim
A package tracking number (PTN) is widely used to monitor and track a shipment. Through the lenses of security and privacy, however, a package tracking number can possibly reveal certain personal information, leading to security and privacy breaches. In this work, we examine the privacy issues associated with online package tracking systems used in the top three most popular package delivery service providers (FedEx, DHL, and UPS) in the world and found that those websites inadvertently leak users’ personal data with a PTN. Moreover, we discovered that PTNs are highly structured and predictable. Therefore, customers’ personal data can be massively collected via PTN enumeration attacks. We analyzed more than one million package tracking records obtained from Fedex, DHL, and UPS, and showed that within 5 attempts, an attacker can efficiently guess more than 90% of PTNs for FedEx and DHL, and close to 50% of PTNs for UPS. In addition, we present two practical attack scenarios: 1) to infer business transactions information and 2) to uniquely identify recipients. Also, we found that more than 109 recipients can be uniquely identified with less than 10 comparisons by linking the PTN information with the online people search service, Whitepages.
包裹跟踪号(PTN)被广泛用于监控和跟踪货物。然而,从安全和隐私的角度来看,快递单号可能会泄露某些个人信息,从而导致安全和隐私泄露。在这项工作中,我们研究了与全球三大最受欢迎的包裹递送服务提供商(联邦快递、DHL和UPS)使用的在线包裹跟踪系统相关的隐私问题,发现这些网站无意中泄露了用户的个人数据。此外,我们发现ptn是高度结构化和可预测的。因此,通过PTN枚举攻击可以大量收集客户的个人数据。我们分析了从Fedex、DHL和UPS获得的100多万个包裹跟踪记录,结果表明,在5次尝试中,攻击者可以有效地猜测Fedex和DHL超过90%的ptn,以及接近50%的UPS ptn。此外,我们还提出了两种实际的攻击场景:1)推断业务交易信息和2)唯一标识收件人。此外,我们发现,通过将PTN信息与在线人物搜索服务Whitepages联系起来,不到10次比较就可以唯一识别超过109个收件人。
{"title":"I’ve Got Your Packages: Harvesting Customers’ Delivery Order Information using Package Tracking Number Enumeration Attacks","authors":"Simon S. Woo, Hanbin Jang, Woojung Ji, Hyoungshick Kim","doi":"10.1145/3366423.3380062","DOIUrl":"https://doi.org/10.1145/3366423.3380062","url":null,"abstract":"A package tracking number (PTN) is widely used to monitor and track a shipment. Through the lenses of security and privacy, however, a package tracking number can possibly reveal certain personal information, leading to security and privacy breaches. In this work, we examine the privacy issues associated with online package tracking systems used in the top three most popular package delivery service providers (FedEx, DHL, and UPS) in the world and found that those websites inadvertently leak users’ personal data with a PTN. Moreover, we discovered that PTNs are highly structured and predictable. Therefore, customers’ personal data can be massively collected via PTN enumeration attacks. We analyzed more than one million package tracking records obtained from Fedex, DHL, and UPS, and showed that within 5 attempts, an attacker can efficiently guess more than 90% of PTNs for FedEx and DHL, and close to 50% of PTNs for UPS. In addition, we present two practical attack scenarios: 1) to infer business transactions information and 2) to uniquely identify recipients. Also, we found that more than 109 recipients can be uniquely identified with less than 10 comparisons by linking the PTN information with the online people search service, Whitepages.","PeriodicalId":20754,"journal":{"name":"Proceedings of The Web Conference 2020","volume":"17 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84133541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
Proceedings of The Web Conference 2020
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1