首页 > 最新文献

Companion Proceedings of the Web Conference 2021最新文献

英文 中文
Visualising Scientific Topic Evolution 可视化科学主题演变
Pub Date : 2021-04-19 DOI: 10.1145/3442442.3451371
P. Deligiannis, Thanasis Vergoulis, Serafeim Chatzopoulos, Christos Tryfonopoulos
The automatic extraction of topics is a standard technique for summarizing text corpora from various domains (e.g., news articles, transport or logistic reports, scientific publications) that has several applications. Since, in many cases, topics are subject to continuous change there is the need to monitor the evolution of a set of topics of interest, as the corresponding corpora are updated. The evolution of scientific topics, in particular, is of great interest for researchers, policy makers, fund managers, and other professionals/engineers in the research and academic community. In this work, we demonstrate a prototype that provides intuitive visualisations for the evolution of scientific topics providing insights about topic transformation, merging, and splitting during the recent years. Although the prototype works on top of a scientific text corpus, its implementation is generic and can be easily applied on texts from other domains, as well.
主题的自动提取是一种标准技术,用于总结来自不同领域的文本语料库(例如,新闻文章、运输或物流报告、科学出版物),具有多种应用。由于在许多情况下,主题受到持续变化的影响,因此需要在相应的语料库更新时监视一组感兴趣的主题的演变。科学主题的演变尤其引起了研究人员、政策制定者、基金经理以及研究和学术界的其他专业人员/工程师的极大兴趣。在这项工作中,我们展示了一个原型,它为科学主题的演变提供了直观的可视化,提供了近年来主题转换、合并和分裂的见解。虽然原型工作在科学文本语料库之上,但它的实现是通用的,也可以很容易地应用于其他领域的文本。
{"title":"Visualising Scientific Topic Evolution","authors":"P. Deligiannis, Thanasis Vergoulis, Serafeim Chatzopoulos, Christos Tryfonopoulos","doi":"10.1145/3442442.3451371","DOIUrl":"https://doi.org/10.1145/3442442.3451371","url":null,"abstract":"The automatic extraction of topics is a standard technique for summarizing text corpora from various domains (e.g., news articles, transport or logistic reports, scientific publications) that has several applications. Since, in many cases, topics are subject to continuous change there is the need to monitor the evolution of a set of topics of interest, as the corresponding corpora are updated. The evolution of scientific topics, in particular, is of great interest for researchers, policy makers, fund managers, and other professionals/engineers in the research and academic community. In this work, we demonstrate a prototype that provides intuitive visualisations for the evolution of scientific topics providing insights about topic transformation, merging, and splitting during the recent years. Although the prototype works on top of a scientific text corpus, its implementation is generic and can be easily applied on texts from other domains, as well.","PeriodicalId":129420,"journal":{"name":"Companion Proceedings of the Web Conference 2021","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132982650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Data-Driven Solutions in Smart Cities: The case of Covid-19 智慧城市中的数据驱动解决方案:以2019冠状病毒病为例
Pub Date : 2021-04-19 DOI: 10.1145/3442442.3453469
Nenad N. Petrović, Vlado Dimovski, Judita Peterlin, M. Meško, Vasja Roblek
This paper aims to give a systemic vision about the data-driven mobile applications in urban data management processes, which is essential to ensure a sustainable smart city ecosystem for what is needed to ensure diversification between stakeholders and data sources. The realization of sustainable data-driven smart solutions based on an urban data platform that will enable citizen wellbeing in the smart city is needed to develop data-driven applications. In this paper, we present five case study mobile applications developed using AppSheet and Google Apps Script technologies to prevent the spread of COVID-19 and provide support to (potentially) infected citizens. Several aspects relevant to coronavirus pandemic are considered: quick COVID-19 patient assessment based on user-provided symptoms integrated with contact tracing; volunteer help during quarantine; UAV-based COVID-19 outdoor safety surveillance; test scheduling and AR-based pharmacy shop assistant.
本文旨在对城市数据管理过程中数据驱动的移动应用程序进行系统的描述,这对于确保利益相关者和数据源之间多样化所需的可持续智慧城市生态系统至关重要。开发数据驱动的应用程序需要基于城市数据平台实现可持续的数据驱动的智能解决方案,从而实现智慧城市中的公民福祉。在本文中,我们介绍了使用AppSheet和Google Apps Script技术开发的五个案例研究移动应用程序,以防止COVID-19的传播并为(潜在)受感染的公民提供支持。与冠状病毒大流行相关的几个方面被考虑:基于用户提供的症状与接触者追踪相结合的COVID-19患者快速评估;隔离期间志愿帮助;基于无人机的新型冠状病毒室外安全监测;测试调度和基于ar的药房店员。
{"title":"Data-Driven Solutions in Smart Cities: The case of Covid-19","authors":"Nenad N. Petrović, Vlado Dimovski, Judita Peterlin, M. Meško, Vasja Roblek","doi":"10.1145/3442442.3453469","DOIUrl":"https://doi.org/10.1145/3442442.3453469","url":null,"abstract":"This paper aims to give a systemic vision about the data-driven mobile applications in urban data management processes, which is essential to ensure a sustainable smart city ecosystem for what is needed to ensure diversification between stakeholders and data sources. The realization of sustainable data-driven smart solutions based on an urban data platform that will enable citizen wellbeing in the smart city is needed to develop data-driven applications. In this paper, we present five case study mobile applications developed using AppSheet and Google Apps Script technologies to prevent the spread of COVID-19 and provide support to (potentially) infected citizens. Several aspects relevant to coronavirus pandemic are considered: quick COVID-19 patient assessment based on user-provided symptoms integrated with contact tracing; volunteer help during quarantine; UAV-based COVID-19 outdoor safety surveillance; test scheduling and AR-based pharmacy shop assistant.","PeriodicalId":129420,"journal":{"name":"Companion Proceedings of the Web Conference 2021","volume":"132 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132186906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Rewarding Research Data Management 奖励研究数据管理
Pub Date : 2021-04-19 DOI: 10.1145/3442442.3451367
Joachim Schöpfel, Otmane Azeroual
In the context of open science, good research data management (RDM), including data sharing and data reuse, has become a major goal of research policy. However, studies and monitors reveal that open science practices are not yet widely mainstream. Rewards and incentives have been suggested as a solution, to facilitate and accelerate the development of open and transparent RDM. Based on relevant literature, our paper provides a critical analysis of three main issues: what should be rewarded and incentivized, who should be rewarded, and what kind of rewards and incentives should be used? Concluding the analysis, we ask if it is really necessary and appropriate to consider RDM as an individual (behavioral) issue, as the main challenges are elsewhere, not personal, but technological, institutional and financial.
在开放科学的背景下,良好的研究数据管理(RDM),包括数据共享和数据重用,已经成为研究政策的主要目标。然而,研究和监测表明,开放科学实践尚未成为广泛的主流。奖励和激励已被建议作为一种解决方案,以促进和加速开放和透明的RDM的发展。在相关文献的基础上,我们的论文对三个主要问题进行了批判性的分析:应该奖励和激励什么,应该奖励谁,应该使用什么样的奖励和激励?在总结分析时,我们要问是否真的有必要和适当地将RDM视为个人(行为)问题,因为主要挑战在其他地方,而不是个人,而是技术,制度和财务。
{"title":"Rewarding Research Data Management","authors":"Joachim Schöpfel, Otmane Azeroual","doi":"10.1145/3442442.3451367","DOIUrl":"https://doi.org/10.1145/3442442.3451367","url":null,"abstract":"In the context of open science, good research data management (RDM), including data sharing and data reuse, has become a major goal of research policy. However, studies and monitors reveal that open science practices are not yet widely mainstream. Rewards and incentives have been suggested as a solution, to facilitate and accelerate the development of open and transparent RDM. Based on relevant literature, our paper provides a critical analysis of three main issues: what should be rewarded and incentivized, who should be rewarded, and what kind of rewards and incentives should be used? Concluding the analysis, we ask if it is really necessary and appropriate to consider RDM as an individual (behavioral) issue, as the main challenges are elsewhere, not personal, but technological, institutional and financial.","PeriodicalId":129420,"journal":{"name":"Companion Proceedings of the Web Conference 2021","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133532728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Auditing Source Diversity Bias in Video Search Results Using Virtual Agents 基于虚拟座席的视频搜索结果源多样性偏差审计
Pub Date : 2021-04-19 DOI: 10.1145/3442442.3452306
Aleksandra Urman, M. Makhortykh, R. Ulloa
We audit the presence of domain-level source diversity bias in video search results. Using a virtual agent-based approach, we compare outputs of four Western and one non-Western search engines for English and Russian queries. Our findings highlight that source diversity varies substantially depending on the language with English queries returning more diverse outputs. We also find disproportionately high presence of a single platform, YouTube, in top search outputs for all Western search engines except Google. At the same time, we observe that Youtube’s major competitors such as Vimeo or Dailymotion do not appear in the sampled Google’s video search results. This finding suggests that Google might be downgrading the results from the main competitors of Google-owned Youtube and highlights the necessity for further studies focusing on the presence of own-content bias in Google’s search results.
我们审核视频搜索结果中域级源多样性偏差的存在。使用基于虚拟代理的方法,我们比较了四个西方和一个非西方搜索引擎对英语和俄语查询的输出。我们的研究结果强调,来源多样性在很大程度上取决于语言,英语查询返回更多样化的输出。我们还发现,在除谷歌以外的所有西方搜索引擎的搜索结果中,YouTube这个单一平台的出现比例过高。与此同时,我们观察到Youtube的主要竞争对手,如Vimeo或Dailymotion,并没有出现在抽样的谷歌视频搜索结果中。这一发现表明,谷歌可能正在降低谷歌旗下Youtube的主要竞争对手的搜索结果,并强调了进一步研究谷歌搜索结果中存在的自有内容偏见的必要性。
{"title":"Auditing Source Diversity Bias in Video Search Results Using Virtual Agents","authors":"Aleksandra Urman, M. Makhortykh, R. Ulloa","doi":"10.1145/3442442.3452306","DOIUrl":"https://doi.org/10.1145/3442442.3452306","url":null,"abstract":"We audit the presence of domain-level source diversity bias in video search results. Using a virtual agent-based approach, we compare outputs of four Western and one non-Western search engines for English and Russian queries. Our findings highlight that source diversity varies substantially depending on the language with English queries returning more diverse outputs. We also find disproportionately high presence of a single platform, YouTube, in top search outputs for all Western search engines except Google. At the same time, we observe that Youtube’s major competitors such as Vimeo or Dailymotion do not appear in the sampled Google’s video search results. This finding suggests that Google might be downgrading the results from the main competitors of Google-owned Youtube and highlights the necessity for further studies focusing on the presence of own-content bias in Google’s search results.","PeriodicalId":129420,"journal":{"name":"Companion Proceedings of the Web Conference 2021","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131289511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Grounding Language in Visual and Conversational Contexts 视觉和会话语境中的语言基础
Pub Date : 2021-04-19 DOI: 10.1145/3442442.3451898
R. Fernández
Most language use is driven by specific communicative goals in interactive setups, where often visual perception goes hand in hand with language processing. I will discuss some recent projects by my research group related to modelling language generation in socially and visually grounded contexts, arguing that such models can help us to better understand the cognitive processes underpinning these abilities in humans and contribute to more human-like conversational agents.
在互动环境中,大多数语言的使用都是由特定的交际目标驱动的,在这种情况下,视觉感知通常与语言处理密切相关。我将讨论我的研究小组最近的一些项目,这些项目与社交和视觉背景下的语言生成建模有关,我认为这些模型可以帮助我们更好地理解支撑人类这些能力的认知过程,并有助于创造更像人类的对话代理。
{"title":"Grounding Language in Visual and Conversational Contexts","authors":"R. Fernández","doi":"10.1145/3442442.3451898","DOIUrl":"https://doi.org/10.1145/3442442.3451898","url":null,"abstract":"Most language use is driven by specific communicative goals in interactive setups, where often visual perception goes hand in hand with language processing. I will discuss some recent projects by my research group related to modelling language generation in socially and visually grounded contexts, arguing that such models can help us to better understand the cognitive processes underpinning these abilities in humans and contribute to more human-like conversational agents.","PeriodicalId":129420,"journal":{"name":"Companion Proceedings of the Web Conference 2021","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115775915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
aiai at the FinSim-2 task: Finance Domain Terms Automatic Classification Via Word Ontology and Embedding 基于词本体和嵌入的金融领域术语自动分类
Pub Date : 2021-04-19 DOI: 10.1145/3442442.3451388
Ke Tian, Hua Chen
This paper describes the method that we submitted to the FinSim-2 task on learning similarities for the financial domain. This task aims to automatically classify the Financial domain terms into the most relevant hypernym (or top-level) concept in an external ontology. This paper shows the result of experiments using the Catboost, Attention-LSTM, BERT, RoBERTa to develop an automatic finance domain classifier via word ontology and embedding. The experiment result demonstrates that each model could be an effective method to tackle the FinSim-2 task, respectively.
本文描述了我们提交给FinSim-2任务的关于金融领域相似性学习的方法。此任务旨在将金融领域术语自动分类为外部本体中最相关的超词(或顶级)概念。本文介绍了利用Catboost、Attention-LSTM、BERT、RoBERTa等方法,利用词本体和嵌入技术开发金融领域自动分类器的实验结果。实验结果表明,每种模型都能有效地解决FinSim-2任务。
{"title":"aiai at the FinSim-2 task: Finance Domain Terms Automatic Classification Via Word Ontology and Embedding","authors":"Ke Tian, Hua Chen","doi":"10.1145/3442442.3451388","DOIUrl":"https://doi.org/10.1145/3442442.3451388","url":null,"abstract":"This paper describes the method that we submitted to the FinSim-2 task on learning similarities for the financial domain. This task aims to automatically classify the Financial domain terms into the most relevant hypernym (or top-level) concept in an external ontology. This paper shows the result of experiments using the Catboost, Attention-LSTM, BERT, RoBERTa to develop an automatic finance domain classifier via word ontology and embedding. The experiment result demonstrates that each model could be an effective method to tackle the FinSim-2 task, respectively.","PeriodicalId":129420,"journal":{"name":"Companion Proceedings of the Web Conference 2021","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114858646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Semantic Search via Entity-Types: The SEMANNOREX Framework 基于实体类型的语义搜索:语义框架
Pub Date : 2021-04-19 DOI: 10.1145/3442442.3458607
Amit Kumar, Govind, M. Spaniol
Capturing and exploiting a content’s semantic is a key success factor for Web search. To this end, it is crucial to - ideally automatically - extract the core semantics of the data being processed and link this information with some formal representation, such as an ontology. By intertwining both, search becomes semantic by simultaneously allowing end-users a structured access to the data via the underlying ontology. Connecting both, we introduce the SEMANNOREX framework in order to provide semantically enriched access to a news corpus from Websites and Wikinews.
捕获和利用内容的语义是Web搜索成功的关键因素。为此,至关重要的是——理想情况下是自动地——提取正在处理的数据的核心语义,并将该信息与一些形式化表示(如本体)联系起来。通过将两者交织在一起,搜索变得语义化,同时允许最终用户通过底层本体对数据进行结构化访问。将两者连接起来,我们引入SEMANNOREX框架,以提供对来自网站和维基新闻的新闻语料库的语义丰富的访问。
{"title":"Semantic Search via Entity-Types: The SEMANNOREX Framework","authors":"Amit Kumar, Govind, M. Spaniol","doi":"10.1145/3442442.3458607","DOIUrl":"https://doi.org/10.1145/3442442.3458607","url":null,"abstract":"Capturing and exploiting a content’s semantic is a key success factor for Web search. To this end, it is crucial to - ideally automatically - extract the core semantics of the data being processed and link this information with some formal representation, such as an ontology. By intertwining both, search becomes semantic by simultaneously allowing end-users a structured access to the data via the underlying ontology. Connecting both, we introduce the SEMANNOREX framework in order to provide semantically enriched access to a news corpus from Websites and Wikinews.","PeriodicalId":129420,"journal":{"name":"Companion Proceedings of the Web Conference 2021","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124314757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Insurance Assistant: An Intelligent Platform for Video Insurance Assessment 保险助手:视频保险评估的智能平台
Pub Date : 2021-04-19 DOI: 10.1145/3442442.3458600
Shuang Peng, Minghui Yang, Fudong Wang, Xiangyang Li, Zujie Wen, Lei Liu
In the insurance industry, the assessor’s role is essential and requires significant efforts conversing with the claimant. This is a highly professional process that involves many complex operations to make a final insurance report. In order to save the cost, the previous offline insurance assessment procedure is gradually moved online. However, for the junior assessor often lacking in practical experience, it is not easy to quickly handle such an online procedure, yet this is important as the insurance company decides how much compensation the claimant should receive based on the assessor’s feedback. In this paper, we present an insurance assistant that applies NLP technologies to help junior insurance assessors do their job better. The insurance assistant recommends appropriate inquiring policies and auto-completes the case report during the insurance assessment procedure. Here, we demonstrate the system via a short video 1.
在保险行业,评估员的角色是必不可少的,需要与索赔人进行大量的沟通。这是一个非常专业的过程,涉及许多复杂的操作来制作最终的保险报告。为了节省成本,以前的线下保险评估程序逐渐转移到网上。然而,对于缺乏实践经验的初级评估员来说,快速处理这样的在线程序并不容易,但这一点很重要,因为保险公司根据评估员的反馈决定索赔人应该获得多少赔偿。在本文中,我们提出了一个保险助理,应用自然语言处理技术来帮助初级保险评估员更好地完成他们的工作。保险助理在保险评估过程中建议适当的查询政策并自动完成案例报告。在这里,我们通过一个简短的视频来演示这个系统。
{"title":"Insurance Assistant: An Intelligent Platform for Video Insurance Assessment","authors":"Shuang Peng, Minghui Yang, Fudong Wang, Xiangyang Li, Zujie Wen, Lei Liu","doi":"10.1145/3442442.3458600","DOIUrl":"https://doi.org/10.1145/3442442.3458600","url":null,"abstract":"In the insurance industry, the assessor’s role is essential and requires significant efforts conversing with the claimant. This is a highly professional process that involves many complex operations to make a final insurance report. In order to save the cost, the previous offline insurance assessment procedure is gradually moved online. However, for the junior assessor often lacking in practical experience, it is not easy to quickly handle such an online procedure, yet this is important as the insurance company decides how much compensation the claimant should receive based on the assessor’s feedback. In this paper, we present an insurance assistant that applies NLP technologies to help junior insurance assessors do their job better. The insurance assistant recommends appropriate inquiring policies and auto-completes the case report during the insurance assessment procedure. Here, we demonstrate the system via a short video 1.","PeriodicalId":129420,"journal":{"name":"Companion Proceedings of the Web Conference 2021","volume":"157 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115828355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PROV4ITDaTa: Transparent and direct transferof personal data to personal stores PROV4ITDaTa:透明和直接地将个人数据转移到个人存储
Pub Date : 2021-04-19 DOI: 10.1145/3442442.3458608
Gertjan De Mulder, B. Meester, Pieter Heyvaert, Ruben Taelman, Anastasia Dimou, R. Verborgh
Data is scattered across service providers, heterogeneously structured in various formats. By lack of interoperability, data portability is hindered, and thus user control is inhibited. An interoperable data portability solution for transferring personal data is needed. We demo PROV4ITDaTa: a Web application, that allows users to transfer personal data into an interoperable format to their personal data store. PROV4ITDaTa leverages the open-source solutions RML.io, Comunica, and Solid: (i) the RML.io toolset to describe how to access data from service providers and generate interoperable datasets; (ii) Comunica to query these and more flexibly generate enriched datasets; and (iii) Solid Pods to store the generated data as Linked Data in personal data stores. As opposed to other (hard-coded) solutions, PROV4ITDaTa is fully transparent, where each component of the pipeline is fully configurable and automatically generates detailed provenance trails. Furthermore, transforming the personal data into RDF allows for an interopable solution. By maximizing the use of open-source tools and open standards, PROV4ITDaTa facilitates the shift towards a data ecosystem wherein users have control of their data, and providers can focus on their service instead of trying to adhere to interoperability requirements.
数据分散在各个服务提供者之间,以各种格式进行异构结构。由于缺乏互操作性,数据可移植性受到阻碍,从而抑制了用户控制。需要一种可互操作的数据可移植性解决方案来传输个人数据。我们演示PROV4ITDaTa:一个Web应用程序,它允许用户将个人数据转换为可互操作的格式,并存储到他们的个人数据存储中。PROV4ITDaTa利用开源解决方案RML。io、communication和Solid:(i) RML。IO工具集,用于描述如何从服务提供商访问数据并生成可互操作的数据集;通讯查询这些数据并更灵活地生成丰富的数据集;以及(iii) Solid Pods将生成的数据作为关联数据存储在个人数据存储中。与其他(硬编码的)解决方案相反,PROV4ITDaTa是完全透明的,其中管道的每个组件都是完全可配置的,并自动生成详细的来源跟踪。此外,将个人数据转换为RDF支持可互操作的解决方案。通过最大限度地使用开源工具和开放标准,PROV4ITDaTa促进了向数据生态系统的转变,其中用户可以控制他们的数据,提供商可以专注于他们的服务,而不是试图坚持互操作性需求。
{"title":"PROV4ITDaTa: Transparent and direct transferof personal data to personal stores","authors":"Gertjan De Mulder, B. Meester, Pieter Heyvaert, Ruben Taelman, Anastasia Dimou, R. Verborgh","doi":"10.1145/3442442.3458608","DOIUrl":"https://doi.org/10.1145/3442442.3458608","url":null,"abstract":"Data is scattered across service providers, heterogeneously structured in various formats. By lack of interoperability, data portability is hindered, and thus user control is inhibited. An interoperable data portability solution for transferring personal data is needed. We demo PROV4ITDaTa: a Web application, that allows users to transfer personal data into an interoperable format to their personal data store. PROV4ITDaTa leverages the open-source solutions RML.io, Comunica, and Solid: (i) the RML.io toolset to describe how to access data from service providers and generate interoperable datasets; (ii) Comunica to query these and more flexibly generate enriched datasets; and (iii) Solid Pods to store the generated data as Linked Data in personal data stores. As opposed to other (hard-coded) solutions, PROV4ITDaTa is fully transparent, where each component of the pipeline is fully configurable and automatically generates detailed provenance trails. Furthermore, transforming the personal data into RDF allows for an interopable solution. By maximizing the use of open-source tools and open standards, PROV4ITDaTa facilitates the shift towards a data ecosystem wherein users have control of their data, and providers can focus on their service instead of trying to adhere to interoperability requirements.","PeriodicalId":129420,"journal":{"name":"Companion Proceedings of the Web Conference 2021","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117138677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Reliability of Large Scale GPU Clusters for Deep Learning Workloads 面向深度学习工作负载的大规模GPU集群可靠性研究
Pub Date : 2021-04-19 DOI: 10.1145/3442442.3452056
Junjie Qian, Taeyoon Kim, Myeongjae Jeon
Recent advances on deep learning technologies have made GPU clusters popular as training platforms. In this paper, we study reliability issues while focusing on training job failures from analyzing logs collected from running deep learning workloads on a large-scale GPU cluster in production. These failures are largely grouped into two categories, infrastructure and user, based on their sources, and reveal diverse reasons causing the failures. With insights obtained from the failure analysis, we suggest several different ways to improve the stability of shared GPU clusters designed for DL training and optimize user experience by reducing failure occurrences.
深度学习技术的最新进展使GPU集群成为流行的训练平台。在本文中,我们研究了可靠性问题,同时通过分析在生产中的大规模GPU集群上运行深度学习工作负载收集的日志来关注训练作业失败。这些故障根据其来源大致分为基础设施和用户两类,并揭示了导致故障的各种原因。根据从故障分析中获得的见解,我们提出了几种不同的方法来提高为深度学习训练设计的共享GPU集群的稳定性,并通过减少故障发生来优化用户体验。
{"title":"Reliability of Large Scale GPU Clusters for Deep Learning Workloads","authors":"Junjie Qian, Taeyoon Kim, Myeongjae Jeon","doi":"10.1145/3442442.3452056","DOIUrl":"https://doi.org/10.1145/3442442.3452056","url":null,"abstract":"Recent advances on deep learning technologies have made GPU clusters popular as training platforms. In this paper, we study reliability issues while focusing on training job failures from analyzing logs collected from running deep learning workloads on a large-scale GPU cluster in production. These failures are largely grouped into two categories, infrastructure and user, based on their sources, and reveal diverse reasons causing the failures. With insights obtained from the failure analysis, we suggest several different ways to improve the stability of shared GPU clusters designed for DL training and optimize user experience by reducing failure occurrences.","PeriodicalId":129420,"journal":{"name":"Companion Proceedings of the Web Conference 2021","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127005154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
Companion Proceedings of the Web Conference 2021
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1