首页 > 最新文献

Proceedings of the 12th International Conference on Management of Digital EcoSystems最新文献

英文 中文
A Meta Learning Approach for Automating Model Selection in Big Data Environments using Microservice and Container Virtualization Technologies 使用微服务和容器虚拟化技术在大数据环境中自动化模型选择的元学习方法
Shadi Shahoud, Hatem Khalloof, Moritz Winter, Clemens Düpmeier, V. Hagenmeyer
For a given specific machine learning task, very often several machine learning algorithms and their right configurations are tested in a trial-and-error approach, until an adequate solution is found. This wastes human resources for constructing multiple models, requires a data analytics expert and is time-consuming, since a variety of learning algorithms are proposed in literature and the non-expert users do not know which one to use in order to obtain good performance results. Meta learning addresses these problems and supports non-expert users by recommending a promising learning algorithm based on meta features computed from a given dataset. In the present paper, a new generic microservice-based framework for realizing the concept of meta learning in Big Data environments is introduced. This framework makes use of a powerful Big Data software stack, container visualization, modern web technologies and a microservice architecture for a fully manageable and highly scalable solution. In this demonstration and for evaluation purpose, time series model selection is taken into account. The performance and usability of the new framework is evaluated on state-of-the-art machine learning algorithms for time series forecasting: it is shown that the proposed microservice-based meta learning framework introduces an excellent performance in assigning the adequate forecasting model for the chosen time series datasets. Moreover, the recommendation of the most appropriate forecasting model results in a well acceptable low overhead demonstrating that the framework can provide an efficient approach to solve the problem of model selection in context of Big Data.
对于给定的特定机器学习任务,通常会以试错方法测试几种机器学习算法及其正确配置,直到找到适当的解决方案。这浪费了构建多个模型的人力资源,需要数据分析专家,并且耗时,因为文献中提出了各种学习算法,非专业用户不知道使用哪种算法才能获得良好的性能结果。元学习解决了这些问题,并通过推荐基于给定数据集计算的元特征的有前途的学习算法来支持非专业用户。本文介绍了一种新的基于微服务的通用框架,用于在大数据环境中实现元学习的概念。该框架利用强大的大数据软件栈、容器可视化、现代web技术和微服务架构,提供完全可管理和高度可扩展的解决方案。在这个演示中,为了评估的目的,考虑了时间序列模型的选择。新框架的性能和可用性在用于时间序列预测的最先进的机器学习算法上进行了评估:结果表明,所提出的基于微服务的元学习框架在为所选时间序列数据集分配适当的预测模型方面引入了出色的性能。此外,推荐最合适的预测模型的结果是一个很好的可接受的低开销,表明该框架可以提供一个有效的方法来解决大数据背景下的模型选择问题。
{"title":"A Meta Learning Approach for Automating Model Selection in Big Data Environments using Microservice and Container Virtualization Technologies","authors":"Shadi Shahoud, Hatem Khalloof, Moritz Winter, Clemens Düpmeier, V. Hagenmeyer","doi":"10.1145/3415958.3433072","DOIUrl":"https://doi.org/10.1145/3415958.3433072","url":null,"abstract":"For a given specific machine learning task, very often several machine learning algorithms and their right configurations are tested in a trial-and-error approach, until an adequate solution is found. This wastes human resources for constructing multiple models, requires a data analytics expert and is time-consuming, since a variety of learning algorithms are proposed in literature and the non-expert users do not know which one to use in order to obtain good performance results. Meta learning addresses these problems and supports non-expert users by recommending a promising learning algorithm based on meta features computed from a given dataset. In the present paper, a new generic microservice-based framework for realizing the concept of meta learning in Big Data environments is introduced. This framework makes use of a powerful Big Data software stack, container visualization, modern web technologies and a microservice architecture for a fully manageable and highly scalable solution. In this demonstration and for evaluation purpose, time series model selection is taken into account. The performance and usability of the new framework is evaluated on state-of-the-art machine learning algorithms for time series forecasting: it is shown that the proposed microservice-based meta learning framework introduces an excellent performance in assigning the adequate forecasting model for the chosen time series datasets. Moreover, the recommendation of the most appropriate forecasting model results in a well acceptable low overhead demonstrating that the framework can provide an efficient approach to solve the problem of model selection in context of Big Data.","PeriodicalId":198419,"journal":{"name":"Proceedings of the 12th International Conference on Management of Digital EcoSystems","volume":"184 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124914520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Towards a Roadmap for the Internet of Things Software Systems Engineering 面向物联网软件系统工程的路线图
R. Motta, K. Oliveira, G. Travassos
New technologies and approaches lead to the Internet of Things (IoT) paradigm. Such a paradigm enables the engineering of more autonomous and smarter software systems solutions. Due to its multidisciplinary nature, it involves several knowledge areas such as software and connectivity that should be combined coherently and uniformly. It should comprise different voices and expertise to deal with IoT in a multi-faceted way. In this context, what to consider while specifying, designing, and implementing IoT software systems? This work raises this discussion by defining a roadmap about what should be considered for engineering IoT software respecting all their facets. Such a roadmap is defined based on evidence acquired at the technical literature combined with a qualitative study.
新技术和新方法带来了物联网(IoT)范式。这样的范例使更自主和更智能的软件系统解决方案的工程成为可能。由于它的多学科性质,它涉及到几个知识领域,如软件和连接,应该统一连贯地结合起来。它应该包括不同的声音和专业知识,以多方面的方式处理物联网。在这种情况下,在指定、设计和实施物联网软件系统时要考虑什么?这项工作通过定义一个路线图来提出这一讨论,该路线图是关于工程物联网软件应考虑的所有方面。这样的路线图是根据从技术文献和定性研究中获得的证据来定义的。
{"title":"Towards a Roadmap for the Internet of Things Software Systems Engineering","authors":"R. Motta, K. Oliveira, G. Travassos","doi":"10.1145/3415958.3433100","DOIUrl":"https://doi.org/10.1145/3415958.3433100","url":null,"abstract":"New technologies and approaches lead to the Internet of Things (IoT) paradigm. Such a paradigm enables the engineering of more autonomous and smarter software systems solutions. Due to its multidisciplinary nature, it involves several knowledge areas such as software and connectivity that should be combined coherently and uniformly. It should comprise different voices and expertise to deal with IoT in a multi-faceted way. In this context, what to consider while specifying, designing, and implementing IoT software systems? This work raises this discussion by defining a roadmap about what should be considered for engineering IoT software respecting all their facets. Such a roadmap is defined based on evidence acquired at the technical literature combined with a qualitative study.","PeriodicalId":198419,"journal":{"name":"Proceedings of the 12th International Conference on Management of Digital EcoSystems","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126375210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
LEOnto
Anis Tissaoui, S. Sassi, R. Chbeir
The Latent Dirichlet Allocation (LDA) model [18] was originally developed and utilised for document modeling and topic extraction in Information Retrieval. To design high quality domain ontologies, effective and usable methodologies are needed to facilitate their building process. In this paper, we propose a new approach for semi-automatic ontology enriching from textual corpus based on LDA model. In our approach, LDA is adopted to provide efficient dimension reduction, able to capture semantic relationships between word-topic and topic-document in terms of probability distributions with minimum human intervention. We conducted several experiments with different model parameters and the corresponding behavior of the enriching technique was evaluated by domain experts. We also compared the results of our method with two existing learning methods using the same dataset. The study showed that our method outperforms the other methods in terms of recall and precision measures.
{"title":"LEOnto","authors":"Anis Tissaoui, S. Sassi, R. Chbeir","doi":"10.1145/3415958.3433076","DOIUrl":"https://doi.org/10.1145/3415958.3433076","url":null,"abstract":"The Latent Dirichlet Allocation (LDA) model [18] was originally developed and utilised for document modeling and topic extraction in Information Retrieval. To design high quality domain ontologies, effective and usable methodologies are needed to facilitate their building process. In this paper, we propose a new approach for semi-automatic ontology enriching from textual corpus based on LDA model. In our approach, LDA is adopted to provide efficient dimension reduction, able to capture semantic relationships between word-topic and topic-document in terms of probability distributions with minimum human intervention. We conducted several experiments with different model parameters and the corresponding behavior of the enriching technique was evaluated by domain experts. We also compared the results of our method with two existing learning methods using the same dataset. The study showed that our method outperforms the other methods in terms of recall and precision measures.","PeriodicalId":198419,"journal":{"name":"Proceedings of the 12th International Conference on Management of Digital EcoSystems","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115582575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Partitioning GPU-based Algorithm for Processing the k Nearest-Neighbor Query 基于分区gpu的k近邻查询处理算法
Polychronis Velentzas, M. Vassilakopoulos, A. Corral
The k Nearest-Neighbor (k-NN) query is a common spatial query that appears in several big data applications. Typically, GPU devices have much larger numbers of processing cores than CPUs and faster device memory than main memory accessed by CPUs, thus, providing higher computing power. We propose and implement a new GPU-based partitioning algorithm for the k-NN query, using the CUDA runtime API. Due to partitioning, this algorithm avoids calculating distances for the whole dataset. Using synthetic and real datasets, we present an extensive experimental performance comparison against six existing algorithms. These algorithms are based on calculating distances for the whole in-memory dataset. This comparison shows that the new algorithm excels in all the conducted experiments and outperforms these six algorithms.
k近邻查询(k- nn)是一种常见的空间查询,出现在很多大数据应用中。通常情况下,GPU设备的处理核数比cpu大得多,设备内存比cpu访问的主存快得多,因此可以提供更高的计算能力。我们提出并实现了一种新的基于gpu的k-NN查询分区算法,使用CUDA运行时API。由于分区,该算法避免了计算整个数据集的距离。使用合成和真实数据集,我们对六种现有算法进行了广泛的实验性能比较。这些算法是基于计算整个内存数据集的距离。对比表明,新算法在所有的实验中都表现优异,优于这六种算法。
{"title":"A Partitioning GPU-based Algorithm for Processing the k Nearest-Neighbor Query","authors":"Polychronis Velentzas, M. Vassilakopoulos, A. Corral","doi":"10.1145/3415958.3433071","DOIUrl":"https://doi.org/10.1145/3415958.3433071","url":null,"abstract":"The k Nearest-Neighbor (k-NN) query is a common spatial query that appears in several big data applications. Typically, GPU devices have much larger numbers of processing cores than CPUs and faster device memory than main memory accessed by CPUs, thus, providing higher computing power. We propose and implement a new GPU-based partitioning algorithm for the k-NN query, using the CUDA runtime API. Due to partitioning, this algorithm avoids calculating distances for the whole dataset. Using synthetic and real datasets, we present an extensive experimental performance comparison against six existing algorithms. These algorithms are based on calculating distances for the whole in-memory dataset. This comparison shows that the new algorithm excels in all the conducted experiments and outperforms these six algorithms.","PeriodicalId":198419,"journal":{"name":"Proceedings of the 12th International Conference on Management of Digital EcoSystems","volume":"109 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124804334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A Secure Tracing Method in Fog Computing Network for the IoT Devices 物联网设备雾计算网络中的安全跟踪方法
A. Alamer, Sultan Basudan, P. Hung
This paper proposes a Secure and Privacy-preserving Tracing (SPT) mechanism in the Fog Computing (FC) network. The proposed SPT mechanism employs a Counting Bloom Filter (CBF) method as a tree framework (CBF-tree) to model a secure tracing system in the FC network. With the proposed SPT mechanism, the fog node can trace a particular Internet of Things (IoT) device in a secure manner, which means that the fog node can trace IoT devices in order to provide them with their requested services without revealing their private data such as the device's identities or locations. Analysis shows that the SPT mechanism is both efficient and resilient against tracing attacks. Simulation results are provided to show that the proposed mechanism is beneficial to the FC network.
本文提出了一种雾计算(FC)网络中安全且保护隐私的跟踪(SPT)机制。提出的SPT机制采用计数布隆滤波器(CBF)方法作为树框架(CBF-tree)来模拟FC网络中的安全跟踪系统。通过提出的SPT机制,雾节点可以以安全的方式跟踪特定的物联网(IoT)设备,这意味着雾节点可以跟踪物联网设备,以便为它们提供所需的服务,而不会泄露它们的私有数据,如设备的身份或位置。分析表明,SPT机制对跟踪攻击既有效又有弹性。仿真结果表明,该机制对FC网络是有利的。
{"title":"A Secure Tracing Method in Fog Computing Network for the IoT Devices","authors":"A. Alamer, Sultan Basudan, P. Hung","doi":"10.1145/3415958.3433074","DOIUrl":"https://doi.org/10.1145/3415958.3433074","url":null,"abstract":"This paper proposes a Secure and Privacy-preserving Tracing (SPT) mechanism in the Fog Computing (FC) network. The proposed SPT mechanism employs a Counting Bloom Filter (CBF) method as a tree framework (CBF-tree) to model a secure tracing system in the FC network. With the proposed SPT mechanism, the fog node can trace a particular Internet of Things (IoT) device in a secure manner, which means that the fog node can trace IoT devices in order to provide them with their requested services without revealing their private data such as the device's identities or locations. Analysis shows that the SPT mechanism is both efficient and resilient against tracing attacks. Simulation results are provided to show that the proposed mechanism is beneficial to the FC network.","PeriodicalId":198419,"journal":{"name":"Proceedings of the 12th International Conference on Management of Digital EcoSystems","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126513152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Selection of Information Streams in Social Sensing: an Interdependence- and Cost-aware Ranking Method 社会感知中的信息流选择:一种相互依赖和成本意识排序方法
G. Gianini, C. Mio, F. Viola, Jianyi Lin, Nawaf I. Almoosa
In this work we address the problem of critical source selection in social sensing. We propose an approach to the ranking of information streams, which is aware of the interdependence among streams (redundancy and synergies), of the cost of individual streams, and of the cost related to the integration of multiple streams. The method is based on the use of the Coalitional Game Theory concept of Power Index, and relies on the polynomial-time estimate of the stream sets characteristics. With respect to other works using a power index, the method takes into account that the problem has a non-trivial cost structure.
在这项工作中,我们解决了社会传感中关键来源选择的问题。我们提出了一种信息流排序的方法,该方法意识到信息流之间的相互依赖性(冗余和协同作用)、单个信息流的成本以及与多个信息流集成相关的成本。该方法基于权力指数的联合博弈论概念的使用,并依赖于流集特征的多项式时间估计。相对于使用幂指数的其他工程,该方法考虑到问题具有非平凡的成本结构。
{"title":"Selection of Information Streams in Social Sensing: an Interdependence- and Cost-aware Ranking Method","authors":"G. Gianini, C. Mio, F. Viola, Jianyi Lin, Nawaf I. Almoosa","doi":"10.1145/3415958.3433099","DOIUrl":"https://doi.org/10.1145/3415958.3433099","url":null,"abstract":"In this work we address the problem of critical source selection in social sensing. We propose an approach to the ranking of information streams, which is aware of the interdependence among streams (redundancy and synergies), of the cost of individual streams, and of the cost related to the integration of multiple streams. The method is based on the use of the Coalitional Game Theory concept of Power Index, and relies on the polynomial-time estimate of the stream sets characteristics. With respect to other works using a power index, the method takes into account that the problem has a non-trivial cost structure.","PeriodicalId":198419,"journal":{"name":"Proceedings of the 12th International Conference on Management of Digital EcoSystems","volume":"46 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115264973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Retrospective on ASPires: An Advanced System for the Prevention and Early Detection of Forest Fires 回顾:一种先进的森林火灾预防和早期检测系统
P. Peinl
In this paper, we describe the design and a prototypical implementation of an open system (ASPires) for the early prevention and early detection of forest fires. Forest fires cause huge, constantly increasing material damage and immaterial costs to humans, the environment and property. Among others, the use of new sensor and mobile communication technologies, the use of drones, data storage and analysis in a cloud, and direct connection with authorities reduces reaction time and thereby damage. Also, biodiversity is sustained in remote areas with rare and endemic species of flora and fauna. This has been tested and proven in 3 national parks in South East Europe.
在本文中,我们描述了用于森林火灾早期预防和早期检测的开放系统(aspire)的设计和原型实现。森林火灾对人类、环境和财产造成巨大的、不断增加的物质损失和非物质损失。其中,使用新的传感器和移动通信技术,使用无人机,在云中存储和分析数据,以及与当局直接联系,减少了反应时间,从而减少了损失。此外,在拥有稀有和特有动植物物种的偏远地区,生物多样性得以维持。这已经在东南欧的3个国家公园进行了测试和验证。
{"title":"A Retrospective on ASPires: An Advanced System for the Prevention and Early Detection of Forest Fires","authors":"P. Peinl","doi":"10.1145/3415958.3433039","DOIUrl":"https://doi.org/10.1145/3415958.3433039","url":null,"abstract":"In this paper, we describe the design and a prototypical implementation of an open system (ASPires) for the early prevention and early detection of forest fires. Forest fires cause huge, constantly increasing material damage and immaterial costs to humans, the environment and property. Among others, the use of new sensor and mobile communication technologies, the use of drones, data storage and analysis in a cloud, and direct connection with authorities reduces reaction time and thereby damage. Also, biodiversity is sustained in remote areas with rare and endemic species of flora and fauna. This has been tested and proven in 3 national parks in South East Europe.","PeriodicalId":198419,"journal":{"name":"Proceedings of the 12th International Conference on Management of Digital EcoSystems","volume":"08 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130738407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A new conceptual framework for enhancing legal information retrieval at the Brazilian Superior Court of Justice 加强巴西高等法院法律信息检索的新概念框架
Thiago Gomes, M. Ladeira
Effective retrieval of jurisprudence (case-law) is imperative to achieve consistency and predictability for any legal system. In this work, we propose and proceed to an empirical evaluation of a framework for jurisprudence retrieval of the Brazilian Superior Court of Justice in order to ease the task of retrieval of other decisions with the same legal opinion. The experimental results shown that our approach based on text similarity performs better than the legacy system of the Court based on Boolean queries. The building of complex Boolean queries is very specialized and we aim to offer a tool able to use free text as queries without any operator. With the legacy system as baseline, we compare the TF-IDF traditional retrieval model, the BM25 probabilistic model and the Word2Vec model. Our results indicate that the Word2Vec Skip-Gram model, trained on a specialized legal corpus and BM25 yield similar performance and surpasses the legacy system. Combining BM25 model with embedding models improved the performance up to 19%.
有效地检索判例(判例法)是实现任何法律制度的一致性和可预测性的必要条件。在这项工作中,我们提出并着手对巴西高等法院的判例检索框架进行实证评估,以便简化具有相同法律意见的其他决定的检索任务。实验结果表明,基于文本相似度的方法比基于布尔查询的法院遗留系统性能更好。构建复杂的布尔查询是非常专业的,我们的目标是提供一个能够使用自由文本作为查询而不需要任何操作符的工具。以遗留系统为基准,我们比较了TF-IDF传统检索模型、BM25概率模型和Word2Vec模型。我们的研究结果表明,在专门的法律语料库和BM25上训练的Word2Vec Skip-Gram模型产生了类似的性能,并且超过了遗留系统。将BM25模型与嵌入模型相结合,性能提高19%。
{"title":"A new conceptual framework for enhancing legal information retrieval at the Brazilian Superior Court of Justice","authors":"Thiago Gomes, M. Ladeira","doi":"10.1145/3415958.3433087","DOIUrl":"https://doi.org/10.1145/3415958.3433087","url":null,"abstract":"Effective retrieval of jurisprudence (case-law) is imperative to achieve consistency and predictability for any legal system. In this work, we propose and proceed to an empirical evaluation of a framework for jurisprudence retrieval of the Brazilian Superior Court of Justice in order to ease the task of retrieval of other decisions with the same legal opinion. The experimental results shown that our approach based on text similarity performs better than the legacy system of the Court based on Boolean queries. The building of complex Boolean queries is very specialized and we aim to offer a tool able to use free text as queries without any operator. With the legacy system as baseline, we compare the TF-IDF traditional retrieval model, the BM25 probabilistic model and the Word2Vec model. Our results indicate that the Word2Vec Skip-Gram model, trained on a specialized legal corpus and BM25 yield similar performance and surpasses the legacy system. Combining BM25 model with embedding models improved the performance up to 19%.","PeriodicalId":198419,"journal":{"name":"Proceedings of the 12th International Conference on Management of Digital EcoSystems","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132013966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A Novel Framework for Event Interpretation in a Heterogeneous Information System 异构信息系统中一种新的事件解释框架
Nabila Guennouni, C. Sallaberry, Sébastien Laborie, R. Chbeir
Over the last decade, the number of research and development projects on sensor network technology has grown exponentially. Events detection is among these research fields, it allows the monitoring of the environment. To build an interpretation to these events, the combination of sensor network and document corpus data is essential since document corpus provide significant amounts of important and valuable information (e.g., technical data sheets, maintenance reports, customer sheets). However, most information systems in connected environments do not support the interconnection of sensor network and document corpus data, hence, user has to look for an explanation by himself through multiple queries on both data sources which is indeed very tedious, time consuming and requires a huge compilation effort. In this paper, we show that recent researches on 5W1H question-answering ("What? Who? Where? When? Why? How?") are an interesting issue to facilitate tunnelling through heterogeneous data sources (sensor networks and document corpus) and the identification of relevant data for the purpose of explaining an event. Consequently, we propose ISEE (an Information System for Event Explanation), a framework for event interpretation based on (i) the semantic representation of a heterogeneous information system, (ii) the cross-analysis of both sensor network and document corpus data and (iii) 5W1H question-answering techniques.
在过去的十年中,传感器网络技术的研究和开发项目的数量呈指数级增长。事件检测是这些研究领域之一,它允许对环境进行监测。为了对这些事件进行解释,传感器网络和文档语料库数据的组合是必不可少的,因为文档语料库提供了大量重要和有价值的信息(例如,技术数据表、维护报告、客户表)。然而,大多数互联环境下的信息系统并不支持传感器网络和文档语料库数据的互联,因此,用户必须通过对两个数据源的多次查询来寻找自己的解释,这确实非常繁琐,耗时且需要大量的编译工作。在本文中,我们展示了最近关于5W1H问答(“What?”谁?在哪里?什么时候?为什么?如何?”)是一个有趣的问题,它有助于通过异构数据源(传感器网络和文档语料库)挖掘隧道,并识别相关数据以解释事件。因此,我们提出了ISEE(事件解释信息系统),这是一个基于(i)异构信息系统的语义表示,(ii)传感器网络和文档语料库数据的交叉分析,以及(iii) 5W1H问答技术的事件解释框架。
{"title":"A Novel Framework for Event Interpretation in a Heterogeneous Information System","authors":"Nabila Guennouni, C. Sallaberry, Sébastien Laborie, R. Chbeir","doi":"10.1145/3415958.3433073","DOIUrl":"https://doi.org/10.1145/3415958.3433073","url":null,"abstract":"Over the last decade, the number of research and development projects on sensor network technology has grown exponentially. Events detection is among these research fields, it allows the monitoring of the environment. To build an interpretation to these events, the combination of sensor network and document corpus data is essential since document corpus provide significant amounts of important and valuable information (e.g., technical data sheets, maintenance reports, customer sheets). However, most information systems in connected environments do not support the interconnection of sensor network and document corpus data, hence, user has to look for an explanation by himself through multiple queries on both data sources which is indeed very tedious, time consuming and requires a huge compilation effort. In this paper, we show that recent researches on 5W1H question-answering (\"What? Who? Where? When? Why? How?\") are an interesting issue to facilitate tunnelling through heterogeneous data sources (sensor networks and document corpus) and the identification of relevant data for the purpose of explaining an event. Consequently, we propose ISEE (an Information System for Event Explanation), a framework for event interpretation based on (i) the semantic representation of a heterogeneous information system, (ii) the cross-analysis of both sensor network and document corpus data and (iii) 5W1H question-answering techniques.","PeriodicalId":198419,"journal":{"name":"Proceedings of the 12th International Conference on Management of Digital EcoSystems","volume":"331 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115769998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Methodology for Non-Functional Property Evaluation of Machine Learning Models 机器学习模型的非功能属性评价方法
M. Anisetti, C. Ardagna, E. Damiani, Paolo G. Panero
The pervasive diffusion of Machine Learning (ML) in many critical domains and application scenarios has revolutionized implementation and working of modern IT systems. The behavior of modern systems often depends on the behavior of ML models, which are treated as black boxes, thus making automated decisions based on inference unpredictable. In this context, there is an increasing need of verifying the non-functional properties of ML models, such as, fairness and privacy, to the aim of providing certified ML-based applications and services. In this paper, we propose a methodology based on Multi-Armed Bandit for evaluating non-functional properties of ML models. Our methodology adopts Thompson sampling, Monte Carlo Simulation, and Value Remaining. An experimental evaluation in a real-world scenario is presented to prove the applicability of our approach in evaluating the fairness of different ML models.
机器学习(ML)在许多关键领域和应用场景中的广泛传播已经彻底改变了现代IT系统的实现和工作。现代系统的行为通常依赖于ML模型的行为,这些模型被视为黑盒,因此基于不可预测的推理做出自动决策。在这种情况下,越来越需要验证ML模型的非功能属性,例如公平性和隐私性,以提供经过认证的基于ML的应用程序和服务。在本文中,我们提出了一种基于Multi-Armed Bandit的方法来评估ML模型的非功能属性。我们的方法采用汤普森抽样、蒙特卡罗模拟和价值保留。提出了一个真实场景中的实验评估,以证明我们的方法在评估不同ML模型的公平性方面的适用性。
{"title":"A Methodology for Non-Functional Property Evaluation of Machine Learning Models","authors":"M. Anisetti, C. Ardagna, E. Damiani, Paolo G. Panero","doi":"10.1145/3415958.3433101","DOIUrl":"https://doi.org/10.1145/3415958.3433101","url":null,"abstract":"The pervasive diffusion of Machine Learning (ML) in many critical domains and application scenarios has revolutionized implementation and working of modern IT systems. The behavior of modern systems often depends on the behavior of ML models, which are treated as black boxes, thus making automated decisions based on inference unpredictable. In this context, there is an increasing need of verifying the non-functional properties of ML models, such as, fairness and privacy, to the aim of providing certified ML-based applications and services. In this paper, we propose a methodology based on Multi-Armed Bandit for evaluating non-functional properties of ML models. Our methodology adopts Thompson sampling, Monte Carlo Simulation, and Value Remaining. An experimental evaluation in a real-world scenario is presented to prove the applicability of our approach in evaluating the fairness of different ML models.","PeriodicalId":198419,"journal":{"name":"Proceedings of the 12th International Conference on Management of Digital EcoSystems","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129366708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
期刊
Proceedings of the 12th International Conference on Management of Digital EcoSystems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1