Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery

DODAS: How to effectively exploit heterogeneous clouds for scientific computations DODAS:如何有效利用异构云进行科学计算

Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)

Pub Date : 2018-12-12 DOI: 10.22323/1.327.0024

D. Spiga, M. Antonacci, T. Boccali, D. Ciangottini, A. Costantini, G. Donvito, C. Duma, M. Duranti, V. Formato, L. Gaido, D. Salomoni, M. Tracolli, D. Michelotto

Dynamic On Demand Analysis Service (DODAS) is a Platform as a Service tool built combining several solutions and products developed by the INDIGO-DataCloud H2020 project. DODAS allows to instantiate on-demand container-based clusters. Both HTCondor batch system and platform for the Big Data analysis based on Spark, Hadoop etc, can be deployed on any cloud-based infrastructures with almost zero effort. DODAS acts as cloud enabler designed for scientists seeking to easily exploit distributed and heterogeneous clouds to process data. Aiming to reduce the learning curve as well as the operational cost of managing community specific services running on distributed cloud, DODAS completely automates the process of provisioning, creating, managing and accessing a pool of heterogeneous computing and storage resources. DODAS was selected as one of the Thematic Services that will provide multi-disciplinary solutions in the EOSC-hub project, an integration and management system of the European Open Science Cloud starting in January 2018. The main goals of this contribution are to provide a comprehensive overview of the overall technical implementation of DODAS, as well as to illustrate two distinct real examples of usage: the integration within the CMS Workload Management System and the extension of the AMS computing model.

动态按需分析服务(DODAS)是一个平台即服务工具，结合了INDIGO-DataCloud H2020项目开发的几种解决方案和产品。DODAS允许实例化基于按需容器的集群。无论是HTCondor批处理系统，还是基于Spark、Hadoop等的大数据分析平台，都可以毫不费力地部署在任何基于云的基础设施上。DODAS作为云推动者，专为寻求轻松利用分布式和异构云来处理数据的科学家而设计。为了减少学习曲线以及管理运行在分布式云上的社区特定服务的运营成本，DODAS完全自动化了供应、创建、管理和访问异构计算和存储资源池的过程。DODAS被选为主题服务之一，将在欧洲开放科学云的集成和管理系统EOSC-hub项目中提供多学科解决方案，该项目将于2018年1月启动。本文的主要目标是全面概述DODAS的总体技术实现，并举例说明两个不同的实际使用示例:CMS工作负载管理系统中的集成和AMS计算模型的扩展。

{"title":"DODAS: How to effectively exploit heterogeneous clouds for scientific computations","authors":"D. Spiga, M. Antonacci, T. Boccali, D. Ciangottini, A. Costantini, G. Donvito, C. Duma, M. Duranti, V. Formato, L. Gaido, D. Salomoni, M. Tracolli, D. Michelotto","doi":"10.22323/1.327.0024","DOIUrl":"https://doi.org/10.22323/1.327.0024","url":null,"abstract":"Dynamic On Demand Analysis Service (DODAS) is a Platform as a Service tool built combining several solutions and products developed by the INDIGO-DataCloud H2020 project. DODAS allows to instantiate on-demand container-based clusters. Both HTCondor batch system and platform for the Big Data analysis based on Spark, Hadoop etc, can be deployed on any cloud-based infrastructures with almost zero effort. DODAS acts as cloud enabler designed for scientists seeking to easily exploit distributed and heterogeneous clouds to process data. Aiming to reduce the learning curve as well as the operational cost of managing community specific services running on distributed cloud, DODAS completely automates the process of provisioning, creating, managing and accessing a pool of heterogeneous computing and storage resources. DODAS was selected as one of the Thematic Services that will provide multi-disciplinary solutions in the EOSC-hub project, an integration and management system of the European Open Science Cloud starting in January 2018. The main goals of this contribution are to provide a comprehensive overview of the overall technical implementation of DODAS, as well as to illustrate two distinct real examples of usage: the integration within the CMS Workload Management System and the extension of the AMS computing model.","PeriodicalId":135658,"journal":{"name":"Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130452062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Skill-based Occupation Recommendation System 基于技能的职业推荐系统

Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)

Pub Date : 2018-12-12 DOI: 10.22323/1.327.0008

A. Ochirbat, T. Shih

A mass of adolescents has decided their occupations/jobs/majors out of proper and professional advice from school services. For instance, adolescents do not have adequate information about occupations/jobs, what occupations can be reached by which majors, and what kind of education and training are needed for particular jobs. On the other hand, major choices of adolescents are influenced by a society and their family. They receive occupational information in common jobs from the environment. But they are a lack of information in professional occupations. Furthermore, the choice of major has become increasingly complex due to the existence of multiple human skills, which mean each person has their ability at the certain area and can be applied to multiple jobs/occupations. For those reasons, students need an automatic counselling system according to their values. To do this, occupation recommendation system is implemented with a variety of IT and soft skills. The main goal of this research is to build an occupation recommendation system (ORS) by using data mining and natural language processing (NLP) methods on open educational resource (OER) and skill dataset, in order to help adolescents. The system can provide different variety of academic programs, related online courses (e.g., MOOCs), required skills, ability, knowledge, and job tasks, and jobs currently announced as well as relevant occupational descriptions. The system can assist adolescents in major selection and career planning. Furthermore, the system incorporates a set of searching results, which are recommended using similarity measurements and hybridization recommendation techniques. These methods serve as a base for recommending occupations that meet interests and competencies of adolescents.

很多青少年都是在得到学校服务部门适当和专业的建议后才决定自己的职业/工作/专业的。例如，青少年没有足够的关于职业/工作的信息，哪些专业可以达到什么职业，以及特定工作需要什么样的教育和培训。另一方面，青少年的主要选择受到社会和家庭的影响。他们从环境中接收普通工作的职业信息。但他们缺乏专业职业的信息。此外，由于多种人类技能的存在，专业的选择变得越来越复杂，这意味着每个人在某一领域都有自己的能力，可以应用于多种工作/职业。由于这些原因，学生需要一个根据他们的价值观自动咨询系统。为了做到这一点，职业推荐系统实现了各种IT和软技能。本研究的主要目标是利用数据挖掘和自然语言处理(NLP)方法，在开放教育资源(OER)和技能数据集上构建职业推荐系统(ORS)，以帮助青少年择业。该系统可以提供不同种类的学术课程、相关的在线课程(如MOOCs)、所需的技能、能力、知识、工作任务，以及目前公布的职位和相关的职业描述。该系统可以帮助青少年进行专业选择和职业规划。此外，该系统结合了一组搜索结果，并使用相似性测量和杂交推荐技术进行推荐。这些方法可作为推荐符合青少年兴趣和能力的职业的基础。

{"title":"Skill-based Occupation Recommendation System","authors":"A. Ochirbat, T. Shih","doi":"10.22323/1.327.0008","DOIUrl":"https://doi.org/10.22323/1.327.0008","url":null,"abstract":"A mass of adolescents has decided their occupations/jobs/majors out of proper and professional advice from school services. For instance, adolescents do not have adequate information about occupations/jobs, what occupations can be reached by which majors, and what kind of education and training are needed for particular jobs. On the other hand, major choices of adolescents are influenced by a society and their family. They receive occupational information in common jobs from the environment. But they are a lack of information in professional occupations. Furthermore, the choice of major has become increasingly complex due to the existence of multiple human skills, which mean each person has their ability at the certain area and can be applied to multiple jobs/occupations. For those reasons, students need an automatic counselling system according to their values. To do this, occupation recommendation system is implemented with a variety of IT and soft skills. The main goal of this research is to build an occupation recommendation system (ORS) by using data mining and natural language processing (NLP) methods on open educational resource (OER) and skill dataset, in order to help adolescents. The system can provide different variety of academic programs, related online courses (e.g., MOOCs), required skills, ability, knowledge, and job tasks, and jobs currently announced as well as relevant occupational descriptions. The system can assist adolescents in major selection and career planning. Furthermore, the system incorporates a set of searching results, which are recommended using similarity measurements and hybridization recommendation techniques. These methods serve as a base for recommending occupations that meet interests and competencies of adolescents.","PeriodicalId":135658,"journal":{"name":"Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125914298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Optical Interconnects for Cloud Computing Data Centers: Recent Advances and Future Challenges 云计算数据中心的光互连:最新进展和未来挑战

Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)

Pub Date : 2018-12-12 DOI: 10.22323/1.327.0017

Muhammad Imran, Saqib Haleem

It is widely argued that optical communication and networking technologies will play a significant role in future data centers. Although the optical technologies have made a significant advancements over the last few years towards providing a very high data transmission rate as well as increased flexibility and efficiency, an additional effort is needed to investigate suitable architectures and technologies for optical network within (intra) and outside (inter) data centers. This paper presents a brief overview on optical networks for data centers. Furthermore, the paper provides a qualitative categorization of the proposed schemes based on the type of optical switches. In the end, future research direction and opportunities of optical interconnect for data centers are discussed.

人们普遍认为，光通信和网络技术将在未来的数据中心中发挥重要作用。尽管在过去的几年中，光学技术在提供非常高的数据传输速率以及增加的灵活性和效率方面取得了重大进展，但还需要额外的努力来研究适合数据中心内(内部)和外部(内部)光网络的架构和技术。本文简要介绍了数据中心光网络的发展概况。此外，本文还根据光开关的类型对所提出的方案进行了定性分类。最后，对未来数据中心光互连的研究方向和机遇进行了展望。

引用次数: 7

Studies on Job Queue Health and Problem Recovery 作业队列健康与问题恢复研究

Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)

Pub Date : 2018-12-12 DOI: 10.22323/1.327.0018

Xiaowei Jiang, Jiaheng Zou, Jingyan Shi, R. Du, Qingbao Hu, Zhenyu Sun, Hongnan Tan

In a batch system, the job queue is in charge of a set of jobs. Job health is the most important issue concerned by users and administrators. The job state can be queuing, running, completed, error or held, etc, that can reflect the job health. Generally jobs can move from one state to another. However, if a job keeps in a state for too long time, there might be problems, such as worker node failure and network blocking. In a large-scale computing cluster, problems cannot be avoided. That means a number of jobs will be blocked in one state, and cannot be completed in an expected time. This will delay the progress of the computing task. For that situation, this paper studies on the unhealthy job state's reason, problem handling and job queue stability. We aim to improve the job health, and then we can improve job success rate and speed up users' task progress. Unhealthy reasons can be found from job attributes, queue information and logs, which can be analyzed in detail to acquire better solutions. Depending on who do the recovery, all the solutions are grouped into two categories. The first category is recovered by administrator. Most problems are automatically solved through integrating with the monitor system. When problem is solved, the corresponding job will be rescheduled in time, without involving users. The second category is automatically informing users to dispose unhealthy jobs by themselves. In accordance with the results of unhealthy analysis, the helpful suggestion might be recommended to users for quick recovery. Based on the foregoing methods, a job queue health system is designed and implemented at IHEP. We define a series of standards to pick out unhealthy jobs. Various factors relevant with unhealthy jobs are collected and analyzed in association. In case that unhealthy jobs could be recovered at admin side, automatic recovery functions are carried out to automatically recover the unhealthy jobs. In case that unhealthy jobs must be recovered at user side, alarms are sent to users via emails, WeChat, etc. The running status of job queue health system indicates that it's able to improve the job queue health in most situations.

在批处理系统中，作业队列负责一组作业。作业运行状况是用户和管理员关心的最重要的问题。作业状态可以是排队、运行、完成、错误或保持等，这些状态可以反映作业的运行状况。一般来说，工作可以从一个州转移到另一个州。但是，如果作业长时间处于某种状态，可能会出现问题，例如工作节点故障和网络阻塞。在大规模的计算集群中，问题是无法避免的。这意味着许多作业将被阻塞在一个状态中，并且无法在预期的时间内完成。这将延迟计算任务的进度。针对这种情况，本文对作业状态不健康的原因、问题处理和作业队列稳定性进行了研究。我们的目标是提高作业的健康度，从而提高作业的成功率，加快用户的任务进度。从作业属性、队列信息和日志中可以发现不健康的原因，并对其进行详细分析，从而获得更好的解决方案。根据谁进行恢复，所有的解决方案被分为两类。第一个类别由管理员恢复。大部分问题通过与监控系统集成自动解决。当问题解决后，相应的作业将及时重新调度，不涉及用户。第二类是自动通知用户自行处理不健康的作业。根据不健康分析的结果，向用户推荐有用的建议，以便快速恢复。在上述方法的基础上，设计并实现了IHEP作业队列健康系统。我们制定了一系列标准来挑选不健康的工作。收集和分析了与不健康工作相关的各种因素。如果在管理端可以恢复不健康的作业，则执行自动恢复功能，自动恢复不健康的作业。如果需要在用户端恢复不健康的作业，则通过邮件、微信等方式向用户发送警报。作业队列健康系统的运行状态表明，它在大多数情况下都能够改善作业队列的健康状况。

{"title":"Studies on Job Queue Health and Problem Recovery","authors":"Xiaowei Jiang, Jiaheng Zou, Jingyan Shi, R. Du, Qingbao Hu, Zhenyu Sun, Hongnan Tan","doi":"10.22323/1.327.0018","DOIUrl":"https://doi.org/10.22323/1.327.0018","url":null,"abstract":"In a batch system, the job queue is in charge of a set of jobs. Job health is the most important issue concerned by users and administrators. The job state can be queuing, running, completed, error or held, etc, that can reflect the job health. Generally jobs can move from one state to another. However, if a job keeps in a state for too long time, there might be problems, such as worker node failure and network blocking. In a large-scale computing cluster, problems cannot be avoided. That means a number of jobs will be blocked in one state, and cannot be completed in an expected time. This will delay the progress of the computing task. For that situation, this paper studies on the unhealthy job state's reason, problem handling and job queue stability. We aim to improve the job health, and then we can improve job success rate and speed up users' task progress. Unhealthy reasons can be found from job attributes, queue information and logs, which can be analyzed in detail to acquire better solutions. Depending on who do the recovery, all the solutions are grouped into two categories. The first category is recovered by administrator. Most problems are automatically solved through integrating with the monitor system. When problem is solved, the corresponding job will be rescheduled in time, without involving users. The second category is automatically informing users to dispose unhealthy jobs by themselves. In accordance with the results of unhealthy analysis, the helpful suggestion might be recommended to users for quick recovery. Based on the foregoing methods, a job queue health system is designed and implemented at IHEP. We define a series of standards to pick out unhealthy jobs. Various factors relevant with unhealthy jobs are collected and analyzed in association. In case that unhealthy jobs could be recovered at admin side, automatic recovery functions are carried out to automatically recover the unhealthy jobs. In case that unhealthy jobs must be recovered at user side, alarms are sent to users via emails, WeChat, etc. The running status of job queue health system indicates that it's able to improve the job queue health in most situations.","PeriodicalId":135658,"journal":{"name":"Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125392776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Building a large scale Intrusion Detection System using Big Data technologies 利用大数据技术构建大规模入侵检测系统

Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)

Pub Date : 2018-12-12 DOI: 10.22323/1.327.0014

Pablo Panero, L. Valsan, Vincent Brillault, Ioan Cristian Schuszter

Computer security threats have always been a major concern and continue to increase in frequency and complexity. The nature and techniques of the attacks evolve rapidly over time, making their detection more difficult. Therefore the means and tools used to deal with them need to evolve at the same pace if not faster. In this paper the implementation of an Intrusion Detection System (IDS) both at the Network (NIDS) and Host (HIDS) level, used at CERN, is presented. The system is currently processing in real time approximately one TB of data per day, with the final goal of coping with at least 5 TB / day. In order to accomplish this goal at first an infrastructure to collect data from sources such as system logs, web server logs and the NIDS logs has been developed making use of technologies such as Apache Flume and Apache Kafka. Once the data is collected it needs to be processed in search of malicious activity: the data is consumed by Apache Spark jobs which compare in real time this data with known signatures of malicious activities. These are known as Indicators of Compromise (IoC). They are published by many security experts and centralized in a local Malware Information Sharing Platform (MISP) instance. Nonetheless, detecting an intrusion is not enough. There is a need to understand what happened and why. In order to gain knowledge on the context of the detected intrusion the data is also enriched in real time when it is passing through the pipeline. For example, DNS resolution and IP geolocation are applied to it. A system generic enough to process any kind of data in JSON format is enriching the data in order to get additional context of what is happening and finally looking for indicators of compromise to detect possible intrusions, making use of the latest technologies in the Big Data ecosystem.

计算机安全威胁一直是人们关注的主要问题，并且在频率和复杂性方面继续增加。随着时间的推移，攻击的性质和技术会迅速发展，这使得检测起来更加困难。因此，用于处理它们的手段和工具需要以同样的速度发展，如果不是更快的话。本文介绍了欧洲核子研究中心在网络(NIDS)和主机(HIDS)两层的入侵检测系统(IDS)的实现。该系统目前每天实时处理大约1tb的数据，最终目标是每天处理至少5tb的数据。为了实现这一目标，我们首先开发了一个基础设施，利用Apache Flume和Apache Kafka等技术，从系统日志、web服务器日志和NIDS日志等来源收集数据。一旦收集到数据，就需要对其进行处理，以搜索恶意活动:Apache Spark作业将使用这些数据，并将这些数据与已知的恶意活动签名进行实时比较。这些被称为妥协指标(IoC)。它们由许多安全专家发布，并集中在本地恶意软件信息共享平台(MISP)实例中。然而，检测到入侵是不够的。有必要了解发生了什么以及原因。为了获得有关检测到的入侵上下文的知识，数据在通过管道时也会实时丰富。如DNS解析、IP地理定位等。一个能够以JSON格式处理任何类型数据的通用系统正在丰富数据，以便获得正在发生的事情的额外背景，并最终寻找折衷指标以检测可能的入侵，利用大数据生态系统中的最新技术。

{"title":"Building a large scale Intrusion Detection System using Big Data technologies","authors":"Pablo Panero, L. Valsan, Vincent Brillault, Ioan Cristian Schuszter","doi":"10.22323/1.327.0014","DOIUrl":"https://doi.org/10.22323/1.327.0014","url":null,"abstract":"Computer security threats have always been a major concern and continue to increase in frequency and complexity. The nature and techniques of the attacks evolve rapidly over time, making their detection more difficult. Therefore the means and tools used to deal with them need to evolve at the same pace if not faster. \u0000In this paper the implementation of an Intrusion Detection System (IDS) both at the Network (NIDS) and Host (HIDS) level, used at CERN, is presented. The system is currently processing in real time approximately one TB of data per day, with the final goal of coping with at least 5 TB / day. In order to accomplish this goal at first an infrastructure to collect data from sources such as system logs, web server logs and the NIDS logs has been developed making use of technologies such as Apache Flume and Apache Kafka. Once the data is collected it needs to be processed in search of malicious activity: the data is consumed by Apache Spark jobs which compare in real time this data with known signatures of malicious activities. These are known as Indicators of Compromise (IoC). They are published by many security experts and centralized in a local Malware Information Sharing Platform (MISP) instance. \u0000Nonetheless, detecting an intrusion is not enough. There is a need to understand what happened and why. In order to gain knowledge on the context of the detected intrusion the data is also enriched in real time when it is passing through the pipeline. For example, DNS resolution and IP geolocation are applied to it. A system generic enough to process any kind of data in JSON format is enriching the data in order to get additional context of what is happening and finally looking for indicators of compromise to detect possible intrusions, making use of the latest technologies in the Big Data ecosystem.","PeriodicalId":135658,"journal":{"name":"Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134532670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Harvesting dispersed computational resources with Openstack: a Cloud infrastructure for the Computational Science community 使用Openstack收集分散的计算资源:用于计算科学社区的云基础设施

Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)

Pub Date : 2018-12-12 DOI: 10.22323/1.327.0009

M. Mariotti, L. Storchi, D. Spiga, G. Vitillaro, M. Tracolli, D. Ciangottini, Manuel Ciangottini, V. Formato, M. Duranti, M. Mergé, P. D'Angeli, R. Primavera, Antonio Guerra, L. Fanò, B. Bertucci

Harvesting dispersed computational resources is nowadays an important and strategic topic especially in an environment, like the computational science one, where computing needs constantly increase. On the other hand managing dispersed resources might not be neither an easy task not costly effective. We successfully explored the use of OpenStack middleware to achieve this objective, our man goal is not only the resource harvesting but also to provide a modern paradigm of computing and data usage access. In the present work we will illustrate a real example on how to build a geographically distributed cloud to share and manage computing and storage resources, owned by heterogeneous cooperating entities

如今，获取分散的计算资源是一个重要的战略课题，特别是在计算科学等计算需求不断增加的环境中。另一方面，管理分散的资源可能既不是一项容易的任务，也不是一项成本高、效率高的任务。我们成功地探索了使用OpenStack中间件来实现这一目标，我们的主要目标不仅是资源收集，而且还提供了计算和数据使用访问的现代范例。在当前的工作中，我们将举例说明如何构建地理上分布式的云来共享和管理异构协作实体拥有的计算和存储资源

引用次数: 0

Construction of real-time monitoring system for Grid services based on log analysis at the Tokyo Tier-2 center 基于日志分析的东京二级中心电网业务实时监控系统建设

Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)

Pub Date : 2018-12-12 DOI: 10.22323/1.327.0019

T. Kishimoto, T. Mashimo, N. Matsui, Tomoaki Nakamura, H. Sakamoto

The Tokyo Tier-2 center, which is located in the International Center for Elementary Particle Physics at the University of Tokyo, is providing computer resources for the ATLAS experiment in the Worldwide LHC Computing Grid. Logs produced by the Grid services provide useful information to determine whether the services are working properly. Therefore, a new real-time monitoring system based on log analysis has been constructed using the ELK stack. This paper reports the configuration of the new monitoring system at the Tokyo Tier-2 center, and discusses improvements in terms of stability and flexibility of the site operation by introducing the new monitoring system.

东京二级中心位于东京大学的国际基本粒子物理中心，为全球大型强子对撞机计算网格中的ATLAS实验提供计算机资源。网格服务产生的日志提供有用的信息，以确定服务是否正常工作。因此，利用ELK堆栈构建了一种新的基于测井分析的实时监测系统。本文介绍了东京二级中心新监控系统的配置情况，并通过新监控系统的引入，讨论了现场运行稳定性和灵活性方面的改进。

引用次数: 0

Progress on Machine and Deep Learning applications in CMS Computing 机器和深度学习在CMS计算中的应用进展

Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)

Pub Date : 2018-12-12 DOI: 10.22323/1.327.0022

D. Bonacorsi, V. Kuznetsov, L. Giommi, T. Diotalevi, J. Vlimant, D. Abercrombie, C. Contreras, A. Repecka, Ž. Matonis, K. Kančys

Machine and Deep Learning techniques are being used in various areas of CMS operations at the LHC collider, like data taking, monitoring, processing and physics analysis. A review a few selected use cases - with focus on CMS software and computing - shows the progress in the field, with highlight on most recent developments, as well as an outlook to future applications in LHC Run III and towards the High-Luminosity LHC phase.

机器和深度学习技术被用于大型强子对撞机CMS操作的各个领域，如数据采集、监测、处理和物理分析。回顾几个选定的用例，重点关注CMS软件和计算，展示了该领域的进展，重点介绍了最新的发展，以及对LHC Run III和高亮度LHC阶段未来应用的展望。

引用次数: 1

Harnessing the Power of Threat Intelligence in Grids and Clouds: WLCG SOC Working Group 在网格和云中利用威胁情报的力量:WLCG SOC工作组

Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)

Pub Date : 2018-12-12 DOI: 10.22323/1.327.0012

D. Crooks, L. Valsan, Kashif Mohammad, M. Cărăbaş, S. McKee, J. Trinder

The modern security landscape affecting Grid and Cloud sites is evolving to include possible threats from a range of avenues, including social engineering as well as more direct approaches. An effective strategy to defend against these risks must include cooperation between security teams in different contexts. It is essential that sites have the ability to share threat intelligence data with confidence, as well as being able to act on this data in a timely and effective manner. As reported at ISGC 2017, the Worldwide LHC Computing Grid (WLCG) Security Operations Centres Working Group (WG) has been working with sites across the WLCG to develop a model for a Security Operations Centre reference design. This work includes not only the technical aspect of developing a security stack appropriate for sites of different sizes and topologies, but also the more social aspect of sharing data between groups of different kinds. In particular, since many Grid and Cloud sites operate as part of larger University or other Facility networks, collaboration between Grid and Campus / Facility security teams is an important aspect of maintaining overall security. We discuss recent work on sharing threat intelligence, particularly involving the WLCG MISP instance hosted at CERN. In addition, we examine strategies for the use of this intelligence, as well as considering recent progress in the deployment and integration of the Bro Intrusion Detection System (IDS) at contributing sites. An important part of this work is a report on the first WLCG SOC WG Workshop / Hackathon, a Workshop planned at time of writing for December 2017. This Workshop provides an opportunity to assist participating sites in the deployment of these security tools as well as giving attendees the opportunity to share experiences and consider site policies as a result. This Workshop is hoped to play a substantial role in shaping the future goals of the working group, as well as shaping future workshops.

影响网格和云站点的现代安全环境正在演变，包括来自一系列途径的可能威胁，包括社会工程和更直接的方法。防御这些风险的有效策略必须包括不同环境中的安全团队之间的合作。至关重要的是，站点必须能够自信地共享威胁情报数据，并能够及时有效地对这些数据采取行动。据ISGC 2017报道，全球大型对撞机计算网格(WLCG)安全运营中心工作组(WG)一直在与WLCG的站点合作，为安全运营中心参考设计开发模型。这项工作不仅包括开发适合不同规模和拓扑的站点的安全堆栈的技术方面，还包括在不同类型的组之间共享数据的更多社交方面。特别是，由于许多网格和云站点作为更大的大学或其他设施网络的一部分运行，因此网格和校园/设施安全团队之间的协作是维护整体安全性的重要方面。我们讨论了最近在共享威胁情报方面的工作，特别是涉及在CERN托管的WLCG MISP实例。此外，我们亦会研究使用这些情报的策略，并考虑在派遣地点部署和整合入侵侦测系统的最新进展。这项工作的一个重要部分是关于第一届WLCG SOC工作组研讨会/黑客马拉松的报告，该研讨会计划于2017年12月撰写。本次研讨会提供了一个机会，帮助参与的网站部署这些安全工具，并让与会者有机会分享经验，并考虑网站的政策。希望这次讲习班在确定工作组未来的目标以及确定今后的讲习班方面发挥重大作用。

{"title":"Harnessing the Power of Threat Intelligence in Grids and Clouds: WLCG SOC Working Group","authors":"D. Crooks, L. Valsan, Kashif Mohammad, M. Cărăbaş, S. McKee, J. Trinder","doi":"10.22323/1.327.0012","DOIUrl":"https://doi.org/10.22323/1.327.0012","url":null,"abstract":"The modern security landscape affecting Grid and Cloud sites is evolving to include possible threats from a range of avenues, including social engineering as well as more direct approaches. An effective strategy to defend against these risks must include cooperation between security teams in different contexts. It is essential that sites have the ability to share threat intelligence data with confidence, as well as being able to act on this data in a timely and effective manner. \u0000 \u0000As reported at ISGC 2017, the Worldwide LHC Computing Grid (WLCG) Security Operations Centres Working Group (WG) has been working with sites across the WLCG to develop a model for a Security Operations Centre reference design. This work includes not only the technical aspect of developing a security stack appropriate for sites of different sizes and topologies, but also the more social aspect of sharing data between groups of different kinds. In particular, since many Grid and Cloud sites operate as part of larger University or other Facility networks, collaboration between Grid and Campus / Facility security teams is an important aspect of maintaining overall security. \u0000 \u0000We discuss recent work on sharing threat intelligence, particularly involving the WLCG MISP instance hosted at CERN. In addition, we examine strategies for the use of this intelligence, as well as considering recent progress in the deployment and integration of the Bro Intrusion Detection System (IDS) at contributing sites. \u0000 \u0000An important part of this work is a report on the first WLCG SOC WG Workshop / Hackathon, a Workshop planned at time of writing for December 2017. This Workshop provides an opportunity to assist participating sites in the deployment of these security tools as well as giving attendees the opportunity to share experiences and consider site policies as a result. This Workshop is hoped to play a substantial role in shaping the future goals of the working group, as well as shaping future workshops.","PeriodicalId":135658,"journal":{"name":"Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)","volume":"130 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134458617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Explore the massive Volunteer Computing resources for HEP computation 探索大量的志愿者计算资源，用于HEP计算

Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)

Pub Date : 2018-12-12 DOI: 10.22323/1.327.0027

Wenjing Wu, D. Cameron

It has been over a decade since the HEP community initially started to explore the possibility of using the massively available Volunteer Computing resource for its computation. The first project LHC@home was only trying to run a platform portable FORTRAN program for the SixTrack application in the BOINC traditional way. With the development and advancement of a few key technologies such as virtualization and the BOINC middleware which is commonly used to harness the volunteer computers, it not only became possible to run the platform heavily dependent HEP software on the heterogeneous volunteer computers, but also yielded very good performance from the utilization. With the technology advancements and the potential of harvesting a large amount of free computing resource to fill the gap between the increasing computing requirements and the flat available resources, more and more HEP experiments endeavor to integrate the Volunteer Computing resource into their Grid Computing systems based on which the workflows were designed. Resource integration and credential are the two common challenges for this endeavor. In order to address this, each experiment comes out with their own solutions, among which some are lightweight and put into production very soon while the others require heavier adaptation and implementation of the gateway services due to the complexity of their Grid Computing platforms and workflow design. Among all the efforts, the ATLAS experiment is the most successful example by harnessing several tens of millions of CPU hours from its Volunteer Computing project ATLAS@home each year. In this paper, we will retrospect the key phases of exploring Volunteer Computing in HEP, and compare and discuss the different solutions that experiments coming out to harness and integrate the Volunteer Computing resource, finally based on the production experience and successful outcomes, we envision the future challenges in order to sustain, expand and more efficiently utilize the Volunteer Computing resource. Furthermore, we envision common efforts to be put together in order to address all these current and future challenges and to achieve a full exploitation of Volunteer Computing resource for the whole HEP computing community.

自从HEP社区最初开始探索使用大量可用的志愿者计算资源进行计算的可能性以来，已经有十多年了。第一个项目LHC@home只是试图以BOINC的传统方式为SixTrack应用程序运行一个平台可移植的FORTRAN程序。随着虚拟化、BOINC中间件等关键技术的发展和进步，使得高度依赖平台的HEP软件在异构志愿计算机上运行成为可能，并取得了良好的利用效果。随着技术的进步和获取大量免费计算资源的潜力，以填补日益增长的计算需求与扁平可用资源之间的差距，越来越多的HEP实验试图将志愿计算资源集成到他们的网格计算系统中，并以此为基础设计工作流。资源集成和凭证是这一努力面临的两个常见挑战。为了解决这个问题，每个实验都提出了自己的解决方案，其中一些是轻量级的，很快就可以投入生产，而另一些则由于其网格计算平台和工作流设计的复杂性，需要更繁重的网关服务适配和实现。在所有的努力中，ATLAS实验是最成功的例子，它每年从志愿者计算项目ATLAS@home中利用数千万个CPU小时。在本文中，我们将回顾在HEP中探索志愿计算的关键阶段，并比较和讨论不同的实验来利用和整合志愿计算资源的解决方案，最后根据生产经验和成功的结果，展望未来的挑战，以维持、扩展和更有效地利用志愿计算资源。此外，我们设想将共同的努力放在一起，以解决所有这些当前和未来的挑战，并为整个HEP计算社区实现志愿者计算资源的充分利用。

{"title":"Explore the massive Volunteer Computing resources for HEP computation","authors":"Wenjing Wu, D. Cameron","doi":"10.22323/1.327.0027","DOIUrl":"https://doi.org/10.22323/1.327.0027","url":null,"abstract":"It has been over a decade since the HEP community initially started to explore the possibility of using the massively available Volunteer Computing resource for its computation. The first project LHC@home was only trying to run a platform portable FORTRAN program for the SixTrack application in the BOINC traditional way. With the development and advancement of a few key technologies such as virtualization and the BOINC middleware which is commonly used to harness the volunteer computers, it not only became possible to run the platform heavily dependent HEP software on the heterogeneous volunteer computers, but also yielded very good performance from the utilization. With the technology advancements and the potential of harvesting a large amount of free computing resource to fill the gap between the increasing computing requirements and the flat available resources, more and more HEP experiments endeavor to integrate the Volunteer Computing resource into their Grid Computing systems based on which the workflows were designed. Resource integration and credential are the two common challenges for this endeavor. In order to address this, each experiment comes out with their own solutions, among which some are lightweight and put into production very soon while the others require heavier adaptation and implementation of the gateway services due to the complexity of their Grid Computing platforms and workflow design. Among all the efforts, the ATLAS experiment is the most successful example by harnessing several tens of millions of CPU hours from its Volunteer Computing project ATLAS@home each year. In this paper, we will retrospect the key phases of exploring Volunteer Computing in HEP, and compare and discuss the different solutions that experiments coming out to harness and integrate the Volunteer Computing resource, finally based on the production experience and successful outcomes, we envision the future challenges in order to sustain, expand and more efficiently utilize the Volunteer Computing resource. Furthermore, we envision common efforts to be put together in order to address all these current and future challenges and to achieve a full exploitation of Volunteer Computing resource for the whole HEP computing community.","PeriodicalId":135658,"journal":{"name":"Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134037825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)最新文献