Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery

What Goes Up, Must Go Down: A Case Study From RAL on Shrinking an Existing Storage Service 什么上升，就必须下降:一个来自RAL的案例研究缩减现有的存储服务

Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)

Pub Date : 2018-12-12 DOI: 10.22323/1.327.0026

R. Appleyard, G. Patargias

Much attention is paid to the process of how new storage services are deployed into production that the challenges therein. Far less is paid to what happens when a storage service is approaching the end of its useful life. The challenges in rationalising and de-scoping a service that, while relatively old, is still critical to production work for both the UK WLCG Tier 1 and local facilities are not to be underestimated. RAL has been running a disk and tape storage service based on CASTOR (Cern Advanced STORage) for over 10 years. CASTOR must cope with both the throughput requirements of supplying data to a large batch farm and the data integrity requirements needed by a long-term tape archive. A new storage service, called ‘Echo’ is now being deployed to replace the disk-only element of CASTOR, but we intend to continue supporting the CASTOR system for tape into the medium term. This, in turn, implies a downsizing and redesign of the CASTOR service in order to improve manageability and cost effectiveness. We will give an outline of both Echo and CASTOR as background. This paper will discuss the project to downsize CASTOR and improve its manageability when running both at a considerably smaller scale (we intend to go from around 140 storage nodes to around 20), and with a considerably lower amount of available staff effort. This transformation must be achieved while, at the same time, running the service in 24/7 production and supporting the transition to the newer storage element. To achieve this goal, we intend to transition to a virtualised infrastructure to underpin the remaining management nodes and improve resilience by allowing management functions to be performed by many different nodes concurrently (‘cattle’ as opposed to ‘pets’), and also intend to streamline the system by condensing the existing 4 CASTOR ‘stagers’ (databases that record the state of the disk pools) into a single one that supports all users.

我们非常关注如何将新存储服务部署到生产环境的过程，以及其中的挑战。对于存储服务即将结束其使用寿命时发生的情况，我们的关注要少得多。这项服务虽然相对较老，但对英国WLCG一级和当地设施的生产工作仍然至关重要，因此在合理化和划分服务范围方面面临的挑战不容低估。RAL已经运行基于CASTOR (Cern高级存储)的磁盘和磁带存储服务超过10年了。CASTOR必须同时满足向大型批处理场提供数据的吞吐量需求和长期磁带归档所需的数据完整性需求。现在正在部署一种名为“Echo”的新存储服务，以取代CASTOR的纯磁盘组件，但我们打算在中期继续支持CASTOR磁带系统。反过来，这意味着缩减和重新设计CASTOR服务，以提高可管理性和成本效益。我们将给出Echo和CASTOR的轮廓作为背景。本文将讨论在相当小的规模(我们打算从大约140个存储节点减少到大约20个)下运行CASTOR并改进其可管理性的项目，同时减少可用人员的工作量。此转换必须同时在24/7生产环境中运行服务并支持向新存储元素的转换。为了实现这一目标，我们打算过渡到虚拟化基础设施，以支持剩余的管理节点，并通过允许许多不同节点同时执行管理功能(“牛”而不是“宠物”)来提高弹性，并且还打算通过将现有的4个CASTOR“阶段”(记录磁盘池状态的数据库)压缩成一个支持所有用户的单一阶段来简化系统。

{"title":"What Goes Up, Must Go Down: A Case Study From RAL on Shrinking an Existing Storage Service","authors":"R. Appleyard, G. Patargias","doi":"10.22323/1.327.0026","DOIUrl":"https://doi.org/10.22323/1.327.0026","url":null,"abstract":"Much attention is paid to the process of how new storage services are deployed into production that the challenges therein. Far less is paid to what happens when a storage service is approaching the end of its useful life. The challenges in rationalising and de-scoping a service that, while relatively old, is still critical to production work for both the UK WLCG Tier 1 and local facilities are not to be underestimated. \u0000 \u0000RAL has been running a disk and tape storage service based on CASTOR (Cern Advanced STORage) for over 10 years. CASTOR must cope with both the throughput requirements of supplying data to a large batch farm and the data integrity requirements needed by a long-term tape archive. A new storage service, called ‘Echo’ is now being deployed to replace the disk-only element of CASTOR, but we intend to continue supporting the CASTOR system for tape into the medium term. This, in turn, implies a downsizing and redesign of the CASTOR service in order to improve manageability and cost effectiveness. We will give an outline of both Echo and CASTOR as background. \u0000 \u0000This paper will discuss the project to downsize CASTOR and improve its manageability when running both at a considerably smaller scale (we intend to go from around 140 storage nodes to around 20), and with a considerably lower amount of available staff effort. This transformation must be achieved while, at the same time, running the service in 24/7 production and supporting the transition to the newer storage element. To achieve this goal, we intend to transition to a virtualised infrastructure to underpin the remaining management nodes and improve resilience by allowing management functions to be performed by many different nodes concurrently (‘cattle’ as opposed to ‘pets’), and also intend to streamline the system by condensing the existing 4 CASTOR ‘stagers’ (databases that record the state of the disk pools) into a single one that supports all users.","PeriodicalId":135658,"journal":{"name":"Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)","volume":"139 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121310896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Unified Account Management for High Performance Computing as a Service with Microservice Architecture 基于微服务架构的高性能计算即服务的统一账户管理

Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)

Pub Date : 2018-12-12 DOI: 10.22323/1.327.0020

Rongqiang Cao, Shasha Lu, Xiaoning Wang, Haili Xiao, Xue-bin Chi

In recent years, High Performance Computing (HPC) has developed rapidly in China. From Chinese Academy of Sciences (CAS) level, Scientific Computing Grid (ScGrid), is a general-purpose computing platform started from 2006 in CAS, which provided a problem solving environment for computing users through grid computing and cloud computing technologies. Then ScGrid becomes Supercomputing Cloud, an important port of Chinese Science Cloud from 2011. From national level, China National Grid (CNGrid) has integrated massive HPC resources from several national supercomputing centers and other large centers distributed geographically, and been providing efficient computing services for users in diverse disciplines and research areas. During more than 10 years, CNGrid and ScGrid has integrated tens of HPC resources distributed geographically across China, comprising 6 National Supercomputer Centers of Tianjin, Jinan, Changsha, and Shenzhen, Guangzhou, Wuxi, and also dozens of teraflops-scale HPC resources belong to universities and institutes. In total, the computing capability is more than 200PF and the storage capacity is more than 160PB in CNGrid. As worked in the operation and management center of CNGrid and ScGrid for many years, we notice that users prefer to manage their jobs at different supercomputers and clusters via a global account on different remote clients such as science gateways, desktop applications and even scripts. And they don’ t like to apply for an account to each supercomputer and login into the supercomputer in specific way. Therefore, we described Unified Account Management as a Service (UAMS) to access and use all HPC resources via a global account for each user in this paper. We addressed and solved challenges for mapping a global account to many local accounts, and provided unified account registration, management and authentication for different collaborative web gateways, command toolkits and other desktop applications. UAMS was designed in accordance with the core rules of simplicity, compatibility and reusability. In architecture design, we focused on loosely-coupled style to acquire good scalability and update internal modules transparently. In implementation, we applied widely accepted knowledge for the definitions of the RESTful API and divided them into several isolated microservices according to their usages and scenarios. For security, all sensitive data transferred in wide-network is protected by HTTPS with transport layer security outside of CNGrid and secure communication channels provided by OpenSSH inside of CNGrid. In addition, all parameters submitted to RESTful web services are strictly checked in format and variable type. By providing these frequently important but always challenging capabilities as a service, UAMS allows users to use tens of HPC resources and clients via only an account, and makes it easy for developers to implement clients and services related HPC with advantages of numerous users and s

近年来，高性能计算(HPC)在中国得到了迅速发展。科学计算网格(ScGrid)是中科院于2006年启动的通用计算平台，它通过网格计算和云计算技术为计算用户提供了一个解决问题的环境。从2011年起，ScGrid成为中国科学云的一个重要端口——超级计算云。从国家层面来看，中国国家电网整合了多个国家超级计算中心和其他地理分布的大型中心的海量高性能计算资源，为不同学科和研究领域的用户提供高效的计算服务。十多年来，CNGrid和ScGrid整合了分布在中国各地的数十个HPC资源，包括天津、济南、长沙、深圳、广州、无锡6个国家超级计算机中心，以及数十个属于高校和研究机构的万亿次规模的HPC资源。CNGrid的计算能力总计超过200PF，存储容量超过160PB。在CNGrid和ScGrid的运营管理中心工作多年，我们注意到用户更喜欢在不同的远程客户端(如科学网关、桌面应用程序甚至脚本)上通过全球帐户管理他们在不同超级计算机和集群上的工作。他们不喜欢为每台超级计算机申请一个帐户，并以特定的方式登录到超级计算机。因此，我们在本文中描述了统一帐户管理即服务(UAMS)，通过每个用户的全局帐户访问和使用所有HPC资源。我们处理并解决了将一个全局帐户映射到许多本地帐户的挑战，并为不同的协作web网关、命令工具包和其他桌面应用程序提供了统一的帐户注册、管理和认证。UAMS的设计遵循简单、兼容和可重用的核心原则。在架构设计上，我们注重松耦合风格，以获得良好的可扩展性和透明的内部模块更新。在实现中，我们应用了广泛接受的RESTful API定义知识，并根据它们的用法和场景将它们划分为几个独立的微服务。为了安全起见，所有在广域网中传输的敏感数据都采用HTTPS保护，在CNGrid之外有传输层安全，在CNGrid内部有OpenSSH提供安全的通信通道。此外，提交给RESTful web服务的所有参数在格式和变量类型上都经过严格检查。通过提供这些经常重要但总是具有挑战性的功能作为服务，UAMS允许用户仅通过一个帐户使用数十个HPC资源和客户端，并使开发人员能够轻松实现与HPC相关的客户端和服务，并具有众多用户和单点登录功能的优势。在此基础上，结合不同的认证方案，介绍了具有代表性的客户端。最后，对UAMS进行了分析和测试，结果表明该系统能够支持毫秒级的身份验证，具有良好的可扩展性。未来，我们计划实现联邦账户服务，使本地HPC账户可以像全球账户一样登录全国HPC环境，访问和使用CNGrid的所有HPC资源。

{"title":"Unified Account Management for High Performance Computing as a Service with Microservice Architecture","authors":"Rongqiang Cao, Shasha Lu, Xiaoning Wang, Haili Xiao, Xue-bin Chi","doi":"10.22323/1.327.0020","DOIUrl":"https://doi.org/10.22323/1.327.0020","url":null,"abstract":"In recent years, High Performance Computing (HPC) has developed rapidly in China. From Chinese Academy of Sciences (CAS) level, Scientific Computing Grid (ScGrid), is a general-purpose computing platform started from 2006 in CAS, which provided a problem solving environment for computing users through grid computing and cloud computing technologies. Then ScGrid becomes Supercomputing Cloud, an important port of Chinese Science Cloud from 2011. From national level, China National Grid (CNGrid) has integrated massive HPC resources from several national supercomputing centers and other large centers distributed geographically, and been providing efficient computing services for users in diverse disciplines and research areas. During more than 10 years, CNGrid and ScGrid has integrated tens of HPC resources distributed geographically across China, comprising 6 National Supercomputer Centers of Tianjin, Jinan, Changsha, and Shenzhen, Guangzhou, Wuxi, and also dozens of teraflops-scale HPC resources belong to universities and institutes. In total, the computing capability is more than 200PF and the storage capacity is more than 160PB in CNGrid. \u0000 As worked in the operation and management center of CNGrid and ScGrid for many years, we notice that users prefer to manage their jobs at different supercomputers and clusters via a global account on different remote clients such as science gateways, desktop applications and even scripts. And they don’ t like to apply for an account to each supercomputer and login into the supercomputer in specific way. \u0000 Therefore, we described Unified Account Management as a Service (UAMS) to access and use all HPC resources via a global account for each user in this paper. We addressed and solved challenges for mapping a global account to many local accounts, and provided unified account registration, management and authentication for different collaborative web gateways, command toolkits and other desktop applications. UAMS was designed in accordance with the core rules of simplicity, compatibility and reusability. In architecture design, we focused on loosely-coupled style to acquire good scalability and update internal modules transparently. In implementation, we applied widely accepted knowledge for the definitions of the RESTful API and divided them into several isolated microservices according to their usages and scenarios. For security, all sensitive data transferred in wide-network is protected by HTTPS with transport layer security outside of CNGrid and secure communication channels provided by OpenSSH inside of CNGrid. In addition, all parameters submitted to RESTful web services are strictly checked in format and variable type. \u0000 By providing these frequently important but always challenging capabilities as a service, UAMS allows users to use tens of HPC resources and clients via only an account, and makes it easy for developers to implement clients and services related HPC with advantages of numerous users and s","PeriodicalId":135658,"journal":{"name":"Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122163747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Extending WLCG Tier-2 Resources using HPC and Cloud Solutions 使用高性能计算和云解决方案扩展WLCG第2层资源

Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)

Pub Date : 2018-12-12 DOI: 10.22323/1.327.0025

J. Chudoba, M. Svatos

Available computing resources limit data simulation and processing of LHC experiments. WLCG Tier centers connected via Grid provide majority of computing and storage capacities, which allow relatively fast and precise analyses of data. Requirements on the number of simulated events must be often reduced to meet installed capacities. Projection of requirements for future LHC runs shows a significant shortage of standard Grid resources if a flat budget is assumed. There are several activities exploring other sources of computing power for LHC projects. The most significant are big HPC centers (supercomputers) and Cloud resources provided both by commercial and academic institutions. The Tier-2 center hosted by the Institute of Physics (FZU) in Prague provides resources for ALICE and ATLAS collaborations on behalf of all involved Czech institutions. Financial resources provided by funding agencies and resources provided by IoP do not allow to buy enough servers to meet demands of experiments. We extend storage resources by two distant sites with additional finance sources. Xrootd servers in the Institute of Nuclear Physics in Rez near Prague store files for the ALICE experiment. CESNET data storage group operates dCache instance with a tape backend for ATLAS (and Pierre Auger Observatory) collaboration. Relatively big computing capacities could be used in the national supercomputing center IT4I in Ostrava. Within the ATLAS collaboration, we explore two different solutions to overcome technical problems arising from different computing environment on the supercomputer. The main difference is that individual worker nodes do not have an external network connection and cannot directly download input and upload output data. One solution is already used for HPC centers in the USA, but until now requires significant adjustments of procedures used for standard ATLAS production. Another solution is based on ARC CE hosted by the Tier-2 center at IoP and resubmission of jobs remotely via ssh.

现有的计算资源限制了LHC实验数据的模拟和处理。通过网格连接的WLCG层中心提供了大部分的计算和存储能力，这允许相对快速和精确的数据分析。通常必须降低模拟事件数量的要求，以满足已安装的容量。对未来大型强子对撞机运行需求的预测表明，如果假设预算持平，标准网格资源将严重短缺。有几项活动在为大型强子对撞机项目探索计算能力的其他来源。最重要的是大型HPC中心(超级计算机)和商业和学术机构提供的云资源。布拉格物理研究所(FZU)主办的第2层中心代表所有相关的捷克机构为ALICE和ATLAS合作提供资源。资助机构提供的财政资源和IoP提供的资源不允许购买足够的服务器来满足实验的需求。我们通过两个遥远的站点扩展存储资源，并提供额外的资金来源。位于布拉格附近Rez核物理研究所的Xrootd服务器存储着ALICE实验的文件。CESNET数据存储组使用磁带后端为ATLAS(和皮埃尔·奥格天文台)协作操作dCache实例。相对较大的计算能力可以在俄斯特拉发的国家超级计算中心IT4I中使用。在ATLAS合作中，我们探索了两种不同的解决方案，以克服超级计算机上不同计算环境产生的技术问题。主要区别在于单个工作节点没有外部网络连接，不能直接下载输入和上传输出数据。一种解决方案已经用于美国的高性能计算中心，但直到现在还需要对用于标准ATLAS生产的程序进行重大调整。另一种解决方案是基于IoP的Tier-2中心托管的ARC CE，并通过ssh远程重新提交作业。

{"title":"Extending WLCG Tier-2 Resources using HPC and Cloud Solutions","authors":"J. Chudoba, M. Svatos","doi":"10.22323/1.327.0025","DOIUrl":"https://doi.org/10.22323/1.327.0025","url":null,"abstract":"Available computing resources limit data simulation and processing of LHC experiments. WLCG Tier centers connected via Grid provide majority of computing and storage capacities, which allow relatively fast and precise analyses of data. Requirements on the number of simulated events must be often reduced to meet installed capacities. Projection of requirements for future LHC runs shows a significant shortage of standard Grid resources if a flat budget is assumed. There are several activities exploring other sources of computing power for LHC projects. The most significant are big HPC centers (supercomputers) and Cloud resources provided both by commercial and academic institutions. The Tier-2 center hosted by the Institute of Physics (FZU) in Prague provides resources for ALICE and ATLAS collaborations on behalf of all involved Czech institutions. Financial resources provided by funding agencies and resources provided by IoP do not allow to buy enough servers to meet demands of experiments. We extend storage resources by two distant sites with additional finance sources. Xrootd servers in the Institute of Nuclear Physics in Rez near Prague store files for the ALICE experiment. CESNET data storage group operates dCache instance with a tape backend for ATLAS (and Pierre Auger Observatory) collaboration. Relatively big computing capacities could be used in the national supercomputing center IT4I in Ostrava. Within the ATLAS collaboration, we explore two different solutions to overcome technical problems arising from different computing environment on the supercomputer. The main difference is that individual worker nodes do not have an external network connection and cannot directly download input and upload output data. One solution is already used for HPC centers in the USA, but until now requires significant adjustments of procedures used for standard ATLAS production. Another solution is based on ARC CE hosted by the Tier-2 center at IoP and resubmission of jobs remotely via ssh.","PeriodicalId":135658,"journal":{"name":"Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130283316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Explore New Computing Environment for LHAASO Offline Data Analysis 探索LHAASO离线数据分析的新计算环境

Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)

Pub Date : 2018-12-12 DOI: 10.22323/1.327.0021

Qiulan Huang, Gongxing Sun, Qiao Yin, Zhanchen Wei, Qiang Li

This paper explores a way to build a new computing environment based on Hadoop to make the Large High Altitude Air Shower Observatory(LHAASO) jobs run on it transparently. Particularly, we discuss a new mechanism to support LHAASO software to random access data in HDFS. This new feature allows the Map/Reduce tasks to random read/write data on the local file system instead of using Hadoop data streaming interface. This makes HEP jobs run on Hadoop possible. We also develop MapReduce patterns for LHAASO jobs such as Corsika simulation, ARGO detector simulation (Geant4), KM2A simulation and Medea++ reconstruction. And user-friendly interface is provided. In addition, we provide the real-time cluster monitoring in terms of cluster healthy, number of running jobs, finished jobs and killed jobs. Also the accounting system is included. This work has been in production for LHAASO offline data analysis to gain about 20,000 CPU hours per month since September, 2016. The results show the efficiency of IO intensive job can be improved about 46%. Finally, we describe our ongoing work of data migration tool to serve the data move between HDFS and other storage systems.

本文探讨了一种基于Hadoop构建新的计算环境的方法，使大型高空风淋观测站(Large High Altitude Air Shower Observatory, LHAASO)的作业能够透明地在Hadoop上运行。特别地，我们讨论了一种支持LHAASO软件随机访问HDFS数据的新机制。这个新特性允许Map/Reduce任务在本地文件系统上随机读取/写入数据，而不是使用Hadoop数据流接口。这使得在Hadoop上运行HEP作业成为可能。我们还开发了用于LHAASO作业的MapReduce模式，如Corsika模拟、ARGO探测器模拟(Geant4)、KM2A模拟和Medea++重建。并提供用户友好的界面。此外，我们还提供实时集群监控，包括集群运行状况、正在运行的作业数量、已完成的作业和已终止的作业。会计系统也包括在内。该工作已投入生产，用于LHAASO离线数据分析，自2016年9月起每月获得约20,000 CPU小时。结果表明，IO密集型作业的效率可提高约46%。最后，我们描述了我们正在进行的数据迁移工具，以服务于HDFS和其他存储系统之间的数据迁移。

{"title":"Explore New Computing Environment for LHAASO Offline Data Analysis","authors":"Qiulan Huang, Gongxing Sun, Qiao Yin, Zhanchen Wei, Qiang Li","doi":"10.22323/1.327.0021","DOIUrl":"https://doi.org/10.22323/1.327.0021","url":null,"abstract":"This paper explores a way to build a new computing environment based on Hadoop to make the Large High Altitude Air Shower Observatory(LHAASO) jobs run on it transparently. Particularly, we discuss a new mechanism to support LHAASO software to random access data in HDFS. This new feature allows the Map/Reduce tasks to random read/write data on the local file system instead of using Hadoop data streaming interface. This makes HEP jobs run on Hadoop possible. We also develop MapReduce patterns for LHAASO jobs such as Corsika simulation, ARGO detector simulation (Geant4), KM2A simulation and Medea++ reconstruction. And user-friendly interface is provided. In addition, we provide the real-time cluster monitoring in terms of cluster healthy, number of running jobs, finished jobs and killed jobs. Also the accounting system is included. This work has been in production for LHAASO offline data analysis to gain about 20,000 CPU hours per month since September, 2016. The results show the efficiency of IO intensive job can be improved about 46%. Finally, we describe our ongoing work of data migration tool to serve the data move between HDFS and other storage systems.","PeriodicalId":135658,"journal":{"name":"Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127145039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Smart Policy Driven Data Management and Data Federations 智能策略驱动的数据管理和数据联合

Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)

Pub Date : 2018-12-12 DOI: 10.22323/1.327.0001

P. Fuhrmann, M. Antonacci, G. Donvito, O. Keeble, P. Millar

The core activity within the newly created H2020 project eXtreme DataCloud project will be the policy-driven orchestration of federated data management for data intensive sciences like High Energy Physics, Astronomy, Photon and Life Science. Well-known experts in this field will work on combining already established data management and orchestration tools to provide a highly scalable solution supporting the entire European Scientific Landscape. The work will cover "Data Life Cycle Management" as well as smart data placement on meta data, including storage availability, network bandwidth and data access patterns. Mechanisms will be put in place to trigger computational resources based on data ingestion and data movements. This paper presents the first architecture of this endeavor.

新创建的H2020项目eXtreme DataCloud项目的核心活动将是为数据密集型科学(如高能物理、天文学、光子和生命科学)提供政策驱动的联邦数据管理编排。该领域的知名专家将结合已经建立的数据管理和编排工具，提供一个高度可扩展的解决方案，支持整个欧洲科学景观。这项工作将涵盖“数据生命周期管理”以及元数据上的智能数据放置，包括存储可用性、网络带宽和数据访问模式。将建立基于数据摄取和数据移动的机制来触发计算资源。本文介绍了这一努力的第一个体系结构。

引用次数: 0

WLCG Tier-2 site at NCP, Status Update and Future Direction WLCG在NCP的二级站点，状态更新和未来方向

Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)

Pub Date : 2018-12-12 DOI: 10.22323/1.327.0015

Saqib Haleem, Fawad Saeed, A. Rehman, Muhammad Imran

The National Centre for Physics (NCP) in Pakistan maintains a computing infrastructure for the scientific community. A major portion of the computing and storage resources are reserved for the CMS experiment through the WLCG infrastructure, and a small portion of the computing resources are reserved for other non experimental high-energy physics (EHEP) scientific experiments. For efficient utilization of resources, many scientific organizations have migrated their resources into infrastructure-as-a-service (IaaS) facilities. The NCP has also taken such an initiative last year, and has migrated most of their resources into an IaaS facility. An HT-condor based batch system has been deployed for the local experimental high energy physics community to allow them to perform their analysis task. Recently we deployed an HT-Condor compute element (CE) as a gateway for the CMS jobs. On the network side, our Tier-2 site is completely accessible and operational on IPv6. Moreover, we recently deployed a Perfsonar node to actively monitor the throughput and latency issues between NCP and other WLCG sites. This paper discusses the status of NCP Tier-2 site, its current challenges and future directions.

巴基斯坦的国家物理中心(NCP)为科学界维护着一个计算基础设施。通过WLCG基础设施为CMS实验预留了大部分计算和存储资源，为其他非实验高能物理(EHEP)科学实验预留了一小部分计算资源。为了有效地利用资源，许多科学组织已经将其资源迁移到基础设施即服务(IaaS)设施中。NCP去年也采取了这样的举措，并将其大部分资源迁移到IaaS设施中。一个基于HT-condor的批处理系统已经为当地的高能物理实验社区部署，使他们能够执行他们的分析任务。最近，我们部署了一个HT-Condor计算元素(CE)作为CMS作业的网关。在网络方面，我们的Tier-2站点完全可以在IPv6上访问和运行。此外，我们最近部署了一个Perfsonar节点来主动监控NCP和其他WLCG站点之间的吞吐量和延迟问题。本文讨论了NCP二级站点的现状、面临的挑战和未来的发展方向。

{"title":"WLCG Tier-2 site at NCP, Status Update and Future Direction","authors":"Saqib Haleem, Fawad Saeed, A. Rehman, Muhammad Imran","doi":"10.22323/1.327.0015","DOIUrl":"https://doi.org/10.22323/1.327.0015","url":null,"abstract":"The National Centre for Physics (NCP) in Pakistan maintains a computing infrastructure for the scientific community. A major portion of the computing and storage resources are reserved for the CMS experiment through the WLCG infrastructure, and a small portion of the computing resources are reserved for other non experimental high-energy physics (EHEP) scientific experiments. For efficient utilization of resources, many scientific organizations have migrated their resources into infrastructure-as-a-service (IaaS) facilities. The NCP has also taken such an initiative last year, and has migrated most of their resources into an IaaS facility. An HT-condor based batch system has been deployed for the local experimental high energy physics community to allow them to perform their analysis task. Recently we deployed an HT-Condor compute element (CE) as a gateway for the CMS jobs. On the network side, our Tier-2 site is completely accessible and operational on IPv6. Moreover, we recently deployed a Perfsonar node to actively monitor the throughput and latency issues between NCP and other WLCG sites. This paper discusses the status of NCP Tier-2 site, its current challenges and future directions.","PeriodicalId":135658,"journal":{"name":"Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126501719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Authorship recognition and disambiguation of scientific papers using a neural networks approach 使用神经网络方法的科学论文的作者身份识别和消歧

Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)

Pub Date : 2018-03-01 DOI: 10.22323/1.327.0007

S. Schifano, Tommaso Sgarbanti, L. Tomassetti

Authorship recognition and author names disambiguation are main issues affecting the quality and reliability of bibliographic records retrieved from digital libraries, such as Web of Science, Scopus, Google Scholar and many others. So far, these problems have been faced using methods mainly based on text-pattern-recognition for specific datasets, with high-level degree of errors. In this paper, we propose a different approach using neural networks to learn features automatically for solving authorship recognition and disambiguation of author names. The network learns for each author the set of co-writers, and from this information recovers authorship of papers. In addition, the network can be trained taking into account other features, such as author affiliations, keywords, projects and research areas. The network has been developed using the TensorFlow framework, and run on recent Nvidia GPUs and multi-core Intel CPUs. Test datasets have been selected from records of Scopus digital library, for several groups of authors working in the fields of computer science, environmental science and physics. The proposed methods achieves accuracies above 99% in authorship recognition, and is able to effectively disambiguate homonyms. We have taken into account several network parameters, such as training-set and batch size, number of levels and hidden units, weights initialization, back-propagation algorithms, and analyzed also their impact on accuracy of results. This approach can be easily extended to any dataset and any bibliographic records provider.

作者身份识别和作者姓名消歧是影响Web of Science、Scopus、b谷歌Scholar等数字图书馆检索书目记录质量和可靠性的主要问题。到目前为止，这些问题主要是使用基于特定数据集的文本模式识别方法来解决的，误差程度很高。在本文中，我们提出了一种使用神经网络自动学习特征的方法来解决作者身份识别和作者姓名消歧问题。网络为每个作者学习共同作者的集合，并从这些信息中恢复论文的作者身份。此外，还可以考虑其他特征来训练网络，例如作者隶属关系、关键词、项目和研究领域。该网络是使用TensorFlow框架开发的，并在最新的Nvidia gpu和多核Intel cpu上运行。测试数据集是从Scopus数字图书馆的记录中选择的，适用于计算机科学、环境科学和物理学领域的几组作者。该方法在作者身份识别中准确率达到99%以上，能够有效地消除同音异义。我们考虑了几个网络参数，如训练集和批大小、层次数量和隐藏单元、权重初始化、反向传播算法，并分析了它们对结果准确性的影响。这种方法可以很容易地扩展到任何数据集和任何书目记录提供者。

{"title":"Authorship recognition and disambiguation of scientific papers using a neural networks approach","authors":"S. Schifano, Tommaso Sgarbanti, L. Tomassetti","doi":"10.22323/1.327.0007","DOIUrl":"https://doi.org/10.22323/1.327.0007","url":null,"abstract":"Authorship recognition and author names disambiguation are main issues affecting the quality and reliability of bibliographic records retrieved from digital libraries, such as Web of Science, Scopus, Google Scholar and many others. So far, these problems have been faced using methods mainly based on text-pattern-recognition for specific datasets, with high-level degree of errors. \u0000 \u0000In this paper, we propose a different approach using neural networks to learn features automatically for solving authorship recognition and disambiguation of author names. The network learns for each author the set of co-writers, and from this information recovers authorship of papers. In addition, the network can be trained taking into account other features, such as author affiliations, keywords, projects and research areas. \u0000 \u0000The network has been developed using the TensorFlow framework, and run on recent Nvidia GPUs and multi-core Intel CPUs. Test datasets have been selected from records of Scopus digital library, for several groups of authors working in the fields of computer science, environmental science and physics. The proposed methods achieves accuracies above 99% in authorship recognition, and is able to effectively disambiguate homonyms. \u0000 \u0000We have taken into account several network parameters, such as training-set and batch size, number of levels and hidden units, weights initialization, back-propagation algorithms, and analyzed also their impact on accuracy of results. This approach can be easily extended to any dataset and any bibliographic records provider.","PeriodicalId":135658,"journal":{"name":"Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127079288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

A Study of Credential Integration Model in Academic Research Federation Supporting a Wide Variety of Services 支持多种服务的学术研究联合会证书集成模型研究

Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)

Pub Date : 2018-03-01 DOI: 10.22323/1.327.0016

E. Sakane, Takeshi Nishimura, K. Aida, Motonori Nakamura

This paper investigates the situation where users must utilize each credential according to the desired services, and clarifies the problems in the situation and the issues addressed by the concept of ``identity federation''. Japan has the GakuNin, which is an academic access management federation, and the HPCI, which is a distributed high performance computing infrastructure. For the provision of the HPCI resources, the HPCI cannot simply behave as a service provider in the GakuNin. Consequently, in performing academic research, especially HPCI users belonging to academic institutions are compelled to manage both the GakuNin and the HPCI credentials. In this paper, based on the situation in Japan mentioned above, we discuss a credential integration model in order to more efficiently utilize a wide variety of services. We first characterize services in an academic federation from the point of view of authorization and investigate the problem that users must utilize each credential issued by different identity providers. Thus, we discuss the issues to integrate user's credentials, and consider a model that solves the issues.

本文研究了用户必须根据期望的服务使用每个凭据的情况，并阐明了这种情况下存在的问题以及“身份联合”概念所解决的问题。日本有GakuNin(一个学术访问管理联盟)和HPCI(一个分布式高性能计算基础设施)。对于HPCI资源的提供，HPCI不能简单地充当GakuNin中的服务提供者。因此，在进行学术研究时，特别是属于学术机构的HPCI用户被迫同时管理GakuNin和HPCI证书。本文结合日本的情况，讨论了一种凭证集成模型，以便更有效地利用各种服务。我们首先从授权的角度描述学术联盟中的服务，并研究用户必须使用由不同身份提供者颁发的每个凭据的问题。因此，我们将讨论集成用户凭据的问题，并考虑解决这些问题的模型。

{"title":"A Study of Credential Integration Model in Academic Research Federation Supporting a Wide Variety of Services","authors":"E. Sakane, Takeshi Nishimura, K. Aida, Motonori Nakamura","doi":"10.22323/1.327.0016","DOIUrl":"https://doi.org/10.22323/1.327.0016","url":null,"abstract":"This paper investigates the situation where users must utilize each credential according to the desired services, and clarifies the problems in the situation and the issues addressed by the concept of ``identity federation''. Japan has the GakuNin, which is an academic access management federation, and the HPCI, which is a distributed high performance computing infrastructure. For the provision of the HPCI resources, the HPCI cannot simply behave as a service provider in the GakuNin. Consequently, in performing academic research, especially HPCI users belonging to academic institutions are compelled to manage both the GakuNin and the HPCI credentials. In this paper, based on the situation in Japan mentioned above, we discuss a credential integration model in order to more efficiently utilize a wide variety of services. We first characterize services in an academic federation from the point of view of authorization and investigate the problem that users must utilize each credential issued by different identity providers. Thus, we discuss the issues to integrate user's credentials, and consider a model that solves the issues.","PeriodicalId":135658,"journal":{"name":"Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123484089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Proceedings of International Symposium on Grids and Clouds 2018 in conjunction with Frontiers in Computational Drug Discovery — PoS(ISGC 2018 & FCDD)最新文献