ARIM-mdx Data System: Towards a Nationwide Data Platform for Materials Science

Masatoshi Hanai, Ryo Ishikawa, Mitsuaki Kawamura, Masato Ohnishi, Norio Takenaka, Kou Nakamura, Daiju Matsumura, Seiji Fujikawa, Hiroki Sakamoto, Yukinori Ochiai, Tetsuo Okane, Shin-Ichiro Kuroki, Atsuo Yamada, Toyotaro Suzumura, Junichiro Shiomi, Kenjiro Taura, Yoshio Mita, Naoya Shibata, Yuichi Ikuhara
{"title":"ARIM-mdx Data System: Towards a Nationwide Data Platform for Materials Science","authors":"Masatoshi Hanai, Ryo Ishikawa, Mitsuaki Kawamura, Masato Ohnishi, Norio Takenaka, Kou Nakamura, Daiju Matsumura, Seiji Fujikawa, Hiroki Sakamoto, Yukinori Ochiai, Tetsuo Okane, Shin-Ichiro Kuroki, Atsuo Yamada, Toyotaro Suzumura, Junichiro Shiomi, Kenjiro Taura, Yoshio Mita, Naoya Shibata, Yuichi Ikuhara","doi":"arxiv-2409.06734","DOIUrl":null,"url":null,"abstract":"In modern materials science, effective and high-volume data management across\nleading-edge experimental facilities and world-class supercomputers is\nindispensable for cutting-edge research. Such facilities and supercomputers are\ntypically utilized by a wide range of researchers across different fields and\norganizations in academia and industry. However, existing integrated systems\nthat handle data from these resources have primarily focused just on\nsmaller-scale cross-institutional or single-domain operations. As a result,\nthey often lack the scalability, efficiency, agility, and interdisciplinarity,\nneeded for handling substantial volumes of data from various researchers. In this paper, we introduce ARIM-mdx data system, a nationwide data platform\nfor materials science in Japan. The platform involves 8 universities and\ninstitutes all over Japan through the governmental materials science project.\nCurrently in its trial phase, the ARIM-mdx data system is utilized by over 800\nresearchers from around 140 organizations in academia and industry, being\nintended to gradually expand its reach. The system employs a hybrid\narchitecture, combining a peta-scale dedicated storage system for security and\nstability with a high-performance academic cloud for efficiency and\nscalability. Through direct network connections between them, the system\nachieves 4.7x latency reduction compared to a conventional approach, resulting\nin near real-time interactive data analysis. It also utilizes specialized IoT\ndevices for secure data transfer from equipment computers and connects to\nmultiple supercomputers via an academic ultra-fast network, achieving 4x faster\ndata transfer compared to the public Internet. The ARIM-mdx data system, as a\npioneering nationwide data platform, has the potential to contribute to the\ncreation of new research communities and accelerates innovations.","PeriodicalId":501422,"journal":{"name":"arXiv - CS - Distributed, Parallel, and Cluster Computing","volume":"69 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Distributed, Parallel, and Cluster Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.06734","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In modern materials science, effective and high-volume data management across leading-edge experimental facilities and world-class supercomputers is indispensable for cutting-edge research. Such facilities and supercomputers are typically utilized by a wide range of researchers across different fields and organizations in academia and industry. However, existing integrated systems that handle data from these resources have primarily focused just on smaller-scale cross-institutional or single-domain operations. As a result, they often lack the scalability, efficiency, agility, and interdisciplinarity, needed for handling substantial volumes of data from various researchers. In this paper, we introduce ARIM-mdx data system, a nationwide data platform for materials science in Japan. The platform involves 8 universities and institutes all over Japan through the governmental materials science project. Currently in its trial phase, the ARIM-mdx data system is utilized by over 800 researchers from around 140 organizations in academia and industry, being intended to gradually expand its reach. The system employs a hybrid architecture, combining a peta-scale dedicated storage system for security and stability with a high-performance academic cloud for efficiency and scalability. Through direct network connections between them, the system achieves 4.7x latency reduction compared to a conventional approach, resulting in near real-time interactive data analysis. It also utilizes specialized IoT devices for secure data transfer from equipment computers and connects to multiple supercomputers via an academic ultra-fast network, achieving 4x faster data transfer compared to the public Internet. The ARIM-mdx data system, as a pioneering nationwide data platform, has the potential to contribute to the creation of new research communities and accelerates innovations.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
ARIM-mdx 数据系统:建立全国材料科学数据平台
在现代材料科学领域,前沿实验设施和世界级超级计算机之间有效而大量的数据管理对于前沿研究来说是必不可少的。这些设施和超级计算机通常由学术界和工业界不同领域和组织的众多研究人员使用。然而,现有的处理这些资源数据的集成系统主要侧重于较小规模的跨机构或单一领域操作。因此,它们往往缺乏处理来自不同研究人员的大量数据所需的可扩展性、效率、敏捷性和跨学科性。本文介绍了 ARIM-mdx 数据系统,这是日本一个全国性的材料科学数据平台。目前,ARIM-mdx 数据系统正处于试运行阶段,已有来自学术界和工业界约 140 个组织的 800 多名研究人员使用,并计划逐步扩大其覆盖范围。该系统采用了混合架构,将具有安全性和稳定性的 pet 级专用存储系统与具有效率和可扩展性的高性能学术云相结合。通过它们之间的直接网络连接,系统的延迟时间比传统方法减少了 4.7 倍,从而实现了近乎实时的交互式数据分析。该系统还利用专门的物联网设备从设备计算机安全传输数据,并通过学术超高速网络与多台超级计算机连接,实现了比公共互联网快4倍的数据传输速度。ARIM-mdx数据系统作为一个全国性的先驱数据平台,有可能为创建新的研究社区和加速创新做出贡献。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Massively parallel CMA-ES with increasing population Communication Lower Bounds and Optimal Algorithms for Symmetric Matrix Computations Energy Efficiency Support for Software Defined Networks: a Serverless Computing Approach CountChain: A Decentralized Oracle Network for Counting Systems Delay Analysis of EIP-4844
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1