{"title":"ARIM-mdx 数据系统:建立全国材料科学数据平台","authors":"Masatoshi Hanai, Ryo Ishikawa, Mitsuaki Kawamura, Masato Ohnishi, Norio Takenaka, Kou Nakamura, Daiju Matsumura, Seiji Fujikawa, Hiroki Sakamoto, Yukinori Ochiai, Tetsuo Okane, Shin-Ichiro Kuroki, Atsuo Yamada, Toyotaro Suzumura, Junichiro Shiomi, Kenjiro Taura, Yoshio Mita, Naoya Shibata, Yuichi Ikuhara","doi":"arxiv-2409.06734","DOIUrl":null,"url":null,"abstract":"In modern materials science, effective and high-volume data management across\nleading-edge experimental facilities and world-class supercomputers is\nindispensable for cutting-edge research. Such facilities and supercomputers are\ntypically utilized by a wide range of researchers across different fields and\norganizations in academia and industry. However, existing integrated systems\nthat handle data from these resources have primarily focused just on\nsmaller-scale cross-institutional or single-domain operations. As a result,\nthey often lack the scalability, efficiency, agility, and interdisciplinarity,\nneeded for handling substantial volumes of data from various researchers. In this paper, we introduce ARIM-mdx data system, a nationwide data platform\nfor materials science in Japan. The platform involves 8 universities and\ninstitutes all over Japan through the governmental materials science project.\nCurrently in its trial phase, the ARIM-mdx data system is utilized by over 800\nresearchers from around 140 organizations in academia and industry, being\nintended to gradually expand its reach. The system employs a hybrid\narchitecture, combining a peta-scale dedicated storage system for security and\nstability with a high-performance academic cloud for efficiency and\nscalability. Through direct network connections between them, the system\nachieves 4.7x latency reduction compared to a conventional approach, resulting\nin near real-time interactive data analysis. It also utilizes specialized IoT\ndevices for secure data transfer from equipment computers and connects to\nmultiple supercomputers via an academic ultra-fast network, achieving 4x faster\ndata transfer compared to the public Internet. The ARIM-mdx data system, as a\npioneering nationwide data platform, has the potential to contribute to the\ncreation of new research communities and accelerates innovations.","PeriodicalId":501422,"journal":{"name":"arXiv - CS - Distributed, Parallel, and Cluster Computing","volume":"69 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ARIM-mdx Data System: Towards a Nationwide Data Platform for Materials Science\",\"authors\":\"Masatoshi Hanai, Ryo Ishikawa, Mitsuaki Kawamura, Masato Ohnishi, Norio Takenaka, Kou Nakamura, Daiju Matsumura, Seiji Fujikawa, Hiroki Sakamoto, Yukinori Ochiai, Tetsuo Okane, Shin-Ichiro Kuroki, Atsuo Yamada, Toyotaro Suzumura, Junichiro Shiomi, Kenjiro Taura, Yoshio Mita, Naoya Shibata, Yuichi Ikuhara\",\"doi\":\"arxiv-2409.06734\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In modern materials science, effective and high-volume data management across\\nleading-edge experimental facilities and world-class supercomputers is\\nindispensable for cutting-edge research. Such facilities and supercomputers are\\ntypically utilized by a wide range of researchers across different fields and\\norganizations in academia and industry. However, existing integrated systems\\nthat handle data from these resources have primarily focused just on\\nsmaller-scale cross-institutional or single-domain operations. As a result,\\nthey often lack the scalability, efficiency, agility, and interdisciplinarity,\\nneeded for handling substantial volumes of data from various researchers. In this paper, we introduce ARIM-mdx data system, a nationwide data platform\\nfor materials science in Japan. The platform involves 8 universities and\\ninstitutes all over Japan through the governmental materials science project.\\nCurrently in its trial phase, the ARIM-mdx data system is utilized by over 800\\nresearchers from around 140 organizations in academia and industry, being\\nintended to gradually expand its reach. The system employs a hybrid\\narchitecture, combining a peta-scale dedicated storage system for security and\\nstability with a high-performance academic cloud for efficiency and\\nscalability. Through direct network connections between them, the system\\nachieves 4.7x latency reduction compared to a conventional approach, resulting\\nin near real-time interactive data analysis. It also utilizes specialized IoT\\ndevices for secure data transfer from equipment computers and connects to\\nmultiple supercomputers via an academic ultra-fast network, achieving 4x faster\\ndata transfer compared to the public Internet. The ARIM-mdx data system, as a\\npioneering nationwide data platform, has the potential to contribute to the\\ncreation of new research communities and accelerates innovations.\",\"PeriodicalId\":501422,\"journal\":{\"name\":\"arXiv - CS - Distributed, Parallel, and Cluster Computing\",\"volume\":\"69 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Distributed, Parallel, and Cluster Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.06734\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Distributed, Parallel, and Cluster Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.06734","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
ARIM-mdx Data System: Towards a Nationwide Data Platform for Materials Science
In modern materials science, effective and high-volume data management across
leading-edge experimental facilities and world-class supercomputers is
indispensable for cutting-edge research. Such facilities and supercomputers are
typically utilized by a wide range of researchers across different fields and
organizations in academia and industry. However, existing integrated systems
that handle data from these resources have primarily focused just on
smaller-scale cross-institutional or single-domain operations. As a result,
they often lack the scalability, efficiency, agility, and interdisciplinarity,
needed for handling substantial volumes of data from various researchers. In this paper, we introduce ARIM-mdx data system, a nationwide data platform
for materials science in Japan. The platform involves 8 universities and
institutes all over Japan through the governmental materials science project.
Currently in its trial phase, the ARIM-mdx data system is utilized by over 800
researchers from around 140 organizations in academia and industry, being
intended to gradually expand its reach. The system employs a hybrid
architecture, combining a peta-scale dedicated storage system for security and
stability with a high-performance academic cloud for efficiency and
scalability. Through direct network connections between them, the system
achieves 4.7x latency reduction compared to a conventional approach, resulting
in near real-time interactive data analysis. It also utilizes specialized IoT
devices for secure data transfer from equipment computers and connects to
multiple supercomputers via an academic ultra-fast network, achieving 4x faster
data transfer compared to the public Internet. The ARIM-mdx data system, as a
pioneering nationwide data platform, has the potential to contribute to the
creation of new research communities and accelerates innovations.