Pavel Stishenko, Adam McSloy, Berk Onat, Ben Hourahine, Reinhard J. Maurer, James R. Kermode, Andrew Logsdail
{"title":"Integrated workflows and interfaces for data-driven semi-empirical electronic structure calculations","authors":"Pavel Stishenko, Adam McSloy, Berk Onat, Ben Hourahine, Reinhard J. Maurer, James R. Kermode, Andrew Logsdail","doi":"arxiv-2403.15625","DOIUrl":null,"url":null,"abstract":"Modern software engineering of electronic structure codes has seen a paradigm\nshift from monolithic workflows towards object-based modularity. Software\nobjectivity allows for greater flexibility in the application of electronic\nstructure calculations, with particular benefits when integrated with\napproaches for data-driven analysis. Here, we discuss different approaches to\ncreate \"deep\" modular interfaces that connect big-data workflows and electronic\nstructure codes, and explore the diversity of use cases that they can enable.\nWe present two such interface approaches for the semi-empirical electronic\nstructure package, DFTB+. In one case, DFTB+ is applied as a library and\nprovides data to an external workflow; and in another, DFTB+ receives data via\nexternal bindings and processes the information subsequently within an internal\nworkflow. We provide a general framework to enable data exchange workflows for\nembedding new machine-learning-based Hamiltonians within DFTB+, or to enabling\ndeep integration of DFTB+ in multiscale embedding workflows. These modular\ninterfaces demonstrate opportunities in emergent software and workflows to\naccelerate scientific discovery by harnessing existing software capabilities.","PeriodicalId":501211,"journal":{"name":"arXiv - PHYS - Other Condensed Matter","volume":"233 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Other Condensed Matter","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2403.15625","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Modern software engineering of electronic structure codes has seen a paradigm
shift from monolithic workflows towards object-based modularity. Software
objectivity allows for greater flexibility in the application of electronic
structure calculations, with particular benefits when integrated with
approaches for data-driven analysis. Here, we discuss different approaches to
create "deep" modular interfaces that connect big-data workflows and electronic
structure codes, and explore the diversity of use cases that they can enable.
We present two such interface approaches for the semi-empirical electronic
structure package, DFTB+. In one case, DFTB+ is applied as a library and
provides data to an external workflow; and in another, DFTB+ receives data via
external bindings and processes the information subsequently within an internal
workflow. We provide a general framework to enable data exchange workflows for
embedding new machine-learning-based Hamiltonians within DFTB+, or to enabling
deep integration of DFTB+ in multiscale embedding workflows. These modular
interfaces demonstrate opportunities in emergent software and workflows to
accelerate scientific discovery by harnessing existing software capabilities.