Samuel J. Pitman, Alicia K. Evans, Robbie T. Ireland, Felix Lempriere, Laura K. McKemmish
{"title":"密度泛函理论热化学计算的基准基集:为何应避免使用非极化基础集和极化 6-311G 系列","authors":"Samuel J. Pitman, Alicia K. Evans, Robbie T. Ireland, Felix Lempriere, Laura K. McKemmish","doi":"arxiv-2409.03964","DOIUrl":null,"url":null,"abstract":"Basis sets are a crucial but often largely overlooked choice when setting up\nquantum chemistry calculations. The choice of basis set can be critical in\ndetermining the accuracy and calculation time of your quantum chemistry\ncalculations. Clear recommendations based on thorough benchmarking are\nessential, but not readily available currently. This study investigates the\nrelative quality of basis sets for general properties by benchmarking basis set\nperformance for a diverse set of 136 reactions (from the diet-150-GMTKN55\ndataset). In our analysis, we find the distributions of errors are often\nsignificantly non-Gaussian, meaning that the joint consideration of median\nerrors, mean absolute errors and outlier statistics is helpful to provide a\nholistic understanding of basis set performance. Our direct comparison of\nperformance between most modern basis sets provides quantitative evidence for\nbasis set recommendations that broadly align with the established understanding\nof basis set experts and is evident in the design of modern basis sets. For\nexample, while zeta is a good measure of quality, it is not the only\ndetermining factor for an accurate calculation with unpolarised double and\ntriple-zeta basis sets (like 6-31G and 6-311G) having very poor performance.\nAppropriate use of polarisation functions (e.g. 6-31G*) is essential to obtain\nthe accuracy offered by double or triple zeta basis sets. In our study, the\nbest performance in our study for double and triple zeta basis set are\n6-31++G** and pcseg-2 respectively. The polarised 6-311G basis set family has\npoor parameterisation which means its performance is more like a double-zeta\nthan triple-zeta basis set. All versions of the 6-311G basis set family should\nbe avoided entirely for valence chemistry calculations moving forward.","PeriodicalId":501304,"journal":{"name":"arXiv - PHYS - Chemical Physics","volume":"401 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Benchmarking Basis Sets for Density Functional Theory Thermochemistry Calculations: Why unpolarised basis sets and the polarised 6-311G family should be avoided\",\"authors\":\"Samuel J. Pitman, Alicia K. Evans, Robbie T. Ireland, Felix Lempriere, Laura K. McKemmish\",\"doi\":\"arxiv-2409.03964\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Basis sets are a crucial but often largely overlooked choice when setting up\\nquantum chemistry calculations. The choice of basis set can be critical in\\ndetermining the accuracy and calculation time of your quantum chemistry\\ncalculations. Clear recommendations based on thorough benchmarking are\\nessential, but not readily available currently. This study investigates the\\nrelative quality of basis sets for general properties by benchmarking basis set\\nperformance for a diverse set of 136 reactions (from the diet-150-GMTKN55\\ndataset). In our analysis, we find the distributions of errors are often\\nsignificantly non-Gaussian, meaning that the joint consideration of median\\nerrors, mean absolute errors and outlier statistics is helpful to provide a\\nholistic understanding of basis set performance. Our direct comparison of\\nperformance between most modern basis sets provides quantitative evidence for\\nbasis set recommendations that broadly align with the established understanding\\nof basis set experts and is evident in the design of modern basis sets. For\\nexample, while zeta is a good measure of quality, it is not the only\\ndetermining factor for an accurate calculation with unpolarised double and\\ntriple-zeta basis sets (like 6-31G and 6-311G) having very poor performance.\\nAppropriate use of polarisation functions (e.g. 6-31G*) is essential to obtain\\nthe accuracy offered by double or triple zeta basis sets. In our study, the\\nbest performance in our study for double and triple zeta basis set are\\n6-31++G** and pcseg-2 respectively. The polarised 6-311G basis set family has\\npoor parameterisation which means its performance is more like a double-zeta\\nthan triple-zeta basis set. All versions of the 6-311G basis set family should\\nbe avoided entirely for valence chemistry calculations moving forward.\",\"PeriodicalId\":501304,\"journal\":{\"name\":\"arXiv - PHYS - Chemical Physics\",\"volume\":\"401 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - PHYS - Chemical Physics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.03964\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Chemical Physics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.03964","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Benchmarking Basis Sets for Density Functional Theory Thermochemistry Calculations: Why unpolarised basis sets and the polarised 6-311G family should be avoided
Basis sets are a crucial but often largely overlooked choice when setting up
quantum chemistry calculations. The choice of basis set can be critical in
determining the accuracy and calculation time of your quantum chemistry
calculations. Clear recommendations based on thorough benchmarking are
essential, but not readily available currently. This study investigates the
relative quality of basis sets for general properties by benchmarking basis set
performance for a diverse set of 136 reactions (from the diet-150-GMTKN55
dataset). In our analysis, we find the distributions of errors are often
significantly non-Gaussian, meaning that the joint consideration of median
errors, mean absolute errors and outlier statistics is helpful to provide a
holistic understanding of basis set performance. Our direct comparison of
performance between most modern basis sets provides quantitative evidence for
basis set recommendations that broadly align with the established understanding
of basis set experts and is evident in the design of modern basis sets. For
example, while zeta is a good measure of quality, it is not the only
determining factor for an accurate calculation with unpolarised double and
triple-zeta basis sets (like 6-31G and 6-311G) having very poor performance.
Appropriate use of polarisation functions (e.g. 6-31G*) is essential to obtain
the accuracy offered by double or triple zeta basis sets. In our study, the
best performance in our study for double and triple zeta basis set are
6-31++G** and pcseg-2 respectively. The polarised 6-311G basis set family has
poor parameterisation which means its performance is more like a double-zeta
than triple-zeta basis set. All versions of the 6-311G basis set family should
be avoided entirely for valence chemistry calculations moving forward.