{"title":"利用新数据和集合模型改进可持续发展目标的自动标注。","authors":"Dirk U Wulff, Dominik S Meier, Rui Mata","doi":"10.1007/s11625-024-01516-3","DOIUrl":null,"url":null,"abstract":"<p><p>A number of labeling systems based on text have been proposed to help monitor work on the United Nations (UN) Sustainable Development Goals (SDGs). Here, we present a systematic comparison of prominent SDG labeling systems using a variety of text sources and show that these differ considerably in their sensitivity (i.e., true-positive rate) and specificity (i.e., true-negative rate), have systematic biases (e.g., are more sensitive to specific SDGs relative to others), and are susceptible to the type and amount of text analyzed. We then show that an ensemble model that pools SDG labeling systems alleviates some of these limitations, exceeding the performance of the individual SDG labeling systems considered. We conclude that researchers and policymakers should care about the choice of the SDG labeling system and that ensemble methods should be favored when drawing conclusions about the absolute and relative prevalence of work on the SDGs based on automated methods.</p>","PeriodicalId":49457,"journal":{"name":"Sustainability Science","volume":null,"pages":null},"PeriodicalIF":5.1000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11366727/pdf/","citationCount":"0","resultStr":"{\"title\":\"Using novel data and ensemble models to improve automated labeling of Sustainable Development Goals.\",\"authors\":\"Dirk U Wulff, Dominik S Meier, Rui Mata\",\"doi\":\"10.1007/s11625-024-01516-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>A number of labeling systems based on text have been proposed to help monitor work on the United Nations (UN) Sustainable Development Goals (SDGs). Here, we present a systematic comparison of prominent SDG labeling systems using a variety of text sources and show that these differ considerably in their sensitivity (i.e., true-positive rate) and specificity (i.e., true-negative rate), have systematic biases (e.g., are more sensitive to specific SDGs relative to others), and are susceptible to the type and amount of text analyzed. We then show that an ensemble model that pools SDG labeling systems alleviates some of these limitations, exceeding the performance of the individual SDG labeling systems considered. We conclude that researchers and policymakers should care about the choice of the SDG labeling system and that ensemble methods should be favored when drawing conclusions about the absolute and relative prevalence of work on the SDGs based on automated methods.</p>\",\"PeriodicalId\":49457,\"journal\":{\"name\":\"Sustainability Science\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":5.1000,\"publicationDate\":\"2024-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11366727/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Sustainability Science\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://doi.org/10.1007/s11625-024-01516-3\",\"RegionNum\":2,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/7/24 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sustainability Science","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.1007/s11625-024-01516-3","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/7/24 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
Using novel data and ensemble models to improve automated labeling of Sustainable Development Goals.
A number of labeling systems based on text have been proposed to help monitor work on the United Nations (UN) Sustainable Development Goals (SDGs). Here, we present a systematic comparison of prominent SDG labeling systems using a variety of text sources and show that these differ considerably in their sensitivity (i.e., true-positive rate) and specificity (i.e., true-negative rate), have systematic biases (e.g., are more sensitive to specific SDGs relative to others), and are susceptible to the type and amount of text analyzed. We then show that an ensemble model that pools SDG labeling systems alleviates some of these limitations, exceeding the performance of the individual SDG labeling systems considered. We conclude that researchers and policymakers should care about the choice of the SDG labeling system and that ensemble methods should be favored when drawing conclusions about the absolute and relative prevalence of work on the SDGs based on automated methods.
期刊介绍:
The journal Sustainability Science offers insights into interactions within and between nature and the rest of human society, and the complex mechanisms that sustain both. The journal promotes science based predictions and impact assessments of global change, and seeks ways to ensure that such knowledge can be understood by society and be used to strengthen the resilience of global natural systems (such as ecosystems, ocean and atmospheric systems, nutrient cycles), social systems (economies, governments, industry) and human systems at the individual level (lifestyles, health, security, and human values).