利用新数据和集合模型改进可持续发展目标的自动标注。

IF 5.1 2区环境科学与生态学 Q1 ENVIRONMENTAL SCIENCES Sustainability Science Pub Date : 2024-01-01 Epub Date: 2024-07-24 DOI:10.1007/s11625-024-01516-3

Dirk U Wulff, Dominik S Meier, Rui Mata

{"title":"利用新数据和集合模型改进可持续发展目标的自动标注。","authors":"Dirk U Wulff, Dominik S Meier, Rui Mata","doi":"10.1007/s11625-024-01516-3","DOIUrl":null,"url":null,"abstract":"A number of labeling systems based on text have been proposed to help monitor work on the United Nations (UN) Sustainable Development Goals (SDGs). Here, we present a systematic comparison of prominent SDG labeling systems using a variety of text sources and show that these differ considerably in their sensitivity (i.e., true-positive rate) and specificity (i.e., true-negative rate), have systematic biases (e.g., are more sensitive to specific SDGs relative to others), and are susceptible to the type and amount of text analyzed. We then show that an ensemble model that pools SDG labeling systems alleviates some of these limitations, exceeding the performance of the individual SDG labeling systems considered. We conclude that researchers and policymakers should care about the choice of the SDG labeling system and that ensemble methods should be favored when drawing conclusions about the absolute and relative prevalence of work on the SDGs based on automated methods.","PeriodicalId":49457,"journal":{"name":"Sustainability Science","volume":"19 5","pages":"1773-1787"},"PeriodicalIF":5.1000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11366727/pdf/","citationCount":"0","resultStr":"{\"title\":\"Using novel data and ensemble models to improve automated labeling of Sustainable Development Goals.\",\"authors\":\"Dirk U Wulff, Dominik S Meier, Rui Mata\",\"doi\":\"10.1007/s11625-024-01516-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A number of labeling systems based on text have been proposed to help monitor work on the United Nations (UN) Sustainable Development Goals (SDGs). Here, we present a systematic comparison of prominent SDG labeling systems using a variety of text sources and show that these differ considerably in their sensitivity (i.e., true-positive rate) and specificity (i.e., true-negative rate), have systematic biases (e.g., are more sensitive to specific SDGs relative to others), and are susceptible to the type and amount of text analyzed. We then show that an ensemble model that pools SDG labeling systems alleviates some of these limitations, exceeding the performance of the individual SDG labeling systems considered. We conclude that researchers and policymakers should care about the choice of the SDG labeling system and that ensemble methods should be favored when drawing conclusions about the absolute and relative prevalence of work on the SDGs based on automated methods.\",\"PeriodicalId\":49457,\"journal\":{\"name\":\"Sustainability Science\",\"volume\":\"19 5\",\"pages\":\"1773-1787\"},\"PeriodicalIF\":5.1000,\"publicationDate\":\"2024-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11366727/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Sustainability Science\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://doi.org/10.1007/s11625-024-01516-3\",\"RegionNum\":2,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/7/24 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sustainability Science","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.1007/s11625-024-01516-3","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/7/24 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}

引用次数: 0

摘要

为了帮助监测联合国可持续发展目标（SDGs）的工作，人们提出了许多基于文本的标签系统。在此，我们利用各种文本资源对著名的可持续发展目标标注系统进行了系统比较，结果表明，这些系统在灵敏度（即真阳性率）和特异性（即真阴性率）方面存在很大差异，存在系统性偏差（例如，相对于其他系统，对特定的可持续发展目标更敏感），并且易受所分析文本的类型和数量的影响。然后，我们展示了一个集合可持续发展目标标注系统的集合模型，该模型缓解了其中的一些局限性，其性能超过了所考虑的单个可持续发展目标标注系统。我们的结论是，研究人员和政策制定者应该关注对可持续发展目标标注系统的选择，并且在基于自动化方法得出可持续发展目标工作的绝对和相对普遍性的结论时，应该优先考虑集合方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Using novel data and ensemble models to improve automated labeling of Sustainable Development Goals.

A number of labeling systems based on text have been proposed to help monitor work on the United Nations (UN) Sustainable Development Goals (SDGs). Here, we present a systematic comparison of prominent SDG labeling systems using a variety of text sources and show that these differ considerably in their sensitivity (i.e., true-positive rate) and specificity (i.e., true-negative rate), have systematic biases (e.g., are more sensitive to specific SDGs relative to others), and are susceptible to the type and amount of text analyzed. We then show that an ensemble model that pools SDG labeling systems alleviates some of these limitations, exceeding the performance of the individual SDG labeling systems considered. We conclude that researchers and policymakers should care about the choice of the SDG labeling system and that ensemble methods should be favored when drawing conclusions about the absolute and relative prevalence of work on the SDGs based on automated methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Sustainability Science 环境科学-环境科学

CiteScore

11.30

自引率

10.00%

发文量

174

审稿时长

3 months

期刊介绍： The journal Sustainability Science offers insights into interactions within and between nature and the rest of human society, and the complex mechanisms that sustain both. The journal promotes science based predictions and impact assessments of global change, and seeks ways to ensure that such knowledge can be understood by society and be used to strengthen the resilience of global natural systems (such as ecosystems, ocean and atmospheric systems, nutrient cycles), social systems (economies, governments, industry) and human systems at the individual level (lifestyles, health, security, and human values).