CaML: Carbon Footprinting of Household Products with Zero-Shot Semantic Text Similarity

Bharathan Balaji, Venkata Sai Gargeya Vunnava, G. Guest, Jared Kramer
{"title":"CaML: Carbon Footprinting of Household Products with Zero-Shot Semantic Text Similarity","authors":"Bharathan Balaji, Venkata Sai Gargeya Vunnava, G. Guest, Jared Kramer","doi":"10.1145/3543507.3583882","DOIUrl":null,"url":null,"abstract":"Products contribute to carbon emissions in each phase of their life cycle, from manufacturing to disposal. Estimating the embodied carbon in products is a key step towards understanding their impact, and undertaking mitigation actions. Precise carbon attribution is challenging at scale, requiring both domain expertise and granular supply chain data. As a first-order approximation, standard reports use Economic Input-Output based Life Cycle Assessment (EIO-LCA) which estimates carbon emissions per dollar at an industry sector level using transactions between different parts of the economy. EIO-LCA models map products to an industry sector, and uses the corresponding carbon per dollar estimates to calculate the embodied carbon footprint of a product. An LCA expert needs to map each product to one of upwards of 1000 potential industry sectors. To reduce the annotation burden, the standard practice is to group products by categories, and map categories to their corresponding industry sector. We present CaML, an algorithm to automate EIO-LCA using semantic text similarity matching by leveraging the text descriptions of the product and the industry sector. CaML uses a pre-trained sentence transformer model to rank the top-5 matches, and asks a human to check if any of them are a good match. We annotated 40K products with non-experts. Our results reveal that pre-defined product categories are heterogeneous with respect to EIO-LCA industry sectors, and lead to a large mean absolute percentage error (MAPE) of 51% in kgCO2e/$. CaML outperforms the previous manually intensive method, yielding a MAPE of 22% with no domain labels (zero-shot). We compared annotations of a small sample of 210 products with LCA experts, and find that CaML accuracy is comparable to that of annotations by non-experts.","PeriodicalId":296351,"journal":{"name":"Proceedings of the ACM Web Conference 2023","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM Web Conference 2023","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3543507.3583882","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Products contribute to carbon emissions in each phase of their life cycle, from manufacturing to disposal. Estimating the embodied carbon in products is a key step towards understanding their impact, and undertaking mitigation actions. Precise carbon attribution is challenging at scale, requiring both domain expertise and granular supply chain data. As a first-order approximation, standard reports use Economic Input-Output based Life Cycle Assessment (EIO-LCA) which estimates carbon emissions per dollar at an industry sector level using transactions between different parts of the economy. EIO-LCA models map products to an industry sector, and uses the corresponding carbon per dollar estimates to calculate the embodied carbon footprint of a product. An LCA expert needs to map each product to one of upwards of 1000 potential industry sectors. To reduce the annotation burden, the standard practice is to group products by categories, and map categories to their corresponding industry sector. We present CaML, an algorithm to automate EIO-LCA using semantic text similarity matching by leveraging the text descriptions of the product and the industry sector. CaML uses a pre-trained sentence transformer model to rank the top-5 matches, and asks a human to check if any of them are a good match. We annotated 40K products with non-experts. Our results reveal that pre-defined product categories are heterogeneous with respect to EIO-LCA industry sectors, and lead to a large mean absolute percentage error (MAPE) of 51% in kgCO2e/$. CaML outperforms the previous manually intensive method, yielding a MAPE of 22% with no domain labels (zero-shot). We compared annotations of a small sample of 210 products with LCA experts, and find that CaML accuracy is comparable to that of annotations by non-experts.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
语义文本相似度为零的家居产品碳足迹
从制造到处理,产品在其生命周期的每个阶段都会产生碳排放。估计产品中的隐含碳是了解其影响并采取缓解行动的关键步骤。精确的碳归属在规模上具有挑战性,需要领域专业知识和颗粒供应链数据。作为一阶近似,标准报告使用基于经济投入产出的生命周期评估(EIO-LCA),该评估使用经济不同部分之间的交易来估计工业部门层面每美元的碳排放量。EIO-LCA模型将产品映射到工业部门,并使用相应的每美元碳估计来计算产品的隐含碳足迹。LCA专家需要将每个产品映射到多达1000个潜在行业部门中的一个。为了减少注释负担,标准做法是按类别对产品进行分组,并将类别映射到相应的行业部门。我们提出了CaML,一种通过利用产品和行业部门的文本描述来使用语义文本相似性匹配来自动化EIO-LCA的算法。CaML使用预先训练的句子转换模型对前5个匹配进行排名,并要求人工检查其中是否有任何匹配。我们用非专家对40K产品进行了注释。我们的研究结果表明,就EIO-LCA行业部门而言,预定义的产品类别是异质的,导致kgCO2e/$的平均绝对百分比误差(MAPE)很大,为51%。CaML优于以前的手动密集型方法,在没有域标签(零射击)的情况下产生22%的MAPE。我们将210个小样本产品的注释与LCA专家进行了比较,发现CaML的准确性与非专家的注释相当。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
CurvDrop: A Ricci Curvature Based Approach to Prevent Graph Neural Networks from Over-Smoothing and Over-Squashing Learning to Simulate Crowd Trajectories with Graph Networks Word Sense Disambiguation by Refining Target Word Embedding Curriculum Graph Poisoning Optimizing Guided Traversal for Fast Learned Sparse Retrieval
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1