Chenyang Yang, Yining Hong, Grace A. Lewis, Tongshuang Wu, Christian Kästner
{"title":"我的模型出了什么问题?识别语义数据切分的系统性问题","authors":"Chenyang Yang, Yining Hong, Grace A. Lewis, Tongshuang Wu, Christian Kästner","doi":"arxiv-2409.09261","DOIUrl":null,"url":null,"abstract":"Machine learning models make mistakes, yet sometimes it is difficult to\nidentify the systematic problems behind the mistakes. Practitioners engage in\nvarious activities, including error analysis, testing, auditing, and\nred-teaming, to form hypotheses of what can go (or has gone) wrong with their\nmodels. To validate these hypotheses, practitioners employ data slicing to\nidentify relevant examples. However, traditional data slicing is limited by\navailable features and programmatic slicing functions. In this work, we propose\nSemSlicer, a framework that supports semantic data slicing, which identifies a\nsemantically coherent slice, without the need for existing features. SemSlicer\nuses Large Language Models to annotate datasets and generate slices from any\nuser-defined slicing criteria. We show that SemSlicer generates accurate slices\nwith low cost, allows flexible trade-offs between different design dimensions,\nreliably identifies under-performing data slices, and helps practitioners\nidentify useful data slices that reflect systematic problems.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"What Is Wrong with My Model? Identifying Systematic Problems with Semantic Data Slicing\",\"authors\":\"Chenyang Yang, Yining Hong, Grace A. Lewis, Tongshuang Wu, Christian Kästner\",\"doi\":\"arxiv-2409.09261\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Machine learning models make mistakes, yet sometimes it is difficult to\\nidentify the systematic problems behind the mistakes. Practitioners engage in\\nvarious activities, including error analysis, testing, auditing, and\\nred-teaming, to form hypotheses of what can go (or has gone) wrong with their\\nmodels. To validate these hypotheses, practitioners employ data slicing to\\nidentify relevant examples. However, traditional data slicing is limited by\\navailable features and programmatic slicing functions. In this work, we propose\\nSemSlicer, a framework that supports semantic data slicing, which identifies a\\nsemantically coherent slice, without the need for existing features. SemSlicer\\nuses Large Language Models to annotate datasets and generate slices from any\\nuser-defined slicing criteria. We show that SemSlicer generates accurate slices\\nwith low cost, allows flexible trade-offs between different design dimensions,\\nreliably identifies under-performing data slices, and helps practitioners\\nidentify useful data slices that reflect systematic problems.\",\"PeriodicalId\":501278,\"journal\":{\"name\":\"arXiv - CS - Software Engineering\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Software Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.09261\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.09261","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
What Is Wrong with My Model? Identifying Systematic Problems with Semantic Data Slicing
Machine learning models make mistakes, yet sometimes it is difficult to
identify the systematic problems behind the mistakes. Practitioners engage in
various activities, including error analysis, testing, auditing, and
red-teaming, to form hypotheses of what can go (or has gone) wrong with their
models. To validate these hypotheses, practitioners employ data slicing to
identify relevant examples. However, traditional data slicing is limited by
available features and programmatic slicing functions. In this work, we propose
SemSlicer, a framework that supports semantic data slicing, which identifies a
semantically coherent slice, without the need for existing features. SemSlicer
uses Large Language Models to annotate datasets and generate slices from any
user-defined slicing criteria. We show that SemSlicer generates accurate slices
with low cost, allows flexible trade-offs between different design dimensions,
reliably identifies under-performing data slices, and helps practitioners
identify useful data slices that reflect systematic problems.