Alberto Bacchelli, Nicolas Bettenburg, Latifa Guerrouj, S. Haiduc
{"title":"第三届非结构化数据挖掘研讨会","authors":"Alberto Bacchelli, Nicolas Bettenburg, Latifa Guerrouj, S. Haiduc","doi":"10.1109/WCRE.2013.6671333","DOIUrl":null,"url":null,"abstract":"Software development knowledge resides in the source code and in a number of other artefacts produced during the development process. To extract such a knowledge, past software engineering research has extensively focused on mining the source code, i.e., the final product of the development effort. Currently, we witness an emerging trend where researchers strive to exploit the information captured in artifacts such as emails and bug reports, free-form text requirements and specifications, comments and identifiers. Being often expressed in natural language, and not having a well-defined structure, the information stored in these artifacts is defined as unstructured data. Although research communities in Information Retrieval, Data Mining and Natural Language Processing have devised techniques to deal with unstructured data, these techniques are usually limited in scope (i.e., designed for English language text found in newspaper articles) and intended for use in specific scenarios, thus failing to achieve their full potential in a software development context. The workshop on Mining Unstructured Data (MUD) aims to provide a common venue for researchers and practitioners across software engineering, information retrieval and data mining research domains, to share new approaches and emerging results in mining unstructured data.","PeriodicalId":275092,"journal":{"name":"2013 20th Working Conference on Reverse Engineering (WCRE)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"3rd workshop on Mining Unstructured Data\",\"authors\":\"Alberto Bacchelli, Nicolas Bettenburg, Latifa Guerrouj, S. Haiduc\",\"doi\":\"10.1109/WCRE.2013.6671333\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Software development knowledge resides in the source code and in a number of other artefacts produced during the development process. To extract such a knowledge, past software engineering research has extensively focused on mining the source code, i.e., the final product of the development effort. Currently, we witness an emerging trend where researchers strive to exploit the information captured in artifacts such as emails and bug reports, free-form text requirements and specifications, comments and identifiers. Being often expressed in natural language, and not having a well-defined structure, the information stored in these artifacts is defined as unstructured data. Although research communities in Information Retrieval, Data Mining and Natural Language Processing have devised techniques to deal with unstructured data, these techniques are usually limited in scope (i.e., designed for English language text found in newspaper articles) and intended for use in specific scenarios, thus failing to achieve their full potential in a software development context. The workshop on Mining Unstructured Data (MUD) aims to provide a common venue for researchers and practitioners across software engineering, information retrieval and data mining research domains, to share new approaches and emerging results in mining unstructured data.\",\"PeriodicalId\":275092,\"journal\":{\"name\":\"2013 20th Working Conference on Reverse Engineering (WCRE)\",\"volume\":\"32 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 20th Working Conference on Reverse Engineering (WCRE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WCRE.2013.6671333\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 20th Working Conference on Reverse Engineering (WCRE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WCRE.2013.6671333","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Software development knowledge resides in the source code and in a number of other artefacts produced during the development process. To extract such a knowledge, past software engineering research has extensively focused on mining the source code, i.e., the final product of the development effort. Currently, we witness an emerging trend where researchers strive to exploit the information captured in artifacts such as emails and bug reports, free-form text requirements and specifications, comments and identifiers. Being often expressed in natural language, and not having a well-defined structure, the information stored in these artifacts is defined as unstructured data. Although research communities in Information Retrieval, Data Mining and Natural Language Processing have devised techniques to deal with unstructured data, these techniques are usually limited in scope (i.e., designed for English language text found in newspaper articles) and intended for use in specific scenarios, thus failing to achieve their full potential in a software development context. The workshop on Mining Unstructured Data (MUD) aims to provide a common venue for researchers and practitioners across software engineering, information retrieval and data mining research domains, to share new approaches and emerging results in mining unstructured data.