{"title":"An annotation scheme for references to research artefacts in scientific publications","authors":"David Schindler, Kristina Yordanova, Frank Krüger","doi":"10.1109/PERCOMW.2019.8730730","DOIUrl":null,"url":null,"abstract":"The extraction of mentions of research artefacts from scientific papers is a necessary precursor for multiple applications ranging from simple search for literature based on particular research artefacts to semantic analyses of the investigations described in the literature. Techniques of natural language processing like named entity and relation extraction allow to establish detailed knowledge about such artefacts. The application of supervised classifiers relies on annotated datasets in order to provide a basis for training and evaluation. In this work, we present an annotation scheme for research artefacts in scientific literature which not only distinguishes between different types of artefacts like datasets, software and materials but also allows for the annotation of more detailed information such as amount or concentration of materials. Furthermore, we present first preliminary results in terms of inter-rater reliability.","PeriodicalId":437017,"journal":{"name":"2019 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PERCOMW.2019.8730730","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
The extraction of mentions of research artefacts from scientific papers is a necessary precursor for multiple applications ranging from simple search for literature based on particular research artefacts to semantic analyses of the investigations described in the literature. Techniques of natural language processing like named entity and relation extraction allow to establish detailed knowledge about such artefacts. The application of supervised classifiers relies on annotated datasets in order to provide a basis for training and evaluation. In this work, we present an annotation scheme for research artefacts in scientific literature which not only distinguishes between different types of artefacts like datasets, software and materials but also allows for the annotation of more detailed information such as amount or concentration of materials. Furthermore, we present first preliminary results in terms of inter-rater reliability.