{"title":"使用未标记数据进行美国最高法院案件分类","authors":"George Sanchez","doi":"10.1109/ICDMW51313.2020.00116","DOIUrl":null,"url":null,"abstract":"The Supreme Court Database provided by Washington University (in St. Louis) School of Law is an essential legal research tool. The Supreme Court Database is organized and categorized to Issue Areas to make it easy for legal researchers to find on-point cases for an area of law. This paper used a semi-supervised learning approach to automatically categorize the Supreme Court's opinions to Issue Areas. An inductive method of clustering then labeling approach was used by employing a nonmetric space of a fast Hierarchical Navigable Small World graph index containing USE (Universal Sentence Encoder) embeddings. After obtaining the labels from the semi-supervised approach, we evaluate several classification approaches to use with the data achieving the weighted average F1-Scores: SVM with Max Norm Features 0.75, RNN 0.78, and BERT 0.68","PeriodicalId":426846,"journal":{"name":"2020 International Conference on Data Mining Workshops (ICDMW)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Using Unlabeled Data for US Supreme Court Case Classification\",\"authors\":\"George Sanchez\",\"doi\":\"10.1109/ICDMW51313.2020.00116\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Supreme Court Database provided by Washington University (in St. Louis) School of Law is an essential legal research tool. The Supreme Court Database is organized and categorized to Issue Areas to make it easy for legal researchers to find on-point cases for an area of law. This paper used a semi-supervised learning approach to automatically categorize the Supreme Court's opinions to Issue Areas. An inductive method of clustering then labeling approach was used by employing a nonmetric space of a fast Hierarchical Navigable Small World graph index containing USE (Universal Sentence Encoder) embeddings. After obtaining the labels from the semi-supervised approach, we evaluate several classification approaches to use with the data achieving the weighted average F1-Scores: SVM with Max Norm Features 0.75, RNN 0.78, and BERT 0.68\",\"PeriodicalId\":426846,\"journal\":{\"name\":\"2020 International Conference on Data Mining Workshops (ICDMW)\",\"volume\":\"29 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference on Data Mining Workshops (ICDMW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDMW51313.2020.00116\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Data Mining Workshops (ICDMW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW51313.2020.00116","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Using Unlabeled Data for US Supreme Court Case Classification
The Supreme Court Database provided by Washington University (in St. Louis) School of Law is an essential legal research tool. The Supreme Court Database is organized and categorized to Issue Areas to make it easy for legal researchers to find on-point cases for an area of law. This paper used a semi-supervised learning approach to automatically categorize the Supreme Court's opinions to Issue Areas. An inductive method of clustering then labeling approach was used by employing a nonmetric space of a fast Hierarchical Navigable Small World graph index containing USE (Universal Sentence Encoder) embeddings. After obtaining the labels from the semi-supervised approach, we evaluate several classification approaches to use with the data achieving the weighted average F1-Scores: SVM with Max Norm Features 0.75, RNN 0.78, and BERT 0.68