R. Benham, Ben Carterette, Alistair Moffat, J. Culpepper
Risk-based evaluation is a failure analysis tool that can be combined with traditional effectiveness metrics to ensure that the improvements observed are consistent across topics when comparing systems. Here we explore the stability of confidence intervals in inference-based risk measurement, extending previous work to five different commonly used inference testing techniques. Using the Robust04 and TREC Core 2017 NYT corpora, we show that risk inferences using parametric methods appear to disagree with their non-parametric counterparts, warranting further investigation. Additionally, we explore how the number of topics being evaluated affects confidence interval stability, and find that more than 50 topics appear to be required before risk-sensitive comparison results are consistent across different inference testing frameworks.
{"title":"Taking Risks with Confidence","authors":"R. Benham, Ben Carterette, Alistair Moffat, J. Culpepper","doi":"10.1145/3372124.3372125","DOIUrl":"https://doi.org/10.1145/3372124.3372125","url":null,"abstract":"Risk-based evaluation is a failure analysis tool that can be combined with traditional effectiveness metrics to ensure that the improvements observed are consistent across topics when comparing systems. Here we explore the stability of confidence intervals in inference-based risk measurement, extending previous work to five different commonly used inference testing techniques. Using the Robust04 and TREC Core 2017 NYT corpora, we show that risk inferences using parametric methods appear to disagree with their non-parametric counterparts, warranting further investigation. Additionally, we explore how the number of topics being evaluated affects confidence interval stability, and find that more than 50 topics appear to be required before risk-sensitive comparison results are consistent across different inference testing frameworks.","PeriodicalId":145556,"journal":{"name":"Proceedings of the 24th Australasian Document Computing Symposium","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126001104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In common law legal systems, judges decide issues between parties (legal decision or case law) by reference to previous decisions that consider similar factual situations. Accordingly, these decisions typically feature rich citation networks, i.e., a new decision frequently cites previous relevant decisions (citation). These citations may, in varying degrees, express that a cited decision is applicable, not-applicable, or no longer current law. Such treatment label is important to a lawyer's process of determining whether a case is proper law. These labels serve as a matter of convenience in citation indices enabling lawyers to prioritise decisions to examine to understand the current state of the law. They also prove useful in other areas such as prioritisation for manual summarisation of cases, where not all cases can be summarised, and automatic summarisation, or, potentially, as a ranking feature in case law retrieval. While a lawyer can determine the treatment of a cited case by reading a decision, this is time consuming and can increase legal costs. Currently, not all newly decided cases feature these treatment labels. Further, older cases typically do not. Given the large amount of new legal decisions decided each year, manual annotation of such treatment is not feasible. In this paper, we explore the effectiveness of neural network architectures for identifying case law citation treatment and importance (whether a case is important to a lawyer's reasoning process). We find that these tasks are very difficult and various methods for text classification perform poorly. We address more comprehensively the task of citation importance for this reason while limiting our examination of the task of citation treatment to the modelling of the problem and the highlight of the intrinsic difficulty of the task. We make a test dataset available at github.com/ielab/caselaw-citations to stimulate further research that tackles this challenging problem. We also contribute a range of word embeddings learned over a large amount of processed case law text.
{"title":"Towards Automatically Classifying Case Law Citation Treatment Using Neural Networks","authors":"Daniel Locke, G. Zuccon","doi":"10.1145/3372124.3372128","DOIUrl":"https://doi.org/10.1145/3372124.3372128","url":null,"abstract":"In common law legal systems, judges decide issues between parties (legal decision or case law) by reference to previous decisions that consider similar factual situations. Accordingly, these decisions typically feature rich citation networks, i.e., a new decision frequently cites previous relevant decisions (citation). These citations may, in varying degrees, express that a cited decision is applicable, not-applicable, or no longer current law. Such treatment label is important to a lawyer's process of determining whether a case is proper law. These labels serve as a matter of convenience in citation indices enabling lawyers to prioritise decisions to examine to understand the current state of the law. They also prove useful in other areas such as prioritisation for manual summarisation of cases, where not all cases can be summarised, and automatic summarisation, or, potentially, as a ranking feature in case law retrieval. While a lawyer can determine the treatment of a cited case by reading a decision, this is time consuming and can increase legal costs. Currently, not all newly decided cases feature these treatment labels. Further, older cases typically do not. Given the large amount of new legal decisions decided each year, manual annotation of such treatment is not feasible. In this paper, we explore the effectiveness of neural network architectures for identifying case law citation treatment and importance (whether a case is important to a lawyer's reasoning process). We find that these tasks are very difficult and various methods for text classification perform poorly. We address more comprehensively the task of citation importance for this reason while limiting our examination of the task of citation treatment to the modelling of the problem and the highlight of the intrinsic difficulty of the task. We make a test dataset available at github.com/ielab/caselaw-citations to stimulate further research that tackles this challenging problem. We also contribute a range of word embeddings learned over a large amount of processed case law text.","PeriodicalId":145556,"journal":{"name":"Proceedings of the 24th Australasian Document Computing Symposium","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129365875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Proceedings of the 24th Australasian Document Computing Symposium","authors":"Robert A. Allen, L. Azzopardi","doi":"10.1145/3372124","DOIUrl":"https://doi.org/10.1145/3372124","url":null,"abstract":"","PeriodicalId":145556,"journal":{"name":"Proceedings of the 24th Australasian Document Computing Symposium","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116967903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Search platforms can have more than one type of user, e.g., those who provide and those who consume content. As an example, in a job/talent search platform, content providers are: (1) job seekers who provide CVs, and (2) hirers who provide job advertisements; content consumers, on the other hand, are: (3) job seekers searching for specific jobs, and (4) hirers/recruiters searching for candidates to fill particular positions. As a result, there are four types of users, each with potentially different patterns of language use. In this paper, we compare the language used by different groups of users in job/talent search, by way of word embeddings pre-trained over documents associated with distinct types of users. In doing so, we investigate whether there are systematic shifts/ mismatches in vocabulary or the use of the same term, and consider the implications for an integrated search solution. Our experiments unearth significant differences in language use, but also that there is a strong agreement between the results of our intrinsic and extrinsic comparisons of word embeddings.
{"title":"Differences in language use: Insights from job and talent search","authors":"Bahar Salehi, B. Kazimipour, Timothy Baldwin","doi":"10.1145/3372124.3372127","DOIUrl":"https://doi.org/10.1145/3372124.3372127","url":null,"abstract":"Search platforms can have more than one type of user, e.g., those who provide and those who consume content. As an example, in a job/talent search platform, content providers are: (1) job seekers who provide CVs, and (2) hirers who provide job advertisements; content consumers, on the other hand, are: (3) job seekers searching for specific jobs, and (4) hirers/recruiters searching for candidates to fill particular positions. As a result, there are four types of users, each with potentially different patterns of language use. In this paper, we compare the language used by different groups of users in job/talent search, by way of word embeddings pre-trained over documents associated with distinct types of users. In doing so, we investigate whether there are systematic shifts/ mismatches in vocabulary or the use of the same term, and consider the implications for an integrated search solution. Our experiments unearth significant differences in language use, but also that there is a strong agreement between the results of our intrinsic and extrinsic comparisons of word embeddings.","PeriodicalId":145556,"journal":{"name":"Proceedings of the 24th Australasian Document Computing Symposium","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114948221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper focuses on automatic character profiling --- connecting "who", "what" and "when" --- in literary documents. This task is especially challenging for low-resource languages, since off-the-shelf tools for named entity recognition, syntactic parsing and other natural language processing tasks are rarely available. We investigate the impact of human annotation on automatic profiling. Based on a Medieval Chinese corpus, experimental results show that even a relatively small amount of word segmentation, part-of-speech and dependency annotation can improve accuracy in named entity recognition and in identifying character-verb associations, but not character-toponym associations.
{"title":"Character Profiling in Low-Resource Language Documents","authors":"Tak-sum Wong, J. Lee","doi":"10.1145/3372124.3372129","DOIUrl":"https://doi.org/10.1145/3372124.3372129","url":null,"abstract":"This paper focuses on automatic character profiling --- connecting \"who\", \"what\" and \"when\" --- in literary documents. This task is especially challenging for low-resource languages, since off-the-shelf tools for named entity recognition, syntactic parsing and other natural language processing tasks are rarely available. We investigate the impact of human annotation on automatic profiling. Based on a Medieval Chinese corpus, experimental results show that even a relatively small amount of word segmentation, part-of-speech and dependency annotation can improve accuracy in named entity recognition and in identifying character-verb associations, but not character-toponym associations.","PeriodicalId":145556,"journal":{"name":"Proceedings of the 24th Australasian Document Computing Symposium","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131069164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Computing similarity between a query and a document is fundamental in any information retrieval system. In search engines, computing query-document similarity is an essential step in both retrieval and ranking stages. In eBay search, document is an item and the query-item similarity can be computed by comparing different facets of the query-item pair. Query text can be compared with the text of the item title. Likewise, a category constraint applied on the query can be compared with the listing category of the item. However, images are one signal that are usually present in the items but are not present in the query. Images are one of the most intuitive signals used by users to determine the relevance of the item given a query. Including this signal in estimating similarity between the query-item pair is likely to improve the relevance of the search engine. We propose a novel way of deriving image information for queries. We attempt to learn image information for queries from item images instead of generating explicit image features or an image for queries. We use canonical correlation analysis (CCA) to learn a new subspace where projecting the original data will give us a new query and item representation. We hypothesize that this new query representation will also have image information about the query. We estimate the query-item similarity using a vector space model and report the performance of the proposed method on eBay's search data. We show 11.89% relevance improvement over the baseline using Area Under the Receiver Operating Characteristic curve (AUROC) as the evaluation metric. We also show 3.1% relevance improvement over the baseline with Area Under the Precision Recall Curve (AUPRC).
{"title":"Learning Image Information for eCommerce Queries","authors":"U. Porwal","doi":"10.1145/3372124.3372126","DOIUrl":"https://doi.org/10.1145/3372124.3372126","url":null,"abstract":"Computing similarity between a query and a document is fundamental in any information retrieval system. In search engines, computing query-document similarity is an essential step in both retrieval and ranking stages. In eBay search, document is an item and the query-item similarity can be computed by comparing different facets of the query-item pair. Query text can be compared with the text of the item title. Likewise, a category constraint applied on the query can be compared with the listing category of the item. However, images are one signal that are usually present in the items but are not present in the query. Images are one of the most intuitive signals used by users to determine the relevance of the item given a query. Including this signal in estimating similarity between the query-item pair is likely to improve the relevance of the search engine. We propose a novel way of deriving image information for queries. We attempt to learn image information for queries from item images instead of generating explicit image features or an image for queries. We use canonical correlation analysis (CCA) to learn a new subspace where projecting the original data will give us a new query and item representation. We hypothesize that this new query representation will also have image information about the query. We estimate the query-item similarity using a vector space model and report the performance of the proposed method on eBay's search data. We show 11.89% relevance improvement over the baseline using Area Under the Receiver Operating Characteristic curve (AUROC) as the evaluation metric. We also show 3.1% relevance improvement over the baseline with Area Under the Precision Recall Curve (AUPRC).","PeriodicalId":145556,"journal":{"name":"Proceedings of the 24th Australasian Document Computing Symposium","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114639046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}