Jonathan Li, Rohan Bhambhoria, Samuel Dahan, Xiaodan Zhu
{"title":"Experimenting with Legal AI Solutions: The Case of Question-Answering for Access to Justice","authors":"Jonathan Li, Rohan Bhambhoria, Samuel Dahan, Xiaodan Zhu","doi":"arxiv-2409.07713","DOIUrl":null,"url":null,"abstract":"Generative AI models, such as the GPT and Llama series, have significant\npotential to assist laypeople in answering legal questions. However, little\nprior work focuses on the data sourcing, inference, and evaluation of these\nmodels in the context of laypersons. To this end, we propose a human-centric\nlegal NLP pipeline, covering data sourcing, inference, and evaluation. We\nintroduce and release a dataset, LegalQA, with real and specific legal\nquestions spanning from employment law to criminal law, corresponding answers\nwritten by legal experts, and citations for each answer. We develop an\nautomatic evaluation protocol for this dataset, then show that\nretrieval-augmented generation from only 850 citations in the train set can\nmatch or outperform internet-wide retrieval, despite containing 9 orders of\nmagnitude less data. Finally, we propose future directions for open-sourced\nefforts, which fall behind closed-sourced models.","PeriodicalId":501030,"journal":{"name":"arXiv - CS - Computation and Language","volume":"6 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computation and Language","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07713","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Generative AI models, such as the GPT and Llama series, have significant
potential to assist laypeople in answering legal questions. However, little
prior work focuses on the data sourcing, inference, and evaluation of these
models in the context of laypersons. To this end, we propose a human-centric
legal NLP pipeline, covering data sourcing, inference, and evaluation. We
introduce and release a dataset, LegalQA, with real and specific legal
questions spanning from employment law to criminal law, corresponding answers
written by legal experts, and citations for each answer. We develop an
automatic evaluation protocol for this dataset, then show that
retrieval-augmented generation from only 850 citations in the train set can
match or outperform internet-wide retrieval, despite containing 9 orders of
magnitude less data. Finally, we propose future directions for open-sourced
efforts, which fall behind closed-sourced models.