作者归因

IF 12.9 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Foundations and Trends in Information Retrieval Pub Date : 2008-03-06 DOI:10.1561/1500000005

P. Juola

{"title":"作者归因","authors":"P. Juola","doi":"10.1561/1500000005","DOIUrl":null,"url":null,"abstract":"Authorship attribution, the science of inferring characteristics of the author from the characteristics of documents written by that author, is a problem with a long history and a wide range of application. Recent work in \"non-traditional\" authorship attribution demonstrates the practicality of automatically analyzing documents based on authorial style, but the state of the art is confusing. Analyses are difficult to apply, little is known about type or rate of errors, and few \"best practices\" are available. In part because of this confusion, the field has perhaps had less uptake and general acceptance than is its due. \n \nThis review surveys the history and present state of the discipline, presenting some comparative results when available. It shows, first, that the discipline is quite successful, even in difficult cases involving small documents in unfamiliar and less studied languages; it further analyzes the types of analysis and features used and tries to determine characteristics of well-performing systems, finally formulating these in a set of recommendations for best practices.","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"23 1","pages":"233-334"},"PeriodicalIF":12.9000,"publicationDate":"2008-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"962","resultStr":"{\"title\":\"Authorship Attribution\",\"authors\":\"P. Juola\",\"doi\":\"10.1561/1500000005\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Authorship attribution, the science of inferring characteristics of the author from the characteristics of documents written by that author, is a problem with a long history and a wide range of application. Recent work in \\\"non-traditional\\\" authorship attribution demonstrates the practicality of automatically analyzing documents based on authorial style, but the state of the art is confusing. Analyses are difficult to apply, little is known about type or rate of errors, and few \\\"best practices\\\" are available. In part because of this confusion, the field has perhaps had less uptake and general acceptance than is its due. \\n \\nThis review surveys the history and present state of the discipline, presenting some comparative results when available. It shows, first, that the discipline is quite successful, even in difficult cases involving small documents in unfamiliar and less studied languages; it further analyzes the types of analysis and features used and tries to determine characteristics of well-performing systems, finally formulating these in a set of recommendations for best practices.\",\"PeriodicalId\":48829,\"journal\":{\"name\":\"Foundations and Trends in Information Retrieval\",\"volume\":\"23 1\",\"pages\":\"233-334\"},\"PeriodicalIF\":12.9000,\"publicationDate\":\"2008-03-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"962\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Foundations and Trends in Information Retrieval\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1561/1500000005\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Foundations and Trends in Information Retrieval","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1561/1500000005","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 962

摘要

作者归属是一门从作者所写文献的特征推断作者特征的科学，是一个历史悠久、应用广泛的问题。最近在“非传统”作者归属方面的工作证明了基于作者风格自动分析文档的实用性，但目前的技术状况令人困惑。分析很难应用，对错误类型或错误率知之甚少，而且很少有“最佳实践”可用。在某种程度上，由于这种混乱，该领域可能没有得到应有的重视和普遍接受。本文回顾了该学科的历史和现状，并在可用的情况下提出了一些比较结果。它表明，首先，这门学科是相当成功的，即使是在涉及用不熟悉和研究较少的语言编写的小文件的困难情况下;它进一步分析了所使用的分析类型和特性，并尝试确定性能良好的系统的特征，最后将这些特征形成一组最佳实践建议。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Authorship Attribution

Authorship attribution, the science of inferring characteristics of the author from the characteristics of documents written by that author, is a problem with a long history and a wide range of application. Recent work in "non-traditional" authorship attribution demonstrates the practicality of automatically analyzing documents based on authorial style, but the state of the art is confusing. Analyses are difficult to apply, little is known about type or rate of errors, and few "best practices" are available. In part because of this confusion, the field has perhaps had less uptake and general acceptance than is its due. This review surveys the history and present state of the discipline, presenting some comparative results when available. It shows, first, that the discipline is quite successful, even in difficult cases involving small documents in unfamiliar and less studied languages; it further analyzes the types of analysis and features used and tries to determine characteristics of well-performing systems, finally formulating these in a set of recommendations for best practices.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Foundations and Trends in Information Retrieval COMPUTER SCIENCE, INFORMATION SYSTEMS-

CiteScore

39.10

自引率

0.00%

发文量

期刊介绍： The surge in research across all domains in the past decade has resulted in a plethora of new publications, causing an exponential growth in published research. Navigating through this extensive literature and staying current has become a time-consuming challenge. While electronic publishing provides instant access to more articles than ever, discerning the essential ones for a comprehensive understanding of any topic remains an issue. To tackle this, Foundations and Trends® in Information Retrieval - FnTIR - addresses the problem by publishing high-quality survey and tutorial monographs in the field. Each issue of Foundations and Trends® in Information Retrieval - FnT IR features a 50-100 page monograph authored by research leaders, covering tutorial subjects, research retrospectives, and survey papers that provide state-of-the-art reviews within the scope of the journal.

期刊最新文献

From Foundations to GPT in Text Classification: A Comprehensive Survey on Current Approaches and Future Trends Search as Learning Understanding and Mitigating Gender Bias in Information Retrieval Systems Mathematical Information Retrieval: Search and Question Answering Information Discovery in E-commerce