使用现成的机器学习工具将学术科学写作与人类或ChatGPT区分开来，准确率超过99%。

IF 7.9 2区综合性期刊 Q1 CHEMISTRY, MULTIDISCIPLINARY Cell Reports Physical Science Pub Date : 2023-06-21 DOI:10.1016/j.xcrp.2023.101426

Heather Desaire, Aleesa E Chua, Madeline Isom, Romana Jarosova, David Hua

{"title":"使用现成的机器学习工具将学术科学写作与人类或ChatGPT区分开来，准确率超过99%。","authors":"Heather Desaire, Aleesa E Chua, Madeline Isom, Romana Jarosova, David Hua","doi":"10.1016/j.xcrp.2023.101426","DOIUrl":null,"url":null,"abstract":"ChatGPT has enabled access to artificial intelligence (AI)-generated writing for the masses, initiating a culture shift in the way people work, learn, and write. The need to discriminate human writing from AI is now both critical and urgent. Addressing this need, we report a method for discriminating text generated by ChatGPT from (human) academic scientists, relying on prevalent and accessible supervised classification methods. The approach uses new features for discriminating (these) humans from AI; as examples, scientists write long paragraphs and have a penchant for equivocal language, frequently using words like \"but,\" \"however,\" and \"although.\" With a set of 20 features, we built a model that assigns the author, as human or AI, at over 99% accuracy. This strategy could be further adapted and developed by others with basic skills in supervised classification, enabling access to many highly accurate and targeted models for detecting AI usage in academic writing and beyond.","PeriodicalId":9703,"journal":{"name":"Cell Reports Physical Science","volume":"4 6","pages":""},"PeriodicalIF":7.9000,"publicationDate":"2023-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/9e/2a/nihms-1911044.PMC10328544.pdf","citationCount":"9","resultStr":"{\"title\":\"Distinguishing academic science writing from humans or ChatGPT with over 99% accuracy using off-the-shelf machine learning tools.\",\"authors\":\"Heather Desaire, Aleesa E Chua, Madeline Isom, Romana Jarosova, David Hua\",\"doi\":\"10.1016/j.xcrp.2023.101426\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"ChatGPT has enabled access to artificial intelligence (AI)-generated writing for the masses, initiating a culture shift in the way people work, learn, and write. The need to discriminate human writing from AI is now both critical and urgent. Addressing this need, we report a method for discriminating text generated by ChatGPT from (human) academic scientists, relying on prevalent and accessible supervised classification methods. The approach uses new features for discriminating (these) humans from AI; as examples, scientists write long paragraphs and have a penchant for equivocal language, frequently using words like \\\"but,\\\" \\\"however,\\\" and \\\"although.\\\" With a set of 20 features, we built a model that assigns the author, as human or AI, at over 99% accuracy. This strategy could be further adapted and developed by others with basic skills in supervised classification, enabling access to many highly accurate and targeted models for detecting AI usage in academic writing and beyond.\",\"PeriodicalId\":9703,\"journal\":{\"name\":\"Cell Reports Physical Science\",\"volume\":\"4 6\",\"pages\":\"\"},\"PeriodicalIF\":7.9000,\"publicationDate\":\"2023-06-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/9e/2a/nihms-1911044.PMC10328544.pdf\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cell Reports Physical Science\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.1016/j.xcrp.2023.101426\",\"RegionNum\":2,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cell Reports Physical Science","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1016/j.xcrp.2023.101426","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 9

摘要

ChatGPT让大众能够接触到人工智能(AI)生成的文字，开启了人们工作、学习和写作方式的文化转变。区分人类写作和人工智能写作的需求现在既关键又紧迫。为了满足这一需求，我们报告了一种基于流行和可访问的监督分类方法，从(人类)学术科学家中区分ChatGPT生成的文本的方法。该方法使用新的特征来区分(这些)人类和人工智能;举个例子，科学家们写很长的段落，喜欢模棱两可的语言，经常使用“但是”、“然而”和“尽管”这样的词。有了20个特征，我们建立了一个模型，将作者分配给人类或人工智能，准确率超过99%。其他具备监督分类基本技能的人可以进一步调整和发展这一策略，从而能够获得许多高度准确和有针对性的模型，以检测人工智能在学术写作等领域的使用情况。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Distinguishing academic science writing from humans or ChatGPT with over 99% accuracy using off-the-shelf machine learning tools.

ChatGPT has enabled access to artificial intelligence (AI)-generated writing for the masses, initiating a culture shift in the way people work, learn, and write. The need to discriminate human writing from AI is now both critical and urgent. Addressing this need, we report a method for discriminating text generated by ChatGPT from (human) academic scientists, relying on prevalent and accessible supervised classification methods. The approach uses new features for discriminating (these) humans from AI; as examples, scientists write long paragraphs and have a penchant for equivocal language, frequently using words like "but," "however," and "although." With a set of 20 features, we built a model that assigns the author, as human or AI, at over 99% accuracy. This strategy could be further adapted and developed by others with basic skills in supervised classification, enabling access to many highly accurate and targeted models for detecting AI usage in academic writing and beyond.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Cell Reports Physical Science Energy-Energy (all)

CiteScore

11.40

自引率

2.20%

发文量

388

审稿时长

62 days

期刊介绍： Cell Reports Physical Science, a premium open-access journal from Cell Press, features high-quality, cutting-edge research spanning the physical sciences. It serves as an open forum fostering collaboration among physical scientists while championing open science principles. Published works must signify significant advancements in fundamental insight or technological applications within fields such as chemistry, physics, materials science, energy science, engineering, and related interdisciplinary studies. In addition to longer articles, the journal considers impactful short-form reports and short reviews covering recent literature in emerging fields. Continually adapting to the evolving open science landscape, the journal reviews its policies to align with community consensus and best practices.