有志者事竟成:在禁止使用 ChatGPT 的国家,它更多地被用于科学研究

Honglin Bao, Mengyi Sun, Misha Teplitskiy
{"title":"有志者事竟成:在禁止使用 ChatGPT 的国家,它更多地被用于科学研究","authors":"Honglin Bao, Mengyi Sun, Misha Teplitskiy","doi":"arxiv-2406.11583","DOIUrl":null,"url":null,"abstract":"Regulating AI has emerged as a key societal challenge, but which methods of\nregulation are effective is unclear. Here, we measure the effectiveness of\nrestricting AI services geographically using the case of ChatGPT and science.\nOpenAI prohibits access to ChatGPT from several countries including China and\nRussia. If the restrictions are effective, there should be minimal use of\nChatGPT in prohibited countries. Drawing on the finding that early versions of\nChatGPT overrepresented distinctive words like \"delve,\" we developed a simple\nensemble classifier by training it on abstracts before and after ChatGPT\n\"polishing\". Testing on held-out abstracts and those where authors\nself-declared to have used AI for writing shows that our classifier\nsubstantially outperforms off-the-shelf LLM detectors like GPTZero and ZeroGPT.\nApplying the classifier to preprints from Arxiv, BioRxiv, and MedRxiv reveals\nthat ChatGPT was used in approximately 12.6% of preprints by August 2023 and\nuse was 7.7% higher in countries without legal access. Crucially, these\npatterns appeared before the first major legal LLM became widely available in\nChina, the largest restricted-country preprint producer. ChatGPT use was\nassociated with higher views and downloads, but not citations or journal\nplacement. Overall, restricting ChatGPT geographically has proven ineffective\nin science and possibly other domains, likely due to widespread workarounds.","PeriodicalId":501285,"journal":{"name":"arXiv - CS - Digital Libraries","volume":"17 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Where there's a will there's a way: ChatGPT is used more for science in countries where it is prohibited\",\"authors\":\"Honglin Bao, Mengyi Sun, Misha Teplitskiy\",\"doi\":\"arxiv-2406.11583\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Regulating AI has emerged as a key societal challenge, but which methods of\\nregulation are effective is unclear. Here, we measure the effectiveness of\\nrestricting AI services geographically using the case of ChatGPT and science.\\nOpenAI prohibits access to ChatGPT from several countries including China and\\nRussia. If the restrictions are effective, there should be minimal use of\\nChatGPT in prohibited countries. Drawing on the finding that early versions of\\nChatGPT overrepresented distinctive words like \\\"delve,\\\" we developed a simple\\nensemble classifier by training it on abstracts before and after ChatGPT\\n\\\"polishing\\\". Testing on held-out abstracts and those where authors\\nself-declared to have used AI for writing shows that our classifier\\nsubstantially outperforms off-the-shelf LLM detectors like GPTZero and ZeroGPT.\\nApplying the classifier to preprints from Arxiv, BioRxiv, and MedRxiv reveals\\nthat ChatGPT was used in approximately 12.6% of preprints by August 2023 and\\nuse was 7.7% higher in countries without legal access. Crucially, these\\npatterns appeared before the first major legal LLM became widely available in\\nChina, the largest restricted-country preprint producer. ChatGPT use was\\nassociated with higher views and downloads, but not citations or journal\\nplacement. Overall, restricting ChatGPT geographically has proven ineffective\\nin science and possibly other domains, likely due to widespread workarounds.\",\"PeriodicalId\":501285,\"journal\":{\"name\":\"arXiv - CS - Digital Libraries\",\"volume\":\"17 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Digital Libraries\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2406.11583\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Digital Libraries","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2406.11583","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

监管人工智能已成为一项关键的社会挑战,但哪些监管方法是有效的尚不清楚。OpenAI 禁止包括中国和俄罗斯在内的多个国家访问 ChatGPT。如果限制措施有效,那么在被禁止的国家使用 ChatGPT 的情况就应该少之又少。根据早期版本的 ChatGPT 对 "delve"(深入研究)等独特词汇的过多使用这一发现,我们开发了一个简单的集合分类器,在 ChatGPT "打磨 "前后对摘要进行训练。将分类器应用于 Arxiv、BioRxiv 和 MedRxiv 的预印本后发现,到 2023 年 8 月,约有 12.6% 的预印本使用了 ChatGPT,而在没有合法访问权限的国家,使用率则高出 7.7%。最重要的是,这些模式出现在中国这个最大的受限国家预印本生产国广泛提供第一个主要的合法 LLM 之前。ChatGPT 的使用与更高的浏览量和下载量有关,但与引用量和期刊排名无关。总体而言,对 ChatGPT 进行地域限制在科学领域被证明是无效的,在其他领域也可能如此,这很可能是由于普遍存在的变通方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Where there's a will there's a way: ChatGPT is used more for science in countries where it is prohibited
Regulating AI has emerged as a key societal challenge, but which methods of regulation are effective is unclear. Here, we measure the effectiveness of restricting AI services geographically using the case of ChatGPT and science. OpenAI prohibits access to ChatGPT from several countries including China and Russia. If the restrictions are effective, there should be minimal use of ChatGPT in prohibited countries. Drawing on the finding that early versions of ChatGPT overrepresented distinctive words like "delve," we developed a simple ensemble classifier by training it on abstracts before and after ChatGPT "polishing". Testing on held-out abstracts and those where authors self-declared to have used AI for writing shows that our classifier substantially outperforms off-the-shelf LLM detectors like GPTZero and ZeroGPT. Applying the classifier to preprints from Arxiv, BioRxiv, and MedRxiv reveals that ChatGPT was used in approximately 12.6% of preprints by August 2023 and use was 7.7% higher in countries without legal access. Crucially, these patterns appeared before the first major legal LLM became widely available in China, the largest restricted-country preprint producer. ChatGPT use was associated with higher views and downloads, but not citations or journal placement. Overall, restricting ChatGPT geographically has proven ineffective in science and possibly other domains, likely due to widespread workarounds.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Publishing Instincts: An Exploration-Exploitation Framework for Studying Academic Publishing Behavior and "Home Venues" Research Citations Building Trust in Wikipedia Evaluating the Linguistic Coverage of OpenAlex: An Assessment of Metadata Accuracy and Completeness Towards understanding evolution of science through language model series Ensuring Adherence to Standards in Experiment-Related Metadata Entered Via Spreadsheets
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1