Comprehensibility and Automation: Plain Language in the Era of Digitalization

István Üveges
{"title":"Comprehensibility and Automation: Plain Language in the Era of Digitalization","authors":"István Üveges","doi":"10.2478/bjes-2022-0012","DOIUrl":null,"url":null,"abstract":"Abstract The current article briefly presents a pilot machine-learning experiment on the classification of official texts addressed to lay readers with the use of support vector machine as a baseline and fastText models. For this purpose, a hand-crafted corpus was used, created by the experts of the National Tax and Customs Administration of Hungary under the office’s Public Accessibility Programme. The corpus contained sentences that were paraphrased or completely rewritten by the experts to make them more readable for lay people, as well their original counter pairs. The aim was to automatically distinguish between these two classes by using supervised machine-learning algorithms. If successful, such a machine-learning-based model could be used to draw the attention of experts involved in making the texts of official bodies more comprehensible to the average reader to the potentially problematic points of a text. Therefore, the process of rephrasing such texts could be sped up drastically. Such a rephrasing (considering, above all, the needs of the average reader) can improve the overall comprehensibility of official (mostly legal) texts, and therefore supports access to justice, the transparency of governmental organizations and, most importantly, improves the rule of law in a given country.","PeriodicalId":29836,"journal":{"name":"TalTech Journal of European Studies","volume":"12 1","pages":"64 - 86"},"PeriodicalIF":0.6000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"TalTech Journal of European Studies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2478/bjes-2022-0012","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"LAW","Score":null,"Total":0}
引用次数: 0

Abstract

Abstract The current article briefly presents a pilot machine-learning experiment on the classification of official texts addressed to lay readers with the use of support vector machine as a baseline and fastText models. For this purpose, a hand-crafted corpus was used, created by the experts of the National Tax and Customs Administration of Hungary under the office’s Public Accessibility Programme. The corpus contained sentences that were paraphrased or completely rewritten by the experts to make them more readable for lay people, as well their original counter pairs. The aim was to automatically distinguish between these two classes by using supervised machine-learning algorithms. If successful, such a machine-learning-based model could be used to draw the attention of experts involved in making the texts of official bodies more comprehensible to the average reader to the potentially problematic points of a text. Therefore, the process of rephrasing such texts could be sped up drastically. Such a rephrasing (considering, above all, the needs of the average reader) can improve the overall comprehensibility of official (mostly legal) texts, and therefore supports access to justice, the transparency of governmental organizations and, most importantly, improves the rule of law in a given country.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
可理解性与自动化:数字化时代的通俗语言
摘要本文简要介绍了一个基于支持向量机(support vector machine)作为基线和fastText模型的面向普通读者的官方文本分类的试点机器学习实验。为此目的,使用了由匈牙利国家税务和海关总署的专家根据该办公室的公共无障碍方案制作的手工语料库。语料库包含由专家改写或完全重写的句子,以使它们更容易被外行人阅读,以及它们的原始counter对。目的是通过使用监督机器学习算法自动区分这两个类别。如果成功,这样一个基于机器学习的模型可以用来吸引专家的注意力,让普通读者更容易理解官方机构的文本,了解文本中潜在的问题点。因此,改写这些案文的进程可以大大加快。这种改写(首先考虑到普通读者的需要)可以提高官方(主要是法律)文本的总体可理解性,从而支持诉诸司法,政府组织的透明度,最重要的是,改善特定国家的法治。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
1.90
自引率
62.50%
发文量
8
期刊最新文献
Threats to Diversity of Opinion and Freedom of Expression via Social Media Selected Legal Issues in Online Adult Education: Compliance of Online Learning and Teaching Process with GDPR Evolution of the European Union Development Policy towards India Divorce at the Notary: Protection of Creditors’ Interests Evolution of the Digital Economy and Society Index in the European Union: Α Socioeconomic Perspective
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1