HelaNER: A Novel Approach for Nested Named Entity Boundary Detection

Y. Priyadarshana, L. Ranathunga, C. Amalraj, I. Perera
{"title":"HelaNER: A Novel Approach for Nested Named Entity Boundary Detection","authors":"Y. Priyadarshana, L. Ranathunga, C. Amalraj, I. Perera","doi":"10.1109/EUROCON52738.2021.9535565","DOIUrl":null,"url":null,"abstract":"Named entity recognition (NER) is a prominent task in identifying text spans to specific types. Named entity boundary detection can be mentioned as a rising research area under NER. Although a limited work has been conducted for nested NE boundary detection, flat NE boundary detection can be considered as at a pinnacle stage. Nested NE boundary detection is an important aspect in information extraction, information retrieval, event extraction, sentiment analysis etc. On the other hand, spreading religious unhealthy statements through social media has become a burden for the wellbeing of the society. The prime objective of this research is to implement a novel system for nested NE boundary detection for Sinhala language considering religious unhealthy statements in social media. A constructive literature survey has been conducted for analyzing the already developed NE type and boundary detection approaches and systems. Along with that, identifying the linguistic structures and patterns of Sinhala hate speech detection has been conducted. A corpus of more than 100,000 Sinhala hates speech contents have been extracted, preprocessed, and annotated by an expert panel. Then, a deep neural approach has been applied for capturing the complexity indexes, matrices, and other related elements of the corpus. Next, a novel approach called \"boundary bubbles\" has been conducted for capturing word representation, head word detection, entity mention nuggets identification and region classification for NE boundary detection. Experiments reveal that our scientific novel approach has achieved the state-of-art performance over the existing baselines.","PeriodicalId":328338,"journal":{"name":"IEEE EUROCON 2021 - 19th International Conference on Smart Technologies","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE EUROCON 2021 - 19th International Conference on Smart Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EUROCON52738.2021.9535565","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Named entity recognition (NER) is a prominent task in identifying text spans to specific types. Named entity boundary detection can be mentioned as a rising research area under NER. Although a limited work has been conducted for nested NE boundary detection, flat NE boundary detection can be considered as at a pinnacle stage. Nested NE boundary detection is an important aspect in information extraction, information retrieval, event extraction, sentiment analysis etc. On the other hand, spreading religious unhealthy statements through social media has become a burden for the wellbeing of the society. The prime objective of this research is to implement a novel system for nested NE boundary detection for Sinhala language considering religious unhealthy statements in social media. A constructive literature survey has been conducted for analyzing the already developed NE type and boundary detection approaches and systems. Along with that, identifying the linguistic structures and patterns of Sinhala hate speech detection has been conducted. A corpus of more than 100,000 Sinhala hates speech contents have been extracted, preprocessed, and annotated by an expert panel. Then, a deep neural approach has been applied for capturing the complexity indexes, matrices, and other related elements of the corpus. Next, a novel approach called "boundary bubbles" has been conducted for capturing word representation, head word detection, entity mention nuggets identification and region classification for NE boundary detection. Experiments reveal that our scientific novel approach has achieved the state-of-art performance over the existing baselines.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
一种新的嵌套命名实体边界检测方法
命名实体识别(NER)是识别文本跨度到特定类型的重要任务。命名实体边界检测可以说是NER下一个新兴的研究领域。虽然嵌套网元边界检测的工作有限,但平面网元边界检测可以被认为处于顶峰阶段。嵌套网元边界检测是信息提取、信息检索、事件提取、情感分析等领域的重要研究方向。另一方面,通过社交媒体传播宗教不健康言论已经成为社会福祉的负担。本研究的主要目标是实现一个新的系统,用于考虑社交媒体中宗教不健康言论的僧伽罗语嵌套NE边界检测。进行了一项建设性的文献调查,以分析已经开发的NE类型和边界检测方法和系统。同时,对僧伽罗语仇恨言论检测的语言结构和模式进行了识别。一个专家小组对10万多份僧伽罗仇恨言论内容进行了提取、预处理和注释。然后,应用深度神经网络方法捕获语料库的复杂性指数、矩阵和其他相关元素。其次,提出了一种新的方法“边界气泡”,用于捕获词表示、头词检测、实体提及块识别和区域分类。实验表明,我们的科学新颖方法在现有基线上取得了最先进的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Analysis of Exact and Approximating Dependences of the Active Resistance of Conductor on the Frequency of Current Under the Action of Skin Effect Efficient Pre-BPF based Sigma Delta Radio over Fiber System for 5G NR Fronthauls An Object-Oriented Verification Technique of FPGA-based Adjustment Systems for Video Graphics Accelerators Estimation of Mechanical Parameters and Tidal Current Velocity for a Tidal Turbine Test Driven Development in Action: Case Study of a Cross-Platform Web Application
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1