CNN-BLSTM-CRF Network for Semantic Labeling of Students' Online Handwritten Assignments

Amirali Darvishzadeh, T. Stahovich, Amir H. Feghahati, Negin Entezari, Shaghayegh Gharghabi, Reed Kanemaru, C. Shelton
{"title":"CNN-BLSTM-CRF Network for Semantic Labeling of Students' Online Handwritten Assignments","authors":"Amirali Darvishzadeh, T. Stahovich, Amir H. Feghahati, Negin Entezari, Shaghayegh Gharghabi, Reed Kanemaru, C. Shelton","doi":"10.1109/ICDAR.2019.00169","DOIUrl":null,"url":null,"abstract":"Automatic semantic labeling of strokes in online handwritten documents is a crucial task for many applications such as diagram interpretation, text recognition, and search. We formulate this task as a stroke classification problem in which each stroke is classified as a cross-out, free body diagram, or text. Separating free body diagram and text in this work is different than the traditional text/non-text separation problem because these two classes contain both text and graphics. The text class includes textual notes, mathematical symbols/equations, and graphics such as arrows that connect other elements. The free body diagram class also contains graphics and various alphanumeric characters and symbols that mark or explain the graphical objects. In this work, we present a novel deep neural network model for classification of strokes in online handwritten documents. There are two input sequences to the network. The first sequence contains the trajectories of the pen strokes while the second contains features of the strokes. Each of these sequences is fed to its own CNN-BLSTM channel to extract features and encode relationships between nearby strokes. The output of the two channels is concatenated and used as the input to a CRF layer that predicts the best sequence of labels for given input sequences. We evaluated our model on a dataset of 1,060 pages written by 132 students in an undergraduate statics course. Our model achieved an overall classification accuracy of 94.70% on this dataset.","PeriodicalId":325437,"journal":{"name":"2019 International Conference on Document Analysis and Recognition (ICDAR)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Document Analysis and Recognition (ICDAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.2019.00169","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Automatic semantic labeling of strokes in online handwritten documents is a crucial task for many applications such as diagram interpretation, text recognition, and search. We formulate this task as a stroke classification problem in which each stroke is classified as a cross-out, free body diagram, or text. Separating free body diagram and text in this work is different than the traditional text/non-text separation problem because these two classes contain both text and graphics. The text class includes textual notes, mathematical symbols/equations, and graphics such as arrows that connect other elements. The free body diagram class also contains graphics and various alphanumeric characters and symbols that mark or explain the graphical objects. In this work, we present a novel deep neural network model for classification of strokes in online handwritten documents. There are two input sequences to the network. The first sequence contains the trajectories of the pen strokes while the second contains features of the strokes. Each of these sequences is fed to its own CNN-BLSTM channel to extract features and encode relationships between nearby strokes. The output of the two channels is concatenated and used as the input to a CRF layer that predicts the best sequence of labels for given input sequences. We evaluated our model on a dataset of 1,060 pages written by 132 students in an undergraduate statics course. Our model achieved an overall classification accuracy of 94.70% on this dataset.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
CNN-BLSTM-CRF网络对学生在线手写作业的语义标注
在线手写文档笔画的自动语义标注是许多应用程序(如图表解释、文本识别和搜索)的关键任务。我们将此任务表述为笔画分类问题,其中每个笔画被分类为划线、自由体图或文本。在这项工作中,分离自由体图和文本与传统的文本/非文本分离问题不同,因为这两个类既包含文本又包含图形。text类包括文本注释、数学符号/方程和图形(如连接其他元素的箭头)。自由体图类还包含图形和各种字母数字字符以及标记或解释图形对象的符号。在这项工作中,我们提出了一种新的深度神经网络模型,用于在线手写文档的笔画分类。网络有两个输入序列。第一个序列包含笔画的轨迹,而第二个序列包含笔画的特征。每个序列都被送入自己的CNN-BLSTM通道,以提取特征并编码附近笔画之间的关系。两个通道的输出被连接起来并用作CRF层的输入,该层预测给定输入序列的最佳标签序列。我们在一个1060页的数据集上评估了我们的模型,该数据集由132名本科生在统计学课程中编写。我们的模型在该数据集上实现了94.70%的总体分类准确率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Article Segmentation in Digitised Newspapers with a 2D Markov Model ICDAR 2019 Robust Reading Challenge on Reading Chinese Text on Signboard TableNet: Deep Learning Model for End-to-end Table Detection and Tabular Data Extraction from Scanned Document Images DICE: Deep Intelligent Contextual Embedding for Twitter Sentiment Analysis Blind Source Separation Based Framework for Multispectral Document Images Binarization
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1