Determining the relationships between characters is an important step in analyzing fictional works. Knowing character relationships can be useful when summarizing a work and may also help to determine authorship. In this paper, scores are generated for pairs of characters in fictional works, which can be used for classification tasks if characters have a relationship or not. An SVM is used to predict relationships between characters. Characters farther from the decision boundary often had stronger relationships than those closer to the boundary. The relative rank of the relationships may have additional literary and authorship related purposes.
{"title":"Character Relationship Mapping in Major Fictional Works Using Text Analysis Methods","authors":"Sam Wolyn, S. Simske","doi":"10.1145/3573128.3609345","DOIUrl":"https://doi.org/10.1145/3573128.3609345","url":null,"abstract":"Determining the relationships between characters is an important step in analyzing fictional works. Knowing character relationships can be useful when summarizing a work and may also help to determine authorship. In this paper, scores are generated for pairs of characters in fictional works, which can be used for classification tasks if characters have a relationship or not. An SVM is used to predict relationships between characters. Characters farther from the decision boundary often had stronger relationships than those closer to the boundary. The relative rank of the relationships may have additional literary and authorship related purposes.","PeriodicalId":310776,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering 2023","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121273674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dense retrieval approaches are challenging the prevalence of inverted index-based sparse representation approaches for information retrieval systems. Different families have arisen: single representations for each query or passage (such as ANCE or DPR), or multiple representations (usually one per token) as exemplified by the ColBERT model. While ColBERT is effective, it requires significant storage space for each token's embedding. In this work, we aim to prune the embeddings for tokens that are not important for effectiveness. Indeed, we show that, by adapting standard uniform and document-centric static pruning methods to embedding-based indexes, but retaining their focus on low-IDF tokens, we can attain large improvements in space efficiency while maintaining high effectiveness. Indeed, on experiments conducted on the MSMARCO passage ranking task, by removing all embeddings corresponding to the 100 most frequent BERT tokens, the index size is reduced by 45%, with limited impact on effectiveness (e.g. no statistically significant degradation of NDCG@10 or MAP on the TREC 2020 queryset). Similarly, on TREC Covid, we observed a 1.3% reduction in nDCG@10 for a 38% reduction in total index size.
{"title":"Static Pruning for Multi-Representation Dense Retrieval","authors":"A. Acquavia, C. Macdonald, N. Tonellotto","doi":"10.1145/3573128.3604896","DOIUrl":"https://doi.org/10.1145/3573128.3604896","url":null,"abstract":"Dense retrieval approaches are challenging the prevalence of inverted index-based sparse representation approaches for information retrieval systems. Different families have arisen: single representations for each query or passage (such as ANCE or DPR), or multiple representations (usually one per token) as exemplified by the ColBERT model. While ColBERT is effective, it requires significant storage space for each token's embedding. In this work, we aim to prune the embeddings for tokens that are not important for effectiveness. Indeed, we show that, by adapting standard uniform and document-centric static pruning methods to embedding-based indexes, but retaining their focus on low-IDF tokens, we can attain large improvements in space efficiency while maintaining high effectiveness. Indeed, on experiments conducted on the MSMARCO passage ranking task, by removing all embeddings corresponding to the 100 most frequent BERT tokens, the index size is reduced by 45%, with limited impact on effectiveness (e.g. no statistically significant degradation of NDCG@10 or MAP on the TREC 2020 queryset). Similarly, on TREC Covid, we observed a 1.3% reduction in nDCG@10 for a 38% reduction in total index size.","PeriodicalId":310776,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering 2023","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129105906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We consider the problem of automatically inferring the (LATEX) document class used to write a scientific article from its PDF representation. Applications include improving the performance of information extraction techniques that rely on the style used in each document class, or determining the publisher of a given scientific article. We introduce two approaches: a simple classifier based on hand-coded document style features, as well as a CNN-based classifier taking as input the bitmap representation of the first page of the PDF article. We experiment on a dataset of around 100k articles from arXiv, where labels come from the source LATEX document associated to each article. Results show the CNN approach significantly outperforms that based on simple document style features, reaching over 90% average F1-score on a task to distinguish among several dozens of the most common document classes.
{"title":"Automatically Inferring the Document Class of a Scientific Article","authors":"Antoine Gauquier, P. Senellart","doi":"10.1145/3573128.3604894","DOIUrl":"https://doi.org/10.1145/3573128.3604894","url":null,"abstract":"We consider the problem of automatically inferring the (LATEX) document class used to write a scientific article from its PDF representation. Applications include improving the performance of information extraction techniques that rely on the style used in each document class, or determining the publisher of a given scientific article. We introduce two approaches: a simple classifier based on hand-coded document style features, as well as a CNN-based classifier taking as input the bitmap representation of the first page of the PDF article. We experiment on a dataset of around 100k articles from arXiv, where labels come from the source LATEX document associated to each article. Results show the CNN approach significantly outperforms that based on simple document style features, reaching over 90% average F1-score on a task to distinguish among several dozens of the most common document classes.","PeriodicalId":310776,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering 2023","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127174098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This study focuses on the importance of well-designed online matching systems for job seekers and employers. We treat resumes and job descriptions as documents. Then, calculate their similarity to determine the suitability of applicants, and rank a set of resumes based on their similarity to a specific job description. We employ Siamese Neural Networks, comprised of identical sub-network components, to evaluate the semantic similarity between documents. Our novel architecture integrates various neural network architectures, where each sub-network incorporates multiple layers such as CNN, LSTM and attention layers to capture sequential, local and global patterns within the data. The LSTM and CNN components are applied concurrently and merged together. The resulting output is then fed into a multi-head attention layer. These layers extract features and capture document representations. The extracted features are then combined to form a unified representation of the document. We leverage pre-trained language models to obtain embeddings for each document, which serve as a lower-dimensional representation of our input data. The model is trained on a private dataset of 268,549 real resumes and 4,198 job descriptions from twelve industry sectors, resulting in a ranked list of matched resumes. We performed a comparative analysis involving our model, Siamese CNN (S-CNNs), Siamese LSTM with Manhattan distance, and a BERT-based sentence transformer model. By combining the power of language models and the novel Siamese architecture, this approach leverages both strengths to improve document ranking accuracy and enhance the matching process between job descriptions and resumes. Our experimental results demonstrate that our model outperforms other models in terms of performance.
本研究的重点是设计良好的在线匹配系统对求职者和雇主的重要性。我们将简历和职位描述视为文件。然后,计算他们的相似度来确定申请人的适用性,并根据他们与特定职位描述的相似度对一组简历进行排名。我们使用由相同的子网络组件组成的暹罗神经网络来评估文档之间的语义相似性。我们的新架构集成了各种神经网络架构,其中每个子网络包含多层,如CNN, LSTM和注意力层,以捕获数据中的顺序,局部和全局模式。LSTM和CNN组件同时应用并合并在一起。然后将结果输出馈送到一个多头注意层。这些层提取特征并捕获文档表示。然后将提取的特征组合起来,形成文档的统一表示。我们利用预训练的语言模型来获得每个文档的嵌入,这些嵌入作为输入数据的低维表示。该模型在一个私人数据集上进行训练,该数据集包含来自12个行业的268,549份真实简历和4,198份职位描述,从而得出匹配简历的排名列表。我们对我们的模型、Siamese CNN (s -CNN)、具有曼哈顿距离的Siamese LSTM和基于bert的句子转换模型进行了比较分析。通过结合语言模型的强大功能和新颖的Siamese架构,该方法利用了这两种优势来提高文档排序的准确性,并增强了职位描述和简历之间的匹配过程。我们的实验结果表明,我们的模型在性能方面优于其他模型。
{"title":"AI-powered Resume-Job matching: A document ranking approach using deep neural networks","authors":"Sima Rezaeipourfarsangi, E. Milios","doi":"10.1145/3573128.3609347","DOIUrl":"https://doi.org/10.1145/3573128.3609347","url":null,"abstract":"This study focuses on the importance of well-designed online matching systems for job seekers and employers. We treat resumes and job descriptions as documents. Then, calculate their similarity to determine the suitability of applicants, and rank a set of resumes based on their similarity to a specific job description. We employ Siamese Neural Networks, comprised of identical sub-network components, to evaluate the semantic similarity between documents. Our novel architecture integrates various neural network architectures, where each sub-network incorporates multiple layers such as CNN, LSTM and attention layers to capture sequential, local and global patterns within the data. The LSTM and CNN components are applied concurrently and merged together. The resulting output is then fed into a multi-head attention layer. These layers extract features and capture document representations. The extracted features are then combined to form a unified representation of the document. We leverage pre-trained language models to obtain embeddings for each document, which serve as a lower-dimensional representation of our input data. The model is trained on a private dataset of 268,549 real resumes and 4,198 job descriptions from twelve industry sectors, resulting in a ranked list of matched resumes. We performed a comparative analysis involving our model, Siamese CNN (S-CNNs), Siamese LSTM with Manhattan distance, and a BERT-based sentence transformer model. By combining the power of language models and the novel Siamese architecture, this approach leverages both strengths to improve document ranking accuracy and enhance the matching process between job descriptions and resumes. Our experimental results demonstrate that our model outperforms other models in terms of performance.","PeriodicalId":310776,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering 2023","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124775272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Lins, Gabriel de F. Pe Silva, Gustavo P. Chaves, Ricardo da Silva Barboza, R. Bernardino, S. Simske
Document image binarization is a fundamental step in many document processes. No binarization algorithm performs well on all types of document images, as the different kinds of digitalization devices and the physical noises present in the document and acquired in the digitalization process alter their performance. Besides that, the processing time is also an important factor that may restrict its applicability. This competition on binarizing photographed documents assessed the quality, time, space, and performance of five new algorithms and sixty-four "classical" and alternative algorithms. The evaluation dataset is composed of laser and deskjet printed documents, photographed using six widely-used mobile devices with the strobe flash on and off, under two different angles and places of capture.
{"title":"Quality, Space and Time Competition on Binarizing Photographed Document Images","authors":"R. Lins, Gabriel de F. Pe Silva, Gustavo P. Chaves, Ricardo da Silva Barboza, R. Bernardino, S. Simske","doi":"10.1145/3573128.3604903","DOIUrl":"https://doi.org/10.1145/3573128.3604903","url":null,"abstract":"Document image binarization is a fundamental step in many document processes. No binarization algorithm performs well on all types of document images, as the different kinds of digitalization devices and the physical noises present in the document and acquired in the digitalization process alter their performance. Besides that, the processing time is also an important factor that may restrict its applicability. This competition on binarizing photographed documents assessed the quality, time, space, and performance of five new algorithms and sixty-four \"classical\" and alternative algorithms. The evaluation dataset is composed of laser and deskjet printed documents, photographed using six widely-used mobile devices with the strobe flash on and off, under two different angles and places of capture.","PeriodicalId":310776,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering 2023","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128525780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haytame Fallah, Emmanuel Bruno, P. Bellot, Elisabeth Murisasco
We introduce in this paper a new approach to improve deep learning-based architectures for multi-label document classification. Dependencies between labels are an essential factor in the multi-label context. Our proposed strategy takes advantage of the knowledge extracted from label co-occurrences. The proposed method consists in adding a regularization term to the loss function used for training the model, in a way that incorporates the label similarities given by the label co-occurrences to encourage the model to jointly predict labels that are likely to co-occur, and and not consider labels that are rarely present with each other. This allows the neural model to better capture label dependencies. Our approach was evaluated on three datasets: the standard AAPD dataset, a corpus of scientific abstracts and Reuters-21578, a collection of news articles, and a newly proposed multi-label dataset called arXiv-ACM. Our method demonstrates improved performance, setting a new state-of-the-art on all three datasets.
{"title":"Exploiting Label Dependencies for Multi-Label Document Classification Using Transformers","authors":"Haytame Fallah, Emmanuel Bruno, P. Bellot, Elisabeth Murisasco","doi":"10.1145/3573128.3609356","DOIUrl":"https://doi.org/10.1145/3573128.3609356","url":null,"abstract":"We introduce in this paper a new approach to improve deep learning-based architectures for multi-label document classification. Dependencies between labels are an essential factor in the multi-label context. Our proposed strategy takes advantage of the knowledge extracted from label co-occurrences. The proposed method consists in adding a regularization term to the loss function used for training the model, in a way that incorporates the label similarities given by the label co-occurrences to encourage the model to jointly predict labels that are likely to co-occur, and and not consider labels that are rarely present with each other. This allows the neural model to better capture label dependencies. Our approach was evaluated on three datasets: the standard AAPD dataset, a corpus of scientific abstracts and Reuters-21578, a collection of news articles, and a newly proposed multi-label dataset called arXiv-ACM. Our method demonstrates improved performance, setting a new state-of-the-art on all three datasets.","PeriodicalId":310776,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering 2023","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128600283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Machine learning-based methods for PDF malware detection have grown in popularity because of their high levels of accuracy. However, many well-known ML-based detectors require a large number of specimen features to be collected before making a decision, which can be time-consuming. In this study, we present a novel, distance-based method for detecting PDF malware. Notably, our approach needs significantly less training data compared to traditional machine learning or neural network models. We evaluated our method using the Contagio dataset and reported that it can detect 90.50% of malware samples with only 20 benign PDF files used for model training. To show the statistical significance, we reported results with a 95% confidence interval (CI). We evaluated our model's performance across multiple metrics including Accuracy, F1 score, Precision, and Recall, alongside False Positive Rate, False Negative Rates, True Positive Rate and True Negative Rates. This paper highlights the feasibility of using distance-based methods for PDF malware detection, even with limited training data, thereby offering a promising direction for future research.
{"title":"A PDF Malware Detection Method Using Extremely Small Training Sample Size","authors":"Ran Liu, Cynthia Matuszek, Charles Nicholas","doi":"10.1145/3573128.3609352","DOIUrl":"https://doi.org/10.1145/3573128.3609352","url":null,"abstract":"Machine learning-based methods for PDF malware detection have grown in popularity because of their high levels of accuracy. However, many well-known ML-based detectors require a large number of specimen features to be collected before making a decision, which can be time-consuming. In this study, we present a novel, distance-based method for detecting PDF malware. Notably, our approach needs significantly less training data compared to traditional machine learning or neural network models. We evaluated our method using the Contagio dataset and reported that it can detect 90.50% of malware samples with only 20 benign PDF files used for model training. To show the statistical significance, we reported results with a 95% confidence interval (CI). We evaluated our model's performance across multiple metrics including Accuracy, F1 score, Precision, and Recall, alongside False Positive Rate, False Negative Rates, True Positive Rate and True Negative Rates. This paper highlights the feasibility of using distance-based methods for PDF malware detection, even with limited training data, thereby offering a promising direction for future research.","PeriodicalId":310776,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering 2023","volume":"199 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127210136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hadia Showkat Kawoosa, Muhammad Suhaib Kanroo, P. Goyal
Chart Data Extraction (CDE) is a complex task in document analysis that involves extracting data from charts to facilitate accessibility for various applications, such as document mining, medical diagnosis, and accessibility for the visually impaired. CDE is challenging due to the intricate structure and specific semantics of charts, which include elements such as title, axis, legend, and plot elements. The existing solutions for CDE have not yet satisfactorily addressed these issues. In this paper, we focus on two critical subtasks in CDE, Legend Analysis and Axis Analysis, and present a lightweight YOLO-based method for detection and domain-specific heuristic algorithms (Axis Matching and Legend Matching), for matching. We evaluate the efficacy of our proposed method, LYLAA, on a real-world dataset, the ICPR2022 UB PMC dataset, and observe promising results compared to the competing teams in the ICPR2022 CHART-Infographics competition. Our findings showcase the potential of our proposed method in the CDE process.
{"title":"LYLAA: A Lightweight YOLO based Legend and Axis Analysis method for CHART-Infographics","authors":"Hadia Showkat Kawoosa, Muhammad Suhaib Kanroo, P. Goyal","doi":"10.1145/3573128.3609355","DOIUrl":"https://doi.org/10.1145/3573128.3609355","url":null,"abstract":"Chart Data Extraction (CDE) is a complex task in document analysis that involves extracting data from charts to facilitate accessibility for various applications, such as document mining, medical diagnosis, and accessibility for the visually impaired. CDE is challenging due to the intricate structure and specific semantics of charts, which include elements such as title, axis, legend, and plot elements. The existing solutions for CDE have not yet satisfactorily addressed these issues. In this paper, we focus on two critical subtasks in CDE, Legend Analysis and Axis Analysis, and present a lightweight YOLO-based method for detection and domain-specific heuristic algorithms (Axis Matching and Legend Matching), for matching. We evaluate the efficacy of our proposed method, LYLAA, on a real-world dataset, the ICPR2022 UB PMC dataset, and observe promising results compared to the competing teams in the ICPR2022 CHART-Infographics competition. Our findings showcase the potential of our proposed method in the CDE process.","PeriodicalId":310776,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering 2023","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123467161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The process of extracting relevant data from historical handwritten documents can be time-consuming and challenging. In Ireland, from 1864 to 1922, government records regarding births, deaths, and marriages were documented by local registrars using printed tabular structures. Leveraging this systematic approach, we employ a neural network capable of segmenting scanned versions of these record documents. We sought to isolate the corner points with the goal of extracting the vital tabular elements and transforming them into consistently structured standalone images. By achieving uniformity in the segmented images, we enable more accurate row and column segmentation, enhancing our ability to isolate and classify individual cell contents effectively. This process must accommodate varying image qualities, different tabular orientations and sizes resulting from diverse scanning procedures, as well as faded and damaged ink lines that naturally occur over time.
{"title":"Tabular Corner Detection in Historical Irish Records","authors":"Enda O'Shea","doi":"10.1145/3573128.3609349","DOIUrl":"https://doi.org/10.1145/3573128.3609349","url":null,"abstract":"The process of extracting relevant data from historical handwritten documents can be time-consuming and challenging. In Ireland, from 1864 to 1922, government records regarding births, deaths, and marriages were documented by local registrars using printed tabular structures. Leveraging this systematic approach, we employ a neural network capable of segmenting scanned versions of these record documents. We sought to isolate the corner points with the goal of extracting the vital tabular elements and transforming them into consistently structured standalone images. By achieving uniformity in the segmented images, we enable more accurate row and column segmentation, enhancing our ability to isolate and classify individual cell contents effectively. This process must accommodate varying image qualities, different tabular orientations and sizes resulting from diverse scanning procedures, as well as faded and damaged ink lines that naturally occur over time.","PeriodicalId":310776,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering 2023","volume":"215 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128169739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrea Gemelli, S. Marinai, Emanuele Vivoli, T. Zappaterra
Early identification of dysgraphia in children is crucial for timely intervention and support. Traditional methods, such as the Brave Handwriting Kinder (BHK) test, which relies on manual scoring of handwritten sentences, are both time-consuming and subjective posing challenges in accurate and efficient diagnosis. In this paper, an approach for dysgraphia detection by leveraging smart pens and deep learning techniques is proposed, automatically extracting visual features from children's handwriting samples. To validate the solution, samples of children handwritings have been gathered and several interviews with domain experts have been conducted. The approach has been compared with an algorithmic version of the BHK test and with several elementary school teachers' interviews.
儿童书写障碍的早期识别对于及时干预和支持至关重要。传统的方法,如Brave Handwriting Kinder (BHK)测试,依赖于手写句子的人工评分,既耗时又主观,对准确高效的诊断提出了挑战。本文提出了一种利用智能笔和深度学习技术自动提取儿童笔迹样本视觉特征的书写障碍检测方法。为了验证该解决方案,收集了儿童手迹样本,并与领域专家进行了多次访谈。该方法已与BHK测试的算法版本和几位小学教师的访谈进行了比较。
{"title":"Deep-learning for dysgraphia detection in children handwritings","authors":"Andrea Gemelli, S. Marinai, Emanuele Vivoli, T. Zappaterra","doi":"10.1145/3573128.3609351","DOIUrl":"https://doi.org/10.1145/3573128.3609351","url":null,"abstract":"Early identification of dysgraphia in children is crucial for timely intervention and support. Traditional methods, such as the Brave Handwriting Kinder (BHK) test, which relies on manual scoring of handwritten sentences, are both time-consuming and subjective posing challenges in accurate and efficient diagnosis. In this paper, an approach for dysgraphia detection by leveraging smart pens and deep learning techniques is proposed, automatically extracting visual features from children's handwriting samples. To validate the solution, samples of children handwritings have been gathered and several interviews with domain experts have been conducted. The approach has been compared with an algorithmic version of the BHK test and with several elementary school teachers' interviews.","PeriodicalId":310776,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering 2023","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115137924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}