{"title":"阿拉伯语文档页的标题分割","authors":"Bouressace Hassina","doi":"10.24132/csrn.2019.2902.2.6","DOIUrl":null,"url":null,"abstract":"Recent studies on text line segmentation have not focused on title segmentation in complex structure documents,\nwhich may represent the upper rows in each article of a document page. Many methods cannot correctly distinguish\nbetween the titles and the text, especially when it contains more than one title. In this paper, we discuss this problem\nand then present a straightforward and robust title segmentation approach. The proposed method was tested on\nPATD (Printed Arabic Text Database ) images and we achieved good results.","PeriodicalId":322214,"journal":{"name":"Computer Science Research Notes","volume":"253 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Title Segmentation in Arabic Document Pages\",\"authors\":\"Bouressace Hassina\",\"doi\":\"10.24132/csrn.2019.2902.2.6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recent studies on text line segmentation have not focused on title segmentation in complex structure documents,\\nwhich may represent the upper rows in each article of a document page. Many methods cannot correctly distinguish\\nbetween the titles and the text, especially when it contains more than one title. In this paper, we discuss this problem\\nand then present a straightforward and robust title segmentation approach. The proposed method was tested on\\nPATD (Printed Arabic Text Database ) images and we achieved good results.\",\"PeriodicalId\":322214,\"journal\":{\"name\":\"Computer Science Research Notes\",\"volume\":\"253 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Science Research Notes\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.24132/csrn.2019.2902.2.6\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Science Research Notes","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.24132/csrn.2019.2902.2.6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Recent studies on text line segmentation have not focused on title segmentation in complex structure documents,
which may represent the upper rows in each article of a document page. Many methods cannot correctly distinguish
between the titles and the text, especially when it contains more than one title. In this paper, we discuss this problem
and then present a straightforward and robust title segmentation approach. The proposed method was tested on
PATD (Printed Arabic Text Database ) images and we achieved good results.