首页 > 最新文献

2008 The Eighth IAPR International Workshop on Document Analysis Systems最新文献

英文 中文
Document Image Retrieval to Support Reading Mokkans 文件图像检索,以支持阅读Mokkans
Pub Date : 2008-09-16 DOI: 10.1109/DAS.2008.32
Akihito Kitadai, Jun Takakura, Masatoshi Ishikawa, M. Nakagawa, Hajime Baba, Akihiro Watanabe
This paper presents a design and an implementation of document image retrieval to support reading mokkans. A mokkan is a wooden tablet with text written by a brush in India ink. Despite the archaeological and historical value of the mokkans excavated from ancient ruins, many of the mokkans have not been decoded yet due to the lost or too much damaged character patterns on them. Character recognition for damaged patterns is useful to decode such mokkans. Furthermore, if the recognition results show not only the character codes but also the images of the character patterns and the whole mokkans, the recognition becomes useful document retrieval to complement the lost or unreadable part of the mokkans. In the implementation, we built a public database of historical mokkans with their photographs and a character recognition module working on our support system to search the database. The evaluation by archaeologists is in progress.
本文提出了一种支持文本阅读的文档图像检索系统的设计与实现。木碑是用毛笔用印度墨水写文字的木碑。尽管从古代遗址中挖掘出来的摩坎具有考古和历史价值,但由于上面的文字图案丢失或损坏太多,许多摩坎尚未被破译。对受损模式的字符识别对于解码这些mokkans是有用的。此外,如果识别结果不仅显示字符代码,而且还显示字符模式和整个mokkann的图像,则识别将成为有用的文档检索,以补充丢失或不可读的mokkann部分。在实施过程中,我们建立了一个包含历史mokkans照片的公共数据库,并在我们的支持系统上使用字符识别模块来搜索数据库。考古学家的评估正在进行中。
{"title":"Document Image Retrieval to Support Reading Mokkans","authors":"Akihito Kitadai, Jun Takakura, Masatoshi Ishikawa, M. Nakagawa, Hajime Baba, Akihiro Watanabe","doi":"10.1109/DAS.2008.32","DOIUrl":"https://doi.org/10.1109/DAS.2008.32","url":null,"abstract":"This paper presents a design and an implementation of document image retrieval to support reading mokkans. A mokkan is a wooden tablet with text written by a brush in India ink. Despite the archaeological and historical value of the mokkans excavated from ancient ruins, many of the mokkans have not been decoded yet due to the lost or too much damaged character patterns on them. Character recognition for damaged patterns is useful to decode such mokkans. Furthermore, if the recognition results show not only the character codes but also the images of the character patterns and the whole mokkans, the recognition becomes useful document retrieval to complement the lost or unreadable part of the mokkans. In the implementation, we built a public database of historical mokkans with their photographs and a character recognition module working on our support system to search the database. The evaluation by archaeologists is in progress.","PeriodicalId":423207,"journal":{"name":"2008 The Eighth IAPR International Workshop on Document Analysis Systems","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130254134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
The HCI Paradigm of HyperPrinting 超打印的HCI范式
Pub Date : 2008-09-16 DOI: 10.1109/DAS.2008.11
T. Kieninger, A. Dengel
Today, printing and reverse printing (scanning, OCR, logical labeling etc.) technologies have become quite mature and thus allow for an easy transition of documents between physical and electronic world. However, there is no technology today which supports the lossless interpretation of paper-based user interaction with direct effects upon the electronic representation of that document. The HyperPrinting environment tries to fill in this gap and thus accounts for the personal favors of a majority of office workers: Not only managers and knowledge workers prefer to read longer documents, articles or news from paper in contrast to a computer monitor or handheld computer. With the help of HyperPrinting, users can annotate, send notes or initiate tasks and it thus offers a completely new paradigm in the usage and treatment of paper documents. As a side-effect, the use of HyperPrinting builds up a document repository which is not only searchable by full text but also by meta-information, which in turn is depending on the selected user scenario.
今天,打印和反打印(扫描,OCR,逻辑标签等)技术已经变得相当成熟,因此可以轻松地在物理世界和电子世界之间转换文档。然而,目前还没有一种技术能够支持对基于纸张的用户交互进行无损解释,从而对该文件的电子表示产生直接影响。HyperPrinting环境试图填补这一空白,从而解释了大多数办公室工作人员的个人喜好:与计算机显示器或手持计算机相比,不仅经理和知识工作者更喜欢从纸上阅读较长的文档、文章或新闻。在HyperPrinting的帮助下,用户可以注释、发送笔记或启动任务,因此它提供了一种全新的使用和处理纸质文档的范例。作为一个副作用,HyperPrinting的使用建立了一个文档存储库,它不仅可以通过全文搜索,还可以通过元信息搜索,而元信息又取决于所选择的用户场景。
{"title":"The HCI Paradigm of HyperPrinting","authors":"T. Kieninger, A. Dengel","doi":"10.1109/DAS.2008.11","DOIUrl":"https://doi.org/10.1109/DAS.2008.11","url":null,"abstract":"Today, printing and reverse printing (scanning, OCR, logical labeling etc.) technologies have become quite mature and thus allow for an easy transition of documents between physical and electronic world. However, there is no technology today which supports the lossless interpretation of paper-based user interaction with direct effects upon the electronic representation of that document. The HyperPrinting environment tries to fill in this gap and thus accounts for the personal favors of a majority of office workers: Not only managers and knowledge workers prefer to read longer documents, articles or news from paper in contrast to a computer monitor or handheld computer. With the help of HyperPrinting, users can annotate, send notes or initiate tasks and it thus offers a completely new paradigm in the usage and treatment of paper documents. As a side-effect, the use of HyperPrinting builds up a document repository which is not only searchable by full text but also by meta-information, which in turn is depending on the selected user scenario.","PeriodicalId":423207,"journal":{"name":"2008 The Eighth IAPR International Workshop on Document Analysis Systems","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125324566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Multi-Oriented English Text Line Extraction Using Background and Foreground Information 基于背景和前景信息的多方向英语文本行提取
Pub Date : 2008-09-16 DOI: 10.1109/DAS.2008.83
P. Roy, U. Pal, J. Lladós, F. Kimura
In graphical documents (map, engineering drawing), artistic documents etc. there exist many printed materials where text lines are not parallel to each other and they are multi-oriented and curve in nature. For the OCR of such documents we need to extract individual text lines from the documents. Extraction of individual text lines from multi-oriented and/or curved text document is a difficult problem. In this paper, we propose a novel method to extract individual text lines from such document pages and the method is based on the foreground and background information of the characters of the text. To take care of background information, water reservoir concept is used here. In the proposed scheme at first, individual components are detected and grouped into 3-character clusters using their inter-component distance, size and positional information. Applying concept of graph, initial 3-character clusters are merged to have larger cluster group. Using inter-character background information, orientations of the extreme characters of a larger cluster are decided and based on these orientation, two candidate regions are formed from the cluster. Finally, with the help of these candidate regions, individual lines are extracted. From the experiment, we obtained encouraging result.
在图形文件(地图、工程图纸)、艺术文件等中,存在着许多文字线条不平行的印刷品,它们具有多方位和曲线性。对于这些文档的OCR,我们需要从文档中提取单个文本行。从多方向和/或弯曲文本文档中提取单个文本行是一个难题。在本文中,我们提出了一种基于文本字符的前景和背景信息的新方法来从此类文档页面中提取单个文本行。为了照顾背景信息,这里使用了水库的概念。在该方案中,首先检测单个组件,并根据组件间距离、大小和位置信息将其分组为3个字符的聚类。利用图的概念,将初始的3字符聚类合并成更大的聚类群。利用字符间背景信息,确定较大集群的极端特征的方向,并基于这些方向从集群中形成两个候选区域。最后,在这些候选区域的帮助下,提取单个线条。从实验中,我们获得了令人鼓舞的结果。
{"title":"Multi-Oriented English Text Line Extraction Using Background and Foreground Information","authors":"P. Roy, U. Pal, J. Lladós, F. Kimura","doi":"10.1109/DAS.2008.83","DOIUrl":"https://doi.org/10.1109/DAS.2008.83","url":null,"abstract":"In graphical documents (map, engineering drawing), artistic documents etc. there exist many printed materials where text lines are not parallel to each other and they are multi-oriented and curve in nature. For the OCR of such documents we need to extract individual text lines from the documents. Extraction of individual text lines from multi-oriented and/or curved text document is a difficult problem. In this paper, we propose a novel method to extract individual text lines from such document pages and the method is based on the foreground and background information of the characters of the text. To take care of background information, water reservoir concept is used here. In the proposed scheme at first, individual components are detected and grouped into 3-character clusters using their inter-component distance, size and positional information. Applying concept of graph, initial 3-character clusters are merged to have larger cluster group. Using inter-character background information, orientations of the extreme characters of a larger cluster are decided and based on these orientation, two candidate regions are formed from the cluster. Finally, with the help of these candidate regions, individual lines are extracted. From the experiment, we obtained encouraging result.","PeriodicalId":423207,"journal":{"name":"2008 The Eighth IAPR International Workshop on Document Analysis Systems","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124048315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
State: A Multimodal Assisted Text-Transcription System for Ancient Documents 状态:古代文献的多模式辅助文本转录系统
Pub Date : 2008-09-16 DOI: 10.1109/DAS.2008.28
Albert Gordo, D. Llorens, A. Marzal, F. Prat, J. M. Vilar
We present a complete assisted transcription system for ancient documents: State. The system consists of two applications: a pen-based, interactive application to assist humans in transcribing ancient documents and a recognition engine which offers automatic transcriptions via a web service. The interaction model and the recognition algorithm employed in the current version of State are presented. Some preliminary experiments show the productivity gains obtained with the system when transcribing a document and the error rate of the current recognition engine.
我们提出了一个完整的古代文献辅助转录系统:State。该系统由两个应用程序组成:一个基于笔的交互式应用程序,帮助人们转录古代文献;一个识别引擎,通过网络服务提供自动转录。介绍了当前版本State所采用的交互模型和识别算法。一些初步的实验表明,该系统在转录文档时获得了生产力的提高,并且当前识别引擎的错误率也有所提高。
{"title":"State: A Multimodal Assisted Text-Transcription System for Ancient Documents","authors":"Albert Gordo, D. Llorens, A. Marzal, F. Prat, J. M. Vilar","doi":"10.1109/DAS.2008.28","DOIUrl":"https://doi.org/10.1109/DAS.2008.28","url":null,"abstract":"We present a complete assisted transcription system for ancient documents: State. The system consists of two applications: a pen-based, interactive application to assist humans in transcribing ancient documents and a recognition engine which offers automatic transcriptions via a web service. The interaction model and the recognition algorithm employed in the current version of State are presented. Some preliminary experiments show the productivity gains obtained with the system when transcribing a document and the error rate of the current recognition engine.","PeriodicalId":423207,"journal":{"name":"2008 The Eighth IAPR International Workshop on Document Analysis Systems","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127923955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Unsupervised Decomposition of Color Document Images by Projecting Colors to a Spherical Surface 通过将颜色投射到球面的彩色文档图像的无监督分解
Pub Date : 2008-09-16 DOI: 10.1109/DAS.2008.37
Yuan He, Jun Sun, S. Naoi, Y. Fujii, K. Fujimoto
A decomposition method for color document images is proposed in this paper. A two dimensional feature surface is constructed from the input color image, and then a novel and unsupervised method based on contour lines is proposed to segment the surface. In detail, colors of the image pixels are firstly projected to a spherical surface whose center is the background color. The projection is used to transform the observed colors to the corresponding 'ideal' foreground colors. Then the spherical surface is segmented into several non-overlapped regions, and each region corresponds to an individual layer of the input color document image. Finally, the image pixels are projected to the spherical surface and classified to the corresponding layers.
提出了一种彩色文档图像的分解方法。从输入的彩色图像构造二维特征曲面,然后提出了一种基于轮廓线的无监督分割方法。首先将图像像素的颜色投影到以背景色为中心的球面上。投影用于将观测到的颜色转换为相应的“理想”前景颜色。然后将球面分割成几个不重叠的区域,每个区域对应于输入彩色文档图像的一个单独的层。最后,将图像像素投影到球面上,并对其进行分层。
{"title":"Unsupervised Decomposition of Color Document Images by Projecting Colors to a Spherical Surface","authors":"Yuan He, Jun Sun, S. Naoi, Y. Fujii, K. Fujimoto","doi":"10.1109/DAS.2008.37","DOIUrl":"https://doi.org/10.1109/DAS.2008.37","url":null,"abstract":"A decomposition method for color document images is proposed in this paper. A two dimensional feature surface is constructed from the input color image, and then a novel and unsupervised method based on contour lines is proposed to segment the surface. In detail, colors of the image pixels are firstly projected to a spherical surface whose center is the background color. The projection is used to transform the observed colors to the corresponding 'ideal' foreground colors. Then the spherical surface is segmented into several non-overlapped regions, and each region corresponds to an individual layer of the input color document image. Finally, the image pixels are projected to the spherical surface and classified to the corresponding layers.","PeriodicalId":423207,"journal":{"name":"2008 The Eighth IAPR International Workshop on Document Analysis Systems","volume":"142 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127321515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Two-Step Dewarping of Camera Document Images 相机文档图像的两步去翘曲
Pub Date : 2008-09-16 DOI: 10.1109/DAS.2008.40
N. Stamatopoulos, B. Gatos, I. Pratikakis, S. Perantonis
Dewarping of camera document images has attracted a lot of interest over the last few years since warping not only reduces the document readability but also affects the accuracy of an OCR application. In this paper, a two-step approach for efficient dewarping of camera document images is presented. At a first step, a coarse dewarping is accomplished with the help of a transformation model which maps the projection of a curved surface to a 2D rectangular area. The projection of the curved surface is delimited by the two curved lines which fit the top and bottom text lines along with the two straight lines which fit to the left and right text boundaries. At a second step, fine dewarping is achieved based on words detection. All words are pose normalized guided by the lower and upper word baselines. Experimental results on several camera document images demonstrate the robustness and effectiveness of the proposed technique.
在过去的几年里,相机文档图像的去翘曲引起了很多人的兴趣,因为翘曲不仅降低了文档的可读性,而且还影响了OCR应用程序的准确性。本文提出了一种两步法对相机文档图像进行有效去warp的方法。首先,利用变换模型将曲面的投影映射到二维矩形区域来完成粗去warp。曲面的投影由适合上下文本线的两条曲线和适合左右文本边界的两条直线划分。第二步,基于单词检测实现精细去翘曲。所有单词都在上下基线的引导下进行姿势规范化。在多幅相机文档图像上的实验结果表明了该方法的鲁棒性和有效性。
{"title":"A Two-Step Dewarping of Camera Document Images","authors":"N. Stamatopoulos, B. Gatos, I. Pratikakis, S. Perantonis","doi":"10.1109/DAS.2008.40","DOIUrl":"https://doi.org/10.1109/DAS.2008.40","url":null,"abstract":"Dewarping of camera document images has attracted a lot of interest over the last few years since warping not only reduces the document readability but also affects the accuracy of an OCR application. In this paper, a two-step approach for efficient dewarping of camera document images is presented. At a first step, a coarse dewarping is accomplished with the help of a transformation model which maps the projection of a curved surface to a 2D rectangular area. The projection of the curved surface is delimited by the two curved lines which fit the top and bottom text lines along with the two straight lines which fit to the left and right text boundaries. At a second step, fine dewarping is achieved based on words detection. All words are pose normalized guided by the lower and upper word baselines. Experimental results on several camera document images demonstrate the robustness and effectiveness of the proposed technique.","PeriodicalId":423207,"journal":{"name":"2008 The Eighth IAPR International Workshop on Document Analysis Systems","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132178671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Symbol Descriptor Based on Shape Context and Vector Model of Information Retrieval 基于形状上下文和信息检索向量模型的符号描述符
Pub Date : 2008-09-16 DOI: 10.1109/DAS.2008.58
T.-O. Nguyen, S. Tabbone, O. R. Terrades
In this paper we present an adaptive method for graphic symbol representation based on shape contexts. The proposed descriptor is invariant under classical geometric transforms (rotation, scale) and based on interest points. To reduce the complexity of matching a symbol to a largeset of candidates we use the popular vector model for information retrieval. In this way, on the set of shape descriptors we build a visual vocabulary where each symbol is retrieved on visual words. Experimental results on complex and occluded symbols show that the approach is very promising.
本文提出了一种基于形状上下文的图形符号自适应表示方法。所提出的描述符在经典几何变换(旋转、尺度)和基于兴趣点下是不变的。为了降低将符号匹配到最大候选集的复杂性,我们使用流行的向量模型进行信息检索。通过这种方式,我们在形状描述符的集合上建立了一个视觉词汇表,其中每个符号都是在视觉词上检索的。对复杂和遮挡符号的实验结果表明,该方法是很有前途的。
{"title":"Symbol Descriptor Based on Shape Context and Vector Model of Information Retrieval","authors":"T.-O. Nguyen, S. Tabbone, O. R. Terrades","doi":"10.1109/DAS.2008.58","DOIUrl":"https://doi.org/10.1109/DAS.2008.58","url":null,"abstract":"In this paper we present an adaptive method for graphic symbol representation based on shape contexts. The proposed descriptor is invariant under classical geometric transforms (rotation, scale) and based on interest points. To reduce the complexity of matching a symbol to a largeset of candidates we use the popular vector model for information retrieval. In this way, on the set of shape descriptors we build a visual vocabulary where each symbol is retrieved on visual words. Experimental results on complex and occluded symbols show that the approach is very promising.","PeriodicalId":423207,"journal":{"name":"2008 The Eighth IAPR International Workshop on Document Analysis Systems","volume":"152 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121261702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
The Convergence of Iterated Classification 迭代分类的收敛性
Pub Date : 2008-09-16 DOI: 10.1109/DAS.2008.52
Chang An, H. Baird
We report an improved methodology for training a sequence of classifiers for document image content extraction, that is, the location and segmentation of regions containing handwriting, machine-printed text, photographs, blank space, etc. The resulting segmentation is pixel-accurate, and so accommodates a wide range of zone shapes (not merely rectangles). We have systematically explored the best scale (spatial extent) of features. We have found that the methodology is sensitive to ground-truthing policy, and especially to precision of ground-truth boundaries. Experiments on a diverse test set of 83 document images show that tighter ground-truth reduces per-pixel classification errors by 45% (from 38.9% to 21.4%). Strong evidence, from both experiments and simulation, suggests that iterated classification converges region boundaries to the ground-truth (i.e. they don't drift). Experiments show that four-stage iterated classifiers reduce the error rates by 24%. We also present an analysis of special cases suggesting reasons why boundaries converge to the ground-truth.
我们报告了一种改进的方法,用于训练用于文档图像内容提取的分类器序列,即包含手写,机器打印文本,照片,空白等的区域的位置和分割。所得到的分割是像素精确的,因此可以适应各种区域形状(不仅仅是矩形)。我们系统地探索了特征的最佳尺度(空间范围)。我们发现,该方法对地面真值策略很敏感,特别是对地面真值边界的精度。在83张文档图像的不同测试集上进行的实验表明,更严格的基础真值将每像素分类误差降低了45%(从38.9%降至21.4%)。来自实验和模拟的有力证据表明,迭代分类将区域边界收敛到基本事实(即它们不会漂移)。实验表明,四阶段迭代分类器将错误率降低了24%。我们还提出了一个特殊情况的分析,说明边界收敛于基本真值的原因。
{"title":"The Convergence of Iterated Classification","authors":"Chang An, H. Baird","doi":"10.1109/DAS.2008.52","DOIUrl":"https://doi.org/10.1109/DAS.2008.52","url":null,"abstract":"We report an improved methodology for training a sequence of classifiers for document image content extraction, that is, the location and segmentation of regions containing handwriting, machine-printed text, photographs, blank space, etc. The resulting segmentation is pixel-accurate, and so accommodates a wide range of zone shapes (not merely rectangles). We have systematically explored the best scale (spatial extent) of features. We have found that the methodology is sensitive to ground-truthing policy, and especially to precision of ground-truth boundaries. Experiments on a diverse test set of 83 document images show that tighter ground-truth reduces per-pixel classification errors by 45% (from 38.9% to 21.4%). Strong evidence, from both experiments and simulation, suggests that iterated classification converges region boundaries to the ground-truth (i.e. they don't drift). Experiments show that four-stage iterated classifiers reduce the error rates by 24%. We also present an analysis of special cases suggesting reasons why boundaries converge to the ground-truth.","PeriodicalId":423207,"journal":{"name":"2008 The Eighth IAPR International Workshop on Document Analysis Systems","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116281193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Word and Symbol Spotting Using Spatial Organization of Local Descriptors 利用局部描述符的空间组织识别词和符号
Pub Date : 2008-09-16 DOI: 10.1109/DAS.2008.24
Marçal Rusiñol, J. Lladós
In this paper we present a method to spot both text and graphical symbols in a collection of images of wiring diagrams. Word spotting and symbol spotting methods tend to use the most discriminative features to describe the objects to be located. This fact makes that one can not tackle with textual and symbolic information at the same time. We propose a spotting architecture able to index both words and symbols, inspired in off-the-shelf object recognition architectures. Keypoints are extracted from a document image and a local descriptor is computed at each of these points of interest. The spatial organization of these descriptors validate the hypothesis to find an object (text or symbol) in a certain location and under a certain pose.
在本文中,我们提出了一种在接线图图像集合中识别文本和图形符号的方法。单词识别和符号识别方法倾向于使用最具区别性的特征来描述待定位的对象。这一事实使得人们不能同时处理文本信息和符号信息。我们提出了一种能够索引单词和符号的定位架构,灵感来自现成的对象识别架构。从文档图像中提取关键点,并在每个感兴趣的点处计算一个局部描述符。这些描述符的空间组织验证了在特定位置和特定姿势下找到对象(文本或符号)的假设。
{"title":"Word and Symbol Spotting Using Spatial Organization of Local Descriptors","authors":"Marçal Rusiñol, J. Lladós","doi":"10.1109/DAS.2008.24","DOIUrl":"https://doi.org/10.1109/DAS.2008.24","url":null,"abstract":"In this paper we present a method to spot both text and graphical symbols in a collection of images of wiring diagrams. Word spotting and symbol spotting methods tend to use the most discriminative features to describe the objects to be located. This fact makes that one can not tackle with textual and symbolic information at the same time. We propose a spotting architecture able to index both words and symbols, inspired in off-the-shelf object recognition architectures. Keypoints are extracted from a document image and a local descriptor is computed at each of these points of interest. The spatial organization of these descriptors validate the hypothesis to find an object (text or symbol) in a certain location and under a certain pose.","PeriodicalId":423207,"journal":{"name":"2008 The Eighth IAPR International Workshop on Document Analysis Systems","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127203518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
On the Reading of Tables of Contents 论目录的阅读
Pub Date : 2008-09-16 DOI: 10.1109/DAS.2008.87
Prateek Sarkar, E. Saund
This paper presents a framework for understanding tables of contents (TOC) of books, journals, and magazines. We propose a universal logical structure representation in terms of a hierarchy of entries, each of which may contain a descriptor and a locator. We enumerate graphical and perceptual cues that provide cues to parsing of tables of contents in terms of this formalism. We make initial suggestions about the form of evaluation metrics for comparing ground truthed tables of contents with the output of recognition algorithms. Typical and a typical tables of contents are used throughout to illustrate significant phenomena that must be dealt with in principled ways in any general TOC interpretation scheme. Finally we discuss implications of our observations on the design of recognition algorithms.
本文提出了一个理解图书、期刊和杂志目录的框架。我们提出了一种通用的逻辑结构表示法,根据条目的层次结构,每个条目可以包含一个描述符和一个定位符。我们列举了图形和感知线索,这些线索提供了根据这种形式主义解析目录表的线索。我们对比较真实目录与识别算法输出的评估指标的形式提出了初步建议。典型和典型的目录在整个过程中使用,以说明在任何一般TOC解释方案中必须以原则方式处理的重要现象。最后,我们讨论了我们的观察对识别算法设计的影响。
{"title":"On the Reading of Tables of Contents","authors":"Prateek Sarkar, E. Saund","doi":"10.1109/DAS.2008.87","DOIUrl":"https://doi.org/10.1109/DAS.2008.87","url":null,"abstract":"This paper presents a framework for understanding tables of contents (TOC) of books, journals, and magazines. We propose a universal logical structure representation in terms of a hierarchy of entries, each of which may contain a descriptor and a locator. We enumerate graphical and perceptual cues that provide cues to parsing of tables of contents in terms of this formalism. We make initial suggestions about the form of evaluation metrics for comparing ground truthed tables of contents with the output of recognition algorithms. Typical and a typical tables of contents are used throughout to illustrate significant phenomena that must be dealt with in principled ways in any general TOC interpretation scheme. Finally we discuss implications of our observations on the design of recognition algorithms.","PeriodicalId":423207,"journal":{"name":"2008 The Eighth IAPR International Workshop on Document Analysis Systems","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125631010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
期刊
2008 The Eighth IAPR International Workshop on Document Analysis Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1