首页 > 最新文献

Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering最新文献

英文 中文
Displaying chemical structural formulae in ePub format 以ePub格式显示化学结构式
S. Marinai, Stefano Quiriconi
We describe one tool designed to enhance the visualization of chemical structural formulae in E-book readers. When dealing with small formulae, to avoid the pixelation effect with zoomed images, the formula is converted to a vectoral representation and then enlarged. On the opposite, large formulae are split in sub-images by cutting the image in suitable locations attempting to reduce the parts of the formula that are broken. In both cases the formulae are embedded in one ePub document that allows users to browse the chemical structure on most reading devices.
我们描述了一个工具,旨在提高可视化的化学结构式的电子书阅读器。在处理小公式时,为了避免缩放后的图像产生像素化效果,将公式转换为矢量表示,然后进行放大。相反,通过在合适的位置切割图像,试图减少公式中被破坏的部分,将大公式分割成子图像。在这两种情况下,公式都嵌入在一个ePub文档中,允许用户在大多数阅读设备上浏览化学结构。
{"title":"Displaying chemical structural formulae in ePub format","authors":"S. Marinai, Stefano Quiriconi","doi":"10.1145/2361354.2361382","DOIUrl":"https://doi.org/10.1145/2361354.2361382","url":null,"abstract":"We describe one tool designed to enhance the visualization of chemical structural formulae in E-book readers. When dealing with small formulae, to avoid the pixelation effect with zoomed images, the formula is converted to a vectoral representation and then enlarged. On the opposite, large formulae are split in sub-images by cutting the image in suitable locations attempting to reduce the parts of the formula that are broken. In both cases the formulae are embedded in one ePub document that allows users to browse the chemical structure on most reading devices.","PeriodicalId":91385,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering","volume":"34 1","pages":"125-128"},"PeriodicalIF":0.0,"publicationDate":"2012-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75245171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimal guillotine layout 最佳的断头台布局
G. Gange, K. Marriott, Peter James Stuckey
Guillotine-based page layout is a method for document layout commonly used by newspapers and magazines, where each region of the page either contains a single article, or is recursively split either vertically or horizontally. Suprisingly there appears to be little research into algorithms for automatic guillotine-based document layout. In this paper we give efficient algorithms to find optimal solutions to guillotine layout problems of two forms. Fixed-cut layout is where the structure of the guillotining is given and we only have to determine the best configuration for each individual article to give the optimal total configuration. Free layout is where we also have to search for the optimal structure. We give bottom-up and top-down dynamic programming algorithms to solve these problems, and propose a novel interaction model for documents on electronic media. Experiments show that our algorithms are effective for realistic layout problems.
基于断头台的页面布局是报纸和杂志常用的文档布局方法,其中页面的每个区域要么包含一篇文章,要么垂直或水平地递归分割。令人惊讶的是,似乎很少有人研究基于断头台的自动文档布局算法。本文给出了两种形式的断头台布置问题的最优解求解算法。固定切割布局是给出断头台结构的地方,我们只需要为每一件单独的物品确定最佳配置,就可以给出最佳的总配置。自由布局也是我们需要寻找最优结构的地方。提出了自底向上和自顶向下的动态规划算法来解决这些问题,并提出了一种新的电子媒体文档交互模型。实验表明,我们的算法对实际的布局问题是有效的。
{"title":"Optimal guillotine layout","authors":"G. Gange, K. Marriott, Peter James Stuckey","doi":"10.1145/2361354.2361359","DOIUrl":"https://doi.org/10.1145/2361354.2361359","url":null,"abstract":"Guillotine-based page layout is a method for document layout commonly used by newspapers and magazines, where each region of the page either contains a single article, or is recursively split either vertically or horizontally. Suprisingly there appears to be little research into algorithms for automatic guillotine-based document layout. In this paper we give efficient algorithms to find optimal solutions to guillotine layout problems of two forms. Fixed-cut layout is where the structure of the guillotining is given and we only have to determine the best configuration for each individual article to give the optimal total configuration. Free layout is where we also have to search for the optimal structure. We give bottom-up and top-down dynamic programming algorithms to solve these problems, and propose a novel interaction model for documents on electronic media. Experiments show that our algorithms are effective for realistic layout problems.","PeriodicalId":91385,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering","volume":"230 1","pages":"13-22"},"PeriodicalIF":0.0,"publicationDate":"2012-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76927280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Learning how to trade off aesthetic criteria in layout 学习如何权衡布局的审美标准
P. Moulder, K. Marriott
Typesetting software is often faced with conflicting aesthetic goals. For example, choosing where to break lines in text might involve aiming to minimize hyphenation, variation in word spacing, and consecutive lines starting with the same word. Typically, automatic layout is modelled as an optimization problem in which the goal is to minimize a complex objective function that combines various penalty functions each of which corresponds to a particular bad feature. Determining how to combine these penalty functions is difficult and very time consuming, becoming harder each time we add another penalty. Here we present a machine-learning approach to do this, and test it in the context of line-breaking. Our approach repeatedly queries the expert typographer as to which one of a pair of layouts is better, and accordingly refines the estimate of how best to weight the penalties in a linear combination. It chooses layout pair queries by a heuristic to maximize the amount that can be learnt from them so as to reduce the number of combinations that must be considered by the typographer.
排版软件经常面临着审美目标的冲突。例如,选择文本中的断行位置可能涉及最小化连字符、单词间距变化以及以相同单词开头的连续行。通常,自动布局被建模为一个优化问题,其目标是最小化一个复杂的目标函数,该目标函数结合了各种惩罚函数,每个惩罚函数对应于一个特定的不良特征。决定如何组合这些惩罚功能是非常困难和耗时的,每次我们添加另一个惩罚都会变得更加困难。在这里,我们提出了一种机器学习方法来做到这一点,并在断行的上下文中对其进行了测试。我们的方法反复询问专业排版师,关于一对布局中哪一个更好,并相应地改进如何最好地衡量线性组合中的惩罚的估计。它通过启发式选择布局对查询,以最大限度地从中学习,从而减少排版人员必须考虑的组合数量。
{"title":"Learning how to trade off aesthetic criteria in layout","authors":"P. Moulder, K. Marriott","doi":"10.1145/2361354.2361361","DOIUrl":"https://doi.org/10.1145/2361354.2361361","url":null,"abstract":"Typesetting software is often faced with conflicting aesthetic goals. For example, choosing where to break lines in text might involve aiming to minimize hyphenation, variation in word spacing, and consecutive lines starting with the same word. Typically, automatic layout is modelled as an optimization problem in which the goal is to minimize a complex objective function that combines various penalty functions each of which corresponds to a particular bad feature. Determining how to combine these penalty functions is difficult and very time consuming, becoming harder each time we add another penalty. Here we present a machine-learning approach to do this, and test it in the context of line-breaking. Our approach repeatedly queries the expert typographer as to which one of a pair of layouts is better, and accordingly refines the estimate of how best to weight the penalties in a linear combination. It chooses layout pair queries by a heuristic to maximize the amount that can be learnt from them so as to reduce the number of combinations that must be considered by the typographer.","PeriodicalId":91385,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering","volume":"29 1","pages":"33-36"},"PeriodicalIF":0.0,"publicationDate":"2012-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81221478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Evaluation of BILBO reference parsing in digital humanities via a comparison of different tools 通过不同工具的比较评价数字人文学科中BILBO参考解析
Young-Min Kim, P. Bellot, J. Tavernier, Elodie Faath, Marin Dacos
Automatic bibliographic reference annotation involves the tokenization and identification of reference fields. Recent methods use machine learning techniques such as Conditional Random Fields to tackle this problem. On the other hand, the state of the art methods always learn and evaluate their systems with a well structured data having simple format such as bibliography at the end of scientific articles. And that is a reason why the parsing of new reference different from a regular format does not work well. In our previous work, we have established a standard for the tokenization and feature selection with a less formulaic data such as notes. In this paper, we evaluate our system BILBO with other popular online reference parsing tools on a new data from totally different source. BILBO is constructed with our own corpora extracted and annotated from real world data, digital humanities articles of Revues.org site (90% in French) of OpenEdition. The robustness of BILBO system allows a language independent tagging result. We expect that this first attempt of evaluation will motivate the development of other efficient techniques for the scattered and less formulaic bibliographic references.
自动书目参考标注涉及到参考字段的标记化和识别。最近的方法使用机器学习技术,如条件随机场来解决这个问题。另一方面,最先进的方法总是使用结构良好、格式简单的数据来学习和评估它们的系统,例如科学文章末尾的参考书目。这就是为什么解析不同于常规格式的新引用不能很好地工作的原因。在我们之前的工作中,我们已经建立了一个针对较少公式化的数据(如注释)的标记化和特征选择的标准。在本文中,我们用其他流行的在线参考解析工具对来自完全不同来源的新数据进行了评估。比尔博是用我们自己的语料库构建的,从真实世界的数据中提取和注释,Revues.org网站的数字人文文章(90%法语)的OpenEdition。BILBO系统的鲁棒性使得标注结果与语言无关。我们期望这一评价的第一次尝试将推动为分散的和较少公式化的书目参考文献开发其他有效的技术。
{"title":"Evaluation of BILBO reference parsing in digital humanities via a comparison of different tools","authors":"Young-Min Kim, P. Bellot, J. Tavernier, Elodie Faath, Marin Dacos","doi":"10.1145/2361354.2361400","DOIUrl":"https://doi.org/10.1145/2361354.2361400","url":null,"abstract":"Automatic bibliographic reference annotation involves the tokenization and identification of reference fields. Recent methods use machine learning techniques such as Conditional Random Fields to tackle this problem. On the other hand, the state of the art methods always learn and evaluate their systems with a well structured data having simple format such as bibliography at the end of scientific articles. And that is a reason why the parsing of new reference different from a regular format does not work well. In our previous work, we have established a standard for the tokenization and feature selection with a less formulaic data such as notes. In this paper, we evaluate our system BILBO with other popular online reference parsing tools on a new data from totally different source. BILBO is constructed with our own corpora extracted and annotated from real world data, digital humanities articles of Revues.org site (90% in French) of OpenEdition. The robustness of BILBO system allows a language independent tagging result. We expect that this first attempt of evaluation will motivate the development of other efficient techniques for the scattered and less formulaic bibliographic references.","PeriodicalId":91385,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering","volume":"23 1","pages":"209-212"},"PeriodicalIF":0.0,"publicationDate":"2012-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74535031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Effective radical segmentation of offline handwritten Chinese characters towards constructing personal handwritten fonts 面向个性化手写字体构建的离线手写体汉字有效根式分割
Zhanghui Chen, Baoyao Zhou
Effective radical segmentation of handwritten Chinese characters can greatly facilitate the subsequent character processing tasks, such as Chinese handwriting recognition/identification and the generation of Chinese handwritten fonts. In this paper, a popular snake model is enhanced by considering the guided image force and optimized by Genetic Algorithm, such that it achieves a significant improvement in terms of both accuracy and efficiency when applied to segment the radicals in handwritten Chinese characters. The proposed radical segmentation approach consists of three stages: constructing guide information, Genetic Algorithm optimization and post-embellishment. Testing results show that the proposed approach can effectively decompose radicals with overlaps and connections from handwritten Chinese characters with various layout structures. The segmentation accuracy reaches 94.91% for complicated samples with overlapped and connected radicals and the segmentation speed is 0.05 second per character. For demonstrating the advantages of the approach, radicals extracted from the user input samples are reused to construct personal Chinese handwritten font library. Experiments show that the constructed characters well maintain the handwriting style of the user and have good enough performance. In this way, the user only needs to write a small number of samples for obtaining his/her own handwritten font library. This method greatly reduces the cost of existing solutions and makes it much easier for people to use computers to write letters/e-mails, diaries/blogs, even magazines/books in their own handwriting.
对手写体汉字进行有效的根式切分,可以极大地方便后续的汉字处理任务,如汉字的识别/识别和汉字手写体的生成。本文通过考虑引导象力对一种流行的蛇形模型进行增强,并通过遗传算法进行优化,使得该模型在用于手写体汉字词根分割时,准确率和效率都有了显著提高。本文提出的激进分割方法包括三个阶段:构建引导信息、遗传算法优化和后期修饰。测试结果表明,该方法可以有效地分解具有重叠和连接的不同布局结构的手写体汉字的词根。对于具有重叠连接自由基的复杂样本,分割准确率达到94.91%,分割速度为0.05 s / character。为了证明该方法的优越性,将从用户输入样本中提取的词根用于构建个人中文手写字体库。实验表明,所构建的汉字能够很好地保持用户的笔迹风格,具有良好的性能。这样,用户只需要编写少量的样本,就可以获得自己的手写字体库。这种方法大大降低了现有解决方案的成本,使人们更容易使用电脑手写信件/电子邮件、日记/博客,甚至杂志/书籍。
{"title":"Effective radical segmentation of offline handwritten Chinese characters towards constructing personal handwritten fonts","authors":"Zhanghui Chen, Baoyao Zhou","doi":"10.1145/2361354.2361379","DOIUrl":"https://doi.org/10.1145/2361354.2361379","url":null,"abstract":"Effective radical segmentation of handwritten Chinese characters can greatly facilitate the subsequent character processing tasks, such as Chinese handwriting recognition/identification and the generation of Chinese handwritten fonts. In this paper, a popular snake model is enhanced by considering the guided image force and optimized by Genetic Algorithm, such that it achieves a significant improvement in terms of both accuracy and efficiency when applied to segment the radicals in handwritten Chinese characters. The proposed radical segmentation approach consists of three stages: constructing guide information, Genetic Algorithm optimization and post-embellishment. Testing results show that the proposed approach can effectively decompose radicals with overlaps and connections from handwritten Chinese characters with various layout structures. The segmentation accuracy reaches 94.91% for complicated samples with overlapped and connected radicals and the segmentation speed is 0.05 second per character. For demonstrating the advantages of the approach, radicals extracted from the user input samples are reused to construct personal Chinese handwritten font library. Experiments show that the constructed characters well maintain the handwriting style of the user and have good enough performance. In this way, the user only needs to write a small number of samples for obtaining his/her own handwritten font library. This method greatly reduces the cost of existing solutions and makes it much easier for people to use computers to write letters/e-mails, diaries/blogs, even magazines/books in their own handwriting.","PeriodicalId":91385,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering","volume":"19 1","pages":"107-116"},"PeriodicalIF":0.0,"publicationDate":"2012-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75624042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
The Glozz platform: a corpus annotation and mining tool Glozz平台:一个语料库注释和挖掘工具
Antoine Widlöcher, Yann Mathet
Corpus linguistics and Natural Language Processing make it necessary to produce and share reference annotations to which linguistic and computational models can be compared. Creating such resources requires a formal framework supporting description of heterogeneous linguistic objects and structures, appropriate representation formats, and adequate manual annotation tools, making it possible to locate, identify and describe linguistic phenomena in textual documents. The Glozz platform addresses all these needs, and provides a highly versatile corpus annotation tool with advanced visualization, querying and evaluation possibilities.
语料库语言学和自然语言处理使得有必要生成和共享参考注释,以便对语言和计算模型进行比较。创建这样的资源需要一个正式的框架来支持异构语言对象和结构的描述,适当的表示格式,以及足够的手动注释工具,使得在文本文档中定位、识别和描述语言现象成为可能。Glozz平台满足了所有这些需求,并提供了一个高度通用的语料库注释工具,具有高级可视化、查询和评估的可能性。
{"title":"The Glozz platform: a corpus annotation and mining tool","authors":"Antoine Widlöcher, Yann Mathet","doi":"10.1145/2361354.2361394","DOIUrl":"https://doi.org/10.1145/2361354.2361394","url":null,"abstract":"Corpus linguistics and Natural Language Processing make it necessary to produce and share reference annotations to which linguistic and computational models can be compared. Creating such resources requires a formal framework supporting description of heterogeneous linguistic objects and structures, appropriate representation formats, and adequate manual annotation tools, making it possible to locate, identify and describe linguistic phenomena in textual documents. The Glozz platform addresses all these needs, and provides a highly versatile corpus annotation tool with advanced visualization, querying and evaluation possibilities.","PeriodicalId":91385,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering","volume":"21 1","pages":"171-180"},"PeriodicalIF":0.0,"publicationDate":"2012-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75982826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 59
500 year documentation 500年的文献
F. Marchese, Maninder Pal Kaur Shergill
Museum visitors today can regularly view 500 year old art by Renaissance masters. Will visitors to museums 500 years in the future be able to see the work of digital artists from the early 21st century? This paper considers the real problem of conserving interactive digital artwork for museum installation in the far distant future by exploring the requirements for creating documentation that will support an artwork's adaptation to future technology. In effect, this documentation must survive as long as the artwork itself -- effectively, in perpetuity. A proposal is made for the use of software engineering methodologies as solutions for designing this documentation.
今天的博物馆游客可以定期观看文艺复兴时期大师500年前的艺术作品。500年后,参观博物馆的游客还能看到21世纪初数字艺术家的作品吗?本文通过探索创建支持艺术品适应未来技术的文档的要求,考虑了在遥远的未来为博物馆装置保存交互式数字艺术品的实际问题。实际上,这些文件必须与美术作品本身一样长久地存在下去。建议使用软件工程方法作为设计该文档的解决方案。
{"title":"500 year documentation","authors":"F. Marchese, Maninder Pal Kaur Shergill","doi":"10.1145/2361354.2361391","DOIUrl":"https://doi.org/10.1145/2361354.2361391","url":null,"abstract":"Museum visitors today can regularly view 500 year old art by Renaissance masters. Will visitors to museums 500 years in the future be able to see the work of digital artists from the early 21st century? This paper considers the real problem of conserving interactive digital artwork for museum installation in the far distant future by exploring the requirements for creating documentation that will support an artwork's adaptation to future technology. In effect, this documentation must survive as long as the artwork itself -- effectively, in perpetuity. A proposal is made for the use of software engineering methodologies as solutions for designing this documentation.","PeriodicalId":91385,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering","volume":"42 1","pages":"157-160"},"PeriodicalIF":0.0,"publicationDate":"2012-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80151994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Scientific table type classification in digital library 数字图书馆科学表型分类
Seongchan Kim, Keejun Han, Soon Young Kim, Ying Liu
Tables are ubiquitous in digital libraries and on the Web, utilized to satisfy various types of data delivery and document formatting goals. For example, tables are widely used to present experimental results or statistical data in a condensed fashion in scientific documents. Identifying and organizing tables of different types is an absolutely necessary task for better table understanding, and data sharing and reusing. This paper has a three-fold contribution: 1) We propose Introduction, Methods, Results, and Discussion (IMRAD)-based table functional classification for scientific documents; 2) A fine-grained table taxonomy is introduced based on an extensive observation and investigation of tables in digital libraries; and 3) We investigate table characteristics and classify tables automatically based on the defined taxonomy. The preliminary experimental results show that our table taxonomy with salient features can significantly improve scientific table classification performance.
表在数字图书馆和Web上无处不在,用于满足各种类型的数据传递和文档格式化目标。例如,在科学文献中,表格被广泛用于以浓缩的方式呈现实验结果或统计数据。识别和组织不同类型的表对于更好地理解表以及数据共享和重用是绝对必要的任务。本文有三个方面的贡献:1)提出了基于IMRAD (Introduction, Methods, Results, and Discussion)的科学文献表功能分类;2)基于对数字图书馆中表的广泛观察和调查,提出了一种细粒度表分类法;3)研究表的特征,根据已定义的分类法对表进行自动分类。初步实验结果表明,基于显著特征的表分类方法可以显著提高科学表分类的性能。
{"title":"Scientific table type classification in digital library","authors":"Seongchan Kim, Keejun Han, Soon Young Kim, Ying Liu","doi":"10.1145/2361354.2361384","DOIUrl":"https://doi.org/10.1145/2361354.2361384","url":null,"abstract":"Tables are ubiquitous in digital libraries and on the Web, utilized to satisfy various types of data delivery and document formatting goals. For example, tables are widely used to present experimental results or statistical data in a condensed fashion in scientific documents. Identifying and organizing tables of different types is an absolutely necessary task for better table understanding, and data sharing and reusing. This paper has a three-fold contribution: 1) We propose Introduction, Methods, Results, and Discussion (IMRAD)-based table functional classification for scientific documents; 2) A fine-grained table taxonomy is introduced based on an extensive observation and investigation of tables in digital libraries; and 3) We investigate table characteristics and classify tables automatically based on the defined taxonomy. The preliminary experimental results show that our table taxonomy with salient features can significantly improve scientific table classification performance.","PeriodicalId":91385,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering","volume":"28 1","pages":"133-136"},"PeriodicalIF":0.0,"publicationDate":"2012-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91166559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A section title authoring tool for clinical guidelines 临床指南章节标题编写工具
M. Truran, G. Georg, M. Cavazza, Dong Zhou
Professional users of medical information often report difficulties when attempting to locate specific information in lengthy documents. Sometimes these difficulties can be attributed to poorly specified section titles which fail to advertise relevant content. In this paper we describe preliminary work on a software plug-in for a document engineering environment that will assist authors when they formulate section-level headings. We describe two different algorithms which can be used to generate section titles. We compare the performance of these algorithms and correlate our experimental results with an evaluation of title quality performed by domain experts.
医疗信息的专业用户经常报告在试图在冗长的文档中查找特定信息时遇到困难。有时,这些困难可以归因于不明确的章节标题,未能宣传相关内容。在本文中,我们描述了一个用于文档工程环境的软件插件的初步工作,该插件将帮助作者制定节级标题。我们描述了可用于生成章节标题的两种不同算法。我们比较了这些算法的性能,并将我们的实验结果与领域专家对标题质量的评估联系起来。
{"title":"A section title authoring tool for clinical guidelines","authors":"M. Truran, G. Georg, M. Cavazza, Dong Zhou","doi":"10.1145/2361354.2361364","DOIUrl":"https://doi.org/10.1145/2361354.2361364","url":null,"abstract":"Professional users of medical information often report difficulties when attempting to locate specific information in lengthy documents. Sometimes these difficulties can be attributed to poorly specified section titles which fail to advertise relevant content. In this paper we describe preliminary work on a software plug-in for a document engineering environment that will assist authors when they formulate section-level headings. We describe two different algorithms which can be used to generate section titles. We compare the performance of these algorithms and correlate our experimental results with an evaluation of title quality performed by domain experts.","PeriodicalId":91385,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering","volume":"114 1","pages":"41-44"},"PeriodicalIF":0.0,"publicationDate":"2012-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77684479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Content and document based approach for digital productivity applications 数字生产力应用的基于内容和文档的方法
Thierry Delprat
In today's world most of the data produced and consumed by employees is content. In this talk we will present our approach to create and deploy content and document based applications to improve business processes and user experience.
在当今世界,员工产生和消费的大部分数据都是内容。在本次演讲中,我们将介绍如何创建和部署基于内容和文档的应用程序,以改进业务流程和用户体验。
{"title":"Content and document based approach for digital productivity applications","authors":"Thierry Delprat","doi":"10.1145/2361354.2361372","DOIUrl":"https://doi.org/10.1145/2361354.2361372","url":null,"abstract":"In today's world most of the data produced and consumed by employees is content. In this talk we will present our approach to create and deploy content and document based applications to improve business processes and user experience.","PeriodicalId":91385,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering","volume":"24 1","pages":"83-84"},"PeriodicalIF":0.0,"publicationDate":"2012-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80390204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of the ACM Symposium on Document Engineering. ACM Symposium on Document Engineering
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1