{"title":"Layout Analysis of Historic Architectural Program Documents","authors":"A. Oliaee, Andrew R. Tripp","doi":"10.1145/3573128.3609339","DOIUrl":null,"url":null,"abstract":"In this paper, we introduce and make publicly available the CRS Visual Dataset, a new dataset consisting of 7,029 pages of human-annotated and validated scanned archival documents from the field of 20th-century architectural programming; and ArcLayNet, a fine-tuned machine learning model based on the YOLOv6-S object detection architecture. Architectural programming is an essential professional service in the Architecture, Engineering, Construction, and Operations (AECO) Industry, and the documents it produces are powerful instruments of this service. The documents in this dataset are the product of a creative process; they exhibit a variety of sizes, orientations, arrangements, and modes of content, and are underrepresented in current datasets. This paper describes the dataset and narrates an iterative process of quality control in which several deficiencies were identified and addressed to improve the performance of the model. In this process, our key performance indicators, mAP@0.5 and mAP@0.5:0.95, both improved by approximately 10%.","PeriodicalId":310776,"journal":{"name":"Proceedings of the ACM Symposium on Document Engineering 2023","volume":"103 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM Symposium on Document Engineering 2023","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3573128.3609339","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, we introduce and make publicly available the CRS Visual Dataset, a new dataset consisting of 7,029 pages of human-annotated and validated scanned archival documents from the field of 20th-century architectural programming; and ArcLayNet, a fine-tuned machine learning model based on the YOLOv6-S object detection architecture. Architectural programming is an essential professional service in the Architecture, Engineering, Construction, and Operations (AECO) Industry, and the documents it produces are powerful instruments of this service. The documents in this dataset are the product of a creative process; they exhibit a variety of sizes, orientations, arrangements, and modes of content, and are underrepresented in current datasets. This paper describes the dataset and narrates an iterative process of quality control in which several deficiencies were identified and addressed to improve the performance of the model. In this process, our key performance indicators, mAP@0.5 and mAP@0.5:0.95, both improved by approximately 10%.