Pub Date : 2023-07-17DOI: https://dl.acm.org/doi/10.1145/3609484
Tong Wei, Yuqi Chen
Ding is a significant type of Chinese bronze that holds key cultural value. Traditional humanists have primarily focused on dating and classifying Ding. However, in the context of Digital Humanities, the research perspective of humanities scholars is gradually shifting towards data-driven research, with linked data emerging as a popular topic. A well-defined and standard ontology representing the complete domain knowledge is essential for linked Ding data. Unfortunately, most existing ontology cannot represent fine-grained knowledge of Ding or is restrictive to represent partial knowledge of bronze Ding. In this context, we propose a fine-grained Ding ontology to represent the bronze Ding knowledge. In this paper, we present in detail the Ding ontology of Chinese bronze during Shang and Zhou dynasties (from 1600 BC to 256 BC). We provide a detailed exposition of the Ding ontology and evaluate its effectiveness using OOPS!, OntoMetrics, and by answering competency questions in SPARQL. The building methodology of Ding ontology follows the ISO principles (ISO 1087 and ISO 704). The objective of this paper is to develop an open ontology of Ding during the Shang and Zhou dynasties, which can serve as a valuable resource for bilingual terminology dictionaries. The Ding ontology was published at: http://www.dhontology.com/ChineseCulture/data/bronze.owl
{"title":"A Ding Ontology of Chinese Bronze","authors":"Tong Wei, Yuqi Chen","doi":"https://dl.acm.org/doi/10.1145/3609484","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3609484","url":null,"abstract":"<p>Ding is a significant type of Chinese bronze that holds key cultural value. Traditional humanists have primarily focused on dating and classifying Ding. However, in the context of Digital Humanities, the research perspective of humanities scholars is gradually shifting towards data-driven research, with linked data emerging as a popular topic. A well-defined and standard ontology representing the complete domain knowledge is essential for linked Ding data. Unfortunately, most existing ontology cannot represent fine-grained knowledge of Ding or is restrictive to represent partial knowledge of bronze Ding. In this context, we propose a fine-grained Ding ontology to represent the bronze Ding knowledge. In this paper, we present in detail the Ding ontology of Chinese bronze during Shang and Zhou dynasties (from 1600 BC to 256 BC). We provide a detailed exposition of the Ding ontology and evaluate its effectiveness using OOPS!, OntoMetrics, and by answering competency questions in SPARQL. The building methodology of Ding ontology follows the ISO principles (ISO 1087 and ISO 704). The objective of this paper is to develop an open ontology of Ding during the Shang and Zhou dynasties, which can serve as a valuable resource for bilingual terminology dictionaries. The Ding ontology was published at: http://www.dhontology.com/ChineseCulture/data/bronze.owl</p>","PeriodicalId":54310,"journal":{"name":"ACM Journal on Computing and Cultural Heritage","volume":"3 1","pages":""},"PeriodicalIF":2.4,"publicationDate":"2023-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138540476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The cultural world offers a staggering amount of rich and varied metadata on cultural heritage, accumulated by governmental, academic and commercial players. However, the variety of involved institutions means that the data is stored in as many complex and often incompatible models and standards, which limits its availability and explorability by the greater public. The adoption of Linked Open Data technologies allows a strong interlinking of these various databases as well as external connections with existing knowledge bases. However, as they often contain references to the same entities, the delicate issue of entity alignment becomes the central challenge, especially in the absence or scarcity of unique global identifiers. To tackle this issue, we explored two approaches, one based on a set of heuristic rules, and one based on masked language models, or MLMs. We compare these two approaches, as well as different variations of MLMs, including some models trained on a different language, and various levels of data cleaning and labeling. Our results show that heuristics are a solid approach, but also that MLM-based entity alignment obtains better performance coupled with the fact that it is robust to the data format, and does not require any form of data preprocessing, which was not the case of the heuristic approach in our experiments.
{"title":"Comparing Heuristic Rules and Masked Language Models for Entity Alignment in the Literature Domain","authors":"Dominique Piché, L. Font, A. Zouaq, M. Gagnon","doi":"10.1145/3606699","DOIUrl":"https://doi.org/10.1145/3606699","url":null,"abstract":"The cultural world offers a staggering amount of rich and varied metadata on cultural heritage, accumulated by governmental, academic and commercial players. However, the variety of involved institutions means that the data is stored in as many complex and often incompatible models and standards, which limits its availability and explorability by the greater public. The adoption of Linked Open Data technologies allows a strong interlinking of these various databases as well as external connections with existing knowledge bases. However, as they often contain references to the same entities, the delicate issue of entity alignment becomes the central challenge, especially in the absence or scarcity of unique global identifiers. To tackle this issue, we explored two approaches, one based on a set of heuristic rules, and one based on masked language models, or MLMs. We compare these two approaches, as well as different variations of MLMs, including some models trained on a different language, and various levels of data cleaning and labeling. Our results show that heuristics are a solid approach, but also that MLM-based entity alignment obtains better performance coupled with the fact that it is robust to the data format, and does not require any form of data preprocessing, which was not the case of the heuristic approach in our experiments.","PeriodicalId":54310,"journal":{"name":"ACM Journal on Computing and Cultural Heritage","volume":"114 1","pages":""},"PeriodicalIF":2.4,"publicationDate":"2023-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78639304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-15DOI: https://dl.acm.org/doi/10.1145/3606699
Dominique Piché, Ludovic Font, Amal Zouaq, Michel Gagnon
The cultural world offers a staggering amount of rich and varied metadata on cultural heritage, accumulated by governmental, academic and commercial players. However, the variety of involved institutions means that the data is stored in as many complex and often incompatible models and standards, which limits its availability and explorability by the greater public.
The adoption of Linked Open Data technologies allows a strong interlinking of these various databases as well as external connections with existing knowledge bases. However, as they often contain references to the same entities, the delicate issue of entity alignment becomes the central challenge, especially in the absence or scarcity of unique global identifiers.
To tackle this issue, we explored two approaches, one based on a set of heuristic rules, and one based on masked language models, or MLMs. We compare these two approaches, as well as different variations of MLMs, including some models trained on a different language, and various levels of data cleaning and labeling. Our results show that heuristics are a solid approach, but also that MLM-based entity alignment obtains better performance coupled with the fact that it is robust to the data format, and does not require any form of data preprocessing, which was not the case of the heuristic approach in our experiments.
{"title":"Comparing Heuristic Rules and Masked Language Models for Entity Alignment in the Literature Domain","authors":"Dominique Piché, Ludovic Font, Amal Zouaq, Michel Gagnon","doi":"https://dl.acm.org/doi/10.1145/3606699","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3606699","url":null,"abstract":"<p>The cultural world offers a staggering amount of rich and varied metadata on cultural heritage, accumulated by governmental, academic and commercial players. However, the variety of involved institutions means that the data is stored in as many complex and often incompatible models and standards, which limits its availability and explorability by the greater public. </p><p>The adoption of Linked Open Data technologies allows a strong interlinking of these various databases as well as external connections with existing knowledge bases. However, as they often contain references to the same entities, the delicate issue of entity alignment becomes the central challenge, especially in the absence or scarcity of unique global identifiers. </p><p>To tackle this issue, we explored two approaches, one based on a set of heuristic rules, and one based on masked language models, or MLMs. We compare these two approaches, as well as different variations of MLMs, including some models trained on a different language, and various levels of data cleaning and labeling. Our results show that heuristics are a solid approach, but also that MLM-based entity alignment obtains better performance coupled with the fact that it is robust to the data format, and does not require any form of data preprocessing, which was not the case of the heuristic approach in our experiments.</p>","PeriodicalId":54310,"journal":{"name":"ACM Journal on Computing and Cultural Heritage","volume":"88 1","pages":""},"PeriodicalIF":2.4,"publicationDate":"2023-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138540474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Historical architectural archives enjoy attention from diverse audiences, acting as a primary source of information for architects, historians, public authorities, and common citizens alike. In Italy, the interest in architectural archives has grown slowly but steadily for the last 20 years. However, architectural archives do not generally follow the trend common for museums and galleries in publishing digitized materials and providing standard metadata for individual records. The information that is available online usually includes only an archival finding aid, instead of metadata about the individual records, or fully digital versions of the records. While cataloguing standards for archival descriptions of architectural records have existed at least since the 1980s, the rise of Linked Open Data as a framework for publishing cultural heritage data has allowed archivists to enhance these archival descriptions with richer contextual information and links to external knowledge bases. In this paper we present the ITDT ontology, an extension of the Records in Contexts Ontology that facilitates the representation of architectural records and of the context related to architectural projects, its process, and participating entities. We discuss the application of the ontology to the project files of Italian architect and engineer Dino Tamburini (1924–2011), and the creation of a digital archive offering multiple perspectives over the records.
{"title":"Extending RiC-O to model historical architectural archives: The ITDT ontology","authors":"D. Mikhaylova, Daniele Metilli","doi":"10.1145/3606706","DOIUrl":"https://doi.org/10.1145/3606706","url":null,"abstract":"Historical architectural archives enjoy attention from diverse audiences, acting as a primary source of information for architects, historians, public authorities, and common citizens alike. In Italy, the interest in architectural archives has grown slowly but steadily for the last 20 years. However, architectural archives do not generally follow the trend common for museums and galleries in publishing digitized materials and providing standard metadata for individual records. The information that is available online usually includes only an archival finding aid, instead of metadata about the individual records, or fully digital versions of the records. While cataloguing standards for archival descriptions of architectural records have existed at least since the 1980s, the rise of Linked Open Data as a framework for publishing cultural heritage data has allowed archivists to enhance these archival descriptions with richer contextual information and links to external knowledge bases. In this paper we present the ITDT ontology, an extension of the Records in Contexts Ontology that facilitates the representation of architectural records and of the context related to architectural projects, its process, and participating entities. We discuss the application of the ontology to the project files of Italian architect and engineer Dino Tamburini (1924–2011), and the creation of a digital archive offering multiple perspectives over the records.","PeriodicalId":54310,"journal":{"name":"ACM Journal on Computing and Cultural Heritage","volume":"87 1","pages":""},"PeriodicalIF":2.4,"publicationDate":"2023-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75605831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-06DOI: https://dl.acm.org/doi/10.1145/3606702
Alessandro Adamou, Davide Picca, Yumeng Hou, Paula Loreto Granados-García
Investigating the intangible nature of a cultural domain can take multiple forms, addressing for example the aesthetic, epistemic and social dimensions of its phenomenology. The context of Southern Chinese martial arts is of particular significance as it carries immaterial components of all these aspects: the technical and stylistic framework of a martial art system; the imagery associated to movements; and the transmission of knowledge orally, practically or through influence, are but examples of intangible characteristics that can and should be captured, not unlike cultural artifacts. The latter case– the one of formalizing cultural influence through its various forms of evidence– is emblematic as well as largely untrodden ground. A previous attempt at detecting cultural influence computationally was made in the context of Roman archaeology, though the binding of that early effort with the domain model was tight; also, there has hardly been any prior dedicated effort to model the martial arts domain through ontologies.
In this paper, we present the realization of the full cycle of a computational approach to investigating cultural contact in Southern Chinese martial arts. The entire approach is predicated upon the usage of standards and techniques of the Semantic Web and formal knowledge. Starting from a modular domain ontology, which models martial arts independently of the goal of capturing cultural influence, we perform knowledge extraction from archival material from the Hong Kong Martial Arts Living Archive and generate a dataset of the results modeled after said ontology. Then, we combine the resulting knowledge base with a rule model that represents ways to infer knowledge of potential contact between cultures based on the evidence present in the knowledge base. The results offer an insight into how an inference-based computational model can be applied to detect interesting facts even in the as-yet underexplored domain of intangible cultural heritage. The implemented workflow shows that the full-cycle employment of semantic technologies can offer the ground truth required for largely different approaches, such as statistical and machine learning ones, to operate.
{"title":"The Facets of Intangible Heritage in Southern Chinese Martial Arts: Applying a Knowledge-Driven Cultural Contact Detection Approach","authors":"Alessandro Adamou, Davide Picca, Yumeng Hou, Paula Loreto Granados-García","doi":"https://dl.acm.org/doi/10.1145/3606702","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3606702","url":null,"abstract":"<p>Investigating the intangible nature of a cultural domain can take multiple forms, addressing for example the aesthetic, epistemic and social dimensions of its phenomenology. The context of Southern Chinese martial arts is of particular significance as it carries immaterial components of all these aspects: the technical and stylistic framework of a martial art system; the imagery associated to movements; and the transmission of knowledge orally, practically or through influence, are but examples of intangible characteristics that can and should be captured, not unlike cultural artifacts. The latter case– the one of formalizing cultural influence through its various forms of evidence– is emblematic as well as largely untrodden ground. A previous attempt at detecting cultural influence computationally was made in the context of Roman archaeology, though the binding of that early effort with the domain model was tight; also, there has hardly been any prior dedicated effort to model the martial arts domain through ontologies. </p><p>In this paper, we present the realization of the full cycle of a computational approach to investigating cultural contact in Southern Chinese martial arts. The entire approach is predicated upon the usage of standards and techniques of the Semantic Web and formal knowledge. Starting from a modular domain ontology, which models martial arts independently of the goal of capturing cultural influence, we perform knowledge extraction from archival material from the <i>Hong Kong Martial Arts Living Archive</i> and generate a dataset of the results modeled after said ontology. Then, we combine the resulting knowledge base with a rule model that represents ways to infer knowledge of potential contact between cultures based on the evidence present in the knowledge base. The results offer an insight into how an inference-based computational model can be applied to detect interesting facts even in the as-yet underexplored domain of intangible cultural heritage. The implemented workflow shows that the full-cycle employment of semantic technologies can offer the ground truth required for largely different approaches, such as statistical and machine learning ones, to operate.</p>","PeriodicalId":54310,"journal":{"name":"ACM Journal on Computing and Cultural Heritage","volume":"4 1","pages":""},"PeriodicalIF":2.4,"publicationDate":"2023-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138540501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-06DOI: https://dl.acm.org/doi/10.1145/3606703
Francesca Murano, Valeria Quochi, Angelo Mario Del Grosso, Luca Rigobianco, Mariarosaria Zinzi
This paper discusses the challenges addressed in the digital scholarly encoding of the fragmentary texts of the languages of Ancient Italy according to the TEI/EpiDoc Guidelines in XML format. It describes the solutions and customisations that have been adopted for dealing with the peculiarities of our epigraphical documentation and with the formalisation of epigraphical information deemed interesting for data retrieval in a historical linguistic perspective. The making of a digital corpus consisting of new critical editions of selected inscriptions is a work carried out in the context of the project ”Languages and Cultures of Ancient Italy. Historical Linguistics and Digital Models”, which aims to investigate the languages of Ancient Italy by combining the traditional methods, proper to historical linguistics, with methods and technology proper to the digital humanities and computational lexicography. More specifically, the purpose of the project is to create a set of interrelated digital language resources which comprise: 1) a digital corpus of texts editions; 2) a computational lexicon compliant with the Web Semantic requirements; 3) a relevant bibliographic reference dataset encoded according to the FRBRoo/LRMoo specifications. Additionally, selected textual data and scientific interpretations will be encoded using CIDOC CRM and its extensions, namely CRMtex and CRMinf. The present contribution thus tackles one of the main aspects of the project, and proposes significant innovations in the encoding of critical editions for epigraphic texts of fragmentary languages, which will hopefully foster future interoperability and integration with other external datasets, a paramount concern of the project.
{"title":"Describing Inscriptions of Ancient Italy. The ItAnt Project and Its Information Encoding Process","authors":"Francesca Murano, Valeria Quochi, Angelo Mario Del Grosso, Luca Rigobianco, Mariarosaria Zinzi","doi":"https://dl.acm.org/doi/10.1145/3606703","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3606703","url":null,"abstract":"<p>This paper discusses the challenges addressed in the digital scholarly encoding of the fragmentary texts of the languages of Ancient Italy according to the TEI/EpiDoc Guidelines in XML format. It describes the solutions and customisations that have been adopted for dealing with the peculiarities of our epigraphical documentation and with the formalisation of epigraphical information deemed interesting for data retrieval in a historical linguistic perspective. The making of a digital corpus consisting of new critical editions of selected inscriptions is a work carried out in the context of the project ”Languages and Cultures of Ancient Italy. Historical Linguistics and Digital Models”, which aims to investigate the languages of Ancient Italy by combining the traditional methods, proper to historical linguistics, with methods and technology proper to the digital humanities and computational lexicography. More specifically, the purpose of the project is to create a set of interrelated digital language resources which comprise: 1) a digital corpus of texts editions; 2) a computational lexicon compliant with the Web Semantic requirements; 3) a relevant bibliographic reference dataset encoded according to the FRBRoo/LRMoo specifications. Additionally, selected textual data and scientific interpretations will be encoded using CIDOC CRM and its extensions, namely CRMtex and CRMinf. The present contribution thus tackles one of the main aspects of the project, and proposes significant innovations in the encoding of critical editions for epigraphic texts of fragmentary languages, which will hopefully foster future interoperability and integration with other external datasets, a paramount concern of the project.</p>","PeriodicalId":54310,"journal":{"name":"ACM Journal on Computing and Cultural Heritage","volume":"2 1","pages":""},"PeriodicalIF":2.4,"publicationDate":"2023-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138540508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Adamou, Davide Picca, Yumeng Hou, Paula Loreto Granados-García
Investigating the intangible nature of a cultural domain can take multiple forms, addressing for example the aesthetic, epistemic and social dimensions of its phenomenology. The context of Southern Chinese martial arts is of particular significance as it carries immaterial components of all these aspects: the technical and stylistic framework of a martial art system; the imagery associated to movements; and the transmission of knowledge orally, practically or through influence, are but examples of intangible characteristics that can and should be captured, not unlike cultural artifacts. The latter case– the one of formalizing cultural influence through its various forms of evidence– is emblematic as well as largely untrodden ground. A previous attempt at detecting cultural influence computationally was made in the context of Roman archaeology, though the binding of that early effort with the domain model was tight; also, there has hardly been any prior dedicated effort to model the martial arts domain through ontologies. In this paper, we present the realization of the full cycle of a computational approach to investigating cultural contact in Southern Chinese martial arts. The entire approach is predicated upon the usage of standards and techniques of the Semantic Web and formal knowledge. Starting from a modular domain ontology, which models martial arts independently of the goal of capturing cultural influence, we perform knowledge extraction from archival material from the Hong Kong Martial Arts Living Archive and generate a dataset of the results modeled after said ontology. Then, we combine the resulting knowledge base with a rule model that represents ways to infer knowledge of potential contact between cultures based on the evidence present in the knowledge base. The results offer an insight into how an inference-based computational model can be applied to detect interesting facts even in the as-yet underexplored domain of intangible cultural heritage. The implemented workflow shows that the full-cycle employment of semantic technologies can offer the ground truth required for largely different approaches, such as statistical and machine learning ones, to operate.
{"title":"The Facets of Intangible Heritage in Southern Chinese Martial Arts: Applying a Knowledge-Driven Cultural Contact Detection Approach","authors":"A. Adamou, Davide Picca, Yumeng Hou, Paula Loreto Granados-García","doi":"10.1145/3606702","DOIUrl":"https://doi.org/10.1145/3606702","url":null,"abstract":"Investigating the intangible nature of a cultural domain can take multiple forms, addressing for example the aesthetic, epistemic and social dimensions of its phenomenology. The context of Southern Chinese martial arts is of particular significance as it carries immaterial components of all these aspects: the technical and stylistic framework of a martial art system; the imagery associated to movements; and the transmission of knowledge orally, practically or through influence, are but examples of intangible characteristics that can and should be captured, not unlike cultural artifacts. The latter case– the one of formalizing cultural influence through its various forms of evidence– is emblematic as well as largely untrodden ground. A previous attempt at detecting cultural influence computationally was made in the context of Roman archaeology, though the binding of that early effort with the domain model was tight; also, there has hardly been any prior dedicated effort to model the martial arts domain through ontologies. In this paper, we present the realization of the full cycle of a computational approach to investigating cultural contact in Southern Chinese martial arts. The entire approach is predicated upon the usage of standards and techniques of the Semantic Web and formal knowledge. Starting from a modular domain ontology, which models martial arts independently of the goal of capturing cultural influence, we perform knowledge extraction from archival material from the Hong Kong Martial Arts Living Archive and generate a dataset of the results modeled after said ontology. Then, we combine the resulting knowledge base with a rule model that represents ways to infer knowledge of potential contact between cultures based on the evidence present in the knowledge base. The results offer an insight into how an inference-based computational model can be applied to detect interesting facts even in the as-yet underexplored domain of intangible cultural heritage. The implemented workflow shows that the full-cycle employment of semantic technologies can offer the ground truth required for largely different approaches, such as statistical and machine learning ones, to operate.","PeriodicalId":54310,"journal":{"name":"ACM Journal on Computing and Cultural Heritage","volume":"35 1","pages":""},"PeriodicalIF":2.4,"publicationDate":"2023-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91006289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
F. Murano, Valeria Quochi, A. D. Grosso, Luca Rigobianco, M. Zinzi
This paper discusses the challenges addressed in the digital scholarly encoding of the fragmentary texts of the languages of Ancient Italy according to the TEI/EpiDoc Guidelines in XML format. It describes the solutions and customisations that have been adopted for dealing with the peculiarities of our epigraphical documentation and with the formalisation of epigraphical information deemed interesting for data retrieval in a historical linguistic perspective. The making of a digital corpus consisting of new critical editions of selected inscriptions is a work carried out in the context of the project ”Languages and Cultures of Ancient Italy. Historical Linguistics and Digital Models”, which aims to investigate the languages of Ancient Italy by combining the traditional methods, proper to historical linguistics, with methods and technology proper to the digital humanities and computational lexicography. More specifically, the purpose of the project is to create a set of interrelated digital language resources which comprise: 1) a digital corpus of texts editions; 2) a computational lexicon compliant with the Web Semantic requirements; 3) a relevant bibliographic reference dataset encoded according to the FRBRoo/LRMoo specifications. Additionally, selected textual data and scientific interpretations will be encoded using CIDOC CRM and its extensions, namely CRMtex and CRMinf. The present contribution thus tackles one of the main aspects of the project, and proposes significant innovations in the encoding of critical editions for epigraphic texts of fragmentary languages, which will hopefully foster future interoperability and integration with other external datasets, a paramount concern of the project.
{"title":"Describing Inscriptions of Ancient Italy. The ItAnt Project and Its Information Encoding Process","authors":"F. Murano, Valeria Quochi, A. D. Grosso, Luca Rigobianco, M. Zinzi","doi":"10.1145/3606703","DOIUrl":"https://doi.org/10.1145/3606703","url":null,"abstract":"This paper discusses the challenges addressed in the digital scholarly encoding of the fragmentary texts of the languages of Ancient Italy according to the TEI/EpiDoc Guidelines in XML format. It describes the solutions and customisations that have been adopted for dealing with the peculiarities of our epigraphical documentation and with the formalisation of epigraphical information deemed interesting for data retrieval in a historical linguistic perspective. The making of a digital corpus consisting of new critical editions of selected inscriptions is a work carried out in the context of the project ”Languages and Cultures of Ancient Italy. Historical Linguistics and Digital Models”, which aims to investigate the languages of Ancient Italy by combining the traditional methods, proper to historical linguistics, with methods and technology proper to the digital humanities and computational lexicography. More specifically, the purpose of the project is to create a set of interrelated digital language resources which comprise: 1) a digital corpus of texts editions; 2) a computational lexicon compliant with the Web Semantic requirements; 3) a relevant bibliographic reference dataset encoded according to the FRBRoo/LRMoo specifications. Additionally, selected textual data and scientific interpretations will be encoded using CIDOC CRM and its extensions, namely CRMtex and CRMinf. The present contribution thus tackles one of the main aspects of the project, and proposes significant innovations in the encoding of critical editions for epigraphic texts of fragmentary languages, which will hopefully foster future interoperability and integration with other external datasets, a paramount concern of the project.","PeriodicalId":54310,"journal":{"name":"ACM Journal on Computing and Cultural Heritage","volume":"8 1","pages":""},"PeriodicalIF":2.4,"publicationDate":"2023-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72979915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-06DOI: https://dl.acm.org/doi/10.1145/3606706
Daria Mikhaylova, Daniele Metilli
Historical architectural archives enjoy attention from diverse audiences, acting as a primary source of information for architects, historians, public authorities, and common citizens alike. In Italy, the interest in architectural archives has grown slowly but steadily for the last 20 years. However, architectural archives do not generally follow the trend common for museums and galleries in publishing digitized materials and providing standard metadata for individual records. The information that is available online usually includes only an archival finding aid, instead of metadata about the individual records, or fully digital versions of the records. While cataloguing standards for archival descriptions of architectural records have existed at least since the 1980s, the rise of Linked Open Data as a framework for publishing cultural heritage data has allowed archivists to enhance these archival descriptions with richer contextual information and links to external knowledge bases. In this paper we present the ITDT ontology, an extension of the Records in Contexts Ontology that facilitates the representation of architectural records and of the context related to architectural projects, its process, and participating entities. We discuss the application of the ontology to the project files of Italian architect and engineer Dino Tamburini (1924–2011), and the creation of a digital archive offering multiple perspectives over the records.
{"title":"Extending RiC-O to model historical architectural archives: The ITDT ontology","authors":"Daria Mikhaylova, Daniele Metilli","doi":"https://dl.acm.org/doi/10.1145/3606706","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3606706","url":null,"abstract":"<p>Historical architectural archives enjoy attention from diverse audiences, acting as a primary source of information for architects, historians, public authorities, and common citizens alike. In Italy, the interest in architectural archives has grown slowly but steadily for the last 20 years. However, architectural archives do not generally follow the trend common for museums and galleries in publishing digitized materials and providing standard metadata for individual records. The information that is available online usually includes only an archival finding aid, instead of metadata about the individual records, or fully digital versions of the records. While cataloguing standards for archival descriptions of architectural records have existed at least since the 1980s, the rise of Linked Open Data as a framework for publishing cultural heritage data has allowed archivists to enhance these archival descriptions with richer contextual information and links to external knowledge bases. In this paper we present the ITDT ontology, an extension of the Records in Contexts Ontology that facilitates the representation of architectural records and of the context related to architectural projects, its process, and participating entities. We discuss the application of the ontology to the project files of Italian architect and engineer Dino Tamburini (1924–2011), and the creation of a digital archive offering multiple perspectives over the records.</p>","PeriodicalId":54310,"journal":{"name":"ACM Journal on Computing and Cultural Heritage","volume":"24 1","pages":""},"PeriodicalIF":2.4,"publicationDate":"2023-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138540473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-05DOI: https://dl.acm.org/doi/10.1145/3606704
Kenzo Milleville, Alec Van den Broeck, Nastasia Vanderperren, Rony Vissers, Matthias Priem, Nico Van de Weghe, Steven Verstockt
The digitization of image archives across the globe has opened up vast collections of libraries, museums, and cultural heritage institutions. These collections provide valuable historical information to the public and researchers. Many image collections have little metadata describing who or what is depicted in a structured format, making it difficult to search for specific persons. This work presents a facial recognition pipeline to enrich these collections by recognizing the persons in each image. A reference dataset of over 6000 known persons was constructed and facial recognition was performed on a dataset of over 150 thousand images. Detected faces were matched with the known faces using a similarity score on the face embeddings. We developed an interactive labeling tool to efficiently validate the face recognition predictions. A total of 182 thousand detected faces were labeled with this tool. Using a minimum similarity score of 0.5, the face recognition model achieved a precision of 0.936 and identified over 62 thousand persons from the image archives. We show how clustering can be used to identify new persons that were not included in the reference dataset. Furthermore, we highlight the potential of facial recognition to enhance the accessibility of the collections and offer new insights.
{"title":"Enriching Image Archives via Facial Recognition","authors":"Kenzo Milleville, Alec Van den Broeck, Nastasia Vanderperren, Rony Vissers, Matthias Priem, Nico Van de Weghe, Steven Verstockt","doi":"https://dl.acm.org/doi/10.1145/3606704","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3606704","url":null,"abstract":"<p>The digitization of image archives across the globe has opened up vast collections of libraries, museums, and cultural heritage institutions. These collections provide valuable historical information to the public and researchers. Many image collections have little metadata describing who or what is depicted in a structured format, making it difficult to search for specific persons. This work presents a facial recognition pipeline to enrich these collections by recognizing the persons in each image. A reference dataset of over 6000 known persons was constructed and facial recognition was performed on a dataset of over 150 thousand images. Detected faces were matched with the known faces using a similarity score on the face embeddings. We developed an interactive labeling tool to efficiently validate the face recognition predictions. A total of 182 thousand detected faces were labeled with this tool. Using a minimum similarity score of 0.5, the face recognition model achieved a precision of 0.936 and identified over 62 thousand persons from the image archives. We show how clustering can be used to identify new persons that were not included in the reference dataset. Furthermore, we highlight the potential of facial recognition to enhance the accessibility of the collections and offer new insights.</p>","PeriodicalId":54310,"journal":{"name":"ACM Journal on Computing and Cultural Heritage","volume":"218 1","pages":""},"PeriodicalIF":2.4,"publicationDate":"2023-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138540472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}