We investigate the creation of a 17th c. French literary corpus. We present the main options regarding available standards, the training data we created and the efficiency of the models produced for OCR, spelling normalisation and lemmatisation - always with open-source solutions. We also present our encoding choices and the global logic of a corpus designed as a virtuous circle, enhancing automatically the tools that are used for its construction.
{"title":"CORPUS17","authors":"Simon Gabay, Alexandre Bartz, Yohann Deguin","doi":"10.1145/3423603.3424002","DOIUrl":"https://doi.org/10.1145/3423603.3424002","url":null,"abstract":"We investigate the creation of a 17th c. French literary corpus. We present the main options regarding available standards, the training data we created and the efficiency of the models produced for OCR, spelling normalisation and lemmatisation - always with open-source solutions. We also present our encoding choices and the global logic of a corpus designed as a virtuous circle, enhancing automatically the tools that are used for its construction.","PeriodicalId":387247,"journal":{"name":"Proceedings of the 2nd International Conference on Digital Tools & Uses Congress","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128822018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A Data warehouse is a structure that stores big amount of data. This data is exploited in the best possible ways in order to improve the efficiency of decision-making. The huge volume of data makes answering queries complex and time-consuming. Therefore, materialized views are used in order to reduce the query processing time. Since materializing all views is not possible, due to space and maintenance constraints, materialized view selection became one of the crucial decisions in designing a data warehouse for optimal efficiency. In this paper, the authors propose a Quantum Evolutionary based algorithm named QEAM to solve the materialized view selection (MVS) problem with storage space constraint. The experimental results show the efficiency of the proposed algorithm compared to well-known algorithms used to solve MVS problem with storage space constraint.
{"title":"Using quantum evolutionary based algorithm to solve materialized view selection problem","authors":"Raouf Mayata, A. Boukra","doi":"10.1145/3423603.3424051","DOIUrl":"https://doi.org/10.1145/3423603.3424051","url":null,"abstract":"A Data warehouse is a structure that stores big amount of data. This data is exploited in the best possible ways in order to improve the efficiency of decision-making. The huge volume of data makes answering queries complex and time-consuming. Therefore, materialized views are used in order to reduce the query processing time. Since materializing all views is not possible, due to space and maintenance constraints, materialized view selection became one of the crucial decisions in designing a data warehouse for optimal efficiency. In this paper, the authors propose a Quantum Evolutionary based algorithm named QEAM to solve the materialized view selection (MVS) problem with storage space constraint. The experimental results show the efficiency of the proposed algorithm compared to well-known algorithms used to solve MVS problem with storage space constraint.","PeriodicalId":387247,"journal":{"name":"Proceedings of the 2nd International Conference on Digital Tools & Uses Congress","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129577138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Proceedings of the 2nd International Conference on Digital Tools & Uses Congress","authors":"","doi":"10.1145/3423603","DOIUrl":"https://doi.org/10.1145/3423603","url":null,"abstract":"","PeriodicalId":387247,"journal":{"name":"Proceedings of the 2nd International Conference on Digital Tools & Uses Congress","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116793610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}