In speech synthesis by unit concatenation a major point is the definition of the unit inventory. Diphone or demisyliable inventories are widely used but both unit types have their drawbacks. This chapter describes a mixed inventory structure that is syllable-oriented but does not demand a definite decision about the position of a syllable boundary. In the definition process of the inventory the results of a comprehensive investigation of coarticulatory phenomena at syllable boundaries were used as well as a machine-readable pronunciation dictionary. An evaluation comparing the mixed inventory with a demisyllable and a diphone inventory confirms that speech generated with the mixed inventory is superior regarding general acceptance. A segmental intelligibility test shows the high intelligibility of the synthetic speech.
{"title":"A mixed inventory structure for German concatenative synthesis","authors":"T. Portele, F. Höfer, W. Hess","doi":"10.22028/D291-25294","DOIUrl":"https://doi.org/10.22028/D291-25294","url":null,"abstract":"In speech synthesis by unit concatenation a major point is the definition of the unit inventory. Diphone or demisyliable inventories are widely used but both unit types have their drawbacks. This chapter describes a mixed inventory structure that is syllable-oriented but does not demand a definite decision about the position of a syllable boundary. In the definition process of the inventory the results of a comprehensive investigation of coarticulatory phenomena at syllable boundaries were used as well as a machine-readable pronunciation dictionary. An evaluation comparing the mixed inventory with a demisyllable and a diphone inventory confirms that speech generated with the mixed inventory is superior regarding general acceptance. A segmental intelligibility test shows the high intelligibility of the synthetic speech.","PeriodicalId":340820,"journal":{"name":"Speech Synthesis Workshop","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132898168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1007/978-1-4612-1894-4_3
Luís C. Oliveira
{"title":"Text-to-speech synthesis with dynamic control of source parameters","authors":"Luís C. Oliveira","doi":"10.1007/978-1-4612-1894-4_3","DOIUrl":"https://doi.org/10.1007/978-1-4612-1894-4_3","url":null,"abstract":"","PeriodicalId":340820,"journal":{"name":"Speech Synthesis Workshop","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123882219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.5220/0003116600720084
I. Giordani, D. Toscani, F. Archetti, M. Cislaghi
The quick development and deployment of sensor technology within the general frame of the Internet of Things poses relevant opportunity and challenges. The sensor is not a pure data source, but an entity (Semantic Sensor Web) with associated metadata and it is a building block of a “worldwide distributed” real time database, to be processed through real-time queries. Important challenges are to achieve interoperability in connectivity and processing capabilities (queries) and to apply “intelligence” and processing capabilities as close as possible to the source of data. This paper presents the extension of a general architecture for data integration in which we add capabilities for processing of complex queries and discuss how they can be adapted to, and used by, an application in the Semantic Sensor Web, presenting a pilot study in environment and health domains. 1 Background and Motivation The rapid development and deployment of sensor technology involves many different types of sensors, both remote and in situ, with such diverse capabilities as range, modality, and manoeuvrability. It is possible today to utilize networks with multiple sensors to detect and identify objects of interest up close or from a great distance. Connected Objects – or the Internet of Things – is expected to be a significant new market and encompass a large variety of technologies and services in different domains. Transport, environmental management, health, agriculture, domestic appliances, building automation, energy efficiency will benefit of real-time reality mining, personal decision support capabilities provided by the growing information shadow (i.e. data traces) of people, goods and objects supplied by the huge data available from the emerging sensor Web [1]. Vertical applications can be developed to connect to and communicate with objects tailored for specific sub domains, service enablement to face fragmented connectivity, device standards, application information protocols etc. and device management. Building extending connectivity, connectivity tailored for object communication – with regards to business model, service level, billing etc, are possible exploitation areas of the Internet Connected Objects. Important challenges are to achieve interoperability in connectivity and processing capabilities (queries, etc.), to distribute “intelligence” and processing capabilities as close as possible to the source of data (the Giordani I., Toscani D., Archetti F. and Cislaghi M.. Semantic High Level Querying in Sensor Networks. DOI: 10.5220/0003116600720084 In Proceedings of the International Workshop on Semantic Sensor Web (SSW-2010), pages 72-84 ISBN: 978-989-8425-33-1 Copyright c 2010 SCITEPRESS (Science and Technology Publications, Lda.) sensor or mobile device), in order to avoid massive data flows and bottlenecks on the connectivity side. The sensor is not a pure data source, but an entity (Semantic Sensor Web) with associated domain metadata, capable of auto
{"title":"Semantic High Level Querying in Sensor Networks","authors":"I. Giordani, D. Toscani, F. Archetti, M. Cislaghi","doi":"10.5220/0003116600720084","DOIUrl":"https://doi.org/10.5220/0003116600720084","url":null,"abstract":"The quick development and deployment of sensor technology within the general frame of the Internet of Things poses relevant opportunity and challenges. The sensor is not a pure data source, but an entity (Semantic Sensor Web) with associated metadata and it is a building block of a “worldwide distributed” real time database, to be processed through real-time queries. Important challenges are to achieve interoperability in connectivity and processing capabilities (queries) and to apply “intelligence” and processing capabilities as close as possible to the source of data. This paper presents the extension of a general architecture for data integration in which we add capabilities for processing of complex queries and discuss how they can be adapted to, and used by, an application in the Semantic Sensor Web, presenting a pilot study in environment and health domains. 1 Background and Motivation The rapid development and deployment of sensor technology involves many different types of sensors, both remote and in situ, with such diverse capabilities as range, modality, and manoeuvrability. It is possible today to utilize networks with multiple sensors to detect and identify objects of interest up close or from a great distance. Connected Objects – or the Internet of Things – is expected to be a significant new market and encompass a large variety of technologies and services in different domains. Transport, environmental management, health, agriculture, domestic appliances, building automation, energy efficiency will benefit of real-time reality mining, personal decision support capabilities provided by the growing information shadow (i.e. data traces) of people, goods and objects supplied by the huge data available from the emerging sensor Web [1]. Vertical applications can be developed to connect to and communicate with objects tailored for specific sub domains, service enablement to face fragmented connectivity, device standards, application information protocols etc. and device management. Building extending connectivity, connectivity tailored for object communication – with regards to business model, service level, billing etc, are possible exploitation areas of the Internet Connected Objects. Important challenges are to achieve interoperability in connectivity and processing capabilities (queries, etc.), to distribute “intelligence” and processing capabilities as close as possible to the source of data (the Giordani I., Toscani D., Archetti F. and Cislaghi M.. Semantic High Level Querying in Sensor Networks. DOI: 10.5220/0003116600720084 In Proceedings of the International Workshop on Semantic Sensor Web (SSW-2010), pages 72-84 ISBN: 978-989-8425-33-1 Copyright c 2010 SCITEPRESS (Science and Technology Publications, Lda.) sensor or mobile device), in order to avoid massive data flows and bottlenecks on the connectivity side. The sensor is not a pure data source, but an entity (Semantic Sensor Web) with associated domain metadata, capable of auto","PeriodicalId":340820,"journal":{"name":"Speech Synthesis Workshop","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123304281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1007/978-1-4612-1894-4_24
A. Ljolje, Julia Hirschberg, J. V. Santen
{"title":"Automatic speech segmentation for concatenative inventory selection","authors":"A. Ljolje, Julia Hirschberg, J. V. Santen","doi":"10.1007/978-1-4612-1894-4_24","DOIUrl":"https://doi.org/10.1007/978-1-4612-1894-4_24","url":null,"abstract":"","PeriodicalId":340820,"journal":{"name":"Speech Synthesis Workshop","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126771045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1007/978-1-4612-1894-4_37
K. Kohler
{"title":"Parametric control of prosodic variables by symbolic input in TTS synthesis","authors":"K. Kohler","doi":"10.1007/978-1-4612-1894-4_37","DOIUrl":"https://doi.org/10.1007/978-1-4612-1894-4_37","url":null,"abstract":"","PeriodicalId":340820,"journal":{"name":"Speech Synthesis Workshop","volume":"194 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124285171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.5220/0003143101090117
Dragos Bratasanu, I. Nedelcu, M. Datcu
The Earth Observation processing tools operating in the recent scenario need to be tailored to the new products offered by the sub-meter spatial resolution imaging sensors. The new methods should provide the image analysts the essential automatic support to discover relevant information and identify significant elements in the image. We advocate an automatic technique to select the optimum number features used in classification, object detection and analysis of optical satellite images. Using measures of mutual information between the target classes and the available features, we investigate the criterions of maximum-relevance and maximum-relevance-minimumredundancy for automatic feature selection at very-low cost. Following a comprehensive set of experiments on multiple sensors, applications and classifiers, the results demonstrate the possible operational use of the method in future scenarios of humanmachine interactions in support of Earth Observation technologies.
{"title":"Automatic Feature Selection for Operational Scenarios of Satellite Image Understanding using Measures of Mutual Information","authors":"Dragos Bratasanu, I. Nedelcu, M. Datcu","doi":"10.5220/0003143101090117","DOIUrl":"https://doi.org/10.5220/0003143101090117","url":null,"abstract":"The Earth Observation processing tools operating in the recent scenario need to be tailored to the new products offered by the sub-meter spatial resolution imaging sensors. The new methods should provide the image analysts the essential automatic support to discover relevant information and identify significant elements in the image. We advocate an automatic technique to select the optimum number features used in classification, object detection and analysis of optical satellite images. Using measures of mutual information between the target classes and the available features, we investigate the criterions of maximum-relevance and maximum-relevance-minimumredundancy for automatic feature selection at very-low cost. Following a comprehensive set of experiments on multiple sensors, applications and classifiers, the results demonstrate the possible operational use of the method in future scenarios of humanmachine interactions in support of Earth Observation technologies.","PeriodicalId":340820,"journal":{"name":"Speech Synthesis Workshop","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114310467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1007/978-1-4612-1894-4_46
J. D. Pijper
{"title":"High-quality message-to-speech generation in a practical application","authors":"J. D. Pijper","doi":"10.1007/978-1-4612-1894-4_46","DOIUrl":"https://doi.org/10.1007/978-1-4612-1894-4_46","url":null,"abstract":"","PeriodicalId":340820,"journal":{"name":"Speech Synthesis Workshop","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133530428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1007/978-1-4612-1894-4_17
R. Wilhelms-Tricarico, J. Perkell
{"title":"Biomechanical and physiologically based speech modeling","authors":"R. Wilhelms-Tricarico, J. Perkell","doi":"10.1007/978-1-4612-1894-4_17","DOIUrl":"https://doi.org/10.1007/978-1-4612-1894-4_17","url":null,"abstract":"","PeriodicalId":340820,"journal":{"name":"Speech Synthesis Workshop","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127971613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper we introduce the first steps towards a new datadriven method for extraction of intonation events that does not require any prerequisite prosodic labelling. Provided with data segmented on the syllable constituent level it derives local and global contour classes by stylisation and subsequent clustering of the stylisation parameter vectors. Local contour classes correspond to pitch movements connected to one or several syllables and determine the local f0 shape. Global classes are connected to intonation phrases and determine the f0 register. Local classes initially are derived for syllabic segments, which are then concatenated incrementally by means of statistical language modelling of co-occurrence patterns. Due to its generality the method is in principal language independent and potentially capable to deal also with other aspects of prosody than intonation.
{"title":"Data-driven extraction of intonation contour classes","authors":"U. Reichel","doi":"10.5282/UBM/EPUB.13158","DOIUrl":"https://doi.org/10.5282/UBM/EPUB.13158","url":null,"abstract":"In this paper we introduce the first steps towards a new datadriven method for extraction of intonation events that does not require any prerequisite prosodic labelling. Provided with data segmented on the syllable constituent level it derives local and global contour classes by stylisation and subsequent clustering of the stylisation parameter vectors. Local contour classes correspond to pitch movements connected to one or several syllables and determine the local f0 shape. Global classes are connected to intonation phrases and determine the f0 register. Local classes initially are derived for syllabic segments, which are then concatenated incrementally by means of statistical language modelling of co-occurrence patterns. Due to its generality the method is in principal language independent and potentially capable to deal also with other aspects of prosody than intonation.","PeriodicalId":340820,"journal":{"name":"Speech Synthesis Workshop","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114915105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}