Pub Date : 2017-09-01DOI: 10.1109/COGINFOCOM.2017.8268245
G. Kiss, K. Vicsi
In this paper, read and spontaneous speech have been compared in the light of automatic depression detection by speech processing. First, statistical analysis was carried out to select those acoustic features that differ significantly between healthy and depressed subjects in case of these two types of speech, separately for both gender. Secondly, statistical examination and classification experiments were prepared to compare the values of the selected features for the two types of speech. We were looking for the answer to which type of speech can be used to achieve better automatic depression detection results. As it was expected, the tempo related features, such as articulation rate, speech rate, and pause lengths are useful in case of spontaneous speech, while formants trajectories can be used only in case of read speech, because their values are mainly influenced by the linguistic content of the speech. Despite the significant differences of the features' values between read and spontaneous speech, there were no major differences in the detection accuracies. 83% detection accuracy was archived with read speech samples, and 86%detection accuracy was achieved with spontaneous speech samples.
{"title":"Comparison of read and spontaneous speech in case of automatic detection of depression","authors":"G. Kiss, K. Vicsi","doi":"10.1109/COGINFOCOM.2017.8268245","DOIUrl":"https://doi.org/10.1109/COGINFOCOM.2017.8268245","url":null,"abstract":"In this paper, read and spontaneous speech have been compared in the light of automatic depression detection by speech processing. First, statistical analysis was carried out to select those acoustic features that differ significantly between healthy and depressed subjects in case of these two types of speech, separately for both gender. Secondly, statistical examination and classification experiments were prepared to compare the values of the selected features for the two types of speech. We were looking for the answer to which type of speech can be used to achieve better automatic depression detection results. As it was expected, the tempo related features, such as articulation rate, speech rate, and pause lengths are useful in case of spontaneous speech, while formants trajectories can be used only in case of read speech, because their values are mainly influenced by the linguistic content of the speech. Despite the significant differences of the features' values between read and spontaneous speech, there were no major differences in the detection accuracies. 83% detection accuracy was archived with read speech samples, and 86%detection accuracy was achieved with spontaneous speech samples.","PeriodicalId":212559,"journal":{"name":"2017 8th IEEE International Conference on Cognitive Infocommunications (CogInfoCom)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115831660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/COGINFOCOM.2017.8268246
A. Moro, György Szaszák
Speech communication human-machine interfaces exploit automatic speech recognition to implement speech-to-text conversion. Unfortunately, in the past, not much effort has been devoted to add punctuation marks to the recognized word chain after speech recognition. This affects human readability and makes interpretation hard. This paper presents an effort to restore punctuation marks by keeping low the latency resulting from this post-processing step. The approach exploits the prosodic structure and proposes a sequential modelling paradigm based on recurrent neural networks. Results show satisfying punctuation restoration abilities, especially taking into account that sentence boundaries are reliably detected. Even if the predicted punctuation sequence is not error free w.r.t. writing standards, human perception is expected to “repair” these errors more easily compared to the case when no punctuation is given at all and the reader is left in confusion regarding the basic segmentation of the word chain.
{"title":"A prosody inspired RNN approach for punctuation of machine produced speech transcripts to improve human readability","authors":"A. Moro, György Szaszák","doi":"10.1109/COGINFOCOM.2017.8268246","DOIUrl":"https://doi.org/10.1109/COGINFOCOM.2017.8268246","url":null,"abstract":"Speech communication human-machine interfaces exploit automatic speech recognition to implement speech-to-text conversion. Unfortunately, in the past, not much effort has been devoted to add punctuation marks to the recognized word chain after speech recognition. This affects human readability and makes interpretation hard. This paper presents an effort to restore punctuation marks by keeping low the latency resulting from this post-processing step. The approach exploits the prosodic structure and proposes a sequential modelling paradigm based on recurrent neural networks. Results show satisfying punctuation restoration abilities, especially taking into account that sentence boundaries are reliably detected. Even if the predicted punctuation sequence is not error free w.r.t. writing standards, human perception is expected to “repair” these errors more easily compared to the case when no punctuation is given at all and the reader is left in confusion regarding the basic segmentation of the word chain.","PeriodicalId":212559,"journal":{"name":"2017 8th IEEE International Conference on Cognitive Infocommunications (CogInfoCom)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134477878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/COGINFOCOM.2017.8268236
I. Fazekas, Attila Perecsényi, B. Porvázsnyik
In this paper we introduce a new network evolution model. The basic feature of the model is the cooperation (interaction) of N nodes. In our model every step m new nodes are born, where m is a discrete random variable with values 0,1, 2,…, N − 1. Then the m new nodes interact with (N − m) old vertices, so that they form a complete graph on N vertices. The old nodes can be chosen either uniformly or by using the preferential attachment rule. We analyze certain properties of the above mentioned model by computer simulations. Power-law degree and weight distributions and clustering coefficients are studied.
{"title":"Numerical analysis of a network evolution model","authors":"I. Fazekas, Attila Perecsényi, B. Porvázsnyik","doi":"10.1109/COGINFOCOM.2017.8268236","DOIUrl":"https://doi.org/10.1109/COGINFOCOM.2017.8268236","url":null,"abstract":"In this paper we introduce a new network evolution model. The basic feature of the model is the cooperation (interaction) of N nodes. In our model every step m new nodes are born, where m is a discrete random variable with values 0,1, 2,…, N − 1. Then the m new nodes interact with (N − m) old vertices, so that they form a complete graph on N vertices. The old nodes can be chosen either uniformly or by using the preferential attachment rule. We analyze certain properties of the above mentioned model by computer simulations. Power-law degree and weight distributions and clustering coefficients are studied.","PeriodicalId":212559,"journal":{"name":"2017 8th IEEE International Conference on Cognitive Infocommunications (CogInfoCom)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128549129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/COGINFOCOM.2017.8268267
A. Adamkó, Abel Garai, István Péntek
The recent decade brought a significant breakthrough in the healthcare interoperability. Among other things the patient-information has been thoroughly digitalized and shared between the involved organizational units. The healthcare data is stored in electronic health records (EHR). These records mostly remained in stand-alone servers. However, the cloud-technology penetrated also the healthcare industry. As people move and travel more frequently, they are more likely to receive treatment in foreign healthcare institutions. Therefore, they leave their electronic medical footprint in different countries. The industrial cloud-providers offer regional, international or global solutions. This trend eliminates the former technological barriers of cross-border data exchange. This article summarizes the results of the team's research, and focuses on the findings of the latest stage of the three-year exploratory program. From technical point of view, this research phase focuses on three objectives: capture of the bio-sensory raw data from the dedicated e-Health device, aggregation and evaluation of the data-flow by the hub-software, and collection into EHR. This research also simulates the proposed adaptive, event-based and interoperable healthcare ecosystem. According to the cloud architecture's elasticity, the findings of this pilot project can be later disseminated to international extent and the hub software's system dimensions can be scaled up to serve also complex healthcare ecosystems.
{"title":"Interaction-dependent e-health hub-software adaptation to cloud-based electronic health records","authors":"A. Adamkó, Abel Garai, István Péntek","doi":"10.1109/COGINFOCOM.2017.8268267","DOIUrl":"https://doi.org/10.1109/COGINFOCOM.2017.8268267","url":null,"abstract":"The recent decade brought a significant breakthrough in the healthcare interoperability. Among other things the patient-information has been thoroughly digitalized and shared between the involved organizational units. The healthcare data is stored in electronic health records (EHR). These records mostly remained in stand-alone servers. However, the cloud-technology penetrated also the healthcare industry. As people move and travel more frequently, they are more likely to receive treatment in foreign healthcare institutions. Therefore, they leave their electronic medical footprint in different countries. The industrial cloud-providers offer regional, international or global solutions. This trend eliminates the former technological barriers of cross-border data exchange. This article summarizes the results of the team's research, and focuses on the findings of the latest stage of the three-year exploratory program. From technical point of view, this research phase focuses on three objectives: capture of the bio-sensory raw data from the dedicated e-Health device, aggregation and evaluation of the data-flow by the hub-software, and collection into EHR. This research also simulates the proposed adaptive, event-based and interoperable healthcare ecosystem. According to the cloud architecture's elasticity, the findings of this pilot project can be later disseminated to international extent and the hub software's system dimensions can be scaled up to serve also complex healthcare ecosystems.","PeriodicalId":212559,"journal":{"name":"2017 8th IEEE International Conference on Cognitive Infocommunications (CogInfoCom)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133903876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/COGINFOCOM.2017.8268273
Kinga Biró, G. Molnár, Dalma Pap, Zoltán Szűts
Pedagogy is in a dire need of shift since students are under motivated. Virtual and augmented reality offer great solutions given the fact that the majority of high school students have heard of those. Augmented reality based applications such as the Pokémon Go 3D has proven that smartphone applications are capable of exciting and moving people therefore these can be used effectively in education.
由于学生缺乏积极性,教育学急需转变。虚拟现实和增强现实提供了很好的解决方案,因为大多数高中生都听说过这些。基于增强现实的应用程序,如poksammon Go 3D已经证明,智能手机应用程序能够让人兴奋和感动,因此这些应用程序可以有效地用于教育。
{"title":"The effects of virtual and augmented learning environments on the learning process in secondary school","authors":"Kinga Biró, G. Molnár, Dalma Pap, Zoltán Szűts","doi":"10.1109/COGINFOCOM.2017.8268273","DOIUrl":"https://doi.org/10.1109/COGINFOCOM.2017.8268273","url":null,"abstract":"Pedagogy is in a dire need of shift since students are under motivated. Virtual and augmented reality offer great solutions given the fact that the majority of high school students have heard of those. Augmented reality based applications such as the Pokémon Go 3D has proven that smartphone applications are capable of exciting and moving people therefore these can be used effectively in education.","PeriodicalId":212559,"journal":{"name":"2017 8th IEEE International Conference on Cognitive Infocommunications (CogInfoCom)","volume":"126 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122094621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/COGINFOCOM.2017.8268293
P. Várlaki, P. Baranyi
The paper deals, through a comparative hermeneutical macro and microanalysis, identical and very similar representational and meaning systems of such great medieval mystical writings as the Book Bahir, the Targum to Song of Songs and the Royal Mirror of St Stephen of Hungary. The basis of the comparison is the hidden (“highest”) presence of the spirit of the Great Monarch. The ‘identified’ mystical patterns are compared with the representational and meaning systems in the great medieval art-works related to Constantine Porphyrogenitus in Part II of the paper.
{"title":"Cognitive and spiritual revolution of the tenth century — Constantine porphyrogenitus and his hidden world: Part I. The Great Monarch's hidden world in the great medieval mystical writings","authors":"P. Várlaki, P. Baranyi","doi":"10.1109/COGINFOCOM.2017.8268293","DOIUrl":"https://doi.org/10.1109/COGINFOCOM.2017.8268293","url":null,"abstract":"The paper deals, through a comparative hermeneutical macro and microanalysis, identical and very similar representational and meaning systems of such great medieval mystical writings as the Book Bahir, the Targum to Song of Songs and the Royal Mirror of St Stephen of Hungary. The basis of the comparison is the hidden (“highest”) presence of the spirit of the Great Monarch. The ‘identified’ mystical patterns are compared with the representational and meaning systems in the great medieval art-works related to Constantine Porphyrogenitus in Part II of the paper.","PeriodicalId":212559,"journal":{"name":"2017 8th IEEE International Conference on Cognitive Infocommunications (CogInfoCom)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125015902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/COGINFOCOM.2017.8268241
Mohamed Amine Korteby, Zoltán Gál
Delay Tolerant Networking (DTN) allows communication in challenging and harsh environments where traditional networking fails and new routing and application protocols are required. For such networks, it is a constant task to keep track and a summary of their behavior because of the lack of network resources as they have less energy and memory to buffer transit messages. Therefore, it is important to exploit these resources efficiently. Several studies propose that accompanying visualization with sonification can ease some of the challenges of constant visual monitoring. In this paper, we simulate three different routing protocols and four movement models to analyze the energy consumption, the buffer occupancy and the interconnection time of the nodes under forty-eight different scenarios. Furthermore, we propose a cognitive sonification controller system and to enhance the network administrators in their network management task.
{"title":"Perception of delay tolerant network behavior with cognitive sonfication controller","authors":"Mohamed Amine Korteby, Zoltán Gál","doi":"10.1109/COGINFOCOM.2017.8268241","DOIUrl":"https://doi.org/10.1109/COGINFOCOM.2017.8268241","url":null,"abstract":"Delay Tolerant Networking (DTN) allows communication in challenging and harsh environments where traditional networking fails and new routing and application protocols are required. For such networks, it is a constant task to keep track and a summary of their behavior because of the lack of network resources as they have less energy and memory to buffer transit messages. Therefore, it is important to exploit these resources efficiently. Several studies propose that accompanying visualization with sonification can ease some of the challenges of constant visual monitoring. In this paper, we simulate three different routing protocols and four movement models to analyze the energy consumption, the buffer occupancy and the interconnection time of the nodes under forty-eight different scenarios. Furthermore, we propose a cognitive sonification controller system and to enhance the network administrators in their network management task.","PeriodicalId":212559,"journal":{"name":"2017 8th IEEE International Conference on Cognitive Infocommunications (CogInfoCom)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130420386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/COGINFOCOM.2017.8268252
M. Gnjatović, Jovica Tasevski, D. Mišković, S. Savic, B. Borovac, A. Mikov, R. Krasnik
This paper reports on a pilot corpus of child-robot interaction in therapeutic settings. The corpus comprises recordings of the interactions between twenty-one children and the conversational humanoid robot MARKO, in the kinesitherapeutic room at the Clinic of Paediatric Rehabilitation in Novi Sad, Serbia. The subject group included both healthy children and children with cerebral palsy and similar movement disorders. Approximately 156 minutes of session time was recorded. All dialogues were transcribed, and nonverbal acts were annotated. The initial evaluation of the corpus indicates that children positively respond to MARKO, engage in interaction with MARKO, perform verbal instructions given by MARKO, and experience increased motivation for therapy.
{"title":"Pilot corpus of child-robot interaction in therapeutic settings","authors":"M. Gnjatović, Jovica Tasevski, D. Mišković, S. Savic, B. Borovac, A. Mikov, R. Krasnik","doi":"10.1109/COGINFOCOM.2017.8268252","DOIUrl":"https://doi.org/10.1109/COGINFOCOM.2017.8268252","url":null,"abstract":"This paper reports on a pilot corpus of child-robot interaction in therapeutic settings. The corpus comprises recordings of the interactions between twenty-one children and the conversational humanoid robot MARKO, in the kinesitherapeutic room at the Clinic of Paediatric Rehabilitation in Novi Sad, Serbia. The subject group included both healthy children and children with cerebral palsy and similar movement disorders. Approximately 156 minutes of session time was recorded. All dialogues were transcribed, and nonverbal acts were annotated. The initial evaluation of the corpus indicates that children positively respond to MARKO, engage in interaction with MARKO, perform verbal instructions given by MARKO, and experience increased motivation for therapy.","PeriodicalId":212559,"journal":{"name":"2017 8th IEEE International Conference on Cognitive Infocommunications (CogInfoCom)","volume":"104 9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127460467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/COGINFOCOM.2017.8268239
Dávid Sik
In our paper we introduce a new multi-leveled e-learning environment, called Sysbook. It is an open-access surface, available on the internet for any users. Its main topics cover the field of systems and control, with some mathematical and even philosophical aspects. The purpose of the Sysbook is to present systems and controls on different levels, addressing readers of different backgrounds and interests. These surfaces are extended with case studies for different fields and a student area where the users can also contribute.
{"title":"Introduction of a multi-leveled E-leaming environment with community contribution","authors":"Dávid Sik","doi":"10.1109/COGINFOCOM.2017.8268239","DOIUrl":"https://doi.org/10.1109/COGINFOCOM.2017.8268239","url":null,"abstract":"In our paper we introduce a new multi-leveled e-learning environment, called Sysbook. It is an open-access surface, available on the internet for any users. Its main topics cover the field of systems and control, with some mathematical and even philosophical aspects. The purpose of the Sysbook is to present systems and controls on different levels, addressing readers of different backgrounds and interests. These surfaces are extended with case studies for different fields and a student area where the users can also contribute.","PeriodicalId":212559,"journal":{"name":"2017 8th IEEE International Conference on Cognitive Infocommunications (CogInfoCom)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126350658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/COGINFOCOM.2017.8268227
Máté Ákos Tündik, Balázs Tarján, György Szaszák
Closed captioning is a common method to improve accessibility of TV programs for people who are hearing impaired or hard of hearing, while representing an application relevant for cognitive infocommunication. However, live captions provided by automatic speech recognition systems usually lack punctuation, making them hard to follow. In this paper, Maximum Entropy and Recurrent Neural Network based punctuation restoration models are compared on two closed captioning tasks in real-time and off-line setups. We present the first results in restoring punctuation for Hungarian broadcast speech, where the RNN significantly outperforms our MaxEnt baseline system. Our approach is also evaluated on TED talks within the IWSLT English dataset providing comparable results to the state-of-the-art systems.
{"title":"Á bilingual comparison of MaxEnt-and RNN-based punctuation restoration in speech transcripts","authors":"Máté Ákos Tündik, Balázs Tarján, György Szaszák","doi":"10.1109/COGINFOCOM.2017.8268227","DOIUrl":"https://doi.org/10.1109/COGINFOCOM.2017.8268227","url":null,"abstract":"Closed captioning is a common method to improve accessibility of TV programs for people who are hearing impaired or hard of hearing, while representing an application relevant for cognitive infocommunication. However, live captions provided by automatic speech recognition systems usually lack punctuation, making them hard to follow. In this paper, Maximum Entropy and Recurrent Neural Network based punctuation restoration models are compared on two closed captioning tasks in real-time and off-line setups. We present the first results in restoring punctuation for Hungarian broadcast speech, where the RNN significantly outperforms our MaxEnt baseline system. Our approach is also evaluated on TED talks within the IWSLT English dataset providing comparable results to the state-of-the-art systems.","PeriodicalId":212559,"journal":{"name":"2017 8th IEEE International Conference on Cognitive Infocommunications (CogInfoCom)","volume":"118 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113991223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}