Two difficult problems pertain to selection of an evacuation route according to each individual¿s situation. One is constant supervision of the situation of a living space and detection of dangerous situa-tions; the other is detection of residents' respective positions. In this paper, we describe devices and systems for an indoor emergency evacuation using a mobile terminal such as a cell phone or PDA for indoor navigation. The navigation system functions autonomously: the user¿s device receives wireless beacon signals from the surrounding environment and can thereby detect a user¿s position by a mobile terminal independently. In fact, no server-side computation is necessary. Moreover, the system is applicable to an indoor emergency evacuation system using various sensors. The emergency system navigates people to safety in the wake of a disaster. A system must detect all residents' positions and then inform them individu-ally of routes to safety to evacuate all residents to a safe space urgently. We have developed a feasible indoor navigation system that can solve various problems arising in the application layer.
{"title":"Indoor Emergency Evacuation Service on Autonomous Navigation System using Mobile Phone","authors":"Y. Inoue, A. Sashima, T. Ikeda, K. Kurumatani","doi":"10.1109/ISUC.2008.49","DOIUrl":"https://doi.org/10.1109/ISUC.2008.49","url":null,"abstract":"Two difficult problems pertain to selection of an evacuation route according to each individual¿s situation. One is constant supervision of the situation of a living space and detection of dangerous situa-tions; the other is detection of residents' respective positions. In this paper, we describe devices and systems for an indoor emergency evacuation using a mobile terminal such as a cell phone or PDA for indoor navigation. The navigation system functions autonomously: the user¿s device receives wireless beacon signals from the surrounding environment and can thereby detect a user¿s position by a mobile terminal independently. In fact, no server-side computation is necessary. Moreover, the system is applicable to an indoor emergency evacuation system using various sensors. The emergency system navigates people to safety in the wake of a disaster. A system must detect all residents' positions and then inform them individu-ally of routes to safety to evacuate all residents to a safe space urgently. We have developed a feasible indoor navigation system that can solve various problems arising in the application layer.","PeriodicalId":339811,"journal":{"name":"2008 Second International Symposium on Universal Communication","volume":"223 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127295200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this study, we develop the embodied virtual communication system with the speech-driven nodding response model for the analysis by synthesis of embodied communication. Using the proposed system in embodied virtual communication, we perform experiments and carry out sensory evaluation and voice-motion analysis to demonstrate the effects of nodding responses on a talker's avatar called VirtualActor. The result of the study shows that superimposed nodding responses in a virtual .space promote communication.
{"title":"Analysis by Synthesis of Embodied Communication via VirtualActor with a Nodding Response Model","authors":"Yoshihiro Sejima, Tomio Watanabe, Michiya Yamamoto","doi":"10.1109/ISUC.2008.71","DOIUrl":"https://doi.org/10.1109/ISUC.2008.71","url":null,"abstract":"In this study, we develop the embodied virtual communication system with the speech-driven nodding response model for the analysis by synthesis of embodied communication. Using the proposed system in embodied virtual communication, we perform experiments and carry out sensory evaluation and voice-motion analysis to demonstrate the effects of nodding responses on a talker's avatar called VirtualActor. The result of the study shows that superimposed nodding responses in a virtual .space promote communication.","PeriodicalId":339811,"journal":{"name":"2008 Second International Symposium on Universal Communication","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123211104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we propose geometry based image-based rendering method for sparse multicamera system using multiple local ray-space representation and 3D model generation. In this system, the quality of walk-through view experience is impaired by inaccurate camera parameters and 3D model. In order to eliminate these affects, we enhance the reconstructed local ray-spaces using stereo matching approach. Hence, free viewpoint rendering quality is improved significantly for walk-through image synthesis. Experimental results for several multiview sequences verify the effectiveness of our approach.
{"title":"Enhanced Multiple Local Ray-spaces Method for Walk-through View Synthesis","authors":"M. P. Tehrani, A. Ishikawa, S. Sakazawa, A. Koike","doi":"10.1109/ISUC.2008.12","DOIUrl":"https://doi.org/10.1109/ISUC.2008.12","url":null,"abstract":"In this paper, we propose geometry based image-based rendering method for sparse multicamera system using multiple local ray-space representation and 3D model generation. In this system, the quality of walk-through view experience is impaired by inaccurate camera parameters and 3D model. In order to eliminate these affects, we enhance the reconstructed local ray-spaces using stereo matching approach. Hence, free viewpoint rendering quality is improved significantly for walk-through image synthesis. Experimental results for several multiview sequences verify the effectiveness of our approach.","PeriodicalId":339811,"journal":{"name":"2008 Second International Symposium on Universal Communication","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125947525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kiyonori Ohtake, Teruhisa Misu, Chiori Hori, H. Kashioka, Satoshi Nakamura
This paper uses the Kyoto tour guide dialogue corpus and its annotations to construct a dialogue management system by employing a statistical approach. We defined dialogue act (DA) tags to express a user's intention. Two kinds of tag sets can be used to annotate the corpus. One denotes a communicative function (speech act), and the other denotes the semantic content of an utterance. We have annotated the speech act tags in our corpus using several annotators. We evaluate the annotation results by measuring the agreement ratios between the annotators.
{"title":"Dialogue Act Annotation for Statistically Managed Spoken Dialogue Systems","authors":"Kiyonori Ohtake, Teruhisa Misu, Chiori Hori, H. Kashioka, Satoshi Nakamura","doi":"10.1109/ISUC.2008.52","DOIUrl":"https://doi.org/10.1109/ISUC.2008.52","url":null,"abstract":"This paper uses the Kyoto tour guide dialogue corpus and its annotations to construct a dialogue management system by employing a statistical approach. We defined dialogue act (DA) tags to express a user's intention. Two kinds of tag sets can be used to annotate the corpus. One denotes a communicative function (speech act), and the other denotes the semantic content of an utterance. We have annotated the speech act tags in our corpus using several annotators. We evaluate the annotation results by measuring the agreement ratios between the annotators.","PeriodicalId":339811,"journal":{"name":"2008 Second International Symposium on Universal Communication","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132616847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
It has been expected to utlize the technology of speech recognition and speech synthesis in order to improve the human interface. However, it is a difficult problem to extract and analyze emotional information contained in human voice in speech recognition and speech synthesis. This paper describes an experimental system to analyze and synthesize baby's emotional voice using the perturbation parameters of pitch frequencies in order to improve human interface. The system is composed of three parts: the first part is to record and analyze the voice and to indicate the perturbation of pitch frequencies in real-time. The next part is to evaluate four perturbation parameters of pitch frequency for baby's voice. Finally, baby's voice is synthesized using 2-mass model of vocal cords vibration. From the experiment of several baby's voices, it is shown that the proposed method is useful to analyze and synthesize baby's emotional voice.
{"title":"An Experimental System to Analyze or Synthesize Baby's Emotional Voice using the Varidation of Pitch Frequencies","authors":"Chikahiro Araki, Shin-ichiro Hashimukai, Satoshi Motomani, Mikio Mori, S. Taniguchi, Shozo Kato, Yasuhiro Ogoshi","doi":"10.1109/ISUC.2008.57","DOIUrl":"https://doi.org/10.1109/ISUC.2008.57","url":null,"abstract":"It has been expected to utlize the technology of speech recognition and speech synthesis in order to improve the human interface. However, it is a difficult problem to extract and analyze emotional information contained in human voice in speech recognition and speech synthesis. This paper describes an experimental system to analyze and synthesize baby's emotional voice using the perturbation parameters of pitch frequencies in order to improve human interface. The system is composed of three parts: the first part is to record and analyze the voice and to indicate the perturbation of pitch frequencies in real-time. The next part is to evaluate four perturbation parameters of pitch frequency for baby's voice. Finally, baby's voice is synthesized using 2-mass model of vocal cords vibration. From the experiment of several baby's voices, it is shown that the proposed method is useful to analyze and synthesize baby's emotional voice.","PeriodicalId":339811,"journal":{"name":"2008 Second International Symposium on Universal Communication","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129414380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T. Tahara, Y. Awatsuji, K. Nishio, S. Ura, T. Kubota, O. Matoba
Parallel phase-shifting digital holography is a technique capable of instantaneous three-dimensional measurement and can simultaneously obtain the holograms required for phase-shifting digital holography. The number of phase shifts of the reference wave in holography is inversely proportional to the number of pixels per each hologram in the technique. The authors quantitatively evaluated the quality of the images reconstructed by the three kinds of the technique in terms of the number of phase shifts. The techniques were numerically simulated and mean square errors between the object image and the reconstructed images were calculated. It was found that higher quality image is attained by smaller number of phase shifts of reference wave in the technique. This is because the smaller the number of phase shifts is, the shorter the sampling interval of spatial frequency per each hologram is, and the more detailed information of object can be reconstructed.
{"title":"Quantitative Evaluation of Reconstructed Images of Parallel Phase-Shifting Digital Holographies","authors":"T. Tahara, Y. Awatsuji, K. Nishio, S. Ura, T. Kubota, O. Matoba","doi":"10.1109/ISUC.2008.26","DOIUrl":"https://doi.org/10.1109/ISUC.2008.26","url":null,"abstract":"Parallel phase-shifting digital holography is a technique capable of instantaneous three-dimensional measurement and can simultaneously obtain the holograms required for phase-shifting digital holography. The number of phase shifts of the reference wave in holography is inversely proportional to the number of pixels per each hologram in the technique. The authors quantitatively evaluated the quality of the images reconstructed by the three kinds of the technique in terms of the number of phase shifts. The techniques were numerically simulated and mean square errors between the object image and the reconstructed images were calculated. It was found that higher quality image is attained by smaller number of phase shifts of reference wave in the technique. This is because the smaller the number of phase shifts is, the shorter the sampling interval of spatial frequency per each hologram is, and the more detailed information of object can be reconstructed.","PeriodicalId":339811,"journal":{"name":"2008 Second International Symposium on Universal Communication","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129750885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kentaro Torisawa, Stijn De Saeger, Yasunori Kakizawa, Jun'ichi Kazama, M. Murata, Daisuke Noguchi, Asuka Sumida
With this research we present a system that suggests valuable complementary information relevant to a user's topic of interest, in the form of keywords. For this purpose we have automatically constructed a Web search directory called TORISHIKI-KAI from a large collection of Web documents, using state of the art knowledge acquisition methods. TORISHIKI-KAI maps out the use context of the terms input by the user, and classifies topically related search terms according to semantic categories such as potential troubles, methods or tools in order to help the user find potentially valuable "unknown unknowns".
{"title":"TORISHIKI-KAI, An Autogenerated Web Search Directory","authors":"Kentaro Torisawa, Stijn De Saeger, Yasunori Kakizawa, Jun'ichi Kazama, M. Murata, Daisuke Noguchi, Asuka Sumida","doi":"10.1109/ISUC.2008.70","DOIUrl":"https://doi.org/10.1109/ISUC.2008.70","url":null,"abstract":"With this research we present a system that suggests valuable complementary information relevant to a user's topic of interest, in the form of keywords. For this purpose we have automatically constructed a Web search directory called TORISHIKI-KAI from a large collection of Web documents, using state of the art knowledge acquisition methods. TORISHIKI-KAI maps out the use context of the terms input by the user, and classifies topically related search terms according to semantic categories such as potential troubles, methods or tools in order to help the user find potentially valuable \"unknown unknowns\".","PeriodicalId":339811,"journal":{"name":"2008 Second International Symposium on Universal Communication","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130772860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Masaru Maebatake, Iori Suzuki, M. Nishida, Y. Horiuchi, S. Kuroiwa
In sign language, hand positions and movements represent meaning of words. Hence, we have been developing sign language recognition methods using both of hand positions and movements. However, in the previous studies, each feature has same weight to calculate the probability for the recognition. In this study, we propose a sign language recognition method by using a multi-stream HMM technique to show the importance of position and movement information for the sign language recognition. We conducted recognition experiments using 21,960 sign language word data. As a result, 75. 6% recognition accuracy was obtained with the appropriate weight (position:movement=0. 2:0. 8), while 70. 6% was obtained with the same weight. From the result, we can conclude that the hand movement is more important for the sign language recognition than the hand position. In addition, we conducted experiments to discuss the optimal number of the states and mixtures and the best accuracy was obtained by the 15 states and two mixtures for each word HMM.
{"title":"Sign Language Recognition Based on Position and Movement Using Multi-Stream HMM","authors":"Masaru Maebatake, Iori Suzuki, M. Nishida, Y. Horiuchi, S. Kuroiwa","doi":"10.1109/ISUC.2008.56","DOIUrl":"https://doi.org/10.1109/ISUC.2008.56","url":null,"abstract":"In sign language, hand positions and movements represent meaning of words. Hence, we have been developing sign language recognition methods using both of hand positions and movements. However, in the previous studies, each feature has same weight to calculate the probability for the recognition. In this study, we propose a sign language recognition method by using a multi-stream HMM technique to show the importance of position and movement information for the sign language recognition. We conducted recognition experiments using 21,960 sign language word data. As a result, 75. 6% recognition accuracy was obtained with the appropriate weight (position:movement=0. 2:0. 8), while 70. 6% was obtained with the same weight. From the result, we can conclude that the hand movement is more important for the sign language recognition than the hand position. In addition, we conducted experiments to discuss the optimal number of the states and mixtures and the best accuracy was obtained by the 15 states and two mixtures for each word HMM.","PeriodicalId":339811,"journal":{"name":"2008 Second International Symposium on Universal Communication","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123324806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The authors propose a hand posture estimation system in real time and with high accuracy, for robot hand control and human interface with hand motions without no sensors attached to the users. This method searches the similar image quickly from a large volume of previously-sorted image database which contains complicated shapes and self-occlusions of the human hand. Because the system doesn¿t need any special peripheral equipment such as range sensor and PC cluster, and works with a note PC using a single high-speed camera, the user can control and make a dexterous robot hand behave as he/she does, and use the system as an information input device by moving his/her hand and fingers.
{"title":"Real Time Posture Estimation of Human Hand for Robot Hand Interface","authors":"Takanobu Tanimoto, K. Hoshino","doi":"10.1109/ISUC.2008.30","DOIUrl":"https://doi.org/10.1109/ISUC.2008.30","url":null,"abstract":"The authors propose a hand posture estimation system in real time and with high accuracy, for robot hand control and human interface with hand motions without no sensors attached to the users. This method searches the similar image quickly from a large volume of previously-sorted image database which contains complicated shapes and self-occlusions of the human hand. Because the system doesn¿t need any special peripheral equipment such as range sensor and PC cluster, and works with a note PC using a single high-speed camera, the user can control and make a dexterous robot hand behave as he/she does, and use the system as an information input device by moving his/her hand and fingers.","PeriodicalId":339811,"journal":{"name":"2008 Second International Symposium on Universal Communication","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126484746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
One of the main causes of errors in statistical machine translation are the erroneous phrase pairs that can find their way into the phrase table. These phrases are the result of poor word-to-word alignments during the training of the translation model. These word alignment errors in turn cause errors during the phrase extraction phase, and these erroneous bilingual phrase pairs are then used during the decoding process and appear in the output of the machine translation system. Machine translation training data is never perfect, often bilingual sentence pairs are incorrectly aligned sentence-by-sentence, or these pairs are poor translations of each other due to human error. Even when sentence pairs in the corpus are good translations of each other the translations may not be literal enough to admit to the sort of phrase-by-phrase translation necessary to make good training data for a phrase-based statistical machine translation (SMT) system. This is because such SMT systems operate on the assumption that source can be transformed into target simply by translating phrase-by-phrase with re-ordering. In the real world, many perfectly correct translations are not of this form, and these sentences even though correct translations, make poor training data for training the translation models of a phrase-based SMT system. This paper presents a technique in which preliminary machine translation systems are built with the sole purpose of indicating those sentence pairs in the training corpus that the systems are able to generate using their models, the hypothesis being that these sentence pairs are likely to make good training data for an SMT system of the same type. These sentences are then used to bootstrap a second SMT system, and those sentences identified as good training data are given additional weight during the training process for building the translation models. Using this technique we were able to improve the performance of a Japanese-to-English SMT system by 1.2-1.5 BLEU points on unseen evaluation data.
{"title":"Using Statistical Machine Translation to Grade Training Data","authors":"A. Finch, E. Sumita","doi":"10.1109/ISUC.2008.20","DOIUrl":"https://doi.org/10.1109/ISUC.2008.20","url":null,"abstract":"One of the main causes of errors in statistical machine translation are the erroneous phrase pairs that can find their way into the phrase table. These phrases are the result of poor word-to-word alignments during the training of the translation model. These word alignment errors in turn cause errors during the phrase extraction phase, and these erroneous bilingual phrase pairs are then used during the decoding process and appear in the output of the machine translation system. Machine translation training data is never perfect, often bilingual sentence pairs are incorrectly aligned sentence-by-sentence, or these pairs are poor translations of each other due to human error. Even when sentence pairs in the corpus are good translations of each other the translations may not be literal enough to admit to the sort of phrase-by-phrase translation necessary to make good training data for a phrase-based statistical machine translation (SMT) system. This is because such SMT systems operate on the assumption that source can be transformed into target simply by translating phrase-by-phrase with re-ordering. In the real world, many perfectly correct translations are not of this form, and these sentences even though correct translations, make poor training data for training the translation models of a phrase-based SMT system. This paper presents a technique in which preliminary machine translation systems are built with the sole purpose of indicating those sentence pairs in the training corpus that the systems are able to generate using their models, the hypothesis being that these sentence pairs are likely to make good training data for an SMT system of the same type. These sentences are then used to bootstrap a second SMT system, and those sentences identified as good training data are given additional weight during the training process for building the translation models. Using this technique we were able to improve the performance of a Japanese-to-English SMT system by 1.2-1.5 BLEU points on unseen evaluation data.","PeriodicalId":339811,"journal":{"name":"2008 Second International Symposium on Universal Communication","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131151452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}