Pub Date : 2001-08-22DOI: 10.1109/ICME.2001.1237802
A. Koga
As a new learning method in global age, the collaborative learning gathers many interests. In collaborative learning, through the discussion among the learners about shared problems, they acquire the skills such as problem finding skills, problem solving skills, self-regulate skills and communication skills. In collaborative learning, the interactions between learners play an important role. To make it easier to design a learning environment where desirable interactions occur, it is necessary to have a description system to express the learning process explicitly. We describe that the description system enables us to develop the tools to design the learning process involving the interactions between learners, and to develop the tools that execute the designed process symbolically to show the feasibility of the process. Finally, we describe that the description system can be used as the basis of the collaborative activity record format standard that is currently being progressed in the collaborative learning technology WG in ISO/IEC JTC1/SC36.
{"title":"Study on the models of the collaborative learning systems and proposal to the standardization activities","authors":"A. Koga","doi":"10.1109/ICME.2001.1237802","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237802","url":null,"abstract":"As a new learning method in global age, the collaborative learning gathers many interests. In collaborative learning, through the discussion among the learners about shared problems, they acquire the skills such as problem finding skills, problem solving skills, self-regulate skills and communication skills. In collaborative learning, the interactions between learners play an important role. To make it easier to design a learning environment where desirable interactions occur, it is necessary to have a description system to express the learning process explicitly. We describe that the description system enables us to develop the tools to design the learning process involving the interactions between learners, and to develop the tools that execute the designed process symbolically to show the feasibility of the process. Finally, we describe that the description system can be used as the basis of the collaborative activity record format standard that is currently being progressed in the collaborative learning technology WG in ISO/IEC JTC1/SC36.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129276683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-08-22DOI: 10.1109/ICME.2001.1237803
K. Nakabayashi
In order to promote computer-based education and training, it is crucial to establish interoperability of learning contents, learner information, and learning system components. In the US, Europe and Asia, government, industry and academia are paying attention and making effort toward this direction. Several learning technology standardization initiatives are developing specifications covering quite large field such as platform, multimedia data, learning contents, learner information, and competency definitions. This paper discusses the needs of learning technology standards, summarizes the efforts in each initiative, and describes the future direction of standardization effort.
{"title":"Trends of learning technology standard","authors":"K. Nakabayashi","doi":"10.1109/ICME.2001.1237803","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237803","url":null,"abstract":"In order to promote computer-based education and training, it is crucial to establish interoperability of learning contents, learner information, and learning system components. In the US, Europe and Asia, government, industry and academia are paying attention and making effort toward this direction. Several learning technology standardization initiatives are developing specifications covering quite large field such as platform, multimedia data, learning contents, learner information, and competency definitions. This paper discusses the needs of learning technology standards, summarizes the efforts in each initiative, and describes the future direction of standardization effort.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127077634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-08-22DOI: 10.1109/ICME.2001.1237658
Ye-Kui Wang
Error concealment is an important method to mitigate the degradation of the audio quality when compressed audio packets are lost in error prone channels, such as mobile Internet and digital audio broadcasting. This paper presents a novel error concealment scheme, which exploits the beat and rhythmic pattern of music signals. Preliminary simulations show significantly improved subjective sound quality in comparison with conventional methods in the case of burst packet losses. The new scheme is proposed as a complement to prior arts. It can be adopted to essentially all existing perceptual audio decoders such as an MP3 decoder for streaming music.
{"title":"A beat-pattern based error concealment scheme for music delivery with burst packet loss","authors":"Ye-Kui Wang","doi":"10.1109/ICME.2001.1237658","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237658","url":null,"abstract":"Error concealment is an important method to mitigate the degradation of the audio quality when compressed audio packets are lost in error prone channels, such as mobile Internet and digital audio broadcasting. This paper presents a novel error concealment scheme, which exploits the beat and rhythmic pattern of music signals. Preliminary simulations show significantly improved subjective sound quality in comparison with conventional methods in the case of burst packet losses. The new scheme is proposed as a complement to prior arts. It can be adopted to essentially all existing perceptual audio decoders such as an MP3 decoder for streaming music.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"9 10","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113964997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-08-22DOI: 10.1109/ICME.2001.1237713
J. Jang, Hong-Ru Lee, M. Kao
paper presents the use of linear scaling and tree search in a content-based music retrieval system that can take a user's acoustic input (8-second clip of singing or humming) via a microphone and then retrieve the intended song from over 3000 candidate songs in the database. The system, known as Super MBox, demonstrates the feasibility of real-time content-based music retrieval with a high recognition rate. Super MBox first takes the user's acoustic input from a microphone and converts it into a pitch vector. Then a fast comparison engine using linear scaling and tree search is employed to compute the similarity scores. We have tested Super MBox and found the top-20 recognition rate is about 73% with about 1000 clips of test inputs from people with mediocre singing skills.
{"title":"Content-based music retrieval using linear scaling and branch-and-bound tree search","authors":"J. Jang, Hong-Ru Lee, M. Kao","doi":"10.1109/ICME.2001.1237713","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237713","url":null,"abstract":"paper presents the use of linear scaling and tree search in a content-based music retrieval system that can take a user's acoustic input (8-second clip of singing or humming) via a microphone and then retrieve the intended song from over 3000 candidate songs in the database. The system, known as Super MBox, demonstrates the feasibility of real-time content-based music retrieval with a high recognition rate. Super MBox first takes the user's acoustic input from a microphone and converts it into a pitch vector. Then a fast comparison engine using linear scaling and tree search is employed to compute the similarity scores. We have tested Super MBox and found the top-20 recognition rate is about 73% with about 1000 clips of test inputs from people with mediocre singing skills.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131273987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-08-22DOI: 10.1109/ICME.2001.1237648
Sungchan Park, NamRye Son, Junghyun Kim, Gueesang Lee
In this paper, a new approach for the recovery of lost or erroneous motion vector(MV)s by classifying the movements of neighboring blocks by their homogeneity is proposed. MVs of the neighboring blocks are classified according to the direction of MVs and a representative value for each class is determined to obtain the candidate MV with the minimum distortion is selected. Experimental results show that the proposed algorithm exhibits better performance in many cases than existing methods.
{"title":"Recovery of motion vectors by detecting homogeneous movements for H.263 video communications","authors":"Sungchan Park, NamRye Son, Junghyun Kim, Gueesang Lee","doi":"10.1109/ICME.2001.1237648","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237648","url":null,"abstract":"In this paper, a new approach for the recovery of lost or erroneous motion vector(MV)s by classifying the movements of neighboring blocks by their homogeneity is proposed. MVs of the neighboring blocks are classified according to the direction of MVs and a representative value for each class is determined to obtain the candidate MV with the minimum distortion is selected. Experimental results show that the proposed algorithm exhibits better performance in many cases than existing methods.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122987376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-08-22DOI: 10.1109/ICME.2001.1237805
K. Mase
Toy Interface is a real-world oriented interface that uses modeled objects with “toy”-like shapes and attributes as the interface between the real world and cyberspace. Toy-interface can be categorized into one of three types: the doll type, miniascape type and brick type. We investigate various toy interfaces and present the design detail of a doll-type interface prototype for the purpose of multi-modal interaction and communication.
{"title":"Toy interface for multimodal interaction and communication","authors":"K. Mase","doi":"10.1109/ICME.2001.1237805","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237805","url":null,"abstract":"Toy Interface is a real-world oriented interface that uses modeled objects with “toy”-like shapes and attributes as the interface between the real world and cyberspace. Toy-interface can be categorized into one of three types: the doll type, miniascape type and brick type. We investigate various toy interfaces and present the design detail of a doll-type interface prototype for the purpose of multi-modal interaction and communication.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124157400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-08-22DOI: 10.1109/ICME.2001.1237781
S. Semwal
Cognitive maps are mental models of the relative locations and attribute phenomena of spatial environments. The ability to form cognitive maps is one of the innate gifts of nature. An absence of this ability can have crippling effect, for example, on the visually impaired. The sense of touch becomes the primary source of forming cognitive maps for the visually impaired. Once formed, cognitive maps provide precise mapping of the physical world so that a visually impaired individual can successfully navigate with minimal assistance. However, traditional mobility training is time consuming, and it is very difficult for the blind to express or revisit the cognitive maps formed after a training session is over. The proposed haptic environment will allow the visually impaired individual to express cognitive maps as 3D surface maps, with two PHANToM force-feedback devices guiding them. The 3D representation can be finetuned by the care-giver, and then felt again by the visually impaired in order to form precise cognitive maps. In addition to voice commentary, a library of pre-existing shapes familiar to the blind will provide orientation and proprioceptive haptic-cues during navigation. A graphical display of cognitive maps will provide feedback to the care-giver or trainer. As the haptic environment can be easily stored and retrieved, the MoVE system will also encourage navigation by the blind at their own convenience, and with family members.
{"title":"Wayfinding and navigation in haptic virtual environments","authors":"S. Semwal","doi":"10.1109/ICME.2001.1237781","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237781","url":null,"abstract":"Cognitive maps are mental models of the relative locations and attribute phenomena of spatial environments. The ability to form cognitive maps is one of the innate gifts of nature. An absence of this ability can have crippling effect, for example, on the visually impaired. The sense of touch becomes the primary source of forming cognitive maps for the visually impaired. Once formed, cognitive maps provide precise mapping of the physical world so that a visually impaired individual can successfully navigate with minimal assistance. However, traditional mobility training is time consuming, and it is very difficult for the blind to express or revisit the cognitive maps formed after a training session is over. The proposed haptic environment will allow the visually impaired individual to express cognitive maps as 3D surface maps, with two PHANToM force-feedback devices guiding them. The 3D representation can be finetuned by the care-giver, and then felt again by the visually impaired in order to form precise cognitive maps. In addition to voice commentary, a library of pre-existing shapes familiar to the blind will provide orientation and proprioceptive haptic-cues during navigation. A graphical display of cognitive maps will provide feedback to the care-giver or trainer. As the haptic environment can be easily stored and retrieved, the MoVE system will also encourage navigation by the blind at their own convenience, and with family members.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128990199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-08-22DOI: 10.1109/ICME.2001.1237885
X. Huang, G. Woolsey
Rapidly advancing capabilities in PC-based multimedia technology are providing new opportunities for delivery of educational material. Multimedia technology is being introduced at all levels of the degrees in Electronics and Communications at the University of New England (UNE). In this paper attention is drawn the use of multimedia technology through the example of a fourth-year education package on signal processing. We have used this multimedia education package for teaching and learning during formal class periods and to encourage students to use the technology in their own personal study and projects in order to increase their engineering generic skills. The success of the venture has encouraged us to extend the technology to other selected units in the UNE engineering programs.
{"title":"Multimedia materials for teaching signal processing","authors":"X. Huang, G. Woolsey","doi":"10.1109/ICME.2001.1237885","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237885","url":null,"abstract":"Rapidly advancing capabilities in PC-based multimedia technology are providing new opportunities for delivery of educational material. Multimedia technology is being introduced at all levels of the degrees in Electronics and Communications at the University of New England (UNE). In this paper attention is drawn the use of multimedia technology through the example of a fourth-year education package on signal processing. We have used this multimedia education package for teaching and learning during formal class periods and to encourage students to use the technology in their own personal study and projects in order to increase their engineering generic skills. The success of the venture has encouraged us to extend the technology to other selected units in the UNE engineering programs.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129399092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-08-22DOI: 10.1109/ICME.2001.1237924
S. Nepal, Uma Srinivasan, G. Reynolds
Recent content-based retrieval systems such as QBIC [7] and VisualSEEk [8] use low-level audio-visual features such as color, pan, zoom, and loudness for retrieval. However, users prefer to retrieve videos using high-level semantics based on their perception such as "bright color" and "very loud sound". This results in a gap between what users would like and what systems can generate. This paper is an attempt to bridge this gap by mapping users’ perception (of semantic concepts) to lowlevel feature values. This paper proposes a model for providing high-level semantics for an audio feature that determines loudness. We first perform a pilot user study to capture the user perception of loudness level on a collection of audio clips of sound effects, and map them to five different semantic terms. We then describe how the loudness measure in MPEG-1 layer II audio files can be mapped to user perceived loudness. We then devise a fuzzy technique for retrieving audio/video clips from the collections using those semantic terms.
{"title":"Semantic based retrieval model for digital audio and video","authors":"S. Nepal, Uma Srinivasan, G. Reynolds","doi":"10.1109/ICME.2001.1237924","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237924","url":null,"abstract":"Recent content-based retrieval systems such as QBIC [7] and VisualSEEk [8] use low-level audio-visual features such as color, pan, zoom, and loudness for retrieval. However, users prefer to retrieve videos using high-level semantics based on their perception such as \"bright color\" and \"very loud sound\". This results in a gap between what users would like and what systems can generate. This paper is an attempt to bridge this gap by mapping users’ perception (of semantic concepts) to lowlevel feature values. This paper proposes a model for providing high-level semantics for an audio feature that determines loudness. We first perform a pilot user study to capture the user perception of loudness level on a collection of audio clips of sound effects, and map them to five different semantic terms. We then describe how the loudness measure in MPEG-1 layer II audio files can be mapped to user perceived loudness. We then devise a fuzzy technique for retrieving audio/video clips from the collections using those semantic terms.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115725735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-08-22DOI: 10.1109/ICME.2001.1237657
M. Bertini, C. Colombo, A. Bimbo
Broadcasters are demonstrating interest in building digital archives of their assets for reuse of archive materials for TV programs, on-line availability, and archiving. This requires tools for video indexing and retrieval by content exploiting high-level video information such as that contained in super-imposed text captions. In this paper we present a method to automatically detect and localize captions in digital video using temporal and spatial local properties of salient points in video frames. Results of experiments on both high-resolutionDV sequences and standard VHS videos are presented and discussed.
{"title":"Automatic caption localization in videos using salient points","authors":"M. Bertini, C. Colombo, A. Bimbo","doi":"10.1109/ICME.2001.1237657","DOIUrl":"https://doi.org/10.1109/ICME.2001.1237657","url":null,"abstract":"Broadcasters are demonstrating interest in building digital archives of their assets for reuse of archive materials for TV programs, on-line availability, and archiving. This requires tools for video indexing and retrieval by content exploiting high-level video information such as that contained in super-imposed text captions. In this paper we present a method to automatically detect and localize captions in digital video using temporal and spatial local properties of salient points in video frames. Results of experiments on both high-resolutionDV sequences and standard VHS videos are presented and discussed.","PeriodicalId":405589,"journal":{"name":"IEEE International Conference on Multimedia and Expo, 2001. ICME 2001.","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125552656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}