Pub Date : 2016-12-01DOI: 10.1109/APSIPA.2016.7820839
Aiting Liu, Chao Xing, Yang Feng, Dong Wang
Learning distributed word representations (word embeddings) has gained much popularity recently. Current learning approaches usually treat all dimensions of the embeddings as homogeneous, which leads to non-structured representations where the dimensions are neither interpretable nor comparable. This paper proposes a method to generate ordered word embed-dings where the significance of the dimensions is in descending order. The ordering mechanism may benefit a wide range of applications such as fast search, vector tailor, and so on. Our method employs a γ-decay dropout algorithm to make sure in the learning process the lower dimensions are more likely to be updated than the higher dimensions so that the lower dimensions can encode more information. The experimental results on the WordSimilarity-353, MEN3000, SCWS and SimLex-999 tasks show that compared to the non-ordered counterparts the proposed method indeed produced more meaningful ordered embeddings and achieved better performance.
{"title":"Learning ordered word representations with γ-decay dropout","authors":"Aiting Liu, Chao Xing, Yang Feng, Dong Wang","doi":"10.1109/APSIPA.2016.7820839","DOIUrl":"https://doi.org/10.1109/APSIPA.2016.7820839","url":null,"abstract":"Learning distributed word representations (word embeddings) has gained much popularity recently. Current learning approaches usually treat all dimensions of the embeddings as homogeneous, which leads to non-structured representations where the dimensions are neither interpretable nor comparable. This paper proposes a method to generate ordered word embed-dings where the significance of the dimensions is in descending order. The ordering mechanism may benefit a wide range of applications such as fast search, vector tailor, and so on. Our method employs a γ-decay dropout algorithm to make sure in the learning process the lower dimensions are more likely to be updated than the higher dimensions so that the lower dimensions can encode more information. The experimental results on the WordSimilarity-353, MEN3000, SCWS and SimLex-999 tasks show that compared to the non-ordered counterparts the proposed method indeed produced more meaningful ordered embeddings and achieved better performance.","PeriodicalId":409448,"journal":{"name":"2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126731972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/APSIPA.2016.7820766
Che-Wei Kuo, H. Hang, Chun-Liang Chien
To meet a wide range of needs for video applications such as remote desktop, video conference, distance education, and cloud gaming, the ISO/ITU Joint Collaborative Team on Video Coding (JCT-VC) committee is recently specifying the Screen Content Coding (SCC) standard, as one of the extensions of High Efficiency Video Coding (HEVC). In this paper, the hash search method of the standard adopted Intra Block Copy (IBC) coding tool for SCC is investigated. We collect the coded data using the current hash table and examine their efficiency and explore possible ways for further improvement. A low complexity scheme of selecting effective hash nodes and a modified hash key generation method are presented. Experimental results show that the proposed method reduces on the average 37% or at most 70% hash table memory usage but it preserves the similar BD-rate savings and encoding complexity when integrated into the SCM-3.0 test model.
{"title":"Intra block copy hash reduction for HEVC screen content coding","authors":"Che-Wei Kuo, H. Hang, Chun-Liang Chien","doi":"10.1109/APSIPA.2016.7820766","DOIUrl":"https://doi.org/10.1109/APSIPA.2016.7820766","url":null,"abstract":"To meet a wide range of needs for video applications such as remote desktop, video conference, distance education, and cloud gaming, the ISO/ITU Joint Collaborative Team on Video Coding (JCT-VC) committee is recently specifying the Screen Content Coding (SCC) standard, as one of the extensions of High Efficiency Video Coding (HEVC). In this paper, the hash search method of the standard adopted Intra Block Copy (IBC) coding tool for SCC is investigated. We collect the coded data using the current hash table and examine their efficiency and explore possible ways for further improvement. A low complexity scheme of selecting effective hash nodes and a modified hash key generation method are presented. Experimental results show that the proposed method reduces on the average 37% or at most 70% hash table memory usage but it preserves the similar BD-rate savings and encoding complexity when integrated into the SCM-3.0 test model.","PeriodicalId":409448,"journal":{"name":"2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126335324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/APSIPA.2016.7820794
Ryosuke Nakanishi, Sayaka Shiota, H. Kiya
This paper proposes an ensemble based automatic speaker recognition (ASV) using adapted score fusion in noisy reverberant environment. It is well known that background noise and reverberation affect the performance of the ASV systems. Various techniques have been reported to improve the robustness against noise and reverberation, and an ensemble based method is one of the effective techniques in the noisy environment. The ensemble based method uses a combination of several weak learners to achieve higher performance than a single learner method. However, since the performance is depended on the fusion weights, the adequate weight estimation method is required. The proposed weight estimation method is based a supervised adaptation and the evolutionary update algorithm. The QUT-NOISE-SRE protocol, which has been published recently, is used for simulating the reverberation of the clean speech in our experiments. The experimental results report the characteristics of the QUT-NOISE-SRE protocol and the effectiveness of the proposed method in noisy reverberant environment.
{"title":"Ensemble based speaker verification using adapted score fusion in noisy reverberant environments","authors":"Ryosuke Nakanishi, Sayaka Shiota, H. Kiya","doi":"10.1109/APSIPA.2016.7820794","DOIUrl":"https://doi.org/10.1109/APSIPA.2016.7820794","url":null,"abstract":"This paper proposes an ensemble based automatic speaker recognition (ASV) using adapted score fusion in noisy reverberant environment. It is well known that background noise and reverberation affect the performance of the ASV systems. Various techniques have been reported to improve the robustness against noise and reverberation, and an ensemble based method is one of the effective techniques in the noisy environment. The ensemble based method uses a combination of several weak learners to achieve higher performance than a single learner method. However, since the performance is depended on the fusion weights, the adequate weight estimation method is required. The proposed weight estimation method is based a supervised adaptation and the evolutionary update algorithm. The QUT-NOISE-SRE protocol, which has been published recently, is used for simulating the reverberation of the clean speech in our experiments. The experimental results report the characteristics of the QUT-NOISE-SRE protocol and the effectiveness of the proposed method in noisy reverberant environment.","PeriodicalId":409448,"journal":{"name":"2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126282812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/APSIPA.2016.7820724
Xiong Xiao, Shinji Watanabe, Chng Eng Siong, Haizhou Li
Recently, a deep beamforming (BF) network was proposed to predict BF weights from phase-carrying features, such as generalized cross correlation (GCC). The BF network is trained jointly with the acoustic model to minimize automatic speech recognition (ASR) cost function. In this paper, we propose to replace GCC with features derived from input signals' spatial covariance matrices (SCM), which contain the phase information of individual frequency bands. Experimental results on the AMI meeting transcription task shows that the BF network using SCM features significantly reduces the word error rate to 44.1% from 47.9% obtained with the conventional ASR pipeline using delay-and-sum BF. Also compared with GCC features, we have observed small but steady gain by 0.6% absolutely. The use of SCM features also facilitate the implementation of more advanced BF methods within a deep learning framework, such as minimum variance distortionless response BF that requires the speech and noise SCM.
{"title":"Beamforming networks using spatial covariance features for far-field speech recognition","authors":"Xiong Xiao, Shinji Watanabe, Chng Eng Siong, Haizhou Li","doi":"10.1109/APSIPA.2016.7820724","DOIUrl":"https://doi.org/10.1109/APSIPA.2016.7820724","url":null,"abstract":"Recently, a deep beamforming (BF) network was proposed to predict BF weights from phase-carrying features, such as generalized cross correlation (GCC). The BF network is trained jointly with the acoustic model to minimize automatic speech recognition (ASR) cost function. In this paper, we propose to replace GCC with features derived from input signals' spatial covariance matrices (SCM), which contain the phase information of individual frequency bands. Experimental results on the AMI meeting transcription task shows that the BF network using SCM features significantly reduces the word error rate to 44.1% from 47.9% obtained with the conventional ASR pipeline using delay-and-sum BF. Also compared with GCC features, we have observed small but steady gain by 0.6% absolutely. The use of SCM features also facilitate the implementation of more advanced BF methods within a deep learning framework, such as minimum variance distortionless response BF that requires the speech and noise SCM.","PeriodicalId":409448,"journal":{"name":"2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)","volume":"25 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120895555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/APSIPA.2016.7820682
Manon Ichiki, Aiko Hagiwara, Hitoshi Ito, K. Onoe, Shoei Sato, A. Kobayashi
We describe a method of lexicon expansion to tackle variations of spontaneous speech. The variations of utterances are found widely in the programs such as conversations talk shows and are typically observed as unintelligible utterances with a high speech-rate. Unlike read speech in news programs, these variations often severely degrade automatic speech recognition (ASR) performance. Then, these variations are considered as new versions of original entries in the ASR lexicon. The new entries are generated based on the SMT approach, in which translation models are trained from corpus translating phoneme sequence in a lexicon into the sequence obtained by phoneme recognition. We introduce a new method in which unreliable entries are removed from the lexicon. Our SMT-based approach achieved a 0.1 % WER reduction for a variety of broadcasting programs.
{"title":"SMT-based lexicon expansion for broadcast transcription","authors":"Manon Ichiki, Aiko Hagiwara, Hitoshi Ito, K. Onoe, Shoei Sato, A. Kobayashi","doi":"10.1109/APSIPA.2016.7820682","DOIUrl":"https://doi.org/10.1109/APSIPA.2016.7820682","url":null,"abstract":"We describe a method of lexicon expansion to tackle variations of spontaneous speech. The variations of utterances are found widely in the programs such as conversations talk shows and are typically observed as unintelligible utterances with a high speech-rate. Unlike read speech in news programs, these variations often severely degrade automatic speech recognition (ASR) performance. Then, these variations are considered as new versions of original entries in the ASR lexicon. The new entries are generated based on the SMT approach, in which translation models are trained from corpus translating phoneme sequence in a lexicon into the sequence obtained by phoneme recognition. We introduce a new method in which unreliable entries are removed from the lexicon. Our SMT-based approach achieved a 0.1 % WER reduction for a variety of broadcasting programs.","PeriodicalId":409448,"journal":{"name":"2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115216605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/APSIPA.2016.7820711
Phuc Chau, Yong-woo Lee, T. Bui, Jitae Shin, J. Jeong
The recent research studies showed that inter-layered network coding is a promising approach to provide the unequal error protection for scalable video multicast under the channel heterogeneity. The selection of the optimal transmission distribution performed at eNB increases the system performance with the cost of time and computational complexities. In this paper, we propose an optimal transmission strategy for the scalable video multicast using triangular network coding at the application layer in LTE/LTE-Advanced networks. The proposed transmission strategy comprises of two optimization phases: space reduction and performance maximization. The first optimization reduces the number of searching steps in the dictionary of possible transmission distributions by using a proposed performance predictive algorithm. The following optimization not only maximizes the average number of successfully decoded layers among receivers but also maximizes the number of receivers decoding the video base layer successfully in the second phase. We evaluate the proposed transmission strategy through various simulations with the performance metrics regarding the average number of successfully decoded layers among receivers in a multicast group, throughput, and video quality measurement. The simulation results show that our proposed scheme outperforms other recent studies and adapts well with the variable streaming rates of the video under the extreme time constraints.
{"title":"Robust scalable video multicast using triangular network coding in LTE/LTE-Advanced","authors":"Phuc Chau, Yong-woo Lee, T. Bui, Jitae Shin, J. Jeong","doi":"10.1109/APSIPA.2016.7820711","DOIUrl":"https://doi.org/10.1109/APSIPA.2016.7820711","url":null,"abstract":"The recent research studies showed that inter-layered network coding is a promising approach to provide the unequal error protection for scalable video multicast under the channel heterogeneity. The selection of the optimal transmission distribution performed at eNB increases the system performance with the cost of time and computational complexities. In this paper, we propose an optimal transmission strategy for the scalable video multicast using triangular network coding at the application layer in LTE/LTE-Advanced networks. The proposed transmission strategy comprises of two optimization phases: space reduction and performance maximization. The first optimization reduces the number of searching steps in the dictionary of possible transmission distributions by using a proposed performance predictive algorithm. The following optimization not only maximizes the average number of successfully decoded layers among receivers but also maximizes the number of receivers decoding the video base layer successfully in the second phase. We evaluate the proposed transmission strategy through various simulations with the performance metrics regarding the average number of successfully decoded layers among receivers in a multicast group, throughput, and video quality measurement. The simulation results show that our proposed scheme outperforms other recent studies and adapts well with the variable streaming rates of the video under the extreme time constraints.","PeriodicalId":409448,"journal":{"name":"2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115982907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/APSIPA.2016.7820678
Mei Zhao, Anhong Wang, Zhiwei Xing, Peihao Li
A key issue for compressed Sensing (CS) is to design the measurement matrix. However, the traditional measurement matrix is not optimal due to its non-adaptability without showing discrimination to different components. In this paper, a prior information directed stage-wise measurement matrix is proposed for block compressed image sensing, leading to a st-BCS method. In the first stage, the measurement matrix only takes measurements of the important low frequency components directed by the prior structure information, and then it is updated stage by stage according to the prior information obtained at the decoder side via a feedback. Experimental results show that our st-BCS achieves significant performance improvement over the state-of-art BCS scheme which uses the non-adaptive random matrix.
{"title":"Prior information directed stage-wise measurement matrix design for block compressed image sensing","authors":"Mei Zhao, Anhong Wang, Zhiwei Xing, Peihao Li","doi":"10.1109/APSIPA.2016.7820678","DOIUrl":"https://doi.org/10.1109/APSIPA.2016.7820678","url":null,"abstract":"A key issue for compressed Sensing (CS) is to design the measurement matrix. However, the traditional measurement matrix is not optimal due to its non-adaptability without showing discrimination to different components. In this paper, a prior information directed stage-wise measurement matrix is proposed for block compressed image sensing, leading to a st-BCS method. In the first stage, the measurement matrix only takes measurements of the important low frequency components directed by the prior structure information, and then it is updated stage by stage according to the prior information obtained at the decoder side via a feedback. Experimental results show that our st-BCS achieves significant performance improvement over the state-of-art BCS scheme which uses the non-adaptive random matrix.","PeriodicalId":409448,"journal":{"name":"2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124956453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/APSIPA.2016.7820859
Lei Luo, Jinwei Sun, Boyan Huang, Xiangbin Jiang
By analyzing the theory of functional link artificial neural network (FLANN) structure based on filtered-s least mean square (FSLMS) algorithm which is usually used in the nonlinear active noise control (NANC) system, it can be found that the controller coefficients of nonlinear parts are multiple related, this problem causes much unbalance to calculate these coefficients and restraints the performance of FSLMS algorithm. To solve this issue, a modified FSLMS (MFSLMS) algorithm is proposed in this paper, it can weaken these multiple relationships greatly by adding a corrective filter before trigonometric expansion. Compared with conventional FSLMS algorithm and its other improved versions, MFSLMS algorithm not only performs better on noise cancellation, but also has less computational complexity. Extensive simulations are conducted to demonstrate the effectiveness of the proposed algorithm.
{"title":"A modified FSLMS algorithm for nonlinear ANC","authors":"Lei Luo, Jinwei Sun, Boyan Huang, Xiangbin Jiang","doi":"10.1109/APSIPA.2016.7820859","DOIUrl":"https://doi.org/10.1109/APSIPA.2016.7820859","url":null,"abstract":"By analyzing the theory of functional link artificial neural network (FLANN) structure based on filtered-s least mean square (FSLMS) algorithm which is usually used in the nonlinear active noise control (NANC) system, it can be found that the controller coefficients of nonlinear parts are multiple related, this problem causes much unbalance to calculate these coefficients and restraints the performance of FSLMS algorithm. To solve this issue, a modified FSLMS (MFSLMS) algorithm is proposed in this paper, it can weaken these multiple relationships greatly by adding a corrective filter before trigonometric expansion. Compared with conventional FSLMS algorithm and its other improved versions, MFSLMS algorithm not only performs better on noise cancellation, but also has less computational complexity. Extensive simulations are conducted to demonstrate the effectiveness of the proposed algorithm.","PeriodicalId":409448,"journal":{"name":"2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116572378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/APSIPA.2016.7820910
Chung-Nan Lee, Han-Ting Lai
How to efficiently use the limited wireless bandwidth is of paramount important in the radio wireless network area. To use the wireless bandwidth in an efficiently way, one can consider a pricing model to improve users received video layers. In this paper, the users are divided into three classes. For different class users, they pay different price and enjoy different QoS. In a pricing model, more video layers are allocated to those users who pay a higher price. We propose a Pricing Class based Resource Allocation Scheme (PCRAS) that considers the price of multicast group and uses channel quality indicator to compute resource allocation priority to dispatch resource blocks efficiently. Experimental results shows that the proposed scheme can increase users' received video layers and users between different classes will receive different level of services. In comparison to other existing scheduling schemes, the proposed scheme can improve users' video experience.
{"title":"Pricing based resource allocation scheme for video multicast service in LTE networks","authors":"Chung-Nan Lee, Han-Ting Lai","doi":"10.1109/APSIPA.2016.7820910","DOIUrl":"https://doi.org/10.1109/APSIPA.2016.7820910","url":null,"abstract":"How to efficiently use the limited wireless bandwidth is of paramount important in the radio wireless network area. To use the wireless bandwidth in an efficiently way, one can consider a pricing model to improve users received video layers. In this paper, the users are divided into three classes. For different class users, they pay different price and enjoy different QoS. In a pricing model, more video layers are allocated to those users who pay a higher price. We propose a Pricing Class based Resource Allocation Scheme (PCRAS) that considers the price of multicast group and uses channel quality indicator to compute resource allocation priority to dispatch resource blocks efficiently. Experimental results shows that the proposed scheme can increase users' received video layers and users between different classes will receive different level of services. In comparison to other existing scheduling schemes, the proposed scheme can improve users' video experience.","PeriodicalId":409448,"journal":{"name":"2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125099423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/APSIPA.2016.7820707
Dong-Won Shin, Yo-Sung Ho
Interests on 3D object reconstruction digitizing the shape and color of an object from the real world are getting popular. 3D object reconstruction consists of various steps such as image acquisition, image refinement, point cloud generation, iterative closest points, bundle adjustment and model surface representation. Among them, iterative closest points method is critical to calculate the accurate initial value for the optimization in the following bundle adjustment step. There is the object drift problem in the existing iterative closest points method due to the accumulated trajectory error as time flows. In this paper, we performed a more accurate registration between point clouds by SIFT features and the weighting on them. We found the proposed method decreases the absolute trajectory error and reduces the object drift problem in the reconstructed 3D object model.
{"title":"Iterative closest points method based on photometric weight for 3D object reconstruction","authors":"Dong-Won Shin, Yo-Sung Ho","doi":"10.1109/APSIPA.2016.7820707","DOIUrl":"https://doi.org/10.1109/APSIPA.2016.7820707","url":null,"abstract":"Interests on 3D object reconstruction digitizing the shape and color of an object from the real world are getting popular. 3D object reconstruction consists of various steps such as image acquisition, image refinement, point cloud generation, iterative closest points, bundle adjustment and model surface representation. Among them, iterative closest points method is critical to calculate the accurate initial value for the optimization in the following bundle adjustment step. There is the object drift problem in the existing iterative closest points method due to the accumulated trajectory error as time flows. In this paper, we performed a more accurate registration between point clouds by SIFT features and the weighting on them. We found the proposed method decreases the absolute trajectory error and reduces the object drift problem in the reconstructed 3D object model.","PeriodicalId":409448,"journal":{"name":"2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125118641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}