Pub Date : 2011-03-17DOI: 10.1109/NCC.2011.5734775
B. C. Haris, G. Pradhan, A. Misra, S. Shukla, R. Sinha, S. Prasanna
In this paper, we present our initial study with the recently collected speech database for developing robust speaker recognition systems in Indian context. The database contains the speech data collected across different sensors, languages, speaking styles, and environments, from 200 speakers. The speech data is collected across five different sensors in parallel, in English and multiple Indian languages, in reading and conversational speaking styles, and in office and uncontrolled environments such as laboratories, hostel rooms and corridors etc. The collected database is evaluated using adapted Gaussian mixture model based speaker verification system following the NIST 2003 speaker recognition evaluation protocol and gives comparable performance to those obtained using NIST data sets. Our initial study exploring the impact of mismatch in training and test conditions with collected data finds that the mismatch in sensor, speaking style, and environment result in significant degradation in performance compared to the matched case whereas for language mismatch case the degradation is found to be relatively smaller.
{"title":"Multi-variability speech database for robust speaker recognition","authors":"B. C. Haris, G. Pradhan, A. Misra, S. Shukla, R. Sinha, S. Prasanna","doi":"10.1109/NCC.2011.5734775","DOIUrl":"https://doi.org/10.1109/NCC.2011.5734775","url":null,"abstract":"In this paper, we present our initial study with the recently collected speech database for developing robust speaker recognition systems in Indian context. The database contains the speech data collected across different sensors, languages, speaking styles, and environments, from 200 speakers. The speech data is collected across five different sensors in parallel, in English and multiple Indian languages, in reading and conversational speaking styles, and in office and uncontrolled environments such as laboratories, hostel rooms and corridors etc. The collected database is evaluated using adapted Gaussian mixture model based speaker verification system following the NIST 2003 speaker recognition evaluation protocol and gives comparable performance to those obtained using NIST data sets. Our initial study exploring the impact of mismatch in training and test conditions with collected data finds that the mismatch in sensor, speaking style, and environment result in significant degradation in performance compared to the matched case whereas for language mismatch case the degradation is found to be relatively smaller.","PeriodicalId":158295,"journal":{"name":"2011 National Conference on Communications (NCC)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125991153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-03-17DOI: 10.1109/NCC.2011.5734737
Ashwin Bellur, K. Narayan, K. Raghava Krishnan, H. Murthy
This paper describes ways to improve prosody modeling in syllable-based concatenative speech synthesis systems for two Indian languages, namely Hindi and Tamil, within the unit selection paradigm. The syllable is a larger unit than the diphone and contains most of the coarticulation information. Although syllable-based synthesis is quite intelligible compared to diphone based systems, naturalness especially in terms of prosody, requires additional processing. Since the synthesizer is built using a cluster unit framework, a hybrid approach, where a combination of both rule based and statistical models are proposed to model prosody of syllable like units better. It is further observed that prediction of phrase boundaries is crucial, particularly because Indian languages are replete with polysyllabic words. CART based phrase modeling for Hindi and Tamil are discussed. Perceptual experiments show a significant improvement in the MOS for both Hindi and Tamil synthesizers. Index Terms: speech synthesis, unit selection, cluster unit synthesis, phrase boundaries
{"title":"Prosody modeling for syllable-based concatenative speech synthesis of Hindi and Tamil","authors":"Ashwin Bellur, K. Narayan, K. Raghava Krishnan, H. Murthy","doi":"10.1109/NCC.2011.5734737","DOIUrl":"https://doi.org/10.1109/NCC.2011.5734737","url":null,"abstract":"This paper describes ways to improve prosody modeling in syllable-based concatenative speech synthesis systems for two Indian languages, namely Hindi and Tamil, within the unit selection paradigm. The syllable is a larger unit than the diphone and contains most of the coarticulation information. Although syllable-based synthesis is quite intelligible compared to diphone based systems, naturalness especially in terms of prosody, requires additional processing. Since the synthesizer is built using a cluster unit framework, a hybrid approach, where a combination of both rule based and statistical models are proposed to model prosody of syllable like units better. It is further observed that prediction of phrase boundaries is crucial, particularly because Indian languages are replete with polysyllabic words. CART based phrase modeling for Hindi and Tamil are discussed. Perceptual experiments show a significant improvement in the MOS for both Hindi and Tamil synthesizers. Index Terms: speech synthesis, unit selection, cluster unit synthesis, phrase boundaries","PeriodicalId":158295,"journal":{"name":"2011 National Conference on Communications (NCC)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127375839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-03-17DOI: 10.1109/NCC.2011.5734780
Vyass Ramakrishnan, Karthik Shetty, Kumar Pawan, C. Seelamantula
We address the problem of speech enhancement in real-world noisy scenarios. We propose to solve the problem in two stages, the first comprising a generalized spectral subtraction technique, followed by a sequence of perceptually-motivated post-processing algorithms. The role of the post-processing algorithms is to compensate for the effects of noise as well as to suppress any artifacts created by the first-stage processing. The key post-processing mechanisms are aimed at suppressing musical noise and to enhance the formant structure of voiced speech as well as to denoise the linear-prediction residual. The parameter values in the techniques are fixed optimally by experimentally evaluating the enhancement performance as a function of the parameters. We used the Carnegie-Mellon university Arctic database for our experiments. We considered three real-world noise types: fan noise, car noise, and motorbike noise. The enhancement performance was evaluated by conducting listening experiments on 12 subjects. The listeners reported a clear improvement (MOS improvement of 0.5 on an average) over the noisy signal in the perceived quality (increase in the mean-opinion score (MOS)) for positive signal-to-noise-ratios (SNRs). For negative SNRs, however, the improvement was found to be marginal.
{"title":"Efficient post-processing techniques for speech enhancement","authors":"Vyass Ramakrishnan, Karthik Shetty, Kumar Pawan, C. Seelamantula","doi":"10.1109/NCC.2011.5734780","DOIUrl":"https://doi.org/10.1109/NCC.2011.5734780","url":null,"abstract":"We address the problem of speech enhancement in real-world noisy scenarios. We propose to solve the problem in two stages, the first comprising a generalized spectral subtraction technique, followed by a sequence of perceptually-motivated post-processing algorithms. The role of the post-processing algorithms is to compensate for the effects of noise as well as to suppress any artifacts created by the first-stage processing. The key post-processing mechanisms are aimed at suppressing musical noise and to enhance the formant structure of voiced speech as well as to denoise the linear-prediction residual. The parameter values in the techniques are fixed optimally by experimentally evaluating the enhancement performance as a function of the parameters. We used the Carnegie-Mellon university Arctic database for our experiments. We considered three real-world noise types: fan noise, car noise, and motorbike noise. The enhancement performance was evaluated by conducting listening experiments on 12 subjects. The listeners reported a clear improvement (MOS improvement of 0.5 on an average) over the noisy signal in the perceived quality (increase in the mean-opinion score (MOS)) for positive signal-to-noise-ratios (SNRs). For negative SNRs, however, the improvement was found to be marginal.","PeriodicalId":158295,"journal":{"name":"2011 National Conference on Communications (NCC)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117196413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-03-17DOI: 10.1109/NCC.2011.5734709
A. Oredope, V. Pham, B. Evans
The Third Generation Partnership Project (3GPP) has proposed the Long Term Evolution and System Architecture Evolution (LTE/SAE) as the next stage of technologies aimed at overcoming the limitation in the existing 2G and 3G networks while driving mobile networks towards 4G standardisation. Although a new architecture has been proposed for the LTE/SAE, however, some technologies from the existing 2G and 3G frameworks are reused to allow for smooth transition and backward compatibility. One of such technologies is the service delivery framework know as the IP Multimedia Subsystem (IMS). In this paper, we investigate and report our early-trials findings on the integration of the IMS and the LTE/SAE architecture specifically looking at key parameters such end-to-end call quality, QoS parameters, IP connectivity and session management. This was achieved by firstly critically analysing key 3GPP and non-3GPP approaches of deploying telephony services over LTE/SAE while making various recommendations based on a wide range of existing research work in the field. In order to demonstrate the recommendations suggested, a high level prototype of the Evolved Packet Core (EPC) prototype was modeled and developed. The EPC model was then integrated with the FOKUS Open IMS Core and deployed on our testbed while carrying out various Quality of Service (QoS) related experimental tests. Our results shows that even though an EPC controlled IMS session generates more packets for initial signaling, it will in the long-run lead to a reduction in bandwidth consumption by the client due to the fragmentation of SIP messages. This in return makes available additional bandwidth for the integration of additional service such as presence and high-resolution video to existing voice services.
第三代合作伙伴计划(3GPP)提出了长期演进和系统架构演进(LTE/SAE)作为下一阶段的技术,旨在克服现有2G和3G网络的局限性,同时推动移动网络向4G标准化发展。尽管LTE/SAE已经提出了一种新的架构,但是,现有的2G和3G框架中的一些技术被重用,以实现平滑过渡和向后兼容。其中一种技术是称为IP多媒体子系统(IMS)的服务交付框架。在本文中,我们调查并报告了IMS和LTE/SAE架构集成的早期试验结果,特别是关注端到端呼叫质量、QoS参数、IP连接和会话管理等关键参数。这是通过首先批判性地分析通过LTE/SAE部署电话服务的关键3GPP和非3GPP方法,同时根据该领域广泛的现有研究工作提出各种建议来实现的。为了演示所提出的建议,对演进分组核心(EPC)原型的高级原型进行了建模和开发。EPC模型随后与FOKUS Open IMS Core集成,并部署在我们的测试平台上,同时进行各种与服务质量(QoS)相关的实验测试。我们的结果表明,尽管EPC控制的IMS会话为初始信令生成更多的数据包,但从长远来看,由于SIP消息的碎片化,它将导致客户端带宽消耗的减少。这反过来又提供了额外的带宽,用于将其他服务(如在线和高分辨率视频)集成到现有的语音服务中。
{"title":"Deploying IP Multimedia Subsystem (IMS) services in future mobile networks","authors":"A. Oredope, V. Pham, B. Evans","doi":"10.1109/NCC.2011.5734709","DOIUrl":"https://doi.org/10.1109/NCC.2011.5734709","url":null,"abstract":"The Third Generation Partnership Project (3GPP) has proposed the Long Term Evolution and System Architecture Evolution (LTE/SAE) as the next stage of technologies aimed at overcoming the limitation in the existing 2G and 3G networks while driving mobile networks towards 4G standardisation. Although a new architecture has been proposed for the LTE/SAE, however, some technologies from the existing 2G and 3G frameworks are reused to allow for smooth transition and backward compatibility. One of such technologies is the service delivery framework know as the IP Multimedia Subsystem (IMS). In this paper, we investigate and report our early-trials findings on the integration of the IMS and the LTE/SAE architecture specifically looking at key parameters such end-to-end call quality, QoS parameters, IP connectivity and session management. This was achieved by firstly critically analysing key 3GPP and non-3GPP approaches of deploying telephony services over LTE/SAE while making various recommendations based on a wide range of existing research work in the field. In order to demonstrate the recommendations suggested, a high level prototype of the Evolved Packet Core (EPC) prototype was modeled and developed. The EPC model was then integrated with the FOKUS Open IMS Core and deployed on our testbed while carrying out various Quality of Service (QoS) related experimental tests. Our results shows that even though an EPC controlled IMS session generates more packets for initial signaling, it will in the long-run lead to a reduction in bandwidth consumption by the client due to the fragmentation of SIP messages. This in return makes available additional bandwidth for the integration of additional service such as presence and high-resolution video to existing voice services.","PeriodicalId":158295,"journal":{"name":"2011 National Conference on Communications (NCC)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122668395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-03-17DOI: 10.1109/NCC.2011.5734751
Ravi Ranjan, Subrat Kar
Prolonging life time is the most important designing objectives in wireless sensor networks (WSN). In WSN the total amount of energy is limited, how to make best use of the limited resource energy is a very important aspect in research of WSN. In this paper we provide a method for determining the optimal number of cluster head for homogeneous sensor networks deployed in different scenario using a reasonable energy consumption model. In the first scenario nodes are thrown randomly, which can be modeled using two-dimensional homogeneous spatial Poisson point process. In the second scenario, nodes are deterministically placed along the grid. For these two scenario we calculate the average energy spend in the network in each round according to LEACH protocol for both single and multi-hop between cluster head and sink (base station) as a function of the probability of the node to become a cluster head. Then we find optimal probability of becoming a cluster head hence the optimal number of cluster head that would lead to minimize the average energy spends in the network for each round. Simulation results shows that optimal probability of becoming a cluster heads that leads to minimize energy dissipation in the network is not only depend on the total number of nodes, but also depends on area of the network A, packet length L and processing energy of nodes.
{"title":"A novel approach for finding optimal number of cluster head in wireless sensor network","authors":"Ravi Ranjan, Subrat Kar","doi":"10.1109/NCC.2011.5734751","DOIUrl":"https://doi.org/10.1109/NCC.2011.5734751","url":null,"abstract":"Prolonging life time is the most important designing objectives in wireless sensor networks (WSN). In WSN the total amount of energy is limited, how to make best use of the limited resource energy is a very important aspect in research of WSN. In this paper we provide a method for determining the optimal number of cluster head for homogeneous sensor networks deployed in different scenario using a reasonable energy consumption model. In the first scenario nodes are thrown randomly, which can be modeled using two-dimensional homogeneous spatial Poisson point process. In the second scenario, nodes are deterministically placed along the grid. For these two scenario we calculate the average energy spend in the network in each round according to LEACH protocol for both single and multi-hop between cluster head and sink (base station) as a function of the probability of the node to become a cluster head. Then we find optimal probability of becoming a cluster head hence the optimal number of cluster head that would lead to minimize the average energy spends in the network for each round. Simulation results shows that optimal probability of becoming a cluster heads that leads to minimize energy dissipation in the network is not only depend on the total number of nodes, but also depends on area of the network A, packet length L and processing energy of nodes.","PeriodicalId":158295,"journal":{"name":"2011 National Conference on Communications (NCC)","volume":"18 41","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120844695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-03-17DOI: 10.1109/NCC.2011.5734772
Atif Iqbal, A. Namboodiri
Biometric identification often involves explicit comparison of a probe template against each template stored in a database. This approach becomes extremely time-consuming as the size of the database increases. Filtering approaches use a light-weight comparison to reduce the database to smaller set of candidates for explicit comparison. However, most existing filtering schemes use specific features that are hand-crafted for the biometric trait at each stage of the filtering. In this work, we show that a cascade of simple linear projections on random lines can achieve significant levels of filtering. Each stage of filtering consists of projecting the probe onto a specific line and removal of database samples outside a window around the probe. The approach provides a way of automatic generation of filters and avoids the need of developing specific features for different biometric traits. The method also provides us with a variety of parameters such as the projection lines, the number and order of projections, and the window sizes to customize the filtering process to a specific application. Experimental results show that using an ensemble of projections reduce the search space by 60% without increasing the false negative identification rate.
{"title":"Cascaded filtering for biometric identification using random projections","authors":"Atif Iqbal, A. Namboodiri","doi":"10.1109/NCC.2011.5734772","DOIUrl":"https://doi.org/10.1109/NCC.2011.5734772","url":null,"abstract":"Biometric identification often involves explicit comparison of a probe template against each template stored in a database. This approach becomes extremely time-consuming as the size of the database increases. Filtering approaches use a light-weight comparison to reduce the database to smaller set of candidates for explicit comparison. However, most existing filtering schemes use specific features that are hand-crafted for the biometric trait at each stage of the filtering. In this work, we show that a cascade of simple linear projections on random lines can achieve significant levels of filtering. Each stage of filtering consists of projecting the probe onto a specific line and removal of database samples outside a window around the probe. The approach provides a way of automatic generation of filters and avoids the need of developing specific features for different biometric traits. The method also provides us with a variety of parameters such as the projection lines, the number and order of projections, and the window sizes to customize the filtering process to a specific application. Experimental results show that using an ensemble of projections reduce the search space by 60% without increasing the false negative identification rate.","PeriodicalId":158295,"journal":{"name":"2011 National Conference on Communications (NCC)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125006703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-03-17DOI: 10.1109/NCC.2011.5734782
Praveen Chandrahas, Deepti Kumar, R. Karthik, T. Gonsalves, A. Jhunjhunwala, Gaurav Raina
The rapid proliferation of mobile phones now provides an opportunity to harness their potential towards financial transactions. One form of such financial transactions is mobile payments. Our focus, in this paper, will be on certain design considerations for a mobile payment architecture.
{"title":"Some design considerations for a mobile payment architecture","authors":"Praveen Chandrahas, Deepti Kumar, R. Karthik, T. Gonsalves, A. Jhunjhunwala, Gaurav Raina","doi":"10.1109/NCC.2011.5734782","DOIUrl":"https://doi.org/10.1109/NCC.2011.5734782","url":null,"abstract":"The rapid proliferation of mobile phones now provides an opportunity to harness their potential towards financial transactions. One form of such financial transactions is mobile payments. Our focus, in this paper, will be on certain design considerations for a mobile payment architecture.","PeriodicalId":158295,"journal":{"name":"2011 National Conference on Communications (NCC)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125502081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-03-17DOI: 10.1109/NCC.2011.5734787
S. Venkatramanan, Anurag Kumar
We provide new analytical results concerning the spread of information or influence under the linear threshold social network model introduced by Kempe et al. in [1], in the information dissemination context. The seeder starts by providing the message to a set of initial nodes and is interested in maximizing the number of nodes that will receive the message ultimately. A node's decision to forward the message depends on the set of nodes from which it has received the message. Under the linear threshold model, the decision to forward the information depends on the comparison of the total influence of the nodes from which a node has received the packet with its own threshold of influence. We derive analytical expressions for the expected number of nodes that receive the message ultimately, as a function of the initial set of nodes, for a generic network. We show that the problem can be recast in the framework of Markov chains. We then use the analytical expression to gain insights into information dissemination in some simple network topologies such as the star, ring, mesh and on acyclic graphs. We also derive the optimal initial set in the above networks, and also hint at general heuristics for picking a good initial set.
{"title":"Information dissemination in socially aware networks under the linear threshold model","authors":"S. Venkatramanan, Anurag Kumar","doi":"10.1109/NCC.2011.5734787","DOIUrl":"https://doi.org/10.1109/NCC.2011.5734787","url":null,"abstract":"We provide new analytical results concerning the spread of information or influence under the linear threshold social network model introduced by Kempe et al. in [1], in the information dissemination context. The seeder starts by providing the message to a set of initial nodes and is interested in maximizing the number of nodes that will receive the message ultimately. A node's decision to forward the message depends on the set of nodes from which it has received the message. Under the linear threshold model, the decision to forward the information depends on the comparison of the total influence of the nodes from which a node has received the packet with its own threshold of influence. We derive analytical expressions for the expected number of nodes that receive the message ultimately, as a function of the initial set of nodes, for a generic network. We show that the problem can be recast in the framework of Markov chains. We then use the analytical expression to gain insights into information dissemination in some simple network topologies such as the star, ring, mesh and on acyclic graphs. We also derive the optimal initial set in the above networks, and also hint at general heuristics for picking a good initial set.","PeriodicalId":158295,"journal":{"name":"2011 National Conference on Communications (NCC)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129179383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-03-17DOI: 10.1109/NCC.2011.5734715
R. Chaudhary, K. V. Srivastava, A. Biswas
The main objective of the present study is to improve the bandwidth of the dielectric resonator antenna (DRA). A new four-element multilayer cylindrical dielectric resonator antenna (MCDRA) array above the ground plane is proposed here. MCDRA is easy to design and excited with HE111δ mode excited in each MCDRA by centrally placed dielectric resonator in which TM01δ mode excited. The effect of design parameters such as permittivity of materials, probe height and arrangement of dielectric layers are investigated and the excited modes (i.e. TM01δ and HE11δ) are also been confirmed by simulations. The simulation is performed on Ansoft's HFSS package. The proposed multilayer cylindrical dielectric resonator antenna (MCDRA) can offer an impedance bandwidth of ∼47% for the return loss below −10dB where frequency range is from 4.06 to 6.07 GHz and resonance frequency is 4.3 GHz with monopole like radiation pattern and it is stable in the passband with 4.73 dB gain.
{"title":"Four element multilayer cylindrical dielectric resonator antenna excited by a coaxial probe for wideband applications","authors":"R. Chaudhary, K. V. Srivastava, A. Biswas","doi":"10.1109/NCC.2011.5734715","DOIUrl":"https://doi.org/10.1109/NCC.2011.5734715","url":null,"abstract":"The main objective of the present study is to improve the bandwidth of the dielectric resonator antenna (DRA). A new four-element multilayer cylindrical dielectric resonator antenna (MCDRA) array above the ground plane is proposed here. MCDRA is easy to design and excited with HE111δ mode excited in each MCDRA by centrally placed dielectric resonator in which TM01δ mode excited. The effect of design parameters such as permittivity of materials, probe height and arrangement of dielectric layers are investigated and the excited modes (i.e. TM01δ and HE11δ) are also been confirmed by simulations. The simulation is performed on Ansoft's HFSS package. The proposed multilayer cylindrical dielectric resonator antenna (MCDRA) can offer an impedance bandwidth of ∼47% for the return loss below −10dB where frequency range is from 4.06 to 6.07 GHz and resonance frequency is 4.3 GHz with monopole like radiation pattern and it is stable in the passband with 4.73 dB gain.","PeriodicalId":158295,"journal":{"name":"2011 National Conference on Communications (NCC)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126630962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-03-17DOI: 10.1109/NCC.2011.5734719
Arpit Mathur, R. Hegde
A new spectral ratio method is proposed in this paper for detecting whispered segments within a normally phonated speech stream. The method is based on computing the ratio of the linear Prediction(LP) spectrum to the minimum variance distortion less response (MVDR) spectrum. Both the linear prediction method and the LP residual method by themselves are found to be inadequate in modelling medium to high frequencies in the speech signal. On the contrary, the MVDR method shows robustness in modelling spectra of all frequencies. This difference in spectral estimation between the two is utilized in the proposed spectral ratio method to separate whispered segments having less harmonics and more noise from normally phonated segments of speech. A comparative analysis of the proposed method with other methods like the LP residual and the spectral flatness methods is described. Whisper Detection experiments are conducted on the CHAINS database. The proposed method indicates reasonable improvements as noted from the ROC curves and the whisper diarization error rate.
{"title":"Significance of the LP-MVDR spectral ratio method in Whisper Detection","authors":"Arpit Mathur, R. Hegde","doi":"10.1109/NCC.2011.5734719","DOIUrl":"https://doi.org/10.1109/NCC.2011.5734719","url":null,"abstract":"A new spectral ratio method is proposed in this paper for detecting whispered segments within a normally phonated speech stream. The method is based on computing the ratio of the linear Prediction(LP) spectrum to the minimum variance distortion less response (MVDR) spectrum. Both the linear prediction method and the LP residual method by themselves are found to be inadequate in modelling medium to high frequencies in the speech signal. On the contrary, the MVDR method shows robustness in modelling spectra of all frequencies. This difference in spectral estimation between the two is utilized in the proposed spectral ratio method to separate whispered segments having less harmonics and more noise from normally phonated segments of speech. A comparative analysis of the proposed method with other methods like the LP residual and the spectral flatness methods is described. Whisper Detection experiments are conducted on the CHAINS database. The proposed method indicates reasonable improvements as noted from the ROC curves and the whisper diarization error rate.","PeriodicalId":158295,"journal":{"name":"2011 National Conference on Communications (NCC)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127423248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}