2006 IEEE 14th Signal Processing and Communications Applications最新文献

英文中文

Automatic Video Text Localization and Recognition 自动视频文本定位和识别

2006 IEEE 14th Signal Processing and Communications Applications

Pub Date : 2006-04-17 DOI: 10.1109/SIU.2006.1659917

A. Saracoglu, A. Alatan

For the indexing and management of large scale video databases an important tool would be the text in the digital media. In this work, the localization performances of the overlay texts using different feature extraction methods with different classifiers are analyzed. Besides that in order to improve the text recognition rate by using multiple hypothesis obtained from multilevel segmentation and using statistical language model are investigated

对于大型视频数据库的索引和管理来说，数字媒体中的文本是一个重要的工具。本文分析了不同分类器和不同特征提取方法对叠加文本的定位性能。此外，为了提高文本识别率，研究了利用多层分割得到的多个假设和统计语言模型

引用次数: 16

Gabor Factor Analysis for 2D+3D Facial Landmark Localization 二维+三维面部地标定位的Gabor因子分析

2006 IEEE 14th Signal Processing and Communications Applications

Pub Date : 2006-04-17 DOI: 10.1109/SIU.2006.1659894

A. Salah

We propose a coarse-to-fme method for facial landmark localization that relies on unsupervised modeling of landmark features obtained through different Gabor filter channels. The input to the system is a registered near-frontal 2D and 3D face image pair, with background clutter. The system aims at complete automatic detection of seven facial landmarks; the nose tip, eye and mouth corners, respectively. A structural analysis subsystem is employed to detect incorrect landmarks and to correct them. We compare our local features with two widely used Gabor jet based methods, and illustrate their superior performance

我们提出了一种基于对通过不同Gabor滤波器通道获得的地标特征进行无监督建模的面部地标定位方法。系统的输入是一个注册的近正面二维和三维人脸图像对，背景杂乱。该系统的目标是完成7个面部标志的自动检测;分别是鼻尖、眼睛和嘴角。结构分析子系统用于检测和纠正不正确的地标。我们将我们的局部特征与两种广泛使用的基于Gabor射流的方法进行了比较，并说明了它们的优越性能

引用次数: 8

Design of Hybrid Fiber Raman and EDFAs Operated by Optimal Performance 性能最优的混合光纤拉曼和edfa设计

2006 IEEE 14th Signal Processing and Communications Applications

Pub Date : 2006-04-17 DOI: 10.1109/SIU.2006.1659897

C. Berkdemir, S. Ozsoy

Optimal performance calculations in point of the gain and the pump power of hybrid amplifiers consisting of erbium-doped fiber amplifiers (EDFAs) and fiber Raman amplifiers for increasing the transmission capacity in communication systems are preformed. In the hybrid amplifier system which is formed from a single-mode EDFA, which has 18 m length, and two fiber Raman amplifiers, which have 6,5 and 7 km lengths, it is shown that the gain of 3-4 dB can be obtained for the long-wavelength applications

从增益和泵浦功率两个方面对掺铒光纤放大器和光纤拉曼放大器组成的混合放大器在提高通信系统传输容量方面的最优性能进行了计算。在由一个长度为18 m的单模EDFA和两个长度分别为6、5和7 km的光纤拉曼放大器组成的混合放大器系统中，可以获得3-4 dB的长波长增益

引用次数: 0

A Tool for Creating Calibrated Images 创建校准图像的工具

2006 IEEE 14th Signal Processing and Communications Applications

Pub Date : 2006-04-17 DOI: 10.1109/SIU.2006.1659836

U. Yilmaz, O. Hellwich

A synthetic imaging tool by which three-dimensional model descriptions given in OpenGL are converted into images, is described. Camera parameters are also extracted and attached to the images. Conversion of OpenGL matrices to calibration parameters and conversion of calibration parameters to OpenGL matrices are explained in detail. Radial distortion is also modeled, so that images become more realistic. The libraries, created in the scope of this work, are made publicly available over Internet under . In the lack of time or photographing equipment, the tools presented in this study would be vital for the researchers who want to test their surface modeling and calibration algorithms.

介绍了一种将OpenGL中给出的三维模型描述转换成图像的合成成像工具。相机参数也被提取并附加到图像上。详细介绍了OpenGL矩阵到校准参数的转换和校准参数到OpenGL矩阵的转换。径向畸变也被建模，使图像变得更加真实。在本工作范围内创建的图书馆，在Internet上公开提供。在缺乏时间或摄影设备的情况下，本研究中提出的工具对于想要测试其表面建模和校准算法的研究人员至关重要。

引用次数: 2

Language Modelling Approaches for Turkish Large Vocabulary Continuous Speech Recognition Based on Lattice Rescoring 基于点阵评分的土耳其语大词汇连续语音识别语言建模方法

2006 IEEE 14th Signal Processing and Communications Applications

Pub Date : 2006-04-17 DOI: 10.1109/SIU.2006.1659773

E. Arisoy, M. Saraçlar

In this paper, we have tried some language modelling approaches for large vocabulary continuous speech recognition (LVCSR) of Turkish. The agglutinative nature of Turkish makes Turkish a challenging language in terms of speech recognition since it is impossible to include all possible words in the recognition lexicon. Therefore, instead of using words as recognition units, we use a data-driven sub-word approach called morphs. This method was previously applied to Finnish, Estonian and Turkish and promising recognition results were achieved compared to words as recognition units. In our database, we obtained word error rates (WER) of 38.8% for the baseline word-based model and 33.9% for the baseline morph-based model. In addition, we tried some new methods. Recognition lattice outputs of each model were rescored with the root-based and root-class-based models for the word-based case and first-morph-based model for the morph-based case. The word-root composition approach achieves a 0.5% increase in the recognition performance. However, other two approaches fail due to the non-robust estimates over the baseline models

本文针对土耳其语的大词汇量连续语音识别(LVCSR)，尝试了几种语言建模方法。土耳其语的黏着性使得土耳其语在语音识别方面成为一种具有挑战性的语言，因为在识别词典中不可能包括所有可能的单词。因此，我们不使用词作为识别单位，而是使用数据驱动的子词方法，称为morphs。该方法之前应用于芬兰语、爱沙尼亚语和土耳其语，与以单词为识别单位相比，取得了令人满意的识别结果。在我们的数据库中，我们获得了基线基于词的模型的单词错误率(WER)为38.8%，基线基于词形的模型为33.9%。此外，我们还尝试了一些新的方法。每个模型的识别格输出分别使用基于根和基于根类的模型对基于词的情况进行重建，使用基于第一形态的模型对基于形态的情况进行重建。词根合成方法的识别性能提高了0.5%。然而，另外两种方法由于对基线模型的非稳健估计而失败

{"title":"Language Modelling Approaches for Turkish Large Vocabulary Continuous Speech Recognition Based on Lattice Rescoring","authors":"E. Arisoy, M. Saraçlar","doi":"10.1109/SIU.2006.1659773","DOIUrl":"https://doi.org/10.1109/SIU.2006.1659773","url":null,"abstract":"In this paper, we have tried some language modelling approaches for large vocabulary continuous speech recognition (LVCSR) of Turkish. The agglutinative nature of Turkish makes Turkish a challenging language in terms of speech recognition since it is impossible to include all possible words in the recognition lexicon. Therefore, instead of using words as recognition units, we use a data-driven sub-word approach called morphs. This method was previously applied to Finnish, Estonian and Turkish and promising recognition results were achieved compared to words as recognition units. In our database, we obtained word error rates (WER) of 38.8% for the baseline word-based model and 33.9% for the baseline morph-based model. In addition, we tried some new methods. Recognition lattice outputs of each model were rescored with the root-based and root-class-based models for the word-based case and first-morph-based model for the morph-based case. The word-root composition approach achieves a 0.5% increase in the recognition performance. However, other two approaches fail due to the non-robust estimates over the baseline models","PeriodicalId":415037,"journal":{"name":"2006 IEEE 14th Signal Processing and Communications Applications","volume":"81 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124018144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Three-Band Modeling Using Prominent Edges for Face Alignment 利用突出边缘进行人脸对齐的三波段建模

2006 IEEE 14th Signal Processing and Communications Applications

Pub Date : 2006-04-17 DOI: 10.1109/SIU.2006.1659716

F. Kahraman, B. Kurt, M. Gokmen

Fundamental difficulty in face recognition systems is mainly related to successful human face alignment from the input image. In recent years, model based approaches get attention among others. Most powerful method among model-based approaches is known as active appearance model. The method accomplishes this by constructing a relation between shape and texture. Face alignment methods are required to work well even in the presence of illumination and affine transformation. Classical AAM extracts texture and shape information from the training image by using RGB color space. Classical AAM can only handle images having the same or similar color distribution to the images in the training set. Classical AAM cannot align images obtained under different lightning conditions from the training images even if the same person exists in the training database. In this study, we propose to use features which are shown to be less sensitive to illumination changes instead of directly using RGB colors. The proposed AAM is called three-band AAM. The bands are hue, hill, and luminance. Prominent edge detection constitutes the most important part of the model. Experimental studies show that prominent edges do not depend on illumination changes much when compared the original color space, and the three-band AAM based face alignment outperforms the classical AAM in terms of alignment precision

人脸识别系统的基本难点主要是如何从输入图像中成功对准人脸。近年来，基于模型的方法受到了广泛的关注。在基于模型的方法中，最强大的方法被称为活动外观模型。该方法通过构造形状和纹理之间的关系来实现这一点。要求人脸对准方法在有光照和仿射变换的情况下也能很好地工作。经典AAM利用RGB色彩空间从训练图像中提取纹理和形状信息。经典AAM只能处理与训练集中的图像具有相同或相似颜色分布的图像。即使训练数据库中存在同一个人，经典AAM也不能将不同闪电条件下获得的图像与训练图像对齐。在本研究中，我们建议使用对光照变化不太敏感的特征，而不是直接使用RGB颜色。所提出的AAM称为三波段AAM。这些波段分别是色调、山丘和亮度。突出的边缘检测是模型中最重要的部分。实验研究表明，与原始色彩空间相比，突出边缘对光照变化的依赖性较小，基于三波段AAM的人脸对准精度优于经典AAM

{"title":"Three-Band Modeling Using Prominent Edges for Face Alignment","authors":"F. Kahraman, B. Kurt, M. Gokmen","doi":"10.1109/SIU.2006.1659716","DOIUrl":"https://doi.org/10.1109/SIU.2006.1659716","url":null,"abstract":"Fundamental difficulty in face recognition systems is mainly related to successful human face alignment from the input image. In recent years, model based approaches get attention among others. Most powerful method among model-based approaches is known as active appearance model. The method accomplishes this by constructing a relation between shape and texture. Face alignment methods are required to work well even in the presence of illumination and affine transformation. Classical AAM extracts texture and shape information from the training image by using RGB color space. Classical AAM can only handle images having the same or similar color distribution to the images in the training set. Classical AAM cannot align images obtained under different lightning conditions from the training images even if the same person exists in the training database. In this study, we propose to use features which are shown to be less sensitive to illumination changes instead of directly using RGB colors. The proposed AAM is called three-band AAM. The bands are hue, hill, and luminance. Prominent edge detection constitutes the most important part of the model. Experimental studies show that prominent edges do not depend on illumination changes much when compared the original color space, and the three-band AAM based face alignment outperforms the classical AAM in terms of alignment precision","PeriodicalId":415037,"journal":{"name":"2006 IEEE 14th Signal Processing and Communications Applications","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124019824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Reduction of Sensory Inaccuracy in Nonlinear Systems using Particle Filters 用粒子滤波降低非线性系统的感觉误差

2006 IEEE 14th Signal Processing and Communications Applications

Pub Date : 2006-04-17 DOI: 10.1109/SIU.2006.1659715

H. Bayram, A. Ertuzun, H. Bozma

In signal processing and control applications, on-line state estimation plays important role in stability of the system. In cases where state and/or measurement functions are highly nonlinear and/or the noise is not Gaussian, conventional filters such as extended Kalman filters do not provide satisfactory results. In this paper, particle filters and its application to a nonlinear problem are examined

在信号处理和控制应用中，在线状态估计对系统的稳定性起着重要的作用。在状态和/或测量函数是高度非线性和/或噪声不是高斯的情况下，传统滤波器如扩展卡尔曼滤波器不能提供令人满意的结果。本文研究了粒子滤波及其在非线性问题中的应用

引用次数: 0

Illumination Subspaces based Robust Face Recognition 基于光照子空间的鲁棒人脸识别

2006 IEEE 14th Signal Processing and Communications Applications

Pub Date : 2006-04-17 DOI: 10.1109/SIU.2006.1659916

D. Kern, H. K. Ekenel, R. Stiefelhagen, Aydinlanmadan Kaynaklanan

In this paper a face recognition system that is based on illumination subspaces is presented. In this system, first, the dominant illumination directions are learned using a clustering algorithm. Three main illumination directions are observed: Ones that have frontal illumination, illumination from left and right sides. After determining the dominant illumination direction classes, the face space is divided into these classes to separate the variations caused by illumination from the variations caused by different identities. Then illumination subspaces based face recognition approach is used to benefit from the additional knowledge of the illumination direction. The proposed approach is tested on the images from the illumination and lighting subsets of the CMU PIE database. The experimental results show that by utilizing knowledge of illumination direction and using illumination subspaces based face recognition, the performance is significantly improved

提出了一种基于光照子空间的人脸识别系统。在该系统中，首先使用聚类算法学习主导照明方向;观察到三个主要的照明方向:正面照明、左右照明。在确定主导光照方向类别后，将人脸空间划分为这些类别，将光照引起的变化与不同身份引起的变化区分开来。然后利用基于光照子空间的人脸识别方法，利用光照方向的附加知识。在CMU PIE数据库的照明和照明子集图像上进行了测试。实验结果表明，利用光照方向知识和基于光照子空间的人脸识别性能有明显提高

引用次数: 0

Acoustic Echo Cancellation with Adaptive Filtering 自适应滤波声学回声消除

2006 IEEE 14th Signal Processing and Communications Applications

Pub Date : 2006-04-17 DOI: 10.1109/SIU.2006.1659783

Yanki Iptali, Ahmet Refit, Kavsaoglu Yuksel Ozbay

Because echo is formed through picking up sounds from the loud speaker by speaker microphone, echo noise in voice transmission systems severely degrades the speech quality and the speech intelligibility of speech signal. These picked sounds are sent to loudspeaker again. As a result of repetition of this process, the amplitude of the echo increases. In the study, it is aimed to enhance the intelligibility of speech by canceling out the echo noise. In this study, the data transfer software which is necessary for real time processing of voice signals and the adaptive filtering algorithm software for the application of acoustic echo cancellation have been developed

由于回波是由扩音器拾取扬声器发出的声音形成的，语音传输系统中的回波噪声严重降低了语音质量和语音信号的可理解性。这些拾取的声音再次被发送到扬声器。由于这个过程的重复，回声的振幅增加。本研究旨在通过消除回声噪声来提高语音的可理解性。本研究开发了语音信号实时处理所需的数据传输软件和应用于声回波抵消的自适应滤波算法软件

引用次数: 2

Detection and Analysis of Quick Phase Eye Movements in Nystagmus (VNG) 眼球震颤(VNG)的快相眼动检测与分析

2006 IEEE 14th Signal Processing and Communications Applications

Pub Date : 2006-04-17 DOI: 10.1109/SIU.2006.1659819

A. Aydemir, A. Uneri

Measuring horizontal, vertical and torsional nystagmus are performed using various methods. Videonystagmography (VNG), being one of these methods, is based on recording the high phase movements of the eye, which occurs during nystagmus. For this research, an infrared camera was employed to measure the three components of nystagmus on different patients, using image-processing algorithms

用不同的方法测量水平、垂直和扭转眼震。视频眼震成像(VNG)就是其中的一种方法，它是基于记录眼球在眼震期间发生的高相运动。在本研究中，采用红外摄像机，利用图像处理算法测量不同患者眼球震颤的三个组成部分

引用次数: 8

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2006 IEEE 14th Signal Processing and Communications Applications

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀