Active Appearance Model Based Contour Extraction for MRI Images of Human Tongue

Zhi-cheng Liu, Qilong Sun, J. Wei
{"title":"Active Appearance Model Based Contour Extraction for MRI Images of Human Tongue","authors":"Zhi-cheng Liu, Qilong Sun, J. Wei","doi":"10.2991/MASTA-19.2019.24","DOIUrl":null,"url":null,"abstract":"In this article, we present the results of automatic extraction of speech articulator contours from Magnetic Resonance Imaging movie by employing the Active Appearance Model. An Active Appearance Model based framework is proposed to deal with the high nonlinear property of articulatory deformation during articulation, which demonstrates the advantage for tracking articulators shape from noisy MRI images. The extraction of the vocal tract contour was carried on MRI movies from Chinese subjects. The performance of this framework was evaluated by comparing manually labeled contours with automatically extracted ones. The average error is around 2.1 pixels. Introduction Speech is one of the most important functions of human communication. However, the mechanism of speech production is far from being fully discovered. The morphological and dynamic aspects of speech organs are the essential for understanding the knowledge of speech dynamic. Advanced imaging and image processing technologies are important for this research field. Magnetic Resonance Imaging (MRI) is able to produce high-resolution images of human articulators. This function makes MRI currently one of the most promising means for speech research and hence has been widely used in study speech production [1-3]. A set of databases of MRI image of human speech organs have been available for various purposes. A necessary procedure to use such databases, however, is a successful extraction of the desired speech organs from these images. A large variety of algorithms have been developed over the last few decades trying to handle this issue [4-6]. They mainly can be categorized as data-driven approach such as snake-like methods and modeldriven approach that use the prior knowledge to complete the task. Both categories have their own pros and cons. For data-driven approach, each image has to be given an initial shape before extracting the shape, which could not be fully automatic. The model-based approach has to be trained by a training set, which has to be labeled manually beforehand. Active Appearance Models (AAM) is one of the model-based approaches, which has been shown that it has great promising for automatically tracking objects from images. As MRI database of speech has a large number of images for recording articulatory movements, it is worthy to label a small training set for automatically extracting the shape from remaining images. AAM was developed by Cootes et .al [7-10], which is a statistical point distribution model (PDM). AAM has demonstrated its capability for image segmentation [11]. It is able to automatically learn the parameters of the PDMs from sets of corresponding landmarks as well as incorporating the shape and boundary gray-level information. An AAM describes the image appearance and shape of object of interest by obtaining a statistical shape-appearance model from a training set. AAM minimize the difference between the synthesized image from the model and an unseen image by tuning the model parameters, when it is applied to image interpretation or segmentation. AAM has demonstrated high robust for segmentation in Cardiac MRI images and face feature extraction. The articulators such as tongue, soft palate and lips, however, are highly deformable organs than face and heart. In this research we adopt AAM as a mean for extracting tongue and palate contours from MRI image sequences as well as the contours of the profile view of upper and lower lips. International Conference on Modeling, Analysis, Simulation Technologies and Applications (MASTA 2019) Copyright © 2019, the Authors. Published by Atlantis Press. This is an open access article under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/). Advances in Intelligent Systems Research, volume 168","PeriodicalId":103896,"journal":{"name":"Proceedings of the 2019 International Conference on Modeling, Analysis, Simulation Technologies and Applications (MASTA 2019)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2019 International Conference on Modeling, Analysis, Simulation Technologies and Applications (MASTA 2019)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2991/MASTA-19.2019.24","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In this article, we present the results of automatic extraction of speech articulator contours from Magnetic Resonance Imaging movie by employing the Active Appearance Model. An Active Appearance Model based framework is proposed to deal with the high nonlinear property of articulatory deformation during articulation, which demonstrates the advantage for tracking articulators shape from noisy MRI images. The extraction of the vocal tract contour was carried on MRI movies from Chinese subjects. The performance of this framework was evaluated by comparing manually labeled contours with automatically extracted ones. The average error is around 2.1 pixels. Introduction Speech is one of the most important functions of human communication. However, the mechanism of speech production is far from being fully discovered. The morphological and dynamic aspects of speech organs are the essential for understanding the knowledge of speech dynamic. Advanced imaging and image processing technologies are important for this research field. Magnetic Resonance Imaging (MRI) is able to produce high-resolution images of human articulators. This function makes MRI currently one of the most promising means for speech research and hence has been widely used in study speech production [1-3]. A set of databases of MRI image of human speech organs have been available for various purposes. A necessary procedure to use such databases, however, is a successful extraction of the desired speech organs from these images. A large variety of algorithms have been developed over the last few decades trying to handle this issue [4-6]. They mainly can be categorized as data-driven approach such as snake-like methods and modeldriven approach that use the prior knowledge to complete the task. Both categories have their own pros and cons. For data-driven approach, each image has to be given an initial shape before extracting the shape, which could not be fully automatic. The model-based approach has to be trained by a training set, which has to be labeled manually beforehand. Active Appearance Models (AAM) is one of the model-based approaches, which has been shown that it has great promising for automatically tracking objects from images. As MRI database of speech has a large number of images for recording articulatory movements, it is worthy to label a small training set for automatically extracting the shape from remaining images. AAM was developed by Cootes et .al [7-10], which is a statistical point distribution model (PDM). AAM has demonstrated its capability for image segmentation [11]. It is able to automatically learn the parameters of the PDMs from sets of corresponding landmarks as well as incorporating the shape and boundary gray-level information. An AAM describes the image appearance and shape of object of interest by obtaining a statistical shape-appearance model from a training set. AAM minimize the difference between the synthesized image from the model and an unseen image by tuning the model parameters, when it is applied to image interpretation or segmentation. AAM has demonstrated high robust for segmentation in Cardiac MRI images and face feature extraction. The articulators such as tongue, soft palate and lips, however, are highly deformable organs than face and heart. In this research we adopt AAM as a mean for extracting tongue and palate contours from MRI image sequences as well as the contours of the profile view of upper and lower lips. International Conference on Modeling, Analysis, Simulation Technologies and Applications (MASTA 2019) Copyright © 2019, the Authors. Published by Atlantis Press. This is an open access article under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/). Advances in Intelligent Systems Research, volume 168
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于主动外观模型的人类舌头MRI图像轮廓提取
在本文中,我们介绍了利用主动外观模型从磁共振成像电影中自动提取语音发音器轮廓的结果。针对关节变形高度非线性的特点,提出了一种基于活动外观模型的框架,该框架在嘈杂的MRI图像中具有跟踪关节形状的优势。对中国受试者的MRI影像进行声道轮廓提取。通过比较手动标记的轮廓和自动提取的轮廓来评估该框架的性能。平均误差在2.1像素左右。语言是人类最重要的交际功能之一。然而,语音产生的机制还远未被完全发现。语言器官的形态和动态方面是理解语言动态知识的基础。先进的成像和图像处理技术对这一研究领域至关重要。磁共振成像(MRI)能够产生人类关节的高分辨率图像。这一功能使MRI成为目前最有前途的语音研究手段之一,因此被广泛应用于研究语音产生[1-3]。一套人类语言器官的核磁共振图像数据库已被用于各种目的。然而,使用这些数据库的一个必要步骤是成功地从这些图像中提取所需的语言器官。在过去的几十年里,各种各样的算法被开发出来试图处理这个问题[4-6]。它们主要可以分为数据驱动的方法,如蛇形方法和使用先验知识完成任务的模型驱动方法。这两种分类都有各自的优缺点。对于数据驱动的方法,每张图像在提取形状之前必须给定一个初始形状,这不是全自动的。基于模型的方法必须通过训练集进行训练,而训练集必须事先手动标记。主动外观模型(AAM)是一种基于模型的方法,在自动跟踪图像中的目标方面具有广阔的应用前景。由于语音的MRI数据库中有大量用于记录发音运动的图像,因此有必要标记一个小的训练集,用于从剩余图像中自动提取形状。AAM由Cootes等人[7-10]提出,是一种统计点分布模型(PDM)。AAM已经证明了它的图像分割能力[11]。它能够从相应的地标集合中自动学习pdm的参数,并结合形状和边界灰度信息。AAM通过从训练集中获得统计形状-外观模型来描述感兴趣对象的图像外观和形状。当应用于图像解释或分割时,AAM通过调整模型参数,将模型合成图像与未见图像之间的差异最小化。AAM在心脏MRI图像分割和人脸特征提取方面具有很高的鲁棒性。然而,像舌头、软腭和嘴唇这样的发音器官比脸和心脏更易变形。在本研究中,我们采用AAM作为一种方法从MRI图像序列中提取舌头和上颚的轮廓以及上下唇的轮廓。建模、分析、仿真技术与应用国际会议(MASTA 2019)版权所有©2019,作者。亚特兰蒂斯出版社出版。这是一篇基于CC BY-NC许可(http://creativecommons.org/licenses/by-nc/4.0/)的开放获取文章。智能系统研究进展,第168卷
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
LADRC-KALM Control Method Research for Three-motor Synchronous System Simulation Research and Application of Complex Fracture Network for SRV Study on the Measuring Harmonics Based on Capacitor Voltage Transformer Evaluation of Operating Efficiency of Agricultural Listed Enterprises Based on DEA-Tobit Two Stage Model The Promotion Effect of LOESS Smoothing Technique in Short-term Traffic Volume Clustering
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1