Audio-Visual Kinship Verification in the Wild

Xiaoting Wu, Eric Granger, T. Kinnunen, Xiaoyi Feng, A. Hadid
{"title":"Audio-Visual Kinship Verification in the Wild","authors":"Xiaoting Wu, Eric Granger, T. Kinnunen, Xiaoyi Feng, A. Hadid","doi":"10.1109/ICB45273.2019.8987241","DOIUrl":null,"url":null,"abstract":"Kinship verification is a challenging problem, where recognition systems are trained to establish a kin relation between two individuals based on facial images or videos. However, due to variations in capture conditions (background, pose, expression, illumination and occlusion), state-of-the-art systems currently provide a low level of accuracy. As in many visual recognition and affective computing applications, kinship verification may benefit from a combination of discriminant information extracted from both video and audio signals. In this paper, we investigate for the first time the fusion audio-visual information from both face and voice modalities to improve kinship verification accuracy. First, we propose a new multi-modal kinship dataset called TALking KINship (TALKIN), that is comprised of several pairs of video sequences with subjects talking. State-of-the-art conventional and deep learning models are assessed and compared for kinship verification using this dataset. Finally, we propose a deep Siamese network for multi-modal fusion of kinship relations. Experiments with the TALKIN dataset indicate that the proposed Siamese network provides a significantly higher level of accuracy over baseline uni-modal and multi-modal fusion techniques for kinship verification. Results also indicate that audio (vocal) information is complementary and useful for kinship verification problem.","PeriodicalId":430846,"journal":{"name":"2019 International Conference on Biometrics (ICB)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Biometrics (ICB)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICB45273.2019.8987241","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

Abstract

Kinship verification is a challenging problem, where recognition systems are trained to establish a kin relation between two individuals based on facial images or videos. However, due to variations in capture conditions (background, pose, expression, illumination and occlusion), state-of-the-art systems currently provide a low level of accuracy. As in many visual recognition and affective computing applications, kinship verification may benefit from a combination of discriminant information extracted from both video and audio signals. In this paper, we investigate for the first time the fusion audio-visual information from both face and voice modalities to improve kinship verification accuracy. First, we propose a new multi-modal kinship dataset called TALking KINship (TALKIN), that is comprised of several pairs of video sequences with subjects talking. State-of-the-art conventional and deep learning models are assessed and compared for kinship verification using this dataset. Finally, we propose a deep Siamese network for multi-modal fusion of kinship relations. Experiments with the TALKIN dataset indicate that the proposed Siamese network provides a significantly higher level of accuracy over baseline uni-modal and multi-modal fusion techniques for kinship verification. Results also indicate that audio (vocal) information is complementary and useful for kinship verification problem.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
野外亲缘关系视听验证
亲属关系验证是一个具有挑战性的问题,其中识别系统被训练以根据面部图像或视频在两个人之间建立亲属关系。然而,由于捕获条件(背景、姿势、表情、照明和遮挡)的变化,目前最先进的系统提供的精度水平较低。在许多视觉识别和情感计算应用中,亲属关系验证可能受益于从视频和音频信号中提取的判别信息的组合。在本文中,我们首次研究了融合视听信息从面部和语音的方式,以提高亲属验证的准确性。首先,我们提出了一个新的多模态亲属关系数据集,称为TALking kinship (TALKIN),该数据集由几对视频序列组成。使用此数据集评估和比较最先进的传统和深度学习模型,以进行亲属关系验证。最后,我们提出了一个深暹罗网络多模态融合的亲属关系。TALKIN数据集的实验表明,所提出的Siamese网络在亲属关系验证方面提供了比基线单模态和多模态融合技术更高的准确性。结果还表明,音频(语音)信息是互补和有用的亲属关系验证问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
PPG2Live: Using dual PPG for active authentication and liveness detection A New Approach for EEG-Based Biometric Authentication Using Auditory Stimulation A novel scheme to address the fusion uncertainty in multi-modal continuous authentication schemes on mobile devices Sclera Segmentation Benchmarking Competition in Cross-resolution Environment Fingerprint Presentation Attack Detection utilizing Time-Series, Color Fingerprint Captures
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1