首页 > 最新文献

2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)最新文献

英文 中文
Lifelog Semantic Annotation using deep visual features and metadata-derived descriptors 使用深度视觉特征和元数据派生描述符的Lifelog语义注释
Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500247
Bahjat Safadi, P. Mulhem, G. Quénot, J. Chevallet
This paper describes a method for querying lifelog data from visual content and from metadata associated with the recorded images. Our approach mainly relies on mapping the query terms to visual concepts computed on the Lifelogs images according to two separated learning schemes based on use of deep visual features. A post-processing is then performed if the topic is related to time, location or activity information associated with the images. This work was evaluated in the context of the Lifelog Semantic Access sub-task of the NTCIR-12 (2016). The results obtained are promising for a first participation to such a task, with an event-based MAP above 29% and an event-based nDCG value close to 39%.
本文描述了一种从视觉内容和与记录图像相关的元数据中查询生活日志数据的方法。我们的方法主要依赖于根据基于使用深度视觉特征的两种分离的学习方案将查询项映射到在Lifelogs图像上计算的视觉概念。如果主题与与图像相关的时间、位置或活动信息相关,则执行后处理。这项工作在ntir -12(2016)的Lifelog语义访问子任务的背景下进行了评估。获得的结果对于首次参与此类任务来说是有希望的,基于事件的MAP值高于29%,基于事件的nDCG值接近39%。
{"title":"Lifelog Semantic Annotation using deep visual features and metadata-derived descriptors","authors":"Bahjat Safadi, P. Mulhem, G. Quénot, J. Chevallet","doi":"10.1109/CBMI.2016.7500247","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500247","url":null,"abstract":"This paper describes a method for querying lifelog data from visual content and from metadata associated with the recorded images. Our approach mainly relies on mapping the query terms to visual concepts computed on the Lifelogs images according to two separated learning schemes based on use of deep visual features. A post-processing is then performed if the topic is related to time, location or activity information associated with the images. This work was evaluated in the context of the Lifelog Semantic Access sub-task of the NTCIR-12 (2016). The results obtained are promising for a first participation to such a task, with an event-based MAP above 29% and an event-based nDCG value close to 39%.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126362611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Demo of multimodal medical retrieval 多模式医疗检索的演示
Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500263
Ranveer Joyseeree, Roger Schaer, H. Müller
Providing personalized medical care based on a patient's specific characteristics (diagnostic-image content, age, sex, weight, and so on) is an important aspect of modern medicine. This paper describes tools that aim to facilitate this process by providing clinicians with information regarding diagnosis, and treatment of past patients with similar characteristics. The additional information thus provided can help make better-informed decisions with regards to the diagnosis and treatment planning of new patients. Two existing tools: Shambala and Shangri-La can be combined for use within a clinical environment. Deployment inside healthcare facilities can become possible via the MD-Paedigree project.
根据患者的具体特征(诊断图像内容、年龄、性别、体重等)提供个性化医疗服务是现代医学的一个重要方面。本文描述了旨在通过向临床医生提供有关诊断和治疗过去具有相似特征的患者的信息来促进这一过程的工具。由此提供的额外信息有助于在新患者的诊断和治疗计划方面做出更明智的决定。两种现有的工具:香巴拉和香格里拉可以在临床环境中结合使用。通过MD-Paedigree项目,可以在医疗机构内部进行部署。
{"title":"A Demo of multimodal medical retrieval","authors":"Ranveer Joyseeree, Roger Schaer, H. Müller","doi":"10.1109/CBMI.2016.7500263","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500263","url":null,"abstract":"Providing personalized medical care based on a patient's specific characteristics (diagnostic-image content, age, sex, weight, and so on) is an important aspect of modern medicine. This paper describes tools that aim to facilitate this process by providing clinicians with information regarding diagnosis, and treatment of past patients with similar characteristics. The additional information thus provided can help make better-informed decisions with regards to the diagnosis and treatment planning of new patients. Two existing tools: Shambala and Shangri-La can be combined for use within a clinical environment. Deployment inside healthcare facilities can become possible via the MD-Paedigree project.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"180 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121524636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Experimenting with musically motivated convolutional neural networks 用音乐驱动的卷积神经网络做实验
Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500246
Jordi Pons, T. Lidy, Xavier Serra
A common criticism of deep learning relates to the difficulty in understanding the underlying relationships that the neural networks are learning, thus behaving like a black-box. In this article we explore various architectural choices of relevance for music signals classification tasks in order to start understanding what the chosen networks are learning. We first discuss how convolutional filters with different shapes can fit specific musical concepts and based on that we propose several musically motivated architectures. These architectures are then assessed by measuring the accuracy of the deep learning model in the prediction of various music classes using a known dataset of audio recordings of ballroom music. The classes in this dataset have a strong correlation with tempo, what allows assessing if the proposed architectures are learning frequency and/or time dependencies. Additionally, a black-box model is proposed as a baseline for comparison. With these experiments we have been able to understand what some deep learning based algorithms can learn from a particular set of data.
对深度学习的一个常见批评涉及到难以理解神经网络正在学习的潜在关系,从而表现得像一个黑箱。在本文中,我们探讨了音乐信号分类任务的各种相关性架构选择,以便开始理解所选择的网络正在学习什么。我们首先讨论了不同形状的卷积滤波器如何适应特定的音乐概念,并在此基础上提出了几个以音乐为动机的架构。然后,通过使用已知的舞厅音乐录音数据集,测量深度学习模型在预测各种音乐课程中的准确性,来评估这些架构。该数据集中的类与节奏有很强的相关性,这允许评估所提议的架构是否具有学习频率和/或时间依赖性。此外,提出了一个黑盒模型作为比较的基线。通过这些实验,我们已经能够理解一些基于深度学习的算法可以从一组特定的数据中学习到什么。
{"title":"Experimenting with musically motivated convolutional neural networks","authors":"Jordi Pons, T. Lidy, Xavier Serra","doi":"10.1109/CBMI.2016.7500246","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500246","url":null,"abstract":"A common criticism of deep learning relates to the difficulty in understanding the underlying relationships that the neural networks are learning, thus behaving like a black-box. In this article we explore various architectural choices of relevance for music signals classification tasks in order to start understanding what the chosen networks are learning. We first discuss how convolutional filters with different shapes can fit specific musical concepts and based on that we propose several musically motivated architectures. These architectures are then assessed by measuring the accuracy of the deep learning model in the prediction of various music classes using a known dataset of audio recordings of ballroom music. The classes in this dataset have a strong correlation with tempo, what allows assessing if the proposed architectures are learning frequency and/or time dependencies. Additionally, a black-box model is proposed as a baseline for comparison. With these experiments we have been able to understand what some deep learning based algorithms can learn from a particular set of data.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122962912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 130
Real-time multilevel sequencing of cataract surgery videos 白内障手术视频的实时多级测序
Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500245
K. Charrière, G. Quellec, M. Lamard, D. Martiano, G. Cazuguel, G. Coatrieux, B. Cochener
Data recorded and stored during video-monitored surgeries are a relevant source of information for surgeons, especially during their training period. But today, this data is virtually unexploited. In this paper, we propose to reuse videos recorded during cataract surgeries to automatically analyze the surgical process with the real-time constraint, with the aim to assist the surgeon during the surgery. We propose to automatically recognize, in real-time, what the surgeon is doing: what surgical phase or, more precisely, what surgical step he or she is performing. This recognition relies on the inference of a multilevel statistical model which uses 1) the conditional relations between levels of description (steps and phases) and 2) the temporal relations among steps and among phases. The model accepts two types of inputs: 1) the presence of surgical instruments, manually provided by the surgeons, or 2) motion in videos, automatically analyzed through the CBVR paradigm. A dataset of 30 cataract surgery videos was collected at Brest University hospital. The system was evaluated in terms of mean area under the ROC curve. Promising results were obtained using either motion analysis (Az = 0.759) or the presence of surgical instruments (Az = 0.983).
在视频监控手术过程中记录和存储的数据是外科医生的相关信息来源,特别是在他们的培训期间。但今天,这些数据几乎没有被利用。在本文中,我们提出利用白内障手术过程中录制的视频,在实时约束下自动分析手术过程,以辅助外科医生进行手术。我们建议自动识别,实时,外科医生正在做什么:什么手术阶段,或者更准确地说,他或她正在执行的手术步骤。这种识别依赖于多层统计模型的推断,该模型使用1)描述级别(步骤和阶段)之间的条件关系和2)步骤和阶段之间的时间关系。该模型接受两种类型的输入:1)手术器械的存在,由外科医生手动提供,或2)视频中的运动,通过CBVR范式自动分析。布雷斯特大学医院收集了30个白内障手术视频的数据集。以ROC曲线下的平均面积评价该系统。通过运动分析(Az = 0.759)或手术器械的存在(Az = 0.983)获得了令人满意的结果。
{"title":"Real-time multilevel sequencing of cataract surgery videos","authors":"K. Charrière, G. Quellec, M. Lamard, D. Martiano, G. Cazuguel, G. Coatrieux, B. Cochener","doi":"10.1109/CBMI.2016.7500245","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500245","url":null,"abstract":"Data recorded and stored during video-monitored surgeries are a relevant source of information for surgeons, especially during their training period. But today, this data is virtually unexploited. In this paper, we propose to reuse videos recorded during cataract surgeries to automatically analyze the surgical process with the real-time constraint, with the aim to assist the surgeon during the surgery. We propose to automatically recognize, in real-time, what the surgeon is doing: what surgical phase or, more precisely, what surgical step he or she is performing. This recognition relies on the inference of a multilevel statistical model which uses 1) the conditional relations between levels of description (steps and phases) and 2) the temporal relations among steps and among phases. The model accepts two types of inputs: 1) the presence of surgical instruments, manually provided by the surgeons, or 2) motion in videos, automatically analyzed through the CBVR paradigm. A dataset of 30 cataract surgery videos was collected at Brest University hospital. The system was evaluated in terms of mean area under the ROC curve. Promising results were obtained using either motion analysis (Az = 0.759) or the presence of surgical instruments (Az = 0.983).","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123002208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Static and dynamic autopsy of deep networks 深度网络的静态和动态解剖
Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500267
Titouan Lorieul, Antoine Ghorra, B. Mérialdo
Although deep learning has been a major break-through in the recent years, Deep Neural Networks (DNNs) are still the subject of intense research, and many issues remain on how to use them efficiently. In particular, training a Deep Network remains a difficult process, which requires extensive computation, and for which very precise care has to be taken to avoid overfitting, a high risk because of the extremely large number of parameters. The purpose of our work is to perform an autopsy of pre-trained Deep Networks, with the objective of collecting information about the values of the various parameters, and their possible relations and correlations. The motivation is that some of these observations could be later used as a priori knowledge to facilitate the training of new networks, by guiding the exploration of the parameter space into more probable areas. In this paper, we first present a static analysis of the AlexNet Deep Network by computing various statistics on the existing parameter values. Then, we perform a dynamic analysis by measuring the effect of certain modifications of those values on the performance of the network. For example, we show that quantizing the values of the parameters to a small adequate set of values leads to similar performance as the original network. These results suggest that pursuing such studies could lead to the design of improved training procedures for Deep Networks.
特别是,训练深度网络仍然是一个困难的过程,它需要大量的计算,并且必须非常精确地注意避免过度拟合,这是一个高风险,因为参数数量非常大。我们工作的目的是对预训练的深度网络进行解剖,目的是收集有关各种参数值的信息,以及它们可能的关系和相关性。其动机是,这些观察结果中的一些可以稍后用作先验知识,通过指导对参数空间的探索进入更可能的区域来促进新网络的训练。在本文中,我们首先通过计算现有参数值的各种统计数据来对AlexNet深度网络进行静态分析。然后,我们通过测量这些值的某些修改对网络性能的影响来执行动态分析。例如,我们表明,将参数的值量化为一个足够小的值集,可以获得与原始网络相似的性能。这些结果表明,进行这样的研究可能会导致设计改进的深度网络训练程序。
{"title":"Static and dynamic autopsy of deep networks","authors":"Titouan Lorieul, Antoine Ghorra, B. Mérialdo","doi":"10.1109/CBMI.2016.7500267","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500267","url":null,"abstract":"Although deep learning has been a major break-through in the recent years, Deep Neural Networks (DNNs) are still the subject of intense research, and many issues remain on how to use them efficiently. In particular, training a Deep Network remains a difficult process, which requires extensive computation, and for which very precise care has to be taken to avoid overfitting, a high risk because of the extremely large number of parameters. The purpose of our work is to perform an autopsy of pre-trained Deep Networks, with the objective of collecting information about the values of the various parameters, and their possible relations and correlations. The motivation is that some of these observations could be later used as a priori knowledge to facilitate the training of new networks, by guiding the exploration of the parameter space into more probable areas. In this paper, we first present a static analysis of the AlexNet Deep Network by computing various statistics on the existing parameter values. Then, we perform a dynamic analysis by measuring the effect of certain modifications of those values on the performance of the network. For example, we show that quantizing the values of the parameters to a small adequate set of values leads to similar performance as the original network. These results suggest that pursuing such studies could lead to the design of improved training procedures for Deep Networks.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123042829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Crowdsourcing as self-fulfilling prophecy: Influence of discarding workers in subjective assessment tasks 众包作为自我实现的预言:弃工对主观评估任务的影响
Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500256
M. Riegler, V. Reddy, M. Larson, Ragnhild Eg, P. Halvorsen, C. Griwodz
Crowdsourcing has established itself as a powerful tool for multimedia researchers, and is commonly used to collect human input for various purposes. It is also a fairly widespread practice to control the contributions of users based on the quality of their input. This paper points to the fact that applying this practice in subjective assessment tasks may lead to an undesired, negative outcome. We present a crowdsourcing experiment and a discussion of the ways in which control in crowdsourcing studies can lead to a phenomenon akin to a self-fulfilling prophecy. This paper is intended to trigger discussion and lead to more deeply reflective crowdsourcing practices in the multimedia context.
众包已经成为多媒体研究人员的一个强大工具,通常用于收集各种目的的人力输入。根据用户输入的质量来控制用户的贡献也是一种相当普遍的做法。本文指出,在主观评估任务中应用这种做法可能会导致不希望的负面结果。我们提出了一个众包实验,并讨论了众包研究中的控制可以导致类似于自我实现预言的现象的方式。本文旨在引发讨论,并导致更深刻的反思在多媒体背景下的众包实践。
{"title":"Crowdsourcing as self-fulfilling prophecy: Influence of discarding workers in subjective assessment tasks","authors":"M. Riegler, V. Reddy, M. Larson, Ragnhild Eg, P. Halvorsen, C. Griwodz","doi":"10.1109/CBMI.2016.7500256","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500256","url":null,"abstract":"Crowdsourcing has established itself as a powerful tool for multimedia researchers, and is commonly used to collect human input for various purposes. It is also a fairly widespread practice to control the contributions of users based on the quality of their input. This paper points to the fact that applying this practice in subjective assessment tasks may lead to an undesired, negative outcome. We present a crowdsourcing experiment and a discussion of the ways in which control in crowdsourcing studies can lead to a phenomenon akin to a self-fulfilling prophecy. This paper is intended to trigger discussion and lead to more deeply reflective crowdsourcing practices in the multimedia context.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"127 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114614112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Prediction of visual attention with Deep CNN for studies of neurodegenerative diseases 用深度CNN预测视觉注意力用于神经退行性疾病的研究
Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500243
S. Chaabouni, F. Tison, J. Benois-Pineau, C. Amar
As a part of the automatic study of visual attention of affected populations with neurodegenerative diseases and to predict whether new gaze records a complaint of these diseases, we should design an automatic model that predicts salient areas in video. Past research showed, that people suffering form dementia are not reactive with regard to degradations on still images. In this paper we study the reaction of healthy normal control subjects on degraded area in videos. Furthermore, in the goal to build an automatic prediction model for salient areas in intentionally degraded videos, we design a deep learning architecture and measure its performances when predicting salient regions on completely unseen data. The obtained results are interesting regarding the reaction of normal control subjects against a degraded area in video.
作为神经退行性疾病患者视觉注意力自动研究的一部分,为了预测新的凝视是否记录了这些疾病的主症,我们应该设计一个自动模型来预测视频中的突出区域。过去的研究表明,患有痴呆症的人对静止图像的退化没有反应。本文研究了健康正常对照者对视频中退化区域的反应。此外,为了在故意降级的视频中建立一个显著区域的自动预测模型,我们设计了一个深度学习架构,并在完全看不见的数据上预测显著区域时测量其性能。关于正常对照对象对视频中退化区域的反应,得到了有趣的结果。
{"title":"Prediction of visual attention with Deep CNN for studies of neurodegenerative diseases","authors":"S. Chaabouni, F. Tison, J. Benois-Pineau, C. Amar","doi":"10.1109/CBMI.2016.7500243","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500243","url":null,"abstract":"As a part of the automatic study of visual attention of affected populations with neurodegenerative diseases and to predict whether new gaze records a complaint of these diseases, we should design an automatic model that predicts salient areas in video. Past research showed, that people suffering form dementia are not reactive with regard to degradations on still images. In this paper we study the reaction of healthy normal control subjects on degraded area in videos. Furthermore, in the goal to build an automatic prediction model for salient areas in intentionally degraded videos, we design a deep learning architecture and measure its performances when predicting salient regions on completely unseen data. The obtained results are interesting regarding the reaction of normal control subjects against a degraded area in video.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128735275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Temporal segmentation of laparoscopic videos into surgical phases 腹腔镜视频手术阶段的时间分割
Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500249
Manfred Jürgen Primus, Klaus Schöffmann, L. Böszörményi
Videos of laparoscopic surgeries need to be segmented temporally into phases so that surgeons can use the recordings efficiently in their everyday work. In this paper we investigate the performance of an automatic phase segmentation method based on instrument detection and recognition. Contrary to known methods that dynamically align phases to an annotated dataset, our method is not limited to standardized or unvarying endoscopic procedures. Phases of laparoscopic procedures show a high correlation to the presence of one or a group of certain instruments. Therefore, the first step of our procedure is the definition of a set of rules that describe these correlations. The next step is the spatial detection of instruments using a color-based segmentation method and a rule-based interpretation of image moments for the refinement of the detections. Finally, the detected regions are recognized with SVM classifiers and ORB features. The evaluation shows that the proposed technique find phases in laparoscopic videos of cholecystectomies reliably.
腹腔镜手术的视频需要被暂时分割成几个阶段,这样外科医生才能在日常工作中有效地使用这些录像。本文研究了一种基于仪器检测和识别的相位自动分割方法的性能。与已知的将阶段与注释数据集动态对齐的方法相反,我们的方法不限于标准化或不变的内窥镜程序。腹腔镜手术的阶段与一种或一组特定器械的存在高度相关。因此,我们过程的第一步是定义一组描述这些相关性的规则。下一步是使用基于颜色的分割方法和基于规则的图像矩解释进行仪器的空间检测,以改进检测。最后,利用SVM分类器和ORB特征对检测到的区域进行识别。评价结果表明,该技术在腹腔镜胆囊切除术视频中能够可靠地发现相位。
{"title":"Temporal segmentation of laparoscopic videos into surgical phases","authors":"Manfred Jürgen Primus, Klaus Schöffmann, L. Böszörményi","doi":"10.1109/CBMI.2016.7500249","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500249","url":null,"abstract":"Videos of laparoscopic surgeries need to be segmented temporally into phases so that surgeons can use the recordings efficiently in their everyday work. In this paper we investigate the performance of an automatic phase segmentation method based on instrument detection and recognition. Contrary to known methods that dynamically align phases to an annotated dataset, our method is not limited to standardized or unvarying endoscopic procedures. Phases of laparoscopic procedures show a high correlation to the presence of one or a group of certain instruments. Therefore, the first step of our procedure is the definition of a set of rules that describe these correlations. The next step is the spatial detection of instruments using a color-based segmentation method and a rule-based interpretation of image moments for the refinement of the detections. Finally, the detected regions are recognized with SVM classifiers and ORB features. The evaluation shows that the proposed technique find phases in laparoscopic videos of cholecystectomies reliably.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130870504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Interactive exploration of healthcare queries 医疗保健查询的交互式探索
Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500275
A. Bampoulidis, M. Lupu, João Palotti, S. Metallidis, J. Brassey, A. Hanbury
Healthcare related queries are a treasure trove of information about the information needs of domain users, be they patients or doctors. However, unlike general queries, in order to make the most out of the information therein, such queries have to be processed within a medical terminology annotation pipeline. We show how this has been done in the context of the KConnect project and demonstrate an interactive query log exploration interface that allows data analysts and search engineers to better understand their users and design a better search experience.
与医疗保健相关的查询是关于域用户(无论是患者还是医生)信息需求的信息宝库。但是,与一般查询不同的是,为了充分利用其中的信息,必须在医学术语注释管道中处理此类查询。我们将展示如何在KConnect项目的上下文中实现这一点,并演示一个交互式查询日志探索界面,该界面允许数据分析师和搜索工程师更好地了解他们的用户并设计更好的搜索体验。
{"title":"Interactive exploration of healthcare queries","authors":"A. Bampoulidis, M. Lupu, João Palotti, S. Metallidis, J. Brassey, A. Hanbury","doi":"10.1109/CBMI.2016.7500275","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500275","url":null,"abstract":"Healthcare related queries are a treasure trove of information about the information needs of domain users, be they patients or doctors. However, unlike general queries, in order to make the most out of the information therein, such queries have to be processed within a medical terminology annotation pipeline. We show how this has been done in the context of the KConnect project and demonstrate an interactive query log exploration interface that allows data analysts and search engineers to better understand their users and design a better search experience.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125417653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A dataset of multimedia material about classical music: PHENICX-SMM 关于古典音乐的多媒体资料数据集:PHENICX-SMM
Pub Date : 2016-06-15 DOI: 10.1109/CBMI.2016.7500240
M. Schedl, D. Hauger, M. Tkalcic, M. Melenhorst, Cynthia C. S. Liem
We present a freely available dataset of multimedia material that can be used to build enriched browsing and retrieval systems for music. It is one result of the EU-FP7 funded project “Performances as Highly Enriched aNd Interactive Concert experiences” (PHENICX) that aims at enhancing the listener experience when enjoying classical music. The presented PHENICX-SMM dataset includes in total more than 50,000 multimedia items (text, image, audio) about composers, performers, pieces, and instruments. In addition to presenting the dataset, we detail one possible use case, that of building a personalized music information system that suggests certain types and quantities of multimedia material, based on personality traits and musical experience of its users. We evaluate the system via a user study and show that people generally prefer the personalized results over non-personalized.
我们提供了一个免费的多媒体材料数据集,可以用来建立丰富的音乐浏览和检索系统。这是EU-FP7资助的“作为高度丰富和互动的音乐会体验的表演”(PHENICX)项目的成果之一,该项目旨在增强听众在欣赏古典音乐时的体验。提出的PHENICX-SMM数据集包括总计超过50,000多媒体项目(文本,图像,音频)关于作曲家,表演者,作品和乐器。除了展示数据集之外,我们还详细介绍了一个可能的用例,即建立一个个性化的音乐信息系统,该系统根据用户的个性特征和音乐经验建议某些类型和数量的多媒体材料。我们通过用户研究来评估系统,并表明人们通常更喜欢个性化的结果而不是非个性化的结果。
{"title":"A dataset of multimedia material about classical music: PHENICX-SMM","authors":"M. Schedl, D. Hauger, M. Tkalcic, M. Melenhorst, Cynthia C. S. Liem","doi":"10.1109/CBMI.2016.7500240","DOIUrl":"https://doi.org/10.1109/CBMI.2016.7500240","url":null,"abstract":"We present a freely available dataset of multimedia material that can be used to build enriched browsing and retrieval systems for music. It is one result of the EU-FP7 funded project “Performances as Highly Enriched aNd Interactive Concert experiences” (PHENICX) that aims at enhancing the listener experience when enjoying classical music. The presented PHENICX-SMM dataset includes in total more than 50,000 multimedia items (text, image, audio) about composers, performers, pieces, and instruments. In addition to presenting the dataset, we detail one possible use case, that of building a personalized music information system that suggests certain types and quantities of multimedia material, based on personality traits and musical experience of its users. We evaluate the system via a user study and show that people generally prefer the personalized results over non-personalized.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134551674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1