International Conference on Pattern Recognition Applications and Methods最新文献

英文中文

Active Output Selection Strategies for Multiple Learning Regression Models 多元学习回归模型的主动输出选择策略

International Conference on Pattern Recognition Applications and Methods

Pub Date : 2020-11-29 DOI: 10.5220/0010181501500157

A. Prochaska, J. Pillas, B. Bäker

Active learning shows promise to decrease test bench time for model-based drivability calibration. This paper presents a new strategy for active output selection, which suits the needs of calibration tasks. The strategy is actively learning multiple outputs in the same input space. It chooses the output model with the highest cross-validation error as leading. The presented method is applied to three different toy examples with noise in a real world range and to a benchmark dataset. The results are analyzed and compared to other existing strategies. In a best case scenario, the presented strategy is able to decrease the number of points by up to 30% compared to a sequential space-filling design while outperforming other existing active learning strategies. The results are promising but also show that the algorithm has to be improved to increase robustness for noisy environments. Further research will focus on improving the algorithm and applying it to a real-world example.

主动学习有望减少基于模型的驾驶性校准的试验台时间。本文提出了一种新的适应标定任务需要的主动输出选择策略。该策略是在相同的输入空间中主动学习多个输出。选择交叉验证误差最大的输出模型作为先导。所提出的方法应用于现实世界范围内三个不同的带有噪声的玩具示例和一个基准数据集。对结果进行了分析，并与其他现有策略进行了比较。在最好的情况下，与顺序空间填充设计相比，所提出的策略能够减少多达30%的点数，同时优于其他现有的主动学习策略。结果是有希望的，但也表明，该算法必须改进，以增加对噪声环境的鲁棒性。进一步的研究将集中于改进算法并将其应用于现实世界的例子。

引用次数: 2

Upgraded W-Net with Attention Gates and its Application in Unsupervised 3D Liver Segmentation 基于注意门的改进W-Net及其在无监督三维肝脏分割中的应用

International Conference on Pattern Recognition Applications and Methods

Pub Date : 2020-11-20 DOI: 10.5220/0010221504880494

Dhanunjaya Mitta, S. Chatterjee, O. Speck, A. Nürnberger

Segmentation of biomedical images can assist radiologists to make a better diagnosis and take decisions faster by helping in the detection of abnormalities, such as tumors. Manual or semi-automated segmentation, however, can be a time-consuming task. Most deep learning based automated segmentation methods are supervised and rely on manually segmented ground-truth. A possible solution for the problem would be an unsupervised deep learning based approach for automated segmentation, which this research work tries to address. We use a W-Net architecture and modified it, such that it can be applied to 3D volumes. In addition, to suppress noise in the segmentation we added attention gates to the skip connections. The loss for the segmentation output was calculated using soft N-Cuts and for the reconstruction output using SSIM. Conditional Random Fields were used as a post-processing step to fine-tune the results. The proposed method has shown promising results, with a dice coefficient of 0.88 for the liver segmentation compared against manual segmentation.

生物医学图像的分割可以帮助放射科医生做出更好的诊断，并通过帮助检测异常(如肿瘤)来更快地做出决定。然而，手动或半自动的分割可能是一项耗时的任务。大多数基于深度学习的自动分割方法都是有监督的，并且依赖于手动分割的真值。该问题的一个可能解决方案是基于无监督深度学习的自动分割方法，这是本研究工作试图解决的问题。我们使用W-Net架构并对其进行了修改，使其可以应用于3D体块。此外，为了抑制分割中的噪声，我们在跳跃连接中添加了注意门。使用软N-Cuts计算分割输出的损失，使用SSIM计算重建输出的损失。条件随机场被用作后处理步骤微调结果。该方法取得了良好的效果，与人工分割相比，肝脏分割的骰子系数为0.88。

引用次数: 5

Interpreting convolutional networks trained on textual data 解释基于文本数据训练的卷积网络

International Conference on Pattern Recognition Applications and Methods

Pub Date : 2020-10-20 DOI: 10.5220/0010205901960203

Reza Marzban, C. Crick

There have been many advances in the artificial intelligence field due to the emergence of deep learning. In almost all sub-fields, artificial neural networks have reached or exceeded human-level performance. However, most of the models are not interpretable. As a result, it is hard to trust their decisions, especially in life and death scenarios. In recent years, there has been a movement toward creating explainable artificial intelligence, but most work to date has concentrated on image processing models, as it is easier for humans to perceive visual patterns. There has been little work in other fields like natural language processing. In this paper, we train a convolutional model on textual data and analyze the global logic of the model by studying its filter values. In the end, we find the most important words in our corpus to our models logic and remove the rest (95%). New models trained on just the 5% most important words can achieve the same performance as the original model while reducing training time by more than half. Approaches such as this will help us to understand NLP models, explain their decisions according to their word choices, and improve them by finding blind spots and biases.

由于深度学习的出现，人工智能领域取得了许多进展。在几乎所有的子领域，人工神经网络已经达到或超过了人类的水平。然而，大多数模型是不可解释的。因此，很难相信他们的决定，尤其是在生死攸关的情况下。近年来，出现了一种创造可解释的人工智能的趋势，但迄今为止，大多数工作都集中在图像处理模型上，因为人类更容易感知视觉模式。自然语言处理等其他领域的研究很少。本文在文本数据上训练卷积模型，并通过研究模型的过滤值来分析模型的全局逻辑。最后，我们找到语料库中对模型逻辑最重要的单词，并删除其余的(95%)。只对5%最重要的单词进行训练的新模型可以达到与原始模型相同的性能，同时将训练时间减少一半以上。诸如此类的方法将帮助我们理解NLP模型，根据它们的词语选择来解释它们的决策，并通过发现盲点和偏见来改进它们。

{"title":"Interpreting convolutional networks trained on textual data","authors":"Reza Marzban, C. Crick","doi":"10.5220/0010205901960203","DOIUrl":"https://doi.org/10.5220/0010205901960203","url":null,"abstract":"There have been many advances in the artificial intelligence field due to the emergence of deep learning. In almost all sub-fields, artificial neural networks have reached or exceeded human-level performance. However, most of the models are not interpretable. As a result, it is hard to trust their decisions, especially in life and death scenarios. In recent years, there has been a movement toward creating explainable artificial intelligence, but most work to date has concentrated on image processing models, as it is easier for humans to perceive visual patterns. There has been little work in other fields like natural language processing. In this paper, we train a convolutional model on textual data and analyze the global logic of the model by studying its filter values. In the end, we find the most important words in our corpus to our models logic and remove the rest (95%). New models trained on just the 5% most important words can achieve the same performance as the original model while reducing training time by more than half. Approaches such as this will help us to understand NLP models, explain their decisions according to their word choices, and improve them by finding blind spots and biases.","PeriodicalId":410036,"journal":{"name":"International Conference on Pattern Recognition Applications and Methods","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121567113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

MetaBox+: A new Region Based Active Learning Method for Semantic Segmentation using Priority Maps MetaBox+:一种新的基于区域的基于优先映射的语义分割主动学习方法

International Conference on Pattern Recognition Applications and Methods

Pub Date : 2020-10-05 DOI: 10.5220/0010227500510062

P. Colling, L. Roese-Koerner, H. Gottschalk, M. Rottmann

We present a novel region based active learning method for semantic image segmentation, called MetaBox+. For acquisition, we train a meta regression model to estimate the segment-wise Intersection over Union (IoU) of each predicted segment of unlabeled images. This can be understood as an estimation of segment-wise prediction quality. Queried regions are supposed to minimize to competing targets, i.e., low predicted IoU values / segmentation quality and low estimated annotation costs. For estimating the latter we propose a simple but practical method for annotation cost estimation. We compare our method to entropy based methods, where we consider the entropy as uncertainty of the prediction. The comparison and analysis of the results provide insights into annotation costs as well as robustness and variance of the methods. Numerical experiments conducted with two different networks on the Cityscapes dataset clearly demonstrate a reduction of annotation effort compared to random acquisition. Noteworthily, we achieve 95%of the mean Intersection over Union (mIoU), using MetaBox+ compared to when training with the full dataset, with only 10.47% / 32.01% annotation effort for the two networks, respectively.

提出了一种新的基于区域的主动学习语义图像分割方法，称为MetaBox+。对于采集，我们训练了一个元回归模型来估计未标记图像的每个预测片段的分段交叉(IoU)。这可以理解为对分段预测质量的估计。被查询的区域应该最小化到竞争目标，即低预测IoU值/分割质量和低估计注释成本。对于后者的估计，我们提出了一种简单实用的标注成本估计方法。我们将我们的方法与基于熵的方法进行比较，其中我们将熵视为预测的不确定性。对结果的比较和分析提供了对注释成本以及方法的鲁棒性和方差的见解。在cityscape数据集上用两种不同的网络进行的数值实验清楚地表明，与随机获取相比，注释工作量减少了。值得注意的是，与使用完整数据集进行训练相比，我们使用MetaBox+实现了95%的平均交联(mIoU)，两个网络的注释工作量分别只有10.47% / 32.01%。

{"title":"MetaBox+: A new Region Based Active Learning Method for Semantic Segmentation using Priority Maps","authors":"P. Colling, L. Roese-Koerner, H. Gottschalk, M. Rottmann","doi":"10.5220/0010227500510062","DOIUrl":"https://doi.org/10.5220/0010227500510062","url":null,"abstract":"We present a novel region based active learning method for semantic image segmentation, called MetaBox+. For acquisition, we train a meta regression model to estimate the segment-wise Intersection over Union (IoU) of each predicted segment of unlabeled images. This can be understood as an estimation of segment-wise prediction quality. Queried regions are supposed to minimize to competing targets, i.e., low predicted IoU values / segmentation quality and low estimated annotation costs. For estimating the latter we propose a simple but practical method for annotation cost estimation. We compare our method to entropy based methods, where we consider the entropy as uncertainty of the prediction. The comparison and analysis of the results provide insights into annotation costs as well as robustness and variance of the methods. Numerical experiments conducted with two different networks on the Cityscapes dataset clearly demonstrate a reduction of annotation effort compared to random acquisition. Noteworthily, we achieve 95%of the mean Intersection over Union (mIoU), using MetaBox+ compared to when training with the full dataset, with only 10.47% / 32.01% annotation effort for the two networks, respectively.","PeriodicalId":410036,"journal":{"name":"International Conference on Pattern Recognition Applications and Methods","volume":"1992 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130800708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19

Automated Detection of COVID-19 from CT Scans Using Convolutional Neural Networks 利用卷积神经网络从CT扫描中自动检测COVID-19

International Conference on Pattern Recognition Applications and Methods

Pub Date : 2020-06-23 DOI: 10.5220/0010293605650570

R. Lokwani, A. Gaikwad, V. Kulkarni, Aniruddha Pant, A. Kharat

COVID-19 is an infectious disease that causes respiratory problems similar to those caused by SARS-CoV (2003). Currently, swab samples are being used for its diagnosis. The most common testing method used is the RT-PCR method, which has high specificity but variable sensitivity. AI-based detection has the capability to overcome this drawback. In this paper, we propose a prospective method wherein we use chest CT scans to diagnose the patients for COVID-19 pneumonia. We use a set of open-source images, available as individual CT slices, and full CT scans from a private Indian Hospital to train our model. We build a 2D segmentation model using the U-Net architecture, which gives the output by marking out the region of infection. Our model achieves a sensitivity of 96.428% (95% CI: 88%-100%) and a specificity of 88.39% (95% CI: 82%-94%). Additionally, we derive a logic for converting our slice-level predictions to scan-level, which helps us reduce the false positives.

COVID-19是一种传染病，可引起与SARS-CoV(2003年)类似的呼吸系统问题。目前，拭子样本被用于其诊断。最常用的检测方法是RT-PCR法，该方法特异性高，但灵敏度不稳定。基于人工智能的检测有能力克服这一缺点。在本文中，我们提出了一种前瞻性方法，即使用胸部CT扫描来诊断COVID-19肺炎患者。我们使用了一组开源图像，作为单独的CT切片，以及来自印度一家私人医院的完整CT扫描来训练我们的模型。我们使用U-Net架构建立了一个二维分割模型，该模型通过标记感染区域来给出输出。该模型的灵敏度为96.428% (95% CI: 88%-100%)，特异性为88.39% (95% CI: 82%-94%)。此外，我们导出了将切片级预测转换为扫描级的逻辑，这有助于我们减少误报。

引用次数: 19

Using DICOM Tags for Clustering Medical Radiology Images into Visually Similar Groups 使用DICOM标签将医学放射学图像聚类成视觉相似的组

International Conference on Pattern Recognition Applications and Methods

Pub Date : 2020-03-18 DOI: 10.5220/0008973405100517

T. Manojlović, Dino Ilic, D. Miletic, Ivan Štajduhar

: The data stored in a Picture Archiving and Communication System (PACS) of a clinical centre normally consists of medical images recorded from patients using select imaging techniques, and stored metadata information concerning the details on the conducted diagnostic procedures - the latter being commonly stored using the Digital Imaging and Communications in Medicine (DICOM) standard. In this work, we explore the possibility of utilising DICOM tags for automatic annotation of PACS databases, using K -medoids clustering. We gather and analyse DICOM data of medical radiology images available as a part of the RadiologyNet database, which was built in 2017, and originates from the Clinical Hospital Centre Rijeka, Croatia. Following data preprocessing, we used K -medoids clustering for multiple values of K , and we chose the most appropriate number of clusters based on the silhouette score . Next, for evaluating the clustering performance with regard to the visual similarity of images, we trained an autoencoder from a non-overlapping set of images. That way, we estimated the visual similarity of pixel data clustered by DICOM tags. Paired t-test ( p < 0 . 001) suggests a signiﬁcant difference between the mean distance from cluster centres of images clustered by DICOM tags, and randomly-permuted cluster labels.

存储在临床中心图像存档和通信系统(PACS)中的数据通常包括使用选定成像技术从患者那里记录的医学图像，以及存储的有关所进行诊断程序细节的元数据信息——后者通常使用医学数字成像和通信(DICOM)标准存储。在这项工作中，我们探索了利用DICOM标签对PACS数据库进行自动注释的可能性，使用K - medioids聚类。我们收集和分析医学放射学图像的DICOM数据，这些数据是RadiologyNet数据库的一部分，该数据库建于2017年，起源于克罗地亚里耶卡临床医院中心。在数据预处理之后，我们对多个K值进行K - mediids聚类，并根据剪影得分选择最合适的聚类数量。接下来，为了评估图像视觉相似性方面的聚类性能，我们从一组不重叠的图像中训练了一个自编码器。这样，我们估计了DICOM标签聚类的像素数据的视觉相似性。配对t检验(p < 0。001)表明DICOM标签聚类图像与随机排列聚类标签聚类图像到聚类中心的平均距离之间存在显著差异。

{"title":"Using DICOM Tags for Clustering Medical Radiology Images into Visually Similar Groups","authors":"T. Manojlović, Dino Ilic, D. Miletic, Ivan Štajduhar","doi":"10.5220/0008973405100517","DOIUrl":"https://doi.org/10.5220/0008973405100517","url":null,"abstract":": The data stored in a Picture Archiving and Communication System (PACS) of a clinical centre normally consists of medical images recorded from patients using select imaging techniques, and stored metadata information concerning the details on the conducted diagnostic procedures - the latter being commonly stored using the Digital Imaging and Communications in Medicine (DICOM) standard. In this work, we explore the possibility of utilising DICOM tags for automatic annotation of PACS databases, using K -medoids clustering. We gather and analyse DICOM data of medical radiology images available as a part of the RadiologyNet database, which was built in 2017, and originates from the Clinical Hospital Centre Rijeka, Croatia. Following data preprocessing, we used K -medoids clustering for multiple values of K , and we chose the most appropriate number of clusters based on the silhouette score . Next, for evaluating the clustering performance with regard to the visual similarity of images, we trained an autoencoder from a non-overlapping set of images. That way, we estimated the visual similarity of pixel data clustered by DICOM tags. Paired t-test ( p < 0 . 001) suggests a signiﬁcant difference between the mean distance from cluster centres of images clustered by DICOM tags, and randomly-permuted cluster labels.","PeriodicalId":410036,"journal":{"name":"International Conference on Pattern Recognition Applications and Methods","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130806222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Structure Preserving Encoding of Non-euclidean Similarity Data 非欧几里得相似数据的结构保持编码

International Conference on Pattern Recognition Applications and Methods

Pub Date : 2020-03-13 DOI: 10.5220/0008955100430051

Maximilian Münch, Christoph Raab, Michael Biehl, Frank-Michael Schleif

Domain-specific proximity measures, like divergence measures in signal processing or alignment scores in bioinformatics, often lead to non-metric, indefinite similarities or dissimilarities. However, many classical learning algorithms like kernel machines assume metric properties and struggle with such metric violations. For example, the classical support vector machine is no longer able to converge to an optimum. One possible direction to solve the indefiniteness problem is to transform the non-metric (dis-)similarity data into positive (semi-)definite matrices. For this purpose, many approaches have been proposed that adapt the eigenspectrum of the given data such that positive definiteness is ensured. Unfortunately, most of these approaches modify the eigenspectrum in such a strong manner that valuable information is removed or noise is added to the data. In particular, the shift operation has attracted a lot of interest in the past few years despite its frequently reoccurring disadvantages. In this work, we propose a modified advanced shift correction method that enables the preservation of the eigenspectrum structure of the data by means of a low-rank approximated nullspace correction. We compare our advanced shift to classical eigenvalue corrections like eigenvalue clipping, flipping, squaring, and shifting on several benchmark data. The impact of a low-rank approximation on the data’s eigenspectrum is analyzed.

特定领域的接近度量，如信号处理中的发散度量或生物信息学中的对齐分数，通常会导致非度量的，不确定的相似性或差异性。然而，许多经典的学习算法，如核机器，假设度量性质，并与这种度量违反作斗争。例如，经典的支持向量机不再能够收敛到最优。解决不确定性问题的一个可能方向是将非度量(非)相似数据转化为正(半)定矩阵。为此，已经提出了许多方法来调整给定数据的特征谱，以确保正确定性。不幸的是，这些方法中的大多数都以一种强烈的方式修改了特征谱，从而删除了有价值的信息或向数据中添加了噪声。特别是，尽管轮班作业的缺点经常出现，但在过去几年中还是引起了很多人的兴趣。在这项工作中，我们提出了一种改进的高级移位校正方法，该方法可以通过低秩近似零空间校正来保留数据的特征谱结构。我们将我们的高级移位与经典的特征值修正(如特征值裁剪、翻转、平方和移位)在几个基准数据上进行比较。分析了低秩近似对数据特征谱的影响。

{"title":"Structure Preserving Encoding of Non-euclidean Similarity Data","authors":"Maximilian Münch, Christoph Raab, Michael Biehl, Frank-Michael Schleif","doi":"10.5220/0008955100430051","DOIUrl":"https://doi.org/10.5220/0008955100430051","url":null,"abstract":"Domain-specific proximity measures, like divergence measures in signal processing or alignment scores in bioinformatics, often lead to non-metric, indefinite similarities or dissimilarities. However, many classical learning algorithms like kernel machines assume metric properties and struggle with such metric violations. For example, the classical support vector machine is no longer able to converge to an optimum. One possible direction to solve the indefiniteness problem is to transform the non-metric (dis-)similarity data into positive (semi-)definite matrices. For this purpose, many approaches have been proposed that adapt the eigenspectrum of the given data such that positive definiteness is ensured. Unfortunately, most of these approaches modify the eigenspectrum in such a strong manner that valuable information is removed or noise is added to the data. In particular, the shift operation has attracted a lot of interest in the past few years despite its frequently reoccurring disadvantages. In this work, we propose a modified advanced shift correction method that enables the preservation of the eigenspectrum structure of the data by means of a low-rank approximated nullspace correction. We compare our advanced shift to classical eigenvalue corrections like eigenvalue clipping, flipping, squaring, and shifting on several benchmark data. The impact of a low-rank approximation on the data’s eigenspectrum is analyzed.","PeriodicalId":410036,"journal":{"name":"International Conference on Pattern Recognition Applications and Methods","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122259878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Deep Learning Approach to Diabetic Retinopathy Detection 糖尿病视网膜病变检测的深度学习方法

International Conference on Pattern Recognition Applications and Methods

Pub Date : 2020-03-03 DOI: 10.5220/0008970805010509

B. Tymchenko, Philip Marchenko, D. Spodarets

Diabetic retinopathy is one of the most threatening complications of diabetes that leads to permanent blindness if left untreated. One of the essential challenges is early detection, which is very important for treatment success. Unfortunately, the exact identification of the diabetic retinopathy stage is notoriously tricky and requires expert human interpretation of fundus images. Simplification of the detection step is crucial and can help millions of people. Convolutional neural networks (CNN) have been successfully applied in many adjacent subjects, and for diagnosis of diabetic retinopathy itself. However, the high cost of big labeled datasets, as well as inconsistency between different doctors, impede the performance of these methods. In this paper, we propose an automatic deep-learning-based method for stage detection of diabetic retinopathy by single photography of the human fundus. Additionally, we propose the multistage approach to transfer learning, which makes use of similar datasets with different labeling. The presented method can be used as a screening method for early detection of diabetic retinopathy with sensitivity and specificity of 0.99 and is ranked 54 of 2943 competing methods (quadratic weighted kappa score of 0.925466) on APTOS 2019 Blindness Detection Dataset (13000 images).

糖尿病视网膜病变是糖尿病最严重的并发症之一，如果不及时治疗，会导致永久性失明。其中一个基本挑战是早期发现，这对治疗成功非常重要。不幸的是，糖尿病视网膜病变阶段的准确识别是出了名的棘手，需要专家对眼底图像进行解读。简化检测步骤至关重要，可以帮助数百万人。卷积神经网络(CNN)已经成功地应用于许多相邻学科，以及糖尿病视网膜病变本身的诊断。然而，大型标记数据集的高成本以及不同医生之间的不一致性阻碍了这些方法的性能。在本文中，我们提出了一种基于深度学习的糖尿病视网膜病变分期自动检测方法。此外，我们提出了迁移学习的多阶段方法，该方法利用了不同标记的相似数据集。该方法可作为早期发现糖尿病视网膜病变的筛查方法，敏感性和特异性为0.99，在APTOS 2019盲检数据集(13000张图像)的2943种竞争方法中排名第54位(二次加权kappa评分为0.925466)。

{"title":"Deep Learning Approach to Diabetic Retinopathy Detection","authors":"B. Tymchenko, Philip Marchenko, D. Spodarets","doi":"10.5220/0008970805010509","DOIUrl":"https://doi.org/10.5220/0008970805010509","url":null,"abstract":"Diabetic retinopathy is one of the most threatening complications of diabetes that leads to permanent blindness if left untreated. One of the essential challenges is early detection, which is very important for treatment success. Unfortunately, the exact identification of the diabetic retinopathy stage is notoriously tricky and requires expert human interpretation of fundus images. Simplification of the detection step is crucial and can help millions of people. Convolutional neural networks (CNN) have been successfully applied in many adjacent subjects, and for diagnosis of diabetic retinopathy itself. However, the high cost of big labeled datasets, as well as inconsistency between different doctors, impede the performance of these methods. In this paper, we propose an automatic deep-learning-based method for stage detection of diabetic retinopathy by single photography of the human fundus. Additionally, we propose the multistage approach to transfer learning, which makes use of similar datasets with different labeling. The presented method can be used as a screening method for early detection of diabetic retinopathy with sensitivity and specificity of 0.99 and is ranked 54 of 2943 competing methods (quadratic weighted kappa score of 0.925466) on APTOS 2019 Blindness Detection Dataset (13000 images).","PeriodicalId":410036,"journal":{"name":"International Conference on Pattern Recognition Applications and Methods","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133311860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 94

Stairway to Elders: Bridging Space, Time and Emotions in Their Social Environment for Wellbeing 通往长者的阶梯:在他们的社会环境中架起空间、时间和情感的桥梁

International Conference on Pattern Recognition Applications and Methods

Pub Date : 2020-02-01 DOI: 10.5220/0009106605480554

G. Boccignone, C. de’Sperati, M. Granato, G. Grossi, R. Lanzarotti, Nicoletta Noceti, F. Odone

: The physical and mental health in elderly population is an emergent issue which in recent years has become an urgent socio-economic phenomenon. Computer scientists, together with physicians and caregivers have devoted a great research effort to conceive and devise assistive technologies, aiming at safeguarding elder health, while a marginal consideration has been devoted to their emotional domain. In this manuscript we outline the research plan and the objectives of a current project called Stairway to elders: bridging space, time and emotions in their social environment for wellbeing” . Through a set of sensors, which include cameras and physiological sensors, we aim at developing computational methods for understanding the affective state and socialization attitude of older people in ecological conditions. A valuable by-product of the project will be the collection of a multi-modal dataset to be used for model design, and that will be made available to the research community. The outcomes of the project should support the design of an environment which automatically (or semi-automatically) adapts its conditions to the affective state of older people, with a consequent improvement of their life quality.

老年人的身心健康问题是近年来出现的一个紧迫的社会经济现象。计算机科学家与医生和护理人员一起投入了大量的研究工作来构思和设计旨在保护老年人健康的辅助技术，而对他们的情感领域的考虑却很少。在这篇手稿中，我们概述了研究计划和当前项目的目标，称为老年人的阶梯:在他们的社会环境中为幸福架起空间、时间和情感的桥梁。通过一组传感器，包括摄像头和生理传感器，我们旨在开发计算方法来了解老年人在生态条件下的情感状态和社会化态度。该项目的一个有价值的副产品将是收集用于模型设计的多模态数据集，这将提供给研究界。该项目的成果应支持环境的设计，使其自动(或半自动)适应老年人的情感状态，从而改善他们的生活质量。

{"title":"Stairway to Elders: Bridging Space, Time and Emotions in Their Social Environment for Wellbeing","authors":"G. Boccignone, C. de’Sperati, M. Granato, G. Grossi, R. Lanzarotti, Nicoletta Noceti, F. Odone","doi":"10.5220/0009106605480554","DOIUrl":"https://doi.org/10.5220/0009106605480554","url":null,"abstract":": The physical and mental health in elderly population is an emergent issue which in recent years has become an urgent socio-economic phenomenon. Computer scientists, together with physicians and caregivers have devoted a great research effort to conceive and devise assistive technologies, aiming at safeguarding elder health, while a marginal consideration has been devoted to their emotional domain. In this manuscript we outline the research plan and the objectives of a current project called Stairway to elders: bridging space, time and emotions in their social environment for wellbeing” . Through a set of sensors, which include cameras and physiological sensors, we aim at developing computational methods for understanding the affective state and socialization attitude of older people in ecological conditions. A valuable by-product of the project will be the collection of a multi-modal dataset to be used for model design, and that will be made available to the research community. The outcomes of the project should support the design of an environment which automatically (or semi-automatically) adapts its conditions to the affective state of older people, with a consequent improvement of their life quality.","PeriodicalId":410036,"journal":{"name":"International Conference on Pattern Recognition Applications and Methods","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115191561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Network of Steel: Neural Font Style Transfer from Heavy Metal to Corporate Logos 钢铁网络:从重金属到企业标志的神经字体风格转换

International Conference on Pattern Recognition Applications and Methods

Pub Date : 2020-01-10 DOI: 10.5220/0009343906210629

Aram Ter-Sarkisov

We introduce a method for transferring style from the logos of heavy metal bands onto corporate logos using a VGG16 network. We establish the contribution of different layers and loss coefficients to the learning of style, minimization of artefacts and maintenance of readability of corporate logos. We find layers and loss coefficients that produce a good tradeoff between heavy metal style and corporate logo readability. This is the first step both towards sparse font style transfer and corporate logo decoration using generative networks. Heavy metal and corporate logos are very different artistically, in the way they emphasize emotions and readability, therefore training a model to fuse the two is an interesting problem.

我们介绍了一种利用VGG16网络将重金属乐队的标志风格转移到企业标志上的方法。我们建立了不同层次和损失系数对风格学习，人工最小化和维护企业标志可读性的贡献。我们发现在重金属风格和公司标志可读性之间产生很好的权衡层次和损失系数。这是使用生成网络实现稀疏字体风格转移和企业标志装饰的第一步。重金属和企业标志在艺术上是非常不同的，它们强调情感和可读性的方式，因此训练一个融合两者的模型是一个有趣的问题。

引用次数: 5

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

International Conference on Pattern Recognition Applications and Methods

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀