2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)最新文献

英文中文

Robust Collaborative Fraudulent Transaction Detection using Federated Learning 基于联邦学习的鲁棒协同欺诈交易检测

2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)

Pub Date : 2021-12-01 DOI: 10.1109/ICMLA52953.2021.00064

Delton Myalil, M. Rajan, Manoj M. Apte, S. Lodha

Fraudulent transaction detection is a difficult problem for an individual bank, since the number of fraudulent transactions within a single bank’s records is significantly less compared to the day-to-day regular transactions it processes. Hence, due to this extreme data imbalance, training a classifier is difficult. Also, the model will not be able to learn from different types of fraudulent transactions, which a single bank’s database lacks. Collaboration between banks is the only way to achieve a generalized model, but banks will not share their data with each other due to competition and regulatory restrictions. Federated Learning can be leveraged here to solve this problem. However, in a cross-silo setting like this, the data held by different banks will be different in terms of distribution and hence follows a non-IID scenario across the participants’ datasets. Moreover, we are considering that a minority of the banks could be malicious and will try to disrupt this federated learning process. Hence the problem is to perform federated learning in a non-IID setting with active adversaries involved, which is a new research area under fraud detection. We perform non-IID partitioning of the transaction dataset to simulate 10 banks or silos. Then, for benchmark, we perform federated averaging with a subset of the banks set as malicious. Furthermore, we propose a novel algorithm - Epsilon Cluster Selection, a filter-based aggregation technique to recognize and prevent malicious nodes from contributing to the global model being trained. We apply this algorithm to the same setting with malicious banks and compare the results.

欺诈性交易检测对于单个银行来说是一个难题，因为单个银行记录中的欺诈性交易数量与其处理的日常常规交易相比要少得多。因此，由于这种极端的数据不平衡，训练分类器是困难的。此外，该模型将无法从不同类型的欺诈交易中学习，这是单个银行数据库所缺乏的。银行之间的合作是实现通用模型的唯一途径，但由于竞争和监管限制，银行不会相互共享数据。这里可以利用联邦学习来解决这个问题。然而，在这样的跨筒仓设置中，不同银行持有的数据在分布方面会有所不同，因此在参与者的数据集中遵循非iid场景。此外，我们正在考虑少数银行可能是恶意的，并将试图破坏这种联合学习过程。因此，如何在主动攻击者参与的非iid环境下进行联邦学习是欺诈检测领域的一个新的研究方向。我们对事务数据集执行非iid分区，以模拟10家银行或竖井。然后，对于基准测试，我们将银行的一个子集设置为恶意，执行联邦平均。此外，我们提出了一种新的算法- Epsilon聚类选择，这是一种基于过滤器的聚合技术，用于识别和防止恶意节点对正在训练的全局模型做出贡献。我们将该算法应用于与恶意银行相同的设置并比较结果。

{"title":"Robust Collaborative Fraudulent Transaction Detection using Federated Learning","authors":"Delton Myalil, M. Rajan, Manoj M. Apte, S. Lodha","doi":"10.1109/ICMLA52953.2021.00064","DOIUrl":"https://doi.org/10.1109/ICMLA52953.2021.00064","url":null,"abstract":"Fraudulent transaction detection is a difficult problem for an individual bank, since the number of fraudulent transactions within a single bank’s records is significantly less compared to the day-to-day regular transactions it processes. Hence, due to this extreme data imbalance, training a classifier is difficult. Also, the model will not be able to learn from different types of fraudulent transactions, which a single bank’s database lacks. Collaboration between banks is the only way to achieve a generalized model, but banks will not share their data with each other due to competition and regulatory restrictions. Federated Learning can be leveraged here to solve this problem. However, in a cross-silo setting like this, the data held by different banks will be different in terms of distribution and hence follows a non-IID scenario across the participants’ datasets. Moreover, we are considering that a minority of the banks could be malicious and will try to disrupt this federated learning process. Hence the problem is to perform federated learning in a non-IID setting with active adversaries involved, which is a new research area under fraud detection. We perform non-IID partitioning of the transaction dataset to simulate 10 banks or silos. Then, for benchmark, we perform federated averaging with a subset of the banks set as malicious. Furthermore, we propose a novel algorithm - Epsilon Cluster Selection, a filter-based aggregation technique to recognize and prevent malicious nodes from contributing to the global model being trained. We apply this algorithm to the same setting with malicious banks and compare the results.","PeriodicalId":6750,"journal":{"name":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"49 1","pages":"373-378"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88113969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Decoder Transformer for Temporally-Embedded Health Outcome Predictions 用于临时嵌入运行状况结果预测的解码器转换器

2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)

Pub Date : 2021-12-01 DOI: 10.1109/ICMLA52953.2021.00235

O. Boursalie, Reza Samavi, T. Doyle

Deep learning models are increasingly being used to predict patients’ diagnoses by analyzing electronic health records. Medical records represent observations of a patient’s health over time. A commonly used approach to analyze health records is to encode them as a sequence of ordered diagnoses (diagnostic-level encoding). Transformer models then analyze the sequence of diagnoses to learn disease patterns. However, the elapsed time between medical visits is not considered when transformers are used to analyze health records. In this paper, we present DT-THRE: Decoder Transformer for Temporally-Embedded Health Records Encoding that predicts patients’ diagnoses by analyzing their medical histories. In DTTHRE, instead of diagnostic-level encoding, we propose an encoding representation for health records called THRE: Temporally-Embedded Health Records Encoding. THRE encodes patient histories as a sequence of medical events such as age, sex, and diagnostic embedding while incorporating the elapsed time between visits. We evaluate a proof-of-concept DTTHRE on a real-world medical dataset and compare our model’s performance to an existing diagnostic transformer model in the literature. DTTHRE was successful on a medical dataset to predict patients’ final diagnosis with improved predictive performance (78.54± 0.22%) compared to the existing model in the literature (40.51± 0.13%).

深度学习模型越来越多地被用于通过分析电子健康记录来预测患者的诊断。医疗记录是对病人长期健康状况的观察。分析健康记录的一种常用方法是将它们编码为有序的诊断序列(诊断级编码)。然后，Transformer模型分析诊断序列以了解疾病模式。但是，当使用变压器分析健康记录时，不考虑两次就诊之间的时间间隔。在本文中，我们提出了dt - 3:用于时间嵌入式健康记录编码的解码器转换器，通过分析患者的病史来预测患者的诊断。在dthre中，我们提出了一种健康记录的编码表示，称为THRE:临时嵌入健康记录编码，而不是诊断级编码。three将患者病史编码为一系列医疗事件，如年龄、性别和诊断嵌入，同时结合两次就诊之间的时间间隔。我们在现实世界的医疗数据集上评估了概念验证dthre，并将我们的模型的性能与文献中现有的诊断变压器模型进行了比较。与文献中现有模型(40.51±0.13%)相比，dthre在医学数据集上成功预测了患者的最终诊断，预测性能提高了78.54±0.22%。

{"title":"Decoder Transformer for Temporally-Embedded Health Outcome Predictions","authors":"O. Boursalie, Reza Samavi, T. Doyle","doi":"10.1109/ICMLA52953.2021.00235","DOIUrl":"https://doi.org/10.1109/ICMLA52953.2021.00235","url":null,"abstract":"Deep learning models are increasingly being used to predict patients’ diagnoses by analyzing electronic health records. Medical records represent observations of a patient’s health over time. A commonly used approach to analyze health records is to encode them as a sequence of ordered diagnoses (diagnostic-level encoding). Transformer models then analyze the sequence of diagnoses to learn disease patterns. However, the elapsed time between medical visits is not considered when transformers are used to analyze health records. In this paper, we present DT-THRE: Decoder Transformer for Temporally-Embedded Health Records Encoding that predicts patients’ diagnoses by analyzing their medical histories. In DTTHRE, instead of diagnostic-level encoding, we propose an encoding representation for health records called THRE: Temporally-Embedded Health Records Encoding. THRE encodes patient histories as a sequence of medical events such as age, sex, and diagnostic embedding while incorporating the elapsed time between visits. We evaluate a proof-of-concept DTTHRE on a real-world medical dataset and compare our model’s performance to an existing diagnostic transformer model in the literature. DTTHRE was successful on a medical dataset to predict patients’ final diagnosis with improved predictive performance (78.54± 0.22%) compared to the existing model in the literature (40.51± 0.13%).","PeriodicalId":6750,"journal":{"name":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"6 1","pages":"1461-1467"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88279220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

BuiltNet: Graph based Spatio-Temporal Indoor Thermal Variation Detection BuiltNet:基于图的室内热变化时空检测

2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)

Pub Date : 2021-12-01 DOI: 10.1109/ICMLA52953.2021.00270

Naima Khan, Nirmalya Roy

Monitoring thermal condition with thermal cameras is a potential non-intrusive way to supervise the structural well-being of buildings. Thermal variation can infer various structural damages or construction deficiencies including air leakages through inside and outside surfaces of buildings. Frequent monitoring with thermal images can track the thermal characteristics of different places of built environments which helps to prevent damages beforehand. Previous literature studied thermal conditions in buildings with thermal images are limited to specific regions with constrained environmental settings. In this work, we propose an automated scalable framework BuiltNet for analyzing spatial and temporal temperature variation over various building elements i.e., walls, windows, doors, etc. using longitudinal thermal images. We collected thermal images from a residential apartment home for 10 minutes in consecutive 4-5 hours on different days. The spatial and temporal relations among different spots in a region from sequential thermal images of the corresponding region are represented by graph. We propose an unsupervised deep clustering algorithm based on graph neural network, considering both spatial and temporal features from longitudinal thermal images. Our analysis on the spatial and temporal features of regions in the collected thermal images (from both day and night of different weather conditions) identifies the thermal variation and characterizes the spatiotemporal dynamics over different places in the built environment.

用热像仪监测热状态是一种潜在的非侵入式方法来监督建筑物的结构健康。热变化可以推断各种结构损坏或施工缺陷，包括建筑物内外表面的空气泄漏。利用热图像进行频繁监测，可以跟踪建筑环境不同部位的热特性，有助于提前预防破坏。以前的文献研究的热图像在建筑物的热条件仅限于特定区域的约束环境设置。在这项工作中，我们提出了一个自动化的可扩展框架BuiltNet，用于使用纵向热图像分析各种建筑元素(如墙壁、窗户、门等)的时空温度变化。我们在不同的日子里，连续4-5个小时，在一个住宅公寓的家中采集了10分钟的热图像。从相应区域的序列热像图中，用图形表示区域内不同点之间的时空关系。提出了一种基于图神经网络的无监督深度聚类算法，同时考虑了纵向热图像的时空特征。我们对收集到的热图像(来自不同天气条件下的白天和夜晚)中的区域的时空特征进行了分析，确定了建筑环境中不同地点的热变化并表征了时空动态。

{"title":"BuiltNet: Graph based Spatio-Temporal Indoor Thermal Variation Detection","authors":"Naima Khan, Nirmalya Roy","doi":"10.1109/ICMLA52953.2021.00270","DOIUrl":"https://doi.org/10.1109/ICMLA52953.2021.00270","url":null,"abstract":"Monitoring thermal condition with thermal cameras is a potential non-intrusive way to supervise the structural well-being of buildings. Thermal variation can infer various structural damages or construction deficiencies including air leakages through inside and outside surfaces of buildings. Frequent monitoring with thermal images can track the thermal characteristics of different places of built environments which helps to prevent damages beforehand. Previous literature studied thermal conditions in buildings with thermal images are limited to specific regions with constrained environmental settings. In this work, we propose an automated scalable framework BuiltNet for analyzing spatial and temporal temperature variation over various building elements i.e., walls, windows, doors, etc. using longitudinal thermal images. We collected thermal images from a residential apartment home for 10 minutes in consecutive 4-5 hours on different days. The spatial and temporal relations among different spots in a region from sequential thermal images of the corresponding region are represented by graph. We propose an unsupervised deep clustering algorithm based on graph neural network, considering both spatial and temporal features from longitudinal thermal images. Our analysis on the spatial and temporal features of regions in the collected thermal images (from both day and night of different weather conditions) identifies the thermal variation and characterizes the spatiotemporal dynamics over different places in the built environment.","PeriodicalId":6750,"journal":{"name":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"67 1","pages":"1696-1703"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86116886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

MLCHECK– Property-Driven Testing of Machine Learning Classifiers 机器学习分类器的属性驱动测试

2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)

Pub Date : 2021-12-01 DOI: 10.1109/ICMLA52953.2021.00123

Arnab Sharma, Caglar Demir, A. N. Ngomo, H. Wehrheim

An increasing amount of software with machine learning components is being deployed. This poses the question of quality assurance for such components: how can we validate whether specified requirements are fulfilled by a machine learned software? Current testing and verification approaches either focus on a single requirement (e.g., fairness) or specialize in a single type of machine learning model (e.g., neural networks). We propose the property-driven testing of machine learning models. Our approach MLCHECK encompasses (1) a language for property specification, and (2) a technique for systematic test case generation. The specification language is comparable to property-based testing languages. The test case generation employs an elaborate verification method for a systematic, property-dependent construction of test suites, without additional user-supplied generator functions. We evaluate MLCHECK using requirements and data sets from three different application areas (software discrimination, learning on knowledge graphs and security). Our evaluation shows that in addition to its generality, MLCHECK can outperform specialised testing approaches while having a comparable runtime.

越来越多的带有机器学习组件的软件正在被部署。这就为这样的组件提出了质量保证的问题:我们如何验证机器学习软件是否满足了指定的需求?当前的测试和验证方法要么专注于单一需求(例如，公平性)，要么专注于单一类型的机器学习模型(例如，神经网络)。我们提出了机器学习模型的属性驱动测试。我们的方法MLCHECK包含(1)用于属性规范的语言，以及(2)用于系统测试用例生成的技术。规范语言可与基于属性的测试语言相媲美。测试用例的生成采用了一种详细的验证方法，用于系统的、与属性相关的测试套件构造，而不需要额外的用户提供的生成器功能。我们使用来自三个不同应用领域(软件识别、知识图学习和安全性)的需求和数据集来评估MLCHECK。我们的评估表明，除了它的通用性之外，MLCHECK还可以在具有可比运行时的情况下优于专门的测试方法。

{"title":"MLCHECK– Property-Driven Testing of Machine Learning Classifiers","authors":"Arnab Sharma, Caglar Demir, A. N. Ngomo, H. Wehrheim","doi":"10.1109/ICMLA52953.2021.00123","DOIUrl":"https://doi.org/10.1109/ICMLA52953.2021.00123","url":null,"abstract":"An increasing amount of software with machine learning components is being deployed. This poses the question of quality assurance for such components: how can we validate whether specified requirements are fulfilled by a machine learned software? Current testing and verification approaches either focus on a single requirement (e.g., fairness) or specialize in a single type of machine learning model (e.g., neural networks). We propose the property-driven testing of machine learning models. Our approach MLCHECK encompasses (1) a language for property specification, and (2) a technique for systematic test case generation. The specification language is comparable to property-based testing languages. The test case generation employs an elaborate verification method for a systematic, property-dependent construction of test suites, without additional user-supplied generator functions. We evaluate MLCHECK using requirements and data sets from three different application areas (software discrimination, learning on knowledge graphs and security). Our evaluation shows that in addition to its generality, MLCHECK can outperform specialised testing approaches while having a comparable runtime.","PeriodicalId":6750,"journal":{"name":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"82 1","pages":"738-745"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83749346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Identification and validation of a radiomic signature for predicting survival outcomes in non-small-cell lung cancer treated with radiation therapy 非小细胞肺癌放射治疗生存预后预测的放射学特征的鉴定和验证

2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)

Pub Date : 2021-12-01 DOI: 10.1109/ICMLA52953.2021.00095

Jin Li, Yixin Liu, Jingquan Wu

Radiomics is a novel tool which extracts quantitative features from medical imaging, and combines key features into an image-based radiomic signature for cancer diagnostics. We aimed to develop a quantitative radiomic signature for predicting survival outcomes in non-small-cell lung cancer (NSCLC) patients treated with radiation therapy. Based on computed tomography (CT) imaging of NSCLC, we applied a forward selection procedure for the establishment of a radiomic signature in a cohort with 107 NSCLC patients treated with radiation therapy, and validated it in a dataset with 88 patients. The radiomics signatures were significantly associated with NSCLC patients’ survival time. In a Testing dataset, the predicted high risk patients had significantly shorter overall survival than the predicted low risk patients (log-rank $P=$ 0.0004, HR $=$ 2.75, 95% CIs: 1.58–4.80, C-index $=$ 0.64). Further, the novel proposed radiomic nomogram combining the radiomic signature and clinicopathological factors improved the prognostic performance. The CT-based radiomic signature exhibited a good performance for noninvasively identifying patients with NSCLC who should receive postoperative radiation therapy. These results provide a more precise reference for the accurate diagnosis and treatment of NSCLC in clinical.

放射组学是一种新的工具，它从医学成像中提取定量特征，并将关键特征组合成基于图像的放射组学特征，用于癌症诊断。我们的目的是开发一种定量的放射学特征来预测接受放射治疗的非小细胞肺癌(NSCLC)患者的生存结果。基于非小细胞肺癌的计算机断层扫描(CT)成像，我们应用正向选择程序建立了107例接受放射治疗的非小细胞肺癌患者的放射学特征，并在88例患者的数据集中验证了它。放射组学特征与NSCLC患者的生存时间显著相关。在一个Testing数据集中，预测的高风险患者的总生存期明显短于预测的低风险患者(log-rank $P= 0.0004, HR $= 2.75, 95% ci: 1.58-4.80, C-index $= 0.64)。此外，新提出的结合放射组学特征和临床病理因素的放射组学形态图改善了预后表现。基于ct的放射学特征在无创识别非小细胞肺癌患者是否应该接受术后放射治疗方面表现良好。这些结果为临床对非小细胞肺癌的准确诊断和治疗提供了更为精确的参考。

{"title":"Identification and validation of a radiomic signature for predicting survival outcomes in non-small-cell lung cancer treated with radiation therapy","authors":"Jin Li, Yixin Liu, Jingquan Wu","doi":"10.1109/ICMLA52953.2021.00095","DOIUrl":"https://doi.org/10.1109/ICMLA52953.2021.00095","url":null,"abstract":"Radiomics is a novel tool which extracts quantitative features from medical imaging, and combines key features into an image-based radiomic signature for cancer diagnostics. We aimed to develop a quantitative radiomic signature for predicting survival outcomes in non-small-cell lung cancer (NSCLC) patients treated with radiation therapy. Based on computed tomography (CT) imaging of NSCLC, we applied a forward selection procedure for the establishment of a radiomic signature in a cohort with 107 NSCLC patients treated with radiation therapy, and validated it in a dataset with 88 patients. The radiomics signatures were significantly associated with NSCLC patients’ survival time. In a Testing dataset, the predicted high risk patients had significantly shorter overall survival than the predicted low risk patients (log-rank $P=$ 0.0004, HR $=$ 2.75, 95% CIs: 1.58–4.80, C-index $=$ 0.64). Further, the novel proposed radiomic nomogram combining the radiomic signature and clinicopathological factors improved the prognostic performance. The CT-based radiomic signature exhibited a good performance for noninvasively identifying patients with NSCLC who should receive postoperative radiation therapy. These results provide a more precise reference for the accurate diagnosis and treatment of NSCLC in clinical.","PeriodicalId":6750,"journal":{"name":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"2 1","pages":"570-574"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82884290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An Unsupervised Learning Methodology for Increasing Human Productivity via VR Training 通过VR培训提高人类生产力的无监督学习方法

2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)

Pub Date : 2021-12-01 DOI: 10.1109/ICMLA52953.2021.00210

Sérgio Viademonte, B. Gomes, A. Siravenha, W. Gomes, Caio Rodrigues, R. A. Tourinho

In recent years the mining industry has witnessed a steady drop in productivity. This decline has been driven by a number of factors such as inefficient workforce. Some of the reasons for workforce concerns are inexperienced workers associated with inadequate training protocols for increasing task-specific human abilities. In this study, we propose an unsupervised machine learning (ML) methodology for increasing human productivity in the mining industry via Virtual Reality (VR) training sessions. Our results reported an increase in average productivity performance for operators that are below the desired production level, which can potentially lead to significant margins of profit as well as provide a safer working environment.

近年来，采矿业的生产率稳步下降。这种下降是由劳动力效率低下等多种因素造成的。劳动力问题的一些原因是缺乏经验的工人与增加特定任务的人类能力的培训协议不足有关。在本研究中，我们提出了一种无监督机器学习(ML)方法，通过虚拟现实(VR)培训课程提高采矿业的人类生产力。我们的研究结果表明，在低于预期生产水平的情况下，作业者的平均生产力表现有所提高，这可能会带来巨大的利润空间，并提供更安全的工作环境。

引用次数: 0

Automated Machine Learning Strategies to Damage Identification of Neurofibromatosis Mutations 神经纤维瘤病突变损伤识别的自动机器学习策略

2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)

Pub Date : 2021-12-01 DOI: 10.1109/ICMLA52953.2021.00217

A. Orjuela-Cañón, Juan Carlos Figueroa–García, Roman Neruda

Machine learning tools have been employed for problem solutions in bioinformatics. However, the parameters tuning of these models cam imply additional difficulties around the specific technique used to classify. In this work data from protein sequences was applied to three auto machine learning strategies to determine the type of mutation for the Neurofibromatosis disease. Results show that the parameters in the machine learning models were found automatically. In addition, these tools were relevant to determine relations between the amino-acids in the protein sequence.

机器学习工具已被用于解决生物信息学中的问题。然而，这些模型的参数调整可能意味着围绕用于分类的特定技术的额外困难。在这项工作中，来自蛋白质序列的数据被应用于三种自动机器学习策略，以确定神经纤维瘤病的突变类型。结果表明，机器学习模型中的参数是自动找到的。此外，这些工具也适用于确定蛋白质序列中氨基酸之间的关系。

引用次数: 1

Deep Learning for Range Localization via Over-Water Electromagnetic Signals 基于水面电磁信号的深度学习距离定位

2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)

Pub Date : 2021-12-01 DOI: 10.1109/ICMLA52953.2021.00247

Evan Witz, M. Barger, R. Paffenroth

Neural networks are widely applied in domains such as image processing, natural language processing, and time series forecasting. However, neural networks have seen less use in problems arising in the physical sciences. This is unfortunate, since the physical domain has a wealth of problems that can benefit from application of neural networks. These problems hold substantial significance to many areas such as manufacturing, material science, and many others. In the current text we demonstrate that knowledge of the physical systems of interest can be combined with effective data preprocessing and neural network training to achieve prediction effectiveness which is greater than the sum of its parts. In particular, we study the challenging problem of range estimation from the measurement of electromagnetic scattering of radio waves reflected off the surface of the ocean and the atmosphere. Our key finding is a that good performance can only be achieved by combining physical principles with careful data preprocessing and network training.

神经网络广泛应用于图像处理、自然语言处理和时间序列预测等领域。然而，神经网络在物理科学中出现的问题中很少使用。这是不幸的，因为物理领域有大量的问题可以从神经网络的应用中受益。这些问题对制造业、材料科学等许多领域都具有重大意义。在当前的文本中，我们证明了感兴趣的物理系统的知识可以与有效的数据预处理和神经网络训练相结合，以达到大于其各部分之和的预测效果。特别地，我们研究了从海洋和大气表面反射的无线电波的电磁散射测量中估计距离的挑战性问题。我们的主要发现是，良好的性能只能通过将物理原理与仔细的数据预处理和网络训练相结合来实现。

引用次数: 0

Sensor-Based Obsessive-Compulsive Disorder Detection With Personalised Federated Learning 基于传感器的强迫症检测与个性化联合学习

2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)

Pub Date : 2021-12-01 DOI: 10.1109/ICMLA52953.2021.00058

Kristina Kirsten, Bjarne Pfitzner, Lando Löper, B. Arnrich

The mental illness Obsessive-Compulsive Disorder (OCD) is characterised by obsessive thoughts and compulsive actions. The latter can occur as repetitive activities to ensure that severe fears do not come true. A diagnosis of the disease is usually very late due to a lack of knowledge and shame of the patient. Nevertheless, early detection can significantly increase the success of therapy.With the development of new wearable sensors, it is possible to recognise human activities. Accordingly, wearables can also be used to identify recurring activities that indicate an OCD. Through this form of an automatic detection system, a diagnosis can be made earlier and thus therapy can be started sooner.Since compulsive behaviour is very individual and varies from patient to patient, this paper deals with personalised federated machine learning models. We first adapt the publicly available OPPORTUNITY dataset to simulate OCD behaviour. Secondly, we evaluate two existing personalised federated learning algorithms against baseline approaches. Finally, we propose a hybrid approach that merges the two evaluated algorithms and reaches a mean area under the precision-recall curve (AUPRC) of 0.954 across clients.

精神疾病强迫症(OCD)以强迫性的思想和行为为特征。后者可以作为重复活动发生，以确保严重的恐惧不会成为现实。由于缺乏知识和患者的羞耻感，这种疾病的诊断通常很晚。然而，早期发现可以显著提高治疗的成功率。随着新型可穿戴传感器的发展，识别人类活动成为可能。因此，可穿戴设备也可以用来识别表明强迫症的重复性活动。通过这种形式的自动检测系统，可以更早地做出诊断，从而可以更快地开始治疗。由于强迫行为是非常个性化的，并且因患者而异，因此本文处理个性化的联合机器学习模型。我们首先采用公开可用的OPPORTUNITY数据集来模拟强迫症行为。其次，我们根据基线方法评估了两种现有的个性化联邦学习算法。最后，我们提出了一种混合方法，将两种评估算法合并在一起，并在客户端precision-recall curve (AUPRC)下达到0.954的平均面积。

{"title":"Sensor-Based Obsessive-Compulsive Disorder Detection With Personalised Federated Learning","authors":"Kristina Kirsten, Bjarne Pfitzner, Lando Löper, B. Arnrich","doi":"10.1109/ICMLA52953.2021.00058","DOIUrl":"https://doi.org/10.1109/ICMLA52953.2021.00058","url":null,"abstract":"The mental illness Obsessive-Compulsive Disorder (OCD) is characterised by obsessive thoughts and compulsive actions. The latter can occur as repetitive activities to ensure that severe fears do not come true. A diagnosis of the disease is usually very late due to a lack of knowledge and shame of the patient. Nevertheless, early detection can significantly increase the success of therapy.With the development of new wearable sensors, it is possible to recognise human activities. Accordingly, wearables can also be used to identify recurring activities that indicate an OCD. Through this form of an automatic detection system, a diagnosis can be made earlier and thus therapy can be started sooner.Since compulsive behaviour is very individual and varies from patient to patient, this paper deals with personalised federated machine learning models. We first adapt the publicly available OPPORTUNITY dataset to simulate OCD behaviour. Secondly, we evaluate two existing personalised federated learning algorithms against baseline approaches. Finally, we propose a hybrid approach that merges the two evaluated algorithms and reaches a mean area under the precision-recall curve (AUPRC) of 0.954 across clients.","PeriodicalId":6750,"journal":{"name":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"33 1","pages":"333-339"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79207123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Graph Convolutional Networks for Categorizing Online Harassment on Twitter 图卷积网络分类在Twitter上的在线骚扰

2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)

Pub Date : 2021-12-01 DOI: 10.1109/ICMLA52953.2021.00156

M. Saeidi, E. Milios, N. Zeh

Twitter is one of the social media platforms that people express themselves freely. Harassment is one consequence of these such platforms, which is hard to obstruct. Text categorization and classification is a task that aims to solve this problem. Several studies applied classical machine learning methods and recent deep neural networks to categorize the text. However, only a few studies have explored graph convolutional neural networks while using classical approaches to categorize harassment Tweets. In this work, we propose using graph convolutional networks (GCN) for tweet categorization. Second, we explore this categorization task using classical machine learning approaches and compare the results with the GCN model. Third, we show the effectiveness of the GCN model on this problem by the other evaluation of the model on fewer sample datasets. In addition, we used different embedding approaches to find the best representation for the dataset in each of the models and represent the best embedding approach to use in this problem.

推特是人们自由表达自己的社交媒体平台之一。骚扰是这些平台的后果之一，很难阻止。文本分类就是为了解决这一问题而进行的一项任务。一些研究应用经典的机器学习方法和最近的深度神经网络对文本进行分类。然而，只有少数研究在使用经典方法对骚扰推文进行分类的同时探索了图卷积神经网络。在这项工作中，我们提出使用图卷积网络(GCN)进行tweet分类。其次，我们使用经典的机器学习方法来探索这个分类任务，并将结果与GCN模型进行比较。第三，我们通过在更少的样本数据集上对模型进行其他评估来证明GCN模型在这个问题上的有效性。此外，我们使用不同的嵌入方法来找到每个模型中数据集的最佳表示，并表示在这个问题中使用的最佳嵌入方法。

引用次数: 1

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀