首页 > 最新文献

Proceedings of the ... International Conference on Machine Learning and Applications. International Conference on Machine Learning and Applications最新文献

英文 中文
A Memory-Enhanced Framework for Financial Fraud Detection 金融欺诈检测的记忆增强框架
Kunlin Yang
{"title":"A Memory-Enhanced Framework for Financial Fraud Detection","authors":"Kunlin Yang","doi":"10.1109/ICMLA.2018.00140","DOIUrl":"https://doi.org/10.1109/ICMLA.2018.00140","url":null,"abstract":"","PeriodicalId":74528,"journal":{"name":"Proceedings of the ... International Conference on Machine Learning and Applications. International Conference on Machine Learning and Applications","volume":"1 1","pages":"871-874"},"PeriodicalIF":0.0,"publicationDate":"2018-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78501993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AUTOMATIC SCORING OF A NONWORD REPETITION TEST. 非单词重复测试的自动评分。
Meysam Asgari, Jan Van Santen, Katina Papadakis

In this study, we explore the feasibility of speech-based techniques to automatically evaluate a nonword repetition (NWR) test. NWR tests, a useful marker for detecting language impairment, require repetition of pronounceable nonwords, such as "D OY F", presented aurally by an examiner or via a recording. Our proposed method leverages ASR techniques to first transcribe verbal responses. Second, it applies machine learning techniques to ASR output for predicting gold standard scores provided by speech and language pathologists. Our experimental results for a sample of 101 children (42 with autism spectrum disorders, or ASD; 18 with specific language impairment, or SLI; and 41 typically developed, or TD) show that the proposed approach is successful in predicting scores on this test, with averaged product-moment correlations of 0.74 and mean absolute error of 0.06 (on a observed score range from 0.34 to 0.97) between observed and predicted ratings.

在这项研究中,我们探索了基于语音的技术来自动评估非单词重复(NWR)测试的可行性。NWR测试是一种检测语言障碍的有用标记,它要求重复可发音的非单词,如“D OY F”,由考官口头或通过录音呈现。我们提出的方法利用ASR技术首先转录口头反应。其次,它将机器学习技术应用于ASR输出,以预测语音和语言病理学家提供的黄金标准分数。我们对101名儿童样本的实验结果(42名患有自闭症谱系障碍,或ASD;18人患有特殊语言障碍(SLI);和41个典型开发,或TD)表明,所提出的方法在预测该测试的分数方面是成功的,平均积矩相关性为0.74,平均绝对误差为0.06(在观察到的分数范围为0.34至0.97)。
{"title":"AUTOMATIC SCORING OF A NONWORD REPETITION TEST.","authors":"Meysam Asgari,&nbsp;Jan Van Santen,&nbsp;Katina Papadakis","doi":"10.1109/icmla.2017.0-143","DOIUrl":"https://doi.org/10.1109/icmla.2017.0-143","url":null,"abstract":"<p><p>In this study, we explore the feasibility of speech-based techniques to automatically evaluate a nonword repetition (NWR) test. NWR tests, a useful marker for detecting language impairment, require repetition of pronounceable nonwords, such as \"D OY F\", presented aurally by an examiner or via a recording. Our proposed method leverages ASR techniques to first transcribe verbal responses. Second, it applies machine learning techniques to ASR output for predicting gold standard scores provided by speech and language pathologists. Our experimental results for a sample of 101 children (42 with autism spectrum disorders, or ASD; 18 with specific language impairment, or SLI; and 41 typically developed, or TD) show that the proposed approach is successful in predicting scores on this test, with averaged product-moment correlations of 0.74 and mean absolute error of 0.06 (on a observed score range from 0.34 to 0.97) between observed and predicted ratings.</p>","PeriodicalId":74528,"journal":{"name":"Proceedings of the ... International Conference on Machine Learning and Applications. International Conference on Machine Learning and Applications","volume":"2017 ","pages":"304-308"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/icmla.2017.0-143","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38633141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Metabolic Profiling of 1H NMR Spectra in Chronic Kidney Disease with Local Predictive Modeling 用局部预测模型分析慢性肾脏疾病的1H NMR代谢谱
M. Luck, A. Yartseva, G. Bertho, E. Thervet, P. Beaune, N. Pallet, C. Damon
Metabolic profiling, the study of changes in the concentration of the metabolites in the organism induced by biological differences within subpopulations, has to deal with a very large amount of complex data. It therefore requires the use of powerful data processing and machine learning methods. To overcome over-fitting, a common concern in metabolic profiling where the number of features is often much larger than the number of observations, many predictive analyses combined dimension reduction techniques with multivariate predictive linear modeling. Moreover, they built a global model that identifies biomarkers predictive of the output of interest giving their overall trend variations. However, this fails to capture local biological phenomena underlying subgroups of subjects. More recently, local exploration methods based on decision trees approaches have been applied in metabolomics but they only explore random parts of the feature space. In this study, we used a supervised rule-mining algorithm that locally and exhaustively explores the feature space to predict chronic kidney disease (CDK) stages based on proton Nuclear Magnetic Resonance (1H NMR) data. From the discriminant subregions obtained with this exploration, we extracted local features and learned a L2-regularized Logistic regression (L2LR) classifier. We compared the resulting local predictive model with a standard one, combining classical univariate supervised feature selection techniques with a L2LR, and a model mixing both global and local features. Results show that the local predictive model is more powerful in terms of predictive performance than the mixed and global models. Additionally, it gives key insights into biological variations specific to subgroups of subjects.
代谢谱是研究由亚种群内的生物学差异引起的生物体中代谢物浓度变化,必须处理非常大量的复杂数据。因此,它需要使用强大的数据处理和机器学习方法。为了克服过度拟合,代谢谱中一个常见的问题,特征的数量往往比观测的数量大得多,许多预测分析结合了降维技术和多变量预测线性建模。此外,他们建立了一个全球模型,该模型可以识别生物标志物,并根据它们的总体趋势变化预测感兴趣的输出。然而,这并没有捕捉到隐藏在被试亚群之下的局部生物现象。最近,基于决策树方法的局部探索方法已应用于代谢组学,但它们只探索特征空间的随机部分。在这项研究中,我们使用了一种监督规则挖掘算法,该算法局部和详尽地探索特征空间,以质子核磁共振(1H NMR)数据为基础预测慢性肾脏疾病(CDK)的分期。从此探索获得的判别子区域中,我们提取了局部特征并学习了l2正则化逻辑回归(L2LR)分类器。我们将得到的局部预测模型与标准模型进行了比较,将经典的单变量监督特征选择技术与L2LR相结合,以及混合了全局和局部特征的模型。结果表明,局部预测模型的预测性能优于混合模型和全局模型。此外,它提供了关键的见解,具体到生物变异的亚组科目。
{"title":"Metabolic Profiling of 1H NMR Spectra in Chronic Kidney Disease with Local Predictive Modeling","authors":"M. Luck, A. Yartseva, G. Bertho, E. Thervet, P. Beaune, N. Pallet, C. Damon","doi":"10.1109/ICMLA.2015.155","DOIUrl":"https://doi.org/10.1109/ICMLA.2015.155","url":null,"abstract":"Metabolic profiling, the study of changes in the concentration of the metabolites in the organism induced by biological differences within subpopulations, has to deal with a very large amount of complex data. It therefore requires the use of powerful data processing and machine learning methods. To overcome over-fitting, a common concern in metabolic profiling where the number of features is often much larger than the number of observations, many predictive analyses combined dimension reduction techniques with multivariate predictive linear modeling. Moreover, they built a global model that identifies biomarkers predictive of the output of interest giving their overall trend variations. However, this fails to capture local biological phenomena underlying subgroups of subjects. More recently, local exploration methods based on decision trees approaches have been applied in metabolomics but they only explore random parts of the feature space. In this study, we used a supervised rule-mining algorithm that locally and exhaustively explores the feature space to predict chronic kidney disease (CDK) stages based on proton Nuclear Magnetic Resonance (1H NMR) data. From the discriminant subregions obtained with this exploration, we extracted local features and learned a L2-regularized Logistic regression (L2LR) classifier. We compared the resulting local predictive model with a standard one, combining classical univariate supervised feature selection techniques with a L2LR, and a model mixing both global and local features. Results show that the local predictive model is more powerful in terms of predictive performance than the mixed and global models. Additionally, it gives key insights into biological variations specific to subgroups of subjects.","PeriodicalId":74528,"journal":{"name":"Proceedings of the ... International Conference on Machine Learning and Applications. International Conference on Machine Learning and Applications","volume":"1 1","pages":"176-181"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87707545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
ExpertBayes: Automatically refining manually built Bayesian networks. ExpertBayes:自动精炼人工构建的贝叶斯网络。
Ezilda Almeida, Pedro Ferreira, Tiago Vinhoza, Inês Dutra, Jingwei Li, Yirong Wu, Elizabeth Burnside

Bayesian network structures are usually built using only the data and starting from an empty network or from a naïve Bayes structure. Very often, in some domains, like medicine, a prior structure knowledge is already known. This structure can be automatically or manually refined in search for better performance models. In this work, we take Bayesian networks built by specialists and show that minor perturbations to this original network can yield better classifiers with a very small computational cost, while maintaining most of the intended meaning of the original model.

贝叶斯网络结构通常只使用数据,从空网络或naïve贝叶斯结构开始构建。很多时候,在一些领域,比如医学,先验结构知识是已知的。可以自动或手动地对该结构进行细化,以寻找更好的性能模型。在这项工作中,我们采用由专家构建的贝叶斯网络,并表明对该原始网络的微小扰动可以以非常小的计算成本产生更好的分类器,同时保持原始模型的大部分预期意义。
{"title":"ExpertBayes: Automatically refining manually built Bayesian networks.","authors":"Ezilda Almeida,&nbsp;Pedro Ferreira,&nbsp;Tiago Vinhoza,&nbsp;Inês Dutra,&nbsp;Jingwei Li,&nbsp;Yirong Wu,&nbsp;Elizabeth Burnside","doi":"10.1109/ICMLA.2014.64","DOIUrl":"https://doi.org/10.1109/ICMLA.2014.64","url":null,"abstract":"<p><p>Bayesian network structures are usually built using only the data and starting from an empty network or from a naïve Bayes structure. Very often, in some domains, like medicine, a prior structure knowledge is already known. This structure can be automatically or manually refined in search for better performance models. In this work, we take Bayesian networks built by specialists and show that minor perturbations to this original network can yield better classifiers with a very small computational cost, while maintaining most of the intended meaning of the original model.</p>","PeriodicalId":74528,"journal":{"name":"Proceedings of the ... International Conference on Machine Learning and Applications. International Conference on Machine Learning and Applications","volume":"2014 ","pages":"362-366"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/ICMLA.2014.64","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34755438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Phrase Based Topic Modeling for Semantic Information Processing in Biomedicine. 基于短语的生物医学语义信息处理主题建模。
Zhiguo Yu, Todd R Johnson, Ramakanth Kavuluru

Given that unstructured data is increasing exponentially everyday, extracting and understanding the information, themes, and relationships from large collections of documents is increasingly important to researchers in many disciplines including biomedicine. Latent Dirichlet Allocation (LDA) is an unsupervised topic modeling technique based on the "bag-of-words" assumption that has been applied extensively to unveil hidden semantic themes within large sets of textual documents. Recently, it was extended using the "bag-of-n-grams" paradigm to account for word order. In this paper, we present an alternative phrase based LDA model to move from a bag of words or n-grams paradigm to a "bag-of-key-phrases" setting by applying a key phrase extraction technique, the C-value method, to further explore latent themes. We evaluate our approach by using a phrase intrusion user study and demonstrate that our model can help LDA generate better and more interpretable topics than those generated using the bag-of-n-grams approach. Given topic models essentially are statistical tools, an important problem in topic modeling is that of visualizing and interacting with the models to understand and extract new information from a collection. To evaluate our phrase based modeling approach in this context, we incorporate it in an open source interactive topic browser. Qualitative evaluations of this browser with biomedical experts demonstrate that our approach can aid biomedical researchers gain better and faster understanding of their document collections.

鉴于非结构化数据每天都呈指数级增长,从大量文档中提取和理解信息、主题和关系对包括生物医学在内的许多学科的研究人员来说变得越来越重要。潜在狄利克雷分配(Latent Dirichlet Allocation, LDA)是一种基于“词袋”假设的无监督主题建模技术,已被广泛应用于揭示大量文本文档中隐藏的语义主题。最近,它被扩展为使用“n-grams”范式来解释词序。在本文中,我们提出了一种基于短语的替代LDA模型,通过应用关键短语提取技术(c值方法)进一步探索潜在主题,从单词袋或n-grams范式转变为“关键短语袋”设置。我们通过使用短语入侵用户研究来评估我们的方法,并证明我们的模型可以帮助LDA生成比使用n-grams方法生成的更好、更可解释的主题。鉴于主题模型本质上是统计工具,主题建模中的一个重要问题是可视化模型并与模型交互,以便从集合中理解和提取新信息。为了在这种情况下评估基于短语的建模方法,我们将其合并到一个开源交互式主题浏览器中。生物医学专家对该浏览器的定性评估表明,我们的方法可以帮助生物医学研究人员更好、更快地理解他们的文档集合。
{"title":"Phrase Based Topic Modeling for Semantic Information Processing in Biomedicine.","authors":"Zhiguo Yu,&nbsp;Todd R Johnson,&nbsp;Ramakanth Kavuluru","doi":"10.1109/ICMLA.2013.89","DOIUrl":"https://doi.org/10.1109/ICMLA.2013.89","url":null,"abstract":"<p><p>Given that unstructured data is increasing exponentially everyday, extracting and understanding the information, themes, and relationships from large collections of documents is increasingly important to researchers in many disciplines including biomedicine. Latent Dirichlet Allocation (LDA) is an unsupervised topic modeling technique based on the \"bag-of-words\" assumption that has been applied extensively to unveil hidden semantic themes within large sets of textual documents. Recently, it was extended using the \"bag-of-n-grams\" paradigm to account for word order. In this paper, we present an alternative phrase based LDA model to move from a bag of words or n-grams paradigm to a \"bag-of-key-phrases\" setting by applying a key phrase extraction technique, the C-value method, to further explore latent themes. We evaluate our approach by using a phrase intrusion user study and demonstrate that our model can help LDA generate better and more interpretable topics than those generated using the bag-of-n-grams approach. Given topic models essentially are statistical tools, an important problem in topic modeling is that of visualizing and interacting with the models to understand and extract new information from a collection. To evaluate our phrase based modeling approach in this context, we incorporate it in an open source interactive topic browser. Qualitative evaluations of this browser with biomedical experts demonstrate that our approach can aid biomedical researchers gain better and faster understanding of their document collections.</p>","PeriodicalId":74528,"journal":{"name":"Proceedings of the ... International Conference on Machine Learning and Applications. International Conference on Machine Learning and Applications","volume":"2013 ","pages":"440-445"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/ICMLA.2013.89","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35192144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Face Recognition Challenge: Object Recognition Approaches for Human/Avatar Classification 人脸识别挑战:人类/化身分类的对象识别方法
T. Yamasaki, Tsuhan Chen
Recently, a novel "completely automated public Turing test to tell computers and humans apart (CAPTCHA)'' system has been proposed, in which users are asked to separate natural faces of humans and artificial faces of virtual world avatars. The system is based on the assumption that computers cannot separate them while it is an easy task for humans. Conventional digital forensics approaches to distinguish natural images from computer graphics images are mostly based on statistical analysis of the images such as noise in CMOS image sensors or Bayer matrix estimation. On the other hand, this paper uses face recognition and object classification based approaches. The experiments show that our approaches work surprisingly well and yields more than 99% accuracy. Our object classification based approach can also tell us how likely the input images are regarded as human/avatar faces.
最近,有人提出了一种新颖的“完全自动化的公共图灵测试来区分计算机和人类(CAPTCHA)”系统,该系统要求用户区分人类的自然面孔和虚拟世界avatar的人造面孔。该系统是基于这样的假设:计算机无法将它们分开,而这对人类来说是一件很容易的事情。传统的数字取证方法将自然图像与计算机图形图像区分开来,主要是基于对图像的统计分析,如CMOS图像传感器中的噪声或拜耳矩阵估计。另一方面,本文采用了基于人脸识别和目标分类的方法。实验表明,我们的方法非常有效,准确率超过99%。我们基于对象分类的方法还可以告诉我们输入图像被视为人类/化身面孔的可能性有多大。
{"title":"Face Recognition Challenge: Object Recognition Approaches for Human/Avatar Classification","authors":"T. Yamasaki, Tsuhan Chen","doi":"10.1109/ICMLA.2012.188","DOIUrl":"https://doi.org/10.1109/ICMLA.2012.188","url":null,"abstract":"Recently, a novel \"completely automated public Turing test to tell computers and humans apart (CAPTCHA)'' system has been proposed, in which users are asked to separate natural faces of humans and artificial faces of virtual world avatars. The system is based on the assumption that computers cannot separate them while it is an easy task for humans. Conventional digital forensics approaches to distinguish natural images from computer graphics images are mostly based on statistical analysis of the images such as noise in CMOS image sensors or Bayer matrix estimation. On the other hand, this paper uses face recognition and object classification based approaches. The experiments show that our approaches work surprisingly well and yields more than 99% accuracy. Our object classification based approach can also tell us how likely the input images are regarded as human/avatar faces.","PeriodicalId":74528,"journal":{"name":"Proceedings of the ... International Conference on Machine Learning and Applications. International Conference on Machine Learning and Applications","volume":"18 1","pages":"574-579"},"PeriodicalIF":0.0,"publicationDate":"2012-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75403149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Combining Gene Expression Profiles and Protein-Protein Interactions for Identifying Functional Modules 结合基因表达谱和蛋白质相互作用鉴定功能模块
Dingding Wang, M. Ogihara, Erliang Zeng, Tao Li
Identifying functional modules from protein-protein interaction networks is an important and challenging task. This paper presents a new approach called PPIBM which is designed to integrate gene expression data analysis and clustering of protein-protein interactions. The proposed approach relies on a Bayesian model which uses as its base protein-protein interactions given as part of input. The proposed method is evaluated with standard measures and its performance is compared with the state-of-the-art network analysis methods. Experimental results on both real-world data and synthetic data demonstrate the effectiveness of the proposed approach.
从蛋白质相互作用网络中识别功能模块是一项重要而具有挑战性的任务。本文提出了一种名为PPIBM的新方法,该方法旨在整合基因表达数据分析和蛋白质-蛋白质相互作用的聚类。所提出的方法依赖于贝叶斯模型,该模型使用作为输入部分的蛋白质-蛋白质相互作用作为其基础。用标准度量对该方法进行了评价,并将其性能与目前最先进的网络分析方法进行了比较。在实际数据和合成数据上的实验结果表明了该方法的有效性。
{"title":"Combining Gene Expression Profiles and Protein-Protein Interactions for Identifying Functional Modules","authors":"Dingding Wang, M. Ogihara, Erliang Zeng, Tao Li","doi":"10.1109/ICMLA.2012.28","DOIUrl":"https://doi.org/10.1109/ICMLA.2012.28","url":null,"abstract":"Identifying functional modules from protein-protein interaction networks is an important and challenging task. This paper presents a new approach called PPIBM which is designed to integrate gene expression data analysis and clustering of protein-protein interactions. The proposed approach relies on a Bayesian model which uses as its base protein-protein interactions given as part of input. The proposed method is evaluated with standard measures and its performance is compared with the state-of-the-art network analysis methods. Experimental results on both real-world data and synthetic data demonstrate the effectiveness of the proposed approach.","PeriodicalId":74528,"journal":{"name":"Proceedings of the ... International Conference on Machine Learning and Applications. International Conference on Machine Learning and Applications","volume":"144 1","pages":"114-119"},"PeriodicalIF":0.0,"publicationDate":"2012-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77529891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Feature Extraction and Selection in Ground Penetrating Radar with Experimental Data Set of Inclusions in Concrete Blocks 基于混凝土块体夹杂物实验数据集的探地雷达特征提取与选择
F. Queiroz, D. Vieira, X. L. Travassos, M. F. Pantoja
Ground Penetrating Radar systems have been successfully used to access concrete structures conditions. Moreover, inclusions in concrete can be discriminated by simple models based on traces obtained by GPR. In this work, concrete blocks with different inclusions were probed in controlled conditions. Some features were extracted from Ascans of this experimental data set. To get efficient models, raw data were submitted to features selection and space reduction methods. Without complex data pre-processing, good accuracy and more explainable models with less computational burden were obtained.
探地雷达系统已成功地用于获取混凝土结构状况。此外,根据探地雷达探测到的迹线,可以用简单的模型对混凝土中的夹杂物进行判别。在这项工作中,在控制条件下对不同夹杂物的混凝土块进行了探测。从该实验数据集的Ascans中提取了一些特征。为了得到有效的模型,将原始数据提交到特征选择和空间约简方法中。无需复杂的数据预处理,可获得精度高、可解释性强、计算量少的模型。
{"title":"Feature Extraction and Selection in Ground Penetrating Radar with Experimental Data Set of Inclusions in Concrete Blocks","authors":"F. Queiroz, D. Vieira, X. L. Travassos, M. F. Pantoja","doi":"10.1109/ICMLA.2012.139","DOIUrl":"https://doi.org/10.1109/ICMLA.2012.139","url":null,"abstract":"Ground Penetrating Radar systems have been successfully used to access concrete structures conditions. Moreover, inclusions in concrete can be discriminated by simple models based on traces obtained by GPR. In this work, concrete blocks with different inclusions were probed in controlled conditions. Some features were extracted from Ascans of this experimental data set. To get efficient models, raw data were submitted to features selection and space reduction methods. Without complex data pre-processing, good accuracy and more explainable models with less computational burden were obtained.","PeriodicalId":74528,"journal":{"name":"Proceedings of the ... International Conference on Machine Learning and Applications. International Conference on Machine Learning and Applications","volume":"119 1","pages":"48-53"},"PeriodicalIF":0.0,"publicationDate":"2012-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77316795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Clinical report classification using Natural Language Processing and Topic Modeling. 使用自然语言处理和主题建模的临床报告分类。
Efsun Sarioglu, Hyeong-Ah Choi, Kabir Yadav

Large amount of electronic clinical data encompasses important information in free text format. To be able to help guide medical decision-making, text needs to be efficiently processed and coded. In this research, we investigate techniques to improve classification of Emergency Department computed tomography (CT) reports. The proposed system uses Natural Language Processing (NLP) to generate structured output from the reports and then machine learning techniques to code for the presence of clinically important injuries for traumatic orbital fracture victims. Topic modeling of the corpora is also utilized as an alternative representation of the patient reports. Our results show that both NLP and topic modeling improves raw text classification results. Within NLP features, filtering the codes using modifiers produces the best performance. Topic modeling shows mixed results. Topic vectors provide good dimensionality reduction and get comparable classification results as with NLP features. However, binary topic classification fails to improve upon raw text classification.

大量的电子临床数据包含了自由文本格式的重要信息。为了能够帮助指导医疗决策,文本需要进行有效的处理和编码。在这项研究中,我们研究了改进急诊科计算机断层扫描(CT)报告分类的技术。所提出的系统使用自然语言处理(NLP)从报告中生成结构化输出,然后使用机器学习技术对创伤性眼眶骨折患者的临床重要损伤进行编码。语料库的主题建模也被用作患者报告的替代表示。我们的结果表明,NLP和主题建模都提高了原始文本分类结果。在NLP功能中,使用修饰符过滤代码可以产生最佳性能。主题建模显示出好坏参半的结果。主题向量提供了良好的降维,并获得了与NLP特征类似的分类结果。然而,二元主题分类并不能改善原始文本分类。
{"title":"Clinical report classification using Natural Language Processing and Topic Modeling.","authors":"Efsun Sarioglu,&nbsp;Hyeong-Ah Choi,&nbsp;Kabir Yadav","doi":"10.1109/icmla.2012.173","DOIUrl":"https://doi.org/10.1109/icmla.2012.173","url":null,"abstract":"<p><p>Large amount of electronic clinical data encompasses important information in free text format. To be able to help guide medical decision-making, text needs to be efficiently processed and coded. In this research, we investigate techniques to improve classification of Emergency Department computed tomography (CT) reports. The proposed system uses Natural Language Processing (NLP) to generate structured output from the reports and then machine learning techniques to code for the presence of clinically important injuries for traumatic orbital fracture victims. Topic modeling of the corpora is also utilized as an alternative representation of the patient reports. Our results show that both NLP and topic modeling improves raw text classification results. Within NLP features, filtering the codes using modifiers produces the best performance. Topic modeling shows mixed results. Topic vectors provide good dimensionality reduction and get comparable classification results as with NLP features. However, binary topic classification fails to improve upon raw text classification.</p>","PeriodicalId":74528,"journal":{"name":"Proceedings of the ... International Conference on Machine Learning and Applications. International Conference on Machine Learning and Applications","volume":"2012 ","pages":"204-209"},"PeriodicalIF":0.0,"publicationDate":"2012-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/icmla.2012.173","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41175135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Improving a Gold Standard: Treating Human Relevance Judgments of MEDLINE Document Pairs. 改进金标准:处理MEDLINE文档对的人类相关性判断。
Won Kim, W John Wilbur

Given prior human judgments of the condition of an object it is possible to use these judgments to make a maximal likelihood estimate of what future human judgments of the condition of that object will be. However, if one has a reasonably large collection of similar objects and the prior human judgments of a number of judges regarding the condition of each object in the collection, then it is possible to make predictions of future human judgments for the whole collection that are superior to the simple maximal likelihood estimate for each object in isolation. This is possible because the multiple judgments over the collection allow an analysis to determine the relative value of a judge as compared with the other judges in the group and this value can be used to augment or diminish a particular judge's influence in predicting future judgments. Here we study and compare five different methods for making such improved predictions and show that each is superior to simple maximal likelihood estimates.

给定人类先前对一个物体的状态的判断,就有可能利用这些判断来对未来人类对该物体的状态的判断做出最大似然估计。然而,如果一个人有一个相当大的类似物体的集合,以及许多法官对集合中每个物体的状况的先前人类判断,那么就有可能对整个集合的未来人类判断做出预测,这种预测优于对孤立的每个物体的简单最大似然估计。这是可能的,因为收集的多个判决允许分析确定法官与组中其他法官相比的相对价值,这个价值可以用来增加或减少特定法官在预测未来判决方面的影响。在这里,我们研究和比较了五种不同的方法来做出这种改进的预测,并表明每一种方法都优于简单的最大似然估计。
{"title":"Improving a Gold Standard: Treating Human Relevance Judgments of MEDLINE Document Pairs.","authors":"Won Kim,&nbsp;W John Wilbur","doi":"10.1109/ICMLA.2010.79","DOIUrl":"https://doi.org/10.1109/ICMLA.2010.79","url":null,"abstract":"<p><p>Given prior human judgments of the condition of an object it is possible to use these judgments to make a maximal likelihood estimate of what future human judgments of the condition of that object will be. However, if one has a reasonably large collection of similar objects and the prior human judgments of a number of judges regarding the condition of each object in the collection, then it is possible to make predictions of future human judgments for the whole collection that are superior to the simple maximal likelihood estimate for each object in isolation. This is possible because the multiple judgments over the collection allow an analysis to determine the relative value of a judge as compared with the other judges in the group and this value can be used to augment or diminish a particular judge's influence in predicting future judgments. Here we study and compare five different methods for making such improved predictions and show that each is superior to simple maximal likelihood estimates.</p>","PeriodicalId":74528,"journal":{"name":"Proceedings of the ... International Conference on Machine Learning and Applications. International Conference on Machine Learning and Applications","volume":"2010 ","pages":"491-498"},"PeriodicalIF":0.0,"publicationDate":"2011-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/ICMLA.2010.79","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"29861442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
Proceedings of the ... International Conference on Machine Learning and Applications. International Conference on Machine Learning and Applications
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1