Daniel E Russ, Kwan-Yuet Ho, Calvin A Johnson, Melissa C Friesen
Mapping job titles to standardized occupation classification (SOC) codes is an important step in evaluating changes in health risks over time as measured in inspection databases. However, manual SOC coding is cost prohibitive for very large studies. Computer based SOC coding systems can improve the efficiency of incorporating occupational risk factors into large-scale epidemiological studies. We present a novel method of mapping verbatim job titles to SOC codes using a large table of prior knowledge available in the public domain that included detailed description of the tasks and activities and their synonyms relevant to each SOC code. Job titles are compared to our knowledge base to find the closest matching SOC code. A soft Jaccard index is used to measure the similarity between a previously unseen job title and the knowledge base. Additional information such as standardized industrial codes can be incorporated to improve the SOC code determination by providing additional context to break ties in matches.
{"title":"Computer-Based Coding of Occupation Codes for Epidemiological Analyses.","authors":"Daniel E Russ, Kwan-Yuet Ho, Calvin A Johnson, Melissa C Friesen","doi":"10.1109/CBMS.2014.79","DOIUrl":"10.1109/CBMS.2014.79","url":null,"abstract":"<p><p>Mapping job titles to standardized occupation classification (SOC) codes is an important step in evaluating changes in health risks over time as measured in inspection databases. However, manual SOC coding is cost prohibitive for very large studies. Computer based SOC coding systems can improve the efficiency of incorporating occupational risk factors into large-scale epidemiological studies. We present a novel method of mapping verbatim job titles to SOC codes using a large table of prior knowledge available in the public domain that included detailed description of the tasks and activities and their synonyms relevant to each SOC code. Job titles are compared to our knowledge base to find the closest matching SOC code. A soft Jaccard index is used to measure the similarity between a previously unseen job title and the knowledge base. Additional information such as standardized industrial codes can be incorporated to improve the SOC code determination by providing additional context to break ties in matches.</p>","PeriodicalId":74567,"journal":{"name":"Proceedings. IEEE International Symposium on Computer-Based Medical Systems","volume":"2014 ","pages":"347-350"},"PeriodicalIF":0.0,"publicationDate":"2014-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4161468/pdf/nihms623084.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32668744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-06-20DOI: 10.1109/CBMS.2013.6627755
R. B. Rao
Consider the following healthcare trends: (1) There is a tremendous increase in the amount of patient, life sciences and process data in electronic form, fueled by advances in healthcare IT technology, and health reform legislation. (2) The amount of medical information (e.g., evidence-based knowledge) and published knowledge is said to be doubling every few years. (3) There is an explosion in the number of available therapies and diagnostic options for patient care, often enabling precise targeting of therapy to disease conditions. In this talk we will discuss these trends and some of the reasons why, despite these advances, healthcare is facing a crisis: namely, there is a steady unsustainable increase in medical costs without a corresponding improvement of patient outcomes. We believe that analysis of clinical, life sciences and medical process data can play a key role in tackling these fundamental challenges. Two technology advances, in particular, can play a key role: cloud computing and mobility will make it possible to analyze vast amounts of data and quickly deliver useful information to clinicians, consumers and researchers at the point where it can have the most impact. Some of this is already happening today, with medical records being analyzed to reduce fraud, waste and abuse, improve patient outcomes, and to improve compliance with standards of care and policy guidelines. We conclude the talk with a glimpse of a future where medical systems could be continually analyzed for optimizing healthcare costs and outcomes.
{"title":"The role of medical data analytics in reducing health fraud and improving clinical and financial outcomes","authors":"R. B. Rao","doi":"10.1109/CBMS.2013.6627755","DOIUrl":"https://doi.org/10.1109/CBMS.2013.6627755","url":null,"abstract":"Consider the following healthcare trends: (1) There is a tremendous increase in the amount of patient, life sciences and process data in electronic form, fueled by advances in healthcare IT technology, and health reform legislation. (2) The amount of medical information (e.g., evidence-based knowledge) and published knowledge is said to be doubling every few years. (3) There is an explosion in the number of available therapies and diagnostic options for patient care, often enabling precise targeting of therapy to disease conditions. In this talk we will discuss these trends and some of the reasons why, despite these advances, healthcare is facing a crisis: namely, there is a steady unsustainable increase in medical costs without a corresponding improvement of patient outcomes. We believe that analysis of clinical, life sciences and medical process data can play a key role in tackling these fundamental challenges. Two technology advances, in particular, can play a key role: cloud computing and mobility will make it possible to analyze vast amounts of data and quickly deliver useful information to clinicians, consumers and researchers at the point where it can have the most impact. Some of this is already happening today, with medical records being analyzed to reduce fraud, waste and abuse, improve patient outcomes, and to improve compliance with standards of care and policy guidelines. We conclude the talk with a glimpse of a future where medical systems could be continually analyzed for optimizing healthcare costs and outcomes.","PeriodicalId":74567,"journal":{"name":"Proceedings. IEEE International Symposium on Computer-Based Medical Systems","volume":"14 6","pages":"3"},"PeriodicalIF":0.0,"publicationDate":"2013-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/CBMS.2013.6627755","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72474735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-06-20DOI: 10.1109/CBMS.2013.6627754
P. Larrañaga, C. Bielza
Summary form only given. In this keynote lecture we will show how Bayesian networks can address important neuroscience problems. These problems include: (a) neuroanatomy issues, like modeling and simulation of dendritic trees and classifying neuron types based on morphological features; (b) neurodegenerative diseases, like predicting health-related quality of life in Parkinson's disease, classification of dementia stages in Parkinson's disease and searching for genetic biomarkers in Alzheimer's disease.
{"title":"Bayesian networks to answer challenging neuroscience questions","authors":"P. Larrañaga, C. Bielza","doi":"10.1109/CBMS.2013.6627754","DOIUrl":"https://doi.org/10.1109/CBMS.2013.6627754","url":null,"abstract":"Summary form only given. In this keynote lecture we will show how Bayesian networks can address important neuroscience problems. These problems include: (a) neuroanatomy issues, like modeling and simulation of dendritic trees and classifying neuron types based on morphological features; (b) neurodegenerative diseases, like predicting health-related quality of life in Parkinson's disease, classification of dementia stages in Parkinson's disease and searching for genetic biomarkers in Alzheimer's disease.","PeriodicalId":74567,"journal":{"name":"Proceedings. IEEE International Symposium on Computer-Based Medical Systems","volume":"42 1","pages":"2"},"PeriodicalIF":0.0,"publicationDate":"2013-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77593944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-06-20DOI: 10.1109/CBMS.2013.6627753
J. Wyatt
In my experience, presentations at AIME, IEEE or EFMI conferences often describe work by academic engineers using patients as a source of data to explore new modelling methods, and few demonstrate convincing solutions to real world clinical problems. One reason for this is that many doctors make themselves inaccessible, so engineers find it hard to engage them in projects. Since healthcare and medical work are very complex, it takes years of exposure to clinicians and healthcare settings for an engineer to understand real-world patient management problems in sufficient detail to help solve them. This means that sometimes, an engineer might believe they have solved the problem, while to a clinician they have only explored an irrelevant simplification of it. Another explanation is that some engineering academics have had their fingers burned by clinicians, who expect the engineer to carry out an everyday system development task with no research payload. Such engineers will become suspicious of engaging too closely with doctors. Cynics might be less fair, observing that since medical research is well funded, there is a tendency for engineers to apply any novel engineering method to a simplified health data as this is more likely to attract funding than applying their method to, say, linguistics data. However, I believe there is a deeper explanation of why so few bioengineering projects seem to bear clinically digestible fruit: there are fundamental differences in motivation, research focus and research methods between engineering and healthcare research domains, and in the kind of problems they address. For example, the engineering approaches used in the Virtual Physiological Human programme mainly involve data mining and modelling, while clinicians emphasise using psychological, social or other theories to understand and formalise a complex problem first, then use empirical testing to find out whether a theory-based solution works - the evidence based approach. It is clearly unhelpful for engineers to criticise doctors as being poor collaborators in multidisciplinary projects, just as it is for doctors to criticise engineers. So, the aim of this talk is to move beyond name calling to explore common ground constructively and to provoke useful reflection and discussion, both within and across these disciplines. This talk will therefore explore some of the similarities and differences between engineering and healthcare as research disciplines, their respective approaches to problem solving and attempt to build bridges between these two very different worlds. In conclusion, unless we describe the features of this uneasy stand-off between engineers and clinicians, confront it head on and provoke debate, it looks set to continue. This will reduce productivity on both sides and limit the enormous scientific, economic and social benefits that novel, clinically appropriate and collaboratively engineered systems can generate.
{"title":"Why don't engineers and clinicians talk the same language - And what to do about it?","authors":"J. Wyatt","doi":"10.1109/CBMS.2013.6627753","DOIUrl":"https://doi.org/10.1109/CBMS.2013.6627753","url":null,"abstract":"In my experience, presentations at AIME, IEEE or EFMI conferences often describe work by academic engineers using patients as a source of data to explore new modelling methods, and few demonstrate convincing solutions to real world clinical problems. One reason for this is that many doctors make themselves inaccessible, so engineers find it hard to engage them in projects. Since healthcare and medical work are very complex, it takes years of exposure to clinicians and healthcare settings for an engineer to understand real-world patient management problems in sufficient detail to help solve them. This means that sometimes, an engineer might believe they have solved the problem, while to a clinician they have only explored an irrelevant simplification of it. Another explanation is that some engineering academics have had their fingers burned by clinicians, who expect the engineer to carry out an everyday system development task with no research payload. Such engineers will become suspicious of engaging too closely with doctors. Cynics might be less fair, observing that since medical research is well funded, there is a tendency for engineers to apply any novel engineering method to a simplified health data as this is more likely to attract funding than applying their method to, say, linguistics data. However, I believe there is a deeper explanation of why so few bioengineering projects seem to bear clinically digestible fruit: there are fundamental differences in motivation, research focus and research methods between engineering and healthcare research domains, and in the kind of problems they address. For example, the engineering approaches used in the Virtual Physiological Human programme mainly involve data mining and modelling, while clinicians emphasise using psychological, social or other theories to understand and formalise a complex problem first, then use empirical testing to find out whether a theory-based solution works - the evidence based approach. It is clearly unhelpful for engineers to criticise doctors as being poor collaborators in multidisciplinary projects, just as it is for doctors to criticise engineers. So, the aim of this talk is to move beyond name calling to explore common ground constructively and to provoke useful reflection and discussion, both within and across these disciplines. This talk will therefore explore some of the similarities and differences between engineering and healthcare as research disciplines, their respective approaches to problem solving and attempt to build bridges between these two very different worlds. In conclusion, unless we describe the features of this uneasy stand-off between engineers and clinicians, confront it head on and provoke debate, it looks set to continue. This will reduce productivity on both sides and limit the enormous scientific, economic and social benefits that novel, clinically appropriate and collaboratively engineered systems can generate.","PeriodicalId":74567,"journal":{"name":"Proceedings. IEEE International Symposium on Computer-Based Medical Systems","volume":"11 1","pages":"1"},"PeriodicalIF":0.0,"publicationDate":"2013-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84886831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AudioSense integrates mobile phones and web technology to measure hearing aid performance in real-time and in-situ. Measuring the performance of hearing aids in the real world poses significant challenges as it depends on the patient's listening context. AudioSense uses Ecological Momentary Assessment methods to evaluate both the perceived hearing aid performance as well as to characterize the listening environment using electronic surveys. AudioSense further characterizes a patient's listening context by recording their GPS location and sound samples. By creating a time-synchronized record of listening performance and listening contexts, AudioSense will allow researchers to understand the relationship between listening context and hearing aid performance. Performance evaluation shows that AudioSense is reliable, energy-efficient, and can estimate Signal-to-Noise Ratio (SNR) levels from captured audio samples.
{"title":"AudioSense: Enabling Real-time Evaluation of Hearing Aid Technology In-Situ.","authors":"Syed Shabih Hasan, Farley Lai, Octav Chipara, Yu-Hsiang Wu","doi":"10.1109/CBMS.2013.6627783","DOIUrl":"https://doi.org/10.1109/CBMS.2013.6627783","url":null,"abstract":"<p><p>AudioSense integrates mobile phones and web technology to measure hearing aid performance in real-time and in-situ. Measuring the performance of hearing aids in the real world poses significant challenges as it depends on the patient's listening context. AudioSense uses Ecological Momentary Assessment methods to evaluate both the perceived hearing aid performance as well as to characterize the listening environment using electronic surveys. AudioSense further characterizes a patient's listening context by recording their GPS location and sound samples. By creating a time-synchronized record of listening performance and listening contexts, AudioSense will allow researchers to understand the relationship between listening context and hearing aid performance. Performance evaluation shows that AudioSense is reliable, energy-efficient, and can estimate Signal-to-Noise Ratio (SNR) levels from captured audio samples.</p>","PeriodicalId":74567,"journal":{"name":"Proceedings. IEEE International Symposium on Computer-Based Medical Systems","volume":"2013 ","pages":"167-172"},"PeriodicalIF":0.0,"publicationDate":"2013-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/CBMS.2013.6627783","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32497113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-01-01DOI: 10.1109/CBMS.2012.6266298
Chen Shi, Brian C Becker, Cameron N Riviere
This paper describes an inexpensive pico-projector-based augmented reality (AR) display for a surgical microscope. The system is designed for use with Micron, an active handheld surgical tool that cancels hand tremor of surgeons to improve microsurgical accuracy. Using the AR display, virtual cues can be injected into the microscope view to track the movement of the tip of Micron, show the desired position, and indicate the position error. Cues can be used to maintain high performance by helping the surgeon to avoid drifting out of the workspace of the instrument. Also, boundary information such as the view range of the cameras that record surgical procedures can be displayed to tell surgeons the operation area. Furthermore, numerical, textual, or graphical information can be displayed, showing such things as tool tip depth in the work space and on/off status of the canceling function of Micron.
{"title":"Inexpensive Monocular Pico-Projector-based Augmented Reality Display for Surgical Microscope.","authors":"Chen Shi, Brian C Becker, Cameron N Riviere","doi":"10.1109/CBMS.2012.6266298","DOIUrl":"10.1109/CBMS.2012.6266298","url":null,"abstract":"<p><p>This paper describes an inexpensive pico-projector-based augmented reality (AR) display for a surgical microscope. The system is designed for use with Micron, an active handheld surgical tool that cancels hand tremor of surgeons to improve microsurgical accuracy. Using the AR display, virtual cues can be injected into the microscope view to track the movement of the tip of Micron, show the desired position, and indicate the position error. Cues can be used to maintain high performance by helping the surgeon to avoid drifting out of the workspace of the instrument. Also, boundary information such as the view range of the cameras that record surgical procedures can be displayed to tell surgeons the operation area. Furthermore, numerical, textual, or graphical information can be displayed, showing such things as tool tip depth in the work space and on/off status of the canceling function of Micron.</p>","PeriodicalId":74567,"journal":{"name":"Proceedings. IEEE International Symposium on Computer-Based Medical Systems","volume":"2012 ","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2012-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4175741/pdf/nihms368430.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32702814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-06-27DOI: 10.1109/CBMS.2011.5999022
T. Solomonides
The 24th International Symposium on Computer-Based Medical Systems, CBMS 2011, took place at the University of the West of England, Bristol, UK, on 27th to 30th June 2011. As a special feature, instead of the traditional (since 2005) special track on “healthgrids”, i.e. grid computing for biomedicine and healthcare, latterly encompassing cloud computing also, the conference HealthGrid 2011 colocated with CBMS to the benefit of both. This was the culmination of a hope that those of us working at UWE had entertained since 2008. The invitation to CBMS was first made in Jyvaskyla in 2008, became a formal proposal in Albuquerque in 2009 and was confirmed in Perth in 2010. As for HealthGrid, it seemed an opportunity not to be missed to colocate with CBMS in Bristol, only the second time the conference has been awarded to a British city (after Oxford in 2005).
{"title":"Proceedings of the 24th International Symposium on Computer-Based Medical Systems - CBMS 2011 Bristol, UK","authors":"T. Solomonides","doi":"10.1109/CBMS.2011.5999022","DOIUrl":"https://doi.org/10.1109/CBMS.2011.5999022","url":null,"abstract":"The 24th International Symposium on Computer-Based Medical Systems, CBMS 2011, took place at the University of the West of England, Bristol, UK, on 27th to 30th June 2011. As a special feature, instead of the traditional (since 2005) special track on “healthgrids”, i.e. grid computing for biomedicine and healthcare, latterly encompassing cloud computing also, the conference HealthGrid 2011 colocated with CBMS to the benefit of both. This was the culmination of a hope that those of us working at UWE had entertained since 2008. The invitation to CBMS was first made in Jyvaskyla in 2008, became a formal proposal in Albuquerque in 2009 and was confirmed in Perth in 2010. As for HealthGrid, it seemed an opportunity not to be missed to colocate with CBMS in Bristol, only the second time the conference has been awarded to a British city (after Oxford in 2005).","PeriodicalId":74567,"journal":{"name":"Proceedings. IEEE International Symposium on Computer-Based Medical Systems","volume":"96 1","pages":"1-4"},"PeriodicalIF":0.0,"publicationDate":"2011-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78767698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Classification methods are widely used in computer-based medical systems. Often, the accuracy of a classifier can be improved using a classifier ensemble, the combination of several classifiers. Two classifiers ensembles and their results on several medical data sets will be presented: Rotation Forest (Rodriguez, Kuncheva and Alonso) and Random Oracles (Kuncheva and Rodriguez). Rotation Forest is a method for generating classifier ensembles based on feature extraction. To create the training data for a base classifier, the feature set is randomly split into K subsets (K is a parameter of the algorithm) and Principal Component Analysis (PCA) is applied to each subset. All principal components are retained in order to preserve the variability information in the data. Thus, K axis rotations take place to form the new features for a base classifier. The idea of the rotation approach is to encourage simultaneously individual accuracy and diversity within the ensemble. Diversity is promoted through the feature extraction for each base classifier. Decision trees were chosen here because they are sensitive to rotation of the feature axes, hence the name "forest." Accuracy is sought by keeping all principal components and also using the whole data set to train each base classifier. Comparisons with various standard ensemble methods (Bagging, AdaBoost, and Random Forest) will be reported. Diversity-error diagrams reveal that Rotation Forest ensembles construct individual classifiers which are more accurate than these in AdaBoost and Random Forest and more diverse than these in Bagging, sometimes more accurate as well. A random oracle classifier is a mini-ensemble formed by a pair of classifiers and a fixed, randomly created oracle that selects between them. The random oracle can be thought of as a random discriminant function which splits the data into two subsets with no regard of any class labels or cluster structure. Two random oracles has been considered: linear and spherical. A random oracle classifier can be used as the base classifier of any ensemble method. It is argued that this approach encourages extra diversity in the ensemble while allowing for high accuracy of the individual ensemble members. Experiments with several data sets from UCI and 11 ensemble models will be reported. Each ensemble model will be examined with and without the oracle. The results will show that all ensemble methods benefited from the new approach, most markedly so random subspace and bagging. A further experiment with seven real medical data sets will demonstrate the validity of these findings outside the UCI data collection. When using Naive Bayes Classifiers as base classifiers, the experiments show that ensembles based solely upon the spherical oracle (and no other ensemble heuristic) outrank Bagging, Wagging, Random Subspaces, AdaBoost.Ml, MultiBoost and Decorate. Moreover, all these ensemble methods are better with any of the two random oracles than their standard
分类方法广泛应用于基于计算机的医疗系统中。通常,可以使用分类器集成(多个分类器的组合)来提高分类器的准确性。将介绍两种分类器集合及其在若干医疗数据集上的结果:轮换森林(Rodriguez, Kuncheva和Alonso)和随机预言器(Kuncheva和Rodriguez)。旋转森林是一种基于特征提取的分类器集成生成方法。为了创建基分类器的训练数据,将特征集随机分成K个子集(K是算法的一个参数),并对每个子集应用主成分分析(PCA)。为了保留数据中的变异性信息,保留了所有主成分。因此,发生K轴旋转以形成基本分类器的新特征。旋转方法的想法是同时鼓励个人的准确性和多样性在整体。通过对每个基分类器的特征提取来提升多样性。这里选择决策树是因为它们对特征轴的旋转很敏感,因此被称为“森林”。准确性是通过保留所有主成分和使用整个数据集来训练每个基分类器来寻求的。将报告与各种标准集成方法(Bagging, AdaBoost和Random Forest)的比较。多样性误差图显示,旋转森林集成构建的单个分类器比AdaBoost和Random Forest中的分类器更准确,比Bagging中的分类器更多样化,有时也更准确。随机oracle分类器是由一对分类器和一个固定的、随机创建的、在它们之间进行选择的oracle组成的小型集合。随机oracle可以被认为是一个随机判别函数,它将数据分成两个子集,而不考虑任何类标签或聚类结构。考虑了两种随机的神谕:线性的和球形的。随机oracle分类器可以作为任何集成方法的基础分类器。有人认为,这种方法鼓励了集合中额外的多样性,同时允许单个集合成员的高精度。本文将报道使用来自UCI和11个集成模型的几个数据集的实验。每个集成模型将在有或没有oracle的情况下进行检查。结果表明,所有的集成方法都受益于新方法,其中最明显的是随机子空间和套袋。对七个真实医疗数据集的进一步实验将证明这些发现在UCI数据收集之外的有效性。当使用朴素贝叶斯分类器作为基本分类器时,实验表明,仅基于球形预测(而没有其他集成启发式)的集成优于Bagging, Wagging, Random Subspaces, AdaBoost。Ml,多重增强和装饰。此外,所有这些集成方法使用任意两种随机oracle都比不使用oracle的标准版本要好。
{"title":"Rotation Forest and Random Oracles: Two Classifier Ensemble Methods","authors":"Juan José Rodríguez Diez","doi":"10.1109/CBMS.2007.94","DOIUrl":"https://doi.org/10.1109/CBMS.2007.94","url":null,"abstract":"Classification methods are widely used in computer-based medical systems. Often, the accuracy of a classifier can be improved using a classifier ensemble, the combination of several classifiers. Two classifiers ensembles and their results on several medical data sets will be presented: Rotation Forest (Rodriguez, Kuncheva and Alonso) and Random Oracles (Kuncheva and Rodriguez). Rotation Forest is a method for generating classifier ensembles based on feature extraction. To create the training data for a base classifier, the feature set is randomly split into K subsets (K is a parameter of the algorithm) and Principal Component Analysis (PCA) is applied to each subset. All principal components are retained in order to preserve the variability information in the data. Thus, K axis rotations take place to form the new features for a base classifier. The idea of the rotation approach is to encourage simultaneously individual accuracy and diversity within the ensemble. Diversity is promoted through the feature extraction for each base classifier. Decision trees were chosen here because they are sensitive to rotation of the feature axes, hence the name \"forest.\" Accuracy is sought by keeping all principal components and also using the whole data set to train each base classifier. Comparisons with various standard ensemble methods (Bagging, AdaBoost, and Random Forest) will be reported. Diversity-error diagrams reveal that Rotation Forest ensembles construct individual classifiers which are more accurate than these in AdaBoost and Random Forest and more diverse than these in Bagging, sometimes more accurate as well. A random oracle classifier is a mini-ensemble formed by a pair of classifiers and a fixed, randomly created oracle that selects between them. The random oracle can be thought of as a random discriminant function which splits the data into two subsets with no regard of any class labels or cluster structure. Two random oracles has been considered: linear and spherical. A random oracle classifier can be used as the base classifier of any ensemble method. It is argued that this approach encourages extra diversity in the ensemble while allowing for high accuracy of the individual ensemble members. Experiments with several data sets from UCI and 11 ensemble models will be reported. Each ensemble model will be examined with and without the oracle. The results will show that all ensemble methods benefited from the new approach, most markedly so random subspace and bagging. A further experiment with seven real medical data sets will demonstrate the validity of these findings outside the UCI data collection. When using Naive Bayes Classifiers as base classifiers, the experiments show that ensembles based solely upon the spherical oracle (and no other ensemble heuristic) outrank Bagging, Wagging, Random Subspaces, AdaBoost.Ml, MultiBoost and Decorate. Moreover, all these ensemble methods are better with any of the two random oracles than their standard ","PeriodicalId":74567,"journal":{"name":"Proceedings. IEEE International Symposium on Computer-Based Medical Systems","volume":"178 1","pages":"3"},"PeriodicalIF":0.0,"publicationDate":"2007-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82993115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2004-01-01DOI: 10.1109/CBMS.2004.1311683
S. Henrard
NIST has devised preliminary elements (technical “hooks”) of a convenient logging method for Web-based electronic health record (EHR) dialogues. These can identify fields, record times spent at each (by whomever), and log a sequence of visits. The next step will be to refine this promising start, to begin building upon it a more polished and user-friendly system. We present our results to gain impressions from users of the worth of simple, open tools for tuning and improving e-record flows and their corresponding with practice workflows.
{"title":"Preliminary Instrumentation for the Efficient Use of Web-Based Electronic Health Records","authors":"S. Henrard","doi":"10.1109/CBMS.2004.1311683","DOIUrl":"https://doi.org/10.1109/CBMS.2004.1311683","url":null,"abstract":"NIST has devised preliminary elements (technical “hooks”) of a convenient logging method for Web-based electronic health record (EHR) dialogues. These can identify fields, record times spent at each (by whomever), and log a sequence of visits. The next step will be to refine this promising start, to begin building upon it a more polished and user-friendly system. We present our results to gain impressions from users of the worth of simple, open tools for tuning and improving e-record flows and their corresponding with practice workflows.","PeriodicalId":74567,"journal":{"name":"Proceedings. IEEE International Symposium on Computer-Based Medical Systems","volume":"53 1","pages":"10-14"},"PeriodicalIF":0.0,"publicationDate":"2004-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75255711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}