首页 > 最新文献

CEUR workshop proceedings最新文献

英文 中文
Fast optimization of weighted sparse decision trees for use in optimal treatment regimes and optimal policy design 加权稀疏决策树的快速优化,用于最优治疗方案和最优策略设计
Pub Date : 2022-10-01 DOI: 10.48550/arXiv.2210.06825
Ali Behrouz, Mathias Lécuyer, C. Rudin, M. Seltzer
Sparse decision trees are one of the most common forms of interpretable models. While recent advances have produced algorithms that fully optimize sparse decision trees for prediction, that work does not address policy design, because the algorithms cannot handle weighted data samples. Specifically, they rely on the discreteness of the loss function, which means that real-valued weights cannot be directly used. For example, none of the existing techniques produce policies that incorporate inverse propensity weighting on individual data points. We present three algorithms for efficient sparse weighted decision tree optimization. The first approach directly optimizes the weighted loss function; however, it tends to be computationally inefficient for large datasets. Our second approach, which scales more efficiently, transforms weights to integer values and uses data duplication to transform the weighted decision tree optimization problem into an unweighted (but larger) counterpart. Our third algorithm, which scales to much larger datasets, uses a randomized procedure that samples each data point with a probability proportional to its weight. We present theoretical bounds on the error of the two fast methods and show experimentally that these methods can be two orders of magnitude faster than the direct optimization of the weighted loss, without losing significant accuracy.
稀疏决策树是可解释模型中最常见的形式之一。虽然最近的进展已经产生了完全优化稀疏决策树进行预测的算法,但这项工作并没有涉及策略设计,因为这些算法无法处理加权数据样本。具体来说,它们依赖于损失函数的离散性,这意味着不能直接使用实值权重。例如,现有的技术都没有产生在单个数据点上包含反向倾向加权的政策。我们提出了三种有效的稀疏加权决策树优化算法。第一种方法直接优化加权损失函数;然而,对于大型数据集,它往往在计算上效率低下。我们的第二种方法更有效地扩展,将权重转换为整数值,并使用数据复制将加权决策树优化问题转换为未加权(但更大)的对应问题。我们的第三种算法可扩展到更大的数据集,它使用随机过程,以与权重成比例的概率对每个数据点进行采样。我们给出了两种快速方法误差的理论界限,并通过实验表明,这些方法可以比直接优化加权损耗快两个数量级,而不会损失显著的精度。
{"title":"Fast optimization of weighted sparse decision trees for use in optimal treatment regimes and optimal policy design","authors":"Ali Behrouz, Mathias Lécuyer, C. Rudin, M. Seltzer","doi":"10.48550/arXiv.2210.06825","DOIUrl":"https://doi.org/10.48550/arXiv.2210.06825","url":null,"abstract":"Sparse decision trees are one of the most common forms of interpretable models. While recent advances have produced algorithms that fully optimize sparse decision trees for prediction, that work does not address policy design, because the algorithms cannot handle weighted data samples. Specifically, they rely on the discreteness of the loss function, which means that real-valued weights cannot be directly used. For example, none of the existing techniques produce policies that incorporate inverse propensity weighting on individual data points. We present three algorithms for efficient sparse weighted decision tree optimization. The first approach directly optimizes the weighted loss function; however, it tends to be computationally inefficient for large datasets. Our second approach, which scales more efficiently, transforms weights to integer values and uses data duplication to transform the weighted decision tree optimization problem into an unweighted (but larger) counterpart. Our third algorithm, which scales to much larger datasets, uses a randomized procedure that samples each data point with a probability proportional to its weight. We present theoretical bounds on the error of the two fast methods and show experimentally that these methods can be two orders of magnitude faster than the direct optimization of the weighted loss, without losing significant accuracy.","PeriodicalId":72554,"journal":{"name":"CEUR workshop proceedings","volume":"3318 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45358116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Performance Summary Display Ontology: Feedback intervention content, delivery, and interpreted information. 性能摘要显示本体:反馈干预内容、交付和解释信息。
Pub Date : 2022-09-01
Zach Landis-Lewis, Cooper Stansbury, John Rincón, Colin Gross

Feedback loops are vital for decision-making and behavior change in health systems, but not all feedback is of equal value. Clinical performance feedback to healthcare professionals and teams has potential for large effects on clinical practice, but evidence suggests that low-value performance feedback is widespread. A primary barrier to understanding the value of feedback loops in health systems may be a lack of a well-defined model and shared semantics for the information that they carry. An ontology for audit and feedback research may be used to address these issues by standardizing feedback intervention metadata. Research describing feedback interventions recognizes differences between the content of the feedback and its delivery process. However, terms describing feedback intervention content are inconsistent, and appear to vary considerably between audit and feedback frameworks, which can result in confusion around what is being delivered in a performance summary. Our objective was to develop an ontology of a performance summary in a clinical performance feedback intervention for the purposes of standardizing metadata. We developed the Performance Summary Display Ontology (PSDO) iteratively by 1) identifying terms for classes from behavior change theories relating to feedback interventions and cognitive theories of visualization, 2) searching for relevant existing ontologies and classes, and 3) using the terms to specify information content and visual displays in published examples of dashboard displays and feedback reports. PSDO is a lightweight application ontology that specifies performance information content and its representations for the purpose of feedback intervention research and evaluation. PSDO contains 3 primary domains: 1) Performance information content, based on constructs from behavior change theories, 2) Marks and their qualities, based on constructs from visualization theories, and 3) roles that link marks, information content, and other emergent characteristics, as interpreted information. PSDO may enable standardization of metadata for the study of feedback interventions.

反馈循环对于卫生系统的决策和行为改变至关重要,但并非所有反馈都具有同等价值。对医疗保健专业人员和团队的临床绩效反馈可能对临床实践产生重大影响,但有证据表明,低价值的绩效反馈很普遍。理解卫生系统中反馈回路价值的主要障碍可能是缺乏定义良好的模型和它们所携带信息的共享语义。审计和反馈研究的本体可以通过标准化反馈干预元数据来解决这些问题。描述反馈干预的研究认识到反馈的内容及其交付过程之间的差异。然而,描述反馈干预内容的术语是不一致的,并且在审计和反馈框架之间似乎差异很大,这可能导致对在绩效摘要中交付的内容的混淆。我们的目标是为了标准化元数据的目的,在临床绩效反馈干预中开发一个绩效总结的本体。我们迭代地开发了性能摘要显示本体(PSDO): 1)从与反馈干预和可视化认知理论相关的行为改变理论中识别类的术语,2)搜索相关的现有本体和类,以及3)使用这些术语指定信息内容和可视化显示在已发布的仪表板显示和反馈报告示例中。PSDO是一个轻量级的应用程序本体,它为反馈干预研究和评估指定了性能信息内容及其表示形式。PSDO包含3个主要领域:1)基于行为改变理论构建的绩效信息内容;2)基于可视化理论构建的分数及其质量;3)将分数、信息内容和其他突现特征作为解释信息联系起来的角色。PSDO可以使元数据标准化,用于研究反馈干预措施。
{"title":"Performance Summary Display Ontology: Feedback intervention content, delivery, and interpreted information.","authors":"Zach Landis-Lewis, Cooper Stansbury, John Rincón, Colin Gross","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Feedback loops are vital for decision-making and behavior change in health systems, but not all feedback is of equal value. Clinical performance feedback to healthcare professionals and teams has potential for large effects on clinical practice, but evidence suggests that low-value performance feedback is widespread. A primary barrier to understanding the value of feedback loops in health systems may be a lack of a well-defined model and shared semantics for the information that they carry. An ontology for audit and feedback research may be used to address these issues by standardizing feedback intervention metadata. Research describing feedback interventions recognizes differences between the content of the feedback and its delivery process. However, terms describing feedback intervention content are inconsistent, and appear to vary considerably between audit and feedback frameworks, which can result in confusion around what is being delivered in a performance summary. Our objective was to develop an ontology of a performance summary in a clinical performance feedback intervention for the purposes of standardizing metadata. We developed the Performance Summary Display Ontology (PSDO) iteratively by 1) identifying terms for classes from behavior change theories relating to feedback interventions and cognitive theories of visualization, 2) searching for relevant existing ontologies and classes, and 3) using the terms to specify information content and visual displays in published examples of dashboard displays and feedback reports. PSDO is a lightweight application ontology that specifies performance information content and its representations for the purpose of feedback intervention research and evaluation. PSDO contains 3 primary domains: 1) Performance information content, based on constructs from behavior change theories, 2) Marks and their qualities, based on constructs from visualization theories, and 3) roles that link marks, information content, and other emergent characteristics, as interpreted information. PSDO may enable standardization of metadata for the study of feedback interventions.</p>","PeriodicalId":72554,"journal":{"name":"CEUR workshop proceedings","volume":"3805 ","pages":"L1-L10"},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11825144/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143416494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A community effort for COVID-19 Ontology Harmonization. COVID-19 本体协调的社区努力。
Pub Date : 2022-01-01 Epub Date: 2022-01-28
Asiyah Yu Lin, Yuki Yamagata, William D Duncan, Leigh C Carmody, Tatsuya Kushida, Hiroshi Masuya, John Beverley, Biswanath Dutta, Michael DeBellis, Zoë May Pendlington, Paola Roncaglia, Yongqun He

Ontologies have emerged to become critical to support data and knowledge representation, standardization, integration, and analysis. The SARS-CoV-2 pandemic led to the rapid proliferation of COVID-19 data, as well as the development of many COVID-19 ontologies. In the interest of supporting data interoperability, we initiated a community-based effort to harmonize COVID-19 ontologies. Our effort involves the collaborative discussion among developers of seven COVID-19 related ontologies, and the merging of four ontologies. This effort demonstrates the feasibility of harmonizing these ontologies in an interoperable framework to support integrative representation and analysis of COVID-19 related data and knowledge.

本体论已成为支持数据和知识表示、标准化、集成和分析的关键。SARS-CoV-2 大流行导致 COVID-19 数据迅速扩散,同时也催生了许多 COVID-19 本体的开发。为了支持数据互操作性,我们发起了一项基于社区的工作,以协调 COVID-19 本体。我们的工作涉及七个 COVID-19 相关本体的开发人员之间的合作讨论,以及四个本体的合并。这项工作证明了在一个可互操作的框架内协调这些本体的可行性,以支持 COVID-19 相关数据和知识的综合表示和分析。
{"title":"A community effort for COVID-19 Ontology Harmonization.","authors":"Asiyah Yu Lin, Yuki Yamagata, William D Duncan, Leigh C Carmody, Tatsuya Kushida, Hiroshi Masuya, John Beverley, Biswanath Dutta, Michael DeBellis, Zoë May Pendlington, Paola Roncaglia, Yongqun He","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Ontologies have emerged to become critical to support data and knowledge representation, standardization, integration, and analysis. The SARS-CoV-2 pandemic led to the rapid proliferation of COVID-19 data, as well as the development of many COVID-19 ontologies. In the interest of supporting data interoperability, we initiated a community-based effort to harmonize COVID-19 ontologies. Our effort involves the collaborative discussion among developers of seven COVID-19 related ontologies, and the merging of four ontologies. This effort demonstrates the feasibility of harmonizing these ontologies in an interoperable framework to support integrative representation and analysis of COVID-19 related data and knowledge.</p>","PeriodicalId":72554,"journal":{"name":"CEUR workshop proceedings","volume":"3073 ","pages":"122-127"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10262777/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10024339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Open-Publishing Response to the COVID-19 Infodemic. 对 COVID-19 信息学术会议的公开出版回应。
Pub Date : 2021-09-01
Halie M Rando, Simina M Boca, Lucy D'Agostino McGowan, Daniel S Himmelstein, Michael P Robson, Vincent Rubinetti, Ryan Velazquez, Casey S Greene, Anthony Gitter

The COVID-19 pandemic catalyzed the rapid dissemination of papers and preprints investigating the disease and its associated virus, SARS-CoV-2. The multifaceted nature of COVID-19 demands a multidisciplinary approach, but the urgency of the crisis combined with the need for social distancing measures present unique challenges to collaborative science. We applied a massive online open publishing approach to this problem using Manubot. Through GitHub, collaborators summarized and critiqued COVID-19 literature, creating a review manuscript. Manubot automatically compiled citation information for referenced preprints, journal publications, websites, and clinical trials. Continuous integration workflows retrieved up-to-date data from online sources nightly, regenerating some of the manuscript's figures and statistics. Manubot rendered the manuscript into PDF, HTML, LaTeX, and DOCX outputs, immediately updating the version available online upon the integration of new content. Through this effort, we organized over 50 scientists from a range of backgrounds who evaluated over 1,500 sources and developed seven literature reviews. While many efforts from the computational community have focused on mining COVID-19 literature, our project illustrates the power of open publishing to organize both technical and non-technical scientists to aggregate and disseminate information in response to an evolving crisis.

COVID-19 大流行促进了研究该疾病及其相关病毒 SARS-CoV-2 的论文和预印本的快速传播。COVID-19 的多面性要求采用多学科方法,但危机的紧迫性和社会疏远措施的必要性给合作科学带来了独特的挑战。我们利用 Manubot 采用大规模在线开放出版的方法来解决这一问题。通过 GitHub,合作者对 COVID-19 文献进行了总结和评论,并撰写了评论手稿。Manubot 自动编译参考预印本、期刊出版物、网站和临床试验的引用信息。持续集成工作流每晚从在线资源中检索最新数据,重新生成手稿中的部分数字和统计数据。Manubot 将手稿渲染为 PDF、HTML、LaTeX 和 DOCX 输出,并在整合新内容后立即更新在线版本。通过这项工作,我们组织了 50 多位不同背景的科学家,他们评估了 1,500 多份资料,编写了 7 篇文献综述。虽然计算界的许多工作都集中在挖掘 COVID-19 文献上,但我们的项目说明了开放出版的力量,它可以组织技术和非技术科学家汇总和传播信息,以应对不断演变的危机。
{"title":"An Open-Publishing Response to the COVID-19 Infodemic.","authors":"Halie M Rando, Simina M Boca, Lucy D'Agostino McGowan, Daniel S Himmelstein, Michael P Robson, Vincent Rubinetti, Ryan Velazquez, Casey S Greene, Anthony Gitter","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The COVID-19 pandemic catalyzed the rapid dissemination of papers and preprints investigating the disease and its associated virus, SARS-CoV-2. The multifaceted nature of COVID-19 demands a multidisciplinary approach, but the urgency of the crisis combined with the need for social distancing measures present unique challenges to collaborative science. We applied a massive online open publishing approach to this problem using Manubot. Through GitHub, collaborators summarized and critiqued COVID-19 literature, creating a review manuscript. Manubot automatically compiled citation information for referenced preprints, journal publications, websites, and clinical trials. Continuous integration workflows retrieved up-to-date data from online sources nightly, regenerating some of the manuscript's figures and statistics. Manubot rendered the manuscript into PDF, HTML, LaTeX, and DOCX outputs, immediately updating the version available online upon the integration of new content. Through this effort, we organized over 50 scientists from a range of backgrounds who evaluated over 1,500 sources and developed seven literature reviews. While many efforts from the computational community have focused on mining COVID-19 literature, our project illustrates the power of open publishing to organize both technical and non-technical scientists to aggregate and disseminate information in response to an evolving crisis.</p>","PeriodicalId":72554,"journal":{"name":"CEUR workshop proceedings","volume":"2976 ","pages":"29-38"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9093051/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142094263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Coverage of the Coronavirus Pandemic through Entropy Measures 通过熵测度对冠状病毒大流行的覆盖
Pub Date : 2021-03-23 DOI: 10.31812/123456789/4427
V. Soloviev, A. Bielinskyi, N. Kharadzjan
The rapidly evolving coronavirus pandemic brings a devastating effect on the entire world and its economy as awhole. Further instability related to COVID-19will negatively affect not only on companies and financial markets, but also on traders and investors that have been interested in saving their investment, minimizing risks, and making decisions such as how to manage their resources, how much to consume and save, when to buy or sell stocks, etc., and these decisions depend on the expectation of when to expect next critical change. Trying to help people in their subsequent decisions, we demonstrate the possibility of constructing indicators of critical and crash phenomena on the example of Bitcoin market crashes for further demonstration of their efficiency on the crash that is related to the coronavirus pandemic. For this purpose, the methods of the theory of complex systems have been used. Since the theory of complex systems has quite an extensive toolkit for exploring the nonlinear complex system, we take a look at the application of the concept of entropy in finance and use this concept to construct 6 effective entropy measures: Shannon entropy, Approximate entropy, Permutation entropy, and 3 Recurrence based entropies. We provide computational results that prove that these indicators could have been used to identify the beginning of the crash and predict the future course of events associated with the current pandemic.
迅速演变的冠状病毒大流行给整个世界及其经济带来了毁灭性的影响。与covid -19相关的进一步不稳定不仅会对公司和金融市场产生负面影响,还会对那些有兴趣节省投资、最大限度地降低风险并做出决策(如如何管理资源、消费和储蓄多少、何时买卖股票等)的交易员和投资者产生负面影响,而这些决策取决于对何时会出现下一个关键变化的预期。为了帮助人们做出后续决策,我们以比特币市场崩溃为例,展示了构建关键和崩溃现象指标的可能性,以进一步证明它们在与冠状病毒大流行相关的崩溃中的效率。为此,使用了复杂系统理论的方法。由于复杂系统理论有相当广泛的工具来探索非线性复杂系统,我们看一下熵的概念在金融中的应用,并使用这个概念构建6个有效的熵度量:香农熵、近似熵、置换熵和3个基于递归的熵。我们提供的计算结果证明,这些指标可以用来确定崩溃的开始,并预测与当前大流行有关的事件的未来进程。
{"title":"Coverage of the Coronavirus Pandemic through Entropy Measures","authors":"V. Soloviev, A. Bielinskyi, N. Kharadzjan","doi":"10.31812/123456789/4427","DOIUrl":"https://doi.org/10.31812/123456789/4427","url":null,"abstract":"The rapidly evolving coronavirus pandemic brings a devastating effect on the entire world and its economy as awhole. Further instability related to COVID-19will negatively affect not only on companies and financial markets, but also on traders and investors that have been interested in saving their investment, minimizing risks, and making decisions such as how to manage their resources, how much to consume and save, when to buy or sell stocks, etc., and these decisions depend on the expectation of when to expect next critical change. Trying to help people in their subsequent decisions, we demonstrate the possibility of constructing indicators of critical and crash phenomena on the example of Bitcoin market crashes for further demonstration of their efficiency on the crash that is related to the coronavirus pandemic. For this purpose, the methods of the theory of complex systems have been used. Since the theory of complex systems has quite an extensive toolkit for exploring the nonlinear complex system, we take a look at the application of the concept of entropy in finance and use this concept to construct 6 effective entropy measures: Shannon entropy, Approximate entropy, Permutation entropy, and 3 Recurrence based entropies. We provide computational results that prove that these indicators could have been used to identify the beginning of the crash and predict the future course of events associated with the current pandemic.","PeriodicalId":72554,"journal":{"name":"CEUR workshop proceedings","volume":"115 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79338041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Challenges in Realism-Based Ontology Design: a Case Study on Creating an Ontology for Motivational Learning Theories. 基于现实主义的本体设计面临的挑战:关于创建动机学习理论本体的案例研究。
Pub Date : 2021-01-01
Irshad Ally, Werner Ceusters

Objective: to identify on the basis of a use case major problem types novices in realism-based ontology design face when attempting to construct an ontology intended to explain differences and commonalities between competing scientific theories.

Methodology: an ontology student was tasked (1) to extract manually from a paper about five distinct motivational learning theories the scientific terms used to explain the theories, (2) to map these terms where possible to type-terms from existing realism-based ontologies or create new ones otherwise, (3) to indicate for new type-terms their immediate subsumer, and (4) to document at every step issues that were encountered.

Results: where term extraction and type-term assignment were handled satisfactorily, correct classification in function of the BFO was a major challenge. Root causes identified included ambiguous and underspecified term use in the theories, the ontological status of psychological constructs, lack of high quality ontologies for the behavioral sciences and insufficient 'deep' understanding of some BFO entities, in part because of insufficient documentation thereof suitable for learners. The issues the student encountered were often insufficiently described for the instructor to identify the problem without analyzing the source paper itself.

Conclusion: whereas behavioral scientists need to do efforts to make their theories comparable, realism-based ontologies can help them therein only when ontology developers and educators put more effort in making them more accessible without violating the principles.

目的:根据一个使用案例,确定基于现实主义的本体设计新手在试图构建本体以解释相互竞争的科学理论之间的差异和共性时所面临的主要问题类型。方法:本体论学生的任务是:(1)从一篇关于五种不同动机学习理论的论文中手动提取用于解释这些理论的科学术语;(2)尽可能将这些术语映射到现有的基于现实主义的本体论中的类型术语,或者创建新的类型术语;(3)为新的类型术语指出其直接子类;以及(4)记录每一步遇到的问题。发现的根本原因包括:理论中术语使用的模糊性和不明确性、心理建构的本体论地位、缺乏高质量的行为科学本体论以及对某些《生物和毒素武器组织》实体的理解不够 "深刻",部分原因是适合学习者的相关文档不足。结论:虽然行为科学家需要努力使他们的理论具有可比性,但只有当本体论开发者和教育者付出更多努力,在不违反原则的前提下使本体论更易于理解时,基于现实主义的本体论才能对他们有所帮助。
{"title":"Challenges in Realism-Based Ontology Design: a Case Study on Creating an Ontology for Motivational Learning Theories.","authors":"Irshad Ally, Werner Ceusters","doi":"","DOIUrl":"","url":null,"abstract":"<p><strong>Objective: </strong>to identify on the basis of a use case major problem types novices in realism-based ontology design face when attempting to construct an ontology intended to explain differences and commonalities between competing scientific theories.</p><p><strong>Methodology: </strong>an ontology student was tasked (1) to extract manually from a paper about five distinct motivational learning theories the scientific terms used to explain the theories, (2) to map these terms where possible to type-terms from existing realism-based ontologies or create new ones otherwise, (3) to indicate for new type-terms their immediate subsumer, and (4) to document at every step issues that were encountered.</p><p><strong>Results: </strong>where term extraction and type-term assignment were handled satisfactorily, correct classification in function of the BFO was a major challenge. Root causes identified included ambiguous and underspecified term use in the theories, the ontological status of psychological constructs, lack of high quality ontologies for the behavioral sciences and insufficient 'deep' understanding of some BFO entities, in part because of insufficient documentation thereof suitable for learners. The issues the student encountered were often insufficiently described for the instructor to identify the problem without analyzing the source paper itself.</p><p><strong>Conclusion: </strong>whereas behavioral scientists need to do efforts to make their theories comparable, realism-based ontologies can help them therein only when ontology developers and educators put more effort in making them more accessible without violating the principles.</p>","PeriodicalId":72554,"journal":{"name":"CEUR workshop proceedings","volume":"3073 ","pages":"63-69"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11164408/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141302264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CTO: a Community-Based Clinical Trial Ontology and its Applications in PubChemRDF and SCAIView. CTO:基于社区的临床试验本体及其在 PubChemRDF 和 SCAIView 中的应用。
Pub Date : 2020-09-01
Asiyah Yu Lin, Stephan Gebel, Qingliang Leon Li, Sumit Madan, Johannes Darms, Evan Bolton, Barry Smith, Martin Hofmann-Apitius, Yongqun Oliver He, Alpha Tom Kodamullil

Driven by the use cases of PubChemRDF and SCAIView, we have developed a first community-based clinical trial ontology (CTO) by following the OBO Foundry principles. CTO uses the Basic Formal Ontology (BFO) as the top level ontology and reuses many terms from existing ontologies. CTO has also defined many clinical trial-specific terms. The general CTO design pattern is based on the PICO framework together with two applications. First, the PubChemRDF use case demonstrates how a drug Gleevec is linked to multiple clinical trials investigating Gleevec's related chemical compounds. Second, the SCAIView text mining engine shows how the use of CTO terms in its search algorithm can identify publications referring to COVID-19-related clinical trials. Future opportunities and challenges are discussed.

在 PubChemRDF 和 SCAIView 用例的推动下,我们遵循 OBO Foundry 原则开发了首个基于社区的临床试验本体(CTO)。CTO 使用基本形式本体(BFO)作为顶层本体,并重用了现有本体中的许多术语。CTO 还定义了许多临床试验专用术语。一般的 CTO 设计模式基于 PICO 框架和两个应用程序。首先,PubChemRDF 用例展示了如何将药物格列卫与研究格列卫相关化合物的多项临床试验联系起来。其次,SCAIView 文本挖掘引擎展示了如何在其搜索算法中使用 CTO 术语来识别涉及 COVID-19 相关临床试验的出版物。此外还讨论了未来的机遇和挑战。
{"title":"CTO: a Community-Based Clinical Trial Ontology and its Applications in PubChemRDF and SCAIView.","authors":"Asiyah Yu Lin, Stephan Gebel, Qingliang Leon Li, Sumit Madan, Johannes Darms, Evan Bolton, Barry Smith, Martin Hofmann-Apitius, Yongqun Oliver He, Alpha Tom Kodamullil","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Driven by the use cases of PubChemRDF and SCAIView, we have developed a first community-based clinical trial ontology (CTO) by following the OBO Foundry principles. CTO uses the Basic Formal Ontology (BFO) as the top level ontology and reuses many terms from existing ontologies. CTO has also defined many clinical trial-specific terms. The general CTO design pattern is based on the PICO framework together with two applications. First, the PubChemRDF use case demonstrates how a drug Gleevec is linked to multiple clinical trials investigating Gleevec's related chemical compounds. Second, the SCAIView text mining engine shows how the use of CTO terms in its search algorithm can identify publications referring to COVID-19-related clinical trials. Future opportunities and challenges are discussed.</p>","PeriodicalId":72554,"journal":{"name":"CEUR workshop proceedings","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9389640/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40415209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The OhioT1DM Dataset for Blood Glucose Level Prediction: Update 2020. 用于血糖水平预测的OhioT1DM数据集:更新2020。
Pub Date : 2020-09-01
Cindy Marling, Razvan Bunescu

This paper documents the OhioT1DM Dataset, which was developed to promote and facilitate research in blood glucose level prediction. It contains eight weeks' worth of continuous glucose monitoring, insulin, physiological sensor, and self-reported life-event data for each of 12 people with type 1 diabetes. An associated graphical software tool allows researchers to visualize the integrated data. The paper details the contents and format of the dataset and tells interested researchers how to obtain it. The OhioT1DM Dataset was first released in 2018 for the first Blood Glucose Level Prediction (BGLP) Challenge. At that time, the dataset was half its current size, containing data for only six people with type 1 diabetes. Data for an additional six people is being released in 2020 for the second BGLP Challenge. This paper subsumes and supersedes the paper which documented the original dataset.

本文记录了OhioT1DM数据集,该数据集旨在促进和促进血糖水平预测的研究。它包含了12名1型糖尿病患者连续8周的血糖监测、胰岛素、生理传感器和自我报告的生活事件数据。相关的图形软件工具允许研究人员可视化集成数据。本文详细介绍了数据集的内容和格式,并告诉感兴趣的研究人员如何获取数据集。OhioT1DM数据集于2018年首次发布,用于首届血糖水平预测(BGLP)挑战赛。当时,数据集的规模只有现在的一半,仅包含6名1型糖尿病患者的数据。另外6人的数据将于2020年发布,以参加第二届BGLP挑战。本文包含并取代了记录原始数据集的论文。
{"title":"The OhioT1DM Dataset for Blood Glucose Level Prediction: Update 2020.","authors":"Cindy Marling,&nbsp;Razvan Bunescu","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>This paper documents the OhioT1DM Dataset, which was developed to promote and facilitate research in blood glucose level prediction. It contains eight weeks' worth of continuous glucose monitoring, insulin, physiological sensor, and self-reported life-event data for each of 12 people with type 1 diabetes. An associated graphical software tool allows researchers to visualize the integrated data. The paper details the contents and format of the dataset and tells interested researchers how to obtain it. The OhioT1DM Dataset was first released in 2018 for the first Blood Glucose Level Prediction (BGLP) Challenge. At that time, the dataset was half its current size, containing data for only six people with type 1 diabetes. Data for an additional six people is being released in 2020 for the second BGLP Challenge. This paper subsumes and supersedes the paper which documented the original dataset.</p>","PeriodicalId":72554,"journal":{"name":"CEUR workshop proceedings","volume":"2675 ","pages":"71-74"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7881904/pdf/nihms-1668254.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25370909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Foundations for a Realism-Based Ontology of Protein Aggregates. 基于现实主义的蛋白质聚集体本体的基础。
Pub Date : 2020-09-01 Epub Date: 2021-02-02
Lauren Wishnie, Alexander P Cox, Alexander D Diehl, Werner Ceusters

The objective of this paper is to propose formal definitions for the terms 'protein aggregate' and 'protein-containing complex' such that the descriptions and usages of these terms in biomedical literature are unified and that those portions of reality are correctly represented. To this end, we surveyed the literature to assess the need for a distinction between these entities, then compared the features of usages and definitions found in the literature to the definitions for those terms found in Bioportal ontologies. Based on the results of this comparison, we propose updated definitions for the terms 'protein aggregate' and 'protein-containing complex'. Thus far, we propose the following distinguishing factors: first, that one important difference lies in whether an entity is disposed to change type in response to certain structural alterations, such as dissociation of a continuant part, and second that an important difference lies in the ability of the entity to realize its function after such an event occurs. These distinctions are reflected in the proposed definitions.

本文的目的是提出术语“蛋白质聚集体”和“含蛋白质复合物”的正式定义,以便这些术语在生物医学文献中的描述和用法是统一的,并且这些部分的现实是正确的。为此,我们调查了文献,以评估对这些实体进行区分的必要性,然后将文献中发现的用法和定义的特征与biopportal本体中发现的这些术语的定义进行了比较。基于这种比较的结果,我们提出了术语“蛋白质聚集体”和“含蛋白质复合物”的更新定义。到目前为止,我们提出了以下区分因素:首先,一个重要的区别在于实体是否倾向于改变类型以响应某些结构变化,例如连续部分的分离;其次,一个重要的区别在于实体在这种事件发生后实现其功能的能力。这些区别反映在拟议的定义中。
{"title":"Foundations for a Realism-Based Ontology of Protein Aggregates.","authors":"Lauren Wishnie,&nbsp;Alexander P Cox,&nbsp;Alexander D Diehl,&nbsp;Werner Ceusters","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The objective of this paper is to propose formal definitions for the terms 'protein aggregate' and 'protein-containing complex' such that the descriptions and usages of these terms in biomedical literature are unified and that those portions of reality are correctly represented. To this end, we surveyed the literature to assess the need for a distinction between these entities, then compared the features of usages and definitions found in the literature to the definitions for those terms found in Bioportal ontologies. Based on the results of this comparison, we propose updated definitions for the terms 'protein aggregate' and 'protein-containing complex'. Thus far, we propose the following distinguishing factors: first, that one important difference lies in whether an entity is disposed to change type in response to certain structural alterations, such as dissociation of a continuant part, and second that an important difference lies in the ability of the entity to realize its function after such an event occurs. These distinctions are reflected in the proposed definitions.</p>","PeriodicalId":72554,"journal":{"name":"CEUR workshop proceedings","volume":"2807 ","pages":"K1-K10"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8547170/pdf/nihms-1648743.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39563967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Ontology-Powered Dialogue Engine For Patient Communication of Vaccines. 用于疫苗患者交流的本体驱动对话引擎。
Pub Date : 2019-10-01
Muhammad Amith, Rebecca Lin, Licong Cui, Dennis Wang, Anna Zhu, Grace Xiong, Hua Xu, Kirk Roberts, Cui Tao

In this study, we introduce an ontology-driven software engine to provide dialogue interaction functionality for a conversational agent for HPV vaccine counseling. Currently, the HPV vaccination rates are low that risks unprotected individuals at being infected with HPV, a virus that leads to life-threatening cancers. In addition, we developed a question answering subsystem to support the dialogue engine. In this paper, we discuss our design and development of an ontology-driven dialogue engine that uses the Patient Health Information Dialogue Ontology, an ontology that we previously developed, and a question answering subsystem based on various previous methods to supplement the dialogue engine's interaction with the user. Our next step is to test the functional ability of the ontology-driven software components and deploy the engine in a live environment to be integrated with a speech interface.

在这项研究中,我们引入了一个本体驱动的软件引擎,为HPV疫苗咨询的会话代理提供对话交互功能。目前,人乳头瘤病毒疫苗接种率很低,未受保护的个体有可能感染人乳头瘤病毒,这种病毒会导致危及生命的癌症。此外,我们还开发了一个问答子系统来支持对话引擎。在本文中,我们讨论了本体驱动的对话引擎的设计和开发,该对话引擎使用患者健康信息对话本体(我们之前开发的本体)和基于各种先前方法的问答子系统来补充对话引擎与用户的交互。我们的下一步是测试本体驱动软件组件的功能能力,并将引擎部署到一个实时环境中,以便与语音接口集成。
{"title":"An Ontology-Powered Dialogue Engine For Patient Communication of Vaccines.","authors":"Muhammad Amith,&nbsp;Rebecca Lin,&nbsp;Licong Cui,&nbsp;Dennis Wang,&nbsp;Anna Zhu,&nbsp;Grace Xiong,&nbsp;Hua Xu,&nbsp;Kirk Roberts,&nbsp;Cui Tao","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>In this study, we introduce an ontology-driven software engine to provide dialogue interaction functionality for a conversational agent for HPV vaccine counseling. Currently, the HPV vaccination rates are low that risks unprotected individuals at being infected with HPV, a virus that leads to life-threatening cancers. In addition, we developed a question answering subsystem to support the dialogue engine. In this paper, we discuss our design and development of an ontology-driven dialogue engine that uses the Patient Health Information Dialogue Ontology, an ontology that we previously developed, and a question answering subsystem based on various previous methods to supplement the dialogue engine's interaction with the user. Our next step is to test the functional ability of the ontology-driven software components and deploy the engine in a live environment to be integrated with a speech interface.</p>","PeriodicalId":72554,"journal":{"name":"CEUR workshop proceedings","volume":"2427 ","pages":"24-30"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7376741/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38194009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
CEUR workshop proceedings
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1