Artificial Intelligence and Social Computing最新文献

Automated Decision Support for Collaborative, Interactive Classification 协作、交互分类的自动决策支持

Artificial Intelligence and Social Computing

Pub Date : 1900-01-01 DOI: 10.54941/ahfe1003269

Randolph M. Jones, Robert Bixler, Robert P. Marinier, Lilia V. Moshkina

Traditional classification approaches are straightforward: collect data, apply classification algorithms, then generate classification results. However, such approaches depend on data being amply available, which is not always the case. This paper describes an approach to maximize the utility of collected data through intelligent guidance of the data collection process. We present the development and evaluation of a knowledge-based decision-support system: the Logical Reasoner (LR), which guides data collection by unmanned ground and air assets to improve behavior classification. The LR is a component of a Human Directed and Controlled AI system (or “Human-AI” system) aimed at semi-autonomous classification of potential threat and non-threat individuals in a complex urban setting. The setting provides little to no pre-existing data; thus, the system collects, analyzes, and evaluates real-time human behavior data to determine whether the observed behavior is indicative of threat intent. The LR’s purpose is to produce contextual knowledge to help make productive decisions about where, when, and how to guide the vehicles in the data collection process. It builds a situational-awareness picture from the observed spatial relationships, activities, and interim classifications, then uses heuristics to generate new information-gathering goals, as well as to recommend which actions the vehicles should take to better achieve these goals. The system uses these recommendations to collaboratively help the operator direct the autonomous assets to individuals or places in the environment to maximize the effectiveness of evidence collection. LR is based on the Soar Cognitive Architecture which excels in supporting Human-AI collaboration. The described DoD-sponsored system has been developed and extensively tested for over three years, in simulation and in the field (with role-players). Results of these experiments have demonstrated that the LR decision support contributes to automated data collection and overall classification accuracy by the Human-AI team. This paper describes the development and evaluation of the LR based on multiple test events.The research reported in this document was performed under Defense Advanced Research Projects Agency (DARPA) contract #HR001120C0180, Urban Reconnaissance through Supervised Autonomy (URSA). The views, opinions, and/or findings expressed are those of the author(s) and should not be interpreted as representing the official views or policies of the Department of Defense or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation hereon. Many thanks to Robert Marinier and Kris Kearns for their assistance in the preparation of this manuscript, as well as the entire ISOLATE R&D team.Distribution Statement “A” (Approved for Public Release, Distribution Unlimited)

传统的分类方法是简单的:收集数据，应用分类算法，然后生成分类结果。然而，这种方法依赖于数据的充足可用性，但情况并非总是如此。本文描述了一种通过对数据采集过程进行智能引导，使所收集数据的效用最大化的方法。我们提出了一个基于知识的决策支持系统的开发和评估:逻辑推理器(LR)，它指导无人地面和空中资产的数据收集，以改进行为分类。LR是人类指导和控制的人工智能系统(或“人类-人工智能”系统)的一个组成部分，旨在对复杂城市环境中的潜在威胁和非威胁个体进行半自主分类。这种设置几乎没有提供预先存在的数据;因此，系统收集、分析和评估实时人类行为数据，以确定观察到的行为是否表明威胁意图。LR的目的是产生上下文知识，以帮助在数据收集过程中指导车辆在何时何地进行有效决策。它从观察到的空间关系、活动和临时分类中构建态势感知图像，然后使用启发式方法生成新的信息收集目标，并建议车辆应该采取哪些行动来更好地实现这些目标。该系统使用这些建议来协同帮助作业者将自主资产引导到环境中的个人或地点，以最大限度地提高证据收集的有效性。LR基于Soar认知架构，该架构擅长于支持人类与人工智能的协作。所描述的国防部赞助的系统已经开发和广泛测试了三年多，在模拟和现场(与角色扮演)。这些实验结果表明，LR决策支持有助于人工智能团队的自动数据收集和整体分类准确性。本文描述了基于多个测试事件的LR的开发和评估。本文档中报告的研究是在国防高级研究计划局(DARPA)合同#HR001120C0180下进行的，通过监督自治进行城市侦察(URSA)。所表达的观点、意见和/或调查结果仅代表作者的观点，不应被解释为代表国防部或美国政府的官方观点或政策。美国政府被授权为政府目的复制和分发重印本，尽管此处有任何版权注释。非常感谢Robert Marinier和Kris Kearns在编写本文中的协助，以及整个ISOLATE研发团队。发行声明“A”(批准公开发行，无限制发行)

{"title":"Automated Decision Support for Collaborative, Interactive Classification","authors":"Randolph M. Jones, Robert Bixler, Robert P. Marinier, Lilia V. Moshkina","doi":"10.54941/ahfe1003269","DOIUrl":"https://doi.org/10.54941/ahfe1003269","url":null,"abstract":"Traditional classification approaches are straightforward: collect data, apply classification algorithms, then generate classification results. However, such approaches depend on data being amply available, which is not always the case. This paper describes an approach to maximize the utility of collected data through intelligent guidance of the data collection process. We present the development and evaluation of a knowledge-based decision-support system: the Logical Reasoner (LR), which guides data collection by unmanned ground and air assets to improve behavior classification. The LR is a component of a Human Directed and Controlled AI system (or “Human-AI” system) aimed at semi-autonomous classification of potential threat and non-threat individuals in a complex urban setting. The setting provides little to no pre-existing data; thus, the system collects, analyzes, and evaluates real-time human behavior data to determine whether the observed behavior is indicative of threat intent. The LR’s purpose is to produce contextual knowledge to help make productive decisions about where, when, and how to guide the vehicles in the data collection process. It builds a situational-awareness picture from the observed spatial relationships, activities, and interim classifications, then uses heuristics to generate new information-gathering goals, as well as to recommend which actions the vehicles should take to better achieve these goals. The system uses these recommendations to collaboratively help the operator direct the autonomous assets to individuals or places in the environment to maximize the effectiveness of evidence collection. LR is based on the Soar Cognitive Architecture which excels in supporting Human-AI collaboration. The described DoD-sponsored system has been developed and extensively tested for over three years, in simulation and in the field (with role-players). Results of these experiments have demonstrated that the LR decision support contributes to automated data collection and overall classification accuracy by the Human-AI team. This paper describes the development and evaluation of the LR based on multiple test events.The research reported in this document was performed under Defense Advanced Research Projects Agency (DARPA) contract #HR001120C0180, Urban Reconnaissance through Supervised Autonomy (URSA). The views, opinions, and/or findings expressed are those of the author(s) and should not be interpreted as representing the official views or policies of the Department of Defense or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation hereon. Many thanks to Robert Marinier and Kris Kearns for their assistance in the preparation of this manuscript, as well as the entire ISOLATE R&D team.Distribution Statement “A” (Approved for Public Release, Distribution Unlimited)","PeriodicalId":405313,"journal":{"name":"Artificial Intelligence and Social Computing","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114895713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Improving Common Ground in Human-Machine Teaming: Dimensions, Gaps, and Priorities 改进人机合作的共同点:维度、差距和优先级

Artificial Intelligence and Social Computing

Pub Date : 1900-01-01 DOI: 10.54941/ahfe1001463

Robert Wray, James R. Kirk, J. Folsom-Kovarik

“Common ground” is the knowledge, facts, beliefs, etc. that are shared between participants in some joint activity. Much of human conversation concerns “grounding,” or ensuring that some assertion is actually shared between participants. Even for highly trained tasks, such teammates executing a military mission, each participant devotes attention to contributing new assertions, making adjustments based on the statements of others, offering potential repairs to resolve potential discrepancies in the common ground and so forth.In conversational interactions between humans and machines (or “agents”), this activity to build and to maintain a common ground is typically one-sided and fixed. It is one-sided because the human must do almost all the work of creating substantive common ground in the interaction. It is fixed because the agent does not adapt its understanding to what the human knows, prefers, and expects. Instead, the human must adapt to the agent. These limitations create burdensome cognitive demand, result in frustration and distrust in automation, and make the notion of an agent “teammate” seem an ambition far from reachable in today’s state-of-art. We are seeking to enable agents to more fully partner in building and maintaining common ground as well as to enable them to adapt their understanding of a joint activity. While “common ground” is often called out as a gap in human-machine teaming, there is not an extant, detailed analysis of the components of common ground and a mapping of these components to specific classes of functions (what specific agent capabilities is required to achieve common ground?) and deficits (what kinds of errors may arise when the functions are insufficient for a particular component of the common ground?). In this paper, we provide such an analysis, focusing on the requirements for human-machine teaming in a military context where interactions are task-oriented and generally well-trained.Drawing on the literature of human communication, we identify the components of information included in common ground. We identify three main axes: the temporal dimension of common ground and personal and communal common ground. The analysis further subdivides these distinctions, differentiating between aspects of the common ground such as personal history between participants, norms and expectations based on those norms, and the extent to which actions taken by participants in a human-machine interaction context are “public” events or not. Within each dimension, we also provide examples of specific issues that may arise due to problems due to lack of common ground related to a specific dimension. The analysis thus defines, at a more granular level than existing analyses, how specific categories of deficits in shared knowledge or processing differences manifests in misalignment in shared understanding. The paper both identifies specific challenges and prioritizes them according to acuteness of need. In other words, not all of

“Common ground”是指在某些联合活动中参与者所共享的知识、事实、信念等。人类的许多对话都涉及“基础”，或确保参与者之间确实分享了某些主张。即使是训练有素的任务，比如执行军事任务的队友，每个参与者都致力于贡献新的主张，根据其他人的陈述进行调整，提供潜在的修复以解决共同基础中的潜在差异，等等。在人与机器(或“代理”)之间的会话交互中，这种建立和维护共同基础的活动通常是片面和固定的。它是片面的，因为人类必须做几乎所有的工作，在互动中创造实质性的共同点。它是固定的，因为智能体没有调整它的理解来适应人类所知道的、喜欢的和期望的东西。相反，人类必须适应代理。这些限制产生了繁重的认知需求，导致了自动化中的挫折和不信任，并使代理“队友”的概念在当今的技术水平下似乎遥不可及。我们正在设法使代理人能够在建立和维持共同基础方面更充分地合作，并使他们能够调整他们对联合活动的理解。虽然“公共基础”经常被称为人机合作中的一个空白，但目前还没有对公共基础的组件进行详细的分析，也没有将这些组件映射到特定的功能类别(实现公共基础需要哪些特定的代理功能?)和缺陷(当功能不足以满足公共基础的特定组件时，可能会出现哪些类型的错误?)在本文中，我们提供了这样的分析，重点关注军事环境中人机团队的需求，其中交互以任务为导向并且通常训练有素。根据人类交流的文献，我们确定了包含在共同基础中的信息的组成部分。我们确定了三个主轴:共同点的时间维度和个人和公共共同点。分析进一步细分了这些区别，区分了共同基础的各个方面，如参与者之间的个人历史、基于这些规范的规范和期望，以及参与者在人机交互环境中采取的行动在多大程度上是“公共”事件。在每个维度中，我们还提供了由于缺乏与特定维度相关的共同点而导致的问题可能产生的特定问题的示例。因此，该分析在比现有分析更细粒度的水平上定义了共享知识或处理差异中特定类别的缺陷如何在共享理解的不一致中表现出来。该文件既确定了具体的挑战，又根据需求的紧迫性对其进行了优先排序。换句话说，并非所有的差距都需要立即关注以改善人机交互。此外，特定问题的解决方案有时可能依赖于其他问题的解决方案。因此，这种分析有助于更好地理解如何在短期和长期内解决不一致的问题。

{"title":"Improving Common Ground in Human-Machine Teaming: Dimensions, Gaps, and Priorities","authors":"Robert Wray, James R. Kirk, J. Folsom-Kovarik","doi":"10.54941/ahfe1001463","DOIUrl":"https://doi.org/10.54941/ahfe1001463","url":null,"abstract":"“Common ground” is the knowledge, facts, beliefs, etc. that are shared between participants in some joint activity. Much of human conversation concerns “grounding,” or ensuring that some assertion is actually shared between participants. Even for highly trained tasks, such teammates executing a military mission, each participant devotes attention to contributing new assertions, making adjustments based on the statements of others, offering potential repairs to resolve potential discrepancies in the common ground and so forth.In conversational interactions between humans and machines (or “agents”), this activity to build and to maintain a common ground is typically one-sided and fixed. It is one-sided because the human must do almost all the work of creating substantive common ground in the interaction. It is fixed because the agent does not adapt its understanding to what the human knows, prefers, and expects. Instead, the human must adapt to the agent. These limitations create burdensome cognitive demand, result in frustration and distrust in automation, and make the notion of an agent “teammate” seem an ambition far from reachable in today’s state-of-art. We are seeking to enable agents to more fully partner in building and maintaining common ground as well as to enable them to adapt their understanding of a joint activity. While “common ground” is often called out as a gap in human-machine teaming, there is not an extant, detailed analysis of the components of common ground and a mapping of these components to specific classes of functions (what specific agent capabilities is required to achieve common ground?) and deficits (what kinds of errors may arise when the functions are insufficient for a particular component of the common ground?). In this paper, we provide such an analysis, focusing on the requirements for human-machine teaming in a military context where interactions are task-oriented and generally well-trained.Drawing on the literature of human communication, we identify the components of information included in common ground. We identify three main axes: the temporal dimension of common ground and personal and communal common ground. The analysis further subdivides these distinctions, differentiating between aspects of the common ground such as personal history between participants, norms and expectations based on those norms, and the extent to which actions taken by participants in a human-machine interaction context are “public” events or not. Within each dimension, we also provide examples of specific issues that may arise due to problems due to lack of common ground related to a specific dimension. The analysis thus defines, at a more granular level than existing analyses, how specific categories of deficits in shared knowledge or processing differences manifests in misalignment in shared understanding. The paper both identifies specific challenges and prioritizes them according to acuteness of need. In other words, not all of","PeriodicalId":405313,"journal":{"name":"Artificial Intelligence and Social Computing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129558305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Artificial Intelligence in aviation decision making process.The transition from extended Minimum Crew Operations to Single Pilot Operations (SiPO) 航空决策过程中的人工智能。从扩展的最小机组操作到单飞行员操作(SiPO)的过渡

Artificial Intelligence and Social Computing

Pub Date : 1900-01-01 DOI: 10.54941/ahfe1001452

Dimitrios Ziakkas, Anastasios Plioutsias, K. Pechlivanis

Innovation, management of change, and human factors implementation in-flight operations portray the aviation industry. The International Air Transportation Authority (IATA) Technology Roadmap (IATA, 2019) and European Aviation Safety Agency (EASA) Artificial Intelligence (A.I.) roadmap propose an outline and assessment of ongoing technology prospects, which change the aviation environment with the implementation of A.I. and introduction of extended Minimum Crew Operations (eMCO) and Single Pilot Operations (SiPO). Changes in the workload will affect human performance and the decision-making process. The research accepted the universally established definition in the A.I. approach of “any technology that appears to emulate the performance of a human” (EASA, 2020). A review of the existing literature on Direct Voice Inputs (DVI) applications structured A.I. aviation decision-making research themes in cockpit design and users’ perception - experience. Interviews with Subject Matter Experts (Human Factors analysts, A.I. analysts, airline managers, examiners, instructors, qualified pilots, pilots under training) and questionnaires (disseminated to a group of professional pilots and pilots under training) examined A.I. implementation in cockpit design and operations. Results were analyzed and evaluated the suitability and significant differences of e-MCO and SiPO under the decision-making aspect.Keywords: Artificial Intelligence (A.I.), Extended Minimum Crew Operations (e-MCO), Single Pilot Operations (SiPO), cockpit design, ergonomics, decision making.

创新、变革管理和人为因素在飞行操作中的实施描绘了航空业。国际航空运输管理局(IATA)技术路线图(IATA, 2019)和欧洲航空安全局(EASA)人工智能(A.I.)路线图提出了正在进行的技术前景的概述和评估，这些技术前景随着人工智能的实施和引入扩展的最小机组操作(eMCO)和单飞行员操作(SiPO)而改变航空环境。工作量的变化将影响人的表现和决策过程。该研究接受了人工智能方法中普遍确立的定义，即“任何似乎模仿人类表现的技术”(EASA, 2020)。通过对直接语音输入(DVI)应用的现有文献的回顾，构建了座舱设计和用户感知体验方面的人工智能航空决策研究主题。对主题专家(人为因素分析师、人工智能分析师、航空公司经理、审核员、教官、合格飞行员、正在培训的飞行员)的采访和问卷调查(分发给一组专业飞行员和正在培训的飞行员)检查了人工智能在驾驶舱设计和操作中的应用。分析评价了e-MCO和SiPO在决策层面的适用性和显著性差异。关键词:人工智能(A.I.)，扩展最小乘员操作(e-MCO)，单飞行员操作(SiPO)，驾驶舱设计，人体工程学，决策制定。

{"title":"Artificial Intelligence in aviation decision making process.The transition from extended Minimum Crew Operations to Single Pilot Operations (SiPO)","authors":"Dimitrios Ziakkas, Anastasios Plioutsias, K. Pechlivanis","doi":"10.54941/ahfe1001452","DOIUrl":"https://doi.org/10.54941/ahfe1001452","url":null,"abstract":"Innovation, management of change, and human factors implementation in-flight operations portray the aviation industry. The International Air Transportation Authority (IATA) Technology Roadmap (IATA, 2019) and European Aviation Safety Agency (EASA) Artificial Intelligence (A.I.) roadmap propose an outline and assessment of ongoing technology prospects, which change the aviation environment with the implementation of A.I. and introduction of extended Minimum Crew Operations (eMCO) and Single Pilot Operations (SiPO). Changes in the workload will affect human performance and the decision-making process. The research accepted the universally established definition in the A.I. approach of “any technology that appears to emulate the performance of a human” (EASA, 2020). A review of the existing literature on Direct Voice Inputs (DVI) applications structured A.I. aviation decision-making research themes in cockpit design and users’ perception - experience. Interviews with Subject Matter Experts (Human Factors analysts, A.I. analysts, airline managers, examiners, instructors, qualified pilots, pilots under training) and questionnaires (disseminated to a group of professional pilots and pilots under training) examined A.I. implementation in cockpit design and operations. Results were analyzed and evaluated the suitability and significant differences of e-MCO and SiPO under the decision-making aspect.Keywords: Artificial Intelligence (A.I.), Extended Minimum Crew Operations (e-MCO), Single Pilot Operations (SiPO), cockpit design, ergonomics, decision making.","PeriodicalId":405313,"journal":{"name":"Artificial Intelligence and Social Computing","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128770263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Analysis of citizen's sentiment towards Philippine administration's intervention against COVID-19 民众对菲律宾政府干预新冠肺炎疫情的看法分析

Artificial Intelligence and Social Computing

Pub Date : 1900-01-01 DOI: 10.54941/ahfe1001446

Matthew John Sino Cruz, M. D. De Leon

The COVID-19 pandemic affected the world. The World Health Organization or WHO issued guidelines the public must follow to prevent the spread of the disease. This includes social distancing, the wearing of facemasks, and regular washing of hands. These guidelines served as the basis for formulating policies by countries affected by the pandemic. In the Philippines, the government implemented different initiatives, following the guidelines of WHO, that aimed to mitigate the effect of the pandemic in the country. Some of the initiatives formulated by the administration include international and domestic travel restrictions, community quarantine, suspension of face-to-face classes and work arrangements, and phased reopening of the Philippine economy to name a few. The initiatives implemented by the government during the surge of COVID-19 disease have resulted in varying reactions from the citizens. The citizens expressed their reactions to these initiatives using different social media platforms such as Twitter and Facebook. The reactions expressed using these social media platforms were used to analyze the sentiment of the citizens towards the initiatives implemented by the government during the pandemic. In this study, a Bidirectional Recurrent Neural Network-Long Short-term memory - Support Vector Machine (BRNN-LSTM-SVM) hybrid sentiment classifier model was used to determine the sentiments of the Philippine public toward the initiatives of the Philippine government to mitigate the effects of the COVID-19 pandemic. The dataset used was collected and extracted from Facebook and Twitter using API and www.exportcomments.com from March 2020 to August 2020. 25% of the dataset was manually annotated by two human annotators. The manually annotated dataset was used to build the COVID-19 context-based sentiment lexicon, which was later used to determine the polarity of each document. Since the dataset contained unstructured and noisy data, preprocessing activities such as conversion to lowercase characters, removal of stopwords, removal of usernames and pure digit texts, and translation to the English language were performed. The preprocessed dataset was vectorized using Glove word embedding and was used to train and test the performance of the proposed model. The performance of the Hybrid BRNN-LSTM-SVM model was compared to BRNN-LSTM and SVM by performing experiments using the preprocessed dataset. The results show that the Hybrid BRNN-LSTM-SVM model, which gained 95% accuracy for the Facebook dataset and 93% accuracy for the Twitter dataset, outperformed the Support Vector Machine (SVM) sentiment model whose accuracy only ranges from 89% to 91% for both datasets. The results indicate that the citizens harbor negative sentiments towards the initiatives of the government in mitigating the effect of the COVID-19 pandemic. The results of the study may be used in reviewing the initiatives imposed during the pandemic to determine the issues which concern the

新冠肺炎疫情影响全球。世界卫生组织发布了公众必须遵守的指导方针，以防止这种疾病的传播。这包括保持社交距离、戴口罩和定期洗手。这些准则是受大流行病影响的国家制定政策的基础。在菲律宾，政府按照世卫组织的指导方针实施了不同的举措，旨在减轻该流行病对该国的影响。政府制定的一些举措包括国际和国内旅行限制，社区隔离，暂停面对面课程和工作安排，分阶段重新开放菲律宾经济等等。在COVID-19疾病激增期间，政府实施的举措引起了公民的不同反应。市民们通过Twitter和Facebook等不同的社交媒体平台表达了他们对这些举措的反应。在这些社交媒体平台上表达的反应被用来分析公民对政府在疫情期间实施的举措的看法。在本研究中，使用双向循环神经网络-长短期记忆-支持向量机(BRNN-LSTM-SVM)混合情绪分类器模型来确定菲律宾公众对菲律宾政府缓解COVID-19大流行影响的举措的情绪。使用的数据集是在2020年3月至2020年8月期间使用API和www.exportcomments.com从Facebook和Twitter收集和提取的。25%的数据集由两名人工注释者手动注释。人工注释的数据集用于构建基于COVID-19上下文的情感词典，该词典随后用于确定每个文档的极性。由于数据集包含非结构化和噪声数据，因此执行了预处理活动，例如转换为小写字符、删除停止词、删除用户名和纯数字文本以及翻译为英语。预处理后的数据集使用手套词嵌入进行矢量化，并用于训练和测试所提模型的性能。利用预处理后的数据集，将BRNN-LSTM-SVM混合模型的性能与BRNN-LSTM和SVM进行对比。结果表明，混合BRNN-LSTM-SVM模型在Facebook数据集上的准确率为95%，在Twitter数据集上的准确率为93%，优于支持向量机(SVM)情感模型，后者在两个数据集上的准确率仅在89%到91%之间。结果显示，国民对政府为缓解新冠疫情而采取的措施持否定态度。研究结果可用于审查大流行病期间实施的举措，以确定与公民有关的问题。这可能有助于决策者制定指导方针，解决大流行期间遇到的问题。可以开展进一步的研究，以分析公众对实施有限的高等教育面对面课程、实施较少的限制、在该国开展疫苗接种计划以及政府在COVID-19大流行期间继续实施的其他相关举措的看法。

{"title":"Analysis of citizen's sentiment towards Philippine administration's intervention against COVID-19","authors":"Matthew John Sino Cruz, M. D. De Leon","doi":"10.54941/ahfe1001446","DOIUrl":"https://doi.org/10.54941/ahfe1001446","url":null,"abstract":"The COVID-19 pandemic affected the world. The World Health Organization or WHO issued guidelines the public must follow to prevent the spread of the disease. This includes social distancing, the wearing of facemasks, and regular washing of hands. These guidelines served as the basis for formulating policies by countries affected by the pandemic. In the Philippines, the government implemented different initiatives, following the guidelines of WHO, that aimed to mitigate the effect of the pandemic in the country. Some of the initiatives formulated by the administration include international and domestic travel restrictions, community quarantine, suspension of face-to-face classes and work arrangements, and phased reopening of the Philippine economy to name a few. The initiatives implemented by the government during the surge of COVID-19 disease have resulted in varying reactions from the citizens. The citizens expressed their reactions to these initiatives using different social media platforms such as Twitter and Facebook. The reactions expressed using these social media platforms were used to analyze the sentiment of the citizens towards the initiatives implemented by the government during the pandemic. In this study, a Bidirectional Recurrent Neural Network-Long Short-term memory - Support Vector Machine (BRNN-LSTM-SVM) hybrid sentiment classifier model was used to determine the sentiments of the Philippine public toward the initiatives of the Philippine government to mitigate the effects of the COVID-19 pandemic. The dataset used was collected and extracted from Facebook and Twitter using API and www.exportcomments.com from March 2020 to August 2020. 25% of the dataset was manually annotated by two human annotators. The manually annotated dataset was used to build the COVID-19 context-based sentiment lexicon, which was later used to determine the polarity of each document. Since the dataset contained unstructured and noisy data, preprocessing activities such as conversion to lowercase characters, removal of stopwords, removal of usernames and pure digit texts, and translation to the English language were performed. The preprocessed dataset was vectorized using Glove word embedding and was used to train and test the performance of the proposed model. The performance of the Hybrid BRNN-LSTM-SVM model was compared to BRNN-LSTM and SVM by performing experiments using the preprocessed dataset. The results show that the Hybrid BRNN-LSTM-SVM model, which gained 95% accuracy for the Facebook dataset and 93% accuracy for the Twitter dataset, outperformed the Support Vector Machine (SVM) sentiment model whose accuracy only ranges from 89% to 91% for both datasets. The results indicate that the citizens harbor negative sentiments towards the initiatives of the government in mitigating the effect of the COVID-19 pandemic. The results of the study may be used in reviewing the initiatives imposed during the pandemic to determine the issues which concern the ","PeriodicalId":405313,"journal":{"name":"Artificial Intelligence and Social Computing","volume":"241 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124653833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Detecting Potential Depressed Users in Twitter Using a Fine-tuned DistilBERT Model 使用微调蒸馏酒模型检测Twitter中潜在的抑郁用户

Artificial Intelligence and Social Computing

Pub Date : 1900-01-01 DOI: 10.54941/ahfe1001458

Miguel Antonio Adarlo, M. D. De Leon

With the spread of Major Depressive Disorder, otherwise known simply as depression, around the world, various efforts have been made to combat it and to potentially reach out to those suffering from it. Part of those efforts includes the use of technology, such as machine learning models, to screen a potential person for depression through various means, including social media narratives, such as tweets from Twitter. Hence, this study aims to evaluate how well a pre-trained DistilBERT, a transformer model for natural language processing that was fine-tuned on a set of tweets coming from depressed and non-depressed users, can detect potential users in Twitter as having depression. Two models were built using the same procedure of preprocessing, splitting, tokenizing, training, fine-tuning, and optimizing. Both the Base Model (trained on CLPsych 2015 Dataset) and the Mixed Model (trained on the CLPsych 2015 Dataset and a half of the dataset of scraped tweets) could detect potential users in Twitter for depression more than half of the time by demonstrating an Area under the Receiver Operating Curve (AUC) score of 65% and 63%, respectively, when evaluated using the test dataset. These models performed comparably in identifying potential depressed users in Twitter given that there was no significant difference in their AUC scores when subjected to a z-test at 95% confidence interval and 0.05 level of significance (p = 0.21). These results suggest DistilBERT, when fine-tuned, may be used to detect potential users in Twitter for depression.

随着重度抑郁症(又称抑郁症)在世界范围内的蔓延，人们已经做出了各种努力来对抗它，并可能向那些患有抑郁症的人伸出援手。这些努力的一部分包括使用机器学习模型等技术，通过各种手段筛选潜在的抑郁症患者，包括社交媒体叙事，如推特上的推文。因此，本研究旨在评估预训练的蒸馏器(一种用于自然语言处理的转换模型，对来自抑郁和非抑郁用户的一组推文进行微调)在Twitter上检测潜在用户是否患有抑郁症的效果。使用相同的预处理、分割、标记化、训练、微调和优化过程构建两个模型。基本模型(在CLPsych 2015数据集上训练)和混合模型(在CLPsych 2015数据集和一半的抓取推文数据集上训练)在使用测试数据集进行评估时，分别显示接收者操作曲线下的面积(AUC)得分为65%和63%，可以在超过一半的时间内检测到Twitter中潜在的抑郁症用户。这些模型在识别Twitter潜在抑郁用户方面表现相当，因为在95%置信区间和0.05显著性水平上进行z检验时，他们的AUC分数没有显著差异(p = 0.21)。这些结果表明，蒸馏器经过微调后，可以用来检测Twitter上潜在的抑郁症用户。

{"title":"Detecting Potential Depressed Users in Twitter Using a Fine-tuned DistilBERT Model","authors":"Miguel Antonio Adarlo, M. D. De Leon","doi":"10.54941/ahfe1001458","DOIUrl":"https://doi.org/10.54941/ahfe1001458","url":null,"abstract":"With the spread of Major Depressive Disorder, otherwise known simply as depression, around the world, various efforts have been made to combat it and to potentially reach out to those suffering from it. Part of those efforts includes the use of technology, such as machine learning models, to screen a potential person for depression through various means, including social media narratives, such as tweets from Twitter. Hence, this study aims to evaluate how well a pre-trained DistilBERT, a transformer model for natural language processing that was fine-tuned on a set of tweets coming from depressed and non-depressed users, can detect potential users in Twitter as having depression. Two models were built using the same procedure of preprocessing, splitting, tokenizing, training, fine-tuning, and optimizing. Both the Base Model (trained on CLPsych 2015 Dataset) and the Mixed Model (trained on the CLPsych 2015 Dataset and a half of the dataset of scraped tweets) could detect potential users in Twitter for depression more than half of the time by demonstrating an Area under the Receiver Operating Curve (AUC) score of 65% and 63%, respectively, when evaluated using the test dataset. These models performed comparably in identifying potential depressed users in Twitter given that there was no significant difference in their AUC scores when subjected to a z-test at 95% confidence interval and 0.05 level of significance (p = 0.21). These results suggest DistilBERT, when fine-tuned, may be used to detect potential users in Twitter for depression.","PeriodicalId":405313,"journal":{"name":"Artificial Intelligence and Social Computing","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133532843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Dynamically monitoring crowd-worker's reliability with interval-valued labels 用区间值标签动态监测人群工作者的可靠性

Artificial Intelligence and Social Computing

Pub Date : 1900-01-01 DOI: 10.54941/ahfe1003270

Chenyi Hu, Makenzie Spurling

Crowdsourcing has rapidly become a computing paradigm in machine learning and artificial intelligence. In crowdsourcing, multiple labels are collected from crowd-workers on an instance usually through the Internet. These labels are then aggregated as a single label to match the ground truth of the instance. Due to its open nature, human workers in crowdsourcing usually come with various levels of knowledge and socio-economic backgrounds. Effectively handling such human factors has been a focus in the study and applications of crowdsourcing. For example, Bi et al studied the impacts of worker's dedication, expertise, judgment, and task difficulty (Bi et al 2014). Qiu et al offered methods for selecting workers based on behavior prediction (Qiu et al 2016). Barbosa and Chen suggested rehumanizing crowdsourcing to deal with human biases (Barbosa 2019). Checco et al studied adversarial attacks on crowdsourcing for quality control (Checco et al 2020). There are many more related works available in literature. In contrast to commonly used binary-valued labels, interval-valued labels (IVLs) have been introduced very recently (Hu et al 2021). Applying statistical and probabilistic properties of interval-valued datasets, Spurling et al quantitatively defined worker's reliability in four measures: correctness, confidence, stability, and predictability (Spurling et al 2021). Calculating these measures, except correctness, does not require the ground truth of each instance but only worker’s IVLs. Applying these quantified reliability measures, people have significantly improved the overall quality of crowdsourcing (Spurling et al 2022). However, in real world applications, the reliability of a worker may vary from time to time rather than a constant. It is necessary to monitor worker’s reliability dynamically. Because a worker j labels instances sequentially, we treat j’s IVLs as an interval-valued time series in our approach. Assuming j’s reliability relies on the IVLs within a time window only, we calculate j’s reliability measures with the IVLs in the current time window. Moving the time window forward with our proposed practical strategies, we can monitor j’s reliability dynamically. Furthermore, the four reliability measures derived from IVLs are time varying too. With regression analysis, we can separate each reliability measure as an explainable trend and possible errors. To validate our approaches, we use four real world benchmark datasets in our computational experiments. Here are the main findings. The reliability weighted interval majority voting (WIMV) and weighted preferred matching probability (WPMP) schemes consistently overperform the base schemes in terms of much higher accuracy, precision, recall, and F1-score. Note: the base schemes are majority voting (MV), interval majority voting (IMV), and preferred matching probability (PMP). Through monitoring worker’s reliability, our computational experiments have successfully identified possible a

众包已经迅速成为机器学习和人工智能领域的计算范式。在众包中，通常通过互联网从一个实例的众包工作者那里收集多个标签。然后将这些标签聚合为单个标签，以匹配实例的基本事实。由于其开放性，众包中的人力工作者通常具有不同的知识水平和社会经济背景。有效处理这些人为因素一直是众包研究和应用的重点。例如，Bi等人研究了员工敬业度、专业知识、判断力和任务难度的影响(Bi et al 2014)。Qiu等人提供了基于行为预测的工人选择方法(Qiu et al . 2016)。Barbosa和Chen建议将众包重新人性化，以应对人类偏见(Barbosa 2019)。Checco等人研究了针对质量控制的众包的对抗性攻击(Checco等人2020)。文献中还有很多相关的作品。与常用的二值标签不同，区间值标签(ivl)是最近才引入的(Hu et al . 2021)。利用区间值数据集的统计和概率特性，Spurling等人在四个方面定量定义了工人的可靠性:正确性、置信度、稳定性和可预测性(Spurling等人2021)。计算这些度量，除了正确性，不需要每个实例的基本事实，而只需要工人的ivl。应用这些量化的可靠性措施，人们显著提高了众包的整体质量(Spurling et al 2022)。然而，在现实世界的应用程序中，工作器的可靠性可能会不时变化，而不是恒定的。对工人的可靠性进行动态监测是必要的。因为工人j按顺序标记实例，所以在我们的方法中，我们将j的ivl视为区间值时间序列。假设j的可靠性仅依赖于一个时间窗口内的ivl，我们用当前时间窗口内的ivl计算j的可靠性度量。利用我们提出的实用策略，将时间窗口向前推进，我们可以动态地监控j的可靠性。此外，由ivl导出的四种可靠性度量也具有时变特性。通过回归分析，我们可以将每个可靠性度量分离为可解释的趋势和可能的误差。为了验证我们的方法，我们在计算实验中使用了四个真实世界的基准数据集。以下是主要发现。可靠性加权区间多数投票(WIMV)和加权首选匹配概率(WPMP)方案在更高的准确性、精密度、召回率和f1分数方面始终优于基本方案。注意:基本方案是多数投票(MV)、间隔多数投票(IMV)和首选匹配概率(PMP)。通过监测工作人员的可靠性，我们的计算实验成功地识别了可能的攻击者。通过移除已识别的攻击者，我们确保了质量。我们还研究了窗口大小选择的影响。动态监测工人的可靠性是必要的，计算结果表明了所提方法的潜在成功。本研究得到了美国国家科学基金会NSF/OIA-1946391项目的部分资助。

{"title":"Dynamically monitoring crowd-worker's reliability with interval-valued labels","authors":"Chenyi Hu, Makenzie Spurling","doi":"10.54941/ahfe1003270","DOIUrl":"https://doi.org/10.54941/ahfe1003270","url":null,"abstract":"Crowdsourcing has rapidly become a computing paradigm in machine learning and artificial intelligence. In crowdsourcing, multiple labels are collected from crowd-workers on an instance usually through the Internet. These labels are then aggregated as a single label to match the ground truth of the instance. Due to its open nature, human workers in crowdsourcing usually come with various levels of knowledge and socio-economic backgrounds. Effectively handling such human factors has been a focus in the study and applications of crowdsourcing. For example, Bi et al studied the impacts of worker's dedication, expertise, judgment, and task difficulty (Bi et al 2014). Qiu et al offered methods for selecting workers based on behavior prediction (Qiu et al 2016). Barbosa and Chen suggested rehumanizing crowdsourcing to deal with human biases (Barbosa 2019). Checco et al studied adversarial attacks on crowdsourcing for quality control (Checco et al 2020). There are many more related works available in literature. In contrast to commonly used binary-valued labels, interval-valued labels (IVLs) have been introduced very recently (Hu et al 2021). Applying statistical and probabilistic properties of interval-valued datasets, Spurling et al quantitatively defined worker's reliability in four measures: correctness, confidence, stability, and predictability (Spurling et al 2021). Calculating these measures, except correctness, does not require the ground truth of each instance but only worker’s IVLs. Applying these quantified reliability measures, people have significantly improved the overall quality of crowdsourcing (Spurling et al 2022). However, in real world applications, the reliability of a worker may vary from time to time rather than a constant. It is necessary to monitor worker’s reliability dynamically. Because a worker j labels instances sequentially, we treat j’s IVLs as an interval-valued time series in our approach. Assuming j’s reliability relies on the IVLs within a time window only, we calculate j’s reliability measures with the IVLs in the current time window. Moving the time window forward with our proposed practical strategies, we can monitor j’s reliability dynamically. Furthermore, the four reliability measures derived from IVLs are time varying too. With regression analysis, we can separate each reliability measure as an explainable trend and possible errors. To validate our approaches, we use four real world benchmark datasets in our computational experiments. Here are the main findings. The reliability weighted interval majority voting (WIMV) and weighted preferred matching probability (WPMP) schemes consistently overperform the base schemes in terms of much higher accuracy, precision, recall, and F1-score. Note: the base schemes are majority voting (MV), interval majority voting (IMV), and preferred matching probability (PMP). Through monitoring worker’s reliability, our computational experiments have successfully identified possible a","PeriodicalId":405313,"journal":{"name":"Artificial Intelligence and Social Computing","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115704897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A machine learning approach for optimizing waiting times in a hand surgery operation center 优化手外科手术中心等待时间的机器学习方法

Artificial Intelligence and Social Computing

Pub Date : 1900-01-01 DOI: 10.54941/ahfe1003268

A. Schuller, M. Braun, Peter Hahn

For patients scheduled for surgery, long waiting times are unpleasant. However, scheduling that is too patient-oriented can lead to friction losses in the operating room and waiting times for the medical personnel. We have conducted an analysis of historical hand surgery data to improve forecasting of hand surgery durations, optimize operation room scheduling for physicians and patients and reduce overall waiting times. Several models have been evaluated to forecast surgery durations. A quantile-based approach based on the distribution of surgery durations has been tested in a scheduling simulation. This approach has indicated possibilities to gradually balance waiting times between patients and medical personnel. Within a field trial, a trained regression model has been successfully deployed in a hand surgery operation center.

对于预定手术的病人来说，长时间的等待是不愉快的。然而，过于以病人为中心的安排可能会导致手术室的摩擦损失和医务人员的等待时间。我们对历史手外科数据进行了分析，以改进手外科手术时间的预测，优化医生和患者的手术室安排，减少总体等待时间。已经评估了几种模型来预测手术时间。一种基于手术时间分布的分位数方法已在调度仿真中进行了测试。这种方法表明了逐步平衡病人和医务人员等待时间的可能性。在现场试验中，训练后的回归模型已成功地部署在手外科手术中心。

引用次数: 0

Supradyadic Trust in Artificial Intelligence 人工智能中的超然信任

Artificial Intelligence and Social Computing

Pub Date : 1900-01-01 DOI: 10.54941/ahfe1001451

Stephen L. Dorton

There is a considerable body of research on trust in Artificial Intelligence (AI). Trust has been viewed almost exclusively as a dyadic construct, where it is a function of various factors between the user and the agent, mediated by the context of the environment. A recent study has found several cases of supradyadic trust interactions, where a user’s trust in the AI is affected by how other people interact with the agent, above and beyond endorsements or reputation. An analysis of these surpradyadic interactions is presented, along with a discussion of practical considerations for AI developers, and an argument for more complex representations of trust in AI.

关于人工智能(AI)中的信任，有相当多的研究。信任几乎完全被视为一种二元结构，它是用户和代理之间各种因素的函数，由环境背景调解。最近的一项研究发现了几个超级信任互动的案例，在这些案例中，用户对人工智能的信任受到其他人与代理互动方式的影响，而不仅仅是认可或声誉。本文对这些超元交互进行了分析，讨论了人工智能开发人员的实际考虑因素，并论证了人工智能中更复杂的信任表示。

引用次数: 5

Automatic Labeling of Human Actions by Skeleton Clustering and Fuzzy Similarity 基于骨架聚类和模糊相似度的人类行为自动标记

Artificial Intelligence and Social Computing

Pub Date : 1900-01-01 DOI: 10.54941/ahfe1001457

Chao-Lung Yang, Shang-Che Hsu, Simi Wang, Jing-Feng Nian

Nowadays, human action recognition (HAR) has been applied in multiple fields with the rapid growth of artificial intelligence and machine learning. Applying HAR onto industrial production lines can help on visualizing and analyzing the correlation between human operators and machine utilization to improve overall productivity. However, to train HAR model, the manual labeling of certain actions in a large amount of the collected video data is required and very costly. How to label a large amount of video automatically is an emerging practical problem in HAR research domain. This research proposed an automatic labeling framework by integrating Dynamic Time Warping (DTW), human skeleton clustering, and Fuzzy similarity to assign the labels based on the pre-defined human actions. First, the skeleton estimation method such as OpenPose was used to jointly detect key points of the human operator’s skeleton. Then, the skeleton data was converted to spatial-temporal data for calculating the DTW distance between skeletons. The groups of human skeletons can be clustered based on DTW distance among skeletons. Within a group of skeletons, the undefined skeletons will be compared with the pre-defined skeletons, considered as the references, and the labels are assigned according to the similarity against the references. The experimental dataset was created by simulating the human actions of manual drilling operations. By comparing with the manual labeled data, the results show that all of accuracy, precision, recall, and F1 of the proposed labeling model can achieve up to 95% with 40% saving time.

如今，随着人工智能和机器学习的快速发展，人体动作识别(HAR)已被应用于多个领域。将HAR应用于工业生产线可以帮助可视化和分析操作员与机器利用率之间的相关性，从而提高整体生产率。然而，为了训练HAR模型，需要对收集到的大量视频数据中的某些动作进行人工标记，并且成本非常高。如何对大量视频进行自动标注是HAR研究领域中一个新兴的实际问题。该研究提出了一种基于预定义人类行为的自动标记框架，该框架将动态时间扭曲(Dynamic Time warp, DTW)、人体骨架聚类和模糊相似度相结合。首先，采用OpenPose等骨架估计方法，对人体操作员骨架关键点进行联合检测;然后，将骨架数据转换为时空数据，计算骨架之间的DTW距离。基于骨骼间的DTW距离可以对人类骨骼群进行聚类。在一组骨架中，未定义的骨架将与预定义的骨架进行比较，作为参考，并根据与参考的相似度分配标签。实验数据集是通过模拟人工钻井作业的人类行为来创建的。通过与手工标注数据的比较，结果表明，所提标注模型的准确率、精密度、召回率和F1均达到95%以上，节省时间40%。

{"title":"Automatic Labeling of Human Actions by Skeleton Clustering and Fuzzy Similarity","authors":"Chao-Lung Yang, Shang-Che Hsu, Simi Wang, Jing-Feng Nian","doi":"10.54941/ahfe1001457","DOIUrl":"https://doi.org/10.54941/ahfe1001457","url":null,"abstract":"Nowadays, human action recognition (HAR) has been applied in multiple fields with the rapid growth of artificial intelligence and machine learning. Applying HAR onto industrial production lines can help on visualizing and analyzing the correlation between human operators and machine utilization to improve overall productivity. However, to train HAR model, the manual labeling of certain actions in a large amount of the collected video data is required and very costly. How to label a large amount of video automatically is an emerging practical problem in HAR research domain. This research proposed an automatic labeling framework by integrating Dynamic Time Warping (DTW), human skeleton clustering, and Fuzzy similarity to assign the labels based on the pre-defined human actions. First, the skeleton estimation method such as OpenPose was used to jointly detect key points of the human operator’s skeleton. Then, the skeleton data was converted to spatial-temporal data for calculating the DTW distance between skeletons. The groups of human skeletons can be clustered based on DTW distance among skeletons. Within a group of skeletons, the undefined skeletons will be compared with the pre-defined skeletons, considered as the references, and the labels are assigned according to the similarity against the references. The experimental dataset was created by simulating the human actions of manual drilling operations. By comparing with the manual labeled data, the results show that all of accuracy, precision, recall, and F1 of the proposed labeling model can achieve up to 95% with 40% saving time.","PeriodicalId":405313,"journal":{"name":"Artificial Intelligence and Social Computing","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130820055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Generating a Multimodal Dataset Using a Feature Extraction Toolkit for Wearable and Machine Learning: A pilot study 使用可穿戴和机器学习的特征提取工具包生成多模态数据集:试点研究

Artificial Intelligence and Social Computing

Pub Date : 1900-01-01 DOI: 10.54941/ahfe1001448

Edwin Marte Zorrilla, I. Villanueva, J. Husman, Matthew C. Graham

Studies for stress and student performance with multimodal sensor measurements have been a recent topic of discussion among research educators. With the advances in computational hardware and the use of Machine learning strategies, scholars can now deal with data of high dimensionality and provide a way to predict new estimates for future research designs. In this paper, the process to generate and obtain a multimodal dataset including physiological measurements (e.g., electrodermal activity- EDA) from wearable devices is presented. Through the use of a Feature Generation Toolkit for Wearable Data, the time to extract clean and generate the data was reduced. A machine learning model from an openly available multimodal dataset was developed and results were compared against previous studies to evaluate the utility of these approaches and tools. Keywords: Engineering Education, Physiological Sensing, Student Performance, Machine Learning, Multimodal, FLIRT, WESAD

用多模态传感器测量压力和学生表现的研究已成为研究教育工作者最近讨论的话题。随着计算硬件的进步和机器学习策略的使用，学者们现在可以处理高维数据，并为未来的研究设计提供一种预测新估计的方法。本文介绍了从可穿戴设备生成和获得包括生理测量(例如，皮肤电活动- EDA)在内的多模态数据集的过程。通过使用可穿戴数据特征生成工具包，减少了提取干净数据和生成数据的时间。开发了一个来自公开可用的多模态数据集的机器学习模型，并将结果与先前的研究进行比较，以评估这些方法和工具的效用。关键词:工程教育，生理传感，学生表现，机器学习，多模态，FLIRT, WESAD

引用次数: 0