Proceedings of the Workshop on Modeling Cognitive Processes from Multimodal Data最新文献

Estimating mental load in passive and active tasks from pupil and gaze changes using bayesian surprise 利用贝叶斯惊奇估计瞳孔和凝视变化在被动和主动任务中的心理负荷

Proceedings of the Workshop on Modeling Cognitive Processes from Multimodal Data

Pub Date : 2018-10-16 DOI: 10.1145/3279810.3279852

E. Wolf, Manuel Martínez, Alina Roitberg, R. Stiefelhagen, B. Deml

Eye-based monitoring has been suggested as a means to measure mental load in a non-intrusive way. In most cases, the experiments have been conducted in a setting where the user has been mainly passive. This constraint does not reflect applications where we want to identify mental load of an active user, e.g. during surgery. The main objective of our work is to investigate the potential of an eye tracking device for measuring the mental load in realistic active situations. In our first experiments we calibrate our setup by using a well established passive setup. There, we confirm that our setup can recover reliably eye width in real time, and we can observe the previously reported relationship between pupil width and cognitive load, however, we also observe a very high variance between different test subjects. In a follow up active task experiment, neither pupil width nor eye gaze showed a significant predictive power over workflow disruptions. To address this, we present an approach for estimating the likelihood of workflow disruptions during active fine-motor tasks. Our method combines the eye-based data with the Bayesian Surprise theory and is able to successfully predict user's struggle with correlations of 35% and 75% respectively.

基于眼睛的监测被认为是一种以非侵入性方式测量精神负荷的手段。在大多数情况下，实验是在用户主要是被动的情况下进行的。这个约束不能反映我们想要识别活跃用户心理负荷的应用，例如在手术期间。我们工作的主要目的是研究眼动追踪设备在实际活动情况下测量心理负荷的潜力。在我们的第一个实验中，我们通过使用一个完善的被动设置来校准我们的设置。在那里，我们证实了我们的设置可以可靠地实时恢复眼宽，并且我们可以观察到先前报道的瞳孔宽度与认知负荷之间的关系，然而，我们也观察到不同测试对象之间的差异非常大。在后续的主动任务实验中，瞳孔宽度和眼睛注视都没有显示出对工作流程中断的显著预测能力。为了解决这个问题，我们提出了一种方法来估计在活动精细运动任务期间工作流程中断的可能性。我们的方法结合了基于眼睛的数据和贝叶斯惊喜理论，能够成功地预测用户的挣扎，相关性分别为35%和75%。

{"title":"Estimating mental load in passive and active tasks from pupil and gaze changes using bayesian surprise","authors":"E. Wolf, Manuel Martínez, Alina Roitberg, R. Stiefelhagen, B. Deml","doi":"10.1145/3279810.3279852","DOIUrl":"https://doi.org/10.1145/3279810.3279852","url":null,"abstract":"Eye-based monitoring has been suggested as a means to measure mental load in a non-intrusive way. In most cases, the experiments have been conducted in a setting where the user has been mainly passive. This constraint does not reflect applications where we want to identify mental load of an active user, e.g. during surgery. The main objective of our work is to investigate the potential of an eye tracking device for measuring the mental load in realistic active situations. In our first experiments we calibrate our setup by using a well established passive setup. There, we confirm that our setup can recover reliably eye width in real time, and we can observe the previously reported relationship between pupil width and cognitive load, however, we also observe a very high variance between different test subjects. In a follow up active task experiment, neither pupil width nor eye gaze showed a significant predictive power over workflow disruptions. To address this, we present an approach for estimating the likelihood of workflow disruptions during active fine-motor tasks. Our method combines the eye-based data with the Bayesian Surprise theory and is able to successfully predict user's struggle with correlations of 35% and 75% respectively.","PeriodicalId":326513,"journal":{"name":"Proceedings of the Workshop on Modeling Cognitive Processes from Multimodal Data","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127323803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Overlooking 俯瞰

Proceedings of the Workshop on Modeling Cognitive Processes from Multimodal Data

Pub Date : 2018-10-16 DOI: 10.1145/3279810.3279845

Nora Castner, Solveig Klepper, Lena Kopnarski, F. Hüttig, C. Keutel, K. Scheiter, Juliane Richter, Thérése F. Eder, Enkelejda Kasneci

The cognitive processes that underly expert decision making in medical image interpretation are crucial to the understanding of what constitutes optimal performance. Often, if an anomaly goes undetected, the exact nature of the false negative is not fully understood. This work looks at 24 experts' performance (true positives and false negatives) during an anomaly detection task for 13 images and the corresponding gaze behavior. By using a drawing and an eye-tracking experimental paradigm, we compared expert target anomaly detection in orthopantomographs (OPTs) against their own gaze behavior. We found there was a relationship between the number of anomalies detected and the anomalies looked at. However, roughly 70% of anomalies that were not explicitly marked in the drawing paradigm were looked at. Therefore, we looked how often an anomaly was glanced at. We found that when not explicitly marked, target anomalies were more often glanced at once or twice. In contrast, when targets were marked, the number of glances was higher. Furthermore, since this behavior was not similar over all images, we attribute these differences to image complexity.

{"title":"Overlooking","authors":"Nora Castner, Solveig Klepper, Lena Kopnarski, F. Hüttig, C. Keutel, K. Scheiter, Juliane Richter, Thérése F. Eder, Enkelejda Kasneci","doi":"10.1145/3279810.3279845","DOIUrl":"https://doi.org/10.1145/3279810.3279845","url":null,"abstract":"The cognitive processes that underly expert decision making in medical image interpretation are crucial to the understanding of what constitutes optimal performance. Often, if an anomaly goes undetected, the exact nature of the false negative is not fully understood. This work looks at 24 experts' performance (true positives and false negatives) during an anomaly detection task for 13 images and the corresponding gaze behavior. By using a drawing and an eye-tracking experimental paradigm, we compared expert target anomaly detection in orthopantomographs (OPTs) against their own gaze behavior. We found there was a relationship between the number of anomalies detected and the anomalies looked at. However, roughly 70% of anomalies that were not explicitly marked in the drawing paradigm were looked at. Therefore, we looked how often an anomaly was glanced at. We found that when not explicitly marked, target anomalies were more often glanced at once or twice. In contrast, when targets were marked, the number of glances was higher. Furthermore, since this behavior was not similar over all images, we attribute these differences to image complexity.","PeriodicalId":326513,"journal":{"name":"Proceedings of the Workshop on Modeling Cognitive Processes from Multimodal Data","volume":"161 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128346414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Multimodal approach for cognitive task performance prediction from body postures, facial expressions and EEG signal 基于身体姿势、面部表情和脑电图信号的认知任务表现预测的多模态方法

Proceedings of the Workshop on Modeling Cognitive Processes from Multimodal Data

Pub Date : 2018-10-16 DOI: 10.1145/3279810.3279849

Ashwin Ramesh Babu, Akilesh Rajavenkatanarayanan, J. Brady, F. Makedon

Recent developments in computer vision and the emergence of wearable sensors have opened opportunities for the development of advanced and sophisticated techniques to enable multi-modal user assessment and personalized training which is important in educational, industrial training and rehabilitation applications. They have also paved way for the use of assistive robots to accurately assess human cognitive and physical skills. Assessment and training cannot be generalized as the requirement varies for every person and for every application. The ability of the system to adapt to the individual's needs and performance is essential for its effectiveness. In this paper, the focus is on task performance prediction which is an important parameter to consider for personalization. Several research works focus on how to predict task performance based on physiological and behavioral data. In this work, we follow a multi-modal approach where the system collects information from different modalities to predict performance based on (a) User's emotional state recognized from facial expressions(Behavioral data), (b) User's emotional state from body postures(Behavioral data) (c) task performance from EEG signals (Physiological data) while the person performs a robot-based cognitive task. This multi-modal approach of combining physiological data and behavioral data produces the highest accuracy of 87.5 percent, which outperforms the accuracy of prediction extracted from any single modality. In particular, this approach is useful in finding associations between facial expressions, body postures and brain signals while a person performs a cognitive task.

计算机视觉的最新发展和可穿戴传感器的出现为发展先进和复杂的技术提供了机会，使多模态用户评估和个性化培训成为可能，这在教育、工业培训和康复应用中很重要。它们还为使用辅助机器人准确评估人类的认知和身体技能铺平了道路。评估和培训不能一概而论，因为每个人和每个申请的要求都是不同的。系统适应个人需要和表现的能力对其有效性至关重要。在本文中，重点研究任务性能预测，这是个性化需要考虑的一个重要参数。一些研究工作集中在如何根据生理和行为数据预测任务表现。在这项工作中，我们采用了一种多模态方法，系统从不同模态收集信息，以基于(a)用户从面部表情中识别的情绪状态(行为数据)，(b)用户从身体姿势中识别的情绪状态(行为数据)，(c)当人执行基于机器人的认知任务时，从脑电图信号中获得的任务表现(生理数据)来预测性能。这种结合生理数据和行为数据的多模态方法产生了87.5%的最高准确率，超过了从任何单一模态提取的预测准确性。特别是，这种方法在发现一个人执行认知任务时面部表情、身体姿势和大脑信号之间的联系时非常有用。

{"title":"Multimodal approach for cognitive task performance prediction from body postures, facial expressions and EEG signal","authors":"Ashwin Ramesh Babu, Akilesh Rajavenkatanarayanan, J. Brady, F. Makedon","doi":"10.1145/3279810.3279849","DOIUrl":"https://doi.org/10.1145/3279810.3279849","url":null,"abstract":"Recent developments in computer vision and the emergence of wearable sensors have opened opportunities for the development of advanced and sophisticated techniques to enable multi-modal user assessment and personalized training which is important in educational, industrial training and rehabilitation applications. They have also paved way for the use of assistive robots to accurately assess human cognitive and physical skills. Assessment and training cannot be generalized as the requirement varies for every person and for every application. The ability of the system to adapt to the individual's needs and performance is essential for its effectiveness. In this paper, the focus is on task performance prediction which is an important parameter to consider for personalization. Several research works focus on how to predict task performance based on physiological and behavioral data. In this work, we follow a multi-modal approach where the system collects information from different modalities to predict performance based on (a) User's emotional state recognized from facial expressions(Behavioral data), (b) User's emotional state from body postures(Behavioral data) (c) task performance from EEG signals (Physiological data) while the person performs a robot-based cognitive task. This multi-modal approach of combining physiological data and behavioral data produces the highest accuracy of 87.5 percent, which outperforms the accuracy of prediction extracted from any single modality. In particular, this approach is useful in finding associations between facial expressions, body postures and brain signals while a person performs a cognitive task.","PeriodicalId":326513,"journal":{"name":"Proceedings of the Workshop on Modeling Cognitive Processes from Multimodal Data","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124367570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 21

Rule-based learning for eye movement type detection 基于规则的眼动类型检测学习

Proceedings of the Workshop on Modeling Cognitive Processes from Multimodal Data

Pub Date : 2018-10-16 DOI: 10.1145/3279810.3279844

Wolfgang Fuhl, Nora Castner, Enkelejda Kasneci

Eye movements hold information about human perception, intention, and cognitive state. Various algorithms have been proposed to identify and distinguish eye movements, particularly fixations, saccades, and smooth pursuits. A major drawback of existing algorithms is that they rely on accurate and constant sampling rates, error free recordings, and impend straightforward adaptation to new movements, such as microsaccades, since they are designed for certain eye movement detection. We propose a novel rule-based machine learning approach to create detectors on annotated or simulated data. It is capable of learning diverse types of eye movements as well as automatically detecting pupil detection errors in the raw gaze data. Additionally, our approach is capable of using any sampling rate, even with fluctuations. Our approach consists of learning several interdependent thresholds and previous type classifications and combines them into sets of detectors automatically. We evaluated our approach against the state-of-the-art algorithms on publicly available datasets. Our approach is integrated in the newest version of EyeTrace which can be downloaded at http://www.ti.uni-tuebingen.de/Eyetrace.1751.0.html.

眼球运动包含了人类感知、意图和认知状态的信息。已经提出了各种算法来识别和区分眼球运动，特别是注视，扫视和平滑追求。现有算法的一个主要缺点是，它们依赖于准确和恒定的采样率，无错误记录，以及对新运动(如微跳)的直接适应，因为它们是为特定的眼动检测而设计的。我们提出了一种新的基于规则的机器学习方法来在注释或模拟数据上创建检测器。它能够学习不同类型的眼球运动，并自动检测原始凝视数据中的瞳孔检测错误。此外，我们的方法能够使用任何采样率，即使有波动。我们的方法包括学习几个相互依赖的阈值和以前的类型分类，并将它们自动组合成检测器集。我们根据公共数据集上最先进的算法评估了我们的方法。我们的方法集成在最新版本的EyeTrace中，可以从http://www.ti.uni-tuebingen.de/Eyetrace.1751.0.html下载。

{"title":"Rule-based learning for eye movement type detection","authors":"Wolfgang Fuhl, Nora Castner, Enkelejda Kasneci","doi":"10.1145/3279810.3279844","DOIUrl":"https://doi.org/10.1145/3279810.3279844","url":null,"abstract":"Eye movements hold information about human perception, intention, and cognitive state. Various algorithms have been proposed to identify and distinguish eye movements, particularly fixations, saccades, and smooth pursuits. A major drawback of existing algorithms is that they rely on accurate and constant sampling rates, error free recordings, and impend straightforward adaptation to new movements, such as microsaccades, since they are designed for certain eye movement detection. We propose a novel rule-based machine learning approach to create detectors on annotated or simulated data. It is capable of learning diverse types of eye movements as well as automatically detecting pupil detection errors in the raw gaze data. Additionally, our approach is capable of using any sampling rate, even with fluctuations. Our approach consists of learning several interdependent thresholds and previous type classifications and combines them into sets of detectors automatically. We evaluated our approach against the state-of-the-art algorithms on publicly available datasets. Our approach is integrated in the newest version of EyeTrace which can be downloaded at http://www.ti.uni-tuebingen.de/Eyetrace.1751.0.html.","PeriodicalId":326513,"journal":{"name":"Proceedings of the Workshop on Modeling Cognitive Processes from Multimodal Data","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116430181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 21

Discovering digital representations for remembered episodes from lifelog data 从生活日志数据中发现记忆事件的数字表示

Proceedings of the Workshop on Modeling Cognitive Processes from Multimodal Data

Pub Date : 2018-10-16 DOI: 10.1145/3279810.3279850

Bernd Dudzik, J. Broekens, Mark Antonius Neerincx, J. Olenick, C. Chang, S. Kozlowski, H. Hung

Combining self-reports in which individuals reflect on their thoughts and feelings (Experience Samples) with sensor data collected via ubiquitous monitoring can provide researchers and applications with detailed insights about human behavior and psychology. However, meaningfully associating these two sources of data with each other is difficult: while it is natural for human beings to reflect on their experience in terms of remembered episodes, it is an open challenge to retrace this subjective organization in sensor data referencing objective time. Lifelogging is a specific approach to the ubiquitous monitoring of individuals that can contribute to overcoming this recollection gap. It strives to create a comprehensive timeline of semantic annotations that reflect the impressions of the monitored person from his or her own subjective point-of-view. In this paper, we describe a novel approach for processing such lifelogs to situate remembered experiences in an objective timeline. It involves the computational modeling of individuals' memory processes to estimate segments within a lifelog acting as plausible digital representations for their recollections. We report about an empirical investigation in which we use our approach to discover plausible representations for remembered social interactions between participants in a longitudinal study. In particular, we describe an exploration of the behavior displayed by our model for memory processes in this setting. Finally, we explore the representations discovered for this study and discuss insights that might be gained from them.

将个人反映自己想法和感受的自我报告(体验样本)与通过无处不在的监测收集的传感器数据相结合，可以为研究人员和应用程序提供有关人类行为和心理的详细见解。然而，有意义地将这两种数据来源相互关联是困难的:虽然人类很自然地根据记忆事件来反思他们的经验，但在参考客观时间的传感器数据中追溯这种主观组织是一个公开的挑战。生活日志是一种对个人进行无所不在的监控的具体方法，它有助于克服这种记忆缺口。它努力创建一个语义注释的综合时间轴，以反映被监视人从他或她自己的主观观点的印象。在本文中，我们描述了一种处理这种生活日志的新方法，以将记忆中的经历置于客观的时间轴中。它涉及对个人记忆过程的计算建模，以估计生活日志中的片段，这些片段作为他们回忆的可信数字表示。我们报告了一项实证调查，在该调查中，我们使用我们的方法在纵向研究中发现参与者之间记忆的社会互动的合理表征。特别是，我们描述了在这种情况下我们的记忆过程模型所显示的行为的探索。最后，我们探讨了本研究发现的表征，并讨论了可能从中获得的见解。

{"title":"Discovering digital representations for remembered episodes from lifelog data","authors":"Bernd Dudzik, J. Broekens, Mark Antonius Neerincx, J. Olenick, C. Chang, S. Kozlowski, H. Hung","doi":"10.1145/3279810.3279850","DOIUrl":"https://doi.org/10.1145/3279810.3279850","url":null,"abstract":"Combining self-reports in which individuals reflect on their thoughts and feelings (Experience Samples) with sensor data collected via ubiquitous monitoring can provide researchers and applications with detailed insights about human behavior and psychology. However, meaningfully associating these two sources of data with each other is difficult: while it is natural for human beings to reflect on their experience in terms of remembered episodes, it is an open challenge to retrace this subjective organization in sensor data referencing objective time. Lifelogging is a specific approach to the ubiquitous monitoring of individuals that can contribute to overcoming this recollection gap. It strives to create a comprehensive timeline of semantic annotations that reflect the impressions of the monitored person from his or her own subjective point-of-view. In this paper, we describe a novel approach for processing such lifelogs to situate remembered experiences in an objective timeline. It involves the computational modeling of individuals' memory processes to estimate segments within a lifelog acting as plausible digital representations for their recollections. We report about an empirical investigation in which we use our approach to discover plausible representations for remembered social interactions between participants in a longitudinal study. In particular, we describe an exploration of the behavior displayed by our model for memory processes in this setting. Finally, we explore the representations discovered for this study and discuss insights that might be gained from them.","PeriodicalId":326513,"journal":{"name":"Proceedings of the Workshop on Modeling Cognitive Processes from Multimodal Data","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129155589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Multimer: validating multimodal, cognitive data in the city: towards a model of how the urban environment influences streetscape users 多重:验证城市中的多模式认知数据:建立城市环境如何影响街景用户的模型

Proceedings of the Workshop on Modeling Cognitive Processes from Multimodal Data

Pub Date : 2018-10-16 DOI: 10.1145/3279810.3279853

Arlene Ducao, Ilias Koen, Zhiqi Guo

Multimer is a new technology that aims to provide a data-driven understanding of how humans cognitively and physically experience spatial environments. By multimodally measuring biosensor data to model how the built environment and its uses influence cognitive processes, Multimer aims to help space professionals like architects, workplace strategists, and urban planners make better design interventions. Multimer is perhaps the first spatial technology that collects biosensor data, like brainwave and heart rate data, and analyzes it with both spatiotemporal and neurophysiological tools. The Multimer mobile app can record data from several kinds of commonly available, inexpensive, wearable sensors, including EEG, ECG, pedometer, accelerometer, and gyroscope modules. The Multimer app also records user-entered information via its user interface and micro-surveys, then also combines all this data with a user's geo-location using GPS, beacons, and other location tools. Multimer's study platform displays all of this data in real-time at the individual and aggregate level. Multimer also validates the data by comparing the collected sensor and sentiment data in spatiotemporal contexts, and then it integrates the collected data with other data sets such as citizen reports, traffic data, and city amenities to provide actionable insights towards the evaluation and redesign of sites and spaces. This report presents preliminary results from the data validation process for a Multimer study of 101 subjects in New York City from August to October 2017. Ultimately, the aim of this study is to prototype a replicable, scalable model of how the built environment and the movement of traffic influence the neurophysiological state of pedestrians, cyclists, and drivers.

multitimer是一项新技术，旨在提供数据驱动的理解人类如何认知和物理体验空间环境。通过多模态测量生物传感器数据来模拟建筑环境及其用途如何影响认知过程，Multimer旨在帮助建筑师、工作场所战略家和城市规划者等空间专业人士进行更好的设计干预。Multimer可能是第一个收集生物传感器数据的空间技术，如脑电波和心率数据，并使用时空和神经生理学工具进行分析。multitimer移动应用程序可以记录来自几种常用的、廉价的可穿戴传感器的数据，包括脑电图、心电图、计步器、加速度计和陀螺仪模块。Multimer应用程序还通过其用户界面和微调查记录用户输入的信息，然后使用GPS、信标和其他定位工具将所有这些数据与用户的地理位置相结合。Multimer的研究平台可以在个人和总体层面实时显示所有这些数据。Multimer还通过比较收集到的传感器数据和时空背景下的情感数据来验证数据，然后将收集到的数据与其他数据集(如公民报告、交通数据和城市设施)整合起来，为站点和空间的评估和重新设计提供可操作的见解。本报告介绍了2017年8月至10月在纽约市对101名受试者进行的多重研究的数据验证过程的初步结果。最终，本研究的目的是建立一个可复制的、可扩展的模型原型，以研究建筑环境和交通运动如何影响行人、骑自行车的人和司机的神经生理状态。

{"title":"Multimer: validating multimodal, cognitive data in the city: towards a model of how the urban environment influences streetscape users","authors":"Arlene Ducao, Ilias Koen, Zhiqi Guo","doi":"10.1145/3279810.3279853","DOIUrl":"https://doi.org/10.1145/3279810.3279853","url":null,"abstract":"Multimer is a new technology that aims to provide a data-driven understanding of how humans cognitively and physically experience spatial environments. By multimodally measuring biosensor data to model how the built environment and its uses influence cognitive processes, Multimer aims to help space professionals like architects, workplace strategists, and urban planners make better design interventions. Multimer is perhaps the first spatial technology that collects biosensor data, like brainwave and heart rate data, and analyzes it with both spatiotemporal and neurophysiological tools. The Multimer mobile app can record data from several kinds of commonly available, inexpensive, wearable sensors, including EEG, ECG, pedometer, accelerometer, and gyroscope modules. The Multimer app also records user-entered information via its user interface and micro-surveys, then also combines all this data with a user's geo-location using GPS, beacons, and other location tools. Multimer's study platform displays all of this data in real-time at the individual and aggregate level. Multimer also validates the data by comparing the collected sensor and sentiment data in spatiotemporal contexts, and then it integrates the collected data with other data sets such as citizen reports, traffic data, and city amenities to provide actionable insights towards the evaluation and redesign of sites and spaces. This report presents preliminary results from the data validation process for a Multimer study of 101 subjects in New York City from August to October 2017. Ultimately, the aim of this study is to prototype a replicable, scalable model of how the built environment and the movement of traffic influence the neurophysiological state of pedestrians, cyclists, and drivers.","PeriodicalId":326513,"journal":{"name":"Proceedings of the Workshop on Modeling Cognitive Processes from Multimodal Data","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130392259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Investigating static and sequential models for intervention-free selection using multimodal data of EEG and eye tracking 利用脑电图和眼动追踪的多模态数据研究无干预选择的静态和顺序模型

Proceedings of the Workshop on Modeling Cognitive Processes from Multimodal Data

Pub Date : 2018-10-16 DOI: 10.1145/3279810.3279841

Mazen Salous, F. Putze, Tanja Schultz, Jutta Hild, J. Beyerer

Multimodal data is increasingly used in cognitive prediction models to better analyze and predict different user cognitive processes. Classifiers based on such data, however, have different performance characteristics. We discuss in this paper an intervention-free selection task using multimodal data of EEG and eye tracking in three different models. We show that a sequential model, LSTM, is more sensitive but less precise than a static model SVM. Moreover, we introduce a confidence-based Competition-Fusion model using both SVM and LSTM. The fusion model further improves the recall compared to either SVM or LSTM alone, without decreasing precision compared to LSTM. According to the results, we recommend SVM for interactive applications which require minimal false positives (high precision), and recommend LSTM and highly recommend Competition-Fusion Model for applications which handle intervention-free selection requests in an additional post-processing step, requiring higher recall than precision.

多模态数据越来越多地用于认知预测模型中，以更好地分析和预测不同的用户认知过程。然而，基于这些数据的分类器具有不同的性能特征。本文利用三种不同模型的脑电和眼动追踪多模态数据，讨论了一种无干预选择任务。我们表明，序列模型LSTM比静态模型SVM更敏感，但精度较低。此外，我们引入了一个基于置信度的竞争融合模型，该模型同时使用支持向量机和LSTM。与SVM或LSTM单独相比，融合模型进一步提高了召回率，但与LSTM相比精度没有降低。根据结果，我们推荐SVM用于需要最小误报(高精度)的交互式应用，推荐LSTM和强烈推荐竞争融合模型用于在额外后处理步骤中处理无干预选择请求的应用，要求更高的召回率而不是精度。

引用次数: 3

Workload-driven modulation of mixed-reality robot-human communication 工作负载驱动的混合现实人机通信调制

Proceedings of the Workshop on Modeling Cognitive Processes from Multimodal Data

Pub Date : 2018-10-16 DOI: 10.1145/3279810.3279848

Leanne M. Hirshfield, T. Williams, Natalie M. Sommer, Trevor Grant, Senem Velipasalar Gursoy

In this work we explore how Augmented Reality annotations can be used as a form of Mixed Reality gesture, how neurophysiological measurements can inform the decision as to whether or not to use such gestures, and whether and how to adapt language when using such gestures. In this paper, we propose a preliminary investigation of how decisions regarding robot-to-human communication modality in mixed reality environments might be made on the basis of humans' perceptual and cognitive states. Specifically, we propose to use brain data acquired with high-density functional near-infrared spectroscopy (fNIRS) to measure the neural correlates of cognitive and emotional states with particular relevance to adaptive human-robot interaction (HRI). In this paper we describe several states of interest that fNIRS is well suited to measure and that have direct implications to HRI adaptations and we leverage a framework developed in our prior work to explore how different neurophysiological measures could inform the selection of different communication strategies. We then describe results from a feasibility experiment where multilabel Convolutional Long Short Term Memory Networks were trained to classify the target mental states of 10 participants and we discuss a research agenda for adaptive human-robot teams based on our findings.

在这项工作中，我们探讨了如何将增强现实注释用作混合现实手势的一种形式，神经生理测量如何告知是否使用此类手势的决定，以及在使用此类手势时是否以及如何适应语言。在本文中，我们提出了一个关于混合现实环境中如何根据人类的感知和认知状态做出关于机器人与人类通信方式的决策的初步研究。具体来说，我们建议使用高密度功能近红外光谱(fNIRS)获得的大脑数据来测量认知和情绪状态的神经相关性，特别是与自适应人机交互(HRI)相关的神经相关性。在本文中，我们描述了几个感兴趣的状态，fNIRS非常适合测量，并且对HRI适应有直接影响，我们利用我们之前工作中开发的框架来探索不同的神经生理测量如何为不同的沟通策略的选择提供信息。然后，我们描述了一个可行性实验的结果，在这个实验中，我们训练了多标签卷积长短期记忆网络来对10名参与者的目标心理状态进行分类，并根据我们的发现讨论了自适应人机团队的研究议程。

{"title":"Workload-driven modulation of mixed-reality robot-human communication","authors":"Leanne M. Hirshfield, T. Williams, Natalie M. Sommer, Trevor Grant, Senem Velipasalar Gursoy","doi":"10.1145/3279810.3279848","DOIUrl":"https://doi.org/10.1145/3279810.3279848","url":null,"abstract":"In this work we explore how Augmented Reality annotations can be used as a form of Mixed Reality gesture, how neurophysiological measurements can inform the decision as to whether or not to use such gestures, and whether and how to adapt language when using such gestures. In this paper, we propose a preliminary investigation of how decisions regarding robot-to-human communication modality in mixed reality environments might be made on the basis of humans' perceptual and cognitive states. Specifically, we propose to use brain data acquired with high-density functional near-infrared spectroscopy (fNIRS) to measure the neural correlates of cognitive and emotional states with particular relevance to adaptive human-robot interaction (HRI). In this paper we describe several states of interest that fNIRS is well suited to measure and that have direct implications to HRI adaptations and we leverage a framework developed in our prior work to explore how different neurophysiological measures could inform the selection of different communication strategies. We then describe results from a feasibility experiment where multilabel Convolutional Long Short Term Memory Networks were trained to classify the target mental states of 10 participants and we discuss a research agenda for adaptive human-robot teams based on our findings.","PeriodicalId":326513,"journal":{"name":"Proceedings of the Workshop on Modeling Cognitive Processes from Multimodal Data","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131124884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Multimodal approach to engagement and disengagement detection with highly imbalanced in-the-wild data 基于高度不平衡野外数据的多模态接触和脱离检测方法

Proceedings of the Workshop on Modeling Cognitive Processes from Multimodal Data

Pub Date : 2018-10-16 DOI: 10.1145/3279810.3279842

D. Fedotov, O. Perepelkina, E. Kazimirova, M. Konstantinova, W. Minker

Engagement/disengagement detection is a challenging task emerging in a range of human-human and human-computer interaction problems. While being important, the issue is still far from being solved and a number of studies involving in-the-wild data have been conducted by now. Disambiguation in the definition of engaged/disengaged states makes it hard to collect, annotate and analyze such data. In this paper we describe different approaches to building engagement/disengagement models working with highly imbalanced multimodal data from natural conversations. We set a baseline result of 0.695 (unweighted average recall) by direct classification. Then we try to detect disengagement by means of engagement regression models, as they have strong negative correlation. To deal with imbalanced data we apply class weighting and data augmentation techniques (SMOTE and mixup). We experiment with combinations of modalities in order to find the most contributing ones. We use features from both audio (speech) and video (face, body, lips, eyes) channels. We transform original features using Principal Component Analysis and experiment with several types of modality fusion. Finally, we combine approaches and increase the performance up to 0.715 using four modalities (all channels except face). Audio and lips features appear to be the most contributing ones, which may be tightly connected with speech.

接触/脱离检测是一项具有挑战性的任务，出现在一系列人机交互问题中。虽然很重要，但这个问题还远远没有解决，到目前为止已经进行了一些涉及野外数据的研究。在参与/未参与状态的定义中消除歧义使得收集、注释和分析这些数据变得困难。在本文中，我们描述了构建参与/脱离模型的不同方法，这些模型处理来自自然对话的高度不平衡的多模态数据。我们通过直接分类设置了0.695(未加权平均召回率)的基线结果。然后我们尝试通过投入回归模型来检测脱离，因为它们具有很强的负相关。为了处理不平衡数据，我们应用了类加权和数据增强技术(SMOTE和mixup)。我们尝试多种模式的组合，以找到最有效的模式。我们使用来自音频(语音)和视频(面部、身体、嘴唇、眼睛)通道的特征。我们利用主成分分析对原始特征进行变换，并进行了几种情态融合实验。最后，我们结合方法并使用四种模态(除面外的所有通道)将性能提高到0.715。声音和嘴唇特征似乎是最重要的特征，它们可能与语言密切相关。

{"title":"Multimodal approach to engagement and disengagement detection with highly imbalanced in-the-wild data","authors":"D. Fedotov, O. Perepelkina, E. Kazimirova, M. Konstantinova, W. Minker","doi":"10.1145/3279810.3279842","DOIUrl":"https://doi.org/10.1145/3279810.3279842","url":null,"abstract":"Engagement/disengagement detection is a challenging task emerging in a range of human-human and human-computer interaction problems. While being important, the issue is still far from being solved and a number of studies involving in-the-wild data have been conducted by now. Disambiguation in the definition of engaged/disengaged states makes it hard to collect, annotate and analyze such data. In this paper we describe different approaches to building engagement/disengagement models working with highly imbalanced multimodal data from natural conversations. We set a baseline result of 0.695 (unweighted average recall) by direct classification. Then we try to detect disengagement by means of engagement regression models, as they have strong negative correlation. To deal with imbalanced data we apply class weighting and data augmentation techniques (SMOTE and mixup). We experiment with combinations of modalities in order to find the most contributing ones. We use features from both audio (speech) and video (face, body, lips, eyes) channels. We transform original features using Principal Component Analysis and experiment with several types of modality fusion. Finally, we combine approaches and increase the performance up to 0.715 using four modalities (all channels except face). Audio and lips features appear to be the most contributing ones, which may be tightly connected with speech.","PeriodicalId":326513,"journal":{"name":"Proceedings of the Workshop on Modeling Cognitive Processes from Multimodal Data","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127521199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

The role of emotion in problem solving: first results from observing chess 情绪在解决问题中的作用:首先是观察国际象棋的结果

Proceedings of the Workshop on Modeling Cognitive Processes from Multimodal Data

Pub Date : 2018-10-16 DOI: 10.1145/3279810.3279846

Thomas Guntz, J. Crowley, D. Vaufreydaz, R. Balzarini, Philippe Dessus

In this paper we present results from recent experiments that suggest that chess players associate emotions to game situations and reactively use these associations to guide search for planning and problem solving. We report on a pilot experiment with multi-modal observation of human experts engaged in solving challenging problems in Chess. Our results confirm that cognitive processes have observable correlates in displays of emotion and fixation, and that these displays can be used to evaluate models of cognitive processes. They also revealed an unexpected observation of rapid changes in emotion as players attempt to solve challenging problems. In this paper, we propose a cognitive model to explain our observations, and describe initial results from a second experiment designed to test this model.

在本文中，我们展示了最近的实验结果，表明棋手将情绪与游戏情境联系起来，并反应性地使用这些联系来指导搜索计划和解决问题。我们报告了一项试点实验，对从事解决国际象棋挑战性问题的人类专家进行多模态观察。我们的研究结果证实，认知过程在情绪表现和固定表现中具有可观察到的相关性，并且这些表现可用于评估认知过程模型。他们还发现，当玩家试图解决具有挑战性的问题时，他们的情绪会发生意想不到的快速变化。在本文中，我们提出了一个认知模型来解释我们的观察结果，并描述了为测试该模型而设计的第二个实验的初步结果。

引用次数: 6