首页 > 最新文献

IS&T International Symposium on Electronic Imaging最新文献

英文 中文
Self-supervised visual representation learning on food images 食物图像的自监督视觉表征学习
Pub Date : 2023-01-16 DOI: 10.2352/ei.2023.35.7.image-269
Andrew W. Peng, Jiangpeng He, Fengqing Zhu
Food image classification is the groundwork for image-based dietary assessment, which is the process of monitoring what kinds of food and how much energy is consumed using captured food or eating scene images. Existing deep learning based methods learn the visual representation for food classification based on human annotation of each food image. However, most food images captured from real life are obtained without labels, requiring human annotation to train deep learning based methods. This approach is not feasible for real world deployment due to high costs. To make use of the vast amount of unlabeled images, many existing works focus on unsupervised or self-supervised learning to learn the visual representation directly from unlabeled data. However, none of these existing works focuses on food images, which is more challenging than general objects due to its high inter-class similarity and intra-class variance. In this paper, we focus on two items: the comparison of existing models and the development of an effective self-supervised learning model for food image classification. Specifically, we first compare the performance of existing state-of-the-art self-supervised learning models, including SimSiam, SimCLR, SwAV, BYOL, MoCo, and Rotation Pretext Task on food images. The experiments are conducted on the Food-101 dataset, which contains 101 different classes of foods with 1,000 images in each class. Next, we analyze the unique features of each model and compare their performance on food images to identify the key factors in each model that can help improve the accuracy. Finally, we propose a new model for unsupervised visual representation learning on food images for the classification task.
食物图像分类是基于图像的饮食评估的基础,它是使用捕获的食物或进食场景图像来监测食物种类和消耗多少能量的过程。现有的基于深度学习的方法是基于人类对每个食物图像的注释来学习食物分类的视觉表示。然而,大多数从现实生活中捕获的食物图像都是没有标签的,需要人工注释来训练基于深度学习的方法。由于成本高,这种方法在实际部署中是不可行的。为了利用大量的未标记图像,许多现有的工作都集中在无监督或自监督学习上,直接从未标记的数据中学习视觉表征。然而,这些现有的作品都没有关注食物图像,因为食物图像具有较高的类间相似性和类内方差,比一般对象更具挑战性。在本文中,我们重点研究了两个项目:现有模型的比较和一种有效的食品图像分类自监督学习模型的开发。具体来说,我们首先比较了现有的最先进的自监督学习模型,包括SimSiam、SimCLR、SwAV、BYOL、MoCo和轮换借口任务在食物图像上的性能。实验是在Food-101数据集上进行的,该数据集包含101个不同类别的食物,每个类别有1000张图像。接下来,我们分析了每个模型的独特特征,并比较了它们在食物图像上的表现,以确定每个模型中有助于提高准确率的关键因素。最后,我们提出了一种新的基于食物图像的无监督视觉表征学习模型。
{"title":"Self-supervised visual representation learning on food images","authors":"Andrew W. Peng, Jiangpeng He, Fengqing Zhu","doi":"10.2352/ei.2023.35.7.image-269","DOIUrl":"https://doi.org/10.2352/ei.2023.35.7.image-269","url":null,"abstract":"Food image classification is the groundwork for image-based dietary assessment, which is the process of monitoring what kinds of food and how much energy is consumed using captured food or eating scene images. Existing deep learning based methods learn the visual representation for food classification based on human annotation of each food image. However, most food images captured from real life are obtained without labels, requiring human annotation to train deep learning based methods. This approach is not feasible for real world deployment due to high costs. To make use of the vast amount of unlabeled images, many existing works focus on unsupervised or self-supervised learning to learn the visual representation directly from unlabeled data. However, none of these existing works focuses on food images, which is more challenging than general objects due to its high inter-class similarity and intra-class variance. In this paper, we focus on two items: the comparison of existing models and the development of an effective self-supervised learning model for food image classification. Specifically, we first compare the performance of existing state-of-the-art self-supervised learning models, including SimSiam, SimCLR, SwAV, BYOL, MoCo, and Rotation Pretext Task on food images. The experiments are conducted on the Food-101 dataset, which contains 101 different classes of foods with 1,000 images in each class. Next, we analyze the unique features of each model and compare their performance on food images to identify the key factors in each model that can help improve the accuracy. Finally, we propose a new model for unsupervised visual representation learning on food images for the classification task.","PeriodicalId":73514,"journal":{"name":"IS&T International Symposium on Electronic Imaging","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135693973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
iPhone12 imagery in scene-referred computer graphics pipelines 场景参考计算机图形管道中的iPhone12图像
Pub Date : 2023-01-16 DOI: 10.2352/ei.2023.35.3.mobmu-350
Eberhard Hasche, Oliver Karaschewski, Reiner Creutzburg
With the release of the Apple iPhone 12 pro in 2020, various features were integrated that make it attractive as a recording device for scene-related computer graphics pipelines. The captured Apple RAW images have a much higher dynamic range than the standard 8-bit images. Since a scene-based workflow naturally has an extended dynamic range (HDR), the Apple RAW recordings can be well integrated. Another feature is the Dolby Vision HDR recordings, which are primarily adapted to the respective display of the source device. However, these recordings can also be used in the CG workflow since at least the basic HLG transfer function is integrated. The iPhone12pro's two Laser scanners can produce complex 3D models and textures for the CG pipeline. On the one hand, there is a scanner on the back that is primarily intended for capturing the surroundings for AR purposes. On the other hand, there is another scanner on the front for facial recognition. In addition, external software can read out the scanning data for integration in 3D applications. To correctly integrate the iPhone12pro Apple RAW data into a scene-related workflow, two command-line-based software solutions can be used, among others: dcraw and rawtoaces. Dcraw offers the possibility to export RAW images directly to ACES2065-1. Unfortunately, the modifiers for the four RAW color channels to address the different white points are unavailable. Experimental test series are performed under controlled studio conditions to retrieve these modifier values. Subsequently, these RAW-derived images are imported into computer graphics pipelines of various CG software applications (SideFx Houdini, The Foundry Nuke, Autodesk Maya) with the help of OpenColorIO (OCIO) and ACES. Finally, it will be determined if they can improve the overall color quality. Dolby Vision content can be captured using the native Camera app on an iPhone 12. It captures HDR video using Dolby Vision Profile 8.4, which contains a cross-compatible HLG Rec.2020 base layer and Dolby Vision dynamic metadata. Only the HLG base layer is passed on when exporting the Dolby Vision iPhone video without the corresponding metadata. It is investigated whether the iPhone12 videos transferred this way can increase the quality of the computer graphics pipeline. The 3D Scanner App software controls the two integrated Laser Scanners. In addition, the software provides a large number of export formats. Therefore, integrating the OBJ-3D data into industry-standard software like Maya and Houdini is unproblematic. Unfortunately, the models and the corresponding UV map are more or less machine-readable. So, manually improving the 3D geometry (filling holes, refining the geometry, setting up new topology) is cumbersome and time-consuming. It is investigated if standard techniques like using the ZRemesher in ZBrush, applying Texture- and UV-Projection in Maya, and VEX-snippets in Houdini can assemble these models and textures for manual editing.
随着2020年苹果iPhone 12 pro的发布,各种功能被整合在一起,使其成为与场景相关的计算机图形管道的录制设备。捕获的苹果RAW图像比标准的8位图像具有更高的动态范围。由于基于场景的工作流程自然具有扩展的动态范围(HDR),因此Apple RAW录制可以很好地集成。另一个特点是杜比视界HDR录音,这主要是适应各自的显示源设备。然而,这些录音也可以在CG工作流程中使用,因为至少基本的HLG传递函数是集成的。iPhone12pro的两个激光扫描仪可以为CG管道生成复杂的3D模型和纹理。一方面,背面有一个扫描仪,主要用于捕捉AR目的的周围环境。另一方面,前面还有一个用于面部识别的扫描仪。此外,外部软件可以读取扫描数据,以便集成到3D应用程序中。要将iPhone12pro Apple RAW数据正确集成到与场景相关的工作流中,可以使用两种基于命令行的软件解决方案:draw和rawtoaces。draw提供了直接将RAW图像导出到ACES2065-1的可能性。不幸的是,用于处理不同白点的四个RAW颜色通道的修饰符不可用。实验测试系列在受控的工作室条件下进行,以检索这些修改值。随后,这些原始衍生的图像被导入到计算机图形管道的各种CG软件应用程序(SideFx胡迪尼,铸造核,Autodesk Maya)与OpenColorIO (OCIO)和ACES的帮助下。最后,它将确定他们是否可以提高整体色彩质量。杜比视界的内容可以使用iPhone 12上的原生相机应用程序捕获。它使用杜比视界配置文件8.4捕获HDR视频,其中包含交叉兼容的HLG Rec.2020基础层和杜比视界动态元数据。在没有相应元数据的情况下导出杜比视界iPhone视频时,只传递HLG基础层。研究了以这种方式传输的iPhone12视频是否能提高计算机图形流水线的质量。3D扫描仪应用软件控制两个集成的激光扫描仪。此外,该软件提供了大量的导出格式。因此,整合OBJ-3D数据到行业标准的软件,如玛雅和胡迪尼是没有问题的。不幸的是,模型和相应的UV图或多或少是机器可读的。因此,手动改进3D几何形状(填充孔,精炼几何形状,设置新的拓扑结构)既麻烦又耗时。它是调查如果标准的技术,如使用ZRemesher在ZBrush,在玛雅应用纹理和紫外线投影,并在胡迪尼vex片段可以组装这些模型和纹理手动编辑。
{"title":"iPhone12 imagery in scene-referred computer graphics pipelines","authors":"Eberhard Hasche, Oliver Karaschewski, Reiner Creutzburg","doi":"10.2352/ei.2023.35.3.mobmu-350","DOIUrl":"https://doi.org/10.2352/ei.2023.35.3.mobmu-350","url":null,"abstract":"With the release of the Apple iPhone 12 pro in 2020, various features were integrated that make it attractive as a recording device for scene-related computer graphics pipelines. The captured Apple RAW images have a much higher dynamic range than the standard 8-bit images. Since a scene-based workflow naturally has an extended dynamic range (HDR), the Apple RAW recordings can be well integrated. Another feature is the Dolby Vision HDR recordings, which are primarily adapted to the respective display of the source device. However, these recordings can also be used in the CG workflow since at least the basic HLG transfer function is integrated. The iPhone12pro's two Laser scanners can produce complex 3D models and textures for the CG pipeline. On the one hand, there is a scanner on the back that is primarily intended for capturing the surroundings for AR purposes. On the other hand, there is another scanner on the front for facial recognition. In addition, external software can read out the scanning data for integration in 3D applications. To correctly integrate the iPhone12pro Apple RAW data into a scene-related workflow, two command-line-based software solutions can be used, among others: dcraw and rawtoaces. Dcraw offers the possibility to export RAW images directly to ACES2065-1. Unfortunately, the modifiers for the four RAW color channels to address the different white points are unavailable. Experimental test series are performed under controlled studio conditions to retrieve these modifier values. Subsequently, these RAW-derived images are imported into computer graphics pipelines of various CG software applications (SideFx Houdini, The Foundry Nuke, Autodesk Maya) with the help of OpenColorIO (OCIO) and ACES. Finally, it will be determined if they can improve the overall color quality. Dolby Vision content can be captured using the native Camera app on an iPhone 12. It captures HDR video using Dolby Vision Profile 8.4, which contains a cross-compatible HLG Rec.2020 base layer and Dolby Vision dynamic metadata. Only the HLG base layer is passed on when exporting the Dolby Vision iPhone video without the corresponding metadata. It is investigated whether the iPhone12 videos transferred this way can increase the quality of the computer graphics pipeline. The 3D Scanner App software controls the two integrated Laser Scanners. In addition, the software provides a large number of export formats. Therefore, integrating the OBJ-3D data into industry-standard software like Maya and Houdini is unproblematic. Unfortunately, the models and the corresponding UV map are more or less machine-readable. So, manually improving the 3D geometry (filling holes, refining the geometry, setting up new topology) is cumbersome and time-consuming. It is investigated if standard techniques like using the ZRemesher in ZBrush, applying Texture- and UV-Projection in Maya, and VEX-snippets in Houdini can assemble these models and textures for manual editing.","PeriodicalId":73514,"journal":{"name":"IS&T International Symposium on Electronic Imaging","volume":"853 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135694709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mobile incident command dashboard (MIC-D) 移动事件命令仪表板(MIC-D)
Pub Date : 2023-01-16 DOI: 10.2352/ei.2023.35.3.mobmu-358
Yang Cai, Mel Siegel
Incident Command Dashboard (ICD) plays an essential role in Emergency Support Functions (ESF). They are centralized with a massive amount of live data. In this project, we explore a decentralized mobile incident commanding dashboard (MIC-D) with an improved mobile augmented reality (AR) user interface (UI) that can access and display multimodal live IoT data streams in phones, tablets, and inexpensive HUDs on the first responder’s helmets. The new platform is designed to work in the field and to share live data streams among team members. It also enables users to view the 3D LiDAR scan data on the location, live thermal video data, and vital sign data on the 3D map. We have built a virtual medical helicopter communication center and tested the launchpad on fire and remote fire extinguishing scenarios. We have also tested the wildfire prevention scenario “Cold Trailing” in the outdoor environment.
事件指挥仪表板(ICD)在紧急支持功能(ESF)中起着至关重要的作用。它们集中了大量的实时数据。在这个项目中,我们探索了一个分散的移动事件指挥仪表板(MIC-D),它具有改进的移动增强现实(AR)用户界面(UI),可以访问和显示手机、平板电脑和第一响应者头盔上的廉价hud上的多模态实时物联网数据流。新平台的设计目的是在现场工作,并在团队成员之间共享实时数据流。用户还可以在3D地图上查看位置的3D激光雷达扫描数据、实时热视频数据和生命体征数据。我们建立了虚拟医疗直升机通信中心,并对发射台进行了火灾和远程灭火场景测试。我们还测试了预防野火的场景“在室外环境下。
{"title":"Mobile incident command dashboard (MIC-D)","authors":"Yang Cai, Mel Siegel","doi":"10.2352/ei.2023.35.3.mobmu-358","DOIUrl":"https://doi.org/10.2352/ei.2023.35.3.mobmu-358","url":null,"abstract":"Incident Command Dashboard (ICD) plays an essential role in Emergency Support Functions (ESF). They are centralized with a massive amount of live data. In this project, we explore a decentralized mobile incident commanding dashboard (MIC-D) with an improved mobile augmented reality (AR) user interface (UI) that can access and display multimodal live IoT data streams in phones, tablets, and inexpensive HUDs on the first responder’s helmets. The new platform is designed to work in the field and to share live data streams among team members. It also enables users to view the 3D LiDAR scan data on the location, live thermal video data, and vital sign data on the 3D map. We have built a virtual medical helicopter communication center and tested the launchpad on fire and remote fire extinguishing scenarios. We have also tested the wildfire prevention scenario “Cold Trailing” in the outdoor environment.","PeriodicalId":73514,"journal":{"name":"IS&T International Symposium on Electronic Imaging","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135694710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Open-source Intelligence (OSINT) investigation in Facebook 开源情报(OSINT)对Facebook的调查
Pub Date : 2023-01-16 DOI: 10.2352/ei.2023.35.3.mobmu-357
Pranesh Kumar Narasimhan, Chinmay Bhosale, Muhammad Hasban Pervez, Najiba Zainab Naqvi, Mert Ilhan Ecevit, Klaus Schwarz, Reiner Creutzburg
Open Source Intelligence (OSINT) has come a long way, and it is still developing ideas, and lots of investigations are yet to happen in the near future. The main essential requirement for all the OSINT investigations is the information that is valuable data from a good source. This paper discusses various tools and methodologies related to Facebook data collection and analyzes part of the collected data. At the end of the paper, the reader will get a deep and clear insight into the available techniques, tools, and descriptions about tools that are present to scrape the data out of the Facebook platform and the types of investigations and analyses that the gathered data can do.
开源智能(OSINT)已经走了很长一段路,它仍在发展想法,在不久的将来还会有很多调查。所有OSINT调查的主要基本要求是来自良好来源的有价值的数据。本文讨论了与Facebook数据收集相关的各种工具和方法,并分析了部分收集到的数据。在论文结束时,读者将深入而清晰地了解可用的技术,工具和描述工具,这些工具用于从Facebook平台中抓取数据,以及收集到的数据可以做的调查和分析类型。
{"title":"Open-source Intelligence (OSINT) investigation in Facebook","authors":"Pranesh Kumar Narasimhan, Chinmay Bhosale, Muhammad Hasban Pervez, Najiba Zainab Naqvi, Mert Ilhan Ecevit, Klaus Schwarz, Reiner Creutzburg","doi":"10.2352/ei.2023.35.3.mobmu-357","DOIUrl":"https://doi.org/10.2352/ei.2023.35.3.mobmu-357","url":null,"abstract":"Open Source Intelligence (OSINT) has come a long way, and it is still developing ideas, and lots of investigations are yet to happen in the near future. The main essential requirement for all the OSINT investigations is the information that is valuable data from a good source. This paper discusses various tools and methodologies related to Facebook data collection and analyzes part of the collected data. At the end of the paper, the reader will get a deep and clear insight into the available techniques, tools, and descriptions about tools that are present to scrape the data out of the Facebook platform and the types of investigations and analyses that the gathered data can do.","PeriodicalId":73514,"journal":{"name":"IS&T International Symposium on Electronic Imaging","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135694713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Autonomous Vehicles and Machines 2023 Conference Overview and Papers Program 自动驾驶汽车和机器2023会议概述和论文计划
Pub Date : 2023-01-16 DOI: 10.2352/ei.2023.35.16.avm-a16
Abstract Advancements in sensing, computing, image processing, and computer vision technologies are enabling unprecedented growth and interest in autonomous vehicles and intelligent machines, from self-driving cars to unmanned drones, to personal service robots. These new capabilities have the potential to fundamentally change the way people live, work, commute, and connect with each other, and will undoubtedly provoke entirely new applications and commercial opportunities for generations to come. The main focus of AVM is perception. This begins with sensing. While imaging continues to be an essential emphasis in all EI conferences, AVM also embraces other sensing modalities important to autonomous navigation, including radar, LiDAR, and time of flight. Realization of autonomous systems also includes purpose-built processors, e.g., ISPs, vision processors, DNN accelerators, as well core image processing and computer vision algorithms, system design and architecture, simulation, and image/video quality. AVM topics are at the intersection of these multi-disciplinary areas. AVM is the Perception Conference that bridges the imaging and vision communities, connecting the dots for the entire software and hardware stack for perception, helping people design globally optimized algorithms, processors, and systems for intelligent “eyes” for vehicles and machines.
传感、计算、图像处理和计算机视觉技术的进步使人们对自动驾驶汽车和智能机器的兴趣空前增长,从自动驾驶汽车到无人驾驶飞机,再到个人服务机器人。这些新功能有可能从根本上改变人们的生活、工作、通勤和相互联系的方式,毫无疑问,这将为子孙后代带来全新的应用和商业机会。AVM的主要焦点是感知。这要从感知开始。虽然成像仍然是所有EI会议的重要重点,但AVM还包括其他对自主导航很重要的传感模式,包括雷达、激光雷达和飞行时间。自主系统的实现还包括专用处理器,例如isp,视觉处理器,DNN加速器,以及核心图像处理和计算机视觉算法,系统设计和架构,仿真和图像/视频质量。AVM主题是这些多学科领域的交叉点。AVM是连接成像和视觉社区的感知会议,连接整个感知软件和硬件堆栈的点,帮助人们为车辆和机器的智能“眼睛”设计全球优化的算法,处理器和系统。
{"title":"Autonomous Vehicles and Machines 2023 Conference Overview and Papers Program","authors":"","doi":"10.2352/ei.2023.35.16.avm-a16","DOIUrl":"https://doi.org/10.2352/ei.2023.35.16.avm-a16","url":null,"abstract":"Abstract Advancements in sensing, computing, image processing, and computer vision technologies are enabling unprecedented growth and interest in autonomous vehicles and intelligent machines, from self-driving cars to unmanned drones, to personal service robots. These new capabilities have the potential to fundamentally change the way people live, work, commute, and connect with each other, and will undoubtedly provoke entirely new applications and commercial opportunities for generations to come. The main focus of AVM is perception. This begins with sensing. While imaging continues to be an essential emphasis in all EI conferences, AVM also embraces other sensing modalities important to autonomous navigation, including radar, LiDAR, and time of flight. Realization of autonomous systems also includes purpose-built processors, e.g., ISPs, vision processors, DNN accelerators, as well core image processing and computer vision algorithms, system design and architecture, simulation, and image/video quality. AVM topics are at the intersection of these multi-disciplinary areas. AVM is the Perception Conference that bridges the imaging and vision communities, connecting the dots for the entire software and hardware stack for perception, helping people design globally optimized algorithms, processors, and systems for intelligent “eyes” for vehicles and machines.","PeriodicalId":73514,"journal":{"name":"IS&T International Symposium on Electronic Imaging","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135695215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mobile Devices and Multimedia: Enabling Technologies, Algorithms, and Applications 2023 Conference Overview and Papers Program 移动设备和多媒体:使能技术、算法和应用2023年会议综述和论文计划
Pub Date : 2023-01-16 DOI: 10.2352/ei.2023.35.3.mobmu-a03
Abstract The goal of this conference is to provide an international forum for presenting recent research results on multimedia for mobile devices, and to bring together experts from both academia and industry for a fruitful exchange of ideas and discussion on future challenges. The authors are encouraged to submit work-in-progress papers as well as updates on previously reported systems. Outstanding papers may be recommended for the publication in the Journal Electronic Imaging or the Journal of Imaging Science and Technology.
本次会议的目的是提供一个国际论坛,展示移动设备多媒体的最新研究成果,并汇集来自学术界和工业界的专家,就未来的挑战进行富有成效的思想交流和讨论。鼓励作者提交正在进行的论文以及对以前报告的系统的更新。优秀论文将被推荐在《电子成像》杂志或《成像科学与技术》杂志上发表。
{"title":"Mobile Devices and Multimedia: Enabling Technologies, Algorithms, and Applications 2023 Conference Overview and Papers Program","authors":"","doi":"10.2352/ei.2023.35.3.mobmu-a03","DOIUrl":"https://doi.org/10.2352/ei.2023.35.3.mobmu-a03","url":null,"abstract":"Abstract The goal of this conference is to provide an international forum for presenting recent research results on multimedia for mobile devices, and to bring together experts from both academia and industry for a fruitful exchange of ideas and discussion on future challenges. The authors are encouraged to submit work-in-progress papers as well as updates on previously reported systems. Outstanding papers may be recommended for the publication in the Journal Electronic Imaging or the Journal of Imaging Science and Technology.","PeriodicalId":73514,"journal":{"name":"IS&T International Symposium on Electronic Imaging","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135695219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatial cognition training rapidly induces cortical plasticity in blind navigation: Transfer of training effect & Granger causal connectivity analysis. 空间认知训练快速诱导盲导航皮层可塑性:训练效果的转移&格兰杰因果连通性分析。
Pub Date : 2023-01-01 DOI: 10.2352/EI.2023.35.10.HVEI-256
Lora T Likova, Zhangziyi Zhou, Michael Liang, Christopher W Tyler

How is the cortical navigation network reorganized by the Likova Cognitive-Kinesthetic Navigation Training? We measured Granger-causal connectivity of the frontal-hippocampal-insular-retrosplenial-V1 network of cortical areas before and after this one-week training in the blind. Primarily top-down influences were seen during two tasks of drawing-from-memory (drawing complex maps and drawing the shortest path between designated map locations), with the dominant role being congruent influences from the egocentric insular to the allocentric spatial retrosplenial cortex and the amodal-spatial sketchpad of V1, with concomitant influences of the frontal cortex on these areas. After training, and during planning-from-memory of the best on-demand path, the hippocampus played a much stronger role, with the V1 sketchpad feeding information forward to the retrosplenial region. The inverse causal influences among these regions generally followed a recursive feedback model of the opposite pattern to a subset of congruent influences. Thus, this navigational network reorganized its pattern of causal influences with task demands and the navigation training, which produced marked enhancement of the navigational skills.

利科娃认知-动觉导航训练是如何重组皮层导航网络的?在为期一周的盲训练前后,我们测量了皮层区域额叶-海马-岛-脾后- v1网络的格兰杰-因果连系。在记忆绘制任务(绘制复杂地图和绘制指定地图位置之间的最短路径)中,自上而下的影响主要出现在两个任务中,从自我中心的岛叶到非中心空间的脾后皮层和V1的模态空间画板的一致影响占主导地位,同时额叶皮层对这些区域的影响也同时存在。训练后,以及在最佳按需路径的记忆计划过程中,海马发挥了更强的作用,V1画板将信息向前传递到脾后区域。这些区域之间的反向因果影响通常遵循与一致影响子集相反模式的递归反馈模型。因此,该导航网络与任务需求和导航训练重新组织了其因果影响模式,从而显著提高了导航技能。
{"title":"Spatial cognition training rapidly induces cortical plasticity in blind navigation: Transfer of training effect & Granger causal connectivity analysis.","authors":"Lora T Likova,&nbsp;Zhangziyi Zhou,&nbsp;Michael Liang,&nbsp;Christopher W Tyler","doi":"10.2352/EI.2023.35.10.HVEI-256","DOIUrl":"https://doi.org/10.2352/EI.2023.35.10.HVEI-256","url":null,"abstract":"<p><p>How is the cortical navigation network reorganized by the Likova Cognitive-Kinesthetic Navigation Training? We measured Granger-causal connectivity of the frontal-hippocampal-insular-retrosplenial-V1 network of cortical areas before and after this one-week training in the blind. Primarily top-down influences were seen during two tasks of drawing-from-memory (drawing complex maps and drawing the shortest path between designated map locations), with the dominant role being congruent influences from the egocentric insular to the allocentric spatial retrosplenial cortex and the amodal-spatial sketchpad of V1, with concomitant influences of the frontal cortex on these areas. After training, and during planning-from-memory of the best on-demand path, the hippocampus played a much stronger role, with the V1 sketchpad feeding information forward to the retrosplenial region. The inverse causal influences among these regions generally followed a recursive feedback model of the opposite pattern to a subset of congruent influences. Thus, this navigational network reorganized its pattern of causal influences with task demands and the navigation training, which produced marked enhancement of the navigational skills.</p>","PeriodicalId":73514,"journal":{"name":"IS&T International Symposium on Electronic Imaging","volume":"35 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10228514/pdf/nihms-1898995.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9553349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multipurpose Spatiomotor Capture System for Haptic and Visual Training and Testing in the Blind and Sighted 用于盲人和健全人的触觉和视觉训练和测试的多用途空间运动捕捉系统
Pub Date : 2021-01-18 DOI: 10.2352/issn.2470-1173.2021.11.hvei-160
Lora T. Likova, Kristyo Mineff, C. Tyler
We describe the development of a multipurpose haptic stimulus delivery and spatiomotor recording system with tactile map-overlays for electronic processing This innovative multipurpose spatiomotor capture system will serve a wide range of functions in the training and behavioral assessment of spatial memory and precise motor control for blindness rehabilitation, both for STEM learning and for navigation training and map reading. Capacitive coupling through the map-overlays to the touch-tablet screen below them allows precise recording i) of hand movements during haptic exploration of tactile raised-line images on one tablet and ii) of line-drawing trajectories on the other, for analysis of navigational errors, speed, time elapsed, etc. Thus, this system will provide for the first time in an integrated and automated manner quantitative assessments of the whole 'perception-cognition-action' loop - from non-visual exploration strategies, spatial memory, precise spatiomotor control and coordination, drawing performance, and navigation capabilities, as well as of haptic and movement planning and control. The accuracy of memory encoding, in particular, can be assessed by the memory-drawing operation of the capture system. Importantly, this system allows for both remote and in-person operation. Although the focus is on visually impaired populations, the system is designed to equally serve training and assessments in the normally sighted as well.
我们描述了一种多用途触觉刺激传递和空间运动记录系统的发展,这种创新的多用途空间运动捕捉系统将在空间记忆的训练和行为评估以及失明康复的精确运动控制方面发挥广泛的作用,无论是在STEM学习还是导航训练和地图阅读方面。通过地图覆盖层的电容耦合到它们下面的触摸屏上,可以精确记录i)在一块平板上触摸凸起线图像时的手部运动,ii)在另一块平板上绘制线条轨迹,以分析导航误差、速度、时间等。因此,该系统将首次以集成和自动化的方式提供整个“感知-认知-行动”循环的定量评估——从非视觉探索策略、空间记忆、精确的空间运动控制和协调、绘图性能和导航能力,以及触觉和运动计划和控制。特别是,存储器编码的准确性可以通过捕获系统的存储器绘制操作来评估。重要的是,该系统允许远程和现场操作。虽然重点是视障人群,但该系统的设计也同样适用于正常视力人群的培训和评估。
{"title":"Multipurpose Spatiomotor Capture System for Haptic and Visual Training and Testing in the Blind and Sighted","authors":"Lora T. Likova, Kristyo Mineff, C. Tyler","doi":"10.2352/issn.2470-1173.2021.11.hvei-160","DOIUrl":"https://doi.org/10.2352/issn.2470-1173.2021.11.hvei-160","url":null,"abstract":"We describe the development of a multipurpose haptic stimulus delivery and spatiomotor recording system with tactile map-overlays for electronic processing This innovative multipurpose spatiomotor capture system will serve a wide range of functions in the training and behavioral assessment of spatial memory and precise motor control for blindness rehabilitation, both for STEM learning and for navigation training and map reading. Capacitive coupling through the map-overlays to the touch-tablet screen below them allows precise recording i) of hand movements during haptic exploration of tactile raised-line images on one tablet and ii) of line-drawing trajectories on the other, for analysis of navigational errors, speed, time elapsed, etc. Thus, this system will provide for the first time in an integrated and automated manner quantitative assessments of the whole 'perception-cognition-action' loop - from non-visual exploration strategies, spatial memory, precise spatiomotor control and coordination, drawing performance, and navigation capabilities, as well as of haptic and movement planning and control. The accuracy of memory encoding, in particular, can be assessed by the memory-drawing operation of the capture system. Importantly, this system allows for both remote and in-person operation. Although the focus is on visually impaired populations, the system is designed to equally serve training and assessments in the normally sighted as well.","PeriodicalId":73514,"journal":{"name":"IS&T International Symposium on Electronic Imaging","volume":"4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88929982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Controllable Medical Image Generation via Generative Adversarial Networks. 基于生成对抗网络的可控医学图像生成。
Pub Date : 2021-01-01 DOI: 10.2352/issn.2470-1173.2021.11.hvei-112
Zhihang Ren, Stella X Yu, David Whitney

Radiologists and pathologists frequently make highly consequential perceptual decisions. For example, visually searching for a tumor and recognizing whether it is malignant can have a life-changing impact on a patient. Unfortunately, all human perceivers-even radiologists-have perceptual biases. Because human perceivers (medical doctors) will, for the foreseeable future, be the final judges of whether a tumor is malignant, understanding and mitigating human perceptual biases is important. While there has been research on perceptual biases in medical image perception tasks, the stimuli used for these studies were highly artificial and often critiqued. Realistic stimuli have not been used because it has not been possible to generate or control them for psychophysical experiments. Here, we propose to use Generative Adversarial Networks (GAN) to create vivid and realistic medical image stimuli that can be used in psychophysical and computer vision studies of medical image perception. Our model can generate tumor-like stimuli with specified shapes and realistic textures in a controlled manner. Various experiments showed the authenticity of our GAN-generated stimuli and the controllability of our model.

放射科医生和病理学家经常做出非常重要的感性决定。例如,视觉搜索肿瘤并识别它是否是恶性肿瘤可以对患者产生改变生活的影响。不幸的是,所有的人类感知者——甚至放射科医生——都有感知偏差。因为在可预见的未来,人类感知者(医生)将成为肿瘤是否恶性的最终判断者,理解和减轻人类感知偏见是很重要的。虽然已经有关于医学图像感知任务中的感知偏差的研究,但这些研究中使用的刺激是高度人为的,并且经常受到批评。现实刺激没有被使用,因为在心理物理实验中不可能产生或控制它们。在这里,我们建议使用生成对抗网络(GAN)来创建生动逼真的医学图像刺激,可用于医学图像感知的心理物理和计算机视觉研究。我们的模型可以以可控的方式产生具有特定形状和逼真纹理的肿瘤样刺激。各种实验表明我们的gan产生的刺激的真实性和我们的模型的可控性。
{"title":"Controllable Medical Image Generation via Generative Adversarial Networks.","authors":"Zhihang Ren,&nbsp;Stella X Yu,&nbsp;David Whitney","doi":"10.2352/issn.2470-1173.2021.11.hvei-112","DOIUrl":"https://doi.org/10.2352/issn.2470-1173.2021.11.hvei-112","url":null,"abstract":"<p><p>Radiologists and pathologists frequently make highly consequential perceptual decisions. For example, visually searching for a tumor and recognizing whether it is malignant can have a life-changing impact on a patient. Unfortunately, all human perceivers-even radiologists-have perceptual biases. Because human perceivers (medical doctors) will, for the foreseeable future, be the final judges of whether a tumor is malignant, understanding and mitigating human perceptual biases is important. While there has been research on perceptual biases in medical image perception tasks, the stimuli used for these studies were highly artificial and often critiqued. Realistic stimuli have not been used because it has not been possible to generate or control them for psychophysical experiments. Here, we propose to use Generative Adversarial Networks (GAN) to create vivid and realistic medical image stimuli that can be used in psychophysical and computer vision studies of medical image perception. Our model can generate tumor-like stimuli with specified shapes and realistic textures in a controlled manner. Various experiments showed the authenticity of our GAN-generated stimuli and the controllability of our model.</p>","PeriodicalId":73514,"journal":{"name":"IS&T International Symposium on Electronic Imaging","volume":"33 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9897627/pdf/nihms-1673431.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9229753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Deep Learning Approach for Dynamic Sparse Sampling for High-Throughput Mass Spectrometry Imaging. 高通量质谱成像中动态稀疏采样的深度学习方法。
Pub Date : 2021-01-01 Epub Date: 2021-01-18 DOI: 10.2352/issn.2470-1173.2021.15.coimg-290
David Helminiak, Hang Hu, Julia Laskin, Dong Hye Ye

A Supervised Learning Approach for Dynamic Sampling (SLADS) addresses traditional issues with the incorporation of stochastic processes into a compressed sensing method. Statistical features, extracted from a sample reconstruction, estimate entropy reduction with regression models, in order to dynamically determine optimal sampling locations. This work introduces an enhanced SLADS method, in the form of a Deep Learning Approach for Dynamic Sampling (DLADS), showing reductions in sample acquisition times for high-fidelity reconstructions between ~ 70-80% over traditional rectilinear scanning. These improvements are demonstrated for dimensionally asymmetric, high-resolution molecular images of mouse uterine and kidney tissues, as obtained using Nanospray Desorption ElectroSpray Ionization (nano-DESI) Mass Spectrometry Imaging (MSI). The methodology for training set creation is adjusted to mitigate stretching artifacts generated when using prior SLADS approaches. Transitioning to DLADS removes the need for feature extraction, further advanced with the employment of convolutional layers to leverage inter-pixel spatial relationships. Additionally, DLADS demonstrates effective generalization, despite dissimilar training and testing data. Overall, DLADS is shown to maximize potential experimental throughput for nano-DESI MSI.

一种用于动态采样的监督学习方法(SLADS)通过将随机过程纳入压缩感知方法来解决传统问题。从样本重建中提取统计特征,用回归模型估计熵降,以便动态确定最佳采样位置。这项工作引入了一种增强的SLADS方法,以动态采样(DLADS)的深度学习方法的形式,显示出与传统的直线扫描相比,高保真重建的样本采集时间减少了~ 70-80%。这些改进在使用纳米喷雾解吸电喷雾电离(纳米desi)质谱成像(MSI)获得的小鼠子宫和肾脏组织的尺寸不对称、高分辨率分子图像中得到了证明。训练集创建的方法进行了调整,以减轻使用先前的SLADS方法时产生的拉伸工件。过渡到DLADS消除了特征提取的需要,并进一步利用卷积层来利用像素间的空间关系。此外,尽管训练和测试数据不同,DLADS仍显示出有效的泛化。总的来说,DLADS被证明可以最大限度地提高纳米desi MSI的潜在实验吞吐量。
{"title":"Deep Learning Approach for Dynamic Sparse Sampling for High-Throughput Mass Spectrometry Imaging.","authors":"David Helminiak,&nbsp;Hang Hu,&nbsp;Julia Laskin,&nbsp;Dong Hye Ye","doi":"10.2352/issn.2470-1173.2021.15.coimg-290","DOIUrl":"https://doi.org/10.2352/issn.2470-1173.2021.15.coimg-290","url":null,"abstract":"<p><p>A Supervised Learning Approach for Dynamic Sampling (SLADS) addresses traditional issues with the incorporation of stochastic processes into a compressed sensing method. Statistical features, extracted from a sample reconstruction, estimate entropy reduction with regression models, in order to dynamically determine optimal sampling locations. This work introduces an enhanced SLADS method, in the form of a Deep Learning Approach for Dynamic Sampling (DLADS), showing reductions in sample acquisition times for high-fidelity reconstructions between ~ 70-80% over traditional rectilinear scanning. These improvements are demonstrated for dimensionally asymmetric, high-resolution molecular images of mouse uterine and kidney tissues, as obtained using Nanospray Desorption ElectroSpray Ionization (nano-DESI) Mass Spectrometry Imaging (MSI). The methodology for training set creation is adjusted to mitigate stretching artifacts generated when using prior SLADS approaches. Transitioning to DLADS removes the need for feature extraction, further advanced with the employment of convolutional layers to leverage inter-pixel spatial relationships. Additionally, DLADS demonstrates effective generalization, despite dissimilar training and testing data. Overall, DLADS is shown to maximize potential experimental throughput for nano-DESI MSI.</p>","PeriodicalId":73514,"journal":{"name":"IS&T International Symposium on Electronic Imaging","volume":"2021 Computational Imaging XIX","pages":"2901-2907"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8553253/pdf/nihms-1699290.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39580835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
期刊
IS&T International Symposium on Electronic Imaging
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1