首页 > 最新文献

IBM Journal of Research and Development最新文献

英文 中文
A machine learning approach to scenario analysis and forecasting of mixed migration 混合迁移情景分析与预测的机器学习方法
IF 1.3 4区 计算机科学 Q1 Computer Science Pub Date : 2019-10-23 DOI: 10.1147/JRD.2019.2948824
R. Nair;B. S. Madsen;H. Lassen;S. Baduk;S. Nagarajan;L. H. Mogensen;R. Novack;R. Curzon;J. Paraszczak;S. Urbak
The development of MM4SIGHT, a machine learning system that enables annual forecasts of mixed-migration flows, is presented. Mixed migration refers to cross-border movements of people that are motivated by a multiplicity of factors to move including refugees fleeing persecution and conflict, victims of trafficking, and people seeking better lives and opportunity. Such populations have a range of legal status, some of which are not reflected in official government statistics. The system combines institutional estimates of migration along with in-person monitoring surveys to establish a migration volume baseline. The surveys reveal clusters of migratory drivers of populations on the move. Given macrolevel indicators that reflect migratory drivers found in the surveys, we develop an ensemble model to determine the volume of migration between source and host country along with uncertainty bounds. Using more than 80 macroindicators, we present results from a case study of migratory flows from Ethiopia to six countries. Our evaluations show error rates for annual forecasts to be within a few thousand persons per year for most destinations.
介绍了MM4SIGHT的开发情况,这是一个机器学习系统,可以对混合移民流进行年度预测。混合移民是指受多种因素驱使的人员跨境流动,包括逃离迫害和冲突的难民、贩运人口的受害者以及寻求更好生活和机会的人。这些人口有一系列的法律地位,其中一些没有反映在政府的官方统计数据中。该系统将机构对移民的估计与现场监测调查相结合,以建立移民量基线。调查显示,流动人口中有成群的移民驱动因素。考虑到反映调查中发现的移民驱动因素的宏观指标,我们开发了一个综合模型来确定来源国和东道国之间的移民量以及不确定性边界。利用80多项宏观指标,我们介绍了从埃塞俄比亚到六个国家的移民流动案例研究的结果。我们的评估显示,大多数目的地的年度预测误差率在每年几千人以内。
{"title":"A machine learning approach to scenario analysis and forecasting of mixed migration","authors":"R. Nair;B. S. Madsen;H. Lassen;S. Baduk;S. Nagarajan;L. H. Mogensen;R. Novack;R. Curzon;J. Paraszczak;S. Urbak","doi":"10.1147/JRD.2019.2948824","DOIUrl":"https://doi.org/10.1147/JRD.2019.2948824","url":null,"abstract":"The development of MM4SIGHT, a machine learning system that enables annual forecasts of mixed-migration flows, is presented. Mixed migration refers to cross-border movements of people that are motivated by a multiplicity of factors to move including refugees fleeing persecution and conflict, victims of trafficking, and people seeking better lives and opportunity. Such populations have a range of legal status, some of which are not reflected in official government statistics. The system combines institutional estimates of migration along with in-person monitoring surveys to establish a migration volume baseline. The surveys reveal clusters of migratory drivers of populations on the move. Given macrolevel indicators that reflect migratory drivers found in the surveys, we develop an ensemble model to determine the volume of migration between source and host country along with uncertainty bounds. Using more than 80 macroindicators, we present results from a case study of migratory flows from Ethiopia to six countries. Our evaluations show error rates for annual forecasts to be within a few thousand persons per year for most destinations.","PeriodicalId":55034,"journal":{"name":"IBM Journal of Research and Development","volume":null,"pages":null},"PeriodicalIF":1.3,"publicationDate":"2019-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1147/JRD.2019.2948824","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49980045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Heterogeneous integration for artificial intelligence: Challenges and opportunities 人工智能的异构集成:挑战与机遇
IF 1.3 4区 计算机科学 Q1 Computer Science Pub Date : 2019-10-16 DOI: 10.1147/JRD.2019.2947373
S. Mukhopadhyay;Y. Long;B. Mudassar;C. S. Nair;B. H. DeProspo;H. M. Torun;M. Kathaperumal;V. Smet;D. Kim;S. Yalamanchili;M. Swaminathan
The recent progress in artificial intelligence (AI) and machine learning (ML) has enabled computing platforms to solve highly complex difficult problems in computer vision, robotics, finance, security, and science. The algorithmic progress in AI/ML have motivated new research in hardware accelerators. The dedicated accelerators promise high energy efficiency compared to software solutions using CPU. However, as AI/ML models become complex, the increasing memory demands and, hence, high energy/time cost of communication between logic and memory possess a major challenge to energy efficiency. We review the potential of heterogeneous integration in addressing the preceding challenge and present different approaches to leverage heterogeneous integration for energy-efficient AI platforms. First, we discuss packaging technologies for efficient chip-to-chip communication. Second, we present near-memory-processing architecture for AI accelerations that leverages 3D die-stacking. Third, processing-in-memory architectures using heterogeneous integration of CMOS and embedded non-volatile memory are presented. Finally, the article presents case studies that integrate preceding concepts to advance AI/ML hardware platform for different application domains.
人工智能(AI)和机器学习(ML)的最新进展使计算平台能够解决计算机视觉、机器人、金融、安全和科学等领域高度复杂的难题。人工智能/机器学习的算法进步推动了硬件加速器的新研究。与使用CPU的软件解决方案相比,专用加速器承诺高能效。然而,随着AI/ML模型变得复杂,内存需求的增加以及逻辑和内存之间通信的高能量/时间成本对能源效率构成了重大挑战。我们回顾了异构集成在解决上述挑战方面的潜力,并提出了利用异构集成实现节能人工智能平台的不同方法。首先,我们讨论了高效芯片间通信的封装技术。其次,我们提出了利用3D模堆的AI加速的近内存处理架构。第三,提出了采用CMOS和嵌入式非易失性存储器异构集成的存储器中处理架构。最后,本文介绍了整合上述概念的案例研究,以推进不同应用领域的AI/ML硬件平台。
{"title":"Heterogeneous integration for artificial intelligence: Challenges and opportunities","authors":"S. Mukhopadhyay;Y. Long;B. Mudassar;C. S. Nair;B. H. DeProspo;H. M. Torun;M. Kathaperumal;V. Smet;D. Kim;S. Yalamanchili;M. Swaminathan","doi":"10.1147/JRD.2019.2947373","DOIUrl":"https://doi.org/10.1147/JRD.2019.2947373","url":null,"abstract":"The recent progress in artificial intelligence (AI) and machine learning (ML) has enabled computing platforms to solve highly complex difficult problems in computer vision, robotics, finance, security, and science. The algorithmic progress in AI/ML have motivated new research in hardware accelerators. The dedicated accelerators promise high energy efficiency compared to software solutions using CPU. However, as AI/ML models become complex, the increasing memory demands and, hence, high energy/time cost of communication between logic and memory possess a major challenge to energy efficiency. We review the potential of heterogeneous integration in addressing the preceding challenge and present different approaches to leverage heterogeneous integration for energy-efficient AI platforms. First, we discuss packaging technologies for efficient chip-to-chip communication. Second, we present near-memory-processing architecture for AI accelerations that leverages 3D die-stacking. Third, processing-in-memory architectures using heterogeneous integration of CMOS and embedded non-volatile memory are presented. Finally, the article presents case studies that integrate preceding concepts to advance AI/ML hardware platform for different application domains.","PeriodicalId":55034,"journal":{"name":"IBM Journal of Research and Development","volume":null,"pages":null},"PeriodicalIF":1.3,"publicationDate":"2019-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1147/JRD.2019.2947373","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49993118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Fairness GAN: Generating datasets with fairness properties using a generative adversarial network 公平性GAN:使用生成式对抗网络生成具有公平性属性的数据集
IF 1.3 4区 计算机科学 Q1 Computer Science Pub Date : 2019-10-16 DOI: 10.1147/JRD.2019.2945519
P. Sattigeri;S. C. Hoffman;V. Chenthamarakshan;K. R. Varshney
We introduce the Fairness GAN (generative adversarial network), an approach for generating a dataset that is plausibly similar to a given multimedia dataset, but is more fair with respect to protected attributes in decision making. We propose a novel auxiliary classifier GAN that strives for demographic parity or equality of opportunity and show empirical results on several datasets, including the CelebFaces Attributes (CelebA) dataset, the Quick, Draw! dataset, and a dataset of soccer player images and the offenses for which they were called. The proposed formulation is well suited to absorbing unlabeled data; we leverage this to augment the soccer dataset with the much larger CelebA dataset. The methodology tends to improve demographic parity and equality of opportunity while generating plausible images.
我们介绍了公平GAN(生成对抗性网络),这是一种生成数据集的方法,该数据集似乎与给定的多媒体数据集相似,但在决策中相对于受保护的属性更公平。我们提出了一种新的辅助分类器GAN,它致力于人口统计的均等或机会均等,并在几个数据集上显示了经验结果,包括CelebFaces Attributes(CelebA)数据集、Quick,Draw!数据集,以及足球运动员图像和他们被调用的违规行为的数据集。所提出的公式非常适合吸收未标记的数据;我们利用这一点用更大的CelebA数据集来扩充足球数据集。该方法倾向于改善人口均等和机会平等,同时生成合理的图像。
{"title":"Fairness GAN: Generating datasets with fairness properties using a generative adversarial network","authors":"P. Sattigeri;S. C. Hoffman;V. Chenthamarakshan;K. R. Varshney","doi":"10.1147/JRD.2019.2945519","DOIUrl":"https://doi.org/10.1147/JRD.2019.2945519","url":null,"abstract":"We introduce the Fairness GAN (generative adversarial network), an approach for generating a dataset that is plausibly similar to a given multimedia dataset, but is more fair with respect to protected attributes in decision making. We propose a novel auxiliary classifier GAN that strives for demographic parity or equality of opportunity and show empirical results on several datasets, including the CelebFaces Attributes (CelebA) dataset, the Quick, Draw! dataset, and a dataset of soccer player images and the offenses for which they were called. The proposed formulation is well suited to absorbing unlabeled data; we leverage this to augment the soccer dataset with the much larger CelebA dataset. The methodology tends to improve demographic parity and equality of opportunity while generating plausible images.","PeriodicalId":55034,"journal":{"name":"IBM Journal of Research and Development","volume":null,"pages":null},"PeriodicalIF":1.3,"publicationDate":"2019-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1147/JRD.2019.2945519","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49946299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 108
Deep analytics for workplace risk and disaster management 工作场所风险和灾害管理的深度分析
IF 1.3 4区 计算机科学 Q1 Computer Science Pub Date : 2019-10-14 DOI: 10.1147/JRD.2019.2945693
S. Dalal;D. Bassu
We discuss dynamic real-time analysis from multimodal data fusion for contextual risk identification to generate “risk maps” for the workplace, resulting in timely identification of hazards and associated risk mitigation. It includes new machine/deep learning, analytics, methods, and its applications that deal with the unconventional data collected from pictures, videos, documents, mobile apps, sensors/Internet of Things, Occupational Safety and Health Administration (OSHA) rules, and Building Information Model (BIM) Models. Specifically, we describe a number of advances and challenges in this field with applications of computer vision, natural language processing, and sensor data analysis. Applications include automated cause identification, damage prevention, and disaster recovery using current and historical claims data and other public data. The methods developed can be applied to any given situation with different groups of people, including first responders. Finally, we discuss some of the important nontechnical challenges related to business practicality, privacy, and industry regulations.
我们讨论了多模态数据融合的动态实时分析,用于上下文风险识别,为工作场所生成“风险图”,从而及时识别危害并减轻相关风险。它包括新的机器/深度学习、分析、方法及其应用程序,这些应用程序处理从图片、视频、文档、移动应用程序、传感器/物联网、职业安全与健康管理局(OSHA)规则和建筑信息模型(BIM)模型收集的非常规数据。具体来说,我们通过计算机视觉、自然语言处理和传感器数据分析的应用,描述了该领域的一些进展和挑战。应用程序包括使用当前和历史索赔数据以及其他公共数据的自动原因识别、损害预防和灾难恢复。所开发的方法可以应用于任何特定情况下的不同人群,包括第一响应者。最后,我们将讨论与业务实用性、隐私和行业法规相关的一些重要的非技术挑战。
{"title":"Deep analytics for workplace risk and disaster management","authors":"S. Dalal;D. Bassu","doi":"10.1147/JRD.2019.2945693","DOIUrl":"https://doi.org/10.1147/JRD.2019.2945693","url":null,"abstract":"We discuss dynamic real-time analysis from multimodal data fusion for contextual risk identification to generate “risk maps” for the workplace, resulting in timely identification of hazards and associated risk mitigation. It includes new machine/deep learning, analytics, methods, and its applications that deal with the unconventional data collected from pictures, videos, documents, mobile apps, sensors/Internet of Things, Occupational Safety and Health Administration (OSHA) rules, and Building Information Model (BIM) Models. Specifically, we describe a number of advances and challenges in this field with applications of computer vision, natural language processing, and sensor data analysis. Applications include automated cause identification, damage prevention, and disaster recovery using current and historical claims data and other public data. The methods developed can be applied to any given situation with different groups of people, including first responders. Finally, we discuss some of the important nontechnical challenges related to business practicality, privacy, and industry regulations.","PeriodicalId":55034,"journal":{"name":"IBM Journal of Research and Development","volume":null,"pages":null},"PeriodicalIF":1.3,"publicationDate":"2019-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1147/JRD.2019.2945693","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49986746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Elderly care through unusual behavior detection: A disaster management approach using IoT and intelligence 通过异常行为检测进行老年人护理:使用物联网和智能的灾难管理方法
IF 1.3 4区 计算机科学 Q1 Computer Science Pub Date : 2019-10-14 DOI: 10.1147/JRD.2019.2947018
P. Pandey;R. Litoriya
This article attempts to provide a minimal disaster management framework for the elderly who are living alone. Elderly people are generally vulnerable to hazards and emergency situations. The proposed framework aims at developing an Internet of Things (IoT)-based intelligent, protective ecosystem for elderly that calls for help in emergencies such as floods, earthquakes, home fires, volcanic eruptions, and storms. This disaster system makes use of a range of calamity sensors in conjunction with in-house activities of the elderly subject. In the case of mishappening, the disaster relief authorities, community members, and other stakeholders will be instantly informed. All these sensors are powered through an uninterrupted electricity supply system, which will continue to work even in the case of power outages. The work carried out in this article is deeply inspired by the need to have holistic platforms that ensure a low-cost, robust, and responsive disaster alert system for the elderly (DASE) in place. Our objective is to overcome many of the shortcomings in the existing systems and offer a reactive disaster recovery technique. Additionally, this article also incorporates the need to take care of numerous important factors, for instance, the elderly individual's limited physical and cognitive limitations, ergonomic requirements, spending capacity, etc.
本文试图为独居老人提供一个最低限度的灾害管理框架。老年人通常容易受到危险和紧急情况的影响。拟议的框架旨在为老年人开发一个基于物联网的智能保护生态系统,在洪水、地震、家庭火灾、火山爆发和风暴等紧急情况下寻求帮助。该灾害系统结合老年人的内部活动使用了一系列灾害传感器。一旦发生事故,将立即通知救灾当局、社区成员和其他利益相关者。所有这些传感器都通过不间断的供电系统供电,即使在停电的情况下,该系统也将继续工作。这篇文章中所做的工作深受需要整体平台的启发,以确保建立一个低成本、强大且响应迅速的老年人灾害警报系统(DASE)。我们的目标是克服现有系统中的许多缺点,并提供一种反应式灾难恢复技术。此外,本文还纳入了照顾许多重要因素的必要性,例如,老年人有限的身体和认知限制、人体工程学要求、消费能力等。
{"title":"Elderly care through unusual behavior detection: A disaster management approach using IoT and intelligence","authors":"P. Pandey;R. Litoriya","doi":"10.1147/JRD.2019.2947018","DOIUrl":"https://doi.org/10.1147/JRD.2019.2947018","url":null,"abstract":"This article attempts to provide a minimal disaster management framework for the elderly who are living alone. Elderly people are generally vulnerable to hazards and emergency situations. The proposed framework aims at developing an Internet of Things (IoT)-based intelligent, protective ecosystem for elderly that calls for help in emergencies such as floods, earthquakes, home fires, volcanic eruptions, and storms. This disaster system makes use of a range of calamity sensors in conjunction with in-house activities of the elderly subject. In the case of mishappening, the disaster relief authorities, community members, and other stakeholders will be instantly informed. All these sensors are powered through an uninterrupted electricity supply system, which will continue to work even in the case of power outages. The work carried out in this article is deeply inspired by the need to have holistic platforms that ensure a low-cost, robust, and responsive disaster alert system for the elderly (DASE) in place. Our objective is to overcome many of the shortcomings in the existing systems and offer a reactive disaster recovery technique. Additionally, this article also incorporates the need to take care of numerous important factors, for instance, the elderly individual's limited physical and cognitive limitations, ergonomic requirements, spending capacity, etc.","PeriodicalId":55034,"journal":{"name":"IBM Journal of Research and Development","volume":null,"pages":null},"PeriodicalIF":1.3,"publicationDate":"2019-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1147/JRD.2019.2947018","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49980051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Online optimization of first-responder routes in disaster response logistics 灾害应急物流中第一响应路径的在线优化
IF 1.3 4区 计算机科学 Q1 Computer Science Pub Date : 2019-10-14 DOI: 10.1147/JRD.2019.2947002
D. Shiri;F. S. Salman
After a disaster, first responders should reach critical locations in the disaster-affected region in the shortest time. However, road network edges can be damaged or blocked by debris. Since response time is crucial, relief operations may start before knowing which edges are blocked. A blocked edge is revealed online when it is visited at one of its end-nodes. Multiple first-responder teams, who can communicate the blockage information, gather initially at an origin node and are assigned to target destinations (nodes) in the disaster-affected area. We consider multiple teams assigned to one destination. The objective is to find an online travel plan such that at least one of the teams finds a route from the origin to the destination in minimum time. This problem is known as the online multi-agent Canadian traveler problem. We develop an effective online heuristic policy and test it on real city road networks as well as randomly generated networks leading to instances with multiple blockages. We compare the performance of the online strategy with the offline optimum and obtain an average competitive ratio of 1.164 over 70,100 instances with varying parameter values.
灾难发生后,第一响应者应该在最短的时间内到达受灾地区的关键地点。然而,道路网的边缘可能会被碎片损坏或堵塞。由于响应时间至关重要,救援行动可能在知道哪些边缘被阻塞之前就开始了。当在其末端节点之一访问被阻塞的边缘时,它会在线显示。多个能够沟通堵塞信息的第一反应小组最初聚集在一个原始节点,并被分配到受灾害影响地区的目标目的地(节点)。我们考虑将多个团队分配到一个目的地。目标是找到一个在线旅行计划,使至少一个团队在最短时间内找到从起点到目的地的路线。这个问题被称为在线多代理加拿大旅行者问题。我们开发了一个有效的在线启发式策略,并在真实的城市道路网络以及随机生成的网络上进行了测试,导致多个阻塞的实例。我们比较了在线策略与离线最优策略的性能,并在70,100个具有不同参数值的实例中获得了1.164的平均竞争比。
{"title":"Online optimization of first-responder routes in disaster response logistics","authors":"D. Shiri;F. S. Salman","doi":"10.1147/JRD.2019.2947002","DOIUrl":"https://doi.org/10.1147/JRD.2019.2947002","url":null,"abstract":"After a disaster, first responders should reach critical locations in the disaster-affected region in the shortest time. However, road network edges can be damaged or blocked by debris. Since response time is crucial, relief operations may start before knowing which edges are blocked. A blocked edge is revealed online when it is visited at one of its end-nodes. Multiple first-responder teams, who can communicate the blockage information, gather initially at an origin node and are assigned to target destinations (nodes) in the disaster-affected area. We consider multiple teams assigned to one destination. The objective is to find an online travel plan such that at least one of the teams finds a route from the origin to the destination in minimum time. This problem is known as the online multi-agent Canadian traveler problem. We develop an effective online heuristic policy and test it on real city road networks as well as randomly generated networks leading to instances with multiple blockages. We compare the performance of the online strategy with the offline optimum and obtain an average competitive ratio of 1.164 over 70,100 instances with varying parameter values.","PeriodicalId":55034,"journal":{"name":"IBM Journal of Research and Development","volume":null,"pages":null},"PeriodicalIF":1.3,"publicationDate":"2019-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1147/JRD.2019.2947002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49986747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Deep learning acceleration based on in-memory computing
IF 1.3 4区 计算机科学 Q1 Computer Science Pub Date : 2019-10-14 DOI: 10.1147/JRD.2019.2947008
E. Eleftheriou;M. Le Gallo;S. R. Nandakumar;C. Piveteau;I. Boybat;V. Joshi;R. Khaddam-Aljameh;M. Dazzi;I. Giannopoulos;G. Karunaratne;B. Kersting;M. Stanisavljevic;V. P. Jonnalagadda;N. Ioannou;K. Kourtis;P. A. Francese;A. Sebastian
Performing computations on conventional von Neumann computing systems results in a significant amount of data being moved back and forth between the physically separated memory and processing units. This costs time and energy, and constitutes an inherent performance bottleneck. In-memory computing is a novel non-von Neumann approach, where certain computational tasks are performed in the memory itself. This is enabled by the physical attributes and state dynamics of memory devices, in particular, resistance-based nonvolatile memory technology. Several computational tasks such as logical operations, arithmetic operations, and even certain machine learning tasks can be implemented in such a computational memory unit. In this article, we first introduce the general notion of in-memory computing and then focus on mixed-precision deep learning training with in-memory computing. The efficacy of this new approach will be demonstrated by training the MNIST multilayer perceptron network achieving high accuracy. Moreover, we show how the precision of in-memory computing can be further improved through architectural and device-level innovations. Finally, we present system aspects, such as high-level system architecture, including core-to-core interconnect technologies, and high-level ideas and concepts of the software stack.
在传统的冯·诺伊曼计算系统上执行计算会导致大量数据在物理上分离的存储器和处理单元之间来回移动。这会耗费时间和精力,并构成固有的性能瓶颈。内存计算是一种新颖的非冯·诺伊曼方法,其中某些计算任务在内存本身执行。这是通过存储设备的物理属性和状态动态实现的,特别是基于电阻的非易失性存储技术。一些计算任务,如逻辑运算、算术运算,甚至某些机器学习任务,都可以在这样一个计算存储单元中实现。在本文中,我们首先介绍了内存计算的一般概念,然后重点介绍了使用内存计算的混合精度深度学习训练。这种新方法的有效性将通过训练MNIST多层感知器网络来证明。此外,我们还展示了如何通过架构和设备级创新进一步提高内存计算的精度。最后,我们介绍了系统方面,例如高级系统架构,包括核心到核心互连技术,以及软件堆栈的高级思想和概念。
{"title":"Deep learning acceleration based on in-memory computing","authors":"E. Eleftheriou;M. Le Gallo;S. R. Nandakumar;C. Piveteau;I. Boybat;V. Joshi;R. Khaddam-Aljameh;M. Dazzi;I. Giannopoulos;G. Karunaratne;B. Kersting;M. Stanisavljevic;V. P. Jonnalagadda;N. Ioannou;K. Kourtis;P. A. Francese;A. Sebastian","doi":"10.1147/JRD.2019.2947008","DOIUrl":"https://doi.org/10.1147/JRD.2019.2947008","url":null,"abstract":"Performing computations on conventional von Neumann computing systems results in a significant amount of data being moved back and forth between the physically separated memory and processing units. This costs time and energy, and constitutes an inherent performance bottleneck. In-memory computing is a novel non-von Neumann approach, where certain computational tasks are performed in the memory itself. This is enabled by the physical attributes and state dynamics of memory devices, in particular, resistance-based nonvolatile memory technology. Several computational tasks such as logical operations, arithmetic operations, and even certain machine learning tasks can be implemented in such a computational memory unit. In this article, we first introduce the general notion of in-memory computing and then focus on mixed-precision deep learning training with in-memory computing. The efficacy of this new approach will be demonstrated by training the MNIST multilayer perceptron network achieving high accuracy. Moreover, we show how the precision of in-memory computing can be further improved through architectural and device-level innovations. Finally, we present system aspects, such as high-level system architecture, including core-to-core interconnect technologies, and high-level ideas and concepts of the software stack\u0000<italic>.</i>","PeriodicalId":55034,"journal":{"name":"IBM Journal of Research and Development","volume":null,"pages":null},"PeriodicalIF":1.3,"publicationDate":"2019-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1147/JRD.2019.2947008","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49993113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Improving humanitarian needs assessments through natural language processing 通过自然语言处理改进人道主义需求评估
IF 1.3 4区 计算机科学 Q1 Computer Science Pub Date : 2019-10-14 DOI: 10.1147/JRD.2019.2947014
T. Kreutzer;P. Vinck;P. N. Pham;A. An;L. Appel;E. DeLuca;G. Tang;M. Alzghool;K. Hachhethu;B. Morris;S. L. Walton-Ellery;J. Crowley;J. Orbinski
An effective response to humanitarian crises relies on detailed information about the needs of the affected population. Current assessment approaches often require interviewers to convert complex, open-ended responses into simplified quantitative data. More nuanced insights require the use of qualitative methods, but proper transcription and manual coding are hard to conduct rapidly and at scale during a crisis. Natural language processing (NLP), a type of artificial intelligence, may provide potentially important new opportunities to capture qualitative data from voice responses and analyze it for relevant content to better inform more effective and rapid humanitarian assistance operational decisions. This article provides an overview of how NLP can be used to transcribe, translate, and analyze large sets of qualitative responses with a view to improving the quality and effectiveness of humanitarian assistance. We describe the practical and ethical challenges of building on the diffusion of digital data collection platforms and introducing this new technology to the humanitarian context. Finally, we provide an overview of the principles that should be used to anticipate and mitigate risks.
对人道主义危机的有效反应取决于有关受影响人口需求的详细信息。目前的评估方法通常要求访谈者将复杂的、开放式的回答转化为简化的定量数据。更细微的见解需要使用定性方法,但在危机期间,很难快速大规模地进行适当的转录和手动编码。自然语言处理(NLP)是一种人工智能,它可能提供潜在的重要新机会,从语音响应中获取定性数据,并对其进行相关内容分析,以更好地为更有效、快速的人道主义援助行动决策提供信息。本文概述了NLP如何用于转录、翻译和分析大量定性响应,以提高人道主义援助的质量和有效性。我们描述了在传播数字数据收集平台的基础上建立并将这项新技术引入人道主义环境的实际和道德挑战。最后,我们概述了应用于预测和减轻风险的原则。
{"title":"Improving humanitarian needs assessments through natural language processing","authors":"T. Kreutzer;P. Vinck;P. N. Pham;A. An;L. Appel;E. DeLuca;G. Tang;M. Alzghool;K. Hachhethu;B. Morris;S. L. Walton-Ellery;J. Crowley;J. Orbinski","doi":"10.1147/JRD.2019.2947014","DOIUrl":"https://doi.org/10.1147/JRD.2019.2947014","url":null,"abstract":"An effective response to humanitarian crises relies on detailed information about the needs of the affected population. Current assessment approaches often require interviewers to convert complex, open-ended responses into simplified quantitative data. More nuanced insights require the use of qualitative methods, but proper transcription and manual coding are hard to conduct rapidly and at scale during a crisis. Natural language processing (NLP), a type of artificial intelligence, may provide potentially important new opportunities to capture qualitative data from voice responses and analyze it for relevant content to better inform more effective and rapid humanitarian assistance operational decisions. This article provides an overview of how NLP can be used to transcribe, translate, and analyze large sets of qualitative responses with a view to improving the quality and effectiveness of humanitarian assistance. We describe the practical and ethical challenges of building on the diffusion of digital data collection platforms and introducing this new technology to the humanitarian context. Finally, we provide an overview of the principles that should be used to anticipate and mitigate risks.","PeriodicalId":55034,"journal":{"name":"IBM Journal of Research and Development","volume":null,"pages":null},"PeriodicalIF":1.3,"publicationDate":"2019-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1147/JRD.2019.2947014","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49953417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Neural network accelerator design with resistive crossbars: Opportunities and challenges 具有电阻交叉杆的神经网络加速器设计:机遇与挑战
IF 1.3 4区 计算机科学 Q1 Computer Science Pub Date : 2019-10-14 DOI: 10.1147/JRD.2019.2947011
S. Jain;A. Ankit;I. Chakraborty;T. Gokmen;M. Rasch;W. Haensch;K. Roy;A. Raghunathan
Deep neural networks (DNNs) achieve best-known accuracies in many machine learning tasks involved in image, voice, and natural language processing and are being used in an ever-increasing range of applications. However, their algorithmic benefits are accompanied by extremely high computation and storage costs, sparking intense efforts in optimizing the design of computing platforms for DNNs. Today, graphics processing units (GPUs) and specialized digital CMOS accelerators represent the state-of-the-art in DNN hardware, with near-term efforts focusing on approximate computing through reduced precision. However, the ever-increasing complexities of DNNs and the data they process have fueled an active interest in alternative hardware fabrics that can deliver the next leap in efficiency. Resistive crossbars designed using emerging nonvolatile memory technologies have emerged as a promising candidate building block for future DNN hardware fabrics since they can natively execute massively parallel vector-matrix multiplications (the dominant compute kernel in DNNs) in the analog domain within the memory arrays. Leveraging in-memory computing and dense storage, resistive-crossbar-based systems cater to both the high computation and storage demands of complex DNNs and promise energy efficiency beyond current DNN accelerators by mitigating data transfer and memory bottlenecks. However, several design challenges need to be addressed to enable their adoption. For example, the overheads of peripheral circuits (analog-to-digital converters and digital-to-analog converters) and other components (scratchpad memories and on-chip interconnect) may significantly diminish the efficiency benefits at the system level. Additionally, the analog crossbar computations are intrinsically subject to noise due to a range of device- and circuit-level nonidealities, potentially leading to lower accuracy at the application level. In this article, we highlight the prospects for designing hardware accelerators for neural networks using resistive crossbars. We also underscore the key open challenges and some possible approaches to address them.
深度神经网络(dnn)在涉及图像、语音和自然语言处理的许多机器学习任务中实现了最知名的准确性,并被用于越来越多的应用领域。然而,它们的算法优势伴随着极高的计算和存储成本,引发了对深度神经网络计算平台优化设计的强烈努力。今天,图形处理单元(gpu)和专门的数字CMOS加速器代表了DNN硬件的最新技术,近期的工作重点是通过降低精度进行近似计算。然而,随着深度神经网络及其处理数据的复杂性不断增加,人们对能够实现下一次效率飞跃的替代硬件结构产生了积极的兴趣。使用新兴的非易失性存储器技术设计的电阻交叉棒已经成为未来DNN硬件结构的一个有前途的候选构建块,因为它们可以在存储器阵列的模拟域中本地执行大规模并行向量矩阵乘法(DNN的主要计算内核)。利用内存计算和密集存储,基于电阻交叉棒的系统满足复杂深度神经网络的高计算和存储需求,并通过减轻数据传输和内存瓶颈,承诺超越当前深度神经网络加速器的能源效率。然而,需要解决几个设计挑战才能使它们能够被采用。例如,外围电路(模数转换器和数模转换器)和其他组件(刮板存储器和片上互连)的开销可能会显著降低系统级的效率效益。此外,由于一系列器件和电路级别的非理想性,模拟交叉条计算本质上受到噪声的影响,可能导致应用级别的精度降低。在本文中,我们强调了使用电阻交叉棒设计神经网络硬件加速器的前景。我们还强调了关键的公开挑战以及解决这些挑战的一些可能方法。
{"title":"Neural network accelerator design with resistive crossbars: Opportunities and challenges","authors":"S. Jain;A. Ankit;I. Chakraborty;T. Gokmen;M. Rasch;W. Haensch;K. Roy;A. Raghunathan","doi":"10.1147/JRD.2019.2947011","DOIUrl":"https://doi.org/10.1147/JRD.2019.2947011","url":null,"abstract":"Deep neural networks (DNNs) achieve best-known accuracies in many machine learning tasks involved in image, voice, and natural language processing and are being used in an ever-increasing range of applications. However, their algorithmic benefits are accompanied by extremely high computation and storage costs, sparking intense efforts in optimizing the design of computing platforms for DNNs. Today, graphics processing units (GPUs) and specialized digital CMOS accelerators represent the state-of-the-art in DNN hardware, with near-term efforts focusing on approximate computing through reduced precision. However, the ever-increasing complexities of DNNs and the data they process have fueled an active interest in alternative hardware fabrics that can deliver the next leap in efficiency. Resistive crossbars designed using emerging nonvolatile memory technologies have emerged as a promising candidate building block for future DNN hardware fabrics since they can natively execute massively parallel vector-matrix multiplications (the dominant compute kernel in DNNs) in the analog domain within the memory arrays. Leveraging in-memory computing and dense storage, resistive-crossbar-based systems cater to both the high computation and storage demands of complex DNNs and promise energy efficiency beyond current DNN accelerators by mitigating data transfer and memory bottlenecks. However, several design challenges need to be addressed to enable their adoption. For example, the overheads of peripheral circuits (analog-to-digital converters and digital-to-analog converters) and other components (scratchpad memories and on-chip interconnect) may significantly diminish the efficiency benefits at the system level. Additionally, the analog crossbar computations are intrinsically subject to noise due to a range of device- and circuit-level nonidealities, potentially leading to lower accuracy at the application level. In this article, we highlight the prospects for designing hardware accelerators for neural networks using resistive crossbars. We also underscore the key open challenges and some possible approaches to address them.","PeriodicalId":55034,"journal":{"name":"IBM Journal of Research and Development","volume":null,"pages":null},"PeriodicalIF":1.3,"publicationDate":"2019-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1147/JRD.2019.2947011","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49993116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
BlueConnect: Decomposing all-reduce for deep learning on heterogeneous network hierarchy BlueConnect:异构网络层次上深度学习的分解全约法
IF 1.3 4区 计算机科学 Q1 Computer Science Pub Date : 2019-10-14 DOI: 10.1147/JRD.2019.2947013
M. Cho;U. Finkler;M. Serrano;D. Kung;H. Hunter
As deep neural networks get more complex and input datasets get larger, it can take days or even weeks to train a deep neural network to the desired accuracy. Therefore, enabling distributed deep learning at a massive scale is critical since it offers the potential to reduce the training time from weeks to hours. In this article, we present BlueConnect, an efficient communication library for distributed deep learning that is highly optimized for popular GPU-based platforms. BlueConnect decomposes a single all-reduce operation into a large number of parallelizable reduce–scatter and all-gather operations to exploit the tradeoff between latency and bandwidth and adapt to a variety of network configurations. Therefore, each individual operation can be mapped to a different network fabric and take advantage of the best performing implementation for the corresponding fabric. According to our experimental results on two system configurations, BlueConnect can outperform the leading industrial communication library by a wide margin, and the BlueConnect-integrated Caffe2 can significantly reduce synchronization overhead by 87% on 192 GPUs for Resnet-50 training over prior schemes.
随着深度神经网络变得越来越复杂,输入数据集越来越大,训练深度神经网络达到所需的精度可能需要几天甚至几周的时间。因此,大规模实现分布式深度学习是至关重要的,因为它有可能将训练时间从数周减少到数小时。在本文中,我们介绍了BlueConnect,这是一个用于分布式深度学习的高效通信库,针对流行的基于gpu的平台进行了高度优化。BlueConnect将单个all-reduce操作分解为大量可并行化的reduce-scatter和all-gather操作,以利用延迟和带宽之间的权衡,并适应各种网络配置。因此,每个单独的操作都可以映射到不同的网络结构,并利用相应结构的最佳性能实现。根据我们在两种系统配置上的实验结果,BlueConnect的性能大大优于领先的工业通信库,并且BlueConnect集成的Caffe2可以显着减少在192个gpu上进行Resnet-50训练的同步开销,比之前的方案减少87%。
{"title":"BlueConnect: Decomposing all-reduce for deep learning on heterogeneous network hierarchy","authors":"M. Cho;U. Finkler;M. Serrano;D. Kung;H. Hunter","doi":"10.1147/JRD.2019.2947013","DOIUrl":"https://doi.org/10.1147/JRD.2019.2947013","url":null,"abstract":"As deep neural networks get more complex and input datasets get larger, it can take days or even weeks to train a deep neural network to the desired accuracy. Therefore, enabling distributed deep learning at a massive scale is critical since it offers the potential to reduce the training time from weeks to hours. In this article, we present BlueConnect, an efficient communication library for distributed deep learning that is highly optimized for popular GPU-based platforms. BlueConnect decomposes a single all-reduce operation into a large number of parallelizable reduce–scatter and all-gather operations to exploit the tradeoff between latency and bandwidth and adapt to a variety of network configurations. Therefore, each individual operation can be mapped to a different network fabric and take advantage of the best performing implementation for the corresponding fabric. According to our experimental results on two system configurations, BlueConnect can outperform the leading industrial communication library by a wide margin, and the BlueConnect-integrated Caffe2 can significantly reduce synchronization overhead by 87% on 192 GPUs for Resnet-50 training over prior schemes.","PeriodicalId":55034,"journal":{"name":"IBM Journal of Research and Development","volume":null,"pages":null},"PeriodicalIF":1.3,"publicationDate":"2019-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1147/JRD.2019.2947013","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49993119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 71
期刊
IBM Journal of Research and Development
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1