首页 > 最新文献

Foundations and Trends in Machine Learning最新文献

英文 中文
Message from the AI4S Workshop Chairs 来自AI4S工作坊主席的信息
IF 32.8 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2020-11-01 DOI: 10.1109/mlhpcai4s51975.2020.00005
{"title":"Message from the AI4S Workshop Chairs","authors":"","doi":"10.1109/mlhpcai4s51975.2020.00005","DOIUrl":"https://doi.org/10.1109/mlhpcai4s51975.2020.00005","url":null,"abstract":"","PeriodicalId":47667,"journal":{"name":"Foundations and Trends in Machine Learning","volume":"17 1","pages":""},"PeriodicalIF":32.8,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79769935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Benders Decomposition Approach to Correlation Clustering 相关聚类的Benders分解方法
IF 32.8 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2020-11-01 DOI: 10.1109/MLHPCAI4S51975.2020.00009
Jovita Lukasik, M. Keuper, M. Singh, Julian Yarkony
We tackle the problem of graph partitioning for image segmentation using correlation clustering (CC), which we treat as an integer linear program (ILP). We reformulate optimization in the ILP so as to admit efficient optimization via Benders decomposition, a classic technique from operations research. Our Benders decomposition formulation has many subproblems, each associated with a node in the CC instance’s graph, which can be solved in parallel. Each Benders subproblem enforces the cycle inequalities corresponding to edges with negative (repulsive) weights attached to its corresponding node in the CC instance. We generate Magnanti-Wong Benders rows in addition to standard Benders rows to accelerate optimization. Our Benders decomposition approach provides a promising new avenue to accelerate optimization for CC, and, in contrast to previous cutting plane approaches, theoretically allows for massive parallelization.
我们使用相关聚类(CC)来解决图像分割的图划分问题,我们将其视为整数线性规划(ILP)。我们重新定义了ILP中的优化,以便通过运筹学中的经典技术Benders分解进行有效的优化。我们的bender分解公式有许多子问题,每个子问题都与CC实例图中的一个节点相关联,这些子问题可以并行解决。每个Benders子问题都强制执行与CC实例中相应节点具有负(排斥)权值的边对应的循环不等式。除了标准Benders行之外,我们还生成Magnanti-Wong Benders行来加速优化。我们的Benders分解方法为加速CC的优化提供了一个有前途的新途径,并且与以前的切割平面方法相比,理论上允许大规模并行化。
{"title":"A Benders Decomposition Approach to Correlation Clustering","authors":"Jovita Lukasik, M. Keuper, M. Singh, Julian Yarkony","doi":"10.1109/MLHPCAI4S51975.2020.00009","DOIUrl":"https://doi.org/10.1109/MLHPCAI4S51975.2020.00009","url":null,"abstract":"We tackle the problem of graph partitioning for image segmentation using correlation clustering (CC), which we treat as an integer linear program (ILP). We reformulate optimization in the ILP so as to admit efficient optimization via Benders decomposition, a classic technique from operations research. Our Benders decomposition formulation has many subproblems, each associated with a node in the CC instance’s graph, which can be solved in parallel. Each Benders subproblem enforces the cycle inequalities corresponding to edges with negative (repulsive) weights attached to its corresponding node in the CC instance. We generate Magnanti-Wong Benders rows in addition to standard Benders rows to accelerate optimization. Our Benders decomposition approach provides a promising new avenue to accelerate optimization for CC, and, in contrast to previous cutting plane approaches, theoretically allows for massive parallelization.","PeriodicalId":47667,"journal":{"name":"Foundations and Trends in Machine Learning","volume":"16 1","pages":"9-16"},"PeriodicalIF":32.8,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80642681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
High-bypass Learning: Automated Detection of Tumor Cells That Significantly Impact Drug Response 高旁路学习:自动检测显著影响药物反应的肿瘤细胞
IF 32.8 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2020-11-01 DOI: 10.1109/MLHPCAI4S51975.2020.00012
J. Wozniak, H. Yoo, J. Mohd-Yusof, Bogdan Nicolae, Nicholson T. Collier, J. Ozik, T. Brettin, Rick L. Stevens
Machine learning in biomedicine is reliant on the availability of large, high-quality data sets. These corpora are used for training statistical or deep learning-based models that can be validated against other data sets and ultimately used to guide decisions. The quality of these data sets is an essential component of the quality of the models and their decisions. Thus, identifying and inspecting outlier data is critical for evaluating, curating, and using biomedical data sets. Many techniques are available to look for outlier data, but it is not clear how to evaluate the impact on highly complex deep learning methods. In this paper, we use deep learning ensembles and workflows to construct a system for automatically identifying data subsets that have a large impact on the trained models. These effects can be quantified and presented to the user for further inspection, which could improve data quality overall. We then present results from running this method on the near-exascale Summit supercomputer.
生物医学中的机器学习依赖于大量高质量数据集的可用性。这些语料库用于训练统计或基于深度学习的模型,这些模型可以针对其他数据集进行验证,并最终用于指导决策。这些数据集的质量是模型及其决策质量的重要组成部分。因此,识别和检查异常数据对于评估、管理和使用生物医学数据集至关重要。有许多技术可用于寻找离群数据,但尚不清楚如何评估对高度复杂的深度学习方法的影响。在本文中,我们使用深度学习集成和工作流来构建一个系统,用于自动识别对训练模型有很大影响的数据子集。这些影响可以量化并呈现给用户以供进一步检查,这可以提高总体数据质量。然后,我们展示了在接近百亿亿次的Summit超级计算机上运行该方法的结果。
{"title":"High-bypass Learning: Automated Detection of Tumor Cells That Significantly Impact Drug Response","authors":"J. Wozniak, H. Yoo, J. Mohd-Yusof, Bogdan Nicolae, Nicholson T. Collier, J. Ozik, T. Brettin, Rick L. Stevens","doi":"10.1109/MLHPCAI4S51975.2020.00012","DOIUrl":"https://doi.org/10.1109/MLHPCAI4S51975.2020.00012","url":null,"abstract":"Machine learning in biomedicine is reliant on the availability of large, high-quality data sets. These corpora are used for training statistical or deep learning-based models that can be validated against other data sets and ultimately used to guide decisions. The quality of these data sets is an essential component of the quality of the models and their decisions. Thus, identifying and inspecting outlier data is critical for evaluating, curating, and using biomedical data sets. Many techniques are available to look for outlier data, but it is not clear how to evaluate the impact on highly complex deep learning methods. In this paper, we use deep learning ensembles and workflows to construct a system for automatically identifying data subsets that have a large impact on the trained models. These effects can be quantified and presented to the user for further inspection, which could improve data quality overall. We then present results from running this method on the near-exascale Summit supercomputer.","PeriodicalId":47667,"journal":{"name":"Foundations and Trends in Machine Learning","volume":"82 1","pages":"1-10"},"PeriodicalIF":32.8,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89021495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Accelerating GPU-based Machine Learning in Python using MPI Library: A Case Study with MVAPICH2-GDR 使用MPI库在Python中加速基于gpu的机器学习:MVAPICH2-GDR的案例研究
IF 32.8 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2020-11-01 DOI: 10.1109/MLHPCAI4S51975.2020.00010
S. M. Ghazimirsaeed, Quentin G. Anthony, A. Shafi, H. Subramoni, D. Panda
The growth of big data applications during the last decade has led to a surge in the deployment and popularity of machine learning (ML) libraries. On the other hand, the high performance offered by GPUs makes them well suited for ML problems. To take advantage of GPU performance for ML, NVIDIA has recently developed the cuML library. cuML is the GPU counterpart of Scikit-learn, and provides similar Pythonic interfaces to Scikit-learn while hiding the complexities of writing GPU compute kernels directly using CUDA. To support execution of ML workloads on Multi-Node Multi- GPU (MNMG) systems, the cuML library exploits the NVIDIA Collective Communications Library (NCCL) as a backend for collective communications between processes. On the other hand, MPI is a de facto standard for communication in HPC systems. Among various MPI libraries, MVAPICH2-GDR is the pioneer in optimizing GPU communication.This paper explores various aspects and challenges of providing MPI-based communication support for GPU-accelerated cuML applications. More specifically, it proposes a Python API to take advantage of MPI-based communications for cuML applications. It also gives an in-depth analysis, characterization, and benchmarking of the cuML algorithms such as K-Means, Nearest Neighbors, Random Forest, and tSVD. Moreover, it provides a comprehensive performance evaluation and profiling study for MPI-based versus NCCL-based communication for these algorithms. The evaluation results show that the proposed MPI-based communication approach achieves up to 1.6x, 1.25x, 1.25x, and 1.36x speedup for K-Means, Nearest Neighbors, Linear Regression, and tSVD, respectively on up to 32 GPUs.
在过去十年中,大数据应用的增长导致机器学习(ML)库的部署和普及激增。另一方面,gpu提供的高性能使它们非常适合ML问题。为了利用GPU在ML中的性能优势,NVIDIA最近开发了cuML库。cuML是Scikit-learn的GPU版本,为Scikit-learn提供了类似的python接口,同时隐藏了直接使用CUDA编写GPU计算内核的复杂性。为了支持在多节点多GPU (MNMG)系统上执行机器学习工作负载,cuML库利用NVIDIA集体通信库(NCCL)作为进程之间集体通信的后端。另一方面,MPI是HPC系统中事实上的通信标准。在各种MPI库中,MVAPICH2-GDR是优化GPU通信的先驱。本文探讨了为gpu加速的cuML应用程序提供基于mpi的通信支持的各个方面和挑战。更具体地说,它提出了一个Python API,以便为cuML应用程序利用基于mpi的通信。它还对K-Means、最近邻、随机森林和tSVD等cuML算法进行了深入的分析、表征和基准测试。此外,本文还对基于mpi和基于nccl的通信算法进行了全面的性能评估和分析研究。评估结果表明,基于mpi的通信方法在最多32个gpu上对K-Means、最近邻、线性回归和tSVD分别实现了1.6倍、1.25倍、1.25倍和1.36倍的加速。
{"title":"Accelerating GPU-based Machine Learning in Python using MPI Library: A Case Study with MVAPICH2-GDR","authors":"S. M. Ghazimirsaeed, Quentin G. Anthony, A. Shafi, H. Subramoni, D. Panda","doi":"10.1109/MLHPCAI4S51975.2020.00010","DOIUrl":"https://doi.org/10.1109/MLHPCAI4S51975.2020.00010","url":null,"abstract":"The growth of big data applications during the last decade has led to a surge in the deployment and popularity of machine learning (ML) libraries. On the other hand, the high performance offered by GPUs makes them well suited for ML problems. To take advantage of GPU performance for ML, NVIDIA has recently developed the cuML library. cuML is the GPU counterpart of Scikit-learn, and provides similar Pythonic interfaces to Scikit-learn while hiding the complexities of writing GPU compute kernels directly using CUDA. To support execution of ML workloads on Multi-Node Multi- GPU (MNMG) systems, the cuML library exploits the NVIDIA Collective Communications Library (NCCL) as a backend for collective communications between processes. On the other hand, MPI is a de facto standard for communication in HPC systems. Among various MPI libraries, MVAPICH2-GDR is the pioneer in optimizing GPU communication.This paper explores various aspects and challenges of providing MPI-based communication support for GPU-accelerated cuML applications. More specifically, it proposes a Python API to take advantage of MPI-based communications for cuML applications. It also gives an in-depth analysis, characterization, and benchmarking of the cuML algorithms such as K-Means, Nearest Neighbors, Random Forest, and tSVD. Moreover, it provides a comprehensive performance evaluation and profiling study for MPI-based versus NCCL-based communication for these algorithms. The evaluation results show that the proposed MPI-based communication approach achieves up to 1.6x, 1.25x, 1.25x, and 1.36x speedup for K-Means, Nearest Neighbors, Linear Regression, and tSVD, respectively on up to 32 GPUs.","PeriodicalId":47667,"journal":{"name":"Foundations and Trends in Machine Learning","volume":"31 1","pages":"1-12"},"PeriodicalIF":32.8,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72937514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reinforcement Learning-Based Solution to Power Grid Planning and Operation Under Uncertainties 基于强化学习的不确定条件下电网规划与运行解决方案
IF 32.8 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2020-11-01 DOI: 10.1109/MLHPCAI4S51975.2020.00015
X. Shang, Ye Lin, Jing Zhang, Jingping Yang, Jianping Xu, Qin Lyu, R. Diao
With the ever-increasing stochastic and dynamic behavior observed in today’s bulk power systems, securely and economically planning future operational scenarios that meet all reliability standards under uncertainties becomes a challenging computational task, which typically involves searching feasible and suboptimal solutions in a highly dimensional space via massive numerical simulations. This paper presents a novel approach to achieving this goal by adopting the state-of-the-art reinforcement learning algorithm, Soft Actor Critic (SAC). First, the optimization problem of finding feasible solutions under uncertainties is formulated as Markov Decision Process (MDP). Second, a general and flexible framework is developed to train SAC agent by adjusting generator active power outputs for searching feasible operating conditions. A software prototype is developed that verifies the effectiveness of the proposed approach via numerical studies conducted on the planning cases of the SGCC Zhejiang Electric Power Company.
随着当今大容量电力系统的随机和动态特性的不断增加,在不确定条件下安全、经济地规划满足所有可靠性标准的未来运行方案成为一项具有挑战性的计算任务,这通常涉及通过大量数值模拟在高维空间中搜索可行和次优解决方案。本文提出了一种新的方法,通过采用最先进的强化学习算法,软演员评论家(SAC)来实现这一目标。首先,将不确定条件下寻找可行解的优化问题表述为马尔可夫决策过程。其次,提出了一个通用的、灵活的框架,通过调整发电机的有功输出来训练SAC代理,以寻找可行的运行条件。通过对SGCC浙江电力公司规划案例的数值研究,开发了软件原型,验证了所提方法的有效性。
{"title":"Reinforcement Learning-Based Solution to Power Grid Planning and Operation Under Uncertainties","authors":"X. Shang, Ye Lin, Jing Zhang, Jingping Yang, Jianping Xu, Qin Lyu, R. Diao","doi":"10.1109/MLHPCAI4S51975.2020.00015","DOIUrl":"https://doi.org/10.1109/MLHPCAI4S51975.2020.00015","url":null,"abstract":"With the ever-increasing stochastic and dynamic behavior observed in today’s bulk power systems, securely and economically planning future operational scenarios that meet all reliability standards under uncertainties becomes a challenging computational task, which typically involves searching feasible and suboptimal solutions in a highly dimensional space via massive numerical simulations. This paper presents a novel approach to achieving this goal by adopting the state-of-the-art reinforcement learning algorithm, Soft Actor Critic (SAC). First, the optimization problem of finding feasible solutions under uncertainties is formulated as Markov Decision Process (MDP). Second, a general and flexible framework is developed to train SAC agent by adjusting generator active power outputs for searching feasible operating conditions. A software prototype is developed that verifies the effectiveness of the proposed approach via numerical studies conducted on the planning cases of the SGCC Zhejiang Electric Power Company.","PeriodicalId":47667,"journal":{"name":"Foundations and Trends in Machine Learning","volume":"18 1","pages":"72-79"},"PeriodicalIF":32.8,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76655913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Workshop Organization – AI4S 2020 研讨会组织- AI4S 2020
IF 32.8 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2020-11-01 DOI: 10.1109/mlhpcai4s51975.2020.00007
{"title":"Workshop Organization – AI4S 2020","authors":"","doi":"10.1109/mlhpcai4s51975.2020.00007","DOIUrl":"https://doi.org/10.1109/mlhpcai4s51975.2020.00007","url":null,"abstract":"","PeriodicalId":47667,"journal":{"name":"Foundations and Trends in Machine Learning","volume":"30 1","pages":""},"PeriodicalIF":32.8,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87327453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EventGraD: Event-Triggered Communication in Parallel Stochastic Gradient Descent 并行随机梯度下降中的事件触发通信
IF 32.8 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2020-11-01 DOI: 10.1109/MLHPCAI4S51975.2020.00008
Soumyadip Ghosh, V. Gupta
Communication in parallel systems consumes significant amount of time and energy which often turns out to be a bottleneck in distributed machine learning. In this paper, we present EventGraD - an algorithm with event-triggered communication in parallel stochastic gradient descent. The main idea of this algorithm is to modify the requirement of communication at every epoch to communicating only in certain epochs when necessary. In particular, the parameters are communicated only in the event when the change in their values exceed a threshold. The threshold for a parameter is chosen adaptively based on the rate of change of the parameter. The adaptive threshold ensures that the algorithm can be applied to different models on different datasets without any change. We focus on data-parallel training of a popular convolutional neural network used for training the MNIST dataset and show that EventGraD can reduce the communication load by up to 70% while retaining the same level of accuracy.
并行系统中的通信消耗了大量的时间和能量,这往往成为分布式机器学习的瓶颈。本文提出了一种并行随机梯度下降的事件触发通信算法EventGraD。该算法的主要思想是将每个历元的通信要求修改为在必要时只在某些历元进行通信。特别是,参数只有在其值的变化超过阈值时才进行通信。根据参数的变化率自适应地选择参数的阈值。自适应阈值保证了算法可以在不改变任何数据集的情况下应用于不同的模型。我们专注于用于训练MNIST数据集的流行卷积神经网络的数据并行训练,并表明EventGraD可以在保持相同精度的同时减少高达70%的通信负载。
{"title":"EventGraD: Event-Triggered Communication in Parallel Stochastic Gradient Descent","authors":"Soumyadip Ghosh, V. Gupta","doi":"10.1109/MLHPCAI4S51975.2020.00008","DOIUrl":"https://doi.org/10.1109/MLHPCAI4S51975.2020.00008","url":null,"abstract":"Communication in parallel systems consumes significant amount of time and energy which often turns out to be a bottleneck in distributed machine learning. In this paper, we present EventGraD - an algorithm with event-triggered communication in parallel stochastic gradient descent. The main idea of this algorithm is to modify the requirement of communication at every epoch to communicating only in certain epochs when necessary. In particular, the parameters are communicated only in the event when the change in their values exceed a threshold. The threshold for a parameter is chosen adaptively based on the rate of change of the parameter. The adaptive threshold ensures that the algorithm can be applied to different models on different datasets without any change. We focus on data-parallel training of a popular convolutional neural network used for training the MNIST dataset and show that EventGraD can reduce the communication load by up to 70% while retaining the same level of accuracy.","PeriodicalId":47667,"journal":{"name":"Foundations and Trends in Machine Learning","volume":"41 1","pages":"1-8"},"PeriodicalIF":32.8,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74075894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Accelerate Distributed Stochastic Descent for Nonconvex Optimization with Momentum 带动量的非凸优化的加速分布随机下降
IF 32.8 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2020-11-01 DOI: 10.1109/MLHPCAI4S51975.2020.00011
Guojing Cong, Tianyi Liu
Momentum method has been used extensively in optimizers for deep learning. Recent studies show that distributed training through K-step averaging has many nice properties. We propose a momentum method for such model averaging approaches. At each individual learner level traditional stochastic gradient is applied. At the meta-level (global learner level), one momentum term is applied and we call it block momentum. We analyze the convergence and scaling properties of such momentum methods. Our experimental results show that block momentum not only accelerates training, but also achieves better results.
动量方法在深度学习优化器中得到了广泛的应用。最近的研究表明,通过k步平均的分布式训练有许多很好的特性。我们提出了一种动量法用于这种模型平均方法。在每个单独的学习者水平上应用传统的随机梯度。在元层次(全局学习者层次),应用了一个动量项,我们称之为块动量。我们分析了这类动量方法的收敛性和标度性。我们的实验结果表明,块动量不仅加速了训练,而且取得了更好的效果。
{"title":"Accelerate Distributed Stochastic Descent for Nonconvex Optimization with Momentum","authors":"Guojing Cong, Tianyi Liu","doi":"10.1109/MLHPCAI4S51975.2020.00011","DOIUrl":"https://doi.org/10.1109/MLHPCAI4S51975.2020.00011","url":null,"abstract":"Momentum method has been used extensively in optimizers for deep learning. Recent studies show that distributed training through K-step averaging has many nice properties. We propose a momentum method for such model averaging approaches. At each individual learner level traditional stochastic gradient is applied. At the meta-level (global learner level), one momentum term is applied and we call it block momentum. We analyze the convergence and scaling properties of such momentum methods. Our experimental results show that block momentum not only accelerates training, but also achieves better results.","PeriodicalId":47667,"journal":{"name":"Foundations and Trends in Machine Learning","volume":"517 1","pages":"29-39"},"PeriodicalIF":32.8,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77134740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Workshop Organization – MLHPC 2020 研讨会组织- MLHPC 2020
IF 32.8 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2020-11-01 DOI: 10.1109/mlhpcai4s51975.2020.00006
{"title":"Workshop Organization – MLHPC 2020","authors":"","doi":"10.1109/mlhpcai4s51975.2020.00006","DOIUrl":"https://doi.org/10.1109/mlhpcai4s51975.2020.00006","url":null,"abstract":"","PeriodicalId":47667,"journal":{"name":"Foundations and Trends in Machine Learning","volume":"23 1","pages":""},"PeriodicalIF":32.8,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72977015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Message from the MLHPC Workshop Chairs 来自MLHPC研讨会主席的信息
IF 32.8 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2020-11-01 DOI: 10.1109/mlhpcai4s51975.2020.00004
{"title":"Message from the MLHPC Workshop Chairs","authors":"","doi":"10.1109/mlhpcai4s51975.2020.00004","DOIUrl":"https://doi.org/10.1109/mlhpcai4s51975.2020.00004","url":null,"abstract":"","PeriodicalId":47667,"journal":{"name":"Foundations and Trends in Machine Learning","volume":"81 1","pages":""},"PeriodicalIF":32.8,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91104494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Foundations and Trends in Machine Learning
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1