长短期记忆网络的高效神经结构搜索

2021 IEEE 19th World Symposium on Applied Machine Intelligence and Informatics (SAMI) Pub Date : 2021-01-21 DOI:10.1109/SAMI50585.2021.9378612

Hamdi Abed, Bálint Gyires-Tóth

{"title":"长短期记忆网络的高效神经结构搜索","authors":"Hamdi Abed, Bálint Gyires-Tóth","doi":"10.1109/SAMI50585.2021.9378612","DOIUrl":null,"url":null,"abstract":"Automated machine learning (AutoML) is a technique which helps to determine the optimal or near-optimal model for a specific dataset and has been a focused research area during the last years. The automation of model design opens doors for non-machine learning experts to utilize machine learning models in several scenarios, which is both appealing for a wide range of researchers and for cloud services as well. Neural Architecture Search is a subfield of AutoML where the optimal artificial neural network model's architecture is generally searched with adaptive algorithms. This paper proposes a method to apply Efficient Neural Architecture Search (ENAS) to LSTM-like recurrent architecture, which uses a gating mechanism an inner memory. Using this method, the paper investigates if the handcrafted Long Short-Term Memory (LSTM) cell is an optimal or near-optimal solution of sequence modelling for a given dataset, or other, automatically defined recurrent structures outperform. The performance of vanilla LSTM, and advanced recurrent architectures designed by random search, and reinforcement learning-based ENAS are examined and compared. The proposed methods are evaluated in a text generation task on the Penn TreeBank dataset.","PeriodicalId":402414,"journal":{"name":"2021 IEEE 19th World Symposium on Applied Machine Intelligence and Informatics (SAMI)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Efficient Neural Architecture Search for Long Short-Term Memory Networks\",\"authors\":\"Hamdi Abed, Bálint Gyires-Tóth\",\"doi\":\"10.1109/SAMI50585.2021.9378612\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automated machine learning (AutoML) is a technique which helps to determine the optimal or near-optimal model for a specific dataset and has been a focused research area during the last years. The automation of model design opens doors for non-machine learning experts to utilize machine learning models in several scenarios, which is both appealing for a wide range of researchers and for cloud services as well. Neural Architecture Search is a subfield of AutoML where the optimal artificial neural network model's architecture is generally searched with adaptive algorithms. This paper proposes a method to apply Efficient Neural Architecture Search (ENAS) to LSTM-like recurrent architecture, which uses a gating mechanism an inner memory. Using this method, the paper investigates if the handcrafted Long Short-Term Memory (LSTM) cell is an optimal or near-optimal solution of sequence modelling for a given dataset, or other, automatically defined recurrent structures outperform. The performance of vanilla LSTM, and advanced recurrent architectures designed by random search, and reinforcement learning-based ENAS are examined and compared. The proposed methods are evaluated in a text generation task on the Penn TreeBank dataset.\",\"PeriodicalId\":402414,\"journal\":{\"name\":\"2021 IEEE 19th World Symposium on Applied Machine Intelligence and Informatics (SAMI)\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-01-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE 19th World Symposium on Applied Machine Intelligence and Informatics (SAMI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SAMI50585.2021.9378612\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 19th World Symposium on Applied Machine Intelligence and Informatics (SAMI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SAMI50585.2021.9378612","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

自动化机器学习(AutoML)是一种帮助确定特定数据集的最优或接近最优模型的技术，在过去几年中一直是一个重点研究领域。模型设计的自动化为非机器学习专家在多种场景中使用机器学习模型打开了大门，这既吸引了广泛的研究人员，也吸引了云服务。神经结构搜索是AutoML的一个子领域，一般使用自适应算法搜索最优人工神经网络模型的结构。本文提出了一种将高效神经结构搜索(ENAS)应用于类lstm循环结构的方法，该方法使用了一种内部存储器的门控机制。使用这种方法，论文研究了手工制作的长短期记忆(LSTM)单元是否是给定数据集序列建模的最优或接近最优解决方案，或者其他自动定义的循环结构优于此。对普通LSTM、随机搜索设计的高级循环架构和基于强化学习的ENAS的性能进行了检验和比较。在Penn TreeBank数据集的文本生成任务中对所提出的方法进行了评估。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Efficient Neural Architecture Search for Long Short-Term Memory Networks

Automated machine learning (AutoML) is a technique which helps to determine the optimal or near-optimal model for a specific dataset and has been a focused research area during the last years. The automation of model design opens doors for non-machine learning experts to utilize machine learning models in several scenarios, which is both appealing for a wide range of researchers and for cloud services as well. Neural Architecture Search is a subfield of AutoML where the optimal artificial neural network model's architecture is generally searched with adaptive algorithms. This paper proposes a method to apply Efficient Neural Architecture Search (ENAS) to LSTM-like recurrent architecture, which uses a gating mechanism an inner memory. Using this method, the paper investigates if the handcrafted Long Short-Term Memory (LSTM) cell is an optimal or near-optimal solution of sequence modelling for a given dataset, or other, automatically defined recurrent structures outperform. The performance of vanilla LSTM, and advanced recurrent architectures designed by random search, and reinforcement learning-based ENAS are examined and compared. The proposed methods are evaluated in a text generation task on the Penn TreeBank dataset.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 IEEE 19th World Symposium on Applied Machine Intelligence and Informatics (SAMI)

自引率

0.00%

发文量