arXiv - CS - General Literature最新文献

英文中文

Fast-FNet: Accelerating Transformer Encoder Models via Efficient Fourier Layers Fast-FNet:通过高效傅立叶层加速变压器编码器模型

arXiv - CS - General Literature

Pub Date : 2022-09-26 DOI: arxiv-2209.12816

Nurullah Sevim, Ege Ozan Özyedek, Furkan Şahinuç, Aykut Koç

Transformer-based language models utilize the attention mechanism forsubstantial performance improvements in almost all natural language processing(NLP) tasks. Similar attention structures are also extensively studied inseveral other areas. Although the attention mechanism enhances the modelperformances significantly, its quadratic complexity prevents efficientprocessing of long sequences. Recent works focused on eliminating thedisadvantages of computational inefficiency and showed that transformer-basedmodels can still reach competitive results without the attention layer. Apioneering study proposed the FNet, which replaces the attention layer with theFourier Transform (FT) in the transformer encoder architecture. FNet achievescompetitive performances concerning the original transformer encoder modelwhile accelerating training process by removing the computational burden of theattention mechanism. However, the FNet model ignores essential properties ofthe FT from the classical signal processing that can be leveraged to increasemodel efficiency further. We propose different methods to deploy FT efficientlyin transformer encoder models. Our proposed architectures have smaller numberof model parameters, shorter training times, less memory usage, and someadditional performance improvements. We demonstrate these improvements throughextensive experiments on common benchmarks.

基于转换器的语言模型利用注意力机制在几乎所有自然语言处理(NLP)任务中大幅提高性能。类似的注意力结构在其他几个领域也得到了广泛的研究。注意机制虽然显著提高了模型的性能，但其二次复杂度阻碍了对长序列的有效处理。最近的研究集中在消除计算效率低下的缺点，并表明基于变压器的模型在没有注意层的情况下仍然可以达到有竞争力的结果。一项开创性的研究提出了FNet，它用傅里叶变换(FT)取代了变压器编码器结构中的注意层。FNet在消除注意力机制的计算负担的同时，在原有的变压器编码器模型上实现了具有竞争力的性能。然而，FNet模型忽略了经典信号处理中FT的基本特性，这些特性可以用来进一步提高模型效率。我们提出了不同的方法来有效地在变压器编码器模型中部署傅立叶变换。我们提出的体系结构具有更少的模型参数，更短的训练时间，更少的内存使用，以及一些额外的性能改进。我们通过在通用基准测试上进行大量实验来演示这些改进。

{"title":"Fast-FNet: Accelerating Transformer Encoder Models via Efficient Fourier Layers","authors":"Nurullah Sevim, Ege Ozan Özyedek, Furkan Şahinuç, Aykut Koç","doi":"arxiv-2209.12816","DOIUrl":"https://doi.org/arxiv-2209.12816","url":null,"abstract":"Transformer-based language models utilize the attention mechanism for\u0000substantial performance improvements in almost all natural language processing\u0000(NLP) tasks. Similar attention structures are also extensively studied in\u0000several other areas. Although the attention mechanism enhances the model\u0000performances significantly, its quadratic complexity prevents efficient\u0000processing of long sequences. Recent works focused on eliminating the\u0000disadvantages of computational inefficiency and showed that transformer-based\u0000models can still reach competitive results without the attention layer. A\u0000pioneering study proposed the FNet, which replaces the attention layer with the\u0000Fourier Transform (FT) in the transformer encoder architecture. FNet achieves\u0000competitive performances concerning the original transformer encoder model\u0000while accelerating training process by removing the computational burden of the\u0000attention mechanism. However, the FNet model ignores essential properties of\u0000the FT from the classical signal processing that can be leveraged to increase\u0000model efficiency further. We propose different methods to deploy FT efficiently\u0000in transformer encoder models. Our proposed architectures have smaller number\u0000of model parameters, shorter training times, less memory usage, and some\u0000additional performance improvements. We demonstrate these improvements through\u0000extensive experiments on common benchmarks.","PeriodicalId":501533,"journal":{"name":"arXiv - CS - General Literature","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138544242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Towards a Standardised Performance Evaluation Protocol for Cooperative MARL 合作MARL的标准化绩效评估协议研究

arXiv - CS - General Literature

Pub Date : 2022-09-21 DOI: arxiv-2209.10485

Rihab Gorsane, Omayma Mahjoub, Ruan de Kock, Roland Dubb, Siddarth Singh, Arnu Pretorius

Multi-agent reinforcement learning (MARL) has emerged as a useful approach tosolving decentralised decision-making problems at scale. Research in the fieldhas been growing steadily with many breakthrough algorithms proposed in recentyears. In this work, we take a closer look at this rapid development with afocus on evaluation methodologies employed across a large body of research incooperative MARL. By conducting a detailed meta-analysis of prior work,spanning 75 papers accepted for publication from 2016 to 2022, we bring tolight worrying trends that put into question the true rate of progress. Wefurther consider these trends in a wider context and take inspiration fromsingle-agent RL literature on similar issues with recommendations that remainapplicable to MARL. Combining these recommendations, with novel insights fromour analysis, we propose a standardised performance evaluation protocol forcooperative MARL. We argue that such a standard protocol, if widely adopted,would greatly improve the validity and credibility of future research, makereplication and reproducibility easier, as well as improve the ability of thefield to accurately gauge the rate of progress over time by being able to makesound comparisons across different works. Finally, we release our meta-analysisdata publicly on our project website for future research on evaluation:https://sites.google.com/view/marl-standard-protocol

多智能体强化学习(MARL)已经成为解决大规模分散决策问题的一种有用方法。近年来，该领域的研究稳步发展，提出了许多突破性的算法。在这项工作中，我们仔细研究了这种快速发展，重点是在合作MARL的大量研究中采用的评估方法。通过对之前的工作进行详细的荟萃分析，涵盖2016年至2022年发表的75篇论文，我们揭示了令人担忧的趋势，这些趋势让人质疑真正的进展速度。我们进一步在更广泛的背景下考虑这些趋势，并从关于类似问题的单智能体强化学习文献中获得灵感，并提出仍然适用于MARL的建议。结合这些建议和我们分析的新见解，我们提出了一个标准化的合作MARL绩效评估协议。我们认为，这样的标准协议，如果被广泛采用，将大大提高未来研究的有效性和可信度，使重复性和可重复性更容易，并提高该领域的能力，以准确地衡量随着时间的推移，通过能够在不同的工作之间做出合理的比较进展速度。最后，我们在我们的项目网站上公开发布了我们的元分析数据，以供未来的评估研究:https://sites.google.com/view/marl-standard-protocol

{"title":"Towards a Standardised Performance Evaluation Protocol for Cooperative MARL","authors":"Rihab Gorsane, Omayma Mahjoub, Ruan de Kock, Roland Dubb, Siddarth Singh, Arnu Pretorius","doi":"arxiv-2209.10485","DOIUrl":"https://doi.org/arxiv-2209.10485","url":null,"abstract":"Multi-agent reinforcement learning (MARL) has emerged as a useful approach to\u0000solving decentralised decision-making problems at scale. Research in the field\u0000has been growing steadily with many breakthrough algorithms proposed in recent\u0000years. In this work, we take a closer look at this rapid development with a\u0000focus on evaluation methodologies employed across a large body of research in\u0000cooperative MARL. By conducting a detailed meta-analysis of prior work,\u0000spanning 75 papers accepted for publication from 2016 to 2022, we bring to\u0000light worrying trends that put into question the true rate of progress. We\u0000further consider these trends in a wider context and take inspiration from\u0000single-agent RL literature on similar issues with recommendations that remain\u0000applicable to MARL. Combining these recommendations, with novel insights from\u0000our analysis, we propose a standardised performance evaluation protocol for\u0000cooperative MARL. We argue that such a standard protocol, if widely adopted,\u0000would greatly improve the validity and credibility of future research, make\u0000replication and reproducibility easier, as well as improve the ability of the\u0000field to accurately gauge the rate of progress over time by being able to make\u0000sound comparisons across different works. Finally, we release our meta-analysis\u0000data publicly on our project website for future research on evaluation:\u0000https://sites.google.com/view/marl-standard-protocol","PeriodicalId":501533,"journal":{"name":"arXiv - CS - General Literature","volume":"76 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138544299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Challenges and Opportunities of Large Transnational Datasets: A Case Study on European Administrative Crop Data 大型跨国数据集的挑战与机遇:以欧洲行政作物数据为例

arXiv - CS - General Literature

Pub Date : 2022-09-19 DOI: arxiv-2210.07178

Maja Schneider, Christian Marchington, Marco Körner

Expansive, informative datasets are vital in providing foundations andpossibilities for scientific research and development across many fields ofstudy. Assembly of grand datasets, however, frequently poses difficulty for theauthor and stakeholders alike, with a variety of considerations requiredthroughout the collaboration efforts and development lifecycle. In this work,we discuss and analyse the challenges and opportunities we faced throughout thecreation of a transnational, European agricultural dataset containing referencelabels of cultivated crops. Together, this forms a succinct framework ofimportant elements one should consider when forging a dataset of their own.

广泛的、信息丰富的数据集对于为许多研究领域的科学研究和发展提供基础和可能性至关重要。然而，大数据集的组装经常给作者和涉众带来困难，在整个协作努力和开发生命周期中需要考虑各种各样的因素。在这项工作中，我们讨论和分析了我们在创建包含栽培作物参考标签的跨国欧洲农业数据集时所面临的挑战和机遇。总之，这形成了一个简洁的框架，其中包含了人们在构建自己的数据集时应该考虑的重要元素。

引用次数: 0

An Overview of Phishing Victimization: Human Factors, Training and the Role of Emotions 网络钓鱼受害概述:人为因素、训练和情绪的作用

arXiv - CS - General Literature

Pub Date : 2022-09-13 DOI: arxiv-2209.11197

Mousa Jari

Phishing is a form of cybercrime and a threat that allows criminals,phishers, to deceive end users in order to steal their confidential andsensitive information. Attackers usually attempt to manipulate the psychologyand emotions of victims. The increasing threat of phishing has made its studyworthwhile and much research has been conducted into the issue. This paperexplores the emotional factors that have been reported in previous studies tobe significant in phishing victimization. In addition, we compare what securityorganizations and researchers have highlighted in terms of phishing types andcategories as well as training in tackling the problem, in a literature reviewwhich takes into account all major credible and published sources.

网络钓鱼是一种网络犯罪和威胁，允许犯罪分子，网络钓鱼者，欺骗最终用户，以窃取他们的机密和敏感信息。攻击者通常试图操纵受害者的心理和情绪。随着网络钓鱼威胁的日益增加，人们对其进行了大量的研究。本文探讨了以往研究中报道的情感因素在网络钓鱼受害中的重要作用。此外，我们在文献综述中比较了安全组织和研究人员在网络钓鱼类型和类别以及解决问题的培训方面所强调的内容，该文献综述考虑了所有主要的可信和已发表的来源。

引用次数: 0

SIND: A Drone Dataset at Signalized Intersection in China 中国信号交叉口无人机数据集

arXiv - CS - General Literature

Pub Date : 2022-09-06 DOI: arxiv-2209.02297

Yanchao Xu, Wenbo Shao, Jun Li, Kai Yang, Weida Wang, Hua Huang, Chen Lv, Hong Wang

Intersection is one of the most challenging scenarios for autonomous drivingtasks. Due to the complexity and stochasticity, essential applications (e.g.,behavior modeling, motion prediction, safety validation, etc.) at intersectionsrely heavily on data-driven techniques. Thus, there is an intense demand fortrajectory datasets of traffic participants (TPs) in intersections. Currently,most intersections in urban areas are equipped with traffic lights. However,there is not yet a large-scale, high-quality, publicly available trajectorydataset for signalized intersections. Therefore, in this paper, a typicaltwo-phase signalized intersection is selected in Tianjin, China. Besides, apipeline is designed to construct a Signalized INtersection Dataset (SIND),which contains 7 hours of recording including over 13,000 TPs with 7 types.Then, the behaviors of traffic light violations in SIND are recorded.Furthermore, the SIND is also compared with other similar works. The featuresof the SIND can be summarized as follows: 1) SIND provides more comprehensiveinformation, including traffic light states, motion parameters, High Definition(HD) map, etc. 2) The category of TPs is diverse and characteristic, where theproportion of vulnerable road users (VRUs) is up to 62.6% 3) Multiple trafficlight violations of non-motor vehicles are shown. We believe that SIND would bean effective supplement to existing datasets and can promote related researchon autonomous driving.The dataset is available online via:https://github.com/SOTIF-AVLab/SinD

十字路口是自动驾驶任务最具挑战性的场景之一。由于复杂性和随机性，十字路口的基本应用(例如，行为建模，运动预测，安全验证等)严重依赖于数据驱动技术。因此，对交叉口交通参与者的轨迹数据集有着强烈的需求。目前，大部分城市的十字路口都安装了红绿灯。然而，目前还没有一个大规模、高质量、公开可用的信号交叉口轨迹数据集。因此，本文选择了中国天津一个典型的两相信号交叉口。此外，pipeline还设计构建了一个信号交叉口数据集(Signalized INtersection Dataset, SIND)，该数据集包含7个小时的记录，包括7个类型的13000多个tp。然后，记录新新市交通信号灯违规行为。并与其他同类作品进行了比较。SIND的特点可以概括为:1)SIND提供了更全面的信息，包括交通灯状态、运动参数、高清(HD)地图等;2)TPs的类别多样且具有特色，其中弱势道路使用者(vru)的比例高达62.6%;3)显示了非机动车多次违反交通灯的行为。我们相信，SIND将是对现有数据集的有效补充，并能促进自动驾驶相关研究。该数据集可通过https://github.com/SOTIF-AVLab/SinD在线获得

{"title":"SIND: A Drone Dataset at Signalized Intersection in China","authors":"Yanchao Xu, Wenbo Shao, Jun Li, Kai Yang, Weida Wang, Hua Huang, Chen Lv, Hong Wang","doi":"arxiv-2209.02297","DOIUrl":"https://doi.org/arxiv-2209.02297","url":null,"abstract":"Intersection is one of the most challenging scenarios for autonomous driving\u0000tasks. Due to the complexity and stochasticity, essential applications (e.g.,\u0000behavior modeling, motion prediction, safety validation, etc.) at intersections\u0000rely heavily on data-driven techniques. Thus, there is an intense demand for\u0000trajectory datasets of traffic participants (TPs) in intersections. Currently,\u0000most intersections in urban areas are equipped with traffic lights. However,\u0000there is not yet a large-scale, high-quality, publicly available trajectory\u0000dataset for signalized intersections. Therefore, in this paper, a typical\u0000two-phase signalized intersection is selected in Tianjin, China. Besides, a\u0000pipeline is designed to construct a Signalized INtersection Dataset (SIND),\u0000which contains 7 hours of recording including over 13,000 TPs with 7 types.\u0000Then, the behaviors of traffic light violations in SIND are recorded.\u0000Furthermore, the SIND is also compared with other similar works. The features\u0000of the SIND can be summarized as follows: 1) SIND provides more comprehensive\u0000information, including traffic light states, motion parameters, High Definition\u0000(HD) map, etc. 2) The category of TPs is diverse and characteristic, where the\u0000proportion of vulnerable road users (VRUs) is up to 62.6% 3) Multiple traffic\u0000light violations of non-motor vehicles are shown. We believe that SIND would be\u0000an effective supplement to existing datasets and can promote related research\u0000on autonomous driving.The dataset is available online via:\u0000https://github.com/SOTIF-AVLab/SinD","PeriodicalId":501533,"journal":{"name":"arXiv - CS - General Literature","volume":"61 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138544232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Decentralized Infrastructure for (Neuro)science (神经)科学的分散基础设施

arXiv - CS - General Literature

Pub Date : 2022-09-01 DOI: arxiv-2209.07493

Jonny L. Saunders

The most pressing problems in science are neither empirical nor theoretical,but infrastructural. Scientific practice is defined by coproductive, mutuallyreinforcing infrastructural deficits and incentive systems that everywhereconstrain and contort our art of curiosity in service of profit and prestige.Our infrastructural problems are not unique to science, but reflective of thebroader logic of digital enclosure where platformatized control of informationproduction and extraction fuels some of the largest corporations in the world.I have taken lessons learned from decades of intertwined digital cultureswithin and beyond academia like wikis, pirates, and librarians in order todraft a path towards more liberatory infrastructures for both science andsociety. Based on a system of peer-to-peer linked data, I sketch interoperablesystems for shared data, tools, and knowledge that map onto three domains ofplatform capture: storage, computation and communication. The challenge ofinfrastructure is not solely technical, but also social and cultural, and so Iattempt to ground a practical development blueprint in an ethics for organizingand maintaining it. I intend this draft as a rallying call for organization, tobe revised with the input of collaborators and through the challenges posed byits implementation. I argue that a more liberatory future for science isneither utopian nor impractical -- the truly impractical choice is to continueto organize science as prestige fiefdoms resting on a pyramid scheme ofunderpaid labor, playing out the clock as every part of our work is swallowedwhole by circling information conglomerates. It was arguably scientists lookingfor a better way to communicate that created something as radical as theinternet in the first place, and I believe we can do it again.

科学中最紧迫的问题既不是经验问题，也不是理论问题，而是基础设施问题。科学实践是由协同生产、相互加强的基础设施缺陷和激励系统定义的，这些缺陷和激励系统在任何地方都限制和扭曲了我们为利润和声望服务的好奇心艺术。我们的基础设施问题并不是科学所独有的，而是反映了数字封闭的更广泛逻辑，信息生产和提取的平台化控制为世界上一些最大的公司提供了动力。我从几十年来学术界内外相互交织的数字文化中吸取了教训，比如维基、海盗和图书管理员，以便为科学和社会制定一条更自由的基础设施之路。基于点对点链接数据的系统，我为共享数据、工具和知识绘制了可互操作的系统，这些系统映射到平台捕获的三个领域:存储、计算和通信。基础设施的挑战不仅是技术上的，而且是社会和文化上的，因此我试图在组织和维护基础设施的道德规范中建立一个实用的发展蓝图。我打算将这份草案作为组织的号召，并通过合作者的投入和实施所带来的挑战对其进行修订。我认为，一个更加自由的科学未来既不是乌托邦，也不是不切实际的——真正不切实际的选择是继续把科学组织成依靠低薪劳动力的金字塔计划的声望领地，在我们的每一部分工作都被盘旋的信息集团吞噬的时候消磨时间。可以说，最初是科学家们在寻找一种更好的交流方式，创造了像互联网这样激进的东西，我相信我们可以再次做到这一点。

{"title":"Decentralized Infrastructure for (Neuro)science","authors":"Jonny L. Saunders","doi":"arxiv-2209.07493","DOIUrl":"https://doi.org/arxiv-2209.07493","url":null,"abstract":"The most pressing problems in science are neither empirical nor theoretical,\u0000but infrastructural. Scientific practice is defined by coproductive, mutually\u0000reinforcing infrastructural deficits and incentive systems that everywhere\u0000constrain and contort our art of curiosity in service of profit and prestige.\u0000Our infrastructural problems are not unique to science, but reflective of the\u0000broader logic of digital enclosure where platformatized control of information\u0000production and extraction fuels some of the largest corporations in the world.\u0000I have taken lessons learned from decades of intertwined digital cultures\u0000within and beyond academia like wikis, pirates, and librarians in order to\u0000draft a path towards more liberatory infrastructures for both science and\u0000society. Based on a system of peer-to-peer linked data, I sketch interoperable\u0000systems for shared data, tools, and knowledge that map onto three domains of\u0000platform capture: storage, computation and communication. The challenge of\u0000infrastructure is not solely technical, but also social and cultural, and so I\u0000attempt to ground a practical development blueprint in an ethics for organizing\u0000and maintaining it. I intend this draft as a rallying call for organization, to\u0000be revised with the input of collaborators and through the challenges posed by\u0000its implementation. I argue that a more liberatory future for science is\u0000neither utopian nor impractical -- the truly impractical choice is to continue\u0000to organize science as prestige fiefdoms resting on a pyramid scheme of\u0000underpaid labor, playing out the clock as every part of our work is swallowed\u0000whole by circling information conglomerates. It was arguably scientists looking\u0000for a better way to communicate that created something as radical as the\u0000internet in the first place, and I believe we can do it again.","PeriodicalId":501533,"journal":{"name":"arXiv - CS - General Literature","volume":"210 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138544239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization z - code++:一种为抽象摘要优化的预训练语言模型

arXiv - CS - General Literature

Pub Date : 2022-08-21 DOI: arxiv-2208.09770

Pengcheng He, Baolin Peng, Liyang Lu, Song Wang, Jie Mei, Yang Liu, Ruochen Xu, Hany Hassan Awadalla, Yu Shi, Chenguang Zhu, Wayne Xiong, Michael Zeng, Jianfeng Gao, Xuedong Huang

This paper presents Z-Code++, a new pre-trained language model optimized forabstractive text summarization. The model extends the state of the artencoder-decoder model using three techniques. First, we use a two-phasepre-training process to improve model's performance on low-resourcesummarization tasks. The model is first pre-trained using text corpora forlanguage understanding, and then is continually pre-trained on summarizationcorpora for grounded text generation. Second, we replace self-attention layersin the encoder with disentangled attention layers, where each word isrepresented using two vectors that encode its content and position,respectively. Third, we use fusion-in-encoder, a simple yet effective method ofencoding long sequences in a hierarchical manner. Z-Code++ creates new state ofthe art on 9 out of 13 text summarization tasks across 5 languages. Our modelis parameter-efficient in that it outperforms the 600x larger PaLM-540B onXSum, and the finetuned 200x larger GPT3-175B on SAMSum. In zero-shot andfew-shot settings, our model substantially outperforms the competing models.

本文提出了一种新的针对抽象文本摘要进行优化的预训练语言模型z - code++。该模型使用三种技术扩展了artencoder-decoder模型的状态。首先，我们使用两阶段预训练过程来提高模型在低资源汇总任务上的性能。该模型首先使用文本语料库进行语言理解的预训练，然后继续使用摘要语料库进行基础文本生成的预训练。其次，我们将编码器中的自注意层替换为解纠缠的注意层，其中每个单词分别使用两个向量来表示其内容和位置。第三，我们使用融合编码器，这是一种简单而有效的方法，以分层方式编码长序列。z - code++在5种语言的13个文本摘要任务中的9个上创造了新的技术水平。我们的模型是参数高效的，因为它在xsum上优于600倍大的PaLM-540B，在SAMSum上优于经过微调的200倍大的GPT3-175B。在零射击和少射击设置中，我们的模型实质上优于竞争模型。

{"title":"Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization","authors":"Pengcheng He, Baolin Peng, Liyang Lu, Song Wang, Jie Mei, Yang Liu, Ruochen Xu, Hany Hassan Awadalla, Yu Shi, Chenguang Zhu, Wayne Xiong, Michael Zeng, Jianfeng Gao, Xuedong Huang","doi":"arxiv-2208.09770","DOIUrl":"https://doi.org/arxiv-2208.09770","url":null,"abstract":"This paper presents Z-Code++, a new pre-trained language model optimized for\u0000abstractive text summarization. The model extends the state of the art\u0000encoder-decoder model using three techniques. First, we use a two-phase\u0000pre-training process to improve model's performance on low-resource\u0000summarization tasks. The model is first pre-trained using text corpora for\u0000language understanding, and then is continually pre-trained on summarization\u0000corpora for grounded text generation. Second, we replace self-attention layers\u0000in the encoder with disentangled attention layers, where each word is\u0000represented using two vectors that encode its content and position,\u0000respectively. Third, we use fusion-in-encoder, a simple yet effective method of\u0000encoding long sequences in a hierarchical manner. Z-Code++ creates new state of\u0000the art on 9 out of 13 text summarization tasks across 5 languages. Our model\u0000is parameter-efficient in that it outperforms the 600x larger PaLM-540B on\u0000XSum, and the finetuned 200x larger GPT3-175B on SAMSum. In zero-shot and\u0000few-shot settings, our model substantially outperforms the competing models.","PeriodicalId":501533,"journal":{"name":"arXiv - CS - General Literature","volume":"4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138544301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Long-Term Mentoring for Computer Science Researchers 计算机科学研究人员的长期指导

arXiv - CS - General Literature

Pub Date : 2022-08-06 DOI: arxiv-2208.04738

Emily Ruppel, Sihang Liu, Elba Garza, Sukyoung Ryu, Alexandra Silva, Talia Ringer

Early in the pandemic, we -- leaders in the research areas of programminglanguages (PL) and computer architecture (CA) -- realized that we had aproblem: the only way to form new lasting connections in the community was toalready have lasting connections in the community. Both of our academiccommunities had wonderful short-term mentoring programs to address thisproblem, but it was clear that we needed long-term mentoring programs. Those of us in CA approached this scientifically, making an evidence-backedcase for community-wide long-term mentoring. In the meantime, one of us in PLhad impulsively launched an unofficial long-term mentoring program, founded onchaos and spreadsheets. In January 2021, the latter grew to an officialcross-institutional long-term mentoring program called SIGPLAN-M; in January2022, the former grew to Computer Architecture Long-term Mentoring (CALM). The impacts have been strong: SIGPLAN-M reaches 328 mentees and 234 mentorsacross 41 countries, and mentees have described it as "life changing" and "acareer saver." And while CALM is in its pilot phase -- with 13 mentors and 21mentees across 7 countries -- it has received very positive feedback. Theleaders of SIGPLAN-M and CALM shared our designs, impacts, and challenges alongthe way. Now, we wish to share those with you. We hope this will kick-start alarger long-term mentoring effort across all of computer science.

在疫情早期，我们——编程语言(PL)和计算机体系结构(CA)研究领域的领导者——意识到我们遇到了一个问题:在社区中形成新的持久联系的唯一方法是在社区中已经拥有持久的联系。我们的学术团体都有很好的短期指导项目来解决这个问题，但很明显，我们需要长期的指导项目。我们这些在加州的人科学地处理了这个问题，为社区范围内的长期指导提供了一个有证据支持的案例。与此同时，我们在pli的一个人冲动地启动了一个非官方的长期指导计划，建立在混乱和电子表格的基础上。2021年1月，后者发展成为一个官方的跨机构长期指导项目，名为SIGPLAN-M;2022年1月，前者更名为计算机架构长期指导(CALM)。SIGPLAN-M的影响是巨大的:在41个国家，SIGPLAN-M有328名学员和234名导师，学员们将其描述为“改变生活”和“职业拯救者”。虽然CALM还处于试点阶段，在7个国家拥有13名导师和21名学员，但它收到了非常积极的反馈。SIGPLAN-M和CALM的领导者分享了我们的设计、影响和挑战。现在，我们希望与大家分享。我们希望这将在整个计算机科学领域启动更大的长期指导工作。

{"title":"Long-Term Mentoring for Computer Science Researchers","authors":"Emily Ruppel, Sihang Liu, Elba Garza, Sukyoung Ryu, Alexandra Silva, Talia Ringer","doi":"arxiv-2208.04738","DOIUrl":"https://doi.org/arxiv-2208.04738","url":null,"abstract":"Early in the pandemic, we -- leaders in the research areas of programming\u0000languages (PL) and computer architecture (CA) -- realized that we had a\u0000problem: the only way to form new lasting connections in the community was to\u0000already have lasting connections in the community. Both of our academic\u0000communities had wonderful short-term mentoring programs to address this\u0000problem, but it was clear that we needed long-term mentoring programs. Those of us in CA approached this scientifically, making an evidence-backed\u0000case for community-wide long-term mentoring. In the meantime, one of us in PL\u0000had impulsively launched an unofficial long-term mentoring program, founded on\u0000chaos and spreadsheets. In January 2021, the latter grew to an official\u0000cross-institutional long-term mentoring program called SIGPLAN-M; in January\u00002022, the former grew to Computer Architecture Long-term Mentoring (CALM). The impacts have been strong: SIGPLAN-M reaches 328 mentees and 234 mentors\u0000across 41 countries, and mentees have described it as \"life changing\" and \"a\u0000career saver.\" And while CALM is in its pilot phase -- with 13 mentors and 21\u0000mentees across 7 countries -- it has received very positive feedback. The\u0000leaders of SIGPLAN-M and CALM shared our designs, impacts, and challenges along\u0000the way. Now, we wish to share those with you. We hope this will kick-start a\u0000larger long-term mentoring effort across all of computer science.","PeriodicalId":501533,"journal":{"name":"arXiv - CS - General Literature","volume":"4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138544228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Mary Kenneth Keller: First US PhD in Computer Science 玛丽·肯尼斯·凯勒:美国第一位计算机科学博士

arXiv - CS - General Literature

Pub Date : 2022-08-02 DOI: arxiv-2208.01765

Jennifer Head, Dianne P. O'Leary

The first two doctoral-level degrees in Computer Science in the US wereawarded in June 1965. This paper discusses one of the degree recipients, SisterMary Kenneth Keller, BVM.

1965年6月，美国颁发了首批两个计算机科学博士学位。本文讨论了其中一位学位获得者，修女玛丽·肯尼思·凯勒，BVM。

引用次数: 0

RangL: A Reinforcement Learning Competition Platform RangL:一个强化学习竞赛平台

arXiv - CS - General Literature

Pub Date : 2022-07-28 DOI: arxiv-2208.00003

Viktor Zobernig, Richard A. Saldanha, Jinke He, Erica van der Sar, Jasper van Doorn, Jia-Chen Hua, Lachlan R. Mason, Aleksander Czechowski, Drago Indjic, Tomasz Kosmala, Alessandro Zocca, Sandjai Bhulai, Jorge Montalvo Arvizu, Claude Klöckl, John Moriarty

The RangL project hosted by The Alan Turing Institute aims to encourage thewider uptake of reinforcement learning by supporting competitions relating toreal-world dynamic decision problems. This article describes the reusable coderepository developed by the RangL team and deployed for the 2022 Pathways toNet Zero Challenge, supported by the UK Net Zero Technology Centre. The winningsolutions to this particular Challenge seek to optimize the UK's energytransition policy to net zero carbon emissions by 2050. The RangL repositoryincludes an OpenAI Gym reinforcement learning environment and code thatsupports both submission to, and evaluation in, a remote instance of the opensource EvalAI platform as well as all winning learning agent strategies. Therepository is an illustrative example of RangL's capability to provide areusable structure for future challenges.

阿兰图灵研究所主持的RangL项目旨在通过支持与现实世界动态决策问题相关的竞争，鼓励更广泛地采用强化学习。本文描述了RangL团队开发的可重用代码存储库，该代码存储库由英国净零技术中心支持，为2022年路径网络零挑战部署。这一特殊挑战的获奖解决方案旨在优化英国的能源转型政策，到2050年实现净零碳排放。RangL存储库包括OpenAI Gym强化学习环境和代码，支持向开源EvalAI平台的远程实例提交和评估，以及所有获胜的学习代理策略。存储库是RangL为未来的挑战提供可重用结构的能力的一个说明性示例。

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

arXiv - CS - General Literature

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀