{"title":"The Story of GraphLab - From Scaling Machine Learning to Shaping Graph Systems Research (VLDB 2023 Test-of-Time Award Talk)","authors":"Joseph E. Gonzalez, Yucheng Low","doi":"10.14778/3611540.3611637","DOIUrl":null,"url":null,"abstract":"The GraphLab project spanned almost a decade and had profound academic and industrial impact on large-scale machine learning and graph processing systems. There were numerous papers written describing the innovations in GraphLab including the original vertex-centric [8] and edge-centric [3] programming abstractions, high-performance asynchronous execution engines [9], out-of-core graph computation [6], tabular graph-systems [4], and even new statistical inference algorithms [2] enabled by the GraphLab project. This work became the basis of multiple PhD theses [1, 5, 7]. The GraphLab open-source project had broad academic and industrial adoption and ultimately lead to the launch of Turi. In this talk, we tell the story of GraphLab, how it began and the key ideas behind it. We will focus on the approach to achieving scalable asynchronous systems in machine learning. During our talk, we will explore the impact that GraphLab has had on the development of graph processing systems, graph databases, and AI/ML; Additionally, we will share our insights and opinions into where we see the future of these fields heading. In the process, we highlight some of the lessons we learned and provide guidance for future students.","PeriodicalId":54220,"journal":{"name":"Proceedings of the Vldb Endowment","volume":"18 1","pages":"0"},"PeriodicalIF":2.6000,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Vldb Endowment","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14778/3611540.3611637","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
The GraphLab project spanned almost a decade and had profound academic and industrial impact on large-scale machine learning and graph processing systems. There were numerous papers written describing the innovations in GraphLab including the original vertex-centric [8] and edge-centric [3] programming abstractions, high-performance asynchronous execution engines [9], out-of-core graph computation [6], tabular graph-systems [4], and even new statistical inference algorithms [2] enabled by the GraphLab project. This work became the basis of multiple PhD theses [1, 5, 7]. The GraphLab open-source project had broad academic and industrial adoption and ultimately lead to the launch of Turi. In this talk, we tell the story of GraphLab, how it began and the key ideas behind it. We will focus on the approach to achieving scalable asynchronous systems in machine learning. During our talk, we will explore the impact that GraphLab has had on the development of graph processing systems, graph databases, and AI/ML; Additionally, we will share our insights and opinions into where we see the future of these fields heading. In the process, we highlight some of the lessons we learned and provide guidance for future students.
期刊介绍:
The Proceedings of the VLDB (PVLDB) welcomes original research papers on a broad range of research topics related to all aspects of data management, where systems issues play a significant role, such as data management system technology and information management infrastructures, including their very large scale of experimentation, novel architectures, and demanding applications as well as their underpinning theory. The scope of a submission for PVLDB is also described by the subject areas given below. Moreover, the scope of PVLDB is restricted to scientific areas that are covered by the combined expertise on the submission’s topic of the journal’s editorial board. Finally, the submission’s contributions should build on work already published in data management outlets, e.g., PVLDB, VLDBJ, ACM SIGMOD, IEEE ICDE, EDBT, ACM TODS, IEEE TKDE, and go beyond a syntactic citation.