{"title":"Intersections of Big Data and IoT in Academic Publications: A Topic Modeling Approach.","authors":"Diana-Andreea Căuniac, Andreea-Alexandra Cîrnaru, Simona-Vasilica Oprea, Adela Bâra","doi":"10.3390/s25030906","DOIUrl":null,"url":null,"abstract":"<p><p>As vast amounts of data are generated from various sources such as social media, sensors and online transactions, the analysis of Big Data offers organizations the ability to derive insights and make informed decisions. Simultaneously, IoT connects physical devices, enabling real-time data collection and exchange that transforms interactions within smart homes, cities and industries. The intersection of these fields is essential, leading to innovations such as predictive maintenance, real-time traffic management and personalized solutions. Utilizing a dataset of 8159 publications sourced from the Web of Science database, our research employs Natural Language Processing (NLP) techniques and selective human validation to analyze abstracts, titles, keywords and other useful information, uncovering key themes and trends in both Big Data and IoT research. Six topics are extracted using Latent Dirichlet Allocation. In Topic 1, words like \"system\" and \"energy\" are among the most frequent, signaling that Topic 1 revolves around <i>data systems and IoT technologies</i>, likely in the context of smart systems and energy-related applications. Topic 2 focuses on the <i>application of technologies</i>, as indicated by terms such as \"technologies\", \"industry\" and \"research\". It deals with how IoT and related technologies are transforming various industries. Topic 3 emphasizes terms like learning and research, indicating a focus on <i>machine learning and IoT applications</i>. It is oriented toward research involving new methods and models in the IoT domain related to learning algorithms. Topic 4 highlights terms such as smart, suggesting a focus on <i>smart technologies and systems</i>. Topic 5 touches upon the role of digital chains and supply systems, suggesting an industrial focus on <i>digital transformation</i>. Topic 6 focuses on technical aspects such as <i>modeling, system performance and prediction algorithms</i>. It delves into the efficiency of IoT networks with terms like \"accuracy\", \"power\" and \"performance\" standing out.</p>","PeriodicalId":21698,"journal":{"name":"Sensors","volume":"25 3","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2025-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11820817/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sensors","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.3390/s25030906","RegionNum":3,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, ANALYTICAL","Score":null,"Total":0}
引用次数: 0
Abstract
As vast amounts of data are generated from various sources such as social media, sensors and online transactions, the analysis of Big Data offers organizations the ability to derive insights and make informed decisions. Simultaneously, IoT connects physical devices, enabling real-time data collection and exchange that transforms interactions within smart homes, cities and industries. The intersection of these fields is essential, leading to innovations such as predictive maintenance, real-time traffic management and personalized solutions. Utilizing a dataset of 8159 publications sourced from the Web of Science database, our research employs Natural Language Processing (NLP) techniques and selective human validation to analyze abstracts, titles, keywords and other useful information, uncovering key themes and trends in both Big Data and IoT research. Six topics are extracted using Latent Dirichlet Allocation. In Topic 1, words like "system" and "energy" are among the most frequent, signaling that Topic 1 revolves around data systems and IoT technologies, likely in the context of smart systems and energy-related applications. Topic 2 focuses on the application of technologies, as indicated by terms such as "technologies", "industry" and "research". It deals with how IoT and related technologies are transforming various industries. Topic 3 emphasizes terms like learning and research, indicating a focus on machine learning and IoT applications. It is oriented toward research involving new methods and models in the IoT domain related to learning algorithms. Topic 4 highlights terms such as smart, suggesting a focus on smart technologies and systems. Topic 5 touches upon the role of digital chains and supply systems, suggesting an industrial focus on digital transformation. Topic 6 focuses on technical aspects such as modeling, system performance and prediction algorithms. It delves into the efficiency of IoT networks with terms like "accuracy", "power" and "performance" standing out.
随着社交媒体、传感器和在线交易等各种来源产生大量数据,对大数据的分析为组织提供了获得见解并做出明智决策的能力。同时,物联网连接物理设备,实现实时数据收集和交换,从而改变智能家居、城市和工业内部的交互。这些领域的交叉是必不可少的,这将带来预测性维护、实时交通管理和个性化解决方案等创新。利用来自Web of Science数据库的8159篇出版物的数据集,我们的研究采用自然语言处理(NLP)技术和选择性的人工验证来分析摘要、标题、关键词和其他有用信息,揭示大数据和物联网研究的关键主题和趋势。使用潜狄利克雷分配方法提取了6个主题。在主题1中,像“系统”和“能源”这样的词是最常见的,这表明主题1围绕数据系统和物联网技术展开,可能是在智能系统和能源相关应用的背景下。主题2侧重于技术的应用,如“技术”、“工业”和“研究”等术语所示。它涉及物联网和相关技术如何改变各个行业。主题3强调学习和研究等术语,表明重点是机器学习和物联网应用。它面向涉及物联网领域中与学习算法相关的新方法和模型的研究。主题4强调了智能等术语,表明了对智能技术和系统的关注。主题5涉及数字链和供应系统的作用,建议行业关注数字化转型。主题6侧重于技术方面,如建模,系统性能和预测算法。它深入研究了物联网网络的效率,其中“准确性”、“功率”和“性能”等术语尤为突出。
期刊介绍:
Sensors (ISSN 1424-8220) provides an advanced forum for the science and technology of sensors and biosensors. It publishes reviews (including comprehensive reviews on the complete sensors products), regular research papers and short notes. Our aim is to encourage scientists to publish their experimental and theoretical results in as much detail as possible. There is no restriction on the length of the papers. The full experimental details must be provided so that the results can be reproduced.