Xiao Haiyang, Yan Ruomei, Wu Yan, Guan Lixin, Li Mengshan
{"title":"Knowledge Graph for Solubility Big Data: Construction and Applications","authors":"Xiao Haiyang, Yan Ruomei, Wu Yan, Guan Lixin, Li Mengshan","doi":"10.1002/widm.1570","DOIUrl":null,"url":null,"abstract":"Dissolution refers to the process in which solvent molecules and solute molecules attract and combine with each other. The extensive solubility data generated from the dissolution of various compounds under different conditions, is distributed across structured or semi‐structured formats in various media, such as text, web pages, tables, images, and databases. These data exhibit multi‐source and unstructured features, aligning with the typical 5 V characteristics of big data. A solubility big data technology system has emerged under the fusion of solubility data and big data technologies. However, the acquisition, fusion, storage, representation, and utilization of solubility big data are encountering new challenges. Knowledge Graphs, known as extensive systems for representing and applying knowledge, can effectively describe entities, concepts, and relations across diverse domains. The construction of solubility big data knowledge graph holds substantial value in the retrieval, analysis, utilization, and visualization of solubility knowledge. Throwing out a brick to attract a jade, this paper focuses on the solubility big data knowledge graph and, firstly, summarizes the architecture of solubility knowledge graph construction. Secondly, the key technologies such as knowledge extraction, knowledge fusion, and knowledge reasoning of solubility big data are emphasized, along with summarizing the common machine learning methods in knowledge graph construction. Furthermore, this paper explores application scenarios, such as knowledge question answering and recommender systems for solubility big data. Finally, it presents a prospective view of the shortcomings, challenges, and future directions related to the construction of solubility big data knowledge graph. This article proposes the research direction of solubility big data knowledge graph, which can provide technical references for constructing a solubility knowledge graph. At the same time, it serves as a comprehensive medium for describing data, resources, and their applications across diverse fields such as chemistry, materials, biology, energy, medicine, and so on. It further aids in knowledge retrieval and mining, analysis and utilization, and visualization across various disciplines.","PeriodicalId":501013,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"61 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"WIREs Data Mining and Knowledge Discovery","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/widm.1570","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Dissolution refers to the process in which solvent molecules and solute molecules attract and combine with each other. The extensive solubility data generated from the dissolution of various compounds under different conditions, is distributed across structured or semi‐structured formats in various media, such as text, web pages, tables, images, and databases. These data exhibit multi‐source and unstructured features, aligning with the typical 5 V characteristics of big data. A solubility big data technology system has emerged under the fusion of solubility data and big data technologies. However, the acquisition, fusion, storage, representation, and utilization of solubility big data are encountering new challenges. Knowledge Graphs, known as extensive systems for representing and applying knowledge, can effectively describe entities, concepts, and relations across diverse domains. The construction of solubility big data knowledge graph holds substantial value in the retrieval, analysis, utilization, and visualization of solubility knowledge. Throwing out a brick to attract a jade, this paper focuses on the solubility big data knowledge graph and, firstly, summarizes the architecture of solubility knowledge graph construction. Secondly, the key technologies such as knowledge extraction, knowledge fusion, and knowledge reasoning of solubility big data are emphasized, along with summarizing the common machine learning methods in knowledge graph construction. Furthermore, this paper explores application scenarios, such as knowledge question answering and recommender systems for solubility big data. Finally, it presents a prospective view of the shortcomings, challenges, and future directions related to the construction of solubility big data knowledge graph. This article proposes the research direction of solubility big data knowledge graph, which can provide technical references for constructing a solubility knowledge graph. At the same time, it serves as a comprehensive medium for describing data, resources, and their applications across diverse fields such as chemistry, materials, biology, energy, medicine, and so on. It further aids in knowledge retrieval and mining, analysis and utilization, and visualization across various disciplines.
溶解是指溶剂分子和溶质分子相互吸引并结合的过程。各种化合物在不同条件下溶解所产生的大量溶解度数据以结构化或半结构化的格式分布在各种媒体中,如文本、网页、表格、图像和数据库。这些数据具有多源和非结构化的特点,符合大数据的典型 5 V 特征。在溶解度数据与大数据技术的融合下,溶解度大数据技术体系应运而生。然而,溶解度大数据的获取、融合、存储、表示和利用都遇到了新的挑战。知识图谱被称为表示和应用知识的广泛系统,可以有效地描述不同领域的实体、概念和关系。构建溶解度大数据知识图谱,对于溶解度知识的检索、分析、利用和可视化具有重要价值。抛砖引玉,本文聚焦溶解度大数据知识图谱,首先总结了溶解度知识图谱的构建架构。其次,重点介绍了溶度大数据的知识抽取、知识融合、知识推理等关键技术,并总结了知识图谱构建中常用的机器学习方法。此外,本文还探讨了溶解度大数据的知识问题解答和推荐系统等应用场景。最后,本文对构建溶解度大数据知识图谱相关的不足、挑战和未来方向进行了展望。本文提出了溶解度大数据知识图谱的研究方向,可为构建溶解度知识图谱提供技术参考。同时,它也是描述化学、材料、生物、能源、医学等不同领域的数据、资源及其应用的综合媒介。它还有助于各学科的知识检索和挖掘、分析和利用以及可视化。