ACM Transactions on Knowledge Discovery from Data (TKDD)最新文献_第3页

ARIS: A Noise Insensitive Data Pre-Processing Scheme for Data Reduction Using Influence Space 一种利用影响空间进行数据约简的噪声不敏感数据预处理方案

ACM Transactions on Knowledge Discovery from Data (TKDD)

Pub Date : 2022-03-15 DOI: 10.1145/3522592

Jiang-hui Cai, Yuqing Yang, Haifeng Yang, Xu-jun Zhao, Jing Hao

The extensive growth of data quantity has posed many challenges to data analysis and retrieval. Noise and redundancy are typical representatives of the above-mentioned challenges, which may reduce the reliability of analysis and retrieval results and increase storage and computing overhead. To solve the above problems, a two-stage data pre-processing framework for noise identification and data reduction, called ARIS, is proposed in this article. The first stage identifies and removes noises by the following steps: First, the influence space (IS) is introduced to elaborate data distribution. Second, a ranking factor (RF) is defined to describe the possibility that the points are regarded as noises, then, the definition of noise is given based on RF. Third, a clean dataset (CD) is obtained by removing noise from the original dataset. The second stage learns representative data and realizes data reduction. In this process, CD is divided into multiple small regions by IS. Then the reduced dataset is formed by collecting the representations of each region. The performance of ARIS is verified by experiments on artificial and real datasets. Experimental results show that ARIS effectively weakens the impact of noise and reduces the amount of data and significantly improves the accuracy of data analysis within a reasonable time cost range.

数据量的广泛增长给数据分析和检索带来了许多挑战。噪声和冗余是上述挑战的典型代表，它们可能降低分析和检索结果的可靠性，增加存储和计算开销。为了解决上述问题，本文提出了一种用于噪声识别和数据降噪的两阶段数据预处理框架ARIS。首先，引入影响空间(IS)来细化数据分布;其次，定义了排序因子(RF)来描述点被视为噪声的可能性，并基于RF给出了噪声的定义。第三，通过去除原始数据集中的噪声，得到一个干净的数据集(CD)。第二阶段学习代表性数据，实现数据约简。在这个过程中，CD被is划分为多个小区域。然后通过收集每个区域的表示形成约简数据集。通过在人工数据集和真实数据集上的实验验证了ARIS的性能。实验结果表明，ARIS在合理的时间成本范围内，有效地减弱了噪声的影响，减少了数据量，显著提高了数据分析的准确性。

{"title":"ARIS: A Noise Insensitive Data Pre-Processing Scheme for Data Reduction Using Influence Space","authors":"Jiang-hui Cai, Yuqing Yang, Haifeng Yang, Xu-jun Zhao, Jing Hao","doi":"10.1145/3522592","DOIUrl":"https://doi.org/10.1145/3522592","url":null,"abstract":"The extensive growth of data quantity has posed many challenges to data analysis and retrieval. Noise and redundancy are typical representatives of the above-mentioned challenges, which may reduce the reliability of analysis and retrieval results and increase storage and computing overhead. To solve the above problems, a two-stage data pre-processing framework for noise identification and data reduction, called ARIS, is proposed in this article. The first stage identifies and removes noises by the following steps: First, the influence space (IS) is introduced to elaborate data distribution. Second, a ranking factor (RF) is defined to describe the possibility that the points are regarded as noises, then, the definition of noise is given based on RF. Third, a clean dataset (CD) is obtained by removing noise from the original dataset. The second stage learns representative data and realizes data reduction. In this process, CD is divided into multiple small regions by IS. Then the reduced dataset is formed by collecting the representations of each region. The performance of ARIS is verified by experiments on artificial and real datasets. Experimental results show that ARIS effectively weakens the impact of noise and reduces the amount of data and significantly improves the accuracy of data analysis within a reasonable time cost range.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126486810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Nested Named Entity Recognition: A Survey 嵌套命名实体识别:综述

ACM Transactions on Knowledge Discovery from Data (TKDD)

Pub Date : 2022-03-15 DOI: 10.1145/3522593

Yu Wang, H. Tong, Ziye Zhu, Yun Li

With the rapid development of text mining, many studies observe that text generally contains a variety of implicit information, and it is important to develop techniques for extracting such information. Named Entity Recognition (NER), the first step of information extraction, mainly identifies names of persons, locations, and organizations in text. Although existing neural-based NER approaches achieve great success in many language domains, most of them normally ignore the nested nature of named entities. Recently, diverse studies focus on the nested NER problem and yield state-of-the-art performance. This survey attempts to provide a comprehensive review on existing approaches for nested NER from the perspectives of the model architecture and the model property, which may help readers have a better understanding of the current research status and ideas. In this survey, we first introduce the background of nested NER, especially the differences between nested NER and traditional (i.e., flat) NER. We then review the existing nested NER approaches from 2002 to 2020 and mainly classify them into five categories according to the model architecture, including early rule-based, layered-based, region-based, hypergraph-based, and transition-based approaches. We also explore in greater depth the impact of key properties unique to nested NER approaches from the model property perspective, namely entity dependency, stage framework, error propagation, and tag scheme. Finally, we summarize the open challenges and point out a few possible future directions in this area. This survey would be useful for three kinds of readers: (i) Newcomers in the field who want to learn about NER, especially for nested NER. (ii) Researchers who want to clarify the relationship and advantages between flat NER and nested NER. (iii) Practitioners who just need to determine which NER technique (i.e., nested or not) works best in their applications.

随着文本挖掘技术的快速发展，许多研究发现文本通常包含各种隐式信息，因此开发这些信息的提取技术非常重要。命名实体识别(NER)是信息提取的第一步，主要识别文本中的人名、地名和组织名称。尽管现有的基于神经的NER方法在许多语言领域取得了巨大的成功，但大多数方法通常忽略了命名实体的嵌套性质。最近，不同的研究集中在嵌套NER问题上，并产生了最先进的性能。本文试图从模型体系结构和模型属性的角度对现有的嵌套NER研究方法进行综述，以帮助读者更好地了解当前的研究现状和思路。在本研究中，我们首先介绍了嵌套NER的背景，特别是嵌套NER与传统(即平面)NER的区别。然后，我们回顾了2002年至2020年现有的嵌套NER方法，并根据模型架构将它们主要分为五类，包括早期基于规则的方法、基于分层的方法、基于区域的方法、基于超图的方法和基于转换的方法。我们还从模型属性的角度更深入地探讨了嵌套NER方法特有的关键属性的影响，即实体依赖关系、阶段框架、错误传播和标记方案。最后，我们总结了该领域面临的挑战，并指出了未来可能的发展方向。这项调查将对三类读者有用:(i)该领域的新手想要了解NER，特别是嵌套NER。(ii)想要厘清扁平型NER和嵌套型NER之间的关系和优势的研究者。(iii)只需要确定哪种NER技术(即嵌套或非嵌套)在其应用程序中效果最好的从业者。

{"title":"Nested Named Entity Recognition: A Survey","authors":"Yu Wang, H. Tong, Ziye Zhu, Yun Li","doi":"10.1145/3522593","DOIUrl":"https://doi.org/10.1145/3522593","url":null,"abstract":"With the rapid development of text mining, many studies observe that text generally contains a variety of implicit information, and it is important to develop techniques for extracting such information. Named Entity Recognition (NER), the first step of information extraction, mainly identifies names of persons, locations, and organizations in text. Although existing neural-based NER approaches achieve great success in many language domains, most of them normally ignore the nested nature of named entities. Recently, diverse studies focus on the nested NER problem and yield state-of-the-art performance. This survey attempts to provide a comprehensive review on existing approaches for nested NER from the perspectives of the model architecture and the model property, which may help readers have a better understanding of the current research status and ideas. In this survey, we first introduce the background of nested NER, especially the differences between nested NER and traditional (i.e., flat) NER. We then review the existing nested NER approaches from 2002 to 2020 and mainly classify them into five categories according to the model architecture, including early rule-based, layered-based, region-based, hypergraph-based, and transition-based approaches. We also explore in greater depth the impact of key properties unique to nested NER approaches from the model property perspective, namely entity dependency, stage framework, error propagation, and tag scheme. Finally, we summarize the open challenges and point out a few possible future directions in this area. This survey would be useful for three kinds of readers: (i) Newcomers in the field who want to learn about NER, especially for nested NER. (ii) Researchers who want to clarify the relationship and advantages between flat NER and nested NER. (iii) Practitioners who just need to determine which NER technique (i.e., nested or not) works best in their applications.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125606682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 20

Computational Estimation by Scientific Data Mining with Classical Methods to Automate Learning Strategies of Scientists 基于经典方法的科学数据挖掘计算估计，实现科学家学习策略的自动化

ACM Transactions on Knowledge Discovery from Data (TKDD)

Pub Date : 2022-03-10 DOI: 10.1145/3502736

A. Varde

Experimental results are often plotted as 2-dimensional graphical plots (aka graphs) in scientific domains depicting dependent versus independent variables to aid visual analysis of processes. Repeatedly performing laboratory experiments consumes significant time and resources, motivating the need for computational estimation. The goals are to estimate the graph obtained in an experiment given its input conditions, and to estimate the conditions that would lead to a desired graph. Existing estimation approaches often do not meet accuracy and efficiency needs of targeted applications. We develop a computational estimation approach called AutoDomainMine that integrates clustering and classification over complex scientific data in a framework so as to automate classical learning methods of scientists. Knowledge discovered thereby from a database of existing experiments serves as the basis for estimation. Challenges include preserving domain semantics in clustering, finding matching strategies in classification, striking a good balance between elaboration and conciseness while displaying estimation results based on needs of targeted users, and deriving objective measures to capture subjective user interests. These and other challenges are addressed in this work. The AutoDomainMine approach is used to build a computational estimation system, rigorously evaluated with real data in Materials Science. Our evaluation confirms that AutoDomainMine provides desired accuracy and efficiency in computational estimation. It is extendable to other science and engineering domains as proved by adaptation of its sub-processes within fields such as Bioinformatics and Nanotechnology.

在科学领域中，实验结果通常绘制为二维图形图(又名图形)，描绘因变量与自变量，以帮助对过程进行可视化分析。重复进行实验室实验消耗大量的时间和资源，激发了对计算估计的需求。目标是估计在给定输入条件的实验中获得的图，并估计导致所需图的条件。现有的评估方法往往不能满足目标应用的准确性和效率需求。我们开发了一种称为AutoDomainMine的计算估计方法，该方法将复杂科学数据的聚类和分类集成在一个框架中，从而使科学家的经典学习方法自动化。由此从现有实验数据库中发现的知识可作为估计的基础。挑战包括在聚类中保持领域语义，在分类中寻找匹配策略，在显示基于目标用户需求的估计结果时，在精细化和简洁之间取得良好的平衡，以及推导客观度量来捕捉主观用户兴趣。这些和其他挑战在这项工作中得到解决。AutoDomainMine方法用于构建一个计算估计系统，并使用材料科学中的真实数据进行严格评估。我们的评估证实了AutoDomainMine在计算估计方面提供了所需的准确性和效率。通过在生物信息学和纳米技术等领域中对其子过程的适应，证明了它可以扩展到其他科学和工程领域。

{"title":"Computational Estimation by Scientific Data Mining with Classical Methods to Automate Learning Strategies of Scientists","authors":"A. Varde","doi":"10.1145/3502736","DOIUrl":"https://doi.org/10.1145/3502736","url":null,"abstract":"Experimental results are often plotted as 2-dimensional graphical plots (aka graphs) in scientific domains depicting dependent versus independent variables to aid visual analysis of processes. Repeatedly performing laboratory experiments consumes significant time and resources, motivating the need for computational estimation. The goals are to estimate the graph obtained in an experiment given its input conditions, and to estimate the conditions that would lead to a desired graph. Existing estimation approaches often do not meet accuracy and efficiency needs of targeted applications. We develop a computational estimation approach called AutoDomainMine that integrates clustering and classification over complex scientific data in a framework so as to automate classical learning methods of scientists. Knowledge discovered thereby from a database of existing experiments serves as the basis for estimation. Challenges include preserving domain semantics in clustering, finding matching strategies in classification, striking a good balance between elaboration and conciseness while displaying estimation results based on needs of targeted users, and deriving objective measures to capture subjective user interests. These and other challenges are addressed in this work. The AutoDomainMine approach is used to build a computational estimation system, rigorously evaluated with real data in Materials Science. Our evaluation confirms that AutoDomainMine provides desired accuracy and efficiency in computational estimation. It is extendable to other science and engineering domains as proved by adaptation of its sub-processes within fields such as Bioinformatics and Nanotechnology.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121509903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Exploiting Higher Order Multi-dimensional Relationships with Self-attention for Author Name Disambiguation 基于自关注的高阶多维关系的作者姓名消歧研究

ACM Transactions on Knowledge Discovery from Data (TKDD)

Pub Date : 2022-03-10 DOI: 10.1145/3502730

K. Pooja, S. Mondal, Joydeep Chandra

Name ambiguity is a prevalent problem in scholarly publications due to the unprecedented growth of digital libraries and number of researchers. An author is identified by their name in the absence of a unique identifier. The documents of an author are mistakenly assigned due to underlying ambiguity, which may lead to an improper assessment of the author. Various efforts have been made in the literature to solve the name disambiguation problem with supervised and unsupervised approaches. The unsupervised approaches for author name disambiguation are preferred due to the availability of a large amount of unlabeled data. Bibliographic data contain heterogeneous features, thus recently, representation learning-based techniques have been used in literature to embed heterogeneous features in common space. Documents of a scholar are connected by multiple relations. Recently, research has shifted from a single homogeneous relation to multi-dimensional (heterogeneous) relations for the latent representation of document. Connections in graphs are sparse, and higher order links between documents give an additional clue. Therefore, we have used multiple neighborhoods in different relation types in heterogeneous graph for representation of documents. However, different order neighborhood in each relation type has different importance which we have empirically validated also. Therefore, to properly utilize the different neighborhoods in relation type and importance of each relation type in the heterogeneous graph, we propose attention-based multi-dimensional multi-hop neighborhood-based graph convolution network for embedding that uses the two levels of an attention, namely, (i) relation level and (ii) neighborhood level, in each relation. A significant improvement over existing state-of-the-art methods in terms of various evaluation matrices has been obtained by the proposed approach.

由于数字图书馆和研究人员数量的空前增长，名称歧义成为学术出版物中普遍存在的问题。在没有唯一标识符的情况下，通过其名称来标识作者。由于潜在的模糊性，作者的文件被错误地分配，这可能导致对作者的不正确评估。文献中已经做出了各种努力，用监督和非监督的方法来解决名称消歧问题。由于大量未标记数据的可用性，作者姓名消歧的无监督方法是首选。书目数据包含异构特征，因此近年来，基于表示学习的技术被用于文献中嵌入异构特征的公共空间。学者的文献是由多种关系联系在一起的。近年来，文献潜在表征的研究从单一的同质关系转向多维(异构)关系。图中的连接是稀疏的，文档之间的高阶链接提供了额外的线索。因此，我们在异构图中使用不同关系类型的多个邻域来表示文档。然而，在每一关系类型中，不同阶邻域具有不同的重要性，我们也通过经验验证了这一点。因此，为了合理利用异构图中关系类型和各关系类型重要性的不同邻域，我们提出了基于关注的多维多跳邻域图卷积网络进行嵌入，该网络在每个关系中使用关注的两个层次，即(i)关系层次和(ii)邻域层次。所提出的方法在各种评价矩阵方面大大改进了现有的最先进的方法。

{"title":"Exploiting Higher Order Multi-dimensional Relationships with Self-attention for Author Name Disambiguation","authors":"K. Pooja, S. Mondal, Joydeep Chandra","doi":"10.1145/3502730","DOIUrl":"https://doi.org/10.1145/3502730","url":null,"abstract":"Name ambiguity is a prevalent problem in scholarly publications due to the unprecedented growth of digital libraries and number of researchers. An author is identified by their name in the absence of a unique identifier. The documents of an author are mistakenly assigned due to underlying ambiguity, which may lead to an improper assessment of the author. Various efforts have been made in the literature to solve the name disambiguation problem with supervised and unsupervised approaches. The unsupervised approaches for author name disambiguation are preferred due to the availability of a large amount of unlabeled data. Bibliographic data contain heterogeneous features, thus recently, representation learning-based techniques have been used in literature to embed heterogeneous features in common space. Documents of a scholar are connected by multiple relations. Recently, research has shifted from a single homogeneous relation to multi-dimensional (heterogeneous) relations for the latent representation of document. Connections in graphs are sparse, and higher order links between documents give an additional clue. Therefore, we have used multiple neighborhoods in different relation types in heterogeneous graph for representation of documents. However, different order neighborhood in each relation type has different importance which we have empirically validated also. Therefore, to properly utilize the different neighborhoods in relation type and importance of each relation type in the heterogeneous graph, we propose attention-based multi-dimensional multi-hop neighborhood-based graph convolution network for embedding that uses the two levels of an attention, namely, (i) relation level and (ii) neighborhood level, in each relation. A significant improvement over existing state-of-the-art methods in terms of various evaluation matrices has been obtained by the proposed approach.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132207812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Online Learning Bipartite Matching with Non-stationary Distributions 非平稳分布的在线学习二部匹配

ACM Transactions on Knowledge Discovery from Data (TKDD)

Pub Date : 2022-03-10 DOI: 10.1145/3502734

Weirong Chen, Jiaqi Zheng, Haoyu Yu, Guihai Chen, Yixing Chen, Dongsheng Li

Online bipartite matching has attracted wide interest since it can successfully model the popular online car-hailing problem and sharing economy. Existing works consider this problem under either adversary setting or i.i.d. setting. The former is too pessimistic to improve the performance in the general case; the latter is too optimistic to deal with the varying distribution of vertices. In this article, we initiate the study of the non-stationary online bipartite matching problem, which allows the distribution of vertices to vary with time and is more practical. We divide the non-stationary online bipartite matching problem into two subproblems, the matching problem and the selecting problem, and solve them individually. Combining Batch algorithms and deep Q-learning networks, we first construct a candidate algorithm set to solve the matching problem. For the selecting problem, we use a classical online learning algorithm, Exp3, as a selector algorithm and derive a theoretical bound. We further propose CDUCB as a selector algorithm by integrating distribution change detection into UCB. Rigorous theoretical analysis demonstrates that the performance of our proposed algorithms is no worse than that of any candidate algorithms in terms of competitive ratio. Finally, extensive experiments show that our proposed algorithms have much higher performance for the non-stationary online bipartite matching problem comparing to the state-of-the-art.

在线三方匹配由于能够成功地模拟当前流行的网约车问题和共享经济而引起了广泛的关注。现有的研究在敌手设置和敌手设置两种情况下都考虑了这个问题。在一般情况下，前者过于悲观，无法提高绩效;后者过于乐观，无法处理顶点分布的变化。在本文中，我们开始研究非平稳在线二部匹配问题，该问题允许顶点的分布随时间变化，并且更实用。将非平稳在线二部匹配问题分为匹配问题和选择问题两个子问题，分别求解。将批处理算法与深度q -学习网络相结合，首先构建候选算法集来解决匹配问题。对于选择问题，我们使用经典的在线学习算法Exp3作为选择算法，并推导出理论界。通过将分布变化检测集成到UCB中，我们进一步提出了CDUCB作为选择器算法。严格的理论分析表明，我们提出的算法在竞争比方面的性能并不比任何候选算法差。最后，大量的实验表明，与现有算法相比，我们提出的算法在非平稳在线二部匹配问题上具有更高的性能。

{"title":"Online Learning Bipartite Matching with Non-stationary Distributions","authors":"Weirong Chen, Jiaqi Zheng, Haoyu Yu, Guihai Chen, Yixing Chen, Dongsheng Li","doi":"10.1145/3502734","DOIUrl":"https://doi.org/10.1145/3502734","url":null,"abstract":"Online bipartite matching has attracted wide interest since it can successfully model the popular online car-hailing problem and sharing economy. Existing works consider this problem under either adversary setting or i.i.d. setting. The former is too pessimistic to improve the performance in the general case; the latter is too optimistic to deal with the varying distribution of vertices. In this article, we initiate the study of the non-stationary online bipartite matching problem, which allows the distribution of vertices to vary with time and is more practical. We divide the non-stationary online bipartite matching problem into two subproblems, the matching problem and the selecting problem, and solve them individually. Combining Batch algorithms and deep Q-learning networks, we first construct a candidate algorithm set to solve the matching problem. For the selecting problem, we use a classical online learning algorithm, Exp3, as a selector algorithm and derive a theoretical bound. We further propose CDUCB as a selector algorithm by integrating distribution change detection into UCB. Rigorous theoretical analysis demonstrates that the performance of our proposed algorithms is no worse than that of any candidate algorithms in terms of competitive ratio. Finally, extensive experiments show that our proposed algorithms have much higher performance for the non-stationary online bipartite matching problem comparing to the state-of-the-art.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127400672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Intelligent Data Analysis using Optimized Support Vector Machine Based Data Mining Approach for Tourism Industry 基于优化支持向量机的旅游业数据挖掘智能数据分析

ACM Transactions on Knowledge Discovery from Data (TKDD)

Pub Date : 2022-03-10 DOI: 10.1145/3494566

Ms Promila Sharma, Uma Meena, Girish Sharma

Data analysis involves the deployment of sophisticated approaches from data mining methods, information theory, and artificial intelligence in various fields like tourism, hospitality, and so on for the extraction of knowledge from the gathered and preprocessed data. In tourism, pattern analysis or data analysis using classification is significant for finding the patterns that represent new and potentially useful information or knowledge about the destination and other data. Several data mining techniques are introduced for the classification of data or patterns. However, overfitting, less accuracy, local minima, sensitive to noise are the drawbacks in some existing data mining classification methods. To overcome these challenges, Support vector machine with Red deer optimization (SVM-RDO) based data mining strategy is proposed in this article. Extended Kalman filter (EKF) is utilized in the first phase, i.e., data cleaning to remove the noise and missing values from the input data. Mantaray foraging algorithm (MaFA) is used in the data selection phase, in which the significant data are selected for the further process to reduce the computational complexity. The final phase is the classification, in which SVM-RDO is proposed to access the useful pattern from the selected data. PYTHON is the implementation tool used for the experiment of the proposed model. The experimental analysis is done to show the efficacy of the proposed work. From the experimental results, the proposed SVM-RDO achieved better accuracy, precision, recall, and F1 score than the existing methods for the tourism dataset. Thus, it is showed the effectiveness of the proposed SVM-RDO for pattern analysis.

数据分析涉及部署数据挖掘方法、信息论和人工智能等各个领域的复杂方法，如旅游、酒店等，以便从收集和预处理的数据中提取知识。在旅游业中，使用分类的模式分析或数据分析对于发现代表关于目的地和其他数据的新的和潜在有用的信息或知识的模式具有重要意义。介绍了几种用于数据或模式分类的数据挖掘技术。然而，现有的一些数据挖掘分类方法存在过拟合、精度不高、局部极小、对噪声敏感等缺点。为了克服这些挑战，本文提出了基于支持向量机与马鹿优化(SVM-RDO)的数据挖掘策略。在第一阶段，即数据清洗中使用扩展卡尔曼滤波(EKF)去除输入数据中的噪声和缺失值。在数据选择阶段采用Mantaray觅食算法(Mantaray foraging algorithm, MaFA)，选取有意义的数据进行下一步处理，以降低计算复杂度。最后一个阶段是分类，提出SVM-RDO从选定的数据中访问有用的模式。PYTHON是用于所提议模型实验的实现工具。实验分析表明了所提方法的有效性。实验结果表明，SVM-RDO在旅游数据集上的准确率、精密度、查全率和F1分数均优于现有方法。验证了SVM-RDO模式分析的有效性。

{"title":"Intelligent Data Analysis using Optimized Support Vector Machine Based Data Mining Approach for Tourism Industry","authors":"Ms Promila Sharma, Uma Meena, Girish Sharma","doi":"10.1145/3494566","DOIUrl":"https://doi.org/10.1145/3494566","url":null,"abstract":"Data analysis involves the deployment of sophisticated approaches from data mining methods, information theory, and artificial intelligence in various fields like tourism, hospitality, and so on for the extraction of knowledge from the gathered and preprocessed data. In tourism, pattern analysis or data analysis using classification is significant for finding the patterns that represent new and potentially useful information or knowledge about the destination and other data. Several data mining techniques are introduced for the classification of data or patterns. However, overfitting, less accuracy, local minima, sensitive to noise are the drawbacks in some existing data mining classification methods. To overcome these challenges, Support vector machine with Red deer optimization (SVM-RDO) based data mining strategy is proposed in this article. Extended Kalman filter (EKF) is utilized in the first phase, i.e., data cleaning to remove the noise and missing values from the input data. Mantaray foraging algorithm (MaFA) is used in the data selection phase, in which the significant data are selected for the further process to reduce the computational complexity. The final phase is the classification, in which SVM-RDO is proposed to access the useful pattern from the selected data. PYTHON is the implementation tool used for the experiment of the proposed model. The experimental analysis is done to show the efficacy of the proposed work. From the experimental results, the proposed SVM-RDO achieved better accuracy, precision, recall, and F1 score than the existing methods for the tourism dataset. Thus, it is showed the effectiveness of the proposed SVM-RDO for pattern analysis.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126270374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Stochastic Variational Optimization of a Hierarchical Dirichlet Process Latent Beta-Liouville Topic Model 分层Dirichlet过程潜β - liouville主题模型的随机变分优化

ACM Transactions on Knowledge Discovery from Data (TKDD)

Pub Date : 2022-03-10 DOI: 10.1145/3502727

Koffi Eddy Ihou, Manar Amayri, N. Bouguila

In topic models, collections are organized as documents where they arise as mixtures over latent clusters called topics. A topic is a distribution over the vocabulary. In large-scale applications, parametric or finite topic mixture models such as LDA (latent Dirichlet allocation) and its variants are very restrictive in performance due to their reduced hypothesis space. In this article, we address the problem related to model selection and sharing ability of topics across multiple documents in standard parametric topic models. We propose as an alternative a BNP (Bayesian nonparametric) topic model where the HDP (hierarchical Dirichlet process) prior models documents topic mixtures through their multinomials on infinite simplex. We, therefore, propose asymmetric BL (Beta-Liouville) as a diffuse base measure at the corpus level DP (Dirichlet process) over a measurable space. This step illustrates the highly heterogeneous structure in the set of all topics that describes the corpus probability measure. For consistency in posterior inference and predictive distributions, we efficiently characterize random probability measures whose limits are the global and local DPs to approximate the HDP from the stick-breaking formulation with the GEM (Griffiths-Engen-McCloskey) random variables. Due to the diffuse measure with the BL prior as conjugate to the count data distribution, we obtain an improved version of the standard HDP that is usually based on symmetric Dirichlet (Dir). In addition, to improve coordinate ascent framework while taking advantage of its deterministic nature, our model implements an online optimization method based on stochastic, at document level, variational inference to accommodate fast topic learning when processing large collections of text documents with natural gradient. The high value in the predictive likelihood per document obtained when compared to the performance of its competitors is also consistent with the robustness of our fully asymmetric BL-based HDP. While insuring the predictive accuracy of the model using the probability of the held-out documents, we also added a combination of metrics such as the topic coherence and topic diversity to improve the quality and interpretability of the topics discovered. We also compared the performance of our model using these metrics against the standard symmetric LDA. We show that online HDP-LBLA (Latent BL Allocation)’s performance is the asymptote for parametric topic models. The accuracy in the results (improved predictive distributions of the held out) is a product of the model’s ability to efficiently characterize dependency between documents (topic correlation) as now they can easily share topics, resulting in a much robust and realistic compression algorithm for information modeling.

在主题模型中，集合被组织为文档，它们作为称为主题的潜在集群的混合物出现。主题是词汇表的分布。在大规模应用中，参数化或有限主题混合模型(如LDA (latent Dirichlet allocation))及其变体)由于其减少的假设空间而在性能上受到很大限制。在本文中，我们解决了在标准参数主题模型中与多个文档之间的主题选择和主题共享能力相关的问题。我们提出了一种替代的BNP(贝叶斯非参数)主题模型，其中HDP(层次狄利克雷过程)先验模型通过无限单纯形上的多项式记录主题混合。因此，我们提出不对称BL (Beta-Liouville)作为可测量空间上语料库水平DP (Dirichlet过程)的扩散基测度。这一步说明了描述语料库概率度量的所有主题集中的高度异构结构。为了后验推理和预测分布的一致性，我们有效地描述了随机概率测度，其极限是全局和局部dp，以近似于GEM (Griffiths-Engen-McCloskey)随机变量的粘断公式的HDP。由于将BL先验作为计数数据分布共轭的漫射测量，我们得到了通常基于对称Dirichlet (Dir)的标准HDP的改进版本。此外，为了改进坐标上升框架，同时利用其确定性特性，我们的模型实现了一种基于随机的在线优化方法，在文档级，变分推理，以适应快速主题学习，当处理具有自然梯度的大型文本文档集合时。与竞争对手的性能相比，每个文档的预测可能性的高值也与我们完全不对称的基于bl的HDP的鲁棒性一致。在使用保留文档的概率确保模型的预测准确性的同时，我们还添加了主题一致性和主题多样性等指标的组合，以提高所发现主题的质量和可解释性。我们还使用这些指标将模型的性能与标准对称LDA进行了比较。我们证明了在线HDP-LBLA (Latent BL Allocation)的性能是参数主题模型的渐近线。结果的准确性(改进了hold out的预测分布)是模型有效地描述文档之间依赖关系(主题相关性)的能力的产物，因为现在它们可以轻松地共享主题，从而为信息建模提供了更加健壮和现实的压缩算法。

{"title":"Stochastic Variational Optimization of a Hierarchical Dirichlet Process Latent Beta-Liouville Topic Model","authors":"Koffi Eddy Ihou, Manar Amayri, N. Bouguila","doi":"10.1145/3502727","DOIUrl":"https://doi.org/10.1145/3502727","url":null,"abstract":"In topic models, collections are organized as documents where they arise as mixtures over latent clusters called topics. A topic is a distribution over the vocabulary. In large-scale applications, parametric or finite topic mixture models such as LDA (latent Dirichlet allocation) and its variants are very restrictive in performance due to their reduced hypothesis space. In this article, we address the problem related to model selection and sharing ability of topics across multiple documents in standard parametric topic models. We propose as an alternative a BNP (Bayesian nonparametric) topic model where the HDP (hierarchical Dirichlet process) prior models documents topic mixtures through their multinomials on infinite simplex. We, therefore, propose asymmetric BL (Beta-Liouville) as a diffuse base measure at the corpus level DP (Dirichlet process) over a measurable space. This step illustrates the highly heterogeneous structure in the set of all topics that describes the corpus probability measure. For consistency in posterior inference and predictive distributions, we efficiently characterize random probability measures whose limits are the global and local DPs to approximate the HDP from the stick-breaking formulation with the GEM (Griffiths-Engen-McCloskey) random variables. Due to the diffuse measure with the BL prior as conjugate to the count data distribution, we obtain an improved version of the standard HDP that is usually based on symmetric Dirichlet (Dir). In addition, to improve coordinate ascent framework while taking advantage of its deterministic nature, our model implements an online optimization method based on stochastic, at document level, variational inference to accommodate fast topic learning when processing large collections of text documents with natural gradient. The high value in the predictive likelihood per document obtained when compared to the performance of its competitors is also consistent with the robustness of our fully asymmetric BL-based HDP. While insuring the predictive accuracy of the model using the probability of the held-out documents, we also added a combination of metrics such as the topic coherence and topic diversity to improve the quality and interpretability of the topics discovered. We also compared the performance of our model using these metrics against the standard symmetric LDA. We show that online HDP-LBLA (Latent BL Allocation)’s performance is the asymptote for parametric topic models. The accuracy in the results (improved predictive distributions of the held out) is a product of the model’s ability to efficiently characterize dependency between documents (topic correlation) as now they can easily share topics, resulting in a much robust and realistic compression algorithm for information modeling.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"144 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131734788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Evidence Transfer: Learning Improved Representations According to External Heterogeneous Task Outcomes 证据迁移:根据外部异构任务结果学习改进表征

ACM Transactions on Knowledge Discovery from Data (TKDD)

Pub Date : 2022-03-10 DOI: 10.1145/3502732

A. Davvetas, I. Klampanos, Spiros Skiadopoulos, V. Karkaletsis

Unsupervised representation learning tends to produce generic and reusable latent representations. However, these representations can often miss high-level features or semantic information, since they only observe the implicit properties of the dataset. On the other hand, supervised learning frameworks learn task-oriented latent representations that may not generalise in other tasks or domains. In this article, we introduce evidence transfer, a deep learning method that incorporates the outcomes of external tasks in the unsupervised learning process of an autoencoder. External task outcomes also referred to as categorical evidence, are represented by categorical variables, and are either directly or indirectly related to the primary dataset—in the most straightforward case they are the outcome of another task on the same dataset. Evidence transfer allows the manipulation of generic latent representations in order to include domain or task-specific knowledge that will aid their effectiveness in downstream tasks. Evidence transfer is robust against evidence of low quality and effective when introduced with related, corresponding, or meaningful evidence.

无监督表示学习倾向于产生通用和可重用的潜在表示。然而，这些表示通常会错过高级特征或语义信息，因为它们只观察数据集的隐式属性。另一方面，监督学习框架学习面向任务的潜在表征，这些表征可能不会泛化到其他任务或领域。在本文中，我们介绍了证据转移，这是一种深度学习方法，将外部任务的结果整合到自动编码器的无监督学习过程中。外部任务结果也被称为分类证据，由分类变量表示，并与主要数据集直接或间接相关——在最简单的情况下，它们是同一数据集上另一个任务的结果。证据转移允许操纵一般潜在表征，以便包括领域或任务特定知识，这将有助于其在下游任务中的有效性。当引入相关的、相应的或有意义的证据时，证据转移对低质量证据具有稳健性和有效性。

{"title":"Evidence Transfer: Learning Improved Representations According to External Heterogeneous Task Outcomes","authors":"A. Davvetas, I. Klampanos, Spiros Skiadopoulos, V. Karkaletsis","doi":"10.1145/3502732","DOIUrl":"https://doi.org/10.1145/3502732","url":null,"abstract":"Unsupervised representation learning tends to produce generic and reusable latent representations. However, these representations can often miss high-level features or semantic information, since they only observe the implicit properties of the dataset. On the other hand, supervised learning frameworks learn task-oriented latent representations that may not generalise in other tasks or domains. In this article, we introduce evidence transfer, a deep learning method that incorporates the outcomes of external tasks in the unsupervised learning process of an autoencoder. External task outcomes also referred to as categorical evidence, are represented by categorical variables, and are either directly or indirectly related to the primary dataset—in the most straightforward case they are the outcome of another task on the same dataset. Evidence transfer allows the manipulation of generic latent representations in order to include domain or task-specific knowledge that will aid their effectiveness in downstream tasks. Evidence transfer is robust against evidence of low quality and effective when introduced with related, corresponding, or meaningful evidence.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126251878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

PSL: An Algorithm for Partial Bayesian Network Structure Learning 部分贝叶斯网络结构学习的一种算法

ACM Transactions on Knowledge Discovery from Data (TKDD)

Pub Date : 2022-03-10 DOI: 10.1145/3508071

Zhaolong Ling, Kui Yu, Lin Liu, Jiuyong Li, Yiwen Zhang, Xindong Wu

Learning partial Bayesian network (BN) structure is an interesting and challenging problem. In this challenge, it is computationally expensive to use global BN structure learning algorithms, while only one part of a BN structure is interesting, local BN structure learning algorithms are not a favourable solution either due to the issue of false edge orientation. To address the problem, this article first presents a detailed analysis of the false edge orientation issue with local BN structure learning algorithms and then proposes PSL, an efficient and accurate Partial BN Structure Learning (PSL) algorithm. Specifically, PSL divides V-structures in a Markov blanket (MB) into two types: Type-C V-structures and Type-NC V-structures, then it starts from the given node of interest and recursively finds both types of V-structures in the MB of the current node until all edges in the partial BN structure are oriented. To further improve the efficiency of PSL, the PSL-FS algorithm is designed by incorporating Feature Selection (FS) into PSL. Extensive experiments with six benchmark BNs validate the efficiency and accuracy of the proposed algorithms.

学习部分贝叶斯网络(BN)结构是一个有趣而富有挑战性的问题。在这个挑战中，使用全局BN结构学习算法的计算成本很高，而BN结构只有一部分是有趣的，局部BN结构学习算法也不是一个有利的解决方案，因为存在假边方向问题。为了解决这个问题，本文首先详细分析了局部BN结构学习算法的假边方向问题，然后提出了一种高效、准确的部分BN结构学习(PSL)算法。具体来说，PSL将马尔可夫毯(MB)中的v -结构分为Type-C v -结构和Type-NC v -结构两种类型，然后从给定的感兴趣节点开始，递归地在当前节点的MB中找到这两种类型的v -结构，直到部分BN结构中的所有边都有取向。为了进一步提高PSL的效率，将特征选择(Feature Selection, FS)引入PSL，设计了PSL-FS算法。用6个基准神经网络进行了大量实验，验证了所提算法的效率和准确性。

{"title":"PSL: An Algorithm for Partial Bayesian Network Structure Learning","authors":"Zhaolong Ling, Kui Yu, Lin Liu, Jiuyong Li, Yiwen Zhang, Xindong Wu","doi":"10.1145/3508071","DOIUrl":"https://doi.org/10.1145/3508071","url":null,"abstract":"Learning partial Bayesian network (BN) structure is an interesting and challenging problem. In this challenge, it is computationally expensive to use global BN structure learning algorithms, while only one part of a BN structure is interesting, local BN structure learning algorithms are not a favourable solution either due to the issue of false edge orientation. To address the problem, this article first presents a detailed analysis of the false edge orientation issue with local BN structure learning algorithms and then proposes PSL, an efficient and accurate Partial BN Structure Learning (PSL) algorithm. Specifically, PSL divides V-structures in a Markov blanket (MB) into two types: Type-C V-structures and Type-NC V-structures, then it starts from the given node of interest and recursively finds both types of V-structures in the MB of the current node until all edges in the partial BN structure are oriented. To further improve the efficiency of PSL, the PSL-FS algorithm is designed by incorporating Feature Selection (FS) into PSL. Extensive experiments with six benchmark BNs validate the efficiency and accuracy of the proposed algorithms.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122597226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Online Scalable Streaming Feature Selection via Dynamic Decision 基于动态决策的在线可扩展流特征选择

ACM Transactions on Knowledge Discovery from Data (TKDD)

Pub Date : 2022-03-10 DOI: 10.1145/3502737

Peng Zhou, Shu Zhao, Yuan-Ting Yan, X. Wu

Feature selection is one of the core concepts in machine learning, which hugely impacts the model’s performance. For some real-world applications, features may exist in a stream mode that arrives one by one over time, while we cannot know the exact number of features before learning. Online streaming feature selection aims at selecting optimal stream features at each timestamp on the fly. Without the global information of the entire feature space, most of the existing methods select stream features in terms of individual feature information or the comparison of features in pairs. This article proposes a new online scalable streaming feature selection framework from the dynamic decision perspective that is scalable on running time and selected features by dynamic threshold adjustment. Regarding the philosophy of “Thinking-in-Threes”, we classify each new arrival feature as selecting, discarding, or delaying, aiming at minimizing the overall decision risks. With the dynamic updating of global statistical information, we add the selecting features into the candidate feature subset, ignore the discarding features, cache the delaying features into the undetermined feature subset, and wait for more information. Meanwhile, we perform the redundancy analysis for the candidate features and uncertainty analysis for the undetermined features. Extensive experiments on eleven real-world datasets demonstrate the efficiency and scalability of our new framework compared with state-of-the-art algorithms.

特征选择是机器学习的核心概念之一，对模型的性能有很大的影响。对于一些现实世界的应用程序，特征可能以流模式存在，随着时间的推移一个接一个地到达，而我们在学习之前无法知道特征的确切数量。在线流特征选择的目的是在每个时间戳上选择最优的流特征。现有的流特征选择方法大多是根据单个特征信息或对特征的比较来选择流特征，缺乏整个特征空间的全局信息。本文从动态决策的角度提出了一种新的在线可扩展流特征选择框架，该框架可以根据运行时间和所选特征进行动态阈值调整。根据“三合一思考”的理念，我们将每一个新的到达特征分类为选择、丢弃或延迟，以最小化整体决策风险。利用全局统计信息的动态更新，将选择特征添加到候选特征子集中，忽略丢弃特征，将延迟特征缓存到待定特征子集中，等待更多信息。同时，对候选特征进行冗余分析，对未确定特征进行不确定分析。在11个真实数据集上进行的大量实验表明，与最先进的算法相比，我们的新框架具有效率和可扩展性。

{"title":"Online Scalable Streaming Feature Selection via Dynamic Decision","authors":"Peng Zhou, Shu Zhao, Yuan-Ting Yan, X. Wu","doi":"10.1145/3502737","DOIUrl":"https://doi.org/10.1145/3502737","url":null,"abstract":"Feature selection is one of the core concepts in machine learning, which hugely impacts the model’s performance. For some real-world applications, features may exist in a stream mode that arrives one by one over time, while we cannot know the exact number of features before learning. Online streaming feature selection aims at selecting optimal stream features at each timestamp on the fly. Without the global information of the entire feature space, most of the existing methods select stream features in terms of individual feature information or the comparison of features in pairs. This article proposes a new online scalable streaming feature selection framework from the dynamic decision perspective that is scalable on running time and selected features by dynamic threshold adjustment. Regarding the philosophy of “Thinking-in-Threes”, we classify each new arrival feature as selecting, discarding, or delaying, aiming at minimizing the overall decision risks. With the dynamic updating of global statistical information, we add the selecting features into the candidate feature subset, ignore the discarding features, cache the delaying features into the undetermined feature subset, and wait for more information. Meanwhile, we perform the redundancy analysis for the candidate features and uncertainty analysis for the undetermined features. Extensive experiments on eleven real-world datasets demonstrate the efficiency and scalability of our new framework compared with state-of-the-art algorithms.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"144 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124599057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7