首页 > 最新文献

International journal of database theory and application最新文献

英文 中文
Research on Quality Cost Model (QCM) based on Quality Improvement Procedure (QIP) in an Auto-factory 基于质量改进程序的汽车企业质量成本模型研究
Pub Date : 2017-01-31 DOI: 10.14257/ijdta.2017.10.1.26
F. Zhou, Xu Wang, Shan Chen, Yandong He, Lina Zhou
With the implementation of continues quality improvement procedure (QIP) in Chinese self-brand automotive firms, the quality-related cost needs identified correspondingly. Quality cost math models could reveal the mathematical relationship between the quality level and quality cost which provides the possibility for research on the trade-off of these two conflicting objectives, as well as contribute to quality related cost reduction. The quality index during QIP is established via Pearson coefficient within warranty period, related quality cost is categorized to conformance and non-conformance cost on the basis of the PAF ingredient as well. Four traditional quality cost math models have been analyzed in this paper, and the regression analysis based on curve fitting process has been implemented for a self-brand automotive firm during its QIP. The results verify that the four quality cost models (QCM) show their excellent simulating performance, which can uncover the optimal quality level and target R/1000@3MIS guiding correct operations during its QIP. In addition, the most appropriate quality performance level index is aggregated and calculated by employing a subjective AHP method, which specifies the quality improvement target ad potential cost reduction value.
随着中国自主品牌汽车企业持续质量改进程序(QIP)的实施,相应的质量相关成本需求得到了识别。质量成本数学模型可以揭示质量水平与质量成本之间的数学关系,为研究这两个相互冲突的目标之间的权衡提供可能,并有助于降低质量相关成本。质保期内的质量指标通过皮尔逊系数建立,相关的质量成本也根据PAF成分分为合格成本和不合格成本。本文分析了四种传统的质量成本数学模型,并对某自主品牌汽车企业在QIP过程中进行了基于曲线拟合过程的回归分析。结果验证了四种质量成本模型(QCM)具有良好的模拟性能,可以揭示质量质量最优水平和目标R/1000@3MIS,指导QIP过程中的正确操作。此外,采用主观层次分析法,汇总计算出最合适的质量绩效水平指标,明确质量改进目标和潜在的成本降低价值。
{"title":"Research on Quality Cost Model (QCM) based on Quality Improvement Procedure (QIP) in an Auto-factory","authors":"F. Zhou, Xu Wang, Shan Chen, Yandong He, Lina Zhou","doi":"10.14257/ijdta.2017.10.1.26","DOIUrl":"https://doi.org/10.14257/ijdta.2017.10.1.26","url":null,"abstract":"With the implementation of continues quality improvement procedure (QIP) in Chinese self-brand automotive firms, the quality-related cost needs identified correspondingly. Quality cost math models could reveal the mathematical relationship between the quality level and quality cost which provides the possibility for research on the trade-off of these two conflicting objectives, as well as contribute to quality related cost reduction. The quality index during QIP is established via Pearson coefficient within warranty period, related quality cost is categorized to conformance and non-conformance cost on the basis of the PAF ingredient as well. Four traditional quality cost math models have been analyzed in this paper, and the regression analysis based on curve fitting process has been implemented for a self-brand automotive firm during its QIP. The results verify that the four quality cost models (QCM) show their excellent simulating performance, which can uncover the optimal quality level and target R/1000@3MIS guiding correct operations during its QIP. In addition, the most appropriate quality performance level index is aggregated and calculated by employing a subjective AHP method, which specifies the quality improvement target ad potential cost reduction value.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"249 1","pages":"285-298"},"PeriodicalIF":0.0,"publicationDate":"2017-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75052118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Construction of Bi-modal Database for Barrier-free Teaching System 面向无障碍教学系统的双模态数据库构建
Pub Date : 2017-01-31 DOI: 10.14257/IJDTA.2017.10.1.01
Jiling Tang, P. Feng, Zhanlei Li
This paper analyzes the application of Chinese speech recognition technology in the non-barrier education system, and studies the construction of bi-modal database for barrier-free teaching system. Based on the case study of the curriculum named “Foundation of Photoshop”, the paper creates corpus to make acquisition of experimental data and annotation of corpora.Meanwhile we analyze and design the organization of data and build essential dictionary and grammar network in recognition system.
本文分析了汉语语音识别技术在无障碍教学系统中的应用,研究了无障碍教学系统双模态数据库的构建。本文以《Photoshop基础》课程为例,创建语料库,对实验数据进行采集,并对语料库进行标注。同时对识别系统中的数据组织进行了分析和设计,建立了必要的词典和语法网络。
{"title":"Construction of Bi-modal Database for Barrier-free Teaching System","authors":"Jiling Tang, P. Feng, Zhanlei Li","doi":"10.14257/IJDTA.2017.10.1.01","DOIUrl":"https://doi.org/10.14257/IJDTA.2017.10.1.01","url":null,"abstract":"This paper analyzes the application of Chinese speech recognition technology in the non-barrier education system, and studies the construction of bi-modal database for barrier-free teaching system. Based on the case study of the curriculum named “Foundation of Photoshop”, the paper creates corpus to make acquisition of experimental data and annotation of corpora.Meanwhile we analyze and design the organization of data and build essential dictionary and grammar network in recognition system.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"1 1","pages":"1-10"},"PeriodicalIF":0.0,"publicationDate":"2017-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84771933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research on Deployment Strategies of Combine Harvesters Based on Intelligent Big Data Platform 基于智能大数据平台的联合收割机部署策略研究
Pub Date : 2017-01-31 DOI: 10.14257/ijdta.2017.10.1.24
Fan Zhang, Yan Zhang, Haizhao Yuan, Chuanyu Sun, Yihang Li
The current agricultural machinery platforms just provide operational information of farmland and machinery, but not effective decision-making service. The problems of low utilization rate of agricultural machinery and low operation profits emerge as a major issue in the cross-regional operation of combine harvesters. The intelligent big data platform of agricultural machinery, which is firstly introduced, is not only to build an information exchanging platform for farmers and machine hand, but more important to provide the decision-making service. And then the deployment problem of combine harvesters is analyzed and the deployment model is established in the paper. Optimization deployment algorithm with global searching strategies, which is proposed in this paper, makes comparison with deployment algorithm with heuristic searching strategies that has be proposed in the author's previous article at aspects of deployment profit, cost and distances. It is concluded that the two algorithms have different applicable conditions. The better solution with high efficiency and performance can be obtained by the algorithm proposed in this paper.
目前的农机平台只提供农田和机械的操作信息,没有提供有效的决策服务。农机利用率低、经营利润低是联合收割机跨区域经营的主要问题。首次推出的农业机械智能大数据平台,不仅是为农民和机手搭建信息交流平台,更重要的是为农民提供决策服务。然后对联合收割机的部署问题进行了分析,建立了部署模型。本文提出的基于全局搜索策略的优化部署算法,从部署利润、部署成本和部署距离等方面与作者之前文章中提出的基于启发式搜索策略的优化部署算法进行了比较。结果表明,两种算法具有不同的适用条件。本文提出的算法可以获得效率高、性能好的较优解。
{"title":"Research on Deployment Strategies of Combine Harvesters Based on Intelligent Big Data Platform","authors":"Fan Zhang, Yan Zhang, Haizhao Yuan, Chuanyu Sun, Yihang Li","doi":"10.14257/ijdta.2017.10.1.24","DOIUrl":"https://doi.org/10.14257/ijdta.2017.10.1.24","url":null,"abstract":"The current agricultural machinery platforms just provide operational information of farmland and machinery, but not effective decision-making service. The problems of low utilization rate of agricultural machinery and low operation profits emerge as a major issue in the cross-regional operation of combine harvesters. The intelligent big data platform of agricultural machinery, which is firstly introduced, is not only to build an information exchanging platform for farmers and machine hand, but more important to provide the decision-making service. And then the deployment problem of combine harvesters is analyzed and the deployment model is established in the paper. Optimization deployment algorithm with global searching strategies, which is proposed in this paper, makes comparison with deployment algorithm with heuristic searching strategies that has be proposed in the author's previous article at aspects of deployment profit, cost and distances. It is concluded that the two algorithms have different applicable conditions. The better solution with high efficiency and performance can be obtained by the algorithm proposed in this paper.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"47 1","pages":"259-270"},"PeriodicalIF":0.0,"publicationDate":"2017-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83248313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Attack Model on Differential Privacy Preserving Methods for Correlated Time Series 相关时间序列差分隐私保护方法的攻击模型
Pub Date : 2017-01-31 DOI: 10.14257/IJDTA.2017.10.1.09
Xiong Wenjun, Xu Zhengquan, Hao Wang
Differential privacy has played a significant role in privacy preserving, and it has performed well in independent series. However, in real-world applications, most data are released in the form of correlated time series. Although a few differential privacy methods have focused on correlated time series, they are not designed by protecting against a specific attack model. Due to this drawback, the effectiveness of these methods cannot be verified and the privacy level of them cannot be measured. To address the problem, this paper presents an attack model based on the principle of filtering in signal processing theory. Since the distribution of the noise designed by current methods is independent and different from that of the original correlated series, a filter is designed as a unified attack model to sanitize the independent noise from the perturbed time series. Furthermore, the designed attack model can realize the function of measuring the effective privacy level of these methods and comparing the performance of them. Experimental results show that the attack model leads to degradation in privacy levels and can work as a unified measurement.
差分隐私在隐私保护中发挥了重要作用,并且在独立序列中表现良好。然而,在实际应用中,大多数数据以相关时间序列的形式发布。尽管一些差分隐私方法侧重于相关时间序列,但它们的设计并不是为了防止特定的攻击模型。由于这个缺点,这些方法的有效性无法验证,隐私水平也无法衡量。针对这一问题,本文提出了一种基于信号处理理论中滤波原理的攻击模型。由于现有方法设计的噪声分布是独立的,与原始相关序列的分布不同,因此设计了一个滤波器作为统一的攻击模型来消除干扰时间序列中的独立噪声。此外,所设计的攻击模型可以实现测量这些方法的有效隐私级别和比较它们的性能的功能。实验结果表明,该攻击模型降低了隐私等级,可以作为统一的度量标准。
{"title":"An Attack Model on Differential Privacy Preserving Methods for Correlated Time Series","authors":"Xiong Wenjun, Xu Zhengquan, Hao Wang","doi":"10.14257/IJDTA.2017.10.1.09","DOIUrl":"https://doi.org/10.14257/IJDTA.2017.10.1.09","url":null,"abstract":"Differential privacy has played a significant role in privacy preserving, and it has performed well in independent series. However, in real-world applications, most data are released in the form of correlated time series. Although a few differential privacy methods have focused on correlated time series, they are not designed by protecting against a specific attack model. Due to this drawback, the effectiveness of these methods cannot be verified and the privacy level of them cannot be measured. To address the problem, this paper presents an attack model based on the principle of filtering in signal processing theory. Since the distribution of the noise designed by current methods is independent and different from that of the original correlated series, a filter is designed as a unified attack model to sanitize the independent noise from the perturbed time series. Furthermore, the designed attack model can realize the function of measuring the effective privacy level of these methods and comparing the performance of them. Experimental results show that the attack model leads to degradation in privacy levels and can work as a unified measurement.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"100 1","pages":"89-104"},"PeriodicalIF":0.0,"publicationDate":"2017-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88962807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Semi-supervised Text Classification Using SVM with Exponential Kernel 基于指数核的SVM半监督文本分类
Pub Date : 2017-01-31 DOI: 10.14257/IJDTA.2017.10.1.08
Liyun Zhong
Kernel-based learning methods (kernel methods for short) in general and support vector machine (SVM) in particular have been successfully applied to the task of text classification. This is mainly due to their relatively high classification accuracy on several application domains as well as their ability to handle high dimensional and sparse data which is the prohibitive characteristics of textual data representation. A significant challenge in text classification is to reduce the need for labeled training data while maintaining an acceptable performance. This paper presents a semi-supervised technique using the exponential kernel for text classification. Specifically, the semantic similarities between terms are first determined with both labeled and unlabeled training data by means of a diffusion process on a graph defined by lexicon and co-occurrence information, and the exponential kernel is then constructed based on the learned semantic similarity. Finally, the SVM classifier trains a model for each class during the training phase and this model is then applied to all test examples in the test phase. The main feature of this approach is that it takes advantage of the exponential kernel to reveal the semantic similarities between terms in an unsupervised manner, which provides a kernel framework for semi-supervised learning. The proposed approach is demonstrated on several benchmark data sets for text classification and the experimental results show that it can significantly improve the classification performance.
基于核的学习方法(简称核方法),特别是支持向量机(SVM)已经成功地应用于文本分类任务。这主要是由于它们在几个应用领域的分类精度相对较高,以及它们处理高维和稀疏数据的能力,这是文本数据表示的禁忌特征。文本分类的一个重大挑战是在保持可接受的性能的同时减少对标记训练数据的需求。本文提出了一种利用指数核进行文本分类的半监督技术。具体而言,首先在由词汇和共现信息定义的图上通过扩散过程确定标记和未标记训练数据之间的语义相似度,然后基于学习到的语义相似度构造指数核。最后,SVM分类器在训练阶段为每个类别训练一个模型,然后将该模型应用于测试阶段的所有测试样例。该方法的主要特点是利用指数核以无监督的方式揭示术语之间的语义相似性,为半监督学习提供了一个核框架。在多个文本分类基准数据集上进行了验证,实验结果表明该方法能显著提高分类性能。
{"title":"Semi-supervised Text Classification Using SVM with Exponential Kernel","authors":"Liyun Zhong","doi":"10.14257/IJDTA.2017.10.1.08","DOIUrl":"https://doi.org/10.14257/IJDTA.2017.10.1.08","url":null,"abstract":"Kernel-based learning methods (kernel methods for short) in general and support vector machine (SVM) in particular have been successfully applied to the task of text classification. This is mainly due to their relatively high classification accuracy on several application domains as well as their ability to handle high dimensional and sparse data which is the prohibitive characteristics of textual data representation. A significant challenge in text classification is to reduce the need for labeled training data while maintaining an acceptable performance. This paper presents a semi-supervised technique using the exponential kernel for text classification. Specifically, the semantic similarities between terms are first determined with both labeled and unlabeled training data by means of a diffusion process on a graph defined by lexicon and co-occurrence information, and the exponential kernel is then constructed based on the learned semantic similarity. Finally, the SVM classifier trains a model for each class during the training phase and this model is then applied to all test examples in the test phase. The main feature of this approach is that it takes advantage of the exponential kernel to reveal the semantic similarities between terms in an unsupervised manner, which provides a kernel framework for semi-supervised learning. The proposed approach is demonstrated on several benchmark data sets for text classification and the experimental results show that it can significantly improve the classification performance.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"49 1","pages":"79-88"},"PeriodicalIF":0.0,"publicationDate":"2017-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80479512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Network Data Mining Application in Earnings Management of Private Holding Enterprise: An Empirical Analysis Based on Multiple Regression Model 网络数据挖掘在民营控股企业盈余管理中的应用:基于多元回归模型的实证分析
Pub Date : 2017-01-31 DOI: 10.14257/IJDTA.2017.10.1.25
Hong-Tao Liu
Efficient processing platform can effectively analyze massive data, strong support for data mining algorithms and data visualization. In this paper, the authors use the new path of integration earnings management way and earnings management direction to study the relationship between China’s listed companies’ ownership structure and earnings management. The results found that: private holding company's earnings quality the pressure is far greater than the state-owned holding company; negative earnings management private holding company level was significantly lower than the state-owned holding company; U-shaped relationship between ownership concentration and earnings management, moderate concentration of ownership in favor of reducing the level of earnings management. Accordingly, aspects of equity nature, are intended to promote the development direction of mixed ownership in line with our national interests; the controlling stake in moderate levels of concentration of ownership structure another sign of deepening reform success.
高效的处理平台可有效分析海量数据,有力支持数据挖掘算法和数据可视化。本文采用整合盈余管理方式和盈余管理方向的新路径,研究中国上市公司股权结构与盈余管理的关系。结果发现:民营控股公司的盈余质量压力远远大于国有控股公司;民营控股公司负盈余管理水平显著低于国有控股公司;股权集中度与盈余管理呈u型关系,适度的股权集中度有利于降低盈余管理水平。据此,在股权性质方面,意在推动符合我国国家利益的混合所有制发展方向;控股股权适度集中是股权结构深化改革取得成功的又一标志。
{"title":"Network Data Mining Application in Earnings Management of Private Holding Enterprise: An Empirical Analysis Based on Multiple Regression Model","authors":"Hong-Tao Liu","doi":"10.14257/IJDTA.2017.10.1.25","DOIUrl":"https://doi.org/10.14257/IJDTA.2017.10.1.25","url":null,"abstract":"Efficient processing platform can effectively analyze massive data, strong support for data mining algorithms and data visualization. In this paper, the authors use the new path of integration earnings management way and earnings management direction to study the relationship between China’s listed companies’ ownership structure and earnings management. The results found that: private holding company's earnings quality the pressure is far greater than the state-owned holding company; negative earnings management private holding company level was significantly lower than the state-owned holding company; U-shaped relationship between ownership concentration and earnings management, moderate concentration of ownership in favor of reducing the level of earnings management. Accordingly, aspects of equity nature, are intended to promote the development direction of mixed ownership in line with our national interests; the controlling stake in moderate levels of concentration of ownership structure another sign of deepening reform success.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"48 1","pages":"271-284"},"PeriodicalIF":0.0,"publicationDate":"2017-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88875305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Numerical Method for Suffix Array Index Compression 后缀数组索引压缩的数值方法
Pub Date : 2017-01-31 DOI: 10.14257/IJDTA.2017.10.1.19
Baomin Xu, Jie Huang, Yang Yang
Suffix arrays is versatile data structures playing a key role in numerous string processing applications such as the data structure can be used to represent the given DNA strings. However, the most serious drawback of suffix arrays is their size, namely space usage. In this paper, we propose a new suffix array compression technique, i.e., numerical method for suffix array index compression, for the problem. With the method, we will translate DNA bases characters ATGC to the corresponding integer number 1234. The experimental results show that the numerical method for suffix array index compression not only can greatly compress the memory space of suffix array, but also can retain the quick search characteristics of suffix array.
后缀数组是一种通用的数据结构,在许多字符串处理应用程序中起着关键作用,例如该数据结构可用于表示给定的DNA字符串。然而,后缀数组最严重的缺点是它们的大小,即空间使用。针对这一问题,我们提出了一种新的后缀数组压缩技术,即后缀数组索引压缩数值方法。利用该方法,我们将DNA碱基字符ATGC翻译成相应的整数1234。实验结果表明,采用数值方法对后缀数组进行索引压缩,不仅可以大大压缩后缀数组的存储空间,而且可以保持后缀数组的快速搜索特性。
{"title":"A Numerical Method for Suffix Array Index Compression","authors":"Baomin Xu, Jie Huang, Yang Yang","doi":"10.14257/IJDTA.2017.10.1.19","DOIUrl":"https://doi.org/10.14257/IJDTA.2017.10.1.19","url":null,"abstract":"Suffix arrays is versatile data structures playing a key role in numerous string processing applications such as the data structure can be used to represent the given DNA strings. However, the most serious drawback of suffix arrays is their size, namely space usage. In this paper, we propose a new suffix array compression technique, i.e., numerical method for suffix array index compression, for the problem. With the method, we will translate DNA bases characters ATGC to the corresponding integer number 1234. The experimental results show that the numerical method for suffix array index compression not only can greatly compress the memory space of suffix array, but also can retain the quick search characteristics of suffix array.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"48 1","pages":"207-212"},"PeriodicalIF":0.0,"publicationDate":"2017-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91220345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assessing and Mining of Phylogenetic Trees 系统发育树的评估与挖掘
Pub Date : 2017-01-31 DOI: 10.14257/IJDTA.2017.10.1.07
Geetika Munjal, M. Hanmandlu, Sangeet Srivastva, D. Gaur
Assessing and Mining phylogenetic trees is very useful in storing, querying the phylogenetic databases, and finding an accurate phylogenetic tree for a set of species is very difficult. Assessing a phylogenetic tree also resolves the problem of conflicting phylogenies. This paper discusses the methods for validating and mining phylogenetic trees. We propose a new way to compare two trees by accessing importance of node in tree. This new method is applied on phylogenetic trees and the results compared with symmetric distance, Maximum Agreement Subtree and Bootstrapped tree.
系统发育树的评估和挖掘对于系统发育数据库的存储和查询非常有用,而为一组物种找到准确的系统发育树是非常困难的。评估系统发生树也解决了系统发生冲突的问题。本文讨论了验证和挖掘系统发育树的方法。我们提出了一种通过访问树中节点的重要性来比较两棵树的新方法。将该方法应用于系统发育树,并与对称距离树、最大一致子树和bootstrap树进行了比较。
{"title":"Assessing and Mining of Phylogenetic Trees","authors":"Geetika Munjal, M. Hanmandlu, Sangeet Srivastva, D. Gaur","doi":"10.14257/IJDTA.2017.10.1.07","DOIUrl":"https://doi.org/10.14257/IJDTA.2017.10.1.07","url":null,"abstract":"Assessing and Mining phylogenetic trees is very useful in storing, querying the phylogenetic databases, and finding an accurate phylogenetic tree for a set of species is very difficult. Assessing a phylogenetic tree also resolves the problem of conflicting phylogenies. This paper discusses the methods for validating and mining phylogenetic trees. We propose a new way to compare two trees by accessing importance of node in tree. This new method is applied on phylogenetic trees and the results compared with symmetric distance, Maximum Agreement Subtree and Bootstrapped tree.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"22 1","pages":"67-78"},"PeriodicalIF":0.0,"publicationDate":"2017-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81754467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
An Improved Sequential Pattern Algorithm Based on Data Mining 一种基于数据挖掘的改进序列模式算法
Pub Date : 2017-01-31 DOI: 10.14257/IJDTA.2017.10.1.03
Jin Zhao, Runtao Lv, Yu Li
This paper mentions several interestingness measures as Lift, Conviction, Piatetsky-Shapiro, Cosine, Jaccard and so on, which have proposed for mining association rules and classification rules but they have not been applied to mine sequential rules in sequence databases except the traditional measures of rule such as the support and confidence. We also propose then an efficient algorithm to generate all relevant sequential rules with the above interestingness measures from the prefix-tree which stored the whole sequential pattern where each child node stores a sequential pattern and its corresponding support value. By traversing the prefix-tree, the algorithm can then easily identify the components of a rule, and can calculate the measured values of the rule. The experimental results show that sequential rule mining with interestingness measures using the proposed algorithm based on the prefix-tree was always much faster than that using the other existing algorithm as modified Full. Especially when mining in large sequence databases with the low minimum support values, the number of sequential patterns generated from sequence databases was large and the proposed algorithm outperformed much because the proposed algorithm only traverse the prefix-tree to immediately determine which sequences are the left- and right-hand sides of a rule as well as their support values to compute the interestingness measure values of the rule from the sequential pattern set. In addition, the experimental results also show that the time for mining sequential rules with the confidence measure was the smallest, because it did not need to revisit the prefix-tree to determine the support of Y (the antecedence of rules), while the other interestingness measures need to revisit the prefix-tree to determine the support values of the consequent of rules or both the antecedence and the consequent.
本文提到了Lift、Conviction、Piatetsky-Shapiro、Cosine、Jaccard等几种有趣度度量,这些度量被提出用于挖掘关联规则和分类规则,但除了传统的规则度量如支持度和置信度外,尚未应用于挖掘序列数据库中的序列规则。然后,我们还提出了一种有效的算法,从存储整个序列模式的前缀树中生成所有相关的序列规则,其中每个子节点存储一个序列模式及其相应的支持值。通过遍历前缀树,该算法可以很容易地识别规则的组成部分,并计算出规则的测量值。实验结果表明,基于前缀树的兴趣度度量序列规则挖掘算法的挖掘速度总是比使用其他改进的Full算法快得多。特别是当在最小支持值较低的大型序列数据库中挖掘时,从序列数据库中生成的序列模式数量较多,该算法仅通过遍历前缀树即可立即确定规则的左右两侧序列及其支持值,从而从序列模式集中计算规则的兴趣度度量值,因此性能优于传统算法。此外,实验结果还表明,使用置信度度量挖掘顺序规则所需的时间最小,因为它不需要重新访问前缀树来确定Y的支持度(规则的先行性),而其他兴趣度度量则需要重新访问前缀树来确定规则的后结果或前结果和后结果的支持值。
{"title":"An Improved Sequential Pattern Algorithm Based on Data Mining","authors":"Jin Zhao, Runtao Lv, Yu Li","doi":"10.14257/IJDTA.2017.10.1.03","DOIUrl":"https://doi.org/10.14257/IJDTA.2017.10.1.03","url":null,"abstract":"This paper mentions several interestingness measures as Lift, Conviction, Piatetsky-Shapiro, Cosine, Jaccard and so on, which have proposed for mining association rules and classification rules but they have not been applied to mine sequential rules in sequence databases except the traditional measures of rule such as the support and confidence. We also propose then an efficient algorithm to generate all relevant sequential rules with the above interestingness measures from the prefix-tree which stored the whole sequential pattern where each child node stores a sequential pattern and its corresponding support value. By traversing the prefix-tree, the algorithm can then easily identify the components of a rule, and can calculate the measured values of the rule. The experimental results show that sequential rule mining with interestingness measures using the proposed algorithm based on the prefix-tree was always much faster than that using the other existing algorithm as modified Full. Especially when mining in large sequence databases with the low minimum support values, the number of sequential patterns generated from sequence databases was large and the proposed algorithm outperformed much because the proposed algorithm only traverse the prefix-tree to immediately determine which sequences are the left- and right-hand sides of a rule as well as their support values to compute the interestingness measure values of the rule from the sequential pattern set. In addition, the experimental results also show that the time for mining sequential rules with the confidence measure was the smallest, because it did not need to revisit the prefix-tree to determine the support of Y (the antecedence of rules), while the other interestingness measures need to revisit the prefix-tree to determine the support values of the consequent of rules or both the antecedence and the consequent.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"18 1","pages":"23-36"},"PeriodicalIF":0.0,"publicationDate":"2017-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87893797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Temporal and Spatial Association Rules Strong Mining Algorithm Based on Hierarchical Reasoning Parameters 基于层次推理参数的时空关联规则强挖掘算法
Pub Date : 2017-01-31 DOI: 10.14257/ijdta.2017.10.1.06
Zhang Xuewu
Such problems as premature convergence and local optimal solution universally exist in the application of traditional genetic algorithm to the association rules mining, so a lot of time is needed for extracting the useful strong association rules. In order to conquer these disadvantages, the adaptive variation rate is introduced in this paper and the method for the operator selection during the genetic process is improved in order to specifically improve the traditional genetic algorithm, and the improved association rules mining method is used to analyze the power transformation equipment defect data. The example comparison shows that the improved genetic algorithm can significantly reduce the rule discovery calculation complexity and improve the association rules mining efficiency.
传统遗传算法在关联规则挖掘中的应用普遍存在过早收敛和局部最优解等问题,因此需要花费大量时间来提取有用的强关联规则。为了克服这些缺点,本文引入了自适应变异率,并对遗传过程中的算子选择方法进行了改进,对传统遗传算法进行了针对性的改进,并采用改进的关联规则挖掘方法对变电设备缺陷数据进行了分析。实例对比表明,改进的遗传算法可以显著降低规则发现的计算复杂度,提高关联规则挖掘效率。
{"title":"Temporal and Spatial Association Rules Strong Mining Algorithm Based on Hierarchical Reasoning Parameters","authors":"Zhang Xuewu","doi":"10.14257/ijdta.2017.10.1.06","DOIUrl":"https://doi.org/10.14257/ijdta.2017.10.1.06","url":null,"abstract":"Such problems as premature convergence and local optimal solution universally exist in the application of traditional genetic algorithm to the association rules mining, so a lot of time is needed for extracting the useful strong association rules. In order to conquer these disadvantages, the adaptive variation rate is introduced in this paper and the method for the operator selection during the genetic process is improved in order to specifically improve the traditional genetic algorithm, and the improved association rules mining method is used to analyze the power transformation equipment defect data. The example comparison shows that the improved genetic algorithm can significantly reduce the rule discovery calculation complexity and improve the association rules mining efficiency.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"359 1","pages":"57-66"},"PeriodicalIF":0.0,"publicationDate":"2017-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76410317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International journal of database theory and application
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1