Pub Date : 2017-01-31DOI: 10.14257/IJDTA.2017.10.1.13
Abdul Salam Shah, Masood Shah, M. Fayaz, F. Wahid, H. Khan, Asadullah Shah
Forensic applications having great importance in the digital era, for the investigation of different types of crimes. The forensic analysis includes Deoxyribonucleic Acid (DNA) test, crime scene video and images,, forged documents analysis, computer-based data recovery, fingerprint identifications, handwritten signature verification and facial recognition. The signatures are divided into two types i.e. genuine and forgery. The forgery signature can lead to the huge amount of financial losses and create other legal issues as well. The process of forensic investigation for the verification of genune signature and detection of forgery signature in law related departements has been manula and the same can be automated using digital image processing techniques, and automated forensic signature verificatiob applications. The signatures represent any person's authority to the forged signature may also be used in a crime. Research has been done to automate the forensic investigation process, but due to the internal verification of signatures, the automation of signature verification still remains a challenging problem for researchers. In this paper, we have further extended previous research carried out in [1-2] and proposed a Forensic signature verification model based on two classifiers i.e. Multi-layer Perception (MLP) and Random Forest for the classification of genuine and forgery signatures.
{"title":"Forensic Analysis of Offline Signatures Using Multilayer Perceptron and Random Forest","authors":"Abdul Salam Shah, Masood Shah, M. Fayaz, F. Wahid, H. Khan, Asadullah Shah","doi":"10.14257/IJDTA.2017.10.1.13","DOIUrl":"https://doi.org/10.14257/IJDTA.2017.10.1.13","url":null,"abstract":"Forensic applications having great importance in the digital era, for the investigation of different types of crimes. The forensic analysis includes Deoxyribonucleic Acid (DNA) test, crime scene video and images,, forged documents analysis, computer-based data recovery, fingerprint identifications, handwritten signature verification and facial recognition. The signatures are divided into two types i.e. genuine and forgery. The forgery signature can lead to the huge amount of financial losses and create other legal issues as well. The process of forensic investigation for the verification of genune signature and detection of forgery signature in law related departements has been manula and the same can be automated using digital image processing techniques, and automated forensic signature verificatiob applications. The signatures represent any person's authority to the forged signature may also be used in a crime. Research has been done to automate the forensic investigation process, but due to the internal verification of signatures, the automation of signature verification still remains a challenging problem for researchers. In this paper, we have further extended previous research carried out in [1-2] and proposed a Forensic signature verification model based on two classifiers i.e. Multi-layer Perception (MLP) and Random Forest for the classification of genuine and forgery signatures.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"94 1","pages":"139-148"},"PeriodicalIF":0.0,"publicationDate":"2017-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76618218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-01-31DOI: 10.14257/IJDTA.2017.10.1.22
B. Usharani
Semi-structured data is used for representing the data over the Internet. In this paper, an implementation is given for, how to convert XML documents to SQL tables, processing the relational data to get the desired result using SQL queries, and stores the results back to XML and finally displaying the XML data in the web page.
{"title":"Mapping the Semi-Structured Data to the Structured Data for Inverted Index Compression","authors":"B. Usharani","doi":"10.14257/IJDTA.2017.10.1.22","DOIUrl":"https://doi.org/10.14257/IJDTA.2017.10.1.22","url":null,"abstract":"Semi-structured data is used for representing the data over the Internet. In this paper, an implementation is given for, how to convert XML documents to SQL tables, processing the relational data to get the desired result using SQL queries, and stores the results back to XML and finally displaying the XML data in the web page.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"9 1","pages":"235-244"},"PeriodicalIF":0.0,"publicationDate":"2017-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79667103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-31DOI: 10.14257/ijdta.2016.9.12.22
Zhou Lianru, Jiao Xiongfei, L. Lijun, Liang Yuanqin
In view of the valuable medical resources at home and abroad, especially considering the fact that children's medical resources can't meet the needs,the author designed the mobile medical integration system with the help of mobile Internet platform and cloud computing platform.The system is divided into two parts of the mobile terminal and cloud computing platform, which are used respectively by the guardian and the doctor. The health monitoring terminal designed for children are wearable watches whileAPP is developed for the guardian and the doctors.Cloud platform designed a platform for data storage, message processing, functional applications and other modules, forming a “cloud+client” service model.This system makes the children's disease prevention, emergency treatment and medical treatment behavior become more convenient and fast, protects the healthy growth of children, provides reference and solution for the future development of medical care.
{"title":"Research On Mobile Medical Integration System for Children","authors":"Zhou Lianru, Jiao Xiongfei, L. Lijun, Liang Yuanqin","doi":"10.14257/ijdta.2016.9.12.22","DOIUrl":"https://doi.org/10.14257/ijdta.2016.9.12.22","url":null,"abstract":"In view of the valuable medical resources at home and abroad, especially considering the fact that children's medical resources can't meet the needs,the author designed the mobile medical integration system with the help of mobile Internet platform and cloud computing platform.The system is divided into two parts of the mobile terminal and cloud computing platform, which are used respectively by the guardian and the doctor. The health monitoring terminal designed for children are wearable watches whileAPP is developed for the guardian and the doctors.Cloud platform designed a platform for data storage, message processing, functional applications and other modules, forming a “cloud+client” service model.This system makes the children's disease prevention, emergency treatment and medical treatment behavior become more convenient and fast, protects the healthy growth of children, provides reference and solution for the future development of medical care.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"26 1","pages":"241-252"},"PeriodicalIF":0.0,"publicationDate":"2016-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72827213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-31DOI: 10.14257/IJDTA.2016.9.12.02
T. V. Saradhi, K. Subrahmanyam, VENKATESWARA RAO PEDDADA, Hye Jin Kim
The skyline queries are the best tools to be used in distributed multi criteria decision making of web based applications for user commendations. However, as the Data dimensions are increasing size of dominance set and skyline set is also increasing. Increasing dimensionality becomes the major problem with real word databases. In skyline computation major cost depends on finding dominance tests between high dimensional objects and the order in which they are accessing. Space filling Z-curve is the best suitable way to address the challenges in skyline computation. In this proposed work, we incorporated Z-curve with optimized skyline boundary detection algorithm to effective access and early pruning. In this paper efficient hybrid index structure was proposed which takes the advantage of sorting and partition approaches to improve the storage and search efficiency. Experimental results show that our propose approach is better than the previous static skyline computation techniques in terms of searching and finding skyline set.
{"title":"Applying Z-Curve Technique to Compute Skyline Set in Multi Criteria Decision Making System","authors":"T. V. Saradhi, K. Subrahmanyam, VENKATESWARA RAO PEDDADA, Hye Jin Kim","doi":"10.14257/IJDTA.2016.9.12.02","DOIUrl":"https://doi.org/10.14257/IJDTA.2016.9.12.02","url":null,"abstract":"The skyline queries are the best tools to be used in distributed multi criteria decision making of web based applications for user commendations. However, as the Data dimensions are increasing size of dominance set and skyline set is also increasing. Increasing dimensionality becomes the major problem with real word databases. In skyline computation major cost depends on finding dominance tests between high dimensional objects and the order in which they are accessing. Space filling Z-curve is the best suitable way to address the challenges in skyline computation. In this proposed work, we incorporated Z-curve with optimized skyline boundary detection algorithm to effective access and early pruning. In this paper efficient hybrid index structure was proposed which takes the advantage of sorting and partition approaches to improve the storage and search efficiency. Experimental results show that our propose approach is better than the previous static skyline computation techniques in terms of searching and finding skyline set.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"6 1","pages":"9-22"},"PeriodicalIF":0.0,"publicationDate":"2016-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82349962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-31DOI: 10.14257/IJDTA.2016.9.12.24
D. Kendal, Oded Koren, N. Perel
Corporations are changing their practices to data-driven big data initiatives, as big data analytics has provided companies with the ability to grow their businesses and increase competition. As the importance of data analytics grew, so accordingly did the size of the data to analyze, thus demanding a more powerful data platform. This paper shows a case study of two High Level Query Languages that are constructed on top of Hadoop MapReduce; Pig and Hive. By creating a query in each query language, both resulting in an identical output, and by running each query 30 times on 2 different sized files (120 runs total), this comparison provides a statistically significant conclusion.
{"title":"Pig Vs. Hive Use Case Analysis","authors":"D. Kendal, Oded Koren, N. Perel","doi":"10.14257/IJDTA.2016.9.12.24","DOIUrl":"https://doi.org/10.14257/IJDTA.2016.9.12.24","url":null,"abstract":"Corporations are changing their practices to data-driven big data initiatives, as big data analytics has provided companies with the ability to grow their businesses and increase competition. As the importance of data analytics grew, so accordingly did the size of the data to analyze, thus demanding a more powerful data platform. This paper shows a case study of two High Level Query Languages that are constructed on top of Hadoop MapReduce; Pig and Hive. By creating a query in each query language, both resulting in an identical output, and by running each query 30 times on 2 different sized files (120 runs total), this comparison provides a statistically significant conclusion.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"12 1 1","pages":"267-276"},"PeriodicalIF":0.0,"publicationDate":"2016-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89120449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-31DOI: 10.14257/IJDTA.2016.9.12.14
Ying Xu, Xuemei Zhang, Hong Zhang
In the network environment, supply chain management has greatly reduced the product development cycle, reduce the inventory. With the continuous development of information technology, e-commerce logistics platform has become the main factor affecting the development of logistics industry. In this paper, the authors research on the E-commerce platform performance and green supply chain based on data mining and SVM. The green supply chain considers the environmental problems in every link of the supply chain, and promotes the coordinated development of economy and environment. The result shows that the most critical factor that affects the satisfaction of consumer to B2C e-commerce platform is the accurate, complete and reliable logistics service.
{"title":"Research on the E-commerce Platform Performance and Green Supply Chain based on Data Mining and SVM","authors":"Ying Xu, Xuemei Zhang, Hong Zhang","doi":"10.14257/IJDTA.2016.9.12.14","DOIUrl":"https://doi.org/10.14257/IJDTA.2016.9.12.14","url":null,"abstract":"In the network environment, supply chain management has greatly reduced the product development cycle, reduce the inventory. With the continuous development of information technology, e-commerce logistics platform has become the main factor affecting the development of logistics industry. In this paper, the authors research on the E-commerce platform performance and green supply chain based on data mining and SVM. The green supply chain considers the environmental problems in every link of the supply chain, and promotes the coordinated development of economy and environment. The result shows that the most critical factor that affects the satisfaction of consumer to B2C e-commerce platform is the accurate, complete and reliable logistics service.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"1 1","pages":"141-150"},"PeriodicalIF":0.0,"publicationDate":"2016-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88754052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-31DOI: 10.14257/ijdta.2016.9.12.12
Lipeng Yang, Fuzhang Wang, Chunmei Fan
Invasive weed optimization (IWO) is a swarm optimization algorithm with both explorative and exploitive power where the diverisity of the population is obtained by allowing the reproduction and mutation of individuals with poor fitness .Differential optimization algorithm is a random parallel algorithm according to a vector change that can make individuals change toward outstanding individuals with global convergence.For k-means algorithm , the traditional algorirhm is prone to get stuck at local optimum and is sensitive to random initialization. Based on the aforementiond background a novel optimization algorithm based hybriding DE and IWO which denoted IWODE-KM is employed to optimize the parameters of k-means and is further applied to chinese text clustering. Experiment results shows that the proposed method outperforms both of its ancestors.
{"title":"A Text Clustering Algorithm based on Weeds and Differential Optimization","authors":"Lipeng Yang, Fuzhang Wang, Chunmei Fan","doi":"10.14257/ijdta.2016.9.12.12","DOIUrl":"https://doi.org/10.14257/ijdta.2016.9.12.12","url":null,"abstract":"Invasive weed optimization (IWO) is a swarm optimization algorithm with both explorative and exploitive power where the diverisity of the population is obtained by allowing the reproduction and mutation of individuals with poor fitness .Differential optimization algorithm is a random parallel algorithm according to a vector change that can make individuals change toward outstanding individuals with global convergence.For k-means algorithm , the traditional algorirhm is prone to get stuck at local optimum and is sensitive to random initialization. Based on the aforementiond background a novel optimization algorithm based hybriding DE and IWO which denoted IWODE-KM is employed to optimize the parameters of k-means and is further applied to chinese text clustering. Experiment results shows that the proposed method outperforms both of its ancestors.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"21 1","pages":"121-130"},"PeriodicalIF":0.0,"publicationDate":"2016-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74959128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-31DOI: 10.14257/IJDTA.2016.9.12.08
Wang Can-wei
Due to big data with related multi-dimensional characteristics, the effective means how to build processing mechanisms and algorithms are still problems; so that the algorithms on big data processing huge resources and time cost of computing, resulting in wasting of energy; for this problem the present study proposes a large data processing algorithm of random matrix theory application, can effectively improve the processing efficiency, thereby increasing the utilization of energy. Results show that the proposed algorithm can effectively reduce the amount of calculation, thus saving and calculating the required energy.
{"title":"Green Mining Algorithm for Big Data Based on Random Matrix","authors":"Wang Can-wei","doi":"10.14257/IJDTA.2016.9.12.08","DOIUrl":"https://doi.org/10.14257/IJDTA.2016.9.12.08","url":null,"abstract":"Due to big data with related multi-dimensional characteristics, the effective means how to build processing mechanisms and algorithms are still problems; so that the algorithms on big data processing huge resources and time cost of computing, resulting in wasting of energy; for this problem the present study proposes a large data processing algorithm of random matrix theory application, can effectively improve the processing efficiency, thereby increasing the utilization of energy. Results show that the proposed algorithm can effectively reduce the amount of calculation, thus saving and calculating the required energy.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"9 1","pages":"79-88"},"PeriodicalIF":0.0,"publicationDate":"2016-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77923873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-31DOI: 10.14257/IJDTA.2016.9.12.28
Imran Khan, Kamen Ivanov, Qingshan Jiang
High-dimensional data with many features present a significant challenge to current clustering algorithms.Sparsity, noise, and correlation of features are common properties of high-dimensional data.Another essential aspect is that clusters in such data often exist in various subspaces. Ensemble clusteringis emerging as a leading technique for improving robustness, stability, and accuracy of high-dimensional data clusterings. In this paper, we propose FastMap projection for generating subspace component data sets from high-dimensional data. By using component data sets, we create component clusterings and provides a new objective function that ensembles them by maximizing the average similarity between component clusterings and final clustering. Compared with the random sampling and random projection methods, the component clusterings by FastMap projection showed high average clustering accuracy without sacrificing clustering diversity in synthetic data analysis. We conducted a series of experiments on real-world data sets from microarray, text, and image domains employing three subspace component data generation methods, three consensus functions, and a proposed objective function for ensemble clustering. The experiment results consistently demonstrated that the FastMap projection method with the proposed objection function provided the best ensemble clustering results for all data sets.
{"title":"FastMap Projection for High-Dimensional Data: A Cluster Ensemble Approach","authors":"Imran Khan, Kamen Ivanov, Qingshan Jiang","doi":"10.14257/IJDTA.2016.9.12.28","DOIUrl":"https://doi.org/10.14257/IJDTA.2016.9.12.28","url":null,"abstract":"High-dimensional data with many features present a significant challenge to current clustering algorithms.Sparsity, noise, and correlation of features are common properties of high-dimensional data.Another essential aspect is that clusters in such data often exist in various subspaces. Ensemble clusteringis emerging as a leading technique for improving robustness, stability, and accuracy of high-dimensional data clusterings. In this paper, we propose FastMap projection for generating subspace component data sets from high-dimensional data. By using component data sets, we create component clusterings and provides a new objective function that ensembles them by maximizing the average similarity between component clusterings and final clustering. Compared with the random sampling and random projection methods, the component clusterings by FastMap projection showed high average clustering accuracy without sacrificing clustering diversity in synthetic data analysis. We conducted a series of experiments\u0000on real-world data sets from microarray, text, and image domains employing three subspace component data generation methods, three consensus functions, and a proposed objective function for ensemble clustering. The experiment results consistently demonstrated that the FastMap projection method with the proposed objection function provided the best ensemble clustering results for all data sets.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"18 1","pages":"311-330"},"PeriodicalIF":0.0,"publicationDate":"2016-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73098036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-31DOI: 10.14257/IJDTA.2016.9.12.09
Chunxia Wang
Time series is an indicator at different times on different values, arranged in chronological sequence. The basic idea of the multi-scale analysis by orthogonal transformation, and it is such as wavelet transform signal decomposition analysis on different scales. The timing analysis method is achieved through the model method. The process parameters of the dynamic data time-domain analysis method is a parametric model to fit the observed data, and then use this model to analyze the observational data and produce data system. The paper presents the design of the multi-scale data fusion algorithm based on time series analysis. Finally, the advantages of the new algorithm are elaborated from the estimation accuracy and simulation demonstrated the effectiveness of the new algorithm.
{"title":"The Design of the Multi-Scale Data Fusion Algorithm Based on Time Series Analysis","authors":"Chunxia Wang","doi":"10.14257/IJDTA.2016.9.12.09","DOIUrl":"https://doi.org/10.14257/IJDTA.2016.9.12.09","url":null,"abstract":"Time series is an indicator at different times on different values, arranged in chronological sequence. The basic idea of the multi-scale analysis by orthogonal transformation, and it is such as wavelet transform signal decomposition analysis on different scales. The timing analysis method is achieved through the model method. The process parameters of the dynamic data time-domain analysis method is a parametric model to fit the observed data, and then use this model to analyze the observational data and produce data system. The paper presents the design of the multi-scale data fusion algorithm based on time series analysis. Finally, the advantages of the new algorithm are elaborated from the estimation accuracy and simulation demonstrated the effectiveness of the new algorithm.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"84 1","pages":"89-100"},"PeriodicalIF":0.0,"publicationDate":"2016-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76141569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}