Speech and music segregation from a single channel is a challenging task due to background interference and intermingled signals of voice and music channels. It is of immense importance due to its utility in wide range of applications such as music information retrieval, singer identification, lyrics recognition and alignment. This paper presents an effective method for speech and music segregation. Considering the repeating nature of music, we first detect the local repeating structures in the signal using a locally defined window for each segment. After detecting the repeating structure, we extract them and perform separation using a soft time-frequency mask. We apply an ideal binary mask to enhance the speech and music intelligibility. We evaluated the proposed method on the mixtures set at -5 dB, 0 dB, 5 dB from Multimedia Information Retrieval1000 clips (MIR-1K) dataset. Experimental results demonstrate that the proposed method for speech and music segregation outperforms the existing state-of-the-art methods in terms of Global-Normalized-Signal-to-Distortion Ratio (GNSDR) values.
{"title":"An Effective Framework for Speech and Music Segregation","authors":"Sidra Sajid, A. Javed, Aun Irtaza","doi":"10.34028/iajit/17/4/9","DOIUrl":"https://doi.org/10.34028/iajit/17/4/9","url":null,"abstract":"Speech and music segregation from a single channel is a challenging task due to background interference and intermingled signals of voice and music channels. It is of immense importance due to its utility in wide range of applications such as music information retrieval, singer identification, lyrics recognition and alignment. This paper presents an effective method for speech and music segregation. Considering the repeating nature of music, we first detect the local repeating structures in the signal using a locally defined window for each segment. After detecting the repeating structure, we extract them and perform separation using a soft time-frequency mask. We apply an ideal binary mask to enhance the speech and music intelligibility. We evaluated the proposed method on the mixtures set at -5 dB, 0 dB, 5 dB from Multimedia Information Retrieval1000 clips (MIR-1K) dataset. Experimental results demonstrate that the proposed method for speech and music segregation outperforms the existing state-of-the-art methods in terms of Global-Normalized-Signal-to-Distortion Ratio (GNSDR) values.","PeriodicalId":161392,"journal":{"name":"The International Arab Journal of Information Technology","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115781892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Very well evolved, information technology made so easy the transfer of all types of data over public channels. For this reason, ensuring data security is certainly a necessary requirement. Scrambling data is one solution to hide information from non authorized users. Presenting matrix content, image scrambling can be made by only adding a mask to the real content. A user, having the appropriate mask, can recognize the image content by only subtracting it. Chaotic function is recently used for image encryption. In this paper, an algorithm of image scrambling based on three logistic chaotic functions is proposed. Defined by its initial condition and parameter, each chaotic function will generate a random signal. The set of initial conditions and parameters is the encryption key. The performance of this technique is ensured for two great reasons. First, using masks on the image makes unintelligible its content. Second, using three successive encryption processes makes so difficult attacks. This point reflects, in one hand, a sufficient key length to resist to brute force attack. In the other hand, it reflects the random aspect of the pixel distribution in the scrambled image. That means, the randomness in one mask minimizes the correlations really existent between neighboring pixels. That makes our proposed approach resistant to known attacks and suitable for applications requiring secure data transfer such as medical image exchanged between doctors.
{"title":"Generation of Chaotic Signal for Scrambling Matrix Content","authors":"N. Khlif, Ahmed Ghorbel, Walid Aydi, N. Masmoudi","doi":"10.34028/iajit/17/4/13","DOIUrl":"https://doi.org/10.34028/iajit/17/4/13","url":null,"abstract":"Very well evolved, information technology made so easy the transfer of all types of data over public channels. For this reason, ensuring data security is certainly a necessary requirement. Scrambling data is one solution to hide information from non authorized users. Presenting matrix content, image scrambling can be made by only adding a mask to the real content. A user, having the appropriate mask, can recognize the image content by only subtracting it. Chaotic function is recently used for image encryption. In this paper, an algorithm of image scrambling based on three logistic chaotic functions is proposed. Defined by its initial condition and parameter, each chaotic function will generate a random signal. The set of initial conditions and parameters is the encryption key. The performance of this technique is ensured for two great reasons. First, using masks on the image makes unintelligible its content. Second, using three successive encryption processes makes so difficult attacks. This point reflects, in one hand, a sufficient key length to resist to brute force attack. In the other hand, it reflects the random aspect of the pixel distribution in the scrambled image. That means, the randomness in one mask minimizes the correlations really existent between neighboring pixels. That makes our proposed approach resistant to known attacks and suitable for applications requiring secure data transfer such as medical image exchanged between doctors.","PeriodicalId":161392,"journal":{"name":"The International Arab Journal of Information Technology","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114538867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Computing semantic similarity between concepts is an important issue in natural language processing, artificial intelligence, information retrieval and knowledge management. The measure of computing concept similarity is a fundament of semantic computation. In this paper, we analyze typical semantic similarity measures and note Wu and Palmer’s measure which does not distinguish the similarities between nodes from a node to different nodes of the same level. Then, we synthesize the advantages of measure of path-based and IC-based, and propose a new hybrid method for measuring semantic similarity. By testing on a fragment of WordNet hierarchical tree, the results demonstrate the proposed method accurately distinguishes the similarities between nodes from a node to different nodes of the same level and overcome the shortcoming of the Wu and Palmer’s measure
{"title":"A New Hypred Improved Method for Measuring Concept Semantic Similarity in WordNet","authors":"Xiao-gang Zhang, Shouqian Sun, Ke-jun Zhang","doi":"10.34028/iajit/17/4/1","DOIUrl":"https://doi.org/10.34028/iajit/17/4/1","url":null,"abstract":"Computing semantic similarity between concepts is an important issue in natural language processing, artificial intelligence, information retrieval and knowledge management. The measure of computing concept similarity is a fundament of semantic computation. In this paper, we analyze typical semantic similarity measures and note Wu and Palmer’s measure which does not distinguish the similarities between nodes from a node to different nodes of the same level. Then, we synthesize the advantages of measure of path-based and IC-based, and propose a new hybrid method for measuring semantic similarity. By testing on a fragment of WordNet hierarchical tree, the results demonstrate the proposed method accurately distinguishes the similarities between nodes from a node to different nodes of the same level and overcome the shortcoming of the Wu and Palmer’s measure","PeriodicalId":161392,"journal":{"name":"The International Arab Journal of Information Technology","volume":"445 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123228282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Automatic classification of dynamic hand gesture is challenging due to the large diversity in a different class of gesture, Low resolution, and it is performed by finger. Due to a number of challenges many researchers focus on this area. Recently deep neural network can be used for implicit feature extraction and Soft Max layer is used for classification. In this paper, we propose a method based on a two-dimensional convolutional neural network that performs detection and classification of hand gesture simultaneously from multimodal Red, Green, Blue, Depth (RGBD) and Optical flow Data and passes this feature to Long-Short Term Memory (LSTM) recurrent network for frame-to-frame probability generation with Connectionist Temporal Classification (CTC) network for loss calculation. We have calculated an optical flow from Red, Green, Blue (RGB) data for getting proper motion information present in the video. CTC model is used to efficiently evaluate all possible alignment of hand gesture via dynamic programming and check consistency via frame-to-frame for the visual similarity of hand gesture in the unsegmented input stream. CTC network finds the most probable sequence of a frame for a class of gesture. The frame with the highest probability value is selected from the CTC network by max decoding. This entire CTC network is trained end-to-end with calculating CTC loss for recognition of the gesture. We have used challenging Vision for Intelligent Vehicles and Applications (VIVA) dataset for dynamic hand gesture recognition captured with RGB and Depth data. On this VIVA dataset, our proposed hand gesture recognition technique outperforms competing state-of-the-art algorithms and gets an accuracy of 86%.
{"title":"Connectionist Temporal Classification Model for Dynamic Hand Gesture Recognition using RGB and Optical flow Data","authors":"S. Patel, R. Makwana","doi":"10.34028/iajit/17/4/8","DOIUrl":"https://doi.org/10.34028/iajit/17/4/8","url":null,"abstract":"Automatic classification of dynamic hand gesture is challenging due to the large diversity in a different class of gesture, Low resolution, and it is performed by finger. Due to a number of challenges many researchers focus on this area. Recently deep neural network can be used for implicit feature extraction and Soft Max layer is used for classification. In this paper, we propose a method based on a two-dimensional convolutional neural network that performs detection and classification of hand gesture simultaneously from multimodal Red, Green, Blue, Depth (RGBD) and Optical flow Data and passes this feature to Long-Short Term Memory (LSTM) recurrent network for frame-to-frame probability generation with Connectionist Temporal Classification (CTC) network for loss calculation. We have calculated an optical flow from Red, Green, Blue (RGB) data for getting proper motion information present in the video. CTC model is used to efficiently evaluate all possible alignment of hand gesture via dynamic programming and check consistency via frame-to-frame for the visual similarity of hand gesture in the unsegmented input stream. CTC network finds the most probable sequence of a frame for a class of gesture. The frame with the highest probability value is selected from the CTC network by max decoding. This entire CTC network is trained end-to-end with calculating CTC loss for recognition of the gesture. We have used challenging Vision for Intelligent Vehicles and Applications (VIVA) dataset for dynamic hand gesture recognition captured with RGB and Depth data. On this VIVA dataset, our proposed hand gesture recognition technique outperforms competing state-of-the-art algorithms and gets an accuracy of 86%.","PeriodicalId":161392,"journal":{"name":"The International Arab Journal of Information Technology","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122676489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Multi-Label Classification (MLC) is a general type of classification that has attracted many researchers in the last few years. Two common approaches are being used to solve the problem of MLC: Problem Transformation Methods (PTMs) and Algorithm Adaptation Methods (AAMs). This Paper is more interested in the first approach; since it is more general and applicable to any domain. In specific, this paper aims to meet two objectives. The first objective is to propose a new multi-label ranking algorithm based on the positive pairwise correlations among labels, while the second objective aims to propose new simple PTMs that are based on labels correlations, and not based on labels frequency as in conventional PTMs. Experiments showed that the proposed algorithm overcomes the existing methods and algorithms on all evaluation metrics that have been used in the experiments. Also, the proposed PTMs show a superior performance when compared with the existing PTMs
{"title":"Multi Label Ranking Based on Positive Pairwise Correlations Among Labels","authors":"Raed Alazaidah, F. Ahmad, M. Mohsin","doi":"10.34028/iajit/17/4/2","DOIUrl":"https://doi.org/10.34028/iajit/17/4/2","url":null,"abstract":"Multi-Label Classification (MLC) is a general type of classification that has attracted many researchers in the last few years. Two common approaches are being used to solve the problem of MLC: Problem Transformation Methods (PTMs) and Algorithm Adaptation Methods (AAMs). This Paper is more interested in the first approach; since it is more general and applicable to any domain. In specific, this paper aims to meet two objectives. The first objective is to propose a new multi-label ranking algorithm based on the positive pairwise correlations among labels, while the second objective aims to propose new simple PTMs that are based on labels correlations, and not based on labels frequency as in conventional PTMs. Experiments showed that the proposed algorithm overcomes the existing methods and algorithms on all evaluation metrics that have been used in the experiments. Also, the proposed PTMs show a superior performance when compared with the existing PTMs","PeriodicalId":161392,"journal":{"name":"The International Arab Journal of Information Technology","volume":"129 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115464918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Outsourcing spatial database to a third party is becoming a common practice for more and more individuals and companies to save the cost of managing and maintaining database, where a data owner delegates its spatial data management tasks to a third party and grants it to provide query services. However, the third party is not full trusted. Thus, authentication information should be provided to the client for query authentication. In this paper, we introduce an efficient space authenticated data structure, called Verifiable Similarity Indexing tree (VSS-tree), to support authenticated spatial query. We build VSS-tree based on SS-tree which employs bounding sphere rather than bounding rectangle for region shape and extend it with authentication information. Based on VSS-tree, the third party finds query results and builds their corresponding verification object. The client performs query authentication using the verification object and the public key published. Finally, we evaluate the performance and validity of our algorithms, the experiment results show that VSS-tree can efficiently support spatial query and have better performance than Merkle R tree (MR-tree).
{"title":"Query Authentication of Outsourced Spatial Database","authors":"Jun Hong, Tao Wen, Quan Guo","doi":"10.34028/iajit/17/4/12","DOIUrl":"https://doi.org/10.34028/iajit/17/4/12","url":null,"abstract":"Outsourcing spatial database to a third party is becoming a common practice for more and more individuals and companies to save the cost of managing and maintaining database, where a data owner delegates its spatial data management tasks to a third party and grants it to provide query services. However, the third party is not full trusted. Thus, authentication information should be provided to the client for query authentication. In this paper, we introduce an efficient space authenticated data structure, called Verifiable Similarity Indexing tree (VSS-tree), to support authenticated spatial query. We build VSS-tree based on SS-tree which employs bounding sphere rather than bounding rectangle for region shape and extend it with authentication information. Based on VSS-tree, the third party finds query results and builds their corresponding verification object. The client performs query authentication using the verification object and the public key published. Finally, we evaluate the performance and validity of our algorithms, the experiment results show that VSS-tree can efficiently support spatial query and have better performance than Merkle R tree (MR-tree).","PeriodicalId":161392,"journal":{"name":"The International Arab Journal of Information Technology","volume":"2 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126335503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MPKC-based Threshold Proxy Signcryption Scheme","authors":"Li Huixian, Gao Jin, Wan Lingyun, Pang Liaojun","doi":"10.34028/IAJIT/17/2/7","DOIUrl":"https://doi.org/10.34028/IAJIT/17/2/7","url":null,"abstract":"","PeriodicalId":161392,"journal":{"name":"The International Arab Journal of Information Technology","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124976774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Object tracking is a fundamental task in video surveillance, human-computer interaction and activity analysis. One of the common challenges in visual object tracking is illumination variation. A large number of methods for tracking have been proposed over the recent years, and median flow tracker is one of them which can handle various challenges. Median flow tracker is designed to track an object using Lucas-Kanade optical flow method which is sensitive to illumination variation, hence fails when sudden illumination changes occur between the frames. In this paper, we propose an enhanced median flow tracker to achieve an illumination invariance to abruptly varying lighting conditions. In this approach, illumination variation is compensated by modifying the Discrete Cosine Transform (DCT) coefficients of an image in the logarithmic domain. The illumination variations are mainly reflected in the low-frequency coefficients of an image. Therefore, a fixed number of DCT coefficients are ignored. Moreover, the Discrete Cosine (DC) coefficient is maintained almost constant all through the video based on entropy difference to minimize the sudden variations of lighting impacts. In addition, each video frame is enhanced by employing pixel transformation technique that improves the contrast of dull images based on probability distribution of pixels. The proposed scheme can effectively handle the gradual and abrupt changes in the illumination of the object. The experiments are conducted on fast-changing illumination videos, and results show that the proposed method improves median flow tracker with outperforming accuracy compared to the state-of-the-art trackers
{"title":"Enhanced Median Flow Tracker Based on Photometric Correction for Videos with Abrupt Changing Illumination","authors":"Asha Narayana, N. Venkata","doi":"10.34028/iajit/17/2/15","DOIUrl":"https://doi.org/10.34028/iajit/17/2/15","url":null,"abstract":"Object tracking is a fundamental task in video surveillance, human-computer interaction and activity analysis. One of the common challenges in visual object tracking is illumination variation. A large number of methods for tracking have been proposed over the recent years, and median flow tracker is one of them which can handle various challenges. Median flow tracker is designed to track an object using Lucas-Kanade optical flow method which is sensitive to illumination variation, hence fails when sudden illumination changes occur between the frames. In this paper, we propose an enhanced median flow tracker to achieve an illumination invariance to abruptly varying lighting conditions. In this approach, illumination variation is compensated by modifying the Discrete Cosine Transform (DCT) coefficients of an image in the logarithmic domain. The illumination variations are mainly reflected in the low-frequency coefficients of an image. Therefore, a fixed number of DCT coefficients are ignored. Moreover, the Discrete Cosine (DC) coefficient is maintained almost constant all through the video based on entropy difference to minimize the sudden variations of lighting impacts. In addition, each video frame is enhanced by employing pixel transformation technique that improves the contrast of dull images based on probability distribution of pixels. The proposed scheme can effectively handle the gradual and abrupt changes in the illumination of the object. The experiments are conducted on fast-changing illumination videos, and results show that the proposed method improves median flow tracker with outperforming accuracy compared to the state-of-the-art trackers","PeriodicalId":161392,"journal":{"name":"The International Arab Journal of Information Technology","volume":"1113 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113982033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Detecting the denial of service attacks that solely target the router is a maximum security imperative in deploying IPv6 networks. The state-of-the-art Denial of Service detection methods aim at leveraging the advantages of flow statistical features and machine learning techniques. However, the detection performance is highly affected by the quality of the feature selector and the reliability of datasets of IPv6 flow information. This paper proposes a new neuro-fuzzy inference system to tackle the problem of classifying the packets in IPv6 networks in crucial situation of small-supervised training dataset. The proposed system is capable of classifying the IPv6 router alert option packets into denial of service and normal by utilizing the neuro-fuzzy strengths to boost the classification accuracy. A mathematical analysis from the fuzzy sets theory perspective is provided to express performance benefit of the proposed system. An empirical performance test is conducted on comprehensive dataset of IPv6 packets produced in a supervised environment. The result shows that the proposed system overcomes robustly some state-of-the-art systems.
{"title":"A Neuro-Fuzzy System to Detect IPv6 Router Alert Option DoS Packets","authors":"S. Abdullah","doi":"10.34028/IAJIT/17/1/3","DOIUrl":"https://doi.org/10.34028/IAJIT/17/1/3","url":null,"abstract":"Detecting the denial of service attacks that solely target the router is a maximum security imperative in deploying IPv6 networks. The state-of-the-art Denial of Service detection methods aim at leveraging the advantages of flow statistical features and machine learning techniques. However, the detection performance is highly affected by the quality of the feature selector and the reliability of datasets of IPv6 flow information. This paper proposes a new neuro-fuzzy inference system to tackle the problem of classifying the packets in IPv6 networks in crucial situation of small-supervised training dataset. The proposed system is capable of classifying the IPv6 router alert option packets into denial of service and normal by utilizing the neuro-fuzzy strengths to boost the classification accuracy. A mathematical analysis from the fuzzy sets theory perspective is provided to express performance benefit of the proposed system. An empirical performance test is conducted on comprehensive dataset of IPv6 packets produced in a supervised environment. The result shows that the proposed system overcomes robustly some state-of-the-art systems.","PeriodicalId":161392,"journal":{"name":"The International Arab Journal of Information Technology","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115039743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The existence of a large number of zombie enterprises will affect the economic development and hinder the transformation and upgrading of economic industries. To improve the accuracy of zombie enterprise identification, this paper takes multidimensional enterprise data as the original data set, divides it into training set and validation set, and gives the corresponding data pre-processing methods. Combined with 14 standardized features, an integrated learning model for zombie enterprise classification and recognition is constructed and studied based on three pattern recognition algorithms. By using the idea of integration and the cross-validation method to determine the optimal parameters, the Gradient Boosting Decision Tree (GBDT), linear kernel Support Vector Machine (SVM) and Deep Neural Network (DNN) algorithms with classification accuracies of 95%, 96% and 96%, respectively, are used as sub-models, and a more comprehensive strong supervision model with a classification accuracy of 98% is obtained by the stacking method in combination with the advantages of multiple sub-models to analyze the fundamental information of 30885 enterprises. The study improves the accuracy of zombie enterprise identification to 98%, builds enterprise portraits based on this, and finally visualizes the classification results through the platform, which provides an auxiliary means for zombie enterprise classification and identification.
{"title":"Design and Study of Zombie Enterprise Classification and Recognition Systems Based on Ensemble Learning","authors":"Shutong Pang, Zi Yang, Chengyou Cai, Zhimin Li","doi":"10.34028/iajit/20/5/3","DOIUrl":"https://doi.org/10.34028/iajit/20/5/3","url":null,"abstract":"The existence of a large number of zombie enterprises will affect the economic development and hinder the transformation and upgrading of economic industries. To improve the accuracy of zombie enterprise identification, this paper takes multidimensional enterprise data as the original data set, divides it into training set and validation set, and gives the corresponding data pre-processing methods. Combined with 14 standardized features, an integrated learning model for zombie enterprise classification and recognition is constructed and studied based on three pattern recognition algorithms. By using the idea of integration and the cross-validation method to determine the optimal parameters, the Gradient Boosting Decision Tree (GBDT), linear kernel Support Vector Machine (SVM) and Deep Neural Network (DNN) algorithms with classification accuracies of 95%, 96% and 96%, respectively, are used as sub-models, and a more comprehensive strong supervision model with a classification accuracy of 98% is obtained by the stacking method in combination with the advantages of multiple sub-models to analyze the fundamental information of 30885 enterprises. The study improves the accuracy of zombie enterprise identification to 98%, builds enterprise portraits based on this, and finally visualizes the classification results through the platform, which provides an auxiliary means for zombie enterprise classification and identification.","PeriodicalId":161392,"journal":{"name":"The International Arab Journal of Information Technology","volume":"159 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122201610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}