Pub Date : 2023-10-26DOI: 10.1007/s44196-023-00350-2
Vu Hong Son Pham, Nghiep Trinh Nguyen Dang, Van Nam Nguyen
Abstract The sine cosine algorithm (SCA) is widely recognized for its efficacy in solving optimization problems, although it encounters challenges in striking a balance between exploration and exploitation. To improve these limitations, a novel model, termed the novel sine cosine algorithm (nSCA), is introduced. In this advanced model, the roulette wheel selection (RWS) mechanism and opposition-based learning (OBL) techniques are integrated to augment its global optimization capabilities. A meticulous evaluation of nSCA performance has been carried out in comparison with state-of-the-art optimization algorithms, including multi-verse optimizer (MVO), salp swarm algorithm (SSA), moth-flame optimization (MFO), grasshopper optimization algorithm (GOA), and whale optimization algorithm (WOA), in addition to the original SCA. This comparative analysis was conducted across a wide array of 23 classical test functions and 29 CEC2017 benchmark functions, thereby facilitating a comprehensive assessment. Further validation of nSCA utility has been achieved through its deployment in five distinct engineering optimization case studies. Its effectiveness and relevance in addressing real-world optimization issues have thus been emphasized. Across all conducted tests and practical applications, nSCA was found to outperform its competitors consistently, furnishing more effective solutions to both theoretical and applied optimization problems.
{"title":"Hybrid Sine Cosine Algorithm with Integrated Roulette Wheel Selection and Opposition-Based Learning for Engineering Optimization Problems","authors":"Vu Hong Son Pham, Nghiep Trinh Nguyen Dang, Van Nam Nguyen","doi":"10.1007/s44196-023-00350-2","DOIUrl":"https://doi.org/10.1007/s44196-023-00350-2","url":null,"abstract":"Abstract The sine cosine algorithm (SCA) is widely recognized for its efficacy in solving optimization problems, although it encounters challenges in striking a balance between exploration and exploitation. To improve these limitations, a novel model, termed the novel sine cosine algorithm (nSCA), is introduced. In this advanced model, the roulette wheel selection (RWS) mechanism and opposition-based learning (OBL) techniques are integrated to augment its global optimization capabilities. A meticulous evaluation of nSCA performance has been carried out in comparison with state-of-the-art optimization algorithms, including multi-verse optimizer (MVO), salp swarm algorithm (SSA), moth-flame optimization (MFO), grasshopper optimization algorithm (GOA), and whale optimization algorithm (WOA), in addition to the original SCA. This comparative analysis was conducted across a wide array of 23 classical test functions and 29 CEC2017 benchmark functions, thereby facilitating a comprehensive assessment. Further validation of nSCA utility has been achieved through its deployment in five distinct engineering optimization case studies. Its effectiveness and relevance in addressing real-world optimization issues have thus been emphasized. Across all conducted tests and practical applications, nSCA was found to outperform its competitors consistently, furnishing more effective solutions to both theoretical and applied optimization problems.","PeriodicalId":54967,"journal":{"name":"International Journal of Computational Intelligence Systems","volume":"33 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136376821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-26DOI: 10.1007/s44196-023-00346-y
Rui Zhong, Jun Yu, Chao Zhang, Masaharu Munetomo
Abstract This paper proposes a novel surrogate ensemble-assisted hyper-heuristic algorithm (SEA-HHA) to solve expensive optimization problems (EOPs). A representative HHA consists of two parts: the low-level and the high-level components. In the low-level component, we regard the surrogate-assisted technique as a type of search strategy and design the four search strategy archives: exploration strategy archive, exploitation strategy archive, surrogate-assisted estimation archive, and mutation strategy archive as low-level heuristics (LLHs), each archive contains one or more search strategies. Once the surrogate-assisted estimation archive is activated to generate the offspring individual, SEA-HHA first selects the dataset for model construction from three principles: All Data , Recent Data , and Neighbor , which correspond to the global and the local surrogate model, respectively. Then, the dataset is randomly divided into training and validation data, and the most accurate model built by polynomial regression (PR), support vector regression (SVR), and Gaussian process regression (GPR) cooperates with the infill sampling criterion is employed for solution estimation. In the high-level component, we design a random selection function based on the pre-defined probabilities to manipulate a set of LLHs. In numerical experiments, we compare SEA-HHA with six optimization techniques on 5-D, 10-D, and 30-D CEC2013 benchmark functions and three engineering optimization problems with only 1000 fitness evaluation times (FEs). The experimental and statistical results show that our proposed SEA-HHA has broad prospects for dealing with EOPs.
{"title":"Surrogate Ensemble-Assisted Hyper-Heuristic Algorithm for Expensive Optimization Problems","authors":"Rui Zhong, Jun Yu, Chao Zhang, Masaharu Munetomo","doi":"10.1007/s44196-023-00346-y","DOIUrl":"https://doi.org/10.1007/s44196-023-00346-y","url":null,"abstract":"Abstract This paper proposes a novel surrogate ensemble-assisted hyper-heuristic algorithm (SEA-HHA) to solve expensive optimization problems (EOPs). A representative HHA consists of two parts: the low-level and the high-level components. In the low-level component, we regard the surrogate-assisted technique as a type of search strategy and design the four search strategy archives: exploration strategy archive, exploitation strategy archive, surrogate-assisted estimation archive, and mutation strategy archive as low-level heuristics (LLHs), each archive contains one or more search strategies. Once the surrogate-assisted estimation archive is activated to generate the offspring individual, SEA-HHA first selects the dataset for model construction from three principles: All Data , Recent Data , and Neighbor , which correspond to the global and the local surrogate model, respectively. Then, the dataset is randomly divided into training and validation data, and the most accurate model built by polynomial regression (PR), support vector regression (SVR), and Gaussian process regression (GPR) cooperates with the infill sampling criterion is employed for solution estimation. In the high-level component, we design a random selection function based on the pre-defined probabilities to manipulate a set of LLHs. In numerical experiments, we compare SEA-HHA with six optimization techniques on 5-D, 10-D, and 30-D CEC2013 benchmark functions and three engineering optimization problems with only 1000 fitness evaluation times (FEs). The experimental and statistical results show that our proposed SEA-HHA has broad prospects for dealing with EOPs.","PeriodicalId":54967,"journal":{"name":"International Journal of Computational Intelligence Systems","volume":"3 9","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134906845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-23DOI: 10.1007/s44196-023-00347-x
Yingsun Sun
Abstract With the continuous development of the market economy, the professional degree of the logistics industry is constantly improving, while the logistics distribution industry is also developing rapidly. The logistics distribution of the cold chain supply chain involves multiple distribution points, and the distance and time relationship between the distribution points are often not fully considered in the route planning, resulting in low distribution efficiency. The hierarchical algorithm model based on machine vision can solve the above problems to a certain extent. This paper takes two cold chain supply chain enterprises as the main research body, analyzes how to choose two kinds of COD and CCD sensors using machine vision, and the number of distribution vehicle scheduling. The simulation experiment was performed and at the end of the article it is summarized and discussed. According to the data sample, the two enterprises have the largest number of people satisfied with the supply chain logistics and distribution vehicle scheduling, but the number of people dissatisfied with enterprise A is 6 and 12% of the total. The number of people dissatisfied with enterprise B is 16 and 32% of the total number, It can be seen that although the number of people satisfied with the two enterprises is large, the number of people dissatisfied with enterprise B far exceeds that of enterprise A. At the same time, with the continuous research of supply chain logistics distribution vehicle scheduling, the research on machine vision is also facing new opportunities and challenges.
{"title":"A Hierarchical Algorithm Model for the Scheduling Problem of Cold Chain Logistics Distribution Vehicles Based on Machine Vision","authors":"Yingsun Sun","doi":"10.1007/s44196-023-00347-x","DOIUrl":"https://doi.org/10.1007/s44196-023-00347-x","url":null,"abstract":"Abstract With the continuous development of the market economy, the professional degree of the logistics industry is constantly improving, while the logistics distribution industry is also developing rapidly. The logistics distribution of the cold chain supply chain involves multiple distribution points, and the distance and time relationship between the distribution points are often not fully considered in the route planning, resulting in low distribution efficiency. The hierarchical algorithm model based on machine vision can solve the above problems to a certain extent. This paper takes two cold chain supply chain enterprises as the main research body, analyzes how to choose two kinds of COD and CCD sensors using machine vision, and the number of distribution vehicle scheduling. The simulation experiment was performed and at the end of the article it is summarized and discussed. According to the data sample, the two enterprises have the largest number of people satisfied with the supply chain logistics and distribution vehicle scheduling, but the number of people dissatisfied with enterprise A is 6 and 12% of the total. The number of people dissatisfied with enterprise B is 16 and 32% of the total number, It can be seen that although the number of people satisfied with the two enterprises is large, the number of people dissatisfied with enterprise B far exceeds that of enterprise A. At the same time, with the continuous research of supply chain logistics distribution vehicle scheduling, the research on machine vision is also facing new opportunities and challenges.","PeriodicalId":54967,"journal":{"name":"International Journal of Computational Intelligence Systems","volume":"225 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135368313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Natural language processing (NLP) based on deep learning provides a positive performance for generative dialogue system, and the transformer model is a new boost in NLP after the advent of word vectors. In this paper, a Chinese generative dialogue system based on transformer is designed, which only uses a multi-layer transformer decoder to build the system and uses the design of an incomplete mask to realize one-way language generation. That is, questions can perceive context information in both directions, while reply sentences can only output one-way autoregressive. The above system improvements make the one-way generation of dialogue tasks more logical and reasonable, and the performance is better than the traditional dialogue system scheme. In consideration of the long-distance information weakness of absolute position coding, we put forward the improvement of relative position coding in theory, and verify it in subsequent experiments. In the transformer module, the calculation formula of self-attention is modified, and the relative position information is added to replace the absolute position coding of the position embedding layer. The performance of the modified model in BLEU, embedding average, grammatical and semantic coherence is ideal, to enhance long-distance attention.
{"title":"Design of a Modified Transformer Architecture Based on Relative Position Coding","authors":"Wenfeng Zheng, Gu Gong, Jiawei Tian, Siyu Lu, Ruiyang Wang, Zhengtong Yin, Xiaolu Li, Lirong Yin","doi":"10.1007/s44196-023-00345-z","DOIUrl":"https://doi.org/10.1007/s44196-023-00345-z","url":null,"abstract":"Abstract Natural language processing (NLP) based on deep learning provides a positive performance for generative dialogue system, and the transformer model is a new boost in NLP after the advent of word vectors. In this paper, a Chinese generative dialogue system based on transformer is designed, which only uses a multi-layer transformer decoder to build the system and uses the design of an incomplete mask to realize one-way language generation. That is, questions can perceive context information in both directions, while reply sentences can only output one-way autoregressive. The above system improvements make the one-way generation of dialogue tasks more logical and reasonable, and the performance is better than the traditional dialogue system scheme. In consideration of the long-distance information weakness of absolute position coding, we put forward the improvement of relative position coding in theory, and verify it in subsequent experiments. In the transformer module, the calculation formula of self-attention is modified, and the relative position information is added to replace the absolute position coding of the position embedding layer. The performance of the modified model in BLEU, embedding average, grammatical and semantic coherence is ideal, to enhance long-distance attention.","PeriodicalId":54967,"journal":{"name":"International Journal of Computational Intelligence Systems","volume":"47 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135368647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract The success requirement of managers’ progress, development and performance improvement lie in their attention to product variety and company effectiveness. Economies of scope (ES) examine the advantages of production or the services diversification of a company based on cost versus production by companies that produce the same products or services separately. Data Envelopment Analysis (DEA) is known as a suitable method for evaluating ES and cost effectiveness. DEA models are introduced with certain input and output costs, while many companies and manufacturing industries in different sectors of production and service provision may not have accurate information on available costs and outputs because of calculation errors, old information, and multiple repeated measurements. The estimation DEA for ES and cost effectiveness are sensitive to changes, also some parameters, such as cost and price, are fluctuated. Therefore, it is a requirement to focus on the interval DEA. Our most important goals in this article are: (1) we develop new DEA models to measure the ES and cost effectiveness of decision-making units (DMUs) under data uncertainty. These models will become non-linear and non-convex models; hence, (2) we identify an appropriate range for ES and cost effectiveness of DMUs from the optimistic and pessimistic viewpoints, allowing decision-makers can use the upper and lower limits or their combination depending on the optimistic and pessimistic viewpoints, (3) we apply our developed models to assess the ES and cost-effectiveness performance of 24 institutions, considering data uncertainties that may affect the quality and reliability of the results. (4) The proposed models’ features have been analyzed, and the impact of interval data on cost effectiveness and ES has been evaluated. The application description of the proposed models for determining ES and cost effectiveness shows that a company can exhibit economies of scope without necessarily being Cost Effectiveness.
{"title":"Investigating the Economies of Scope and Cost Effectiveness in Manufacturing Companies with Interval Data","authors":"Elham Zaker Harofteh, Faranak Hosseinzadeh Saljooghi","doi":"10.1007/s44196-023-00340-4","DOIUrl":"https://doi.org/10.1007/s44196-023-00340-4","url":null,"abstract":"Abstract The success requirement of managers’ progress, development and performance improvement lie in their attention to product variety and company effectiveness. Economies of scope (ES) examine the advantages of production or the services diversification of a company based on cost versus production by companies that produce the same products or services separately. Data Envelopment Analysis (DEA) is known as a suitable method for evaluating ES and cost effectiveness. DEA models are introduced with certain input and output costs, while many companies and manufacturing industries in different sectors of production and service provision may not have accurate information on available costs and outputs because of calculation errors, old information, and multiple repeated measurements. The estimation DEA for ES and cost effectiveness are sensitive to changes, also some parameters, such as cost and price, are fluctuated. Therefore, it is a requirement to focus on the interval DEA. Our most important goals in this article are: (1) we develop new DEA models to measure the ES and cost effectiveness of decision-making units (DMUs) under data uncertainty. These models will become non-linear and non-convex models; hence, (2) we identify an appropriate range for ES and cost effectiveness of DMUs from the optimistic and pessimistic viewpoints, allowing decision-makers can use the upper and lower limits or their combination depending on the optimistic and pessimistic viewpoints, (3) we apply our developed models to assess the ES and cost-effectiveness performance of 24 institutions, considering data uncertainties that may affect the quality and reliability of the results. (4) The proposed models’ features have been analyzed, and the impact of interval data on cost effectiveness and ES has been evaluated. The application description of the proposed models for determining ES and cost effectiveness shows that a company can exhibit economies of scope without necessarily being Cost Effectiveness.","PeriodicalId":54967,"journal":{"name":"International Journal of Computational Intelligence Systems","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135883184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-16DOI: 10.1007/s44196-023-00344-0
Shitharth Selvarajan, Hariprasath Manoharan, Alaa O. Khadidos, Achyut Shankar, Adil O. Khadidos, Edeh Michael Onyema
Abstract In this study, unidentified flying machines are built with real-time monitoring in mid-course settings for obstacle avoidance in mind. The majority of the currently available methods are implemented as comprehensive monitoring systems, with significant success in monitored applications like bridges, railways, etc. So, the predicted model is developed exclusively for specific monitoring settings, as opposed to the broad conditions that are used by the current approaches. Also, in the design model, the first steps are taken by limiting the procedure to specific heights, and the input thrust that is provided for take up operation is kept to a minimum. Due to the improved altitudes, the velocity and acceleration units have been cranked up on purpose, making it possible to sidestep intact objects. In addition, Advanced Image Mapping Localization (AIML) is used to carry out the implementation process, which identifies stable sites at the correct rotation angle. Besides, Cyphal protocol integration improves the security of the data-gathering process by transmitting information gathered from sensing devices. The suggested system is put to the test across five different case studies, where the designed Unmanned aerial vehicle can able to detect 25 obstacles in the narrow paths in considered routs but existing approach can able to identify only 14 obstacle in the same routes.
{"title":"Obstacles Uncovering System for Slender Pathways Using Unmanned Aerial Vehicles with Automatic Image Localization Technique","authors":"Shitharth Selvarajan, Hariprasath Manoharan, Alaa O. Khadidos, Achyut Shankar, Adil O. Khadidos, Edeh Michael Onyema","doi":"10.1007/s44196-023-00344-0","DOIUrl":"https://doi.org/10.1007/s44196-023-00344-0","url":null,"abstract":"Abstract In this study, unidentified flying machines are built with real-time monitoring in mid-course settings for obstacle avoidance in mind. The majority of the currently available methods are implemented as comprehensive monitoring systems, with significant success in monitored applications like bridges, railways, etc. So, the predicted model is developed exclusively for specific monitoring settings, as opposed to the broad conditions that are used by the current approaches. Also, in the design model, the first steps are taken by limiting the procedure to specific heights, and the input thrust that is provided for take up operation is kept to a minimum. Due to the improved altitudes, the velocity and acceleration units have been cranked up on purpose, making it possible to sidestep intact objects. In addition, Advanced Image Mapping Localization (AIML) is used to carry out the implementation process, which identifies stable sites at the correct rotation angle. Besides, Cyphal protocol integration improves the security of the data-gathering process by transmitting information gathered from sensing devices. The suggested system is put to the test across five different case studies, where the designed Unmanned aerial vehicle can able to detect 25 obstacles in the narrow paths in considered routs but existing approach can able to identify only 14 obstacle in the same routes.","PeriodicalId":54967,"journal":{"name":"International Journal of Computational Intelligence Systems","volume":"224 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136113534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-16DOI: 10.1007/s44196-023-00341-3
Luis Alfonso Pérez Martos, Ángel Miguel García-Vico, Pedro González, Cristóbal J. Carmona del Jesus
Abstract Clustering is a grouping technique that has long been used to relate data homogeneously. With the huge growth of complex datasets from different sources in the last decade, new paradigms have emerged. Multiclustering is a new concept within clustering that attempts to simultaneously generate multiple clusters that are bound to be different from each other, allowing to analyze and discover hidden patterns in the dataset compared to single clustering methods. This paper presents a hybrid methodology based on an evolutionary approach with the concepts of hyperrectangle for multiclustering, called MultiCHCClust. The algorithm is applied in a post-processing stage and it improves the results obtained for a clustering algorithm with respect to the partitioning of the dataset and the optimization of the number of partitions, achieving a high degree of compactness and separation of the partitioned dataset as can be observed in a complete experimental study.
{"title":"A Multiclustering Evolutionary Hyperrectangle-Based Algorithm","authors":"Luis Alfonso Pérez Martos, Ángel Miguel García-Vico, Pedro González, Cristóbal J. Carmona del Jesus","doi":"10.1007/s44196-023-00341-3","DOIUrl":"https://doi.org/10.1007/s44196-023-00341-3","url":null,"abstract":"Abstract Clustering is a grouping technique that has long been used to relate data homogeneously. With the huge growth of complex datasets from different sources in the last decade, new paradigms have emerged. Multiclustering is a new concept within clustering that attempts to simultaneously generate multiple clusters that are bound to be different from each other, allowing to analyze and discover hidden patterns in the dataset compared to single clustering methods. This paper presents a hybrid methodology based on an evolutionary approach with the concepts of hyperrectangle for multiclustering, called MultiCHCClust. The algorithm is applied in a post-processing stage and it improves the results obtained for a clustering algorithm with respect to the partitioning of the dataset and the optimization of the number of partitions, achieving a high degree of compactness and separation of the partitioned dataset as can be observed in a complete experimental study.","PeriodicalId":54967,"journal":{"name":"International Journal of Computational Intelligence Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136115937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-14DOI: 10.1007/s44196-023-00325-3
Paolo Fosci, Giuseppe Psaila
Abstract Since the advent of $$textit{JSON}$$ JSON as a popular format for exchanging large amounts of data, a novel category of NoSQL database systems, named $$textit{JSON}$$ JSON document stores, has emerged for storing $$textit{JSON}$$ JSON data sets; in fact, these novel databases are able to natively manage collections of $$textit{JSON}$$ JSON documents. To help analysts and data engineers query and integrate $$textit{JSON}$$ JSON data sets persistently saved in $$textit{JSON}$$ JSON document stores, the J-CO Framework has been developed (at the University of Bergamo, Italy): it is built around a novel query language, named J-CO-QL $$^{+}$$ + , that provides sophisticated features, including soft-querying capabilities. However, J-CO-QL $$^{+}$$ + (as the other languages for querying $$textit{JSON}$$ JSON data sets) is designed to be general purpose; consequently, it can be cumbersome for users to apply it on specific data formats. This is the case of GeoJSON , a specific and popular $$textit{JSON}$$ JSON data format that is designed to represent geographical information layers. This paper presents the latest evolution of GeoSoft , a novel high-level “domain-specific language” that is specifically designed to express complex queries on the GeoJSON documents, including soft-queries. GeoSoft is inspired to the classical SQL language, so as to reduce the learning curve of potential users. GeoSoft queries are translated into J-CO-QL $$^{+}$$ + scripts, to be actually executed.
{"title":"Soft Querying Features in GeoJSON Documents: The GeoSoft Proposal","authors":"Paolo Fosci, Giuseppe Psaila","doi":"10.1007/s44196-023-00325-3","DOIUrl":"https://doi.org/10.1007/s44196-023-00325-3","url":null,"abstract":"Abstract Since the advent of $$textit{JSON}$$ <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\"> <mml:mi>JSON</mml:mi> </mml:math> as a popular format for exchanging large amounts of data, a novel category of NoSQL database systems, named $$textit{JSON}$$ <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\"> <mml:mi>JSON</mml:mi> </mml:math> document stores, has emerged for storing $$textit{JSON}$$ <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\"> <mml:mi>JSON</mml:mi> </mml:math> data sets; in fact, these novel databases are able to natively manage collections of $$textit{JSON}$$ <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\"> <mml:mi>JSON</mml:mi> </mml:math> documents. To help analysts and data engineers query and integrate $$textit{JSON}$$ <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\"> <mml:mi>JSON</mml:mi> </mml:math> data sets persistently saved in $$textit{JSON}$$ <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\"> <mml:mi>JSON</mml:mi> </mml:math> document stores, the J-CO Framework has been developed (at the University of Bergamo, Italy): it is built around a novel query language, named J-CO-QL $$^{+}$$ <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\"> <mml:msup> <mml:mrow /> <mml:mo>+</mml:mo> </mml:msup> </mml:math> , that provides sophisticated features, including soft-querying capabilities. However, J-CO-QL $$^{+}$$ <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\"> <mml:msup> <mml:mrow /> <mml:mo>+</mml:mo> </mml:msup> </mml:math> (as the other languages for querying $$textit{JSON}$$ <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\"> <mml:mi>JSON</mml:mi> </mml:math> data sets) is designed to be general purpose; consequently, it can be cumbersome for users to apply it on specific data formats. This is the case of GeoJSON , a specific and popular $$textit{JSON}$$ <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\"> <mml:mi>JSON</mml:mi> </mml:math> data format that is designed to represent geographical information layers. This paper presents the latest evolution of GeoSoft , a novel high-level “domain-specific language” that is specifically designed to express complex queries on the GeoJSON documents, including soft-queries. GeoSoft is inspired to the classical SQL language, so as to reduce the learning curve of potential users. GeoSoft queries are translated into J-CO-QL $$^{+}$$ <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\"> <mml:msup> <mml:mrow /> <mml:mo>+</mml:mo> </mml:msup> </mml:math> scripts, to be actually executed.","PeriodicalId":54967,"journal":{"name":"International Journal of Computational Intelligence Systems","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135800525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-04DOI: 10.1007/s44196-023-00337-z
Boting Liu, Weili Guan, Changjin Yang, Zhijie Fang, Zhiheng Lu
Abstract Graph convolutional network (GCN) is an effective tool for feature clustering. However, in the text classification task, the traditional TextGCN (GCN for Text Classification) ignores the context word order of the text. In addition, TextGCN constructs the text graph only according to the context relationship, so it is difficult for the word nodes to learn an effective semantic representation. Based on this, this paper proposes a text classification method that combines Transformer and GCN. To improve the semantic accuracy of word node features, we add a part of speech (POS) to the word-document graph and build edges between words based on POS. In the layer-to-layer of GCN, the Transformer is used to extract the contextual and sequential information of the text. We conducted the experiment on five representative datasets. The results show that our method can effectively improve the accuracy of text classification and is better than the comparison method.
图卷积网络(GCN)是一种有效的特征聚类工具。然而,在文本分类任务中,传统的textcn (GCN for text classification)忽略了文本的上下文词序。此外,TextGCN仅根据上下文关系构建文本图,因此单词节点很难学习到有效的语义表示。在此基础上,本文提出了一种结合Transformer和GCN的文本分类方法。为了提高词节点特征的语义准确性,我们在词-文档图中加入词性(POS),并基于词性(POS)在词与词之间建立边缘。在分层GCN中,使用Transformer提取文本的上下文信息和顺序信息。我们在五个有代表性的数据集上进行了实验。结果表明,该方法能有效提高文本分类的准确率,优于比较法。
{"title":"Transformer and Graph Convolutional Network for Text Classification","authors":"Boting Liu, Weili Guan, Changjin Yang, Zhijie Fang, Zhiheng Lu","doi":"10.1007/s44196-023-00337-z","DOIUrl":"https://doi.org/10.1007/s44196-023-00337-z","url":null,"abstract":"Abstract Graph convolutional network (GCN) is an effective tool for feature clustering. However, in the text classification task, the traditional TextGCN (GCN for Text Classification) ignores the context word order of the text. In addition, TextGCN constructs the text graph only according to the context relationship, so it is difficult for the word nodes to learn an effective semantic representation. Based on this, this paper proposes a text classification method that combines Transformer and GCN. To improve the semantic accuracy of word node features, we add a part of speech (POS) to the word-document graph and build edges between words based on POS. In the layer-to-layer of GCN, the Transformer is used to extract the contextual and sequential information of the text. We conducted the experiment on five representative datasets. The results show that our method can effectively improve the accuracy of text classification and is better than the comparison method.","PeriodicalId":54967,"journal":{"name":"International Journal of Computational Intelligence Systems","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135591657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-04DOI: 10.1007/s44196-023-00342-2
Shixuan Li, Wenxuan Shi
Abstract Textual-based factors have been widely regarded as a promising feature that can be applied to financial issues. This study focuses on extracting both basic and semantic textual features to supplement the traditionally used financial indicators. The main is to improve Chinese listed companies’ financial distress prediction (FDP). A unique paradigm is proposed in this study that combines financial and multi-type textual predictive factors, feature selection methods, classifiers, and time spans to achieve the optimal FDP. The frequency counts, TF-IDF, TextRank, and word embedding approaches are employed to extract frequency count-based, keyword-based, sentiment, and readability indicators. The experimental results prove that financial domain sentiment lexicons, word embedding-based readability analysis approaches, and the basic textual features of Management Discussion and Analysis can be important elements of FDP. Moreover, the finding highlights the fact that incorporating financial and textual features can achieve optimal performance 4 or 5 years before the expected baseline year; applying the RF-GBDT combined model can also outperform other classifiers. This study makes an innovative contribution, since it expands the multiple text analysis method in the financial text mining field and provides new findings on how to provide early warning signs related to financial risk. The approaches developed in this research can serve as a template that can be used to resolve other financial issues.
{"title":"Incorporating Multiple Textual Factors into Unbalanced Financial Distress Prediction: A Feature Selection Methods and Ensemble Classifiers Combined Approach","authors":"Shixuan Li, Wenxuan Shi","doi":"10.1007/s44196-023-00342-2","DOIUrl":"https://doi.org/10.1007/s44196-023-00342-2","url":null,"abstract":"Abstract Textual-based factors have been widely regarded as a promising feature that can be applied to financial issues. This study focuses on extracting both basic and semantic textual features to supplement the traditionally used financial indicators. The main is to improve Chinese listed companies’ financial distress prediction (FDP). A unique paradigm is proposed in this study that combines financial and multi-type textual predictive factors, feature selection methods, classifiers, and time spans to achieve the optimal FDP. The frequency counts, TF-IDF, TextRank, and word embedding approaches are employed to extract frequency count-based, keyword-based, sentiment, and readability indicators. The experimental results prove that financial domain sentiment lexicons, word embedding-based readability analysis approaches, and the basic textual features of Management Discussion and Analysis can be important elements of FDP. Moreover, the finding highlights the fact that incorporating financial and textual features can achieve optimal performance 4 or 5 years before the expected baseline year; applying the RF-GBDT combined model can also outperform other classifiers. This study makes an innovative contribution, since it expands the multiple text analysis method in the financial text mining field and provides new findings on how to provide early warning signs related to financial risk. The approaches developed in this research can serve as a template that can be used to resolve other financial issues.","PeriodicalId":54967,"journal":{"name":"International Journal of Computational Intelligence Systems","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135644083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}