Pub Date : 2024-08-12DOI: 10.1007/s10115-024-02196-2
Seyed Sina Ziaee, Hossein Rahmani, Mohammad Nazari
Nowadays, with the advent of movies and TV shows and the competition between different movie streamer companies and movie databases to attract more users, movie recommenders have become a major prerequisite for customer satisfaction. Most of the previously introduced methods used collaborative, content-based, and hybrid filtering techniques, where neural network-based approaches and matrix completion are the major approaches of most recent movie recommender systems. The major drawbacks of previous systems are not considering side information, such as plot synopsis and cold start problem, in the context of movie recommendations. In this paper, we propose a novel inductive approach called MoRGH which first constructs a graph of similar movies by considering the information available in movies’ plot synopsis and genres. Second, we construct a heterogeneous graph that includes two types of nodes: movies and users. This graph is built using the MovieLens dataset and the similarity graph generated in the first stage, where each edge between a user and a movie represents the user’s rating for that movie, and each edge between two movies represents the similarity between them. Third, MoRGH mitigates the drawbacks of previous methods by employing a GNN and GAE-based model that combines collaborative and content-based approaches. This hybrid approach allows MoRGH to provide accurate and more personalized recommendations for each user, outperforming previous state-of-the-art models in terms of RMSE scores. The achieved improvement in RMSE scores demonstrates MoRGH’s superior performance and its ability to deliver enhanced recommendations compared to existing models.
{"title":"MoRGH: movie recommender system using GNNs on heterogeneous graphs","authors":"Seyed Sina Ziaee, Hossein Rahmani, Mohammad Nazari","doi":"10.1007/s10115-024-02196-2","DOIUrl":"https://doi.org/10.1007/s10115-024-02196-2","url":null,"abstract":"<p>Nowadays, with the advent of movies and TV shows and the competition between different movie streamer companies and movie databases to attract more users, movie recommenders have become a major prerequisite for customer satisfaction. Most of the previously introduced methods used collaborative, content-based, and hybrid filtering techniques, where neural network-based approaches and matrix completion are the major approaches of most recent movie recommender systems. The major drawbacks of previous systems are not considering side information, such as plot synopsis and cold start problem, in the context of movie recommendations. In this paper, we propose a novel inductive approach called MoRGH which first constructs a graph of similar movies by considering the information available in movies’ plot synopsis and genres. Second, we construct a heterogeneous graph that includes two types of nodes: movies and users. This graph is built using the MovieLens dataset and the similarity graph generated in the first stage, where each edge between a user and a movie represents the user’s rating for that movie, and each edge between two movies represents the similarity between them. Third, MoRGH mitigates the drawbacks of previous methods by employing a GNN and GAE-based model that combines collaborative and content-based approaches. This hybrid approach allows MoRGH to provide accurate and more personalized recommendations for each user, outperforming previous state-of-the-art models in terms of RMSE scores. The achieved improvement in RMSE scores demonstrates MoRGH’s superior performance and its ability to deliver enhanced recommendations compared to existing models.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"118 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-06DOI: 10.1007/s10115-024-02192-6
Avneet Singh Gautam, Zahid Raza
Research on disease outbreak prediction has suddenly received an enormous interest owing to the COVID-19 pandemic. Natural language processing using user-generated text data has proven to be quite effective for the same. Disease outbreaks that occur frequently can be easily predicted, but novel disease outbreaks are difficult to predict. This review work attempts to summarize the research concerning disease outbreaks and the use of datasets such as news headlines, tweets, and search engine queries using natural language processing techniques. Existing state-of-the-art systems have been analytically discussed with their contributions and limitations. This work is an insight into the existing research in the domain of disease outbreak prediction. A total of 146 articles were reviewed in this study, and results show that news and Twitter datasets are being used most to predict disease outbreaks. This research underlines the fact that numerous works are available in the literature based on specific outbreak-related Internet-sourced text data, viz. news, tweets, and search engine queries. However, this becomes a limitation for any disease outbreak prediction system as it can predict only specific disease outbreaks and motivates the development of systems capable of disease outbreak prediction without any bias.
{"title":"Disease outbreak prediction using natural language processing: a review","authors":"Avneet Singh Gautam, Zahid Raza","doi":"10.1007/s10115-024-02192-6","DOIUrl":"https://doi.org/10.1007/s10115-024-02192-6","url":null,"abstract":"<p>Research on disease outbreak prediction has suddenly received an enormous interest owing to the COVID-19 pandemic. Natural language processing using user-generated text data has proven to be quite effective for the same. Disease outbreaks that occur frequently can be easily predicted, but novel disease outbreaks are difficult to predict. This review work attempts to summarize the research concerning disease outbreaks and the use of datasets such as news headlines, tweets, and search engine queries using natural language processing techniques. Existing state-of-the-art systems have been analytically discussed with their contributions and limitations. This work is an insight into the existing research in the domain of disease outbreak prediction. A total of 146 articles were reviewed in this study, and results show that news and Twitter datasets are being used most to predict disease outbreaks. This research underlines the fact that numerous works are available in the literature based on specific outbreak-related Internet-sourced text data, viz. news, tweets, and search engine queries. However, this becomes a limitation for any disease outbreak prediction system as it can predict only specific disease outbreaks and motivates the development of systems capable of disease outbreak prediction without any bias.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"43 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141939783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-06DOI: 10.1007/s10115-024-02193-5
Yiping Zhang, Yuntao Shou, Tao Meng, Wei Ai, Keqin Li
The age estimation task aims to use facial features to predict the age of people and is widely used in public security, marketing, identification, and other fields. However, the features are mainly concentrated in facial keypoints, and existing CNN and Transformer-based methods have inflexibility and redundancy for modeling complex irregular structures. Therefore, this paper proposes a multi-view mask contrastive learning graph convolutional neural network (MMCL-GCN) for age estimation. Specifically, the overall structure of the MMCL-GCN network contains a feature extraction stage and an age estimation stage. In the feature extraction stage, we introduce a graph structure to construct face images as input and then design a multi-view mask contrastive learning (MMCL) mechanism to learn complex structural and semantic information about face images. The learning mechanism employs an asymmetric Siamese network architecture, which utilizes an online encoder–decoder structure to reconstruct the missing information from the original graph and utilizes the target encoder to learn latent representations for contrastive learning. Furthermore, to promote the two learning mechanisms better compatible and complementary, we adopt two augmentation strategies and optimize the joint losses. In the age estimation stage, we design a multi-layer extreme learning machine (ML-IELM) with identity mapping to fully use the features extracted by the online encoder. Then, a classifier and a regressor were constructed based on ML-IELM, which were used to identify the age grouping interval and accurately estimate the final age. Extensive experiments show that MMCL-GCN can effectively reduce the error of age estimation on benchmark datasets such as Adience, MORPH-II, and LAP-2016.
{"title":"A multi-view mask contrastive learning graph convolutional neural network for age estimation","authors":"Yiping Zhang, Yuntao Shou, Tao Meng, Wei Ai, Keqin Li","doi":"10.1007/s10115-024-02193-5","DOIUrl":"https://doi.org/10.1007/s10115-024-02193-5","url":null,"abstract":"<p>The age estimation task aims to use facial features to predict the age of people and is widely used in public security, marketing, identification, and other fields. However, the features are mainly concentrated in facial keypoints, and existing CNN and Transformer-based methods have inflexibility and redundancy for modeling complex irregular structures. Therefore, this paper proposes a multi-view mask contrastive learning graph convolutional neural network (MMCL-GCN) for age estimation. Specifically, the overall structure of the MMCL-GCN network contains a feature extraction stage and an age estimation stage. In the feature extraction stage, we introduce a graph structure to construct face images as input and then design a multi-view mask contrastive learning (MMCL) mechanism to learn complex structural and semantic information about face images. The learning mechanism employs an asymmetric Siamese network architecture, which utilizes an online encoder–decoder structure to reconstruct the missing information from the original graph and utilizes the target encoder to learn latent representations for contrastive learning. Furthermore, to promote the two learning mechanisms better compatible and complementary, we adopt two augmentation strategies and optimize the joint losses. In the age estimation stage, we design a multi-layer extreme learning machine (ML-IELM) with identity mapping to fully use the features extracted by the online encoder. Then, a classifier and a regressor were constructed based on ML-IELM, which were used to identify the age grouping interval and accurately estimate the final age. Extensive experiments show that MMCL-GCN can effectively reduce the error of age estimation on benchmark datasets such as Adience, MORPH-II, and LAP-2016.\u0000</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"2012 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141939782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-05DOI: 10.1007/s10115-024-02180-w
Danilo Cavaliere, Sabrina Senatore, Vincenzo Loia
Precision agriculture is evolving toward a contemporary approach that involves multiple sensing techniques to monitor and enhance crop quality while minimizing losses and waste of no longer considered inexhaustible resources, such as soil and water supplies. To understand crop status, it is necessary to integrate data from heterogeneous sensors and employ advanced sensing devices that can assess crop and water status. This study presents a smart monitoring approach in agriculture, involving sensors that can be both stationary (such as soil moisture sensors) and mobile (such as sensor-equipped unmanned aerial vehicles). These sensors collect information from visual maps of crop production and water conditions, to comprehensively understand the crop area and spot any potential vegetation problems. A modular fuzzy control scheme has been designed to interpret spectral indices and vegetative parameters and, by applying fuzzy rules, return status maps about vegetation status. The rules are applied incrementally per a hierarchical design to correlate lower-level data (e.g., temperature, vegetation indices) with higher-level data (e.g., vapor pressure deficit) to robustly determine the vegetation status and the main parameters that have led to it. A case study was conducted, involving the collection of satellite images from artichoke crops in Salerno, Italy, to demonstrate the potential of incremental design and information integration in crop health monitoring. Subsequently, tests were conducted on vineyard regions of interest in Teano, Italy, to assess the efficacy of the framework in the assessment of plant status and water stress. Indeed, comparing the outcomes of our maps with those of cutting-edge machine learning (ML) semantic segmentation has indeed revealed a promising level of accuracy. Specifically, classification performance was compared to the output of conventional ML methods, demonstrating that our approach is consistent and achieves an accuracy of over 90% throughout various seasons of the year.
精准农业正在向一种现代方法演变,这种方法涉及多种传感技术,用于监测和提高作物质量,同时最大限度地减少土壤和水供应等不再被视为取之不尽、用之不竭的资源的损失和浪费。为了解作物状况,有必要整合来自不同传感器的数据,并采用可评估作物和水状况的先进传感设备。本研究提出了一种农业智能监测方法,涉及固定式(如土壤水分传感器)和移动式(如配备传感器的无人机)传感器。这些传感器从作物产量和水状况的可视地图中收集信息,以全面了解作物区域并发现任何潜在的植被问题。已设计出一种模块化模糊控制方案,用于解释光谱指数和植被参数,并通过应用模糊规则,返回有关植被状况的状态图。这些规则按层次设计逐步应用,将低层次数据(如温度、植被指数)与高层次数据(如水汽压差)关联起来,从而稳健地确定植被状况和导致植被状况的主要参数。进行了一项案例研究,涉及意大利萨莱诺朝鲜蓟作物的卫星图像收集,以展示增量设计和信息集成在作物健康监测方面的潜力。随后,在意大利蒂亚诺的葡萄园相关区域进行了测试,以评估该框架在评估植物状态和水分胁迫方面的功效。事实上,将我们的地图结果与最先进的机器学习(ML)语义分割结果进行比较后发现,两者的准确度都很高。具体来说,我们将分类性能与传统 ML 方法的输出结果进行了比较,结果表明我们的方法是一致的,在一年的各个季节都能达到 90% 以上的准确率。
{"title":"Crop health assessment through hierarchical fuzzy rule-based status maps","authors":"Danilo Cavaliere, Sabrina Senatore, Vincenzo Loia","doi":"10.1007/s10115-024-02180-w","DOIUrl":"https://doi.org/10.1007/s10115-024-02180-w","url":null,"abstract":"<p>Precision agriculture is evolving toward a contemporary approach that involves multiple sensing techniques to monitor and enhance crop quality while minimizing losses and waste of no longer considered inexhaustible resources, such as soil and water supplies. To understand crop status, it is necessary to integrate data from heterogeneous sensors and employ advanced sensing devices that can assess crop and water status. This study presents a smart monitoring approach in agriculture, involving sensors that can be both stationary (such as soil moisture sensors) and mobile (such as sensor-equipped unmanned aerial vehicles). These sensors collect information from visual maps of crop production and water conditions, to comprehensively understand the crop area and spot any potential vegetation problems. A modular fuzzy control scheme has been designed to interpret spectral indices and vegetative parameters and, by applying fuzzy rules, return status maps about vegetation status. The rules are applied incrementally per a hierarchical design to correlate lower-level data (e.g., temperature, vegetation indices) with higher-level data (e.g., vapor pressure deficit) to robustly determine the vegetation status and the main parameters that have led to it. A case study was conducted, involving the collection of satellite images from artichoke crops in Salerno, Italy, to demonstrate the potential of incremental design and information integration in crop health monitoring. Subsequently, tests were conducted on vineyard regions of interest in Teano, Italy, to assess the efficacy of the framework in the assessment of plant status and water stress. Indeed, comparing the outcomes of our maps with those of cutting-edge machine learning (ML) semantic segmentation has indeed revealed a promising level of accuracy. Specifically, classification performance was compared to the output of conventional ML methods, demonstrating that our approach is consistent and achieves an accuracy of over 90% throughout various seasons of the year.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"22 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141939784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-04DOI: 10.1007/s10115-024-02185-5
Yoonji Baek, Hanju Kim, Myungha Cho, Hyeonmo Kim, Chanhee Lee, Taewoong Ryu, Heonho Kim, Bay Vo, Vincent W. Gan, Philippe Fournier-Viger, Jerry Chun-Wei Lin, Witold Pedrycz, Unil Yun
There are many real-life data incrementally generated around the world. One of the recent interesting issues is the efficient processing real-world data that is continuously accumulated. Mining and recognizing removable patterns in such data is a challenging task. Erasable pattern mining confronts this challenge by discovering removable patterns with low gain. In various real-world applications, data are stored in the form of non-binary databases. These databases store item information in a quantity form. Since items in the database can each have different characteristics, such as quantities, considering their relative features makes the mined patterns more meaningful. For these reasons, we propose an erasable utility pattern mining algorithm for incremental non-binary databases. The suggested technique can recognize removable patterns by considering the relative utility of items and the profit of products in an incremental database. The proposed algorithm utilizes a list structure for efficiently extracting erasable utility patterns. Several experiments have been conducted to compare the performance between the suggested algorithm and state-of-the-art techniques using real and synthetic datasets, and the results demonstrate the effectiveness of the proposed method.
{"title":"An efficient approach for incremental erasable utility pattern mining from non-binary data","authors":"Yoonji Baek, Hanju Kim, Myungha Cho, Hyeonmo Kim, Chanhee Lee, Taewoong Ryu, Heonho Kim, Bay Vo, Vincent W. Gan, Philippe Fournier-Viger, Jerry Chun-Wei Lin, Witold Pedrycz, Unil Yun","doi":"10.1007/s10115-024-02185-5","DOIUrl":"https://doi.org/10.1007/s10115-024-02185-5","url":null,"abstract":"<p>There are many real-life data incrementally generated around the world. One of the recent interesting issues is the efficient processing real-world data that is continuously accumulated. Mining and recognizing removable patterns in such data is a challenging task. Erasable pattern mining confronts this challenge by discovering removable patterns with low gain. In various real-world applications, data are stored in the form of non-binary databases. These databases store item information in a quantity form. Since items in the database can each have different characteristics, such as quantities, considering their relative features makes the mined patterns more meaningful. For these reasons, we propose an erasable utility pattern mining algorithm for incremental non-binary databases. The suggested technique can recognize removable patterns by considering the relative utility of items and the profit of products in an incremental database. The proposed algorithm utilizes a list structure for efficiently extracting erasable utility patterns. Several experiments have been conducted to compare the performance between the suggested algorithm and state-of-the-art techniques using real and synthetic datasets, and the results demonstrate the effectiveness of the proposed method.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"1 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141939785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cues from multiple modalities have been successfully applied in several fields of natural language processing including machine translation (MT). However, the application of multimodal cues in low-resource MT (LRMT) is still an open research problem. The main challenge of LRMT is the lack of abundant parallel data which makes it difficult to build MT systems for a reasonable output. Using multimodal cues can provide additional context and information that can help to mitigate this challenge. To address this challenge, we present a multimodal machine translation (MMT) dataset of low-resource languages. The dataset consists of images, audio and corresponding parallel text for a low-resource language pair that is Manipuri–English. The text dataset is collected from the news articles of local daily newspapers and subsequently translated into the target language by translators of the native speakers. The audio version by native speakers for the Manipuri text is recorded for the experiments. The study also investigates whether the correlated audio-visual cues enhance the performance of the machine translation system. Several experiments are conducted for a systematic evaluation of the effectiveness utilizing multiple modalities. With the help of automatic metrics and human evaluation, a detailed analysis of the MT systems trained with text-only and multimodal inputs is carried out. Experimental results attest that the MT systems in low-resource settings could be significantly improved up to +2.7 BLEU score by incorporating correlated modalities. The human evaluation reveals that the type of correlated auxiliary modality affects the adequacy and fluency performance in the MMT systems. Our results emphasize the potential of using cues from auxiliary modalities to enhance machine translation systems, particularly in situations with limited resources.
{"title":"An empirical study of a novel multimodal dataset for low-resource machine translation","authors":"Loitongbam Sanayai Meetei, Thoudam Doren Singh, Sivaji Bandyopadhyay","doi":"10.1007/s10115-024-02087-6","DOIUrl":"https://doi.org/10.1007/s10115-024-02087-6","url":null,"abstract":"<p>Cues from multiple modalities have been successfully applied in several fields of natural language processing including machine translation (MT). However, the application of multimodal cues in low-resource MT (LRMT) is still an open research problem. The main challenge of LRMT is the lack of abundant parallel data which makes it difficult to build MT systems for a reasonable output. Using multimodal cues can provide additional context and information that can help to mitigate this challenge. To address this challenge, we present a multimodal machine translation (MMT) dataset of low-resource languages. The dataset consists of images, audio and corresponding parallel text for a low-resource language pair that is Manipuri–English. The text dataset is collected from the news articles of local daily newspapers and subsequently translated into the target language by translators of the native speakers. The audio version by native speakers for the Manipuri text is recorded for the experiments. The study also investigates whether the correlated audio-visual cues enhance the performance of the machine translation system. Several experiments are conducted for a systematic evaluation of the effectiveness utilizing multiple modalities. With the help of automatic metrics and human evaluation, a detailed analysis of the MT systems trained with text-only and multimodal inputs is carried out. Experimental results attest that the MT systems in low-resource settings could be significantly improved up to +2.7 BLEU score by incorporating correlated modalities. The human evaluation reveals that the type of correlated auxiliary modality affects the adequacy and fluency performance in the MMT systems. Our results emphasize the potential of using cues from auxiliary modalities to enhance machine translation systems, particularly in situations with limited resources.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"3 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141867047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-24DOI: 10.1007/s10115-024-02177-5
Venkata Ratnam Ganji, Aparna Chaparala
With the emerging trend in e-commerce, an increasing number of people have adopted cashless payment methods, especially credit cards for buying products online. However, this ever-rising usage of credit cards has also led to an increase in the malicious users attempting to gain financial profits by committing fraudulent activities resulting in huge losses to the card issuer as well as the customer. Credit Card Frauds (CCFs) are pervasive worldwide, and so efficient methods are required to detect CCFs to minimize financial losses. This research presents an efficient CCF Detection (CCFD) approach based on Deep Learning. In this work, CCFD is performed based on the features obtained from the credit card fused based on Wave Hedge distance, and the Wave Hedge coefficient utilized for fusion is estimated using the Deep Neuro-Fuzzy Network. Further, detection is performed using the Zeiler and Fergus Network (ZFNet), whose trainable factors are adjusted using the Dwarf Mongoose–Shuffled Shepherd Political Optimization (DMSSPO) algorithm. Moreover, the DMSSPO_ZFNet is analyzed based on accuracy, sensitivity, and specificity, and the experimental outcomes reveal that the values attained are 0.961, 0.961, and 0.951.
{"title":"Wave Hedges distance-based feature fusion and hybrid optimization-enabled deep learning for cyber credit card fraud detection","authors":"Venkata Ratnam Ganji, Aparna Chaparala","doi":"10.1007/s10115-024-02177-5","DOIUrl":"https://doi.org/10.1007/s10115-024-02177-5","url":null,"abstract":"<p>With the emerging trend in e-commerce, an increasing number of people have adopted cashless payment methods, especially credit cards for buying products online. However, this ever-rising usage of credit cards has also led to an increase in the malicious users attempting to gain financial profits by committing fraudulent activities resulting in huge losses to the card issuer as well as the customer. Credit Card Frauds (CCFs) are pervasive worldwide, and so efficient methods are required to detect CCFs to minimize financial losses. This research presents an efficient CCF Detection (CCFD) approach based on Deep Learning. In this work, CCFD is performed based on the features obtained from the credit card fused based on Wave Hedge distance, and the Wave Hedge coefficient utilized for fusion is estimated using the Deep Neuro-Fuzzy Network. Further, detection is performed using the Zeiler and Fergus Network (ZFNet), whose trainable factors are adjusted using the Dwarf Mongoose–Shuffled Shepherd Political Optimization (DMSSPO) algorithm. Moreover, the DMSSPO_ZFNet is analyzed based on accuracy, sensitivity, and specificity, and the experimental outcomes reveal that the values attained are 0.961, 0.961, and 0.951.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"42 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141782922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-22DOI: 10.1007/s10115-024-02166-8
Bin Feng, Shulan Ruan, Likang Wu, Huijie Liu, Kai Zhang, Kun Zhang, Qi Liu, Enhong Chen
Knowledge-based visual question answering (KB-VQA) requires to answer questions according to the given image with the assistance of external knowledge. Recently, researchers generally tend to design different multimodal networks to extract visual and text semantic features for KB-VQA. Despite the significant progress, ‘caption’ information, a textual form of image semantics, which can also provide visually non-obvious cues for the reasoning process, is often ignored. In this paper, we introduce a novel framework, the Knowledge Based Caption Enhanced Net (KBCEN), designed to integrate caption information into the KB-VQA process. Specifically, for better knowledge reasoning, we make utilization of caption information comprehensively from both explicit and implicit perspectives. For the former, we explicitly link caption entities to knowledge graph together with object tags and question entities. While for the latter, a pre-trained multimodal BERT with natural implicit knowledge is leveraged to co-represent caption tokens, object regions as well as question tokens. Moreover, we develop a mutual correlation module to discern intricate correlations between explicit and implicit representations, thereby facilitating knowledge integration and final prediction. We conduct extensive experiments on three publicly available datasets (i.e., OK-VQA v1.0, OK-VQA v1.1 and A-OKVQA). Both quantitative and qualitative results demonstrate the superiority and rationality of our proposed KBCEN.
{"title":"Caption matters: a new perspective for knowledge-based visual question answering","authors":"Bin Feng, Shulan Ruan, Likang Wu, Huijie Liu, Kai Zhang, Kun Zhang, Qi Liu, Enhong Chen","doi":"10.1007/s10115-024-02166-8","DOIUrl":"https://doi.org/10.1007/s10115-024-02166-8","url":null,"abstract":"<p>Knowledge-based visual question answering (KB-VQA) requires to answer questions according to the given image with the assistance of external knowledge. Recently, researchers generally tend to design different multimodal networks to extract visual and text semantic features for KB-VQA. Despite the significant progress, ‘caption’ information, a textual form of image semantics, which can also provide visually non-obvious cues for the reasoning process, is often ignored. In this paper, we introduce a novel framework, the Knowledge Based Caption Enhanced Net (KBCEN), designed to integrate caption information into the KB-VQA process. Specifically, for better knowledge reasoning, we make utilization of caption information comprehensively from both explicit and implicit perspectives. For the former, we explicitly link caption entities to knowledge graph together with object tags and question entities. While for the latter, a pre-trained multimodal BERT with natural implicit knowledge is leveraged to co-represent caption tokens, object regions as well as question tokens. Moreover, we develop a mutual correlation module to discern intricate correlations between explicit and implicit representations, thereby facilitating knowledge integration and final prediction. We conduct extensive experiments on three publicly available datasets (i.e., OK-VQA v1.0, OK-VQA v1.1 and A-OKVQA). Both quantitative and qualitative results demonstrate the superiority and rationality of our proposed KBCEN.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"11 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141743155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-22DOI: 10.1007/s10115-024-02150-2
Doaa Mohey El-Din, Aboul Ella Hassanein, Ehab E. Hassanien
There is a growing interest in multidisciplinary research in multimodal synthesis technology to stimulate diversity of modal interpretation in different application contexts. The real requirement for modality diversity across multiple contextual representation fields is due to the conflicting nature of data in multitarget sensors, which introduces other obstacles including ambiguity, uncertainty, imbalance, and redundancy in multiobject classification. This paper proposes a new adaptive and late multimodal fusion framework using evidence-enhanced deep learning guided by Dempster–Shafer theory and concatenation strategy to interpret multiple modalities and contextual representations that achieves a bigger number of features for interpreting unstructured multimodality types based on late fusion. Furthermore, it is designed based on a multifusion learning solution to solve the modality and context-based fusion that leads to improving decisions. It creates a fully automated selective deep neural network and constructs an adaptive fusion model for all modalities based on the input type. The proposed framework is implemented based on five layers which are a software-defined fusion layer, a preprocessing layer, a dynamic classification layer, an adaptive fusion layer, and an evaluation layer. The framework is formalizing the modality/context-based problem into an adaptive multifusion framework based on a late fusion level. The particle swarm optimization was used in multiple smart context systems to improve the final classification layer with the best optimal parameters that tracing 30 changes in hyperparameters of deep learning training models. This paper applies multiple experimental with multimodalities inputs in multicontext to show the behaviors the proposed multifusion framework. Experimental results on four challenging datasets including military, agricultural, COIVD-19, and food health data provide impressive results compared to other state-of-the-art multiple fusion models. The main strengths of proposed adaptive fusion framework can classify multiobjects with reduced features automatically and solves the fused data ambiguity and inconsistent data. In addition, it can increase the certainty and reduce the redundancy data with improving the unbalancing data. The experimental results of multimodalities experiment in multicontext using the proposed multimodal fusion framework achieve 98.45% of accuracy.
{"title":"An adaptive and late multifusion framework in contextual representation based on evidential deep learning and Dempster–Shafer theory","authors":"Doaa Mohey El-Din, Aboul Ella Hassanein, Ehab E. Hassanien","doi":"10.1007/s10115-024-02150-2","DOIUrl":"https://doi.org/10.1007/s10115-024-02150-2","url":null,"abstract":"<p>There is a growing interest in multidisciplinary research in multimodal synthesis technology to stimulate diversity of modal interpretation in different application contexts. The real requirement for modality diversity across multiple contextual representation fields is due to the conflicting nature of data in multitarget sensors, which introduces other obstacles including ambiguity, uncertainty, imbalance, and redundancy in multiobject classification. This paper proposes a new adaptive and late multimodal fusion framework using evidence-enhanced deep learning guided by Dempster–Shafer theory and concatenation strategy to interpret multiple modalities and contextual representations that achieves a bigger number of features for interpreting unstructured multimodality types based on late fusion. Furthermore, it is designed based on a multifusion learning solution to solve the modality and context-based fusion that leads to improving decisions. It creates a fully automated selective deep neural network and constructs an adaptive fusion model for all modalities based on the input type. The proposed framework is implemented based on five layers which are a software-defined fusion layer, a preprocessing layer, a dynamic classification layer, an adaptive fusion layer, and an evaluation layer. The framework is formalizing the modality/context-based problem into an adaptive multifusion framework based on a late fusion level. The particle swarm optimization was used in multiple smart context systems to improve the final classification layer with the best optimal parameters that tracing 30 changes in hyperparameters of deep learning training models. This paper applies multiple experimental with multimodalities inputs in multicontext to show the behaviors the proposed multifusion framework. Experimental results on four challenging datasets including military, agricultural, COIVD-19, and food health data provide impressive results compared to other state-of-the-art multiple fusion models. The main strengths of proposed adaptive fusion framework can classify multiobjects with reduced features automatically and solves the fused data ambiguity and inconsistent data. In addition, it can increase the certainty and reduce the redundancy data with improving the unbalancing data. The experimental results of multimodalities experiment in multicontext using the proposed multimodal fusion framework achieve 98.45% of accuracy.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"13 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141743298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Currently, online learning systems in the education sector are widely used and have become a new trend, generating large amounts of educational data based on students’ activities. In order to improve online learning experiences, sophisticated data analysis techniques are required. Adding value to E-learning platforms through the efficient processing of big learning data is possible with Big Data. With time, the E-learning management system’s repository expands and becomes a rich source of learning materials. Subject matter experts may benefit from using E-learning resources to reuse previously created content when creating online content. In addition, it might be beneficial to the students by giving them access to the pertinent documents for achieving their learning objectives effectively. An improved intelligent information retrieval and reliable storage (OIIRS) scheme is proposed for E-learning using hybrid deep learning techniques. Assume that relevant E-learning documents are stored in cloud and dynamically updated according to users’ status. First, we present a highly robust and lightweight crypto, i.e., optimized CLEFIA, for securely storing data in local repositories that improve the reliability of data loading. We develop an improved butterfly optimization algorithm to provide an optimal solution for CLEFIA that selects private keys. In addition, a hybrid deep learning method, i.e., backward diagonal search-based deep recurrent neural network (BD-DRNN) is introduced for optimal intelligent information retrieval based on keywords rather than semantics. Here, feature extraction and key feature matching are performed by the modified Hungarian optimization (MHO) algorithm that improves searching accuracy. Finally, we test our proposed OIIRS scheme with different benchmark datasets and use simulation results to test the performance.
{"title":"Optimal intelligent information retrieval and reliable storage scheme for cloud environment and E-learning big data analytics","authors":"Chandrasekar Venkatachalam, Shanmugavalli Venkatachalam","doi":"10.1007/s10115-024-02152-0","DOIUrl":"https://doi.org/10.1007/s10115-024-02152-0","url":null,"abstract":"<p>Currently, online learning systems in the education sector are widely used and have become a new trend, generating large amounts of educational data based on students’ activities. In order to improve online learning experiences, sophisticated data analysis techniques are required. Adding value to E-learning platforms through the efficient processing of big learning data is possible with Big Data. With time, the E-learning management system’s repository expands and becomes a rich source of learning materials. Subject matter experts may benefit from using E-learning resources to reuse previously created content when creating online content. In addition, it might be beneficial to the students by giving them access to the pertinent documents for achieving their learning objectives effectively. An improved intelligent information retrieval and reliable storage (OIIRS) scheme is proposed for E-learning using hybrid deep learning techniques. Assume that relevant E-learning documents are stored in cloud and dynamically updated according to users’ status. First, we present a highly robust and lightweight crypto, i.e., optimized CLEFIA, for securely storing data in local repositories that improve the reliability of data loading. We develop an improved butterfly optimization algorithm to provide an optimal solution for CLEFIA that selects private keys. In addition, a hybrid deep learning method, i.e., backward diagonal search-based deep recurrent neural network (BD-DRNN) is introduced for optimal intelligent information retrieval based on keywords rather than semantics. Here, feature extraction and key feature matching are performed by the modified Hungarian optimization (MHO) algorithm that improves searching accuracy. Finally, we test our proposed OIIRS scheme with different benchmark datasets and use simulation results to test the performance.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"25 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141783057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}