Pub Date : 2024-07-06DOI: 10.1007/s11280-024-01279-y
Jixiao Zhang, Yongkang Li, Ruotong Zou, Jingyuan Zhang, Renhe Jiang, Zipei Fan, Xuan Song
With the advancement of mobile technology, Point of Interest (POI) recommendation systems in Location-based Social Networks (LBSN) have brought numerous benefits to both users and companies. Many existing works employ Knowledge Graph (KG) to alleviate the data sparsity issue in LBSN. These approaches primarily focus on modeling the pair-wise relations in LBSN to enrich the semantics and thereby relieve the data sparsity issue. However, existing approaches seldom consider the hyper-relations in LBSN, such as the mobility relation (a 3-ary relation: user-POI-time). This makes the model hard to exploit the semantics accurately. In addition, prior works overlook the rich structural information inherent in KG, which consists of higher-order relations and can further alleviate the impact of data sparsity.To this end, we propose a Hyper-Relational Knowledge Graph Neural Network (HKGNN) model. In HKGNN, a Hyper-Relational Knowledge Graph (HKG) that models the LBSN data is constructed to maintain and exploit the rich semantics of hyper-relations. Then we proposed a Hypergraph Neural Network to utilize the structural information of HKG in a cohesive way. In addition, a self-attention network is used to leverage sequential information and make personalized recommendations. Furthermore, side information, essential in reducing data sparsity by providing background knowledge of POIs, is not fully utilized in current methods. In light of this, we extended the current dataset with available side information to further lessen the impact of data sparsity. Results of experiments on four real-world LBSN datasets demonstrate the effectiveness of our approach compared to existing state-of-the-art methods. Our implementation is available at https://github.com/aeroplanepaper/HKG.
{"title":"Hyper-relational knowledge graph neural network for next POI recommendation","authors":"Jixiao Zhang, Yongkang Li, Ruotong Zou, Jingyuan Zhang, Renhe Jiang, Zipei Fan, Xuan Song","doi":"10.1007/s11280-024-01279-y","DOIUrl":"https://doi.org/10.1007/s11280-024-01279-y","url":null,"abstract":"<p>With the advancement of mobile technology, Point of Interest (POI) recommendation systems in Location-based Social Networks (LBSN) have brought numerous benefits to both users and companies. Many existing works employ Knowledge Graph (KG) to alleviate the data sparsity issue in LBSN. These approaches primarily focus on modeling the pair-wise relations in LBSN to enrich the semantics and thereby relieve the data sparsity issue. However, existing approaches seldom consider the hyper-relations in LBSN, such as the mobility relation (a 3-ary relation: user-POI-time). This makes the model hard to exploit the semantics accurately. In addition, prior works overlook the rich structural information inherent in KG, which consists of higher-order relations and can further alleviate the impact of data sparsity.To this end, we propose a Hyper-Relational Knowledge Graph Neural Network (HKGNN) model. In HKGNN, a Hyper-Relational Knowledge Graph (HKG) that models the LBSN data is constructed to maintain and exploit the rich semantics of hyper-relations. Then we proposed a Hypergraph Neural Network to utilize the structural information of HKG in a cohesive way. In addition, a self-attention network is used to leverage sequential information and make personalized recommendations. Furthermore, side information, essential in reducing data sparsity by providing background knowledge of POIs, is not fully utilized in current methods. In light of this, we extended the current dataset with available side information to further lessen the impact of data sparsity. Results of experiments on four real-world LBSN datasets demonstrate the effectiveness of our approach compared to existing state-of-the-art methods. Our implementation is available at https://github.com/aeroplanepaper/HKG.</p>","PeriodicalId":501180,"journal":{"name":"World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141567936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Temporal graph networks (TGNs) have been proposed to facilitate learning on dynamic graphs which are composed of interaction events among nodes. However, existing TGNs suffer from poor generalization under distribution shifts that occur over time. It is vital to discover invariant patterns with stable predictive power across various distributions to improve the generalization ability. Invariant pattern discovery on dynamic graphs is non-trivial, as long-term history of interaction events is compressed into the memory by TGNs in an entangled way, making invariant pattern discovery difficult. Furthermore, TGNs process interaction events chronologically in batches to obtain up-to-date representations. Each batch consisting of chronologically-close events lacks diversity for identifying invariance under distribution shifts. To tackle these challenges, we propose a novel method called Smile, which stands for Structural teMporal Invariant LEarning. Specifically, we first propose the disentangled graph memory network, which selectively extracts pattern information from long-term history through the disentangled memory gating and attention network. The interaction history approximator is further introduced to provide diverse interaction distributions efficiently. Smile guarantees prediction stability under diverse temporal-dynamic distributions by regularizing invariance under cross-time distribution interventions. Experimental results on real-world datasets demonstrate that Smile outperforms baselines, yielding substantial performance improvements.
{"title":"Bridging distribution gaps: invariant pattern discovery for dynamic graph learning","authors":"Yucheng Jin, Maoyi Wang, Yun Xiong, Zhizhou Ren, Cuiying Huo, Feng Zhu, Jiawei Zhang, Guangzhong Wang, Haoran Chen","doi":"10.1007/s11280-024-01283-2","DOIUrl":"https://doi.org/10.1007/s11280-024-01283-2","url":null,"abstract":"<p>Temporal graph networks (TGNs) have been proposed to facilitate learning on dynamic graphs which are composed of interaction events among nodes. However, existing TGNs suffer from poor generalization under distribution shifts that occur over time. It is vital to discover invariant patterns with stable predictive power across various distributions to improve the generalization ability. Invariant pattern discovery on dynamic graphs is non-trivial, as long-term history of interaction events is compressed into the memory by TGNs in an entangled way, making invariant pattern discovery difficult. Furthermore, TGNs process interaction events chronologically in batches to obtain up-to-date representations. Each batch consisting of chronologically-close events lacks diversity for identifying invariance under distribution shifts. To tackle these challenges, we propose a novel method called <span>Smile</span>, which stands for <u>S</u>tructural te<u>M</u>poral <u>I</u>nvariant <u>LE</u>arning. Specifically, we first propose the disentangled graph memory network, which selectively extracts pattern information from long-term history through the disentangled memory gating and attention network. The interaction history approximator is further introduced to provide diverse interaction distributions efficiently. <span>Smile</span> guarantees prediction stability under diverse temporal-dynamic distributions by regularizing invariance under cross-time distribution interventions. Experimental results on real-world datasets demonstrate that <span>Smile</span> outperforms baselines, yielding substantial performance improvements.</p>","PeriodicalId":501180,"journal":{"name":"World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141524613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-02DOI: 10.1007/s11280-024-01282-3
Jiaying Shen, Donglin Zhu, Rui Li, Xingyun Zhu, Yuemai Zhang, Weijie Li, Changjun Zhou, Jun Zhang, Shi Cheng
Signal coverage quality and intensity distribution in complex environments pose a critical challenge, particularly evident in high-density personnel areas and specialized regions with intricate geographic features. This challenge leads to the inadequacy of the traditional two-dimensional base station model under the strain of communication congestion. Addressing the intricacies of the scenario, this paper focuses on the conditionally constrained deployment of base stations in special areas. It introduces a Splitting Particle Swarm Optimization (SPSO) algorithm, enhancing the algorithm’s global optimization capabilities by incorporating the concepts of splitting and parameter adjustments. This refinement aims to meet the communication requirements of customers in complex scenarios. To better align with the real-world communication needs of base stations, simulation experiments are conducted. These experiments involve assigning fixed coordinates to the special region or randomly generating its position. In the conducted experiments, the SPSO achieves maximum coverage rates of 99.24% and 99.00% with fewer target points and 93.56% and 96.16% with more target points. These results validate the optimization capability of the SPSO algorithm, demonstrating its feasibility and effectiveness. Ablation experiments and comparisons with other algorithms further illustrate the advantages of SPSO.
{"title":"Efficient base station deployment in specialized regions with splitting particle swarm optimization algorithm","authors":"Jiaying Shen, Donglin Zhu, Rui Li, Xingyun Zhu, Yuemai Zhang, Weijie Li, Changjun Zhou, Jun Zhang, Shi Cheng","doi":"10.1007/s11280-024-01282-3","DOIUrl":"https://doi.org/10.1007/s11280-024-01282-3","url":null,"abstract":"<p>Signal coverage quality and intensity distribution in complex environments pose a critical challenge, particularly evident in high-density personnel areas and specialized regions with intricate geographic features. This challenge leads to the inadequacy of the traditional two-dimensional base station model under the strain of communication congestion. Addressing the intricacies of the scenario, this paper focuses on the conditionally constrained deployment of base stations in special areas. It introduces a Splitting Particle Swarm Optimization (SPSO) algorithm, enhancing the algorithm’s global optimization capabilities by incorporating the concepts of splitting and parameter adjustments. This refinement aims to meet the communication requirements of customers in complex scenarios. To better align with the real-world communication needs of base stations, simulation experiments are conducted. These experiments involve assigning fixed coordinates to the special region or randomly generating its position. In the conducted experiments, the SPSO achieves maximum coverage rates of 99.24% and 99.00% with fewer target points and 93.56% and 96.16% with more target points. These results validate the optimization capability of the SPSO algorithm, demonstrating its feasibility and effectiveness. Ablation experiments and comparisons with other algorithms further illustrate the advantages of SPSO.</p>","PeriodicalId":501180,"journal":{"name":"World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-01DOI: 10.1007/s11280-024-01284-1
Richard Chbeir, Helen Huang, Yannis Manolopoulos, Fabrizio Silvestri
{"title":"Editorial on the Special Issue of the World Wide Web journal with selected papers from the 22nd International Conference on Web Information Systems Engineering (WISE)","authors":"Richard Chbeir, Helen Huang, Yannis Manolopoulos, Fabrizio Silvestri","doi":"10.1007/s11280-024-01284-1","DOIUrl":"https://doi.org/10.1007/s11280-024-01284-1","url":null,"abstract":"","PeriodicalId":501180,"journal":{"name":"World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141703609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The advent of large language models marks a revolutionary breakthrough in artificial intelligence. With the unprecedented scale of training and model parameters, the capability of large language models has been dramatically improved, leading to human-like performances in understanding, language synthesizing, common-sense reasoning, etc. Such a major leap forward in general AI capacity will fundamentally change the pattern of how personalization is conducted. For one thing, it will reform the way of interaction between humans and personalization systems. Instead of being a passive medium of information filtering, like conventional recommender systems and search engines, large language models present the foundation for active user engagement. On top of such a new foundation, users’ requests can be proactively explored, and users’ required information can be delivered in a natural, interactable, and explainable way. For another thing, it will also considerably expand the scope of personalization, making it grow from the sole function of collecting personalized information to the compound function of providing personalized services. By leveraging large language models as a general-purpose interface, the personalization systems may compile user’s requests into plans, calls the functions of external tools (e.g., search engines, calculators, service APIs, etc.) to execute the plans, and integrate the tools’ outputs to complete the end-to-end personalization tasks. Today, large language models are still being rapidly developed, whereas the application in personalization is largely unexplored. Therefore, we consider it to be right the time to review the challenges in personalization and the opportunities to address them with large language models. In particular, we dedicate this perspective paper to the discussion of the following aspects: the development and challenges for the existing personalization system, the newly emerged capabilities of large language models, and the potential ways of making use of large language models for personalization.
{"title":"When large language models meet personalization: perspectives of challenges and opportunities","authors":"Jin Chen, Zheng Liu, Xu Huang, Chenwang Wu, Qi Liu, Gangwei Jiang, Yuanhao Pu, Yuxuan Lei, Xiaolong Chen, Xingmei Wang, Kai Zheng, Defu Lian, Enhong Chen","doi":"10.1007/s11280-024-01276-1","DOIUrl":"https://doi.org/10.1007/s11280-024-01276-1","url":null,"abstract":"<p>The advent of large language models marks a revolutionary breakthrough in artificial intelligence. With the unprecedented scale of training and model parameters, the capability of large language models has been dramatically improved, leading to human-like performances in understanding, language synthesizing, common-sense reasoning, etc. Such a major leap forward in general AI capacity will fundamentally change the pattern of how personalization is conducted. For one thing, it will reform the way of interaction between humans and personalization systems. Instead of being a passive medium of information filtering, like conventional recommender systems and search engines, large language models present the foundation for active user engagement. On top of such a new foundation, users’ requests can be proactively explored, and users’ required information can be delivered in a natural, interactable, and explainable way. For another thing, it will also considerably expand the scope of personalization, making it grow from the sole function of collecting personalized information to the compound function of providing personalized services. By leveraging large language models as a general-purpose interface, the personalization systems may compile user’s requests into plans, calls the functions of external tools (e.g., search engines, calculators, service APIs, etc.) to execute the plans, and integrate the tools’ outputs to complete the end-to-end personalization tasks. Today, large language models are still being rapidly developed, whereas the application in personalization is largely unexplored. Therefore, we consider it to be right the time to review the challenges in personalization and the opportunities to address them with large language models. In particular, we dedicate this perspective paper to the discussion of the following aspects: the development and challenges for the existing personalization system, the newly emerged capabilities of large language models, and the potential ways of making use of large language models for personalization.</p>","PeriodicalId":501180,"journal":{"name":"World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141529315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-26DOI: 10.1007/s11280-024-01249-4
Ang Li, Yawen Li, Yingxia Shao
In the last decade, the explosive surge in multi-modal data has propelled cross-modal retrieval into the forefront of information retrieval research. Exceptional cross-modal retrieval algorithms are crucial for meeting user requirements effectively and offering invaluable support for subsequent tasks, including cross-modal recommendations, multi-modal content generation, and so forth. Previous methods for cross-modal retrieval typically search for a single common subspace, neglecting the possibility of multiple common subspaces that may mutually reinforce each other in reality, thereby resulting in the poor performance of cross-modal retrieval. To address this issue, we propose a Federated Supervised Cross-Modal Retrieval approach (FedSCMR), which leverages competition to learn the optimal common subspace, and adaptively aggregates the common subspaces of multiple clients for dynamic global aggregation. To reduce the differences between modalities, FedSCMR minimizes the semantic discrimination and consistency in the common subspace, in addition to modeling semantic discrimination in the label space. Additionally, it minimizes modal discrimination and semantic invariance across common subspaces to strengthen cross-subspace constraints and promote learning of the optimal common subspace. In the aggregation stage for federated learning, we design an adaptive model aggregation scheme that can dynamically and collaboratively evaluate the model contribution based on data volume, data category, model loss, and mean average precision, to adaptively aggregate multi-party common subspaces. Experimental results on two publicly available datasets demonstrate that our proposed FedSCMR surpasses state-of-the-art cross-modal retrieval methods.
{"title":"Federated learning for supervised cross-modal retrieval","authors":"Ang Li, Yawen Li, Yingxia Shao","doi":"10.1007/s11280-024-01249-4","DOIUrl":"https://doi.org/10.1007/s11280-024-01249-4","url":null,"abstract":"<p>In the last decade, the explosive surge in multi-modal data has propelled cross-modal retrieval into the forefront of information retrieval research. Exceptional cross-modal retrieval algorithms are crucial for meeting user requirements effectively and offering invaluable support for subsequent tasks, including cross-modal recommendations, multi-modal content generation, and so forth. Previous methods for cross-modal retrieval typically search for a single common subspace, neglecting the possibility of multiple common subspaces that may mutually reinforce each other in reality, thereby resulting in the poor performance of cross-modal retrieval. To address this issue, we propose a <b>Fed</b>erated <b>S</b>upervised <b>C</b>ross-<b>M</b>odal <b>R</b>etrieval approach (FedSCMR), which leverages competition to learn the optimal common subspace, and adaptively aggregates the common subspaces of multiple clients for dynamic global aggregation. To reduce the differences between modalities, FedSCMR minimizes the semantic discrimination and consistency in the common subspace, in addition to modeling semantic discrimination in the label space. Additionally, it minimizes modal discrimination and semantic invariance across common subspaces to strengthen cross-subspace constraints and promote learning of the optimal common subspace. In the aggregation stage for federated learning, we design an adaptive model aggregation scheme that can dynamically and collaboratively evaluate the model contribution based on data volume, data category, model loss, and mean average precision, to adaptively aggregate multi-party common subspaces. Experimental results on two publicly available datasets demonstrate that our proposed FedSCMR surpasses state-of-the-art cross-modal retrieval methods.</p>","PeriodicalId":501180,"journal":{"name":"World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-22DOI: 10.1007/s11280-024-01281-4
Chengdong Zheng, Yuliang Shi, Wu Lee, Lin Cheng, Xinjun Wang, Zhongmin Yan, Fanyu Kong
Deep learning models have been widely applied in the field of long-term forecasting has achieved significant success, with the incorporation of inductive bias such as periodicity to model multi-granularity representations of time series being a commonly employed design approach in forecasting methods. However, existing methods still face challenges related to information redundancy during the extraction of inductive bias and the learning process for multi-granularity features. The presence of redundant information can impede the acquisition of a comprehensive temporal representation by the model, thereby adversely impacting its predictive performance. To address the aforementioned issues, we propose a De-Redundant Multi-Period Hybrid Modeling Network (DPHM-Net) that effectively eliminates redundant information from the series inductive bias extraction mechanism and the multi-granularity series features in the time series representation learning. In DPHM-Net, we propose an efficient time series representation learning process based on a period inductive bias and introduce the concept of de-redundancy among multiple time series into the representation learning process for single time series. Additionally, we design a specialized gated unit to dynamically balance the elimination weights between series features and redundant semantic information. The advanced performance and high efficiency of our method in long-term forecasting tasks against previous state-of-the-art are demonstrated through extensive experiments on real-world datasets.
{"title":"DPHM-Net:de-redundant multi-period hybrid modeling network for long-term series forecasting","authors":"Chengdong Zheng, Yuliang Shi, Wu Lee, Lin Cheng, Xinjun Wang, Zhongmin Yan, Fanyu Kong","doi":"10.1007/s11280-024-01281-4","DOIUrl":"https://doi.org/10.1007/s11280-024-01281-4","url":null,"abstract":"<p>Deep learning models have been widely applied in the field of long-term forecasting has achieved significant success, with the incorporation of inductive bias such as periodicity to model multi-granularity representations of time series being a commonly employed design approach in forecasting methods. However, existing methods still face challenges related to information redundancy during the extraction of inductive bias and the learning process for multi-granularity features. The presence of redundant information can impede the acquisition of a comprehensive temporal representation by the model, thereby adversely impacting its predictive performance. To address the aforementioned issues, we propose a <b>D</b>e-Redundant Multi-<b>P</b>eriod <b>H</b>ybrid <b>M</b>odeling <b>Net</b>work (<b>DPHM-Net</b>) that effectively eliminates redundant information from the series inductive bias extraction mechanism and the multi-granularity series features in the time series representation learning. In DPHM-Net, we propose an efficient time series representation learning process based on a period inductive bias and introduce the concept of de-redundancy among multiple time series into the representation learning process for single time series. Additionally, we design a specialized gated unit to dynamically balance the elimination weights between series features and redundant semantic information. The advanced performance and high efficiency of our method in long-term forecasting tasks against previous state-of-the-art are demonstrated through extensive experiments on real-world datasets.</p>","PeriodicalId":501180,"journal":{"name":"World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we propose a simple and general lightweight approach named AL-RE2 for text matching models, and conduct experiments on three well-studied benchmark datasets across tasks of natural language inference and paraphrase identification. Firstly, we explore the feasibility of dimensional compression of word embedding vectors using principal component analysis, and then analyze the impact of the information retained in different dimensions on model accuracy. Considering the balance between compression efficiency and information loss, we choose 128 dimensions to represent each word and make the model params 1.6M. Finally, the feasibility of applying depthwise separable convolution instead of standard convolution in the field of text matching is analyzed in detail. The experimental results show that our model’s inference speed is at least 1.5 times faster and it has 42.76% fewer parameters compared to similarly performing models, while its accuracy on the SciTail dataset of is state-of-the-art among all lightweight models.
{"title":"Exploring highly concise and accurate text matching model with tiny weights","authors":"Yangchun Li, Danfeng Yan, Wei Jiang, Yuanqiang Cai, Zhihong Tian","doi":"10.1007/s11280-024-01262-7","DOIUrl":"https://doi.org/10.1007/s11280-024-01262-7","url":null,"abstract":"<p>In this paper, we propose a simple and general lightweight approach named AL-RE2 for text matching models, and conduct experiments on three well-studied benchmark datasets across tasks of natural language inference and paraphrase identification. Firstly, we explore the feasibility of dimensional compression of word embedding vectors using principal component analysis, and then analyze the impact of the information retained in different dimensions on model accuracy. Considering the balance between compression efficiency and information loss, we choose 128 dimensions to represent each word and make the model params 1.6M. Finally, the feasibility of applying depthwise separable convolution instead of standard convolution in the field of text matching is analyzed in detail. The experimental results show that our model’s inference speed is at least 1.5 times faster and it has 42.76% fewer parameters compared to similarly performing models, while its accuracy on the SciTail dataset of is state-of-the-art among all lightweight models.</p>","PeriodicalId":501180,"journal":{"name":"World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141532333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-11DOI: 10.1007/s11280-024-01277-0
Marco Montanaro, Antonio Maria Rinaldi, Cristiano Russo, Cristian Tommasino
Abstract
Identifying cases of intellectual property violation in multimedia files poses significant challenges for the Internet infrastructure, especially when dealing with extensive document collections. Typically, techniques used to tackle such issues can be categorized into either of two groups: proactive and reactive approaches. This article introduces an approach combining both proactive and reactive solutions to remove illegal uploads on a platform while preventing legal uploads or modified versions of audio tracks, such as parodies, remixes or further types of edits. To achieve this, we have developed a rule-based focused crawler specifically designed to detect copyright infringement on audio files coupled with a visualization environment that maps the retrieved data on a knowledge graph to represent information extracted from audio files. Our system automatically scans multimedia files that are uploaded to a public collection when a user submits a search query, performing an audio information retrieval task only on files deemed legal. We present experimental results obtained from tests conducted by performing user queries on a large music collection, a subset of 25,000 songs and audio snippets obtained from the Free Music Archive library. The returned audio tracks have an associated Similarity Score, a metric we use to determine the quality of the adversarial searches executed by the system. We then proceed with discussing the effectiveness and efficiency of different settings of our proposed system.
{"title":"Using knowledge graphs for audio retrieval: a case study on copyright infringement detection","authors":"Marco Montanaro, Antonio Maria Rinaldi, Cristiano Russo, Cristian Tommasino","doi":"10.1007/s11280-024-01277-0","DOIUrl":"https://doi.org/10.1007/s11280-024-01277-0","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>Identifying cases of intellectual property violation in multimedia files poses significant challenges for the Internet infrastructure, especially when dealing with extensive document collections. Typically, techniques used to tackle such issues can be categorized into either of two groups: proactive and reactive approaches. This article introduces an approach combining both proactive and reactive solutions to remove illegal uploads on a platform while preventing legal uploads or modified versions of audio tracks, such as parodies, remixes or further types of edits. To achieve this, we have developed a rule-based focused crawler specifically designed to detect copyright infringement on audio files coupled with a visualization environment that maps the retrieved data on a knowledge graph to represent information extracted from audio files. Our system automatically scans multimedia files that are uploaded to a public collection when a user submits a search query, performing an audio information retrieval task only on files deemed legal. We present experimental results obtained from tests conducted by performing user queries on a large music collection, a subset of 25,000 songs and audio snippets obtained from the Free Music Archive library. The returned audio tracks have an associated Similarity Score, a metric we use to determine the quality of the adversarial searches executed by the system. We then proceed with discussing the effectiveness and efficiency of different settings of our proposed system.</p><h3 data-test=\"abstract-sub-heading\">Graphical abstract</h3>","PeriodicalId":501180,"journal":{"name":"World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}