arXiv - CS - Information Retrieval最新文献_第7页

HierLLM: Hierarchical Large Language Model for Question Recommendation HierLLM：用于问题推荐的分层大语言模型

arXiv - CS - Information Retrieval

Pub Date : 2024-09-10 DOI: arxiv-2409.06177

Yuxuan Liu, Haipeng Liu, Ting Long

Question recommendation is a task that sequentially recommends questions forstudents to enhance their learning efficiency. That is, given the learninghistory and learning target of a student, a question recommender is supposed toselect the question that will bring the most improvement for students. Previousmethods typically model the question recommendation as a sequentialdecision-making problem, estimating students' learning state with the learninghistory, and feeding the learning state with the learning target to a neuralnetwork to select the recommended question from a question set. However,previous methods are faced with two challenges: (1) learning history isunavailable in the cold start scenario, which makes the recommender generateinappropriate recommendations; (2) the size of the question set is much large,which makes it difficult for the recommender to select the best questionprecisely. To address the challenges, we propose a method called hierarchicallarge language model for question recommendation (HierLLM), which is aLLM-based hierarchical structure. The LLM-based structure enables HierLLM totackle the cold start issue with the strong reasoning abilities of LLM. Thehierarchical structure takes advantage of the fact that the number of conceptsis significantly smaller than the number of questions, narrowing the range ofselectable questions by first identifying the relevant concept for theto-recommend question, and then selecting the recommended question based onthat concept. This hierarchical structure reduces the difficulty of therecommendation.To investigate the performance of HierLLM, we conduct extensiveexperiments, and the results demonstrate the outstanding performance ofHierLLM.

问题推荐是一项按顺序为学生推荐问题以提高其学习效率的任务。也就是说，在给定学生的学习历史和学习目标的情况下，问题推荐者应该选择能给学生带来最大进步的问题。以往的方法通常将问题推荐作为一个连续决策问题来建模，利用学习历史估计学生的学习状态，然后将学习状态和学习目标输入神经网络，从问题集中选择推荐的问题。然而，以往的方法面临两个挑战：（1）学习历史在冷启动场景下不可用，这使得推荐器生成不恰当的推荐；（2）问题集的规模很大，这使得推荐器难以精确地选择最佳问题。针对上述问题，我们提出了一种基于 LLM 的分层结构的问题推荐方法，即分层大语言模型（HierLLM）。基于 LLM 的结构使 HierLLM 能够利用 LLM 的强大推理能力解决冷启动问题。分层结构利用了概念数明显少于问题数这一事实，通过首先确定要推荐问题的相关概念，然后根据该概念选择推荐问题，从而缩小了可选问题的范围。为了研究 HierLLM 的性能，我们进行了大量的实验，结果证明了 HierLLM 的出色性能。

{"title":"HierLLM: Hierarchical Large Language Model for Question Recommendation","authors":"Yuxuan Liu, Haipeng Liu, Ting Long","doi":"arxiv-2409.06177","DOIUrl":"https://doi.org/arxiv-2409.06177","url":null,"abstract":"Question recommendation is a task that sequentially recommends questions for\u0000students to enhance their learning efficiency. That is, given the learning\u0000history and learning target of a student, a question recommender is supposed to\u0000select the question that will bring the most improvement for students. Previous\u0000methods typically model the question recommendation as a sequential\u0000decision-making problem, estimating students' learning state with the learning\u0000history, and feeding the learning state with the learning target to a neural\u0000network to select the recommended question from a question set. However,\u0000previous methods are faced with two challenges: (1) learning history is\u0000unavailable in the cold start scenario, which makes the recommender generate\u0000inappropriate recommendations; (2) the size of the question set is much large,\u0000which makes it difficult for the recommender to select the best question\u0000precisely. To address the challenges, we propose a method called hierarchical\u0000large language model for question recommendation (HierLLM), which is a\u0000LLM-based hierarchical structure. The LLM-based structure enables HierLLM to\u0000tackle the cold start issue with the strong reasoning abilities of LLM. The\u0000hierarchical structure takes advantage of the fact that the number of concepts\u0000is significantly smaller than the number of questions, narrowing the range of\u0000selectable questions by first identifying the relevant concept for the\u0000to-recommend question, and then selecting the recommended question based on\u0000that concept. This hierarchical structure reduces the difficulty of the\u0000recommendation.To investigate the performance of HierLLM, we conduct extensive\u0000experiments, and the results demonstrate the outstanding performance of\u0000HierLLM.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

What makes a good concept anyway ? 什么才是好概念？

arXiv - CS - Information Retrieval

Pub Date : 2024-09-10 DOI: arxiv-2409.06150

Naren Khatwani, James Geller

A good medical ontology is expected to cover its domain completely andcorrectly. On the other hand, large ontologies are hard to build, hard tounderstand, and hard to maintain. Thus, adding new concepts (often multi-wordconcepts) to an existing ontology must be done judiciously. Only "good"concepts should be added; however, it is difficult to define what makes aconcept good. In this research, we propose a metric to measure the goodness ofa concept. We identified factors that appear to influence goodness judgments ofmedical experts and combined them into a single metric. These factors includeconcept name length (in words), concept occurrence frequency in the medicalliterature, and syntactic categories of component words. As an added factor, weused the simplicity of a term after mapping it into a specific foreignlanguage. We performed Bayesian optimization of factor weights to achievemaximum agreement between the metric and three medical experts. The resultsshowed that our metric had a 50.67% overall agreement with the experts, asmeasured by Krippendorff's alpha.

一个好的医学本体论应该能够完整而正确地覆盖其领域。另一方面，大型本体难以构建、难以理解、难以维护。因此，在现有本体中添加新概念（通常是多词概念）时必须慎重。只有 "好 "的概念才能被添加；然而，很难定义什么是好概念。在这项研究中，我们提出了一种衡量概念好坏的标准。我们发现了一些似乎会影响医学专家对概念好坏判断的因素，并将这些因素合并为一个衡量标准。这些因素包括概念名称长度（以词为单位）、概念在医学文献中的出现频率以及成分词的句法类别。作为附加因素，我们使用了术语映射到特定外语后的简单程度。我们对因子权重进行了贝叶斯优化，以实现该指标与三位医学专家之间的最大一致性。结果表明，我们的指标与专家的总体一致度为 50.67%，以 Krippendorff's alpha 表示。

引用次数: 0

Operational Advice for Dense and Sparse Retrievers: HNSW, Flat, or Inverted Indexes? 密集型和稀疏型检索器的操作建议：HNSW、平指数还是倒指数？

arXiv - CS - Information Retrieval

Pub Date : 2024-09-10 DOI: arxiv-2409.06464

Jimmy Lin

Practitioners working on dense retrieval today face a bewildering number ofchoices. Beyond selecting the embedding model, another consequential choice isthe actual implementation of nearest-neighbor vector search. While bestpractices recommend HNSW indexes, flat vector indexes with brute-force searchrepresent another viable option, particularly for smaller corpora and for rapidprototyping. In this paper, we provide experimental results on the BEIR datasetusing the open-source Lucene search library that explicate the tradeoffsbetween HNSW and flat indexes (including quantized variants) from theperspectives of indexing time, query evaluation performance, and retrievalquality. With additional comparisons between dense and sparse retrievers, ourresults provide guidance for today's search practitioner in understanding thedesign space of dense and sparse retrievers. To our knowledge, we are the firstto provide operational advice supported by empirical experiments in thisregard.

如今，从事高密度检索的从业人员面临着大量令人困惑的选择。除了选择嵌入模型外，另一个重要的选择是近邻矢量搜索的实际实现。虽然最佳实践推荐使用 HNSW 索引，但使用暴力搜索的扁平矢量索引是另一种可行的选择，尤其适用于较小的语料库和快速原型开发。本文提供了使用开源 Lucene 搜索库在 BEIR 数据集上的实验结果，从索引时间、查询评估性能和检索质量等角度阐述了 HNSW 和扁平索引（包括量化变体）之间的权衡。通过对密集检索器和稀疏检索器进行更多比较，我们的研究结果为当今的搜索从业者了解密集检索器和稀疏检索器的设计空间提供了指导。据我们所知，我们是第一个在这方面提供有经验实验支持的操作建议的人。

引用次数: 0

Interactive Counterfactual Exploration of Algorithmic Harms in Recommender Systems 交互式反事实探索推荐系统中的算法危害

arXiv - CS - Information Retrieval

Pub Date : 2024-09-10 DOI: arxiv-2409.06916

Yongsu Ahn, Quinn K Wolter, Jonilyn Dick, Janet Dick, Yu-Ru Lin

Recommender systems have become integral to digital experiences, shaping userinteractions and preferences across various platforms. Despite their widespreaduse, these systems often suffer from algorithmic biases that can lead to unfairand unsatisfactory user experiences. This study introduces an interactive tooldesigned to help users comprehend and explore the impacts of algorithmic harmsin recommender systems. By leveraging visualizations, counterfactualexplanations, and interactive modules, the tool allows users to investigate howbiases such as miscalibration, stereotypes, and filter bubbles affect theirrecommendations. Informed by in-depth user interviews, this tool benefits bothgeneral users and researchers by increasing transparency and offeringpersonalized impact assessments, ultimately fostering a better understanding ofalgorithmic biases and contributing to more equitable recommendation outcomes.This work provides valuable insights for future research and practicalapplications in mitigating bias and enhancing fairness in machine learningalgorithms.

推荐系统已成为数字体验不可或缺的一部分，它在各种平台上影响着用户的互动和偏好。尽管这些系统被广泛使用，但其算法往往存在偏差，可能导致不公平和不令人满意的用户体验。本研究介绍了一种交互式方法，旨在帮助用户理解和探索推荐系统中算法危害的影响。通过利用可视化、反事实解释和互动模块，该工具允许用户研究误判、刻板印象和过滤气泡等偏见如何影响他们的推荐。通过对用户的深入访谈，该工具提高了透明度并提供了个性化的影响评估，从而使普通用户和研究人员受益匪浅，最终促进了对算法偏差的更好理解，并有助于实现更公平的推荐结果。这项工作为减轻偏差和提高机器学习算法的公平性方面的未来研究和实际应用提供了宝贵的见解。

{"title":"Interactive Counterfactual Exploration of Algorithmic Harms in Recommender Systems","authors":"Yongsu Ahn, Quinn K Wolter, Jonilyn Dick, Janet Dick, Yu-Ru Lin","doi":"arxiv-2409.06916","DOIUrl":"https://doi.org/arxiv-2409.06916","url":null,"abstract":"Recommender systems have become integral to digital experiences, shaping user\u0000interactions and preferences across various platforms. Despite their widespread\u0000use, these systems often suffer from algorithmic biases that can lead to unfair\u0000and unsatisfactory user experiences. This study introduces an interactive tool\u0000designed to help users comprehend and explore the impacts of algorithmic harms\u0000in recommender systems. By leveraging visualizations, counterfactual\u0000explanations, and interactive modules, the tool allows users to investigate how\u0000biases such as miscalibration, stereotypes, and filter bubbles affect their\u0000recommendations. Informed by in-depth user interviews, this tool benefits both\u0000general users and researchers by increasing transparency and offering\u0000personalized impact assessments, ultimately fostering a better understanding of\u0000algorithmic biases and contributing to more equitable recommendation outcomes.\u0000This work provides valuable insights for future research and practical\u0000applications in mitigating bias and enhancing fairness in machine learning\u0000algorithms.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing Sequential Recommendations through Multi-Perspective Reflections and Iteration 通过多视角反思和迭代改进顺序建议

arXiv - CS - Information Retrieval

Pub Date : 2024-09-10 DOI: arxiv-2409.06377

Weicong Qin, Yi Xu, Weijie Yu, Chenglei Shen, Xiao Zhang, Ming He, Jianping Fan, Jun Xu

Sequence recommendation (SeqRec) aims to predict the next item a user willinteract with by understanding user intentions and leveraging collaborativefiltering information. Large language models (LLMs) have shown great promise inrecommendation tasks through prompt-based, fixed reflection libraries, andfine-tuning techniques. However, these methods face challenges, including lackof supervision, inability to optimize reflection sources, inflexibility todiverse user needs, and high computational costs. Despite promising results,current studies primarily focus on reflections of users' explicit preferences(e.g., item titles) while neglecting implicit preferences (e.g., brands) andcollaborative filtering information. This oversight hinders the capture ofpreference shifts and dynamic user behaviors. Additionally, existing approacheslack mechanisms for reflection evaluation and iteration, often leading tosuboptimal recommendations. To address these issues, we propose the Mixture ofREflectors (MoRE) framework, designed to model and learn dynamic userpreferences in SeqRec. Specifically, MoRE introduces three reflectors forgenerating LLM-based reflections on explicit preferences, implicit preferences,and collaborative signals. Each reflector incorporates a self-improvingstrategy, termed refining-and-iteration, to evaluate and iteratively updatereflections. Furthermore, a meta-reflector employs a contextual banditalgorithm to select the most suitable expert and corresponding reflections foreach user's recommendation, effectively capturing dynamic preferences.Extensive experiments on three real-world datasets demonstrate that MoREconsistently outperforms state-of-the-art methods, requiring less training timeand GPU memory compared to other LLM-based approaches in SeqRec.

序列推荐（SeqRec）旨在通过理解用户意图和利用协同过滤信息来预测用户将与之互动的下一个项目。大语言模型（LLM）通过基于提示的固定反射库和微调技术，在推荐任务中显示出了巨大的前景。然而，这些方法也面临着挑战，包括缺乏监督、无法优化反射源、无法灵活应对不同的用户需求以及计算成本高昂等。尽管取得了可喜的成果，但目前的研究主要集中在对用户显性偏好（如商品标题）的反映上，而忽略了隐性偏好（如品牌）和协作筛选信息。这种疏忽阻碍了对偏好转变和动态用户行为的捕捉。此外，现有方法缺乏反思评估和迭代机制，往往会导致次优推荐。为了解决这些问题，我们提出了 Mixture ofREflectors（MoRE）框架，旨在对 SeqRec 中的动态用户偏好进行建模和学习。具体来说，MoRE 引入了三个反射器，用于生成基于 LLM 的显式偏好、隐式偏好和协作信号的反射。每个反射器都包含一个自我完善策略（称为 "完善和迭代"），用于评估和迭代更新反射。在三个真实数据集上进行的广泛实验表明，MoRE 的性能始终优于最先进的方法，与 SeqRec 中其他基于 LLM 的方法相比，它所需的训练时间和 GPU 内存更少。

{"title":"Enhancing Sequential Recommendations through Multi-Perspective Reflections and Iteration","authors":"Weicong Qin, Yi Xu, Weijie Yu, Chenglei Shen, Xiao Zhang, Ming He, Jianping Fan, Jun Xu","doi":"arxiv-2409.06377","DOIUrl":"https://doi.org/arxiv-2409.06377","url":null,"abstract":"Sequence recommendation (SeqRec) aims to predict the next item a user will\u0000interact with by understanding user intentions and leveraging collaborative\u0000filtering information. Large language models (LLMs) have shown great promise in\u0000recommendation tasks through prompt-based, fixed reflection libraries, and\u0000fine-tuning techniques. However, these methods face challenges, including lack\u0000of supervision, inability to optimize reflection sources, inflexibility to\u0000diverse user needs, and high computational costs. Despite promising results,\u0000current studies primarily focus on reflections of users' explicit preferences\u0000(e.g., item titles) while neglecting implicit preferences (e.g., brands) and\u0000collaborative filtering information. This oversight hinders the capture of\u0000preference shifts and dynamic user behaviors. Additionally, existing approaches\u0000lack mechanisms for reflection evaluation and iteration, often leading to\u0000suboptimal recommendations. To address these issues, we propose the Mixture of\u0000REflectors (MoRE) framework, designed to model and learn dynamic user\u0000preferences in SeqRec. Specifically, MoRE introduces three reflectors for\u0000generating LLM-based reflections on explicit preferences, implicit preferences,\u0000and collaborative signals. Each reflector incorporates a self-improving\u0000strategy, termed refining-and-iteration, to evaluate and iteratively update\u0000reflections. Furthermore, a meta-reflector employs a contextual bandit\u0000algorithm to select the most suitable expert and corresponding reflections for\u0000each user's recommendation, effectively capturing dynamic preferences.\u0000Extensive experiments on three real-world datasets demonstrate that MoRE\u0000consistently outperforms state-of-the-art methods, requiring less training time\u0000and GPU memory compared to other LLM-based approaches in SeqRec.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"117 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

DV-FSR: A Dual-View Target Attack Framework for Federated Sequential Recommendation DV-FSR：联合序列推荐的双视角目标攻击框架

arXiv - CS - Information Retrieval

Pub Date : 2024-09-10 DOI: arxiv-2409.07500

Qitao Qin, Yucong Luo, Mingyue Cheng, Qingyang Mao, Chenyi Lei

Federated recommendation (FedRec) preserves user privacy by enablingdecentralized training of personalized models, but this architecture isinherently vulnerable to adversarial attacks. Significant research has beenconducted on targeted attacks in FedRec systems, motivated by commercial andsocial influence considerations. However, much of this work has largelyoverlooked the differential robustness of recommendation models. Moreover, ourempirical findings indicate that existing targeted attack methods achieve onlylimited effectiveness in Federated Sequential Recommendation (FSR) tasks.Driven by these observations, we focus on investigating targeted attacks in FSRand propose a novel dualview attack framework, named DV-FSR. This attack methoduniquely combines a sampling-based explicit strategy with a contrastivelearning-based implicit gradient strategy to orchestrate a coordinated attack.Additionally, we introduce a specific defense mechanism tailored for targetedattacks in FSR, aiming to evaluate the mitigation effects of the attack methodwe proposed. Extensive experiments validate the effectiveness of our proposedapproach on representative sequential models.

联合推荐（FedRec）通过对个性化模型进行分散训练来保护用户隐私，但这种架构本身容易受到恶意攻击。出于商业和社会影响方面的考虑，针对 FedRec 系统的定向攻击开展了大量研究。然而，这些研究在很大程度上忽略了推荐模型的不同鲁棒性。此外，我们的实证研究结果表明，现有的定向攻击方法在联邦顺序推荐（FSR）任务中的效果有限。在这些观察结果的推动下，我们专注于研究 FSR 中的定向攻击，并提出了一种新颖的双视角攻击框架，命名为 DV-FSR。这种攻击方法独特地结合了基于采样的显式策略和基于对比学习的隐式梯度策略，以协调攻击。此外，我们还引入了一种专门针对 FSR 中定向攻击的特定防御机制，旨在评估我们提出的攻击方法的缓解效果。广泛的实验验证了我们提出的方法在具有代表性的序列模型上的有效性。

{"title":"DV-FSR: A Dual-View Target Attack Framework for Federated Sequential Recommendation","authors":"Qitao Qin, Yucong Luo, Mingyue Cheng, Qingyang Mao, Chenyi Lei","doi":"arxiv-2409.07500","DOIUrl":"https://doi.org/arxiv-2409.07500","url":null,"abstract":"Federated recommendation (FedRec) preserves user privacy by enabling\u0000decentralized training of personalized models, but this architecture is\u0000inherently vulnerable to adversarial attacks. Significant research has been\u0000conducted on targeted attacks in FedRec systems, motivated by commercial and\u0000social influence considerations. However, much of this work has largely\u0000overlooked the differential robustness of recommendation models. Moreover, our\u0000empirical findings indicate that existing targeted attack methods achieve only\u0000limited effectiveness in Federated Sequential Recommendation (FSR) tasks.\u0000Driven by these observations, we focus on investigating targeted attacks in FSR\u0000and propose a novel dualview attack framework, named DV-FSR. This attack method\u0000uniquely combines a sampling-based explicit strategy with a contrastive\u0000learning-based implicit gradient strategy to orchestrate a coordinated attack.\u0000Additionally, we introduce a specific defense mechanism tailored for targeted\u0000attacks in FSR, aiming to evaluate the mitigation effects of the attack method\u0000we proposed. Extensive experiments validate the effectiveness of our proposed\u0000approach on representative sequential models.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Critical Features Tracking on Triangulated Irregular Networks by a Scale-Space Method 用尺度空间法追踪三角形不规则网络上的关键特征

arXiv - CS - Information Retrieval

Pub Date : 2024-09-10 DOI: arxiv-2409.06638

Haoan Feng, Yunting Song, Leila De Floriani

The scale-space method is a well-established framework that constructs ahierarchical representation of an input signal and facilitates coarse-to-finevisual reasoning. Considering the terrain elevation function as the inputsignal, the scale-space method can identify and track significant topographicfeatures across different scales. The number of scales a feature persists,called its life span, indicates the importance of that feature. In this way,important topographic features of a landscape can be selected, which are usefulfor many applications, including cartography, nautical charting, and land-useplanning. The scale-space methods developed for terrain data use griddedDigital Elevation Models (DEMs) to represent the terrain. However, gridded DEMslack the flexibility to adapt to the irregular distribution of input data andthe varied topological complexity of different regions. Instead, TriangulatedIrregular Networks (TINs) can be directly generated from irregularlydistributed point clouds and accurately preserve important features. In thiswork, we introduce a novel scale-space analysis pipeline for TINs, addressingthe multiple challenges in extending grid-based scale-space methods to TINs.Our pipeline can efficiently identify and track topologically importantfeatures on TINs. Moreover, it is capable of analyzing terrains with irregularboundaries, which poses challenges for grid-based methods. Comprehensiveexperiments show that, compared to grid-based methods, our TIN-based pipelineis more efficient, accurate, and has better resolution robustness.

尺度空间法是一种成熟的框架，它能构建输入信号的层次表示法，便于进行从粗到细的视觉推理。将地形高程函数视为输入信号，尺度空间法可以识别和跟踪不同尺度上的重要地形特征。地貌特征持续存在的尺度数（称为其寿命）表明了该特征的重要性。通过这种方法，可以筛选出景观中重要的地形特征，这在制图、海图绘制和土地利用规划等许多应用中都非常有用。为地形数据开发的比例空间方法使用网格数字高程模型（DEM）来表示地形。然而，网格数字高程模型缺乏灵活性，无法适应输入数据的不规则分布和不同地区的不同地形复杂性。相反，三角不规则网络（TIN）可以直接从不规则分布的点云生成，并准确地保留重要特征。在这项工作中，我们为 TINs 引入了一个新颖的尺度空间分析管道，解决了将基于网格的尺度空间方法扩展到 TINs 时所面临的多重挑战。此外，它还能分析具有不规则边界的地形，这对基于网格的方法构成了挑战。综合实验表明，与基于网格的方法相比，我们基于 TIN 的管道更高效、更准确，并且具有更好的分辨率鲁棒性。

{"title":"Critical Features Tracking on Triangulated Irregular Networks by a Scale-Space Method","authors":"Haoan Feng, Yunting Song, Leila De Floriani","doi":"arxiv-2409.06638","DOIUrl":"https://doi.org/arxiv-2409.06638","url":null,"abstract":"The scale-space method is a well-established framework that constructs a\u0000hierarchical representation of an input signal and facilitates coarse-to-fine\u0000visual reasoning. Considering the terrain elevation function as the input\u0000signal, the scale-space method can identify and track significant topographic\u0000features across different scales. The number of scales a feature persists,\u0000called its life span, indicates the importance of that feature. In this way,\u0000important topographic features of a landscape can be selected, which are useful\u0000for many applications, including cartography, nautical charting, and land-use\u0000planning. The scale-space methods developed for terrain data use gridded\u0000Digital Elevation Models (DEMs) to represent the terrain. However, gridded DEMs\u0000lack the flexibility to adapt to the irregular distribution of input data and\u0000the varied topological complexity of different regions. Instead, Triangulated\u0000Irregular Networks (TINs) can be directly generated from irregularly\u0000distributed point clouds and accurately preserve important features. In this\u0000work, we introduce a novel scale-space analysis pipeline for TINs, addressing\u0000the multiple challenges in extending grid-based scale-space methods to TINs.\u0000Our pipeline can efficiently identify and track topologically important\u0000features on TINs. Moreover, it is capable of analyzing terrains with irregular\u0000boundaries, which poses challenges for grid-based methods. Comprehensive\u0000experiments show that, compared to grid-based methods, our TIN-based pipeline\u0000is more efficient, accurate, and has better resolution robustness.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Rs4rs: Semantically Find Recent Publications from Top Recommendation System-Related Venues Rs4rs：从顶级推荐系统相关网站中语义查找最新出版物

arXiv - CS - Information Retrieval

Pub Date : 2024-09-09 DOI: arxiv-2409.05570

Tri Kurniawan Wijaya, Edoardo D'Amico, Gabor Fodor, Manuel V. Loureiro

Rs4rs is a web application designed to perform semantic search on recentpapers from top conferences and journals related to Recommender Systems.Current scholarly search engine tools like Google Scholar, Semantic Scholar,and ResearchGate often yield broad results that fail to target the mostrelevant high-quality publications. Moreover, manually visiting individualconference and journal websites is a time-consuming process that primarilysupports only syntactic searches. Rs4rs addresses these issues by providing auser-friendly platform where researchers can input their topic of interest andreceive a list of recent, relevant papers from top Recommender Systems venues.Utilizing semantic search techniques, Rs4rs ensures that the search results arenot only precise and relevant but also comprehensive, capturing papersregardless of variations in wording. This tool significantly enhances researchefficiency and accuracy, thereby benefitting the research community and publicby facilitating access to high-quality, pertinent academic resources in thefield of Recommender Systems. Rs4rs is available at https://rs4rs.com.

Rs4rs 是一款网络应用程序，旨在对与推荐系统相关的顶级会议和期刊的最新论文进行语义搜索。目前的学术搜索引擎工具（如 Google Scholar、Semantic Scholar 和 ResearchGate）通常会产生广泛的搜索结果，无法锁定最相关的高质量出版物。此外，手动访问各个会议和期刊网站也是一个耗时的过程，而且主要只支持句法搜索。Rs4rs 利用语义搜索技术，确保搜索结果不仅精确、相关，而且全面，无论措辞如何变化，都能捕捉到论文。该工具大大提高了研究效率和准确性，从而为研究界和公众获取推荐系统领域高质量的相关学术资源提供了便利。Rs4rs可在https://rs4rs.com。

{"title":"Rs4rs: Semantically Find Recent Publications from Top Recommendation System-Related Venues","authors":"Tri Kurniawan Wijaya, Edoardo D'Amico, Gabor Fodor, Manuel V. Loureiro","doi":"arxiv-2409.05570","DOIUrl":"https://doi.org/arxiv-2409.05570","url":null,"abstract":"Rs4rs is a web application designed to perform semantic search on recent\u0000papers from top conferences and journals related to Recommender Systems.\u0000Current scholarly search engine tools like Google Scholar, Semantic Scholar,\u0000and ResearchGate often yield broad results that fail to target the most\u0000relevant high-quality publications. Moreover, manually visiting individual\u0000conference and journal websites is a time-consuming process that primarily\u0000supports only syntactic searches. Rs4rs addresses these issues by providing a\u0000user-friendly platform where researchers can input their topic of interest and\u0000receive a list of recent, relevant papers from top Recommender Systems venues.\u0000Utilizing semantic search techniques, Rs4rs ensures that the search results are\u0000not only precise and relevant but also comprehensive, capturing papers\u0000regardless of variations in wording. This tool significantly enhances research\u0000efficiency and accuracy, thereby benefitting the research community and public\u0000by facilitating access to high-quality, pertinent academic resources in the\u0000field of Recommender Systems. Rs4rs is available at https://rs4rs.com.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"17 8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing Graph Contrastive Learning with Reliable and Informative Augmentation for Recommendation 利用可靠且信息丰富的增强功能加强图表对比学习，以进行推荐

arXiv - CS - Information Retrieval

Pub Date : 2024-09-09 DOI: arxiv-2409.05633

Bowen Zheng, Junjie Zhang, Hongyu Lu, Yu Chen, Ming Chen, Wayne Xin Zhao, Ji-Rong Wen

Graph neural network (GNN) has been a powerful approach in collaborativefiltering (CF) due to its ability to model high-order user-item relationships.Recently, to alleviate the data sparsity and enhance representation learning,many efforts have been conducted to integrate contrastive learning (CL) withGNNs. Despite the promising improvements, the contrastive view generation basedon structure and representation perturbations in existing methods potentiallydisrupts the collaborative information in contrastive views, resulting inlimited effectiveness of positive alignment. To overcome this issue, we proposeCoGCL, a novel framework that aims to enhance graph contrastive learning byconstructing contrastive views with stronger collaborative information viadiscrete codes. The core idea is to map users and items into discrete codesrich in collaborative information for reliable and informative contrastive viewgeneration. To this end, we initially introduce a multi-level vector quantizerin an end-to-end manner to quantize user and item representations into discretecodes. Based on these discrete codes, we enhance the collaborative informationof contrastive views by considering neighborhood structure and semanticrelevance respectively. For neighborhood structure, we propose virtual neighboraugmentation by treating discrete codes as virtual neighbors, which expands anobserved user-item interaction into multiple edges involving discrete codes.Regarding semantic relevance, we identify similar users/items based on shareddiscrete codes and interaction targets to generate the semantically relevantview. Through these strategies, we construct contrastive views with strongercollaborative information and develop a triple-view graph contrastive learningapproach. Extensive experiments on four public datasets demonstrate theeffectiveness of our proposed approach.

图神经网络（GNN）因其能够模拟高阶用户-项目关系而成为协同过滤（CF）中的一种强大方法。最近，为了缓解数据稀疏性和增强表征学习，许多人努力将对比学习（CL）与图神经网络结合起来。尽管对比学习的改进前景广阔，但现有方法中基于结构和表示扰动的对比视图生成可能会破坏对比视图中的协作信息，从而导致正向配准的效果有限。为了克服这个问题，我们提出了一个新颖的框架--CoGCL，旨在通过构建具有更强协作信息的对比视图（contrastive views）来增强图对比学习（graph contrastive learning）。其核心思想是将用户和条目映射到富含协作信息的离散代码中，从而生成可靠、翔实的对比视图。为此，我们首先引入了多级向量量化器，以端到端的方式将用户和项目表示量化为离散代码。在这些离散代码的基础上，我们分别通过考虑邻域结构和语义相关性来增强对比视图的协作信息。在邻域结构方面，我们将离散代码视为虚拟邻域，提出了虚拟邻域扩展（virtual neighboraugmentation）方法，将观察到的用户与物品的交互扩展为涉及离散代码的多条边；在语义相关性方面，我们根据共享的离散代码和交互目标识别相似的用户/物品，生成语义相关的视图。通过这些策略，我们构建了具有较强协作信息的对比视图，并开发了三重视图图对比学习方法。在四个公共数据集上进行的广泛实验证明了我们提出的方法的有效性。

{"title":"Enhancing Graph Contrastive Learning with Reliable and Informative Augmentation for Recommendation","authors":"Bowen Zheng, Junjie Zhang, Hongyu Lu, Yu Chen, Ming Chen, Wayne Xin Zhao, Ji-Rong Wen","doi":"arxiv-2409.05633","DOIUrl":"https://doi.org/arxiv-2409.05633","url":null,"abstract":"Graph neural network (GNN) has been a powerful approach in collaborative\u0000filtering (CF) due to its ability to model high-order user-item relationships.\u0000Recently, to alleviate the data sparsity and enhance representation learning,\u0000many efforts have been conducted to integrate contrastive learning (CL) with\u0000GNNs. Despite the promising improvements, the contrastive view generation based\u0000on structure and representation perturbations in existing methods potentially\u0000disrupts the collaborative information in contrastive views, resulting in\u0000limited effectiveness of positive alignment. To overcome this issue, we propose\u0000CoGCL, a novel framework that aims to enhance graph contrastive learning by\u0000constructing contrastive views with stronger collaborative information via\u0000discrete codes. The core idea is to map users and items into discrete codes\u0000rich in collaborative information for reliable and informative contrastive view\u0000generation. To this end, we initially introduce a multi-level vector quantizer\u0000in an end-to-end manner to quantize user and item representations into discrete\u0000codes. Based on these discrete codes, we enhance the collaborative information\u0000of contrastive views by considering neighborhood structure and semantic\u0000relevance respectively. For neighborhood structure, we propose virtual neighbor\u0000augmentation by treating discrete codes as virtual neighbors, which expands an\u0000observed user-item interaction into multiple edges involving discrete codes.\u0000Regarding semantic relevance, we identify similar users/items based on shared\u0000discrete codes and interaction targets to generate the semantically relevant\u0000view. Through these strategies, we construct contrastive views with stronger\u0000collaborative information and develop a triple-view graph contrastive learning\u0000approach. Extensive experiments on four public datasets demonstrate the\u0000effectiveness of our proposed approach.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"55 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

NLLB-E5: A Scalable Multilingual Retrieval Model NLLB-E5：可扩展的多语言检索模型

arXiv - CS - Information Retrieval

Pub Date : 2024-09-09 DOI: arxiv-2409.05401

Arkadeep Acharya, Rudra Murthy, Vishwajeet Kumar, Jaydeep Sen

Despite significant progress in multilingual information retrieval, the lackof models capable of effectively supporting multiple languages, particularlylow-resource like Indic languages, remains a critical challenge. This paperpresents NLLB-E5: A Scalable Multilingual Retrieval Model. NLLB-E5 leveragesthe in-built multilingual capabilities in the NLLB encoder for translationtasks. It proposes a distillation approach from multilingual retriever E5 toprovide a zero-shot retrieval approach handling multiple languages, includingall major Indic languages, without requiring multilingual training data. Weevaluate the model on a comprehensive suite of existing benchmarks, includingHindi-BEIR, highlighting its robust performance across diverse languages andtasks. Our findings uncover task and domain-specific challenges, providingvaluable insights into the retrieval performance, especially for low-resourcelanguages. NLLB-E5 addresses the urgent need for an inclusive, scalable, andlanguage-agnostic text retrieval model, advancing the field of multilingualinformation access and promoting digital inclusivity for millions of usersglobally.

尽管在多语言信息检索方面取得了重大进展，但缺乏能够有效支持多种语言（尤其是像印度语这样的低资源语言）的模型仍然是一个严峻的挑战。本文介绍了 NLLB-E5：一种可扩展的多语言检索模型。NLLB-E5 充分利用了 NLLB 编码器的内置多语言功能来完成翻译任务。它提出了一种从多语言检索器 E5 中提炼出来的方法，提供了一种无需多语言训练数据即可处理多种语言（包括所有主要印度语言）的零点检索方法。我们在一套全面的现有基准（包括印地语-BEIR）上对该模型进行了评估，突出显示了它在不同语言和任务中的强大性能。我们的研究结果揭示了任务和特定领域的挑战，为检索性能提供了宝贵的见解，特别是对于低资源语言。NLLB-E5 解决了对包容性、可扩展性和语言无关性文本检索模型的迫切需求，推动了多语言信息访问领域的发展，促进了全球数百万用户的数字包容性。

{"title":"NLLB-E5: A Scalable Multilingual Retrieval Model","authors":"Arkadeep Acharya, Rudra Murthy, Vishwajeet Kumar, Jaydeep Sen","doi":"arxiv-2409.05401","DOIUrl":"https://doi.org/arxiv-2409.05401","url":null,"abstract":"Despite significant progress in multilingual information retrieval, the lack\u0000of models capable of effectively supporting multiple languages, particularly\u0000low-resource like Indic languages, remains a critical challenge. This paper\u0000presents NLLB-E5: A Scalable Multilingual Retrieval Model. NLLB-E5 leverages\u0000the in-built multilingual capabilities in the NLLB encoder for translation\u0000tasks. It proposes a distillation approach from multilingual retriever E5 to\u0000provide a zero-shot retrieval approach handling multiple languages, including\u0000all major Indic languages, without requiring multilingual training data. We\u0000evaluate the model on a comprehensive suite of existing benchmarks, including\u0000Hindi-BEIR, highlighting its robust performance across diverse languages and\u0000tasks. Our findings uncover task and domain-specific challenges, providing\u0000valuable insights into the retrieval performance, especially for low-resource\u0000languages. NLLB-E5 addresses the urgent need for an inclusive, scalable, and\u0000language-agnostic text retrieval model, advancing the field of multilingual\u0000information access and promoting digital inclusivity for millions of users\u0000globally.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"63 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0