Scale-free networks are prevalently observed in a great variety of complex systems, which triggers various researches relevant to networked models of such type. In this work, we propose a family of growth tree networks (mathcal{T}_{t}), which turn out to be scale-free, in an iterative manner. As opposed to most of published tree models with scale-free feature, our tree networks have the power-law exponent (gamma=1+ln 5/ln 2) that is obviously larger than (3). At the same time, ”small-world” property can not be found particularly because models (mathcal{T}_{t}) have an ultra-large diameter (D_{t}) (i.e., (D_{t}sim|mathcal{T}_{t}|^{ln 3/ln 5})) and a greater average shortest path length (langlemathcal{W}_{t}rangle) (namely, (langlemathcal{W}_{t}ranglesim|mathcal{T}_{t}|^{ln 3/ln 5})) where (|mathcal{T}_{t}|) represents vertex number. Next, we determine Pearson correlation coefficient and verify that networks (mathcal{T}_{t}) display disassortative mixing structure. In addition, we study random walks on tree networks (mathcal{T}_{t}) and derive exact solution to mean hitting time (langlemathcal{H}_{t}rangle). The results suggest that the analytic formula for quantity (langlemathcal{H}_{t}rangle) as a function of vertex number (|mathcal{T}_{t}|) shows a power-law form, i.e., (langlemathcal{H}_{t}ranglesim|mathcal{T}_{t}|^{1+ln 3/ln 5}). Accordingly, we execute extensive experimental simulations, and demonstrate that empirical analysis is in strong agreement with theoretical results. Lastly, we provide a guide to extend the proposed iterative manner in order to generate more general scale-free tree networks with large diameter.
{"title":"Structural properties on scale-free tree network with an ultra-large diameter","authors":"Fei Ma, Ping Wang","doi":"10.1145/3674146","DOIUrl":"https://doi.org/10.1145/3674146","url":null,"abstract":"<p>Scale-free networks are prevalently observed in a great variety of complex systems, which triggers various researches relevant to networked models of such type. In this work, we propose a family of growth tree networks (mathcal{T}_{t}), which turn out to be scale-free, in an iterative manner. As opposed to most of published tree models with scale-free feature, our tree networks have the power-law exponent (gamma=1+ln 5/ln 2) that is obviously larger than (3). At the same time, ”small-world” property can not be found particularly because models (mathcal{T}_{t}) have an ultra-large diameter (D_{t}) (i.e., (D_{t}sim|mathcal{T}_{t}|^{ln 3/ln 5})) and a greater average shortest path length (langlemathcal{W}_{t}rangle) (namely, (langlemathcal{W}_{t}ranglesim|mathcal{T}_{t}|^{ln 3/ln 5})) where (|mathcal{T}_{t}|) represents vertex number. Next, we determine Pearson correlation coefficient and verify that networks (mathcal{T}_{t}) display disassortative mixing structure. In addition, we study random walks on tree networks (mathcal{T}_{t}) and derive exact solution to mean hitting time (langlemathcal{H}_{t}rangle). The results suggest that the analytic formula for quantity (langlemathcal{H}_{t}rangle) as a function of vertex number (|mathcal{T}_{t}|) shows a power-law form, i.e., (langlemathcal{H}_{t}ranglesim|mathcal{T}_{t}|^{1+ln 3/ln 5}). Accordingly, we execute extensive experimental simulations, and demonstrate that empirical analysis is in strong agreement with theoretical results. Lastly, we provide a guide to extend the proposed iterative manner in order to generate more general scale-free tree networks with large diameter.</p>","PeriodicalId":49249,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data","volume":"345 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ziyu Zhao, Yuqi Bai, Ruoxuan Xiong, Qingyu Cao, Chao Ma, Ning Jiang, Fei Wu, Kun Kuang
Estimating individual treatment effects in networked observational data is a crucial and increasingly recognized problem. One major challenge of this problem is violating the Stable Unit Treatment Value Assumption (SUTVA), which posits that a unit’s outcome is independent of others’ treatment assignments. However, in network data, a unit’s outcome is influenced not only by its treatment (i.e., direct effect) but also by the treatments of others (i.e., spillover effect) since the presence of interference. Moreover, the interference from other units is always heterogeneous (e.g., friends with similar interests have a different influence than those with different interests). In this paper, we focus on the problem of estimating individual treatment effects (including direct effect and spillover effect) under heterogeneous interference in networks. To address this problem, we propose a novel Dual Weighting Regression (DWR) algorithm by simultaneously learning attention weights to capture the heterogeneous interference from neighbors and sample weights to eliminate the complex confounding bias in networks. We formulate the learning process as a bi-level optimization problem. Theoretically, we give a generalization error bound for the expected estimation error of the individual treatment effects. Extensive experiments on four benchmark datasets demonstrate that the proposed DWR algorithm outperforms the state-of-the-art methods in estimating individual treatment effects under heterogeneous network interference.
{"title":"Learning Individual Treatment Effects under Heterogeneous Interference in Networks","authors":"Ziyu Zhao, Yuqi Bai, Ruoxuan Xiong, Qingyu Cao, Chao Ma, Ning Jiang, Fei Wu, Kun Kuang","doi":"10.1145/3673761","DOIUrl":"https://doi.org/10.1145/3673761","url":null,"abstract":"<p>Estimating individual treatment effects in networked observational data is a crucial and increasingly recognized problem. One major challenge of this problem is violating the Stable Unit Treatment Value Assumption (SUTVA), which posits that a unit’s outcome is independent of others’ treatment assignments. However, in network data, a unit’s outcome is influenced not only by its treatment (i.e., direct effect) but also by the treatments of others (i.e., spillover effect) since the presence of interference. Moreover, the interference from other units is always heterogeneous (e.g., friends with similar interests have a different influence than those with different interests). In this paper, we focus on the problem of estimating individual treatment effects (including direct effect and spillover effect) under heterogeneous interference in networks. To address this problem, we propose a novel Dual Weighting Regression (DWR) algorithm by simultaneously learning attention weights to capture the heterogeneous interference from neighbors and sample weights to eliminate the complex confounding bias in networks. We formulate the learning process as a bi-level optimization problem. Theoretically, we give a generalization error bound for the expected estimation error of the individual treatment effects. Extensive experiments on four benchmark datasets demonstrate that the proposed DWR algorithm outperforms the state-of-the-art methods in estimating individual treatment effects under heterogeneous network interference.</p>","PeriodicalId":49249,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data","volume":"84 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recommender systems are influenced by many confounding factors (i.e., confounders) which result in various biases (e.g., popularity biases) and inaccurate user preference. Existing approaches try to eliminate these biases by inference with causal graphs. However, they assume all confounding factors can be observed and no hidden confounders exist. We argue that many confounding factors (e.g., season) may not be observable from user-item interaction data, resulting inaccurate user preference. In this paper, we propose a deconfounded recommender considering unobservable confounders. Specifically, we propose a new causal graph with explicit and implicit feedback, which can better model user preference. Then, we realize a deconfounded estimator by the front-door adjustment, which is able to eliminate the effect of unobserved confounders. Finally, we conduct a series of experiments on two real-world datasets, and the results show that our approach performs better than other counterparts in terms of recommendation accuracy.
{"title":"Deconfounding User Preference in Recommendation Systems through Implicit and Explicit Feedback","authors":"Yuliang Liang, Enneng Yang, Guibing Guo, Wei Cai, Linying Jiang, Xingwei Wang","doi":"10.1145/3673762","DOIUrl":"https://doi.org/10.1145/3673762","url":null,"abstract":"<p>Recommender systems are influenced by many confounding factors (i.e., confounders) which result in various biases (e.g., popularity biases) and inaccurate user preference. Existing approaches try to eliminate these biases by inference with causal graphs. However, they assume all confounding factors can be observed and no hidden confounders exist. We argue that many confounding factors (e.g., season) may not be observable from user-item interaction data, resulting inaccurate user preference. In this paper, we propose a deconfounded recommender considering unobservable confounders. Specifically, we propose a new causal graph with explicit and implicit feedback, which can better model user preference. Then, we realize a deconfounded estimator by the front-door adjustment, which is able to eliminate the effect of unobserved confounders. Finally, we conduct a series of experiments on two real-world datasets, and the results show that our approach performs better than other counterparts in terms of recommendation accuracy.</p>","PeriodicalId":49249,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data","volume":"124 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Meng Xiao, Min Wu, Ziyue Qiao, Yanjie Fu, Zhiyuan Ning, Yi Du, Yuanchun Zhou
The objective of topic inference in research proposals aims to obtain the most suitable disciplinary division from the discipline system defined by a funding agency. The agency will subsequently find appropriate peer review experts from their database based on this division. Automated topic inference can reduce human errors caused by manual topic filling, bridge the knowledge gap between funding agencies and project applicants, and improve system efficiency. Existing methods focus on modeling this as a hierarchical multi-label classification problem, using generative models to iteratively infer the most appropriate topic information. However, these methods overlook the gap in scale between interdisciplinary research proposals and non-interdisciplinary ones, leading to an unjust phenomenon where the automated inference system categorizes interdisciplinary proposals as non-interdisciplinary, causing unfairness during the expert assignment. How can we address this data imbalance issue under a complex discipline system and hence resolve this unfairness? In this paper, we implement a topic label inference system based on a Transformer encoder-decoder architecture. Furthermore, we utilize interpolation techniques to create a series of pseudo-interdisciplinary proposals from non-interdisciplinary ones during training based on non-parametric indicators such as cross-topic probabilities and topic occurrence probabilities. This approach aims to reduce the bias of the system during model training. Finally, we conduct extensive experiments on a real-world dataset to verify the effectiveness of the proposed method. The experimental results demonstrate that our training strategy can significantly mitigate the unfairness generated in the topic inference task. To improve the reproducibility of our research, we have released accompanying code by Dropbox.1.
{"title":"Interdisciplinary Fairness in Imbalanced Research Proposal Topic Inference: A Hierarchical Transformer-based Method with Selective Interpolation","authors":"Meng Xiao, Min Wu, Ziyue Qiao, Yanjie Fu, Zhiyuan Ning, Yi Du, Yuanchun Zhou","doi":"10.1145/3671149","DOIUrl":"https://doi.org/10.1145/3671149","url":null,"abstract":"<p>The objective of topic inference in research proposals aims to obtain the most suitable disciplinary division from the discipline system defined by a funding agency. The agency will subsequently find appropriate peer review experts from their database based on this division. Automated topic inference can reduce human errors caused by manual topic filling, bridge the knowledge gap between funding agencies and project applicants, and improve system efficiency. Existing methods focus on modeling this as a hierarchical multi-label classification problem, using generative models to iteratively infer the most appropriate topic information. However, these methods overlook the gap in scale between interdisciplinary research proposals and non-interdisciplinary ones, leading to an unjust phenomenon where the automated inference system categorizes interdisciplinary proposals as non-interdisciplinary, causing unfairness during the expert assignment. How can we address this data imbalance issue under a complex discipline system and hence resolve this unfairness? In this paper, we implement a topic label inference system based on a Transformer encoder-decoder architecture. Furthermore, we utilize interpolation techniques to create a series of pseudo-interdisciplinary proposals from non-interdisciplinary ones during training based on non-parametric indicators such as cross-topic probabilities and topic occurrence probabilities. This approach aims to reduce the bias of the system during model training. Finally, we conduct extensive experiments on a real-world dataset to verify the effectiveness of the proposed method. The experimental results demonstrate that our training strategy can significantly mitigate the unfairness generated in the topic inference task. To improve the reproducibility of our research, we have released accompanying code by Dropbox.<sup>1</sup>.</p>","PeriodicalId":49249,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data","volume":"42 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141550260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Software vulnerabilities, also known as flaws, bugs or weaknesses, are common in modern information systems, putting critical data of organizations and individuals at cyber risk. Due to the scarcity of resources, initial risk assessment is becoming a necessary step to prioritize vulnerabilities and make better decisions on remediation, mitigation, and patching. Datasets containing historical vulnerability information are crucial digital assets to enable AI-based risk assessments. However, existing datasets focus on collecting information on individual vulnerabilities while simply storing them in relational databases, disregarding their structural connections. This paper constructs a compact vulnerability knowledge graph, VulKG, containing over 276K nodes and 1M relationships to represent the connections between vulnerabilities, exploits, affected products, vendors, referred domain names, and more. We provide a detailed analysis of VulKG modeling and construction, demonstrating VulKG-based query and reasoning, and providing a use case of applying VulKG to a vulnerability risk assessment task, i.e., co-exploitation behavior discovery. Experimental results demonstrate the value of graph connections in vulnerability risk assessment tasks. VulKG offers exciting opportunities for more novel and significant research in areas related to vulnerability risk assessment. The data and codes of this paper are available at https://github.com/happyResearcher/VulKG.git.
{"title":"A Compact Vulnerability Knowledge Graph for Risk Assessment","authors":"Jiao Yin, Wei Hong, Hua Wang, Jinli Cao, Yuan Miao, Yanchun Zhang","doi":"10.1145/3671005","DOIUrl":"https://doi.org/10.1145/3671005","url":null,"abstract":"<p>Software vulnerabilities, also known as flaws, bugs or weaknesses, are common in modern information systems, putting critical data of organizations and individuals at cyber risk. Due to the scarcity of resources, initial risk assessment is becoming a necessary step to prioritize vulnerabilities and make better decisions on remediation, mitigation, and patching. Datasets containing historical vulnerability information are crucial digital assets to enable AI-based risk assessments. However, existing datasets focus on collecting information on individual vulnerabilities while simply storing them in relational databases, disregarding their structural connections. This paper constructs a compact vulnerability knowledge graph, VulKG, containing over 276K nodes and 1M relationships to represent the connections between vulnerabilities, exploits, affected products, vendors, referred domain names, and more. We provide a detailed analysis of VulKG modeling and construction, demonstrating VulKG-based query and reasoning, and providing a use case of applying VulKG to a vulnerability risk assessment task, i.e., co-exploitation behavior discovery. Experimental results demonstrate the value of graph connections in vulnerability risk assessment tasks. VulKG offers exciting opportunities for more novel and significant research in areas related to vulnerability risk assessment. The data and codes of this paper are available at https://github.com/happyResearcher/VulKG.git.</p>","PeriodicalId":49249,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data","volume":"30 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141258961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As a critical task for large-scale commercial recommender systems, reranking rearranges items in the initial ranking lists from the previous ranking stage to better meet users’ demands. Foundational work in reranking has shown the potential of improving recommendation results by uncovering mutual influence among items. However, rather than considering the context of initial lists as most existing methods do, an ideal reranking algorithm should consider the counterfactual context – the position and the alignment of the items in the reranked lists. In this work, we propose a novel pairwise reranking framework, Utility-oriented Reranking with Counterfactual Context (URCC), which maximizes the overall utility after reranking efficiently. Specifically, we first design a utility-oriented evaluator, which applies Bi-LSTM and graph attention mechanism to estimate the listwise utility via the counterfactual context modeling. Then, under the guidance of the evaluator, we propose a pairwise reranker model to find the most suitable position for each item by swapping misplaced item pairs. Extensive experiments on two benchmark datasets and a proprietary real-world dataset demonstrate that URCC significantly outperforms the state-of-the-art models in terms of both relevance-based metrics and utility-based metrics.
{"title":"Utility-oriented Reranking with Counterfactual Context","authors":"Yunjia Xi, Weiwen Liu, Xinyi Dai, Ruiming Tang, Qing Liu, Weinan Zhang, Yong Yu","doi":"10.1145/3671004","DOIUrl":"https://doi.org/10.1145/3671004","url":null,"abstract":"<p>As a critical task for large-scale commercial recommender systems, reranking rearranges items in the initial ranking lists from the previous ranking stage to better meet users’ demands. Foundational work in reranking has shown the potential of improving recommendation results by uncovering mutual influence among items. However, rather than considering the context of initial lists as most existing methods do, an ideal reranking algorithm should consider the <i>counterfactual context</i> – the position and the alignment of the items in the <i>reranked lists</i>. In this work, we propose a novel pairwise reranking framework, Utility-oriented Reranking with Counterfactual Context (URCC), which maximizes the overall utility after reranking efficiently. Specifically, we first design a utility-oriented evaluator, which applies Bi-LSTM and graph attention mechanism to estimate the listwise utility via the <i>counterfactual context</i> modeling. Then, under the guidance of the evaluator, we propose a pairwise reranker model to find the most suitable position for each item by swapping misplaced item pairs. Extensive experiments on two benchmark datasets and a proprietary real-world dataset demonstrate that URCC significantly outperforms the state-of-the-art models in terms of both relevance-based metrics and utility-based metrics.</p>","PeriodicalId":49249,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data","volume":"26 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141258837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This survey paper presents a comprehensive and conceptual overview of anomaly detection using dynamic graphs. We focus on existing graph-based anomaly detection (AD) techniques and their applications to dynamic networks. The contributions of this survey paper include the following: i) a comparative study of existing surveys on anomaly detection; ii) a Dynamic Graph-based Anomaly Detection (DGAD) review framework in which approaches for detecting anomalies in dynamic graphs are grouped based on traditional machine-learning models, matrix transformations, probabilistic approaches, and deep-learning approaches; iii) a discussion of graphically representing both discrete and dynamic networks; and iv) a discussion of the advantages of graph-based techniques for capturing the relational structure and complex interactions in dynamic graph data. Finally, this work identifies the potential challenges and future directions for detecting anomalies in dynamic networks. This DGAD survey approach aims to provide a valuable resource for researchers and practitioners by summarizing the strengths and limitations of each approach, highlighting current research trends, and identifying open challenges. In doing so, it can guide future research efforts and promote advancements in anomaly detection in dynamic graphs.
{"title":"Anomaly Detection in Dynamic Graphs: A Comprehensive Survey","authors":"Ocheme Anthony Ekle, William Eberle","doi":"10.1145/3669906","DOIUrl":"https://doi.org/10.1145/3669906","url":null,"abstract":"<p>This survey paper presents a comprehensive and conceptual overview of anomaly detection using dynamic graphs. We focus on existing graph-based anomaly detection (AD) techniques and their applications to dynamic networks. The contributions of this survey paper include the following: i) a comparative study of existing surveys on anomaly detection; ii) a <b>D</b>ynamic <b>G</b>raph-based <b>A</b>nomaly <b>D</b>etection (<b>DGAD</b>) review framework in which approaches for detecting anomalies in dynamic graphs are grouped based on traditional machine-learning models, matrix transformations, probabilistic approaches, and deep-learning approaches; iii) a discussion of graphically representing both discrete and dynamic networks; and iv) a discussion of the advantages of graph-based techniques for capturing the relational structure and complex interactions in dynamic graph data. Finally, this work identifies the potential challenges and future directions for detecting anomalies in dynamic networks. This <b>DGAD</b> survey approach aims to provide a valuable resource for researchers and practitioners by summarizing the strengths and limitations of each approach, highlighting current research trends, and identifying open challenges. In doing so, it can guide future research efforts and promote advancements in anomaly detection in dynamic graphs.</p>","PeriodicalId":49249,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data","volume":"6 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141193177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhihao Hu, Yiran Xu, Mengnan Du, Jindong Gu, Xinmei Tian, Fengxiang He
With the increasing penetration of machine learning applications in critical decision-making areas, calls for algorithmic fairness are more prominent. Although there have been various modalities to improve algorithmic fairness through learning with fairness constraints, their performance does not generalize well in the test set. A performance-promising fair algorithm with better generalizability is needed. This paper proposes a novel adaptive reweighing method to eliminate the impact of the distribution shifts between training and test data on model generalizability. Most previous reweighing methods propose to assign a unified weight for each (sub)group. Rather, our method granularly models the distance from the sample predictions to the decision boundary. Our adaptive reweighing method prioritizes samples closer to the decision boundary and assigns a higher weight to improve the generalizability of fair classifiers. Extensive experiments are performed to validate the generalizability of our adaptive priority reweighing method for accuracy and fairness measures (i.e., equal opportunity, equalized odds, and demographic parity) in tabular benchmarks. We also highlight the performance of our method in improving the fairness of language and vision models. The code is available at https://github.com/che2198/APW.
{"title":"Boosting Fair Classifier Generalization through Adaptive Priority Reweighing","authors":"Zhihao Hu, Yiran Xu, Mengnan Du, Jindong Gu, Xinmei Tian, Fengxiang He","doi":"10.1145/3665895","DOIUrl":"https://doi.org/10.1145/3665895","url":null,"abstract":"<p>With the increasing penetration of machine learning applications in critical decision-making areas, calls for algorithmic fairness are more prominent. Although there have been various modalities to improve algorithmic fairness through learning with fairness constraints, their performance does not generalize well in the test set. A performance-promising fair algorithm with better generalizability is needed. This paper proposes a novel adaptive reweighing method to eliminate the impact of the distribution shifts between training and test data on model generalizability. Most previous reweighing methods propose to assign a unified weight for each (sub)group. Rather, our method granularly models the distance from the sample predictions to the decision boundary. Our adaptive reweighing method prioritizes samples closer to the decision boundary and assigns a higher weight to improve the generalizability of fair classifiers. Extensive experiments are performed to validate the generalizability of our adaptive priority reweighing method for accuracy and fairness measures (i.e., equal opportunity, equalized odds, and demographic parity) in tabular benchmarks. We also highlight the performance of our method in improving the fairness of language and vision models. The code is available at https://github.com/che2198/APW.</p>","PeriodicalId":49249,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data","volume":"21 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141153091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Linear discriminant analysis (LDA) is widely used for dimensionality reduction under supervised learning settings. Traditional LDA objective aims to minimize the ratio of the squared Euclidean distances that may not perform optimally on noisy datasets. Multiple robust LDA objectives have been proposed to address this problem, but their implementations have two major limitations. One is that their mean calculations use the squared (ell_{2})-norm distance to center the data, which is not valid when the objective depends on other distance functions. The second problem is that there is no generalized optimization algorithm to solve different robust LDA objectives. In addition, most existing algorithms can only guarantee the solution to be locally optimal, rather than globally optimal. In this paper, we review multiple robust loss functions and propose a new and generalized robust objective for LDA. Besides, to better remove the mean value within data, our objective uses an optimal way to center the data through learning. As one important algorithmic contribution, we derive an efficient iterative algorithm to optimize the resulting non-smooth and non-convex objective function. We theoretically prove that our solution algorithm guarantees that both the objective and the solution sequences converge to globally optimal solutions at a sub-linear convergence rate. The results of comprehensive experimental evaluations demonstrate the effectiveness of our new method, achieving significant improvements compared to the other competing methods.
{"title":"On Mean-Optimal Robust Linear Discriminant Analysis","authors":"Xiangyu Li, Hua Wang","doi":"10.1145/3665500","DOIUrl":"https://doi.org/10.1145/3665500","url":null,"abstract":"<p>Linear discriminant analysis (LDA) is widely used for dimensionality reduction under supervised learning settings. Traditional LDA objective aims to minimize the ratio of the squared Euclidean distances that may not perform optimally on noisy datasets. Multiple robust LDA objectives have been proposed to address this problem, but their implementations have two major limitations. One is that their mean calculations use the squared (ell_{2})-norm distance to center the data, which is not valid when the objective depends on other distance functions. The second problem is that there is no generalized optimization algorithm to solve different robust LDA objectives. In addition, most existing algorithms can only guarantee the solution to be locally optimal, rather than globally optimal. In this paper, we review multiple robust loss functions and propose a new and generalized robust objective for LDA. Besides, to better remove the mean value within data, our objective uses an optimal way to center the data through learning. As one important algorithmic contribution, we derive an efficient iterative algorithm to optimize the resulting non-smooth and non-convex objective function. We theoretically prove that our solution algorithm guarantees that both the objective and the solution sequences converge to globally optimal solutions at a sub-linear convergence rate. The results of comprehensive experimental evaluations demonstrate the effectiveness of our new method, achieving significant improvements compared to the other competing methods.</p>","PeriodicalId":49249,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data","volume":"56 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141153219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Victor Gomes de Oliveira Martins Nicola, Karina Valdivia Delgado, Marcelo de Souza Lauretto
In the task of multi-label classification in data streams, instances arriving in real time need to be associated with multiple labels simultaneously. Various methods based on the k Nearest Neighbors algorithm have been proposed to address this task. However, these methods face limitations when dealing with imbalanced data streams, a problem that has received limited attention in existing works. To approach this gap, this paper introduces the Imbalance-Robust Multi-Label Self-Adjusting kNN (IRMLSAkNN), designed to tackle multi-label imbalanced data streams. IRMLSAkNN’s strength relies on maintaining relevant instances with imbalance labels by using a discarding mechanism that considers the imbalance ratio per label. On the other hand, it evaluates subwindows with an imbalance-aware measure to discard older instances that are lacking performance. We conducted statistical experiments on 32 benchmark data streams, evaluating IRMLSAkNN against eight multi-label classification algorithms using common accuracy-aware and imbalance-aware measures. The obtained results demonstrate that IRMLSAkNN consistently outperforms these algorithms in terms of predictive capacity and time cost across various levels of imbalance.
{"title":"Imbalance-Robust Multi-Label Self-Adjusting kNN","authors":"Victor Gomes de Oliveira Martins Nicola, Karina Valdivia Delgado, Marcelo de Souza Lauretto","doi":"10.1145/3663575","DOIUrl":"https://doi.org/10.1145/3663575","url":null,"abstract":"<p>In the task of multi-label classification in data streams, instances arriving in real time need to be associated with multiple labels simultaneously. Various methods based on the k Nearest Neighbors algorithm have been proposed to address this task. However, these methods face limitations when dealing with imbalanced data streams, a problem that has received limited attention in existing works. To approach this gap, this paper introduces the Imbalance-Robust Multi-Label Self-Adjusting kNN (IRMLSAkNN), designed to tackle multi-label imbalanced data streams. IRMLSAkNN’s strength relies on maintaining relevant instances with imbalance labels by using a discarding mechanism that considers the imbalance ratio per label. On the other hand, it evaluates subwindows with an imbalance-aware measure to discard older instances that are lacking performance. We conducted statistical experiments on 32 benchmark data streams, evaluating IRMLSAkNN against eight multi-label classification algorithms using common accuracy-aware and imbalance-aware measures. The obtained results demonstrate that IRMLSAkNN consistently outperforms these algorithms in terms of predictive capacity and time cost across various levels of imbalance.</p>","PeriodicalId":49249,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data","volume":"65 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140929310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}