Although algorithms are increasingly used to guide real-world decision-making, their potential for propagating bias remains challenging to measure. A common approach for researchers and practitioners examining algorithms for unintended discriminatory biases is to assess group fairness, which compares outcomes across typically sensitive or protected demographic features like race, gender, or age. In practice, however, data representing these group attributes is often not collected, or may be unavailable due to policy, legal, or other constraints. As a result, practitioners often find themselves tasked with assessing fairness in the face of these missing features. In such cases, they can either forgo a bias audit, obtain the missing data directly, or impute it. Because obtaining additional data is often prohibitively expensive or raises privacy concerns, many practitioners attempt to impute missing data using proxies. Through a survey of the data used in algorithmic fairness literature, which we make public to facilitate future research, we show that when available at all, most publicly available proxy sources are in the form of summary tables , which contain only aggregate statistics about a population. Prior work has found that these proxies are not predictive enough on their own to accurately measure group fairness. Even proxy variables that are correlated with group attributes also contain noise (i.e. will predict attributes for a subset of the population effectively at random). Here, we outline a method for improving accuracy in measuring group fairness using summary tables. Specifically, we propose improving accuracy by focusing only on highly predictive values within proxy variables, and outline the conditions under which these proxies can estimate fairness disparities with high accuracy. We then show that a major disqualifying criterion—an association between the proxy and the outcome—can be controlled for using causal inference. Finally, we show that when proxy data is missing altogether, our approach is applicable to rule-based proxies constructed using subject-matter context applied to the original data alone. Crucially, we are able to extract information on group disparities from proxies that may have low discriminatory power at the population level. We illustrate our results through a variety of case studies with real and simulated data. In all, we present a viable method allowing the assessment of fairness in the face of missing data, with limited privacy implications and without needing to rely on complex, expensive, or proprietary data sources.
{"title":"Improving Group Fairness Assessments with Proxies","authors":"Emma Harvey, M. S. Lee, Jatinder Singh","doi":"10.1145/3677175","DOIUrl":"https://doi.org/10.1145/3677175","url":null,"abstract":"\u0000 Although algorithms are increasingly used to guide real-world decision-making, their potential for propagating bias remains challenging to measure. A common approach for researchers and practitioners examining algorithms for unintended discriminatory biases is to assess group fairness, which compares outcomes across typically sensitive or protected demographic features like race, gender, or age. In practice, however, data representing these group attributes is often not collected, or may be unavailable due to policy, legal, or other constraints. As a result, practitioners often find themselves tasked with assessing fairness in the face of these missing features. In such cases, they can either forgo a bias audit, obtain the missing data directly, or impute it. Because obtaining additional data is often prohibitively expensive or raises privacy concerns, many practitioners attempt to impute missing data using proxies. Through a survey of the data used in algorithmic fairness literature, which we make public to facilitate future research, we show that when available at all, most publicly available proxy sources are in the form of\u0000 summary tables\u0000 , which contain only aggregate statistics about a population. Prior work has found that these proxies are not predictive enough on their own to accurately measure group fairness. Even proxy variables that are correlated with group attributes also contain noise (i.e. will predict attributes for a subset of the population effectively at random).\u0000 \u0000 \u0000 Here, we outline a method for improving accuracy in measuring group fairness using summary tables. Specifically, we propose improving accuracy by focusing only on\u0000 highly predictive values\u0000 within proxy variables, and outline the conditions under which these proxies can estimate fairness disparities with high accuracy. We then show that a major disqualifying criterion—an association between the proxy and the outcome—can be controlled for using causal inference. Finally, we show that when proxy data is missing altogether, our approach is applicable to rule-based proxies constructed using subject-matter context applied to the original data alone. Crucially, we are able to extract information on group disparities from proxies that may have low discriminatory power at the population level. We illustrate our results through a variety of case studies with real and simulated data. In all, we present a viable method allowing the assessment of fairness in the face of missing data, with limited privacy implications and without needing to rely on complex, expensive, or proprietary data sources.\u0000","PeriodicalId":486991,"journal":{"name":"ACM Journal on Responsible Computing","volume":"6 11","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141807641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hilmy Hanif, Jorge Constantino, Marie-Theres Sekwenz, M. van Eeten, J. Ubacht, Ben Wagner, Yury Zhauniarovich
The AI Act represents a significant legislative effort by the European Union to govern the use of AI systems according to different risk-related classes, imposing different degrees of compliance obligations to users and providers of AI systems. However, it is often critiqued due to the lack of general public comprehension and effectiveness regarding the classification of AI systems to the corresponding risk classes. To mitigate these shortcomings, we propose a Decision-Tree-based framework aimed at increasing legal compliance and classification clarity. By performing a quantitative evaluation, we show that our framework is especially beneficial to individuals without a legal background, allowing them to enhance the accuracy and speed of AI system classification according to the AI Act. The qualitative study results show that the framework is helpful to all participants, allowing them to justify intuitively made decisions and making the classification process clearer.
{"title":"Navigating the EU AI Act Maze using a Decision-Tree Approach","authors":"Hilmy Hanif, Jorge Constantino, Marie-Theres Sekwenz, M. van Eeten, J. Ubacht, Ben Wagner, Yury Zhauniarovich","doi":"10.1145/3677174","DOIUrl":"https://doi.org/10.1145/3677174","url":null,"abstract":"The AI Act represents a significant legislative effort by the European Union to govern the use of AI systems according to different risk-related classes, imposing different degrees of compliance obligations to users and providers of AI systems. However, it is often critiqued due to the lack of general public comprehension and effectiveness regarding the classification of AI systems to the corresponding risk classes. To mitigate these shortcomings, we propose a Decision-Tree-based framework aimed at increasing legal compliance and classification clarity. By performing a quantitative evaluation, we show that our framework is especially beneficial to individuals without a legal background, allowing them to enhance the accuracy and speed of AI system classification according to the AI Act. The qualitative study results show that the framework is helpful to all participants, allowing them to justify intuitively made decisions and making the classification process clearer.","PeriodicalId":486991,"journal":{"name":"ACM Journal on Responsible Computing","volume":"3 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141658642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Moving operations to the cloud has become a way of life for many educational institutions. Much of the information these institutions store in the cloud is protected by Family Educational Rights and Privacy Act (FERPA), which was last amended in 2002, well before cloud computing became ubiquitous. The application of a 1974 law to 21st-century technology presents a plethora of legal and technical questions. In this article, we present an interdisciplinary analysis of these issues. We examine both existing statutes and case law and contemporary research into cloud security, focusing on the impact of the latter on the former. We find that FERPA excludes information that students and faculty often believe is protected and that lower-court decisions have created further ambiguity. We additionally find that given current technology, the statute is no longer sufficient to protect student data, and we present recommendations for revisions.
{"title":"This Is Going on Your Permanent Record: A Legal Analysis of Educational Data in the Cloud","authors":"Ben Cohen, Ashley Hu, Deisy Patino, Joel Coffman","doi":"10.1145/3675230","DOIUrl":"https://doi.org/10.1145/3675230","url":null,"abstract":"Moving operations to the cloud has become a way of life for many educational institutions. Much of the information these institutions store in the cloud is protected by Family Educational Rights and Privacy Act (FERPA), which was last amended in 2002, well before cloud computing became ubiquitous. The application of a 1974 law to 21st-century technology presents a plethora of legal and technical questions. In this article, we present an interdisciplinary analysis of these issues. We examine both existing statutes and case law and contemporary research into cloud security, focusing on the impact of the latter on the former. We find that FERPA excludes information that students and faculty often believe is protected and that lower-court decisions have created further ambiguity. We additionally find that given current technology, the statute is no longer sufficient to protect student data, and we present recommendations for revisions.","PeriodicalId":486991,"journal":{"name":"ACM Journal on Responsible Computing","volume":" 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141677081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Joshua Krook, David M. Bossens, Peter D. Winter, John Downer, Shane Windsor
Drones, unmanned aircraft controlled remotely and equipped with cameras, have seen widespread deployment across military, industrial, and commercial domains. The commercial sector, in particular, has experienced rapid growth, outpacing regulatory developments due to substantial financial incentives. The UK construction sector exemplifies a case where the regulatory framework for drones remains unclear. This article investigates the state of UK legislation on commercial drone use in construction through a thematic analysis of peer-reviewed literature. Four main themes, including opportunities, safety risks, privacy risks, and the regulatory context, were identified along with twenty-one sub-themes such as noise and falling materials. Findings reveal a fragmented regulatory landscape, combining byelaws, national laws, and EU regulations, creating business uncertainty. Our study recommends the establishment of specific national guidelines for commercial drone use, addressing uncertainties and building public trust, especially in anticipation of the integration of ‘autonomous’ drones. This research contributes to the responsible computing domain by uncovering regulatory gaps and issues in UK drone law, particularly within the often-overlooked context of the construction sector. The insights provided aim to inform future responsible computing practices and policy development in the evolving landscape of commercial drone technology.
{"title":"Mapping the complexity of legal challenges for trustworthy drones on construction sites in the United Kingdom","authors":"Joshua Krook, David M. Bossens, Peter D. Winter, John Downer, Shane Windsor","doi":"10.1145/3664617","DOIUrl":"https://doi.org/10.1145/3664617","url":null,"abstract":"Drones, unmanned aircraft controlled remotely and equipped with cameras, have seen widespread deployment across military, industrial, and commercial domains. The commercial sector, in particular, has experienced rapid growth, outpacing regulatory developments due to substantial financial incentives. The UK construction sector exemplifies a case where the regulatory framework for drones remains unclear. This article investigates the state of UK legislation on commercial drone use in construction through a thematic analysis of peer-reviewed literature. Four main themes, including opportunities, safety risks, privacy risks, and the regulatory context, were identified along with twenty-one sub-themes such as noise and falling materials. Findings reveal a fragmented regulatory landscape, combining byelaws, national laws, and EU regulations, creating business uncertainty. Our study recommends the establishment of specific national guidelines for commercial drone use, addressing uncertainties and building public trust, especially in anticipation of the integration of ‘autonomous’ drones. This research contributes to the responsible computing domain by uncovering regulatory gaps and issues in UK drone law, particularly within the often-overlooked context of the construction sector. The insights provided aim to inform future responsible computing practices and policy development in the evolving landscape of commercial drone technology.","PeriodicalId":486991,"journal":{"name":"ACM Journal on Responsible Computing","volume":"32 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140979441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bradley Butcher, Miri Zilka, Jiri Hron, Darren Cook, Adrian Weller
From science to law enforcement, many research questions are answerable only by poring over a large amount of unstructured text documents. While people can extract information from such documents with high accuracy, this is often too time-consuming to be practical. On the other hand, automated approaches produce nearly-immediate results, but are not reliable enough for applications where near-perfect precision is essential. Motivated by two use cases from criminal justice, we consider the benefits and drawbacks of various human-only, human-machine, and machine-only approaches. Finding no tool well suited for our use cases, we develop a human-in-the-loop method for fast but accurate extraction of structured data from unstructured text. The tool is based on automated extraction followed by human validation, and is particularly useful in cases where purely manual extraction is not practical. Testing on three criminal justice datasets, we find that the combination of the computer speed and human understanding yields precision comparable to manual annotation while requiring only a fraction of time, and significantly outperforms the precision of all fully automated baselines.
{"title":"Optimising Human-Machine Collaboration for Efficient High-Precision Information Extraction from Text Documents","authors":"Bradley Butcher, Miri Zilka, Jiri Hron, Darren Cook, Adrian Weller","doi":"10.1145/3652591","DOIUrl":"https://doi.org/10.1145/3652591","url":null,"abstract":"From science to law enforcement, many research questions are answerable only by poring over a large amount of unstructured text documents. While people can extract information from such documents with high accuracy, this is often too time-consuming to be practical. On the other hand, automated approaches produce nearly-immediate results, but are not reliable enough for applications where near-perfect precision is essential. Motivated by two use cases from criminal justice, we consider the benefits and drawbacks of various human-only, human-machine, and machine-only approaches. Finding no tool well suited for our use cases, we develop a human-in-the-loop method for fast but accurate extraction of structured data from unstructured text. The tool is based on automated extraction followed by human validation, and is particularly useful in cases where purely manual extraction is not practical. Testing on three criminal justice datasets, we find that the combination of the computer speed and human understanding yields precision comparable to manual annotation while requiring only a fraction of time, and significantly outperforms the precision of all fully automated baselines.","PeriodicalId":486991,"journal":{"name":"ACM Journal on Responsible Computing","volume":"86 24","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140377966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marc Cheong, Ehsan Abedin, Marinus Ferreira, Ritsaart Reimann, Shalom Chalson, Pamela Robinson, Joanne Byrne, Leah Ruppanner, Mark Alfano, Colin Klein
Generative artificial intelligence systems based on transformers, including both text-generators like GPT-4 and image generators like DALL-E 3, have recently entered the popular consciousness. These tools, while impressive, are liable to reproduce, exacerbate, and reinforce extant human social biases, such as gender and racial biases. In this paper, we systematically review the extent to which DALL-E Mini suffers from this problem. In line with the Model Card published alongside DALL-E Mini by its creators, we find that the images it produces tend to represent dozens of different occupations as populated either solely by men (e.g., pilot, builder, plumber) or solely by women (e.g., hairdresser, receptionist, dietitian). In addition, the images DALL-E Mini produces tend to represent most occupations as populated primarily or solely by White people (e.g., farmer, painter, prison officer, software engineer) and very few by non-White people (e.g., pastor, rapper). These findings suggest that exciting new AI technologies should be critically scrutinized and perhaps regulated before they are unleashed on society.
{"title":"Investigating gender and racial biases in DALL-E Mini Images","authors":"Marc Cheong, Ehsan Abedin, Marinus Ferreira, Ritsaart Reimann, Shalom Chalson, Pamela Robinson, Joanne Byrne, Leah Ruppanner, Mark Alfano, Colin Klein","doi":"10.1145/3649883","DOIUrl":"https://doi.org/10.1145/3649883","url":null,"abstract":"Generative artificial intelligence systems based on transformers, including both text-generators like GPT-4 and image generators like DALL-E 3, have recently entered the popular consciousness. These tools, while impressive, are liable to reproduce, exacerbate, and reinforce extant human social biases, such as gender and racial biases. In this paper, we systematically review the extent to which DALL-E Mini suffers from this problem. In line with the Model Card published alongside DALL-E Mini by its creators, we find that the images it produces tend to represent dozens of different occupations as populated either solely by men (e.g., pilot, builder, plumber) or solely by women (e.g., hairdresser, receptionist, dietitian). In addition, the images DALL-E Mini produces tend to represent most occupations as populated primarily or solely by White people (e.g., farmer, painter, prison officer, software engineer) and very few by non-White people (e.g., pastor, rapper). These findings suggest that exciting new AI technologies should be critically scrutinized and perhaps regulated before they are unleashed on society.","PeriodicalId":486991,"journal":{"name":"ACM Journal on Responsible Computing","volume":"37 24","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140086393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As AI-based systems become commonplace in our daily lives, they need to provide understandable information to their users about how they collect, process, and output information that concerns them. The importance of such transparency practices has gained significance due to recent ethical guidelines and regulation, as well as research suggesting a positive relationship between the transparency of AI-based systems and users’ satisfaction. This paper provides a new tool for the design and study of transparency in AI-based systems that use personalization. The tool, called Transparency-Check, is based on a checklist of questions about transparency in four areas of a system: input (data collection), processing (algorithmic models), output (personalized recommendations) and user control (user feedback mechanisms to adjust elements of the system). Transparency-Check can be used by researchers, designers, and end users of computer systems. To demonstrate the usefulness of Transparency-Check from a researcher perspective, we collected the responses of 108 student participants who used the transparency checklist to rate five popular real-world systems (Amazon, Facebook, Netflix, Spotify, and YouTube). Based on users’ subjective evaluations, the systems showed low compliance with transparency standards, with some nuances about individual categories (specifically data collection, processing, user control). We use these results to compile design recommendations for improving transparency in AI-based systems, such as integrating information about the system’s behavior during the user’s interactions with it.
{"title":"Transparency-Check: An Instrument for the Study and Design of Transparency in AI-based Personalization Systems","authors":"Laura Schelenz, Avi Segal, Oduma Adelio, K. Gal","doi":"10.1145/3636508","DOIUrl":"https://doi.org/10.1145/3636508","url":null,"abstract":"As AI-based systems become commonplace in our daily lives, they need to provide understandable information to their users about how they collect, process, and output information that concerns them. The importance of such transparency practices has gained significance due to recent ethical guidelines and regulation, as well as research suggesting a positive relationship between the transparency of AI-based systems and users’ satisfaction. This paper provides a new tool for the design and study of transparency in AI-based systems that use personalization. The tool, called Transparency-Check, is based on a checklist of questions about transparency in four areas of a system: input (data collection), processing (algorithmic models), output (personalized recommendations) and user control (user feedback mechanisms to adjust elements of the system). Transparency-Check can be used by researchers, designers, and end users of computer systems. To demonstrate the usefulness of Transparency-Check from a researcher perspective, we collected the responses of 108 student participants who used the transparency checklist to rate five popular real-world systems (Amazon, Facebook, Netflix, Spotify, and YouTube). Based on users’ subjective evaluations, the systems showed low compliance with transparency standards, with some nuances about individual categories (specifically data collection, processing, user control). We use these results to compile design recommendations for improving transparency in AI-based systems, such as integrating information about the system’s behavior during the user’s interactions with it.","PeriodicalId":486991,"journal":{"name":"ACM Journal on Responsible Computing","volume":"6 6","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138587542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}