Pub Date : 2021-07-03DOI: 10.1080/09332480.2021.1979806
C. Andersen, U. Huynh, Andrés Ochoa Toasa, C. Wells, M. Wong
Abstract Previous major infection outbreaks have shown the importance of timely information about local conditions in guiding support and health care interventions. During the COVID-19 pandemic, UNICEF developed the Community Rapid Assessment (CRA) method to address this need. The CRA uses cell phone technology and a questionnaire based on an advanced behavioral model. The success of this kind of instrument depends on a variety of statistical issues such as whether the samples are representative of the population, the detailed design of questions, the quality of responses, and the choice of methods for inferential analysis. The purpose of this article is to describe the CRA method and lessons learned from a preliminary inferential analysis.
{"title":"Lessons from Applying the Community Rapid Assessment Method to COVID-19 Protective Measures in Three Countries","authors":"C. Andersen, U. Huynh, Andrés Ochoa Toasa, C. Wells, M. Wong","doi":"10.1080/09332480.2021.1979806","DOIUrl":"https://doi.org/10.1080/09332480.2021.1979806","url":null,"abstract":"Abstract Previous major infection outbreaks have shown the importance of timely information about local conditions in guiding support and health care interventions. During the COVID-19 pandemic, UNICEF developed the Community Rapid Assessment (CRA) method to address this need. The CRA uses cell phone technology and a questionnaire based on an advanced behavioral model. The success of this kind of instrument depends on a variety of statistical issues such as whether the samples are representative of the population, the detailed design of questions, the quality of responses, and the choice of methods for inferential analysis. The purpose of this article is to describe the CRA method and lessons learned from a preliminary inferential analysis.","PeriodicalId":88226,"journal":{"name":"Chance (New York, N.Y.)","volume":"2 1","pages":"6 - 12"},"PeriodicalIF":0.0,"publicationDate":"2021-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80811001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-03DOI: 10.1080/09332480.2021.1981052
Preetam Debasish Saha Roy, Sangeeta Jayadevan
In this article, we discuss the attempt to synthesize disparate sources of information Non-profit Organizations India Excellence Forum (IEF) and Statistics without Borders (SWB) collaborated to develop a platform that would aid in decision making for different stakeholders. The goal was to leverage pre-existing infectious disease models and COVID-19 related open data to provide relevant monitoring metrics at different granular levels such as States, Districts, City and Wards.
{"title":"COVID Monitoring Framework for Indian Cities","authors":"Preetam Debasish Saha Roy, Sangeeta Jayadevan","doi":"10.1080/09332480.2021.1981052","DOIUrl":"https://doi.org/10.1080/09332480.2021.1981052","url":null,"abstract":"In this article, we discuss the attempt to synthesize disparate sources of information Non-profit Organizations India Excellence Forum (IEF) and Statistics without Borders (SWB) collaborated to develop a platform that would aid in decision making for different stakeholders. The goal was to leverage pre-existing infectious disease models and COVID-19 related open data to provide relevant monitoring metrics at different granular levels such as States, Districts, City and Wards.","PeriodicalId":88226,"journal":{"name":"Chance (New York, N.Y.)","volume":"18 1","pages":"W73 - W81"},"PeriodicalIF":0.0,"publicationDate":"2021-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85359947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-03DOI: 10.1080/09332480.2021.1979820
M. Czapski, S. Godfrey, Joshua Derenski, Isaac Khader
While the Global Health Organization was able to officially declare the spread of COVID-19 as a global pandemic late in Q1 2020, the most effective responses from both governmental and private organizations were by no means clear. Very little was known about what was then frequently referred to as the novel coronavirus, and medical professionals had few recommendations specific to this disease. Still, what was abundantly clear was stay-at-home and lockdown orders were needed to bend the curve or slow transmission. As customers sheltered in place and businesses closed their doors, the impact on small businesses was expected to be devastating. With so many sources for potential aid from U.S. governments, and private and philanthropic entities available C2CB, aided by SWB, focused on helping small businesses identify relevant aid resources. SWB, consulting with C2CB, built a multistage data pipeline using machine learning techniques to automatically curate a national list of small-business aid programs, presenting users with results to efficiently research and find relevant aid programs. While this project curates business relief grants, it is a proof-of-concept for a no-cost data pipeline using machine learning techniques with automated website relevancy classification.
{"title":"A Machine Learning Approach to Helping Small Businesses Find Pandemic Economic-Impact Relief","authors":"M. Czapski, S. Godfrey, Joshua Derenski, Isaac Khader","doi":"10.1080/09332480.2021.1979820","DOIUrl":"https://doi.org/10.1080/09332480.2021.1979820","url":null,"abstract":"While the Global Health Organization was able to officially declare the spread of COVID-19 as a global pandemic late in Q1 2020, the most effective responses from both governmental and private organizations were by no means clear. Very little was known about what was then frequently referred to as the novel coronavirus, and medical professionals had few recommendations specific to this disease. Still, what was abundantly clear was stay-at-home and lockdown orders were needed to bend the curve or slow transmission. As customers sheltered in place and businesses closed their doors, the impact on small businesses was expected to be devastating. With so many sources for potential aid from U.S. governments, and private and philanthropic entities available C2CB, aided by SWB, focused on helping small businesses identify relevant aid resources. SWB, consulting with C2CB, built a multistage data pipeline using machine learning techniques to automatically curate a national list of small-business aid programs, presenting users with results to efficiently research and find relevant aid programs. While this project curates business relief grants, it is a proof-of-concept for a no-cost data pipeline using machine learning techniques with automated website relevancy classification.","PeriodicalId":88226,"journal":{"name":"Chance (New York, N.Y.)","volume":"215 1","pages":"61 - 68"},"PeriodicalIF":0.0,"publicationDate":"2021-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74163958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-03DOI: 10.1080/09332480.2021.1979804
Amanda Peterson-Plunkett
{"title":"Editor’s Letter","authors":"Amanda Peterson-Plunkett","doi":"10.1080/09332480.2021.1979804","DOIUrl":"https://doi.org/10.1080/09332480.2021.1979804","url":null,"abstract":"","PeriodicalId":88226,"journal":{"name":"Chance (New York, N.Y.)","volume":"68 1","pages":"3 - 3"},"PeriodicalIF":0.0,"publicationDate":"2021-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87062572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-03DOI: 10.1080/09332480.2021.1979805
C. Augustin, M. Brems, Davina Durgana
{"title":"Special Issue on Statistics and Data Science for Good","authors":"C. Augustin, M. Brems, Davina Durgana","doi":"10.1080/09332480.2021.1979805","DOIUrl":"https://doi.org/10.1080/09332480.2021.1979805","url":null,"abstract":"","PeriodicalId":88226,"journal":{"name":"Chance (New York, N.Y.)","volume":"14 1","pages":"4 - 5"},"PeriodicalIF":0.0,"publicationDate":"2021-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79366086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-03DOI: 10.1080/09332480.2021.1979821
Maria Tackett, Kendra S. Burbank, Judith E. Canner, Mine Çetinkaya-Rundel
In this column we describe two courses that focus on the role of statistics in understanding social issues. The first is an introductory statistics course developed by Dr. Kendra Burbank at the University of Chicago where students learn statistical methods as they explore different aspects of the water crisis in Flint, Michigan. Then, we describe an intermediate-level service learning course taught by Dr. Judith Canner and Dr. Alana Unfried at California State University Monterey Bay University where students use statistics to consult with local nonprofits and learn how to take a data-driven approach to affect change. We conclude with resources for instructors interested in incorporating social issues in their courses.
{"title":"Teaching Courses Focused on Social Good","authors":"Maria Tackett, Kendra S. Burbank, Judith E. Canner, Mine Çetinkaya-Rundel","doi":"10.1080/09332480.2021.1979821","DOIUrl":"https://doi.org/10.1080/09332480.2021.1979821","url":null,"abstract":"In this column we describe two courses that focus on the role of statistics in understanding social issues. The first is an introductory statistics course developed by Dr. Kendra Burbank at the University of Chicago where students learn statistical methods as they explore different aspects of the water crisis in Flint, Michigan. Then, we describe an intermediate-level service learning course taught by Dr. Judith Canner and Dr. Alana Unfried at California State University Monterey Bay University where students use statistics to consult with local nonprofits and learn how to take a data-driven approach to affect change. We conclude with resources for instructors interested in incorporating social issues in their courses.","PeriodicalId":88226,"journal":{"name":"Chance (New York, N.Y.)","volume":"11 1","pages":"69 - 72"},"PeriodicalIF":0.0,"publicationDate":"2021-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89564317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-03DOI: 10.1080/09332480.2021.1981055
Benjamin Kinsella
Broadly understood as the Data for Good (D4G) movement, coordinated efforts between technologists, domain experts, and mission driven organizations are addressing some of the world’s most pressing challenges using data science and AI applications to. One prominent mode of D4G engagement is skills-based volunteering, such as the work conducted by DataKind, a global nonprofit that pairs pro bono technologists with social sector organizations. This article examines D4G volunteering, reporting on findings from DataKind’s global volunteer survey that explores the community’s characteristics, motivations, and even the limitations that hinder long-term project engagement. As a global data collection and assessment effort of a self-identified D4G community, this study informs the broader community of practice and collaboration opportunities, which seek to advance a more equitable and ethical D4G ecosystem.
{"title":"Data Science for Social Good Volunteer Motivations and Limitations: An Exploratory Survey","authors":"Benjamin Kinsella","doi":"10.1080/09332480.2021.1981055","DOIUrl":"https://doi.org/10.1080/09332480.2021.1981055","url":null,"abstract":"Broadly understood as the Data for Good (D4G) movement, coordinated efforts between technologists, domain experts, and mission driven organizations are addressing some of the world’s most pressing challenges using data science and AI applications to. One prominent mode of D4G engagement is skills-based volunteering, such as the work conducted by DataKind, a global nonprofit that pairs pro bono technologists with social sector organizations. This article examines D4G volunteering, reporting on findings from DataKind’s global volunteer survey that explores the community’s characteristics, motivations, and even the limitations that hinder long-term project engagement. As a global data collection and assessment effort of a self-identified D4G community, this study informs the broader community of practice and collaboration opportunities, which seek to advance a more equitable and ethical D4G ecosystem.","PeriodicalId":88226,"journal":{"name":"Chance (New York, N.Y.)","volume":"4 1","pages":"W86 - W95"},"PeriodicalIF":0.0,"publicationDate":"2021-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72984372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-03DOI: 10.1080/09332480.2021.1979810
Eric A. Vance, Kim Love
Data-driven decision making for sustainable development requires domain expertise to ask the right questions; high-quality, relevant data; appropriate, nuanced statistical analyses; and the power to make and implement a decision. Statistics enables and accelerates all of these aspects. We propose a new model for building statistics and data science capacity to engage in data-driven development. Statisticians and data scientists must be able to understand the data and projects they are working with on both a deep and broad level and be able to communicate the results of statistical methods and analytical work in ways that provide actionable evidence to those who can use it to positively impact society. Our model for building statistics and data science capacity is to create statistics and data science collaboration laboratories (“stat labs”) that work in the intersections of data-driven development by collaborating with data producers and data decision makers to transform evidence into action. We present lessons learned from the LISA 2020 Network, which has leveraged the collective experiences of more than 30 newly created stat labs in developing countries to build such statistics and data science capacity by focusing on the intersections of data-driven development.
{"title":"Building Statistics and Data Science Capacity for Development","authors":"Eric A. Vance, Kim Love","doi":"10.1080/09332480.2021.1979810","DOIUrl":"https://doi.org/10.1080/09332480.2021.1979810","url":null,"abstract":"Data-driven decision making for sustainable development requires domain expertise to ask the right questions; high-quality, relevant data; appropriate, nuanced statistical analyses; and the power to make and implement a decision. Statistics enables and accelerates all of these aspects. We propose a new model for building statistics and data science capacity to engage in data-driven development. Statisticians and data scientists must be able to understand the data and projects they are working with on both a deep and broad level and be able to communicate the results of statistical methods and analytical work in ways that provide actionable evidence to those who can use it to positively impact society. Our model for building statistics and data science capacity is to create statistics and data science collaboration laboratories (“stat labs”) that work in the intersections of data-driven development by collaborating with data producers and data decision makers to transform evidence into action. We present lessons learned from the LISA 2020 Network, which has leveraged the collective experiences of more than 30 newly created stat labs in developing countries to build such statistics and data science capacity by focusing on the intersections of data-driven development.","PeriodicalId":88226,"journal":{"name":"Chance (New York, N.Y.)","volume":"13 1","pages":"38 - 46"},"PeriodicalIF":0.0,"publicationDate":"2021-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83484221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
38 practical, and more effectively promoted, than their Bayesian counterparts. I am also not certain about the extent to which the replication crisis is due to the use of the p-value. Although it is certainly a contributing factor, I personally believe the root cause is that the data analyst is usually fully invested in the outcome, and has every incentive imaginable to obtain the most flattering result. All in all, I believe a book as belligerent and instantly controversial as Bernoulli’s Fallacy deserves an appendix in which some of the claims are debated with other statisticians (cf. Berger & Wolpert, 1988). My final misgiving is that the author occasionally gives Ed Jaynes a little too much credit. For instance, the notion that all probability statements are conditional on prior knowledge is found in Keynes (1921), and both Jeffreys and Lindley consistently conditioned on “H.” (for “history”) or “K” (for “knowledge”) before Jaynes. Similarly, the idea that Bayesian inference is a logic of partial beliefs predates Jaynes—it goes back at least to De Morgan (1847/2003), Ramsey (1926), and de Finetti (1974). Despite these minor misgivings, this book comes highly recommended. Bernoulli’s Fallacy elegantly connects the past to the present in an attempt to dismantle the reigning statistical orthodoxy. Buy this book, and give it to your students so they may learn about Bayesian inference and the history of statistics; give it to your colleagues working in the empirical sciences so they will understand that the frequentist emperor is scantily dressed; give it to your frequentist friends as a provocation. Or read it yourself, so you will be prompted to think more deeply about the foundations of statistical inference.
{"title":"A History of Data Visualization and Graphic Communication","authors":"Leland Wilkinson","doi":"10.4159/9780674259034","DOIUrl":"https://doi.org/10.4159/9780674259034","url":null,"abstract":"38 practical, and more effectively promoted, than their Bayesian counterparts. I am also not certain about the extent to which the replication crisis is due to the use of the p-value. Although it is certainly a contributing factor, I personally believe the root cause is that the data analyst is usually fully invested in the outcome, and has every incentive imaginable to obtain the most flattering result. All in all, I believe a book as belligerent and instantly controversial as Bernoulli’s Fallacy deserves an appendix in which some of the claims are debated with other statisticians (cf. Berger & Wolpert, 1988). My final misgiving is that the author occasionally gives Ed Jaynes a little too much credit. For instance, the notion that all probability statements are conditional on prior knowledge is found in Keynes (1921), and both Jeffreys and Lindley consistently conditioned on “H.” (for “history”) or “K” (for “knowledge”) before Jaynes. Similarly, the idea that Bayesian inference is a logic of partial beliefs predates Jaynes—it goes back at least to De Morgan (1847/2003), Ramsey (1926), and de Finetti (1974). Despite these minor misgivings, this book comes highly recommended. Bernoulli’s Fallacy elegantly connects the past to the present in an attempt to dismantle the reigning statistical orthodoxy. Buy this book, and give it to your students so they may learn about Bayesian inference and the history of statistics; give it to your colleagues working in the empirical sciences so they will understand that the frequentist emperor is scantily dressed; give it to your frequentist friends as a provocation. Or read it yourself, so you will be prompted to think more deeply about the foundations of statistical inference.","PeriodicalId":88226,"journal":{"name":"Chance (New York, N.Y.)","volume":"4 1","pages":"38 - 40"},"PeriodicalIF":0.0,"publicationDate":"2021-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87267411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-04-03DOI: 10.1080/09332480.2021.1915032
C. Parkey
39 Accountability for misuse of data is a big question in using data science and machine learning (ML) to advance society. Are the data collectors, model builders, or users ultimately accountable? The benefits of data sharing are widely recognized by the scientific community, but headlines can also be seen in the news about models that are released with known bias or without any impact monitoring and reporting in place. Examples include “Florida scientist says she was fired for not manipulating COVID-19 Data” and “Google Researcher Says She Was Fired Over Paper Highlighting Bias in A.I.” after a paper by Timnit Gebru that highlighted the risk of large language models was accepted. Organizations such as the World Health Organization (WHO) have pages of policies qualifying how data were collected, the limitations, and restrictions on use. At the same time, whistleblowers and researchers alike are pushing back, attempting to hold companies and states accountable for their misuse of data. While there is no clear answer, the question of accountability at multiple levels can be explored, as well as how to begin implementing systems of accountability now instead of waiting for regulations to provide guidance.
{"title":"Who is Accountable for Data Bias?","authors":"C. Parkey","doi":"10.1080/09332480.2021.1915032","DOIUrl":"https://doi.org/10.1080/09332480.2021.1915032","url":null,"abstract":"39 Accountability for misuse of data is a big question in using data science and machine learning (ML) to advance society. Are the data collectors, model builders, or users ultimately accountable? The benefits of data sharing are widely recognized by the scientific community, but headlines can also be seen in the news about models that are released with known bias or without any impact monitoring and reporting in place. Examples include “Florida scientist says she was fired for not manipulating COVID-19 Data” and “Google Researcher Says She Was Fired Over Paper Highlighting Bias in A.I.” after a paper by Timnit Gebru that highlighted the risk of large language models was accepted. Organizations such as the World Health Organization (WHO) have pages of policies qualifying how data were collected, the limitations, and restrictions on use. At the same time, whistleblowers and researchers alike are pushing back, attempting to hold companies and states accountable for their misuse of data. While there is no clear answer, the question of accountability at multiple levels can be explored, as well as how to begin implementing systems of accountability now instead of waiting for regulations to provide guidance.","PeriodicalId":88226,"journal":{"name":"Chance (New York, N.Y.)","volume":"29 1","pages":"39 - 43"},"PeriodicalIF":0.0,"publicationDate":"2021-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73948438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}