首页 > 最新文献

Journal of Statistics and Data Science Education最新文献

英文 中文
Student-Developed Shiny Applications for Teaching Statistics 学生开发的统计学教学应用程序
IF 1.7 Q2 EDUCATION, SCIENTIFIC DISCIPLINES Pub Date : 2021-09-02 DOI: 10.1080/26939169.2021.1995545
Sabrina Luxin Wang, A. Y. Zhang, Samuel Messer, A. Wiesner, Dennis K. Pearl
Abstract This article describes a suite of student-created Shiny apps for teaching statistics and a field test of their short-term effectiveness. To date, more than 50 Shiny apps and a growing collection of associated lesson plans, designed to enrich the teaching of both introductory and upper division statistics courses, have been developed. The apps are available for free use and their open source code can be adapted as desired. We report on the experimental testing of four of these Shiny apps to examine short-term learning outcomes in an introductory statistical concepts course.
摘要本文介绍了一套由学生创建的用于统计学教学的Shiny应用程序,并对其短期有效性进行了现场测试。到目前为止,已经开发了50多个Shiny应用程序和越来越多的相关课程计划,旨在丰富入门和高级统计课程的教学。这些应用程序可以免费使用,它们的开源代码可以根据需要进行调整。我们报告了对其中四款Shiny应用程序的实验测试,以检验统计学概念入门课程中的短期学习结果。
{"title":"Student-Developed Shiny Applications for Teaching Statistics","authors":"Sabrina Luxin Wang, A. Y. Zhang, Samuel Messer, A. Wiesner, Dennis K. Pearl","doi":"10.1080/26939169.2021.1995545","DOIUrl":"https://doi.org/10.1080/26939169.2021.1995545","url":null,"abstract":"Abstract This article describes a suite of student-created Shiny apps for teaching statistics and a field test of their short-term effectiveness. To date, more than 50 Shiny apps and a growing collection of associated lesson plans, designed to enrich the teaching of both introductory and upper division statistics courses, have been developed. The apps are available for free use and their open source code can be adapted as desired. We report on the experimental testing of four of these Shiny apps to examine short-term learning outcomes in an introductory statistical concepts course.","PeriodicalId":34851,"journal":{"name":"Journal of Statistics and Data Science Education","volume":"29 1","pages":"218 - 227"},"PeriodicalIF":1.7,"publicationDate":"2021-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45773220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
How to Get Away With Statistics: Gamification of Multivariate Statistics 如何摆脱统计:多元统计的游戏化
IF 1.7 Q2 EDUCATION, SCIENTIFIC DISCIPLINES Pub Date : 2021-09-02 DOI: 10.1080/26939169.2021.1997128
Jacopo Di Iorio, S. Vantini
Abstract In this article, we discuss our attempt to teach applied statistics techniques typically taught in advanced courses, such as clustering and principal component analysis, to a non-mathematical educated audience. Considering the negative attitude and inclination toward mathematical disciplines of our students we introduce them to our topics using four different games. The four games are all user-centric, score-based arcade experiences intended to be played under the supervision of an instructor. They are developed using the Shiny web-based application framework for R. In every activity students have to follow the instructions and to interact with plots to minimize a score with a statistical meaning. No other knowledge than elementary geometry and Euclidean distance is required to complete the tasks. Results from a student questionnaire give us some confidence that the experience has benefited students, not only in terms of their ability to understand and use the explained methods but also regarding their confidence and overall satisfaction with the course. This fact suggests that these or similar activities could greatly improve the diffusion of statistical thinking at different levels of education.
在这篇文章中,我们讨论了我们试图向非数学教育的受众教授高级课程中通常教授的应用统计技术,如聚类和主成分分析。考虑到学生对数学学科的消极态度和倾向,我们用四种不同的游戏向他们介绍我们的主题。这四款游戏都是以用户为中心,基于分数的街机体验,旨在指导玩家进行游戏。它们是使用Shiny的基于web的r应用程序框架开发的。在每个活动中,学生都必须遵循说明并与图表进行交互,以最小化具有统计意义的分数。完成这些任务只需要初等几何和欧氏距离知识。一份学生问卷调查的结果让我们有信心,这段经历使学生受益,不仅在他们理解和使用所解释的方法的能力方面,而且在他们对课程的信心和总体满意度方面。这一事实表明,这些或类似的活动可以极大地促进统计思维在不同教育水平上的传播。
{"title":"How to Get Away With Statistics: Gamification of Multivariate Statistics","authors":"Jacopo Di Iorio, S. Vantini","doi":"10.1080/26939169.2021.1997128","DOIUrl":"https://doi.org/10.1080/26939169.2021.1997128","url":null,"abstract":"Abstract In this article, we discuss our attempt to teach applied statistics techniques typically taught in advanced courses, such as clustering and principal component analysis, to a non-mathematical educated audience. Considering the negative attitude and inclination toward mathematical disciplines of our students we introduce them to our topics using four different games. The four games are all user-centric, score-based arcade experiences intended to be played under the supervision of an instructor. They are developed using the Shiny web-based application framework for R. In every activity students have to follow the instructions and to interact with plots to minimize a score with a statistical meaning. No other knowledge than elementary geometry and Euclidean distance is required to complete the tasks. Results from a student questionnaire give us some confidence that the experience has benefited students, not only in terms of their ability to understand and use the explained methods but also regarding their confidence and overall satisfaction with the course. This fact suggests that these or similar activities could greatly improve the diffusion of statistical thinking at different levels of education.","PeriodicalId":34851,"journal":{"name":"Journal of Statistics and Data Science Education","volume":"29 1","pages":"241 - 250"},"PeriodicalIF":1.7,"publicationDate":"2021-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42389743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Trends in Teaching Advanced Placement Statistics: Results from a National Survey 大学先修课程统计教学趋势:来自全国调查的结果
IF 1.7 Q2 EDUCATION, SCIENTIFIC DISCIPLINES Pub Date : 2021-09-02 DOI: 10.1080/26939169.2021.1965509
Hollylynne S. Lee, Taylor Harrison
Abstract This study provides a glimpse into the professional learning, beliefs, and practices of high school teachers of Advanced Placement (AP) Statistics. Data are from a survey of 445 AP Statistics teachers in late 2018. Results indicate many AP Statistics teachers have taken several statistics courses and engage in professional development related to statistics sponsored by the College Board (summer institutes, exam readings, and online community). They generally do not engage with resources developed by the American Statistical Association and the statistics education community. While AP statistics teachers structure class time with student–student interaction and use student-centered activities, they generally do not use statistics-specific technology tools and rarely engage students with datasets larger than 100 cases or with multiple variables. Teachers’ beliefs about teaching statistics do not always reflect their teaching practices. Personal time to improve, time with students (especially those on a blocked semester schedule), structure of curriculum and exam schedule, and lack of access to technology often prevent teachers from making changes to their practices. Findings call for targeted efforts to reach high school statistics teachers, engage them more in the statistics education community, and encourage curriculum and instructional approaches that more closely align with recommendations and trends in college-level introductory statistics.
摘要:本研究对高中AP统计教师的专业学习、信念和实践进行了探讨。数据来自2018年底对445名AP统计教师的调查。结果表明,许多AP统计教师已经学习了几门统计课程,并参与了由美国大学理事会(College Board)赞助的与统计相关的专业发展(暑期学院、考试阅读和在线社区)。他们通常不参与美国统计协会和统计教育界开发的资源。虽然AP统计教师通过学生与学生的互动来安排课堂时间,并使用以学生为中心的活动,但他们通常不使用特定于统计的技术工具,也很少让学生参与超过100个案例或多个变量的数据集。教师对统计学教学的信念并不总是反映他们的教学实践。需要改进的个人时间、与学生相处的时间(尤其是那些学期安排不清的学生)、课程和考试安排的结构,以及缺乏技术手段,往往会阻碍教师改变他们的做法。调查结果呼吁有针对性地接触高中统计教师,让他们更多地参与统计教育界,并鼓励课程和教学方法更紧密地符合大学水平入门统计的建议和趋势。
{"title":"Trends in Teaching Advanced Placement Statistics: Results from a National Survey","authors":"Hollylynne S. Lee, Taylor Harrison","doi":"10.1080/26939169.2021.1965509","DOIUrl":"https://doi.org/10.1080/26939169.2021.1965509","url":null,"abstract":"Abstract This study provides a glimpse into the professional learning, beliefs, and practices of high school teachers of Advanced Placement (AP) Statistics. Data are from a survey of 445 AP Statistics teachers in late 2018. Results indicate many AP Statistics teachers have taken several statistics courses and engage in professional development related to statistics sponsored by the College Board (summer institutes, exam readings, and online community). They generally do not engage with resources developed by the American Statistical Association and the statistics education community. While AP statistics teachers structure class time with student–student interaction and use student-centered activities, they generally do not use statistics-specific technology tools and rarely engage students with datasets larger than 100 cases or with multiple variables. Teachers’ beliefs about teaching statistics do not always reflect their teaching practices. Personal time to improve, time with students (especially those on a blocked semester schedule), structure of curriculum and exam schedule, and lack of access to technology often prevent teachers from making changes to their practices. Findings call for targeted efforts to reach high school statistics teachers, engage them more in the statistics education community, and encourage curriculum and instructional approaches that more closely align with recommendations and trends in college-level introductory statistics.","PeriodicalId":34851,"journal":{"name":"Journal of Statistics and Data Science Education","volume":"29 1","pages":"317 - 327"},"PeriodicalIF":1.7,"publicationDate":"2021-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48294609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Enhancement of the Command-Line Environment for use in the Introductory Statistics Course and Beyond 增强用于统计学入门课程及以后课程的命令行环境
IF 1.7 Q2 EDUCATION, SCIENTIFIC DISCIPLINES Pub Date : 2021-09-02 DOI: 10.1080/26939169.2021.1999871
D. Gerbing
ABSTRACT R and Python are commonly used software languages for data analytics. Using these languages as the course software for the introductory course gives students practical skills for applying statistical concepts to data analysis. However, the reliance upon the command line is perceived by the typical nontechnical introductory student as sufficiently esoteric that its use detracts from the teaching of statistical concepts and data analysis. An R package was developed based on the successive feedback of hundreds of introductory statistics students over multiple years to provide a set of functions that apply basic statistical principles with command-line R. The package offers gentler error checking and many visualizations and analytics, successfully serving as the course software for teaching and homework. This software includes pedagogical functions, data analytic functions for a variety of analyses, and the foundation for access to the entire R ecosystem and, by extension, any command-line environment.
摘要R和Python是用于数据分析的常用软件语言。使用这些语言作为入门课程的课程软件,可以为学生提供将统计概念应用于数据分析的实用技能。然而,对命令行的依赖被典型的非技术入门学生认为是非常深奥的,以至于它的使用有损于统计概念和数据分析的教学。基于数百名统计学入门学生多年来的连续反馈,开发了一个R包,以提供一组通过命令行R应用基本统计原理的功能。该包提供了更温和的错误检查和许多可视化和分析,成功地用作教学和家庭作业的课程软件。该软件包括教学功能、用于各种分析的数据分析功能,以及访问整个R生态系统和任何命令行环境的基础。
{"title":"Enhancement of the Command-Line Environment for use in the Introductory Statistics Course and Beyond","authors":"D. Gerbing","doi":"10.1080/26939169.2021.1999871","DOIUrl":"https://doi.org/10.1080/26939169.2021.1999871","url":null,"abstract":"ABSTRACT R and Python are commonly used software languages for data analytics. Using these languages as the course software for the introductory course gives students practical skills for applying statistical concepts to data analysis. However, the reliance upon the command line is perceived by the typical nontechnical introductory student as sufficiently esoteric that its use detracts from the teaching of statistical concepts and data analysis. An R package was developed based on the successive feedback of hundreds of introductory statistics students over multiple years to provide a set of functions that apply basic statistical principles with command-line R. The package offers gentler error checking and many visualizations and analytics, successfully serving as the course software for teaching and homework. This software includes pedagogical functions, data analytic functions for a variety of analyses, and the foundation for access to the entire R ecosystem and, by extension, any command-line environment.","PeriodicalId":34851,"journal":{"name":"Journal of Statistics and Data Science Education","volume":"29 1","pages":"251 - 266"},"PeriodicalIF":1.7,"publicationDate":"2021-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47377963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Do Students Learn More from Erroneous Code? Exploring Student Performance and Satisfaction in an Error-Free Versus an Error-full SAS® Programming Environment 学生能从错误代码中学到更多吗?在无错误与全错误的SAS®编程环境中探索学生的表现和满意度
IF 1.7 Q2 EDUCATION, SCIENTIFIC DISCIPLINES Pub Date : 2021-09-02 DOI: 10.1080/26939169.2021.1967229
H. Hoffman, Angelo F. Elmi
Abstract Teaching students statistical programming languages while simultaneously teaching them how to debug erroneous code is challenging. The traditional programming course focuses on error-free learning in class while students’ experiences outside of class typically involve error-full learning. While error-free teaching consists of focused lectures emphasizing correct coding, error-full teaching would follow such lectures with debugging sessions. We aimed to explore these two approaches by conducting a pilot study of 18 graduate students who voluntarily attended a SAS programming seminar held weekly from September 2018 through November 2018. Each seminar had a 10-min error-free lecture, 15-min programming assignment, 5-min break, 10-min error-full lecture, and 15-min programming assignment. We examined student performance and preference. While four students successfully completed both assignments and ten students did not successfully complete either assignment, one student successfully completed only the first assignment that directly followed the error-free lecture and three students successfully completed only the second assignment that directly followed the error-full lecture. Of the 15 students who responded, twelve (80%) preferred error-full to error-free learning. We will evaluate error-full learning on a larger scale in an introductory SAS course. Supplemental files are available online for this article.
摘要教学生统计编程语言,同时教他们如何调试错误代码是一项挑战。传统的编程课程侧重于课堂上的无错误学习,而学生的课外体验通常涉及完全错误的学习。虽然无错误教学由强调正确编码的重点讲座组成,但在此类讲座之后,错误教学将包括调试课程。我们旨在通过对18名研究生进行试点研究来探索这两种方法,这些研究生自愿参加了2018年9月至2018年11月每周举行的SAS编程研讨会。每个研讨会都有10分钟的无错误讲座、15分钟的编程作业、5分钟的休息、10分钟的错误完整讲座和15分钟的程序作业。我们考察了学生的表现和偏好。四名学生成功完成了两项作业,十名学生没有成功完成任何一项作业,一名学生只成功完成了无错误讲座后的第一项作业,三名学生只完成了错误完整讲座后的第二项作业。在回答的15名学生中,12名(80%)的学生更喜欢完全错误的学习,而不是无错误的学习。我们将在SAS入门课程中对全面错误学习进行更大规模的评估。本文的补充文件可在线获取。
{"title":"Do Students Learn More from Erroneous Code? Exploring Student Performance and Satisfaction in an Error-Free Versus an Error-full SAS® Programming Environment","authors":"H. Hoffman, Angelo F. Elmi","doi":"10.1080/26939169.2021.1967229","DOIUrl":"https://doi.org/10.1080/26939169.2021.1967229","url":null,"abstract":"Abstract Teaching students statistical programming languages while simultaneously teaching them how to debug erroneous code is challenging. The traditional programming course focuses on error-free learning in class while students’ experiences outside of class typically involve error-full learning. While error-free teaching consists of focused lectures emphasizing correct coding, error-full teaching would follow such lectures with debugging sessions. We aimed to explore these two approaches by conducting a pilot study of 18 graduate students who voluntarily attended a SAS programming seminar held weekly from September 2018 through November 2018. Each seminar had a 10-min error-free lecture, 15-min programming assignment, 5-min break, 10-min error-full lecture, and 15-min programming assignment. We examined student performance and preference. While four students successfully completed both assignments and ten students did not successfully complete either assignment, one student successfully completed only the first assignment that directly followed the error-free lecture and three students successfully completed only the second assignment that directly followed the error-full lecture. Of the 15 students who responded, twelve (80%) preferred error-full to error-free learning. We will evaluate error-full learning on a larger scale in an introductory SAS course. Supplemental files are available online for this article.","PeriodicalId":34851,"journal":{"name":"Journal of Statistics and Data Science Education","volume":"29 1","pages":"228 - 240"},"PeriodicalIF":1.7,"publicationDate":"2021-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47691692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Note from the Editor 编辑器的注释
IF 1.7 Q2 EDUCATION, SCIENTIFIC DISCIPLINES Pub Date : 2021-08-06 DOI: 10.1080/26939169.2021.1959224
J. Witmer
{"title":"Note from the Editor","authors":"J. Witmer","doi":"10.1080/26939169.2021.1959224","DOIUrl":"https://doi.org/10.1080/26939169.2021.1959224","url":null,"abstract":"","PeriodicalId":34851,"journal":{"name":"Journal of Statistics and Data Science Education","volume":"29 1","pages":"155 - 155"},"PeriodicalIF":1.7,"publicationDate":"2021-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41647866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Letter to the Journal of Statistics and Data Science Education — A Call for Review of “OkCupid Data for Introductory Statistics and Data Science Courses” by Albert Y. Kim and Adriana Escobedo-Land 致《统计学与数据科学教育杂志》的一封信——对Albert Y.Kim和Adriana Escobedo Land“OkCupid Data for Introduction Statistics and Data Science Courses”的评论
IF 1.7 Q2 EDUCATION, SCIENTIFIC DISCIPLINES Pub Date : 2021-08-06 DOI: 10.1080/26939169.2021.1930812
Tiffany Xiao, Yifan Ma
As Big Data continues to rise in popularity, so does an increased need for protection against potential misuses of data. We are a group of undergraduate Statistical and Data Science major students from Smith College that are actively engaged in ethical discussions concerning the use of data in our society. It can be challenging to predict future trends and technologies in data science that could cause concerns. However, we believe that some essential protections and procedures should be in place to help prevent misuses of data. In particular, we are writing to you to address our concerns with the article “OkCupid Data for Introductory Statistics and Data Science Courses” by Albert Y. Kim and Adriana Escobedo-Land that was published in your journal (Kim and Escobedo-Land 2015). In light of ethical concerns surrounding the article, herein we describe the background of how the dataset was found to contain identifiable information. We communicated this to the authors, who correspondingly corrected the article. In our opinion, there is no doubt that the dataset presented in the article holds pedagogical value as well as research value. One aspect of the educational value of the dataset is the fact that the context of possible analysis could better drive students’ interests. The research value of the data lies within the self-reported nature of the dataset, which usually is the private property of corporations and could be hard to obtain for researchers in universities. Another context in which the pedagogical value of the dataset remains is where students could use this as a case study in discussions of the ethical implications of such data, even practicing anonymization skills with the data. However, we do believe that for the dataset to be used for pedagogical purposes, further anonymizations to the dataset were necessary. Some ways that datasets like this one could be better anonymized in the future include removing unimportant variables that have identification power disproportionate to their value to research. For example, in the case of the OkCupid dataset associated with the paper, the time the data was collected could be removed, since this fact is not particularly essential but can be used for identification. Other sources of concern for this dataset are the variables that reveal geographical and temporal information on individuals. Another method could
随着大数据的不断普及,对防止潜在数据滥用的保护需求也在增加。我们是史密斯学院统计学和数据科学专业的本科生,他们积极参与有关我们社会中数据使用的伦理讨论。预测数据科学中可能引起关注的未来趋势和技术可能具有挑战性。然而,我们认为,应该制定一些基本的保护措施和程序,以帮助防止数据滥用。特别是,我们写信给您,以解决我们对Albert Y.Kim和Adriana Escobedo Land在您的期刊(Kim和Escobedo Land2015)上发表的文章“OkCupid Data for Introduction Statistics and Data Science Courses”的担忧。鉴于围绕这篇文章的伦理问题,我们在这里描述了如何发现数据集包含可识别信息的背景。我们把这件事告诉了作者,他们相应地更正了这篇文章。在我们看来,毫无疑问,文章中提供的数据集具有教学价值和研究价值。数据集的教育价值的一个方面是,可能的分析背景可以更好地激发学生的兴趣。数据的研究价值在于数据集的自我报告性质,数据集通常是公司的私有财产,大学的研究人员可能很难获得。数据集的另一个教学价值仍然存在的背景是,学生可以将其作为案例研究,讨论此类数据的道德含义,甚至可以使用数据练习匿名化技能。然而,我们确实认为,为了将数据集用于教学目的,有必要对数据集进行进一步的匿名化。像这样的数据集在未来可以更好地匿名化的一些方法包括删除不重要的变量,这些变量的识别能力与其研究价值不成比例。例如,在与论文相关的OkCupid数据集的情况下,数据收集的时间可以被删除,因为这一事实不是特别重要,但可以用于识别。该数据集关注的其他来源是揭示个人地理和时间信息的变量。另一种方法可以
{"title":"A Letter to the Journal of Statistics and Data Science Education — A Call for Review of “OkCupid Data for Introductory Statistics and Data Science Courses” by Albert Y. Kim and Adriana Escobedo-Land","authors":"Tiffany Xiao, Yifan Ma","doi":"10.1080/26939169.2021.1930812","DOIUrl":"https://doi.org/10.1080/26939169.2021.1930812","url":null,"abstract":"As Big Data continues to rise in popularity, so does an increased need for protection against potential misuses of data. We are a group of undergraduate Statistical and Data Science major students from Smith College that are actively engaged in ethical discussions concerning the use of data in our society. It can be challenging to predict future trends and technologies in data science that could cause concerns. However, we believe that some essential protections and procedures should be in place to help prevent misuses of data. In particular, we are writing to you to address our concerns with the article “OkCupid Data for Introductory Statistics and Data Science Courses” by Albert Y. Kim and Adriana Escobedo-Land that was published in your journal (Kim and Escobedo-Land 2015). In light of ethical concerns surrounding the article, herein we describe the background of how the dataset was found to contain identifiable information. We communicated this to the authors, who correspondingly corrected the article. In our opinion, there is no doubt that the dataset presented in the article holds pedagogical value as well as research value. One aspect of the educational value of the dataset is the fact that the context of possible analysis could better drive students’ interests. The research value of the data lies within the self-reported nature of the dataset, which usually is the private property of corporations and could be hard to obtain for researchers in universities. Another context in which the pedagogical value of the dataset remains is where students could use this as a case study in discussions of the ethical implications of such data, even practicing anonymization skills with the data. However, we do believe that for the dataset to be used for pedagogical purposes, further anonymizations to the dataset were necessary. Some ways that datasets like this one could be better anonymized in the future include removing unimportant variables that have identification power disproportionate to their value to research. For example, in the case of the OkCupid dataset associated with the paper, the time the data was collected could be removed, since this fact is not particularly essential but can be used for identification. Other sources of concern for this dataset are the variables that reveal geographical and temporal information on individuals. Another method could","PeriodicalId":34851,"journal":{"name":"Journal of Statistics and Data Science Education","volume":"29 1","pages":"214 - 215"},"PeriodicalIF":1.7,"publicationDate":"2021-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/26939169.2021.1930812","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42510109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Interdisciplinary Approaches and Strategies from Research Reproducibility 2020: Educating for Reproducibility 2020年研究可重复性的跨学科方法和策略:可重复性教育
IF 1.7 Q2 EDUCATION, SCIENTIFIC DISCIPLINES Pub Date : 2021-07-01 DOI: 10.1080/26939169.2022.2104767
M. Rethlefsen, H. Norton, Sarah L. Meyer, Katherine A. MacWilkinson, Plato L. Smith II, Haoyang Ye
Abstract Research Reproducibility: Educating for Reproducibility, Pathways to Research Integrity was an interdisciplinary, conference hosted virtually by the University of Florida in December 2020. This event brought together educators, researchers, students, policy makers, and industry representatives from across the globe to explore best practices, innovations, and new ideas for education around reproducibility and replicability. Emphasizing a broad view of rigor and reproducibility, the conference touched on many aspects of introducing learners to transparency, rigorous study design, data science, data management, replications, and more. Transdisciplinary themes emerged from the panels, keynote, and submitted papers and poster presentations. The identified themes included lifelong learning, cultivating bottom-up change, “sneaking in” learning, just-in-time learning, targeting learners by career stage, learning by doing, learning how to learn, establishing communities of practice, librarians as interdisciplinary leaders, teamwork skills, rewards and incentives, and implementing top-down change. For each of these themes, we share ideas, practices, and actions as discussed by the conference speakers and attendees.
摘要研究再现性:再现性教育,研究诚信之路是一个跨学科会议,于2020年12月由佛罗里达大学虚拟主办。此次活动汇集了来自全球各地的教育工作者、研究人员、学生、政策制定者和行业代表,围绕可复制性和可复制性探索最佳实践、创新和教育新理念。会议强调了严谨性和可重复性的广泛观点,涉及向学习者介绍透明度、严格的研究设计、数据科学、数据管理、复制等方面。小组讨论、主题演讲、提交的论文和海报展示中出现了跨学科主题。确定的主题包括终身学习、培养自下而上的变革、“潜入式”学习、及时学习、按职业阶段针对学习者、边做边学、学习如何学习、建立实践社区、图书馆员作为跨学科领导者、团队合作技能、奖励和激励以及实施自上而下的变革。对于每一个主题,我们都会分享会议发言人和与会者讨论的想法、实践和行动。
{"title":"Interdisciplinary Approaches and Strategies from Research Reproducibility 2020: Educating for Reproducibility","authors":"M. Rethlefsen, H. Norton, Sarah L. Meyer, Katherine A. MacWilkinson, Plato L. Smith II, Haoyang Ye","doi":"10.1080/26939169.2022.2104767","DOIUrl":"https://doi.org/10.1080/26939169.2022.2104767","url":null,"abstract":"Abstract Research Reproducibility: Educating for Reproducibility, Pathways to Research Integrity was an interdisciplinary, conference hosted virtually by the University of Florida in December 2020. This event brought together educators, researchers, students, policy makers, and industry representatives from across the globe to explore best practices, innovations, and new ideas for education around reproducibility and replicability. Emphasizing a broad view of rigor and reproducibility, the conference touched on many aspects of introducing learners to transparency, rigorous study design, data science, data management, replications, and more. Transdisciplinary themes emerged from the panels, keynote, and submitted papers and poster presentations. The identified themes included lifelong learning, cultivating bottom-up change, “sneaking in” learning, just-in-time learning, targeting learners by career stage, learning by doing, learning how to learn, establishing communities of practice, librarians as interdisciplinary leaders, teamwork skills, rewards and incentives, and implementing top-down change. For each of these themes, we share ideas, practices, and actions as discussed by the conference speakers and attendees.","PeriodicalId":34851,"journal":{"name":"Journal of Statistics and Data Science Education","volume":"30 1","pages":"219 - 227"},"PeriodicalIF":1.7,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47260615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Building a Multiple Linear Regression Model With LEGO Brick Data 用乐高积木数据建立多元线性回归模型
IF 1.7 Q2 EDUCATION, SCIENTIFIC DISCIPLINES Pub Date : 2021-06-25 DOI: 10.1080/26939169.2021.1946450
Anna D. Peterson, Laura E. Ziegler
Abstract We present an innovative activity that uses data about LEGO sets to help students self-discover multiple linear regressions. Students are guided to predict the price of a LEGO set posted on Amazon.com (Amazon price) using LEGO characteristics such as the number of pieces, the theme (i.e., product line), and the general size of the pieces. By starting with graphical displays and simple linear regression, students are able to develop additive multiple linear regression models as well as interaction models to accomplish the task. We provide examples of student responses to the activity and suggestions for teachers based on our experiences. Supplementary materials for this article are available online.
摘要:我们提出了一个创新的活动,利用乐高积木的数据来帮助学生自我发现多元线性回归。引导学生预测在亚马逊网站上发布的乐高套装的价格(亚马逊价格),使用乐高的特征,如块的数量,主题(即产品线),以及块的一般尺寸。通过图形显示和简单的线性回归,学生能够建立相加的多元线性回归模型以及交互模型来完成任务。我们提供了学生对活动的反应示例,并根据我们的经验为教师提供建议。本文的补充材料可在网上获得。
{"title":"Building a Multiple Linear Regression Model With LEGO Brick Data","authors":"Anna D. Peterson, Laura E. Ziegler","doi":"10.1080/26939169.2021.1946450","DOIUrl":"https://doi.org/10.1080/26939169.2021.1946450","url":null,"abstract":"Abstract We present an innovative activity that uses data about LEGO sets to help students self-discover multiple linear regressions. Students are guided to predict the price of a LEGO set posted on Amazon.com (Amazon price) using LEGO characteristics such as the number of pieces, the theme (i.e., product line), and the general size of the pieces. By starting with graphical displays and simple linear regression, students are able to develop additive multiple linear regression models as well as interaction models to accomplish the task. We provide examples of student responses to the activity and suggestions for teachers based on our experiences. Supplementary materials for this article are available online.","PeriodicalId":34851,"journal":{"name":"Journal of Statistics and Data Science Education","volume":"29 1","pages":"297 - 303"},"PeriodicalIF":1.7,"publicationDate":"2021-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/26939169.2021.1946450","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46325594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Might Temporal Logic Improve the Specification of Directed Acyclic Graphs (DAGs)? 时间逻辑是否能改善有向无环图(dag)的规范?
IF 1.7 Q2 EDUCATION, SCIENTIFIC DISCIPLINES Pub Date : 2021-06-02 DOI: 10.1080/26939169.2021.1936311
G. Ellison
Abstract Temporality-driven covariate classification had limited impact on: the specification of directed acyclic graphs (DAGs) by 85 novice analysts (medical undergraduates); or the risk of bias in DAG-informed multivariable models designed to generate causal inference from observational data. Only 71 students (83.5%) managed to complete the “Temporality-driven Covariate Classification” task, and fewer still completed the “DAG Specification” task (77.6%) or both tasks in succession (68.2%). Most students who completed the first task misclassified at least one covariate (84.5%), and misclassification rates were even higher among students who specified a DAG (92.4%). Nonetheless, across the 512 and 517 covariates considered by each of these tasks, “confounders” were far less likely to be misclassified (11/252, 4.4% and 8/261, 3.1%) than “mediators” (70/123, 56.9% and 56/115, 48.7%) or “competing exposures” (93/137, 67.9% and 86/138, 62.3%), respectively. Since estimates of total causal effects are biased in multivariable models that: fail to adjust for “confounders”; or adjust for “mediators” (or “consequences of the outcome”) misclassified as “confounders” or “competing exposures,” a substantial proportion of any models informed by the present study’s DAGs would have generated biased estimates of total causal effects (50/66, 76.8%); and this would have only been slightly lower for models informed by temporality-driven covariate classification alone (47/71, 66.2%). Supplementary materials for this article are available online.
摘要时间驱动的协变量分类对85名新手分析师(医学本科生)对有向无环图(DAG)的规范影响有限;或DAG多变量模型中的偏差风险,该模型旨在从观测数据中产生因果推断。只有71名学生(83.5%)成功完成了“时间驱动的协变量分类”任务,完成“DAG规范”任务(77.6%)或连续完成两项任务(68.2%)的学生更少。大多数完成第一项任务的学生至少对一个协变量进行了错误分类(84.5%),指定DAG的学生的错误分类率更高(92.4%)。尽管如此,在每项任务考虑的512和517个协变量中,“混杂因素”被错误分类的可能性(11/252,4.4%和8/261,3.1%)远低于“中介因素”(70/123,56.9%和56/115,48.7%)或“竞争暴露”(93/137,67.9%和86/138,62.3%)。由于在多变量模型中对总因果效应的估计是有偏差的,这些模型:未能调整“混杂因素”;或根据被错误归类为“混杂因素”或“竞争暴露”的“媒介”(或“结果的后果”)进行调整,本研究DAG所提供的任何模型中,很大一部分都会对总因果效应产生有偏差的估计(50/66,76.8%);对于仅由时间驱动的协变量分类提供信息的模型,这一比例仅略低(47/71,66.2%)。本文的补充材料可在线获取。
{"title":"Might Temporal Logic Improve the Specification of Directed Acyclic Graphs (DAGs)?","authors":"G. Ellison","doi":"10.1080/26939169.2021.1936311","DOIUrl":"https://doi.org/10.1080/26939169.2021.1936311","url":null,"abstract":"Abstract Temporality-driven covariate classification had limited impact on: the specification of directed acyclic graphs (DAGs) by 85 novice analysts (medical undergraduates); or the risk of bias in DAG-informed multivariable models designed to generate causal inference from observational data. Only 71 students (83.5%) managed to complete the “Temporality-driven Covariate Classification” task, and fewer still completed the “DAG Specification” task (77.6%) or both tasks in succession (68.2%). Most students who completed the first task misclassified at least one covariate (84.5%), and misclassification rates were even higher among students who specified a DAG (92.4%). Nonetheless, across the 512 and 517 covariates considered by each of these tasks, “confounders” were far less likely to be misclassified (11/252, 4.4% and 8/261, 3.1%) than “mediators” (70/123, 56.9% and 56/115, 48.7%) or “competing exposures” (93/137, 67.9% and 86/138, 62.3%), respectively. Since estimates of total causal effects are biased in multivariable models that: fail to adjust for “confounders”; or adjust for “mediators” (or “consequences of the outcome”) misclassified as “confounders” or “competing exposures,” a substantial proportion of any models informed by the present study’s DAGs would have generated biased estimates of total causal effects (50/66, 76.8%); and this would have only been slightly lower for models informed by temporality-driven covariate classification alone (47/71, 66.2%). Supplementary materials for this article are available online.","PeriodicalId":34851,"journal":{"name":"Journal of Statistics and Data Science Education","volume":"29 1","pages":"202 - 213"},"PeriodicalIF":1.7,"publicationDate":"2021-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/26939169.2021.1936311","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44493667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Journal of Statistics and Data Science Education
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1