{"title":"有正确的路吗?","authors":"H. MacGillivray","doi":"10.1111/test.12288","DOIUrl":null,"url":null,"abstract":"At the recent OZCOTS (Australian Conference on Teaching Statistics), https://anzsc2021.com.au/ozcots-conference/, Rob Gould's keynote, titled Data Education in pre-College: promises and challenges, attracted a question from Matthew Parry, University of Otago, as to whether the scenario of “.. here's a bunch of data, come up with questions..” is a type of reversal of much previous advocacy to source or collect data to investigate identified issues. Rob's reply, and his discussion in his 2021 paper “Towards data-scientific thinking” [1], include comments that whatever codification is used for the statistical investigation cycle, now often called the data cycle or the learning from data cycle “...it is expected that investigators will ‘skip around' to some extent.” and that the order is not strict. This can be seen in examination of a variety of statistical and data investigations in real and complex contexts, whether in research or applications. In References [1,3], both Rob Gould and Andee Rubin emphasize “consider data” to include all aspects of the assembly of data, whether the data is assembled through sourcing, searching, collating or collecting, or is already available. They, and other authors, comment that the deluge of data means that students and indeed investigators more and more consider or access data already collected. Technological advances also enable greater and more ready access to collected data, and the necessary wrangling to handle such data. These in turn open up many possibilities for students to explore civic issues, including the critiquing of data with the associated vital learning about data quality and inherent dangers in uncritical algorithmic approaches. Rob also commented that students seem to find difficulty in identifying what statistical questions can be posed for an existing dataset. It is interesting to consider that today's data deluges require a return to more emphasis on the questions of “what, when, how, why, who?” In previous eras when instructors had no choice but to provide data and their context to students, these questions were of paramount importance in authentic statistical learning. For those in workplaces, not being able to find answers to such dataquerying questions, prevented the critiquing of reports or the building on previous data investigations or the redoing of analyses. As access to technology increased, enabling students to explore and analyse data beyond simplistic pocket calculator restrictions, students were able to design, collect, observe or source their own data to investigate issues involving a number of variables of interest to them. This could also introduce another question of great practical importance in many disciplines and workplaces, namely, can we measure what we want to measure? Including the information on the “what, when, why, how” in their reporting of data investigations, was, and is, excellent grounding for their future work whether in industry, business or research. Hence we see that greater technological capabilities open greater possibilities in authentic student learning of data investigations, whether in accessing and using data collected by others or in collecting data themselves. The order of identifying issues and sourcing data may be reversed or, as is often the case, reiterated, but the core questions and reporting of “what, when, how, why, who?” and “can we measure what we want?” are as important as ever, along with critiquing and understanding issues of data quality. These apply to both statistics and data science, illustrating again their common crux in data investigations. However, there is one order in teaching statistics and data science which is not appropriate, namely, here is a tool, now find/obtain some data to use it on. This leads to forcing data into tools, neglecting assumptions and their subsequent evaluation, and the over-emphasis on a single question and a single answer which so dominates and inhibits early statistics teaching and contradicts statistical thinking. Many years ago as an undergraduate, I was both bemused and amused by how my medical student undergraduate friends could force every dataset and every question into a chi-square test, because their undergraduate program had included at that stage only a few weeks' introduction to statistics in which they had seen only this tool and how to use it, but not where, when, on what or why. It is also the approach which bedevils users in other disciplines with misuse of multiple simplistic procedures, especially the ubiquitous t, or of numerical codes as having numerical meaning, or misuse of assumptions or diagnostics. This type of approach arises from the mindset of theory-then-example which can appear in any science, including mathematics and computer science. This is not to negate the importance of theory which underpins, unifies, supports and validates methods, procedures and tools, and indeed provides the assumptions which are essential to emphasize in teaching statistics and data science. But I can hear the questioning of surely one has to introduce simple tools first and illustrate with simple DOI: 10.1111/test.12288","PeriodicalId":43739,"journal":{"name":"Teaching Statistics","volume":null,"pages":null},"PeriodicalIF":1.2000,"publicationDate":"2021-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1111/test.12288","citationCount":"0","resultStr":"{\"title\":\"Is there a right way round?\",\"authors\":\"H. MacGillivray\",\"doi\":\"10.1111/test.12288\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"At the recent OZCOTS (Australian Conference on Teaching Statistics), https://anzsc2021.com.au/ozcots-conference/, Rob Gould's keynote, titled Data Education in pre-College: promises and challenges, attracted a question from Matthew Parry, University of Otago, as to whether the scenario of “.. here's a bunch of data, come up with questions..” is a type of reversal of much previous advocacy to source or collect data to investigate identified issues. Rob's reply, and his discussion in his 2021 paper “Towards data-scientific thinking” [1], include comments that whatever codification is used for the statistical investigation cycle, now often called the data cycle or the learning from data cycle “...it is expected that investigators will ‘skip around' to some extent.” and that the order is not strict. This can be seen in examination of a variety of statistical and data investigations in real and complex contexts, whether in research or applications. In References [1,3], both Rob Gould and Andee Rubin emphasize “consider data” to include all aspects of the assembly of data, whether the data is assembled through sourcing, searching, collating or collecting, or is already available. They, and other authors, comment that the deluge of data means that students and indeed investigators more and more consider or access data already collected. Technological advances also enable greater and more ready access to collected data, and the necessary wrangling to handle such data. These in turn open up many possibilities for students to explore civic issues, including the critiquing of data with the associated vital learning about data quality and inherent dangers in uncritical algorithmic approaches. Rob also commented that students seem to find difficulty in identifying what statistical questions can be posed for an existing dataset. It is interesting to consider that today's data deluges require a return to more emphasis on the questions of “what, when, how, why, who?” In previous eras when instructors had no choice but to provide data and their context to students, these questions were of paramount importance in authentic statistical learning. For those in workplaces, not being able to find answers to such dataquerying questions, prevented the critiquing of reports or the building on previous data investigations or the redoing of analyses. As access to technology increased, enabling students to explore and analyse data beyond simplistic pocket calculator restrictions, students were able to design, collect, observe or source their own data to investigate issues involving a number of variables of interest to them. This could also introduce another question of great practical importance in many disciplines and workplaces, namely, can we measure what we want to measure? Including the information on the “what, when, why, how” in their reporting of data investigations, was, and is, excellent grounding for their future work whether in industry, business or research. Hence we see that greater technological capabilities open greater possibilities in authentic student learning of data investigations, whether in accessing and using data collected by others or in collecting data themselves. The order of identifying issues and sourcing data may be reversed or, as is often the case, reiterated, but the core questions and reporting of “what, when, how, why, who?” and “can we measure what we want?” are as important as ever, along with critiquing and understanding issues of data quality. These apply to both statistics and data science, illustrating again their common crux in data investigations. However, there is one order in teaching statistics and data science which is not appropriate, namely, here is a tool, now find/obtain some data to use it on. This leads to forcing data into tools, neglecting assumptions and their subsequent evaluation, and the over-emphasis on a single question and a single answer which so dominates and inhibits early statistics teaching and contradicts statistical thinking. Many years ago as an undergraduate, I was both bemused and amused by how my medical student undergraduate friends could force every dataset and every question into a chi-square test, because their undergraduate program had included at that stage only a few weeks' introduction to statistics in which they had seen only this tool and how to use it, but not where, when, on what or why. It is also the approach which bedevils users in other disciplines with misuse of multiple simplistic procedures, especially the ubiquitous t, or of numerical codes as having numerical meaning, or misuse of assumptions or diagnostics. This type of approach arises from the mindset of theory-then-example which can appear in any science, including mathematics and computer science. This is not to negate the importance of theory which underpins, unifies, supports and validates methods, procedures and tools, and indeed provides the assumptions which are essential to emphasize in teaching statistics and data science. But I can hear the questioning of surely one has to introduce simple tools first and illustrate with simple DOI: 10.1111/test.12288\",\"PeriodicalId\":43739,\"journal\":{\"name\":\"Teaching Statistics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.2000,\"publicationDate\":\"2021-08-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1111/test.12288\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Teaching Statistics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1111/test.12288\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"EDUCATION & EDUCATIONAL RESEARCH\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Teaching Statistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1111/test.12288","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
At the recent OZCOTS (Australian Conference on Teaching Statistics), https://anzsc2021.com.au/ozcots-conference/, Rob Gould's keynote, titled Data Education in pre-College: promises and challenges, attracted a question from Matthew Parry, University of Otago, as to whether the scenario of “.. here's a bunch of data, come up with questions..” is a type of reversal of much previous advocacy to source or collect data to investigate identified issues. Rob's reply, and his discussion in his 2021 paper “Towards data-scientific thinking” [1], include comments that whatever codification is used for the statistical investigation cycle, now often called the data cycle or the learning from data cycle “...it is expected that investigators will ‘skip around' to some extent.” and that the order is not strict. This can be seen in examination of a variety of statistical and data investigations in real and complex contexts, whether in research or applications. In References [1,3], both Rob Gould and Andee Rubin emphasize “consider data” to include all aspects of the assembly of data, whether the data is assembled through sourcing, searching, collating or collecting, or is already available. They, and other authors, comment that the deluge of data means that students and indeed investigators more and more consider or access data already collected. Technological advances also enable greater and more ready access to collected data, and the necessary wrangling to handle such data. These in turn open up many possibilities for students to explore civic issues, including the critiquing of data with the associated vital learning about data quality and inherent dangers in uncritical algorithmic approaches. Rob also commented that students seem to find difficulty in identifying what statistical questions can be posed for an existing dataset. It is interesting to consider that today's data deluges require a return to more emphasis on the questions of “what, when, how, why, who?” In previous eras when instructors had no choice but to provide data and their context to students, these questions were of paramount importance in authentic statistical learning. For those in workplaces, not being able to find answers to such dataquerying questions, prevented the critiquing of reports or the building on previous data investigations or the redoing of analyses. As access to technology increased, enabling students to explore and analyse data beyond simplistic pocket calculator restrictions, students were able to design, collect, observe or source their own data to investigate issues involving a number of variables of interest to them. This could also introduce another question of great practical importance in many disciplines and workplaces, namely, can we measure what we want to measure? Including the information on the “what, when, why, how” in their reporting of data investigations, was, and is, excellent grounding for their future work whether in industry, business or research. Hence we see that greater technological capabilities open greater possibilities in authentic student learning of data investigations, whether in accessing and using data collected by others or in collecting data themselves. The order of identifying issues and sourcing data may be reversed or, as is often the case, reiterated, but the core questions and reporting of “what, when, how, why, who?” and “can we measure what we want?” are as important as ever, along with critiquing and understanding issues of data quality. These apply to both statistics and data science, illustrating again their common crux in data investigations. However, there is one order in teaching statistics and data science which is not appropriate, namely, here is a tool, now find/obtain some data to use it on. This leads to forcing data into tools, neglecting assumptions and their subsequent evaluation, and the over-emphasis on a single question and a single answer which so dominates and inhibits early statistics teaching and contradicts statistical thinking. Many years ago as an undergraduate, I was both bemused and amused by how my medical student undergraduate friends could force every dataset and every question into a chi-square test, because their undergraduate program had included at that stage only a few weeks' introduction to statistics in which they had seen only this tool and how to use it, but not where, when, on what or why. It is also the approach which bedevils users in other disciplines with misuse of multiple simplistic procedures, especially the ubiquitous t, or of numerical codes as having numerical meaning, or misuse of assumptions or diagnostics. This type of approach arises from the mindset of theory-then-example which can appear in any science, including mathematics and computer science. This is not to negate the importance of theory which underpins, unifies, supports and validates methods, procedures and tools, and indeed provides the assumptions which are essential to emphasize in teaching statistics and data science. But I can hear the questioning of surely one has to introduce simple tools first and illustrate with simple DOI: 10.1111/test.12288