{"title":"Comments From the Editor: How Big Are Your P Values?","authors":"B. Silvey","doi":"10.1177/87551233211043843","DOIUrl":null,"url":null,"abstract":"For researchers who conduct quantitative analyses that involve statistical software such as SPSS or R, nothing is more perilous than hitting the execute button and waiting for those p values to appear on the screen. In the case of p ≤ .05, there is usually great rejoicing and a satisfaction that achieving statistical significance means that your study may result in a publishable article. However, if the output shows p ≥ .05, the need for a stiff drink and psychotherapy develops suddenly. If only the number had been different! All of that hard work for nothing! Due to the distorted nature of this easy and lazy categorization that places research findings into those that matter and those that do not, it should come as no surprise that statisticians disagree on the importance and use of p values (Wasserstein & Lazar, 2016). Indeed, many researchers and data scientists have called for the retirement of statistical significance altogether (Amrhein et al., 2019). Although a complete description and discussion of the debate surrounding significance testing is beyond the scope of these comments, there are many resources available for those individuals who like to fall asleep early (c.f., Brereton, 2020; Kennedy-Shaffer, 2019; Vidgen & Yasseri, 2016). Even though the mission of Update is to present “findings of individual studies without research terminology or jargon” (Update, n.d.), we often include quantitative studies that have varying degrees of statistical mumbo jumbo. (I won’t make any excuses, though, other than maybe I should do a better job as Editor.) Because you have woken from the slumber induced by reading the exhaustive list of the strengths and weaknesses of null hypothesis significance testing cited previously, I thought it would be better to present some ways that researchers are attempting to move beyond the p value. Many of our readers, in addition to be excellent practitioners, endeavor to consume even more sophisticated quantitative research, so knowing more about what is happening within the social sciences and other music education research journals could prove beneficial. In a terrific article by Resnick (2017), the case for and against redefining statistical significance is debated. He claims that there are more nuanced ways to move science forward, and asserts that researchers should consider several things when reporting their data. One consideration is to include effect sizes. For those unaware of effect sizes, they are a quantitative measure of the magnitude of an experimental effect, and are reported alongside p values. Rather than only report whether there was a statistical difference, researchers should include effect sizes to contextualize the importance and practicality of their findings. Depending on the type of statistical test that was computed, you might find Cohen’s d, Hedges g, or partial eta squared (η2) hanging out behind that p value. In other words, just because something is statistically significant does not mean that it has any real importance. Researchers can also provide additional statistical information and a bit more wiggle room when reporting findings through the use of confidence intervals (CIs). These intervals are a “range of values around that statistic that are believed to contain, with a certain probability, the real value of that statistic” (Field, 2009, p. 783). Most researchers use a CI value of 95%. This is a neat way to indicate a range of values—from using a lower and an upper limit—that you are 95% confident will include your sample statistic. You might see something like this after the p value: p = .01, 95% CI [1.2, 2.5]. There are also matters that do not involve statistics that can help alleviate the anxiety induced by p values. The first consideration is that researchers should contextualize the results of a study by whether the findings are novel or have been replicated. If no one has ever studied this problem, the associated data, or conducted a similar methodology, it stands to reason that regardless of the statistical findings, the research community should exercise discretion in extrapolating the results. However, if the study is a replication of previous research or part of a continued line of study, we have greater reason to accept the underlying implications of those studies. Finally, there has been a push to make data from studies free and accessible online. This includes both quantitative and qualitative data. If researchers are willing to make their data accessible for 1043843 UPDXXX10.1177/87551233211043843Update: Applications of Research in Music EducationSilvey research-article2021","PeriodicalId":75281,"journal":{"name":"Update (Music Educators National Conference (U.S.))","volume":"40 1","pages":"3 - 4"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Update (Music Educators National Conference (U.S.))","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/87551233211043843","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
For researchers who conduct quantitative analyses that involve statistical software such as SPSS or R, nothing is more perilous than hitting the execute button and waiting for those p values to appear on the screen. In the case of p ≤ .05, there is usually great rejoicing and a satisfaction that achieving statistical significance means that your study may result in a publishable article. However, if the output shows p ≥ .05, the need for a stiff drink and psychotherapy develops suddenly. If only the number had been different! All of that hard work for nothing! Due to the distorted nature of this easy and lazy categorization that places research findings into those that matter and those that do not, it should come as no surprise that statisticians disagree on the importance and use of p values (Wasserstein & Lazar, 2016). Indeed, many researchers and data scientists have called for the retirement of statistical significance altogether (Amrhein et al., 2019). Although a complete description and discussion of the debate surrounding significance testing is beyond the scope of these comments, there are many resources available for those individuals who like to fall asleep early (c.f., Brereton, 2020; Kennedy-Shaffer, 2019; Vidgen & Yasseri, 2016). Even though the mission of Update is to present “findings of individual studies without research terminology or jargon” (Update, n.d.), we often include quantitative studies that have varying degrees of statistical mumbo jumbo. (I won’t make any excuses, though, other than maybe I should do a better job as Editor.) Because you have woken from the slumber induced by reading the exhaustive list of the strengths and weaknesses of null hypothesis significance testing cited previously, I thought it would be better to present some ways that researchers are attempting to move beyond the p value. Many of our readers, in addition to be excellent practitioners, endeavor to consume even more sophisticated quantitative research, so knowing more about what is happening within the social sciences and other music education research journals could prove beneficial. In a terrific article by Resnick (2017), the case for and against redefining statistical significance is debated. He claims that there are more nuanced ways to move science forward, and asserts that researchers should consider several things when reporting their data. One consideration is to include effect sizes. For those unaware of effect sizes, they are a quantitative measure of the magnitude of an experimental effect, and are reported alongside p values. Rather than only report whether there was a statistical difference, researchers should include effect sizes to contextualize the importance and practicality of their findings. Depending on the type of statistical test that was computed, you might find Cohen’s d, Hedges g, or partial eta squared (η2) hanging out behind that p value. In other words, just because something is statistically significant does not mean that it has any real importance. Researchers can also provide additional statistical information and a bit more wiggle room when reporting findings through the use of confidence intervals (CIs). These intervals are a “range of values around that statistic that are believed to contain, with a certain probability, the real value of that statistic” (Field, 2009, p. 783). Most researchers use a CI value of 95%. This is a neat way to indicate a range of values—from using a lower and an upper limit—that you are 95% confident will include your sample statistic. You might see something like this after the p value: p = .01, 95% CI [1.2, 2.5]. There are also matters that do not involve statistics that can help alleviate the anxiety induced by p values. The first consideration is that researchers should contextualize the results of a study by whether the findings are novel or have been replicated. If no one has ever studied this problem, the associated data, or conducted a similar methodology, it stands to reason that regardless of the statistical findings, the research community should exercise discretion in extrapolating the results. However, if the study is a replication of previous research or part of a continued line of study, we have greater reason to accept the underlying implications of those studies. Finally, there has been a push to make data from studies free and accessible online. This includes both quantitative and qualitative data. If researchers are willing to make their data accessible for 1043843 UPDXXX10.1177/87551233211043843Update: Applications of Research in Music EducationSilvey research-article2021