Skip to main content
Log in

Wider die „Sternchenkunde“!

Diskussionsbeitrag zur empirischen Sportwissenschaft

Against “p-hacking”

Contribution to the discussion on empirical sport science

  • Diskussionen
  • Published:
Sportwissenschaft Aims and scope Submit manuscript

Zusammenfassung

Empirische Beobachtungsergebnisse werden mit statistischen Verfahren ausgewertet, um Zusammenhänge und Unterschiede zu prüfen und sachgerecht interpretieren zu können. Die statistisch-methodologische Auswertungsstrategie entspricht einer formalen Sprache, die zu lernen ist, um sich mit anderen Forscher(inne)n verständigen zu können, die aber, wie andere Gegenstandsgebiete auch, durch neue Erkenntnisse im Wandel begriffen ist und damit sich auch entsprechende Empfehlungen zur Verwendung optimaler statistischer Methoden im Laufe der Zeit wandeln. Dieser Gedanke wird in diesem kurzen Diskussionsbeitrag aufgegriffen, und es werden Empfehlungen zu fünf zentralen Bereichen gegeben: a) Gerichtetheit der Hypothesen, b) Konfidenzintervalle, c) Effektgrößen, d) Intervall-Effektgrößen und e) praktische Bedeutsamkeit.

Abstract

In general, empirical results are analyzed using statistical methods to examine and discuss differences and interrelationships. Statistical methodological evaluation and the underlying reporting strategy can be described as a technical language that has to be learnt for successful communication between researchers, authors and reviewers; however, empirical science is a constantly changing environment which is why the statistical methods applied have to be refined to adhere to new empirical approaches. This line of thought will be discussed in this article and five essential recommendations will be presented: (a) directionality of hypotheses, (b) confidence intervals, (c) effect sizes, (d) confidence intervals of effect sizes and (e) practical meaningfulness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Abb. 1
Abb. 2

Literatur

  • Aiken, L. R. (1994). Some observations and recommendations concerning research methodology in the behavioral sciences. Educational and Psychological Measurement, 54(4), 848–860. doi:10.1177/0013164494054004001.

    Article  Google Scholar 

  • American Psychological Association (Hrsg.). (2010). Publication manual of the American Psychological Association – Teil 1 (6. Aufl.). Washington, DC: American Psychological Association.

    Google Scholar 

  • Amir, Y., & Sharon, I. (1990). Replication Research: A „must“ for the scientific advancement of psychology. In J. W. Neuliep (Hrsg.), Handbook of replication research in the behavioral and social sciences (S. 51–69). Corte Medera: Select.

  • Arain, M., Campbell, M., Cooper, C., & Lancaster, G. (2010). What is a pilot or feasibility study? A review of current practice and editorial policy. BMC Medical Research Methodology, 10(1), 67.

    Article  PubMed  PubMed Central  Google Scholar 

  • Atkinson, G., & Nevill, A. M. (2001). Selected issues in the design and analysis of sport performance research. Journal of Sports Sciences, 19(10), 811–827.

    Article  CAS  PubMed  Google Scholar 

  • Baguley, T. (2009). Standardized or simple effect size: What should be reported? British Journal of Psychology, 100(3), 603–617. doi:10.1348/000712608x377117.

    Article  PubMed  Google Scholar 

  • Bakker, M., & Wicherts, J. (2011). The (mis)reporting of statistical results in psychology journals. Behavior Research Methods, 43(3), 666–678. doi:10.3758/s13428-011-0089-5.

    Article  PubMed  PubMed Central  Google Scholar 

  • Batterham, A. M., & Hopkins, W. G. (2006). Making meaningful inferences about magnitudes. International Journal of Sports Physiology and Performance, 1, 50–57.

    PubMed  Google Scholar 

  • Beck, T. W. (2013). The importance of a priori sample size estimation in strength and conditioning research. Journal of Strength & Conditioning Research, 27(8), 2323–2337. doi:10.1519/JSC.0b013e318278eea0.

    Article  Google Scholar 

  • Bortz, J., & Döring, N. (2006). Forschungsmethoden und Evaluation (4. Aufl.). Berlin: Springer.

    Book  Google Scholar 

  • Brandstätter, E. (1999). Konfidenzintervalle als Alternative zu Signifikanztests. Methods of Psycholocical Research – Online, 4(2), 33–46.

    Google Scholar 

  • Bredenkamp, J. (1970). Über Maße der praktischen Signifikanz. Zeitschrift für Psychologie, 177(3/4), 310–317.

    Google Scholar 

  • Bredenkamp, J. (1972). Der Signifikanztest in der psychologischen Forschung. Stuttgart: Akademische Verlagsgesellschaft.

    Google Scholar 

  • Bredenkamp, J., & Feger, H. (Hrsg.). (1983). Hypothesenprüfung. Göttingen: Hogrefe.

    Google Scholar 

  • Büsch, D. (2004). Sequ(T)est: Ein einfaches Statistikprogramm zum sequenziellen Testen in sportwissenschaftlichen Untersuchungen. Spectrum der Sportwissenschaften, 16(1), 85–95.

    Google Scholar 

  • Büsch, D. (2014). Das Messen trainingsbedingter Veränderungen im Spitzensport. In L. K. Maurer, F. Döhring, K. Ferger, H. Maurer, M. Reisser, & H. Müller (Hrsg.), Trainingsbedingte Veränderungen – Messung, Modellierung und Evidenzsicherung. Abstractband zum 10. gemeinsamen Symposium der dvs-Sektionen Biomechanik, Sportmotorik und Trainingswissenschaft vom 17.–19. September 2014 in Gießen (Bd. 237, S. 25–26). Hamburg: Feldhaus Edition Czwalina.

    Google Scholar 

  • Button, K. S., Ioannidis, J. P. A., Mokrysz, C., Nosek, B. A., Flint, J., Robinson, E. S. J., & Munafo, M. R. (2013). Power failure: Why small sample size undermines the reliability of neuroscience. Nature reviews Neuroscience, 14(5), 365–376. doi:10.1038/nrn3475.

    Article  CAS  PubMed  Google Scholar 

  • Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Hillsdale: Lawrence Erlbaum Associates.

    Google Scholar 

  • Cohen, J. (1994). The earth is round (p < .05). American Psychologist, 49(12), 997–1003.

    Article  Google Scholar 

  • Conzelmann, A., & Raab, M. (2009). Datenanalyse: Das Null-Ritual und der Umgang mit Effekten in der Zeitschrift für Sportpsychologie. Zeitschrift für Sportpsychologie, 16(2), 43–54.

    Article  Google Scholar 

  • Cumming, G. (2014). The New Statistics: Why and How. Psychological Science, 25(1), 7–29. doi:10.1177/0956797613504966.

    Article  PubMed  Google Scholar 

  • Cumming, G., & Maillardet, R. (2006). Confidence intervals and replication: Where will the next mean fall? Psychological Methods, 11(3), 217–227.

    Article  PubMed  Google Scholar 

  • Cumming, G., Williams, J., & Fidler, F. (2004). Replications and researcher's understanding of confidence intervals and standard error bars. Understanding Statistics, 3(4), 299–311.

    Article  Google Scholar 

  • Cumming, J., & Finch, S. (2005). Inference by eye. American Psychologist, 60(2), 170–180. doi:10.1037/0003-066X.60.2.170.

    Article  PubMed  Google Scholar 

  • Curran-Everett, D. & Benos, D. J. (2004). Guidelines for reporting statistics in journals published by the American Physiological Society. Advances in Physiology Education, 28(3), 85–87. doi:10.1152/advan.00019.2004.

    Article  Google Scholar 

  • Curran-Everett, D., Taylor, S. & Kafadar, K. (1998). Fundamental concepts in statistics: Elucidation and illustration. Journal of Applied Physiology , 85 (3), 775–786.

    CAS  Google Scholar 

  • Devilly, G. J. (2007). The Effect Size Generator for Windows (Version 4.1) [Computer-Programm]. Swinburne University, Australia: Brain Sciences Institute.

    Google Scholar 

  • Drinkwater, E. (2008). Applications of confidence limits and effect sizes in sport research. The Open Sports Sciences Journal, 1, 3–4.

    Article  Google Scholar 

  • Drummond, G. B., & Tom, B. D. M. (2012). Presenting data: Can you follow a recipe? British Journal of Pharmacology, 165(4), 777–781. doi:10.1111/j.1476-5381.2011.01735.x.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Drummond, G. B., & Vowler, S. L. (2011). Show the data, don’t conceal them. British Journal of Pharmacology, 163, 208–210. doi:10.1111/j.1476-5381.2011.01251.x.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Erdfelder, E. (2010). Editorial – A note on statistical analysis. Experimental Psychology, 57(1), 1–4.

    Article  PubMed  Google Scholar 

  • Erdfelder, E., & Bredenkamp, J. (1994). Hypothesenprüfung. In T. Herrmann & W. Tack (Hrsg.), Enzyklopädie der Psychologie: Themenbereich B Methodologie und Methoden, Serie I Forschungsmethoden der Psychologie. Bd. 1 Methodologische Grundlagen der Psychologie (S. 604–648). Göttingen: Hogrefe.

    Google Scholar 

  • Finch, S., Cumming, G., & Thomason, N. (2001). Colloquium on effect sizes: The roles of editors, textbook authors, and the publication manual: Reporting of statistical inference in the journal of applied psychology: Little evidence of reform. Educational and Psychological Measurement, 61(2), 181–210. doi:10.1177/0013164401612001.

    Google Scholar 

  • Finch, S., Thomason, N., & Cumming, G. (2002). Past and future American psychological association guidelines for statistical practice. Theory & Psychology, 12(6), 825–853.

    Article  Google Scholar 

  • Fisher, R. A. (1973). Statistical methods and scientific inference (3. Aufl.). London, England: Collier Macmillan.

    Google Scholar 

  • Francis, G. (2012a). The psychology of replication and replication in psychology. Perspectives on Psychological Science, 7(6), 585–594. doi:10.1177/1745691612459520.

    Article  PubMed  Google Scholar 

  • Francis, G. (2012b). Publication bias and the failure of replication in experimental psychology. Psychonomic Bulletin & Review, 19(6), 975–991. doi:10.3758/s13423-012-0322-y.

    Article  Google Scholar 

  • Fritz, A., Scherndl, T., & Kühberger, A. (2013). A comprehensive review of reporting practices in psychological journals: Are effect sizes really enough? Theory & Psychology, 23(1), 98–122. doi:10.1177/0959354312436870.

    Article  Google Scholar 

  • Fröhlich, M., & Pieter, A. (2009). Cohen’s Effektstärken als Mass der Bewertung von praktischer Relevanz – Implikationen für die Praxis. Schweizerische Zeitschrift für „Sportmedizin und Sporttraumatologie“, 57(4), 139–142.

    Google Scholar 

  • Furchtgott, E. (1984). Replicate, again and again. American Psychologist, 39, 1315–1316.

    Article  Google Scholar 

  • Giles, J. (2006). The trouble with replication. Nature, 442, 344–347. doi:10.1038/442344a.

    Article  CAS  PubMed  Google Scholar 

  • Gollwitzer, M., & Jäger, R. S. (2009). Evaluation kompakt (1. Aufl.). Weinheim: Beltz Verlag.

    Google Scholar 

  • Grissom, R. J., & Kim, J. J. (2001). Review of assumptions and problems in the appropriate conceptualization of effect size. Psychological Methods, 6, 135–146.

    Article  CAS  PubMed  Google Scholar 

  • Guller, U., & DeLong, E. R. (2004). Interpreting statistics in medical literature: A vade mecum for surgeons1. Journal of the American College of Surgeons, 198(3), 441–458. doi:http://dx.doi.org/10.1016/j.jamcollsurg.2003.09.017.

    Article  PubMed  Google Scholar 

  • Hager, W. (1987). Grundlagen einer Versuchsplanung zur Prüfung empirischer Hypothesen der Psychologie. In G. Lüer (Hrsg.), Allgemeine experimentelle Psychologie (S. 43–264). Stuttgart: Gustav Fischer Verlag.

    Google Scholar 

  • Hager, W. (1992). Jenseits von Experiment und Quasi-Experiment. Göttingen: Hogrefe.

    Google Scholar 

  • Hager, W. (2000). About some misconceptions and the dicontent with statistical tests in psychology. Methods of Psycholocical Research – Online, 5(1), 1–31.

    Google Scholar 

  • Hager, W. (2004). Testplanung zur statistischen Prüfung psychologischer Hypothesen. Göttingen: Hogrefe.

    Google Scholar 

  • Hager, W., Spies, K., & Heise, E. (2001). Versuchsdurchführung und Versuchsbericht (2. überarb. und erweit. ed.). Göttingen: Hogrefe.

    Google Scholar 

  • Hager, W., & Westermann, R. (1983). Zur Wahl und Prüfung statistischer Hypothesen in psychologischen Untersuchungen. Zeitschrift für experimentelle und angewandte Psychologie, 30(1), 67–94.

    Google Scholar 

  • Hagger, M. S., & Chatzisarantis, N. L. D. (2009). Assumptions in research in sport and exercise psychology. Psychology of Sport and Exercise, 10(5), 511–519.

    Article  Google Scholar 

  • Hoekstra, R., Morey, R., Rouder, J., & Wagenmakers, E.-J. (2014). Robust misinterpretation of confidence intervals. Psychonomic Bulletin & Review, 21(5), 1157–1164. doi:10.3758/s13423-013-0572-3.

    Article  Google Scholar 

  • Hopkins, W. (2002). Probabilities of clinical or practical significance. Sportscience, 6. www.sportsci.org/jour/0201/wghprob.htm. Zugegriffen: 31. Jan. 2015.

  • Hopkins, W. (2004). How to interpret changes in an athletic performance test. Sportscience, 8, 1–7. (www.sportsci.org/jour/05/ambwgh.htm).

    Google Scholar 

  • Hopkins, W. (2005). Making meaningful inferences about magnitudes. Sportscience, 9, 6–13. (www.sportsci.org/jour/05/ambwgh.htm).

    Google Scholar 

  • Hopkins, W. (2006). Estimating sample size for magnitude-based inferences. Sportscience, 10, 63–70. (www.sportsci.org/2006/wghss.htm).

    Google Scholar 

  • Hopkins, W. (2007). A spreadsheet for deriving a confidence interval, mechanistic inference und clinical inference from a p value. Sportscience, 11, 16–20. (www.sportsci.org/2007/wghinf.htm).

    Google Scholar 

  • Hopkins, W., Batterham, A. M., Marshall, S. W., & Hanin, J. (2009a). Progressive statistics. Sportscience, 13, 55–70. (www.sportsci.org/2009/prostats.htm).

    Google Scholar 

  • Hopkins, W., Batterham, A. M., Marshall, S. W., & Hanin, J. (2009b). Progressive statistics for studies in sports medicine and exercise science. Medicine & Science in Sports & Exercercise, 41(1), 3–12.

    Article  Google Scholar 

  • Huber, H. P. (1973). Psychometrische Einzelfalldiagnostik. Weinheim: Beltz.

    Google Scholar 

  • Hurlbert, S. H., & Lombardi, C. M. (2009). Final collapse of the Neyman-Pearson decision theoretic framework and rise of the neoFisherian. Annales Zoologici Fennici, 46, 311–349.

    Article  Google Scholar 

  • Hussy, W., & Jain, A. (2002). Experimentelle Hypothesenprüfung in der Psychologie. Göttingen: Hogrefe.

    Google Scholar 

  • Hussy, W., & Möller, H. (1994). Hypothesen. In T. Herrmann & W. Tack (Hrsg.), Enzyklopädie der Psychologie: Themenbereich B Methodologie und Methoden, Serie I Forschungsmethoden der Psychologie. Bd. 1 Methodologische Grundlagen der Psychologie (S. 475–507). Göttingen: Hogrefe.

    Google Scholar 

  • Hussy, W., Schreier, M., & Echterhoff, G. (2013). Forschungsmethoden in Psychologie und Sozialwissenschaften (2. Aufl.). Berlin: Springer.

    Book  Google Scholar 

  • Hyde, J. S. (2001). Reporting effect sizes: The roles of editors, textbook authors, and publication manuals. Educational and Psychological Measurement, 61(2), 225–228. doi:10.1177/0013164401612005.

    Article  Google Scholar 

  • Kelley, K., & Preacher, K. J. (2012). On effect size. Psychological Methods, 17(2), 137–152. doi:10.1037/a0028086.

    Article  PubMed  Google Scholar 

  • Kirk, R. E. (1996). Practical significance: A concept whose time has come. Educational and Psychological Measurement, 56(5), 746–759. doi:10.1177/0013164496056005002.

    Article  Google Scholar 

  • Kirk, R. E. (2001). Promoting good statistical practices: Some suggestions. Educational and Psychological Measurement, 61(2), 213–218. doi:10.1177/00131640121971185.

    Article  Google Scholar 

  • Klein, S. B. (2014). What can recent replication failures tell us about the theoretical commitments of psychology? Theory & Psychology, 24(3), 326–338. doi:10.1177/0959354314529616.

    Article  Google Scholar 

  • Kline, R. B. (2005). Beyond significance testing (2. Aufl.). Washington, DC: American Psychological Association.

    Google Scholar 

  • Koole, S. L., & Lakens, D. (2012). Rewarding replications: A sure and simple way to improve psychological science. Perspectives on Psychological Science, 7(6), 608–614. doi:10.1177/1745691612462586.

    Article  PubMed  Google Scholar 

  • Krause, M. S. (2012). Measurement validity is fundamentally a matter of definition, not correlation. Review of General Psychology, 14(4), 391–400. doi:10.1037/a0027701.

    Article  Google Scholar 

  • Leonhart, R. (2004). Effektgrößenberechnung bei Interventionsstudien. Rehabilitation, 43(4), 241–246.

    Article  CAS  PubMed  Google Scholar 

  • Lew, M. (2006). Principles: When there should be no difference – how to fail to reject the null hypothesis. Trends in Pharmacological Sciences, 27(5), 274–278. doi:http://dx.doi.org/10.1016/j.tips.2006.03.006.

    Article  CAS  PubMed  Google Scholar 

  • Lew, M. (2007a). Good statistical practice in pharmacology. Problem 1. British Journal of Pharmacology, 152, 295–298.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Lew, M. (2007b). Good statistical practice in pharmacology. Problem 2. British Journal of Pharmacology, 152, 299–303.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Lew, M. (2012). Bad statistical practice in pharmacology (and other basic biomedical disciplines): You probably don't know P. British Journal of Pharmacology, 166(5), 1559–1567. doi:10.1111/j.1476-5381.2012.01931.x.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • MacCallum, R. C., Zhang, S., Preacher, K. J., & Rucker, D. D. (2002). On the practice of dichotomization of quantitative variables. Psychological Methods, 7(1), 19–40.

    Article  PubMed  Google Scholar 

  • Makel, M. C., Plucker, J. A., & Hegarty, B. (2012). Replications in psychology research: How often do they really occur? Perspectives on Psychological Science, 7(6), 537–542. doi:10.1177/1745691612460688.

    Article  PubMed  Google Scholar 

  • Morris, S. B. (2008). Estimating effect sizes from pretest-posttest-control group designs. Organizational Research Methods, 11(2), 364–386. doi:10.1177/1094428106291059.

    Article  Google Scholar 

  • Mullineaux, D. R., Bartlett, R. M., & Bennett, S. (2001). Research design and statistics in biomechanics and motor control. Journal of Sports Sciences, 19(10), 739–760. doi:10.1080/026404101317015410.

    Article  CAS  PubMed  Google Scholar 

  • Nakagawa, S., & Cuthill, I. C. (2007). Effect size, confidence interval and statistical significance: A practical guide for biologists. Biological Rewies, 82(4), 591–605. doi:10.1111/j.1469-185X.2007.00027.x.

    Article  Google Scholar 

  • Nuijten, M. B., van Assen, M. A. L. M., Veldkamp, C. L. S. & Wicherts, J. M. (2015). The replication paradox: Combining studies can decrease accuracy of effect size estimates. Review of General Psychology, 19(2), 172–182. doi:10.1037/gpr0000034.supp (Supplemental).

    Article  Google Scholar 

  • Page, P. (2014). Beyond statistical significance: Clinical interpretation of rehabilitation research literature. The International Journal of Sports Physical Therapy, 9(5), 726–736.

    PubMed  Google Scholar 

  • Petersen, C., Wilson, B., & Hopkins, W. (2004). Effects of modified-implement training on fast bowling in cricket. Journal of Sports Sciences, 22(11–12), 1035–1039.

    Article  PubMed  Google Scholar 

  • Rhea, M. R. (2004). Determing the magnitude of treatment effects in strength training research through the use of the effect size. Journal of Strength and Conditioning Research, 18(4), 918–920.

    PubMed  Google Scholar 

  • Sarris, V. & Reiß, S. (2007). Kurzer Leitfaden der Experimentalpsychologie. München: Pearson Studium.

  • Schimmack, U. (2013). The ironic effect of significant results on the credibility of multiple-study articles. Psychological Methods, 17(4), 551–566.

    Article  Google Scholar 

  • Sedlmeier, P. (1996). Jenseits des Signifikanz-Rituals: Ergänzungen und Alternativen. Methods of Psychological Research – Online, 1(4), 41–63.

    Google Scholar 

  • Sedlmeier, P., & Renkewitz, F. (2008). Forschungsmethoden und Statistik in der Psychologie. München: Pearson Studium.

    Google Scholar 

  • Shakespeare, T. P., Gebski, V. J., Veness, M. J., & Simes, J. (2001). Improving interpretation of clinical studies by use of confidence levels, clinical significance curves, and risk-benefit contours. The Lancet, 357(9265), 1349–1353. doi:10.1016/S0140-6736(00)04522-0.

    Article  CAS  Google Scholar 

  • Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science. doi:10.1177/0956797611417632.

  • Simonsohn, U., Nelson, L. D., & Simmons, J. P. (2014). P-curve: A key to the file-drawer. Journal of Experimental Psychology: General, 143(2), 534–547. doi:http://dx.doi.org/10.1037/a0033242.

    Article  Google Scholar 

  • Smith, N. C. (1970). Replication studies: A neglected aspect of psychological research. American Psychologist, 25, 970–975.

    Article  Google Scholar 

  • Sparkes, A. C., & Smith, B. (2009). Judging the quality of qualitative inquiry. Psychology of Sport and Exercise, 10(5), 491–497.

    Article  Google Scholar 

  • Steiger, J. H. (2001). NDC: Noncentral distribution calculator [Statistical program]. Vanderbilt University, USA: Department of Psychology and Human Development.

  • Steiger, J. H. (2004). Beyond the F test: Effect size confidence intervals and test of close fit in the analysis of variance and contrast analysis. Psychological Methods, 9(2), 164–182.

    Article  PubMed  Google Scholar 

  • Steiger, J. H., & Fouladi, R. T. (1997). Noncentrality interval estimation and the evaluation of statistical methods. In L. L. Harlow (Hrsg.), What if there were no significance tests? (S. 221–257). London: Routledge.

    Google Scholar 

  • Sterne, J. A. C., & Smith, G. D. (2001). Sifting the evidence – what's wrong with significance tests? British Medical Journal, 322, 226–231.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Strauss, B., Hagemann, N., & Loffing, F. (2009). Die Drei-Punkte-Regel in der deutschen 1. Fußballbundesliga und der Anteil unentschiedener Spiele. Sportwissenschaft, 39(1), 16–22. doi:10.1007/s12662-009-0003-9.

    Article  Google Scholar 

  • Strauss, B., & Ntoumanis, N. (2015). Our PSE journey: Looking back and forward. Psychology of Sport and Exercise, 16, Part 3(0), 181–182. doi:http://dx.doi.org/10.1016/j.psychsport.2014.11.002.

    Article  Google Scholar 

  • Thompson, B. (2001). Significance, effect sizes, stepwise methods, and other issues: Strong arguments move the field. Journal of Experimental Education, 70(1), 80.

    Article  Google Scholar 

  • Thompson, B. (2002). What future quantitative social science research could look like: Confidence intervals for effect sizes. Educational Researcher, 31(3), 25–32. doi:10.3102/0013189x031003025.

    Article  Google Scholar 

  • Thompson, B. (2007). Effect sizes, confidence intervals, and confidence intervals for effect sizes. Psychology in the Schools, 44(5), 423–432. doi:10.1002/pits.20234.

    Article  Google Scholar 

  • Thompson, E. N. (1974). A plea for replication. California Journal of Educational Research, 25, 79–86.

    Google Scholar 

  • Trafimow, D., & Marks, M. (2015). Editorial. Basic and Applied Social Psychology, 37(1), 1–2. doi:10.1080/01973533.2015.1012991.

    Article  Google Scholar 

  • Troncoso Skidmore, S., & Thompson, B. (2013). Bias and precision of some classical ANOVA effect sizes when assumptions are violated. Behavior Research Methods, 45(2), 536–546. doi:10.3758/s13428-012-0257-2.

    Article  PubMed  Google Scholar 

  • Tukey, J. W. (1977). Exploratory data analysis. Reading, MA: Addison-Wesley.

    Google Scholar 

  • Wagenmakers, E.-J., Wetzels, R., Borsboom, D., van der Maas, H. L. J., & Kievit, R. A. (2012). An agenda for purely confirmatory research. Perspectives on Psychological Science, 7(6), 632–638. doi:10.1177/1745691612463078.

    Article  PubMed  Google Scholar 

  • Welsh, A. H., & Knight, E. J. (2015). ‘‘magnitude-based inference’’: A statistical review. Medicine and Science in Sports and Exercise, 47(4), 874–884. doi:10.1249/MSS.0000000000000451.

    Article  PubMed  Google Scholar 

  • Westermann, R. (2000). Wissenschaftstheorie und Experimentalmethodik. Göttingen: Hogrefe.

    Google Scholar 

  • Wilkinson, M. (2012). Testing the null hypothesis: The forgotten legacy of Karl Popper? Journal of Sports Sciences, 31(9), 919–920. doi:10.1080/02640414.2012.753636.

    Article  PubMed  Google Scholar 

  • Wilkinson, M. (2014). Distinguishing Between Statistical Significance and Practical/Clinical Meaningfulness Using Statistical Inference. Sports Medicine, 44(3), 295–301. doi:10.1007/s40279-013-0125-y.

    Article  PubMed  Google Scholar 

  • Zemková, E. (2014). Significantly and practically meaningful differences in balance research: P values and/or effect sizes? Sports Medicine, 44(7), 879–886. doi:10.1007/s40279-014-0185-7.

    Article  PubMed  Google Scholar 

  • Zhu, W. (2012). Sadly, the earth is still round (p < 0.05). Journal of Sport and Health Science, 1(1), 9–11.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dirk Büsch.

Ethics declarations

Interessenkonflikt

D. Büsch und B. Strauß geben an, dass kein Interessenkonflikt besteht.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Büsch, D., Strauß, B. Wider die „Sternchenkunde“!. Sportwiss 46, 53–59 (2016). https://doi.org/10.1007/s12662-015-0376-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12662-015-0376-x

Schlüsselwörter

Keywords

Navigation