Use and misuse of p-values; A review of NHST

In my final essay I present a case against null hypothesis significance testing using p-values. I first present several – by far not all – drawbacks and misinterpretations regarding NHST and its results. Then I present an example of a potential problem with using p-values as a decision criterion. My letter ends with a very brief summary of several methods that can be presented next to- or used as alternatives for NHST. Methods include; confidence intervals, effect sizes, and bayesian statistics.

The goal of this overview is to make people aware of controversies in this part of scientific practice, and perhaps to inspire a new view on how we should be educating new scientists. By offering not only theories based on p-value significance testing, but also discuss its flaws and alternatives.



Blog on replication; Discussion with EJ Wagenmakers & Michelle Nuijten

Hi everybody.

In this blog we review the major themes in the discussion of Thursday 25/9/2014.

Why (not) replicate?

  • A true finding should be replicable.
  • Replication is no QRP’s detector: if you want to you can make a finding happen. However, without QRP’s (and fraud) replication is meaningful.

How can we stimulate replication:

  • Reward system: give extra point to researchers that replicate findings.
  • Punish system: punish studies that cannot be replicated. For example, by linking the original study to the failed replication (there was no consensus about this point).
  • Change the research policies: replicate your study a couple of times before publication to be sure that you publish a meaningful finding.
  • Let students do a replication in their internship instead of (mostly failing) new research.

It is impossible to replicate all studies. Editors could determine which study needs to be replicated and researchers need to focus their replications on:

  • Studies on which policies are based.
  • High impact studies.
  • Implausible findings.

Lastly we discussed some changes that would benefit science and psychology:

  • Researchers need to change their mindsets: they do research in isolation. We need building blocks that combine expertise and knowledge.
  • Theory can function as a boundary in maintaining quality within science. However, there are too many useless theories: every finding can be explained by a different theory. We need better theories, for example theory based on mathematical models.
  • Supervisors and teachers need training in doing research the right way: preventing QRP’s.
  • Every researcher need to use Bayesian statistics (of course…).
  • Studies need to be pre-registered.
  • Prevent testing small samples because they will always bias the results (unless you pre register).
  • Every department needs a methodologist. UvA is planning to set up a ‘methodenleer winkel’ for the staff to get methodological advice.
  • Attitude change: we won’t discover large effects anymore, researchers will from now on probably discover only small effects.
  • Combine research effort: results are no personal baby.

Finally an optimistic note: We are moving in the right direction, in the sense of creating a lot of awareness and good initiatives to improve replication and replicability in psychology.


Sarah, Noor, Lukas, Bianca & Bob

Good Rules Bad Rules

In the last couple of years, questionably research practices and fraudulent researchers have received more and more attention from the scientific world. In an attempt to halt the deterioration of the reputation of science, drastical actions are often proposed. One of the solutions presented to counteract QRP is the implementation of more and stricter rules. The scientific world however, will not necessarily benefit from increased regulation.

Undesirable (always) and unethical (some cases) as the questionable practices are, we should be careful not to overreact and implement ‘solutions’ that incriminate scientific research even further. The implementation of a large set of rules might work well against some of the QRP’s but usually also leads to the following; an increase in administrative burden; increased waiting times between application and approval; etc. in one word: bureaucracy (see Kafka for some examples of bureaucracy at its finest). Science, in most fields, is changing rapidly and scientist experiment with new practices and new ways of working. Large bureaucratic institutions however, have the tendency to move forward and adapt in a rather sluggish pace (if at all). Regulation and the enforcement thereof will not be able to keep pace with the fast advances that some fields make. One can imagine that dealing with rules that do not adapt to changing circumstances might slow down, and prohibit new research. The point I’m trying to make is that we should not all follow each other in blindly screaming that we need rules and guidelines (when in fact we need integrity) in our quest to improve scientific practice, but also to keep in mind that a lot of the suggested solutions for misconduct have possible drawbacks.

n.b. This blog is not meant to advice against all regulation, just as a warning that we should not, in our enthusiasm to improve, make matters worse.