In the latest issue of Nature there is a piece by Regina Nuzzo about how un-reliable p-values are in interpreting statistical tests. She describes how researchers use the p-value entirely differently than how Fisher intended it to be used when he first published it in the 1920s. His significance, censu Nuzzo, was a mere means to test whether a result is worth pursuing. It has become the gold standard in scientific work to test for the 0.05 significance level without providing information about the feasibility of the hypothesis being tested, or on the strength of the effect being reported. Without that information, the p-value is not very informative and in many cases might be misleading. This paper can be related to a long historic feud among statisticians regarding the pros and cons of frequentist statistics versus, for example, Bayesian inference techniques.
The take home message from Nuzzo’s and other papers on the subject, is that it’s not enough to use p-value, one should always look for additional ways to test the rigor of one’s results, and to describe them in a way that renders them highly reproducible.