What are confidence intervals and p-values?

A confidence interval calculated for a measure of treatment effect shows the range within which the true treatment effect is likely to lie.

A p-value is calculated to assess whether differences between treatments are likely to have occurred simply through chance, or whether they are likely to represent a genuine effect.

Confidence intervals are preferable to p-values, as they tell us the range of possible effect sizes compatible with the data, and thus provide clinically relevant information.

P-values simply provide a cut-off beyond which we assert that the findings are ‘statistically significant’ (by convention, this is p<0.05).

A confidence interval that embraces the value of no difference between treatments indicates that the treatments are not significantly different.

Confidence intervals aid interpretation by putting upper and lower bounds on the likely size of any true effect.

Non-significance does not mean ‘no effect’. Small studies will often report non-significance even when there are important, real effects which a large study would have detected.

Statistical significance does not necessarily mean that the effect is real: by chance alone about one in 20 significant findings will be spurious.

Statistically significant does not necessarily mean clinically important.

It is the size of the effect that determines the clinical importance, not the presence of statistical significance.

While confidence intervals may be preferable to p-values, the latter provide complementary information, and both can be reported together.