Most people would agree that “phacking””, the art of getting a pvalue just below 0.05, and publication bias both hurt the accumulation of knowledge. We end up with data hidden in subfolders of discarded laptops and estimated effects biased away from 0. However, their effect on metaanalyses is not additive. In fact, if we take publication bias as given, phacking actually might reduce the bias in naive metaanalyses.
Let’s say that there is no phacking but there is publication bias. In this situation, only results that are statistically significant are published, and each individual study provides an unbiased estimate of the population parameter. When these studies are aggregated (for now let’s consider their simple mean), the aggregate estimated parameter will be biased away from 0. Now let’s imagine a world where there is also phacking. There are many ways to phack, but let’s consider the case where one is changing model specifications to increase the size of the parameter of interest.^{1} In this case, many results that are just large enough to be published appear in the literature. These estimates will have a mean closer to 0 than the mean of the statistically significant results because they have been phacked to be just barely significant. Therefore, this will bring the cumulative estimate of the population parameter towards 0, thus mitigating the bias from publication bias. Given publication bias, phacking actually reduces the bias in our estimates of the population parameter.
To demonstrate this further, I will use an example in R
. First let us set up a population where the true parameter is 1 but is realized with some error. Then I write a function cumulative_est
that computes three cumulative estimates—one with no bias, one with publication bias, and one with publication bias and phacking—of the population parameter following 500 experiments. First, it runs experiments 500 times, sampling from the population, estimating the mean and the pvalue corresponding to the ttest that the parameter is not 0. From these experiments, we take the mean of these individually estimated parameter values to produce a cumulative estimate of the population parameter under the three aforementioned conditions.
We operationalize publication_bias
as the proportion of nonsignificant parameter estimates that end up being published. Then p_hacking_level
as the amount of bias researchers are willing or able to induce in their parameter estimate to get a statistically significant result and p_hacking_success
is the number of phacking attempts that work. I also assume that phackers stop phacking once they creep their confidence interval just north of 0 and thus their estimates are biased upwards by $0$ minus the lower bound of the confidence interval.
Now I use Monte Carlo simulations to show how the cumulative estimate of the population parameter will be less biased under phacking and publication bias than just under publication bias. The first case, nobias
is when there is no publication bias and no phacking. In the second case, pubbias
, only 1 out of every 40 insignificant results is published. In the third case, pubbias_phack
, can add up to 0.2 to their effect and get published 19 out of 20 times. Note that they only bias their parameter estimate by just enough to pull the lower bound of the confidence interval above 0. In this case, we also allow 1 in 40 of the remaining insigificant, nonphacked results to get published.
So, given publication bias, does phacking improve cumulative estimates of population parameters?
True parameter  No bias  Pub. bias  Pub bias + phacking 

1  1  1.11  1.06 
Indeed it appears that it does! Let’s plot the 1000 simulated cumulative estimates of the population parameter.
Feel free to take the code and play around with the parameters.

In fact, if phacking is done by choosing onesided tests instead of twosided tests after seeing the data, the attenuation in bias will be even greater because the individual experiments will not produce estimates biased away from 0. Only their pvalues will change! ↩