Page Speed Insights Scores Aren't As Stable As You'd Think
We use a number of different tools when profiling our sites performance. PageSpeed Insights is one that gives you a nice high level and easy to compare scores. Recently though, I've become aware that if you run tests back to back you get varying scores, sometimes by quite a lot too.
How Much Is A Lot?
In some cases as much as 50 points difference!
To demonstrate I ran Page Speed Insights on https://www.freshleafmedia.co.uk 17 times one after another. Fun fact: if you run that many tests in a short period you start getting CAPTCHA challenges. The scores we got were between 98 and 49, if we discard the highest and lowest value to remove anomalous results we get between 97 and 81, a difference of 16 (a standard deviation of 5!). Better but still not great.
Is It Just Me?
My first thought was that the difference was caused by fluctuations in something like server response time. It's normal for there to be changes in network conditions, server load etc which could easily causes the score to change, and I think this explains some of the outliers we see. But looking closer at the reports the response time for the bulk of the runs doesn't change much.
My next thought was to try other sites. I choose a site that is statically generated, so there is no PHP, no database, no third party integrations - and it got weirder. At first, the results seemed stable but the more tests I ran it became apparent that it was still varying only in 'chunks'. To explain: the first 3 tests all showed 87%, the next four showed 91% and the next five 83%???
PageSpeed Insights is useful and I still use it. The insights it gives about where your site is slow are invaluable. But if you want to use the scores to benchmark before and after a change I would recommend running it a couple of times and taking the average
I recently became aware that the PageSpeed Insights FAQs page lists this exact problem and loosely confirms my theories:
Why does the performance score change from run to run? I didn’t change anything on my page!
Variability in performance measurement is introduced via a number of channels with different levels of impact. Several common sources of metric variability are local network availability, client hardware availability, and client resource contention.