*If you want more updates, follow me on Twitter. I'll post new projections later today, I want to wait for the new Nanos numbers.*

Every poll published in Canada is supposed to come with margins of error. You know, the plus or minus 3%, 19 times out of 20 thingy. Over the last decade, the rise of online polls has created a new debate among many since those polls don't use a random sample - they instead have panels of hundred of thousands of people who accepted to be on this list - and therefore don't have the classic margins of error.

That's a debate for another day however. What I want to do in this post is take a look at the actual, empirical accuracy of polls. Because, you see, the 3% 19 times of 20 is purely theoretical (and as mentioned, doesn't even apply to online polls). It only represents the random variation occurring because of the random sampling. It doesn't account for other factors such as turnout, biased sampling, people changing their mind, selection bias, etc.

Think of it this way: if really the only uncertainty was due to the random sampling, then polling aggregators like me would have almost perfect accuracy. Indeed, while one individual poll with 1000 observation has margins of error of roughly 3%, a polling average, composed of 5-7 polls, has a theoretical margin of error much, much smaller than that (less than 1% for sure). Yet, empirically speaking, I (or any other aggregators really) have not been that close.

Here are the metrics we'll use. The first one is the average absolute error. This one is the simplest. If a party was polled, in average, at 30% but ultimately got 32% of the vote, the absolute error is 2%. Absolute means you take the absolute value, so it doesn't matter if the polls were under- or overestimating the party. You do this for every party during an election, take the average and voilà.

The second metric is the MSE, or Mean Square Error. This is the average of the deviations squared. This is a very commonly used measure of the precision of an estimator in statistics.

Finally, using the MSE, we can estimate actual, empirical margins of error by simply multiplying the MSE by 1.96. This measure has a nice interpretation and can be directly compared to the theoretical one provided by the polls.

I have looked at the following elections over the last 12 years: the 2008, 2011 and 2015 federal elections, the 2008, 2012, 2014 and 2018 Quebec elections, the 2012, 2015 and 2019 Alberta elections, BC in 2013 and 2017 as well as Ontario in 2014 and 2018. In each case, I took an average of the polls published during the last week, limiting each polling firm to one poll only. No weight or nothing, just a simple average. Some might argue that I should give a bigger weight to polls closer to election day. Fair enough but empirically it really doesn't make a big difference, as we showed in a research paper with David Coletto. I also only used the numbers for the major parties included in the polls. So the number varies depending on the election (5 at the federal level for instance).

So here are the results:

In average polls have been roughly 2 points off. That's not bad you'll say but remember that this is the polling AVERAGE. As mentioned above, this average should be much closer to the actual result if the only uncertainty was really due to random sampling. Being 2 points off for an individual poll? That's great. For the average composed of usually 5-10 polls? Not impressive. And this alone shows that theoretical margins of error published in the media are pretty useless. This is why, personally, I don't care about the phone vs online debate - at the end of the day, what matters is the empirical accuracy. But feel free to have a strong opinion on this and go on Reddit or Twitter to express your deep knowledge of stats.

Maybe the most shocking stat is the corresponding, effective, empirical margin of error: 5.8%! And if you think I made a mistake or that Canadian polls are uniquely bad, you'd be wrong. This same margin of error in the US is close to 7%! In other words, we pretty much never ever have an election where we can be absolutely certain as to who will win based on the polls. Take the current federal election, it means the range at 95% confidence level for the top two parties are 29% to 40% roughly. Yes that might seem absurd but this what the accuracy has been in the last 10 years.

Side note: France presidential elections seem more accurate than our elections, in average, with an effective margin of error below 4%. And that includes the 2002 election where pollsters have the wrong top 2.

Don't believe me? You'll likely mention a list of elections where the polls were super close. Fair enough but let's me retort with many cases of giant mistakes. Alberta 2012, polls had the Wildrose ahead by 7 points, they lost by 10 to the PC! In BC 2013, the BC Liberals got 44% of the vote while the polls predicted around 36%.

More recently the CAQ in Quebec won over the Liberals with a margin of almost 13 points. What were the polls saying? CAQ ahead by roughly 4!

Even elections where polls got the correct winner can have weak accuracy. The recent Alberta election had polls putting the UCP at 48%, 10 points ahead of the NDP. The actual results? UCP got close to 55% while the NDP was only at 33%. Even elections where people think the polls did well aren't that great. The 2018 Ontario election had the PC at 39%, a little bit less than 4 points ahead of the NDP. At the end, Doug Ford won with 40.5% versus 33.6%. Not as bad as the other examples but still far from perfect accuracy. and a good example of how having an average absolute error of 2 points can lead to outcomes very different from the polls and projections.

Here below you have those stats for every election I used:

We can clearly see the big misses of Alberta 2012 and BC 2013, as well as Quebec 2018 or Alberta 2019 (again, the last two are less remembered because the polls at least had the correct winner).

The good news for us? Federal elections have been more accurate than the average so far. The bad news? The polls over the last year or so have been very bad. Ontario 2018 is the best of the bunch and the polls still missed the margin of victory by 4 points!

So keep that in mind when you're looking at polls, even polling averages like on my site. Also, keep an open mind if you see a poll that 'clearly' looks like an outlier. You never know, this one poll might actually be right.

Oh, finally, when I looked at the performance of online versus phone polls, I found no significant difference between the two.

Oh, finally, when I looked at the performance of online versus phone polls, I found no significant difference between the two.