Keen readers may be aware that regression-based techniques, sometimes known as MRP, are the exciting new methodology in polling. Electoral Calculus has been a firm proponent on these methods as an improvement over both classic polling analysis and uniform national swing (UNS) seat predictions.
We have conducted a large amount of testing on previous election polls to verify that the method works in practice (testing details). We have also looked closely at the question of sample size to see how large a regression-based poll needs to be. Both the academic literature and our own testing suggest that samples of four thousand to six thousand people are sufficient for a reasonable-quality result, and that it is not necessary to have a massively large sample.
As it happens, two regression polls were published yesterday. In the morning, the campaign group RemainUnited published a regression poll conducted by Electoral Calculus with polling fieldwork by Savanta:ComRes in Britain and LucidTalk in Northern Ireland (press release). The British fieldwork took 3 days and interviewed around 6,000 people.
Later in the evening, YouGov published a large-scale regression poll where the fieldwork took 7 days and interviewed over 100,000 people (YouGov details). They did not include Northern Ireland.
The results of the two polls are shown in the table below.
|Party||2017 Votes||2017 Seats||Elect Calc|
Electoral Calculus, on behalf of Remain United,
used polling fieldwork from (a) Savanta:ComRes surveying 6,073 people in GB from
6-8 Dec 2019, and (b) LucidTalk surveying 2,318 people in Northern Ireland from 27-30 Nov 2019.
YouGov did their own analysis and fieldwork and surveyed 105,612 people from 4-10 Dec 2019. YouGov did not make predictions for Northern Ireland.
(The Speaker has been classified as Labour for comparison purposes.)
While there is no guarantee exactly how accurate these predictions are, it is notable that the very large poll did not produce significantly different results. Although one comparison does not conclusively prove any case, it is suggestive of support for our belief that large samples, which also have large costs, are not necessary to get reasonable answers from regression-based polling.