Track Record: 2019 Errors

This page first posted 27 January 2020

The headline prediction for the December 2019 election was broadly accurate. The polls fairly accurately indicated a substantial lead of the Conservatives over Labour, which our new regression-based model translated into a considerable Conservative majority. In the event the Conservative lead of Labour was slightly greater than the polls had shown, so that the Conservative majority was a bit bigger than expected.

In numerical terms, the prediction and the outcome for GB seats were:

Party	2017 Votes	2017 Seats	Pred Votes	Pred Seats	Actual Votes	Actual Seats	Vote Error	Seat Error
CON	43.5%	318	43.3%	351	44.7%	365	−1.4%	−14
LAB	41.0%	262	33.9%	224	33.0%	203	+0.9%	+21
LIB	7.6%	12	11.7%	13	11.8%	11	−0.1%	+2
Brexit	0.0%	0	3.2%	0	2.1%	0	+1.1%	0
Green	1.7%	1	2.7%	1	2.8%	1	−0.1%	0
SNP	3.1%	35	3.6%	41	4.0%	48	−0.9%	−7
Plaid	0.5%	4	0.4%	2	0.5%	4	−0.1%	−2
UKIP	1.9%	0	0.0%	0	0.1%	0	−0.1%	0

The final Electoral Calculus prediction is made up of two components: the poll of opinion polls and the seat predictor. Both parts performed well.

The poll-of-polls showed a Conservative lead over Labour of 9.4pc compared with an actual lead of 11.7pc. The difference between those, an error of 2.3pc, is low by historical standards. The equivalent lead errors were significantly higher in recent elections, with an error of 4.3pc in 2017 and a large error of 6.3pc in 2015.

Even with the poll error, the seat predictor performed very acceptably. The two major parties were within the margins of error, and the Conservative prediction of 351 seats was only 14 seats too low. Of all major UK seat predictors in 2019, the Electoral Calculus final prediction came closest to the actual result. Other predictors generally had the Conservative seat prediction too low, such as FocalData (337 seats), YouGov (339 seats) and Datapraxis (344 seats).

Of course, there is more to prediction accuracy than the headline seat count, since every seat is predicted individually which provides 650 opportunities to be right or wrong. Looking at the final Electoral Calculus seat-by-seat prediction, the winner was predicted correctly in 605 seats and incorrectly in 45 seats. That is a success rate of 93pc, which is the same as the high success rate of YouGov's final large-scale MRP in 2017. About a third of the incorrect seats were marginals which could have gone either way, leaving 31 seats which were mis-predicted.

We can also separate out the errors caused by polling and by the seat predictor. If we feed the actual national vote shares for GB and Scotland into the Electoral Calculus seat predictor we get the following result:

Party	Outcome Votes	Predicted Seats	Outcome Seats	Seat Error
CON	44.7%	362	365	−3
LAB	33.0%	204	203	1
LIB	11.8%	13	11	2
Brexit	2.1%	0	0	0
Green	2.8%	1	1	0
SNP	4.0%	50	48	2
Plaid	0.5%	2	4	−2
UKIP	0.1%	0	0	0

You can see this for yourself by running the seat predictor with the correct national and Scottish vote shares: run predictor.

This is quite an accurate performance, with all parties correctly predicted to within just three seats. As it happens, this is as accurate as the TV Exit poll.

Of the two causes of error: polling error and predictor error, the polling error was actually the larger component, even though it was fairly small. The error in the prediction model was quite small indeed.

We will now look at these and other issues in more detail. The particular topics studied are:

Northern Ireland
Model errors
Seat by seat errors

1. Northern Ireland

For the first time, Electoral Calculus was able to make predictions for Northern Ireland, in collaboration with our polling partners at LucidTalk.

The table below shows the predictions which were made and compares them with the actual outcome.

Party	2017 Votes	2017 Seats	Pred Votes	Pred Seats	Actual Votes	Actual Seats	Vote Error	Seat Error
DUP	36.0%	10	30%	10	31%	8	−1%	+2
SF	29.4%	7	25%	6	23%	7	+2%	−1
SDLP	11.7%	0	13%	2	15%	2	−2%	0
UUP	10.3%	0	11%	0	12%	0	−1%	0
Green	0.9%	0	0.1%	0	0.2%	0	−0.1%	0
Alliance	7.9%	0	16%	0	17%	1	−1%	−1
NI Other	3.7%	1	5%	0	3%	0	+2%	0

On the whole, the vote share predictions were fairly accurate. All parties were predicted to within 2pc of their actual vote share, which is within the margin of error (2.6pc) for a poll of 2,300 people. Our LucidTalk poll correctly predicted that the DUP and Sinn Fein would lose support compared with 2017 and that the Alliance would gain support. It also correctly predicted only small increases in support for the SDLP and UUP. In terms of seats, two seats were mis-predicted and sixteen seats were correctly predicted. Overall, the prediction was fairly accurate.

The two mis-predicted seats are described and explained below.

In Belfast North, we predicted a narrow DUP victory over Sinn Fein with a predicted majority of 3pc. In the event, Sinn Eein won the seat with a narrow majority of 4pc. Compared with our prediction, there is the suggestion of anti-DUP tactical voting by Alliance supporters, as the Alliance party polled 7pc lower than expected.

In Down North, we predicted a very narrow DUP victory over the Alliance with a predicted majority of just 0.2pc. The Alliance party actually won the seat with a majority of 7pc. This was a tricky seat to predict because the independent incumbent (Lady Sylvia Hermon) had stepped down and the UUP had not run in the seat recently. The UUP did better than we expected, and partially split the Unionist vote.

1.1 Northern Ireland methodology

This was the first time that Electoral Calculus made predictions for Northern Ireland and there was some interest from people in the province and elsewhere about how the predictions were made.

The first stage was a poll of 2,300 people in Northern Ireland conducted by LucidTalk and sponsored by Electoral Calculus and Remain United. LucidTalk performed their own analysis on the fieldwork data to calculate their estimates of the vote share for each party, as well as approximate probabilities for parties in each seat. At the planning stage, it was hoped to use regression-based techniques for these calculations. In the event, there was not enough time to perform a careful regression and classical polling methods were used instead.

You can see the LucidTalk tables for this poll on their website. The key figures are given in the spreadsheet of tables on the tab named 'Q2-WestminsterVote-ExcNVs' which has the following vote shares: DUP 30pc, SF 25pc, Alliance 16pc, SDLP 13pc, UUP 11pc, Other 5pc. These are the provincial vote share figures which drove the predictions.

LucidTalk also conducted their own accuracy analysis, focusing on a slightly different prediction, which was the vote share for the five large parties excluding smaller parties. They found, on that basis, that their vote share predictions were within 1pc of the actual result. (LucidTalk analysis).

The other key table was the LucidTalk NI Westminster Seat Predictor Model which is shown on page six of this LucidTalk document. For the record, those probabilities of victory in each seat for each major party are shown in the table below:

Seat	Predicted Winner	DUP	SF	Alliance	SDLP	UUP
Antrim East	DUP	Max
Antrim North	DUP	Max
Antrim South	DUP	54%		26%		18%
Belfast East	DUP	53%		42%
Belfast North	DUP	49%	46%
Belfast South	SDLP	23%		19%	55%
Belfast West	SF		Max
Down North	DUP	47%		44%
Down South	SF		Max
Fermanagh and South Tyrone	SF		67%			28%
Foyle	SDLP		45%		48%
Lagan Valley	DUP	Max
Londonderry East	DUP	Max
Newry and Armagh	SF		Max
Strangford	DUP	Max
Tyrone West	SF		Max
Ulster Mid	SF		Max
Upper Bann	DUP		Max

In this table, the probabilities indicate the probability of the party winning the seat, not the predicted vote share. The entry 'Max' indicates the party has a very high chance of winning the seat, interpreted by Electoral Calculus to be around 80pc or higher.

The second stage of the process was conducted by Electoral Calculus. In this stage, predicted vote shares for each seat were inferred. The starting point for the optimization was the predicted results in each seat using the Electoral Calculus model applied to the LucidTalk provincial vote shares. These predictions were then adjusted using a search-based optimization to find adjusted vote shares for each seat which satisfy these three constraints:

The total party vote shares over the province should match those calculated by LucidTalk
The predicted winner in each seat should match LucidTalk
the seat win chances should (approximately) match those calculated by LucidTalk

The results of this process were the predicted seat results shown on the Electoral Calculus website. In terms of the division of labour between LucidTalk and Electoral Calculus, LucidTalk was responsible for the province-wide vote shares and for the predicted winner of each seat; and Electoral Calculus was responsible for the predicted vote shares and for our seat win chances (which were similar to, but different from, the LucidTalk values due to modelling and other constraints).

1.2 Northern Ireland summary

Overall the Northern Ireland prediction was fairly accurate. The province-wide vote shares were within the margin of error for all parties. All but two seats were correctly predicted, and the two which were mis-predicted were correctly identifed as being very marginal.

Electoral Calculus is pleased to have partnered with our friends at LucidTalk on this project and to have benefitted from their well-informed judgements on Northern Irish politics.

2. Model errors

We can separate out the effects of polling error and model error. This is done by running the model using the actual national vote shares form the election and seeing how accurate the result is. This removes polling error because we are using the actual vote shares, and so the error that remains is model error.

This can be done both for the new regression-based model, which was the one used in the campaign, and also our older UNS-style strong transition model. This lets us compare between the two models to see if there is any noticeable difference.

The regression-based model used a regression-driven "baseline" prediction, which was then modified by a small UNS overlay to adjust for the difference between the regression's national vote shares and the target national vote shares. The regression baseline was based on two waves of campaign polling, each around 6,000 respondents in size.

The table below shows the actual election result in terms of both vote share and seats won; the UNS-style prediction using the actual vote share as the model input; and the regression-based prediction. The predictions are given in terms of seats and the error (defined as predicted seats minus actual seats) is also shown.

Party	Actual Votes	Actual Seats	UNS Prediction	UNS Error	Regression Prediction	Regression Error
CON	44.7%	365	351	−14	362	−3
LAB	33.0%	203	210	+7	204	1
LIB	11.8%	11	17	+6	13	2
Brexit	2.1%	0	0	0	0	0
Green	2.8%	1	1	0	1	0
SNP	4.0%	48	50	+2	50	2
Plaid	0.5%	4	3	−1	2	−2
UKIP	0.1%	0	0	0	0	0

The number of incorrectly predicted seats in total was 48 for the UNS model and 34 for the regression model.

Generally the regression model performed better than the UNS model. The regression model predicted the total seats for each party to within a narrow tolerance of just three seats. But the UNS model was 14 seats too low for the Conservatives. (If the UNS model had been run with the predicted vote shares rather than the actual vote shares it would have predicted the Conservatives to win 344 seats, which is 21 seats too low.) The regression-based model was also more successful at predicting individual seats, having a seat error rate of 5.4pc compared with 7.6pc for the UNS-style model.

Overall the regression-based model was a noticeable improvement on the older UNS approach and was a more accurate predictor.

3. Seat by Seat errors

Even if we use the exact national vote shares, we do not predict every seat correctly. Although the total for each party is correctly predicted to within three seats, there are still 34 seats which were mis-predicted. This compares well with previous years. There were 50 seats wrong in 2017 (after correcting for polling and methodology issues), 36 in 2015, and 63 in 2010.

This is encouraging that the new regression-based model is having some success in converting national polling to lower-level geographies.

Num	Seat Name	GE2017	Prediction	GE2019	Area	Swing	Comment
1	Moray	CON-09	NAT-01	CON-01	Scotland	1%	Marginal
2	Blyth Valley	LAB-19	LAB-01	CON-02	North East	2%	Marginal
3	Dewsbury	LAB-06	LAB-01	CON-03	Yorks/Humber	2%	Marginal
4	Dumfries and Galloway	CON-11	NAT-02	CON-04	Scotland	3%	Pro-Union
5	Delyn	LAB-11	LAB-04	CON-02	Wales	3%	Welsh anti-Labour
6	Birmingham Northfield	LAB-11	LAB-03	CON-04	West Midlands	4%	Pro-Leave
7	Banff and Buchan	CON-09	NAT-00	CON-10	Scotland	4%	Pro-Union
8	Wolverhampton South West	LAB-05	LAB-03	CON-04	West Midlands	4%	Pro-Leave
9	Vale of Clwyd	LAB-06	LAB-04	CON-05	Wales	4%	Welsh anti-Labour
10	Colne Valley	LAB-02	LAB-00	CON-08	Yorks/Humber	4%	Pro-Leave
11	Lincoln	LAB-03	LAB-02	CON-07	East Midlands	5%	Pro-Leave
12	Stoke-on-Trent Central	LAB-12	LAB-07	CON-02	West Midlands	5%	Pro-Leave
13	Ynys Mon	LAB-14	LAB-05	CON-05	Wales	5%	Welsh anti-Labour
14	High Peak	LAB-04	LAB-09	CON-01	East Midlands	6%	Pro-Leave
15	Cheltenham	CON-05	LIB-02	CON-02	South West	7%	Pro-Leave
16	Norfolk North	LIB-07	LIB-01	CON-28	Anglia	15%	Incumbent steps down
17	Bedford	LAB-02	CON-02	LAB-00	Anglia	1%	Marginal
18	Coventry South	LAB-17	CON-02	LAB-01	West Midlands	2%	Marginal
19	Alyn and Deeside	LAB-12	CON-02	LAB-00	Wales	2%	Marginal
20	Coventry North West	LAB-17	CON-03	LAB-00	West Midlands	3%	Pro-Remain
21	Hemsworth	LAB-22	CON-01	LAB-03	Yorks/Humber	3%	Anti-Con
22	Sheffield Hallam	LAB-04	LIB-06	LAB-01	Yorks/Humber	4%	Lingering Anti-Lib
23	Wolverhampton South East	LAB-23	CON-02	LAB-04	West Midlands	4%	Pro-Remain
24	Bradford South	LAB-16	CON-04	LAB-06	Yorks/Humber	5%	Anti-Con
25	Hartlepool	LAB-18	CON-04	LAB-09	North East	7%	Strong Brexit candidate splits Leave vote
26	Putney	CON-03	CON-09	LAB-09	London	9%	Surprise Labour win
27	Dagenham and Rainham	LAB-10	CON-16	LAB-01	London	10%	Leave-ish incumbent holds on
28	Portsmouth South	LAB-03	CON-03	LAB-11	South East	12%	Pro-Remain
29	Fife North East	NAT-00	NAT-04	LIB-03	Scotland	8%	Pro-Union
30	St Albans	CON-11	CON-07	LIB-11	Anglia	14%	Pro-Remain surprise
31	Ceredigion	NAT-00	CON-03	NAT-16	Wales	2%	Underestimated PC strength
32	Arfon	NAT-00	LAB-04	NAT-10	Wales	3%	Underestimated PC strength
33	Dunbartonshire East	LIB-10	LIB-06	NAT-00	Scotland	4%	Surprise decapitation
34	Kirkcaldy and Cowdenbeath	LAB-01	LAB-28	NAT-03	Scotland	11%	Suspended SNP candidate wins anyway

[Note the use of Slide-O-Meter notation of "CON-03" to mean a Conservative majority of 3% which is used in this table. Majorities are rounded to the nearest integer percentage, so "CON-00" means a majority of less than 0.5%.]

There are a number of stories here. In outline they are:

Marginals. As usual there were a number of seats which were just marginal and could have gone either way. This year there were six of them [1-3, 17-19].
Pro-Leave. There were seven seats where voters seem to have moved to the Conservatives to ensure a pro-Leave candidate won [6, 8, 10, 11, 12, 14, 15].
Pro-Remain. There were four seats where there was a clear sign of tactical voting against the Conservatives by Remain supporters [20, 23, 28, 30]. There were an additional two seats where there were outright defections from Conservatives to Labour [21, 24].
Pro-Union. There were three seats in Scotland where pro-Union tactical voting appears to have made a difference in defeating the SNP [4, 7, 29].
Wales. In Wales, there seems to have been anti-Labour tactical voting. Although the model correctly predicted every party's vote share in Wales, Labour did less well than expected in some key seats. They lost three seats to the Conservatives [5, 9, 13]. The model also underestimated Plaid Cymru's strength in two seats where they were the incumbents [31, 32].
Special stories. Special circumstances applied to a few seats. These included the presence [27] or absence [16] of a popular incumbent; an SNP candidate suspended for anti-semitism who won anyway [34]; a strong Brexit candidate who split the Leave vote [25]; or a lingering dislike of Lib Dems [22].
Surprises. Some seats just had a very surprising result which went against the trends for seats like that. These include [26, 30, 33].

You can also see the seat errors graphically. The graphic below is a "political triangle" which maps Conservative, Labour and Lib Dem support into a triangle, where each party has a corner of strength and is strong in points near that corner, and weaker for points further away. A party should win a seat if it falls inside the kite shape indicated by the internal dotted lines. (See seat migration for more details.)

Each seat is represented by a line and a blob. The line starts from the predicted outcome for the seat (using the regression model and the actual national vote shares) and finishes at the actual outcome for the seat, indicated by the blob. The colour of line indicates the predicted winner of the seat (blue for Conservative, red for Labour, orange for Lib Dem, green for Green, and yellow for SNP/Plaid). The colour of the blob indicates the actual winner of the seat.

You can hover over a blob to see the name of that seat. You can also press the button to toggle between viewing all seats and only those seats which were incorrectly predicted.

Outside Scotland and Wales, most of the incorrect seats were around the Labour/Conservative marginal border. Some of the strange seats, such as Putney, Portsmouth South and Hartlepool, are clearly visible as outliers on the "incorrect seats" view.

One clear pattern is that many seats near the Conservatives' border with the Lib Dems moved more towards the Lib Dems than expected. This suggests that anti-Conservative tactical voting was taking place, though there was not quite enough of it to win many seats, other than St Albans.

In the Labour corner, there was another pattern of strong Labour seats being even more stronger than expected. This was particularly noticeable in Labour cities, such as Liverpool, Manchester and parts of London. The Labour party platform and leadership was appealing to voters in those areas, while repelling more moderate voters elsewhere.

In the Conservative corner, there were a few seats which were even more Conservative than predicted. These were often in Essex and other Leaver strongholds, plus some home counties seats.

Summary and Conclusions

The main points that this analysis has shown are:

Our headline prediction for 2019 was broadly accurate
Allowing for the modest polling error, the model was very accurate and predicted all parties to within three seats.
Only thirty-four seats were mis-predicted in total (after correcting for polling error), which is a good performance on a historical basis.
The regression-based modelling appears to have performed quite well, even in the challenging context of considerable voter realignment and abstentions.

The overall performance of the prediction was good, and the model was very good.

Return to top of page, or track record summary, or home page.