House bias
Over at Pollytics, the Possum is maintaining some combined polling charts to map the fortunes of Labor and the Coalition as we move inexorably to the 2010 Federal election. I suspect the Possum would argue that weighting and pooling a number of polls creates a virtual super poll, with a much larger sample size, and consequently, a smaller margin of error. At one level I agree with the Possum. It is not an unreasonable approach to take, and (albeit more sophisticated) statistical meta analyses are widely used in medicine and education to combine the results of independent statistical studies. I used pooled polls on Oz Politics in the lead up to the 2007 election.
Nonetheless, there is a very big caution that goes with this approach. Systemic biases are not controlled when individual surveys are combined. As a result, the appearance of improved accuracy from pooling may be misleading. For example, if one pollster regularly and systemically over-states the Coalition’s two-party preferred vote; adding that pollster into a pooled poll detracts from the accuracy of the pooled poll.
This brings us to a key question: Are there significant systemic biases that would invalidate the potential for improved accuracy from pooling opinion polls? In my view, the answer is unequivocal: Yes! And because there are systemic biases between pollsters, pooled opinion polling is problematic. But before we get to the meaning of problematic, let’s look at the evidence for systemic biases.
Simon Jackman studied the house bias of the polling companies in the lead up to the 2004 election. His work is published as “Pooling the Polls Over an Election Campaign”, Australian Journal of Political Science, 2005, V40(4):499-517. There is a related slide presentation here. Jackman found that in the four months prior to the 2004 election Morgan typically under-estimated coalition support by 4.7 percentage points. In the lead-up to the 2004 election, Newspoll was estimated to have 2.7 percentage point bias towards Labor. Galaxy and Nielsen were estimated to have negligible biases (in the sense that their bias parameters can not be distinguished from zero).
If the polling companies did not suffer from systemic bias, you would expect their polls to track close together, and to be above one another as often as they were below. Yet, if we look at the next three graphs from the archives at Oz Politics, you can see that over time Newspoll consistently tracks to the political-right of Morgan. The track for ACNielsen, is a little more interesting to examine and consider; I won’t go into it here.



Over recent elections, Morgan appears to have had a systemic, left-leaning bias in comparison with the other pollsters; and in comparison with the final election results. In 2008, Morgan’s relative-left bias remains in respect of the other pollsters. However, with the departure of John Howard it is a little more difficult to assert with confidence that Morgan has a left-leaning bias compared to the general population. (For example, it is possible to hypothesise that Morgan’s bias stemmed from a social undesirability associated with revealing a voting preference for John Howard. If this hypothesis is true — something I can neither prove or disprove — then the recent polling data may indicate a systemic right-leaning in Newspoll and ACNielsen over the first nine months of 2008).
One last thing I need to clear up; the systemic house biases we are talking about are not intended by the pollsters. I believe every pollster is more interested in getting an accurate population prediction than spinning for one political party or another. The biases are the result of systemic factors. They come about from the myriad of little differences in the way the pollsters do their job. There are many possible sources of systemic biases: the way in which the sample is selected; the use of a phone interview versus face to face interviews; the order in which questions are asked; the way in which questions are asked; the time of day and the day of the week when surveys are conducted; the post interview weighting of responses from cohort slices; and so on. They are insidiously difficult to identify and root out from the survey process.
Conclusions: We cannot conclusively unravel which pollster is biased in respect of the general population as we head for the 2010 election. But we can conclude that significant systemic house biases exist between the pollsters and that these systemic biases are of a magnitude that they would invalidate any improvements in accuracy that might otherwise come from pooled opinion polling.
Where does this leave us? I suspect it means that while the shape and trend direction of the pooled polls is interesting; the nominal prediction is no where near as reliable as the pooled margin of error would suggest. Furthermore, the pooled poll may well be less accurate than at least one of the individual polls contributing to the pool. Depending on your political persuasion, pooled polling may offer false succour or cause excessive despondency.
Personally, I find the smoothed trends for the individual pollsters more informative, than a weighted combined poll. With an understanding of the systemic differences between the pollsters, and their performance at previous elections, I can make an informed judgment on the actual population parameter in respect of voting intentions.