HARPER DEFEATS CENSUS

DeweyDefeatsTruman-550

Harper spokespeople argue that sending the voluntary census long form to a larger number of people will compensate for any loss of data quality due to the newly voluntary nature of the form. Milan Ilnyckyj explains the fallacy.

One of the biggest challenges in statistics is collecting a representative sample: finding a subset of the population that will do a good job of approximating the whole group. When a dataset contains a lot of sampling bias and is not reflective of the general population, it is essentially worthless as a guide. That cannot be fixed by using a larger sample size, nor can it be dealt with via fancy mathematics.

The classic example of sampling bias is the ‘Dewey Defeats Truman’ headline, from the Chicago Tribune in 1948. The newspaper got their prediction wrong because they sampled people with telephones, at a time when telephones were comparatively rare. Most of the people who had them were rich, and rich people were more supportive of Dewey. As a consequence, telephone polling provided bad information about the likely voting behaviour of the whole population.

While on the census fiasco, Jim Brown, guest host of CBC Radio’s The Current, was uncharacteristically ill-prepared this morning for his interview with Conservative sock-puppet Tim Powers. He let Powers float unchallenged from one specious talking point to another, even letting him equate the supposed intrusiveness of a standard census question about the number of bedrooms in a respondent’s house to Pierre Trudeau’s decision to repeal laws outlawing private homosexual acts. If you’re going to guest host a national show, you need a passing familiarity with recent Canadian history, and you need to bone up on the issues of the day. Brown is usually better than this.