Archive

Posts Tagged ‘Simpson’s Paradox’

A tale of two Graphs

May 3, 2016 2 comments

For the last two years I’ve collected some amazingly bad graphs from election material, both that has come through my door, and that other people have sent me (see this and this.) My own MP provided so many gems that Colin Beveridge  (@icecolbeveridge) started an Internet campaign to have people refer to these misleading election graphs as “Mulhollands” – after the man himself. This led to Colin and others tweeting him questions about his misleading graphs and one teacher, Adam Creen (@adamcreen) tweeting him with corrected “Mulhollands” that his Y9 class had completed.

This was obviously effective, as each time a piece of campaign literature has arrived since I have scoured it and there has only been one chart on any of them, and that was correctly drawn! A success! Bizarrely, as I have had many other folks looking out for them, this seems a success that has been widespread as I’ve not come across any hideously inaccurate graphs this year.

I did think that this election season would pass without any mention on this blog but today I came across two interesting graphs from a neighbouring ward. Both are accurate, but they tell very different stories, and reminded me a little of Simpson’s Paradox, without actually being directly related to it.

Exhibit A

image

This graph is from the Lib Dems in Horsforth ward for the Leeds City Council elections, it’s not wholly accurate in terms of the bar charts,  but it’s near enough to not irk me too much. It shows that of the 5 previous local elections in the area the Lib Dems have won 3 and the Conservatives have won 2. They are using this to sell the idea that it’s only them or the Conservatives who can with the seat. However…….

Exhibit B

image

This is from a Labour party leaflet in the very same ward and it shows that in the last Local Election the Labour party candidate came second to the Conservative candidate and that the Lib Dem ended in last place. The inference here is that Labour are more likely to beat the Conservatives as they came second last time.

Both leaflets are presenting true facts selected to further their narrative, and both are presenting them accurately, although one could argue they are both a little misleading.

I’ve looked at the stats from the last few years, it seems that on general election years the tories win by a fair way, but that in local election years it is tight between all three parties, but the lib dem vote has been steadily dropping. It could be an interesting ward to judge the national feeling on on Friday when the results come in, as it is really a 3 way marginal in non GE years.

The anomaly that is the general election year is interesting, more people do vote nationally when there is a GE, but the massive swing to the tories is fairly unusual, as they tend to be good at mobilising the vote. I do know that the Lib Dems in that constituency didn’t really campaign during the GE and the seat was a Tory Labour marginal, and that in a neighbouring Lib Dem Labour marginal the Conservatives didn’t campaign, so perhaps this had an effect.

If you have found any terrible election graphs, please send me them!

False Variables and Simpson’s Paradox

February 5, 2014 2 comments

Last weekend I attended a day of lectures as part of my MA course. The focus of the day was on barriers for learning and it was quite intensive. Part of the day involved looking at the statistics involved in various things and seeing how they related to the development of children and the lecturer mentioned the idea that a false variable can skew ones ideas, and can make it look like something is having an effect, when in reality it is something else.

This idea of false variables is one that has been “following” me around recently. The first book I read this year was “The Simpsons and their Mathematical Secrets” by Simon Singh. In the book he discusses “Simpson’s Paradox”. The example he uses is in relation to the US government vote on the American civil rights act of 1964. In the north, 94% of democrats voted for the act compared to 85% of republicans. In the south 7% of democrat voted for, and 0% of republicans did. However, overall 80% of republicans voted for the act, compared to 61% of democrats. This example is great for showing Simpson’s Paradox and really emphasises the fact that stats can be deceptive. The worrying thing is that these stats can be manipulated to show that a higher proportion of democrats in the north and in the south supported the bill, or that a higher proportion of republicans supported the bill. Meaning both sides can legitimately lay these claims and hence really confuse the electorate. The fact of the matter is that the real variable that was feelings towards the bill differed largely due to attitudes in the north vs attitudes in the south, rather than a political allegiance.

Simpson’s paradox also appeared at school recently. A teach-firster in our department was planning a lesson on probability and asked me if I knew “that thing where you have a higher probability of picking one colour in each bag of balls, but if you put them all into the same bag you get a higher probability of the other.” This produced a rather interesting discussion, around Simpson’s Paradox, no one else in the department were familiar, and they all found it pretty interesting. We both then included it in our lessons. The question was around bins with coloured counters in them and showed that you had a higher probability of picking black counters from the blue bin in two cases, but if you combined the counters into the same bin, the higher probability came from the red bin.

The example of this false variable situation given in our lecture was that of breast feeding. The stats suggest that breast feeding equates to a better academic achievement for the pupil. But if you drill down into the stats you see that there is a far higher proportion of breast feeding mothers in the “middle class” as opposed to the “working class”, and that academic achievement may be more down to socio-economic status, rather than the breast feeding itself. This could be due to a plethora of reasons which may include: a higher level of education to the parents, enabling them to provide more support to learning at home; a higher income in the house which may enable private tuition if a child is falling behind or even that more working class families are reliant on shift work, longer days and multiple jobs, leaving them less time to spend with their children to aid their development. This is clearly a complex issue, and it highlights the fact when reading anything that includes statistics you have to ask yourself, “does the author have an agenda, and are they twisting the facts to suit it?”

%d bloggers like this: