## 100% Chance

*“100% Chance of getting a safety car.”*

That line was repeated numerous times in the build up to today’s signapore grand prix and in the early stages of the race. It was repeated by various pundits and comentators and it has made my blog boil.

**THERE IS NOT A 100% CHANCE OF A SAFETY CAR**

That should be enough, to be honest. It’s not certain that it will happen, there is a chance that it might be avoided. The problem stems from the confusion between relative frequency and probability.

Relative frequency IS a good proxy when it comes to probability but it’s isn’t always exactly the same thing. The relative frequently of a safety car being needed at Singapore was, still is, 100% because There has been one at every race ever held here, which gives us a relative frequency of 100%. But the sample size is tiny (9 races) and this isn’t big enough in this case. Please sky sports, sort your maths out.

## Stem and Leaf – is there a point?

Stem and leaf diagrams, or “Those leafy stem things”, as one of my former pupils used to call them, have long been an annoyance of mine. I’d never heard of them until I was brushing up on the GCSE syllabus ahead of my PGCE and when I did come across them I couldn’t see anything that they brought to the party that couldn’t better be shown using alternative methods.

You can imagine my feelings then as the KS3,4 and now 5 curricula jettisoned them, meaning the end was in sight for the need to teach them. I let my feelings on this be known in my recent post around the new A level curriculum and this led to further discussion around them on twitter. Then Jo Morgan (@mathsjem) wrote this fantastic piece which supports their place in a classroom and gives some great activities to use in teaching them.

It got me thinking, are my feelings unfounded? Should I be writing off stem and leaf diagrams? I’ve long been an advocate of maths for maths sake, see this defence of circle theorems for one example, so why is this feeling bot the sane for stem and leaf?

Perhaps it’s that it falls under the banner of “stats”, a very applied area of maths. This suggests that there should be an application associated with it. The use mentioned in Jo’s blog for bus and train timetables is the best example I’ve seen, but I think a normal timetable will be easier to read for the majority if people, as the majority of folk aren’t familiar with stem and leaf. Hannah (@missradders) suggested that they were used a lot in baseball, but I can’t see any reason that they would be better than a bar chart or a boxplot.

Colin Wright (@ColinTheMathmo) suggested during the twitter discussion that they could be used to build understanding around data, even though they are no use for any real data sets which would be far too big. Jo also uses this idea in her defence, saying they could provide a good introduction to the ideas of skew, quartiles and outliers. I can see this argument, but I still think there are better, more visual and less convoluted ways to introduce these to pupils, such as the aforementioned bar charts and box plots along with scattergraphs and a host of other data presentation methods (but not pie charts, they’re just as bad, if not worse! But that’s a topic for another day!)

*I really enjoyed Jo’s post, if you haven’t read it I would advise you do. It made me think and look hard at my views. In the end though, I still see no need in stem and leaf diagrams and will be glad to see the back of them. If you have opinions either way I would love to hear them, especially if you have further real life uses!*

## What does it mean?

Today my year 11s were busy revising ahead of tomorrow’s mock exam and one of them started singing the averages song. You know the one:

*“Mean is average, mean is average, mode is most, mode is most, median’s the middle, median’s the middle, range high low, range high low.”*

This got me thinking about the words we use. I’ve always disliked this song as a mnemonic as it encourages people to think of the mean as the “average” when actually the mode and the median are also averages too. The median in particular is a very useful one and we need pupils to understand the distinction. I have been very impressed in recent staff meetings to hear the principal, an English teacher by trade, use the term “national median” rather than “national average”!

As I was thinking about this, though, I had the sudden realisation that I should also be feeling the same way about the term “mean”! Granted, at GCSE level we only talk about one mean, the arithmetic mean, but that doesn’t mean the geometric mean doesn’t exist. (Nor the root mean square nor harmonic mean for that matter! *Other means are available*)

This is a hypocrisy in the way we treat certain words. I’m not the only maths teacher who dislikes the way mean and average have become synonymous. But no one has ever mentioned that the word arithmetic is missing from the term every time we use it.

I worry that we may be setting students who go on to further study statistics up for confusion in the future by simply referring to the arithmetic mean as the mean.

Have you ever used the term arithmetic mean, or even geometric mean, with your students? Have you shared my worry? Or do you think I’m being overly pedantic and it doesn’t matter? I’d love to hear your opinion.

## Histograms

Bizarrely, I’ve never taught Histograms to a full class before. This seems strange as I’ve taught everything else, but it seems that due to schemes of work, class changes and shared classes I have never had the pleasure of teaching this topic to a whole class. I have taught it to small groups on revision days, but that is different to teaching it to a full class.

My Year 11’s have now covered the entire course and we are hitting the topics they need. On the last two mocks they all did very badly on Histograms, and say that it has been a long time since they learned about them, so I have decided to cover it on the first week back. I had a look over my resources and I have very little. I thought about what I wanted them to know, and what they need to be able to reproduce in the exam. I find that there are not that many real life uses of Histograms, They can be used to look at distributions, but not much else. In fact, a friend once told me she had a background in using statistics before she went into teaching and could use that knowledge to find real world uses of everything, except for histograms.

I personally think they are quite a nice way of showing a distribution, and like the way that it links to normal and other distributions in higher level stats.But I don’t see much need on them being on the GCSE syllabus, and I’m fairly sure that the vast majority of people won’t use them in the real world!

I have uploaded my notebook, exported PowerPoint and exam questions sheet here, if anyone is interested – as always feedback will be gratefully received. I’m on the hunt for a bingo or a card sort on the subject, so do feel free to signpost me to a good one if you know of any! If I’ve had no luck by next week I will probably make one, so watch this space!

## False Variables and Simpson’s Paradox

Last weekend I attended a day of lectures as part of my MA course. The focus of the day was on barriers for learning and it was quite intensive. Part of the day involved looking at the statistics involved in various things and seeing how they related to the development of children and the lecturer mentioned the idea that a false variable can skew ones ideas, and can make it look like something is having an effect, when in reality it is something else.

This idea of false variables is one that has been “following” me around recently. The first book I read this year was “The Simpsons and their Mathematical Secrets” by Simon Singh. In the book he discusses “Simpson’s Paradox”. The example he uses is in relation to the US government vote on the American civil rights act of 1964. In the north, 94% of democrats voted for the act compared to 85% of republicans. In the south 7% of democrat voted for, and 0% of republicans did. However, overall 80% of republicans voted for the act, compared to 61% of democrats. This example is great for showing Simpson’s Paradox and really emphasises the fact that stats can be deceptive. The worrying thing is that these stats can be manipulated to show that a higher proportion of democrats in the north and in the south supported the bill, or that a higher proportion of republicans supported the bill. Meaning both sides can legitimately lay these claims and hence really confuse the electorate. The fact of the matter is that the real variable that was feelings towards the bill differed largely due to attitudes in the north vs attitudes in the south, rather than a political allegiance.

Simpson’s paradox also appeared at school recently. A teach-firster in our department was planning a lesson on probability and asked me if I knew “that thing where you have a higher probability of picking one colour in each bag of balls, but if you put them all into the same bag you get a higher probability of the other.” This produced a rather interesting discussion, around Simpson’s Paradox, no one else in the department were familiar, and they all found it pretty interesting. We both then included it in our lessons. The question was around bins with coloured counters in them and showed that you had a higher probability of picking black counters from the blue bin in two cases, but if you combined the counters into the same bin, the higher probability came from the red bin.

The example of this false variable situation given in our lecture was that of breast feeding. The stats suggest that breast feeding equates to a better academic achievement for the pupil. But if you drill down into the stats you see that there is a far higher proportion of breast feeding mothers in the “middle class” as opposed to the “working class”, and that academic achievement may be more down to socio-economic status, rather than the breast feeding itself. This could be due to a plethora of reasons which may include: a higher level of education to the parents, enabling them to provide more support to learning at home; a higher income in the house which may enable private tuition if a child is falling behind or even that more working class families are reliant on shift work, longer days and multiple jobs, leaving them less time to spend with their children to aid their development. This is clearly a complex issue, and it highlights the fact when reading anything that includes statistics you have to ask yourself, “does the author have an agenda, and are they twisting the facts to suit it?”

## Probability and Sex Ed

This week we had our third CT day of the year. (CT Days, or citizenship themed days, are collapsed timetable days where pupils do a range of topics linked to a theme.) I was with my coaching group and we had a great day on the topic of “personal wellbeing”.

The new year 11 (we move up year groups at spring bank) had a day on sexual education. Currently in maths they are learning about probability and one of my colleagues and I decided this was a perfect opportunity to merge the two.

We gathered some data on the probabilities if contracting STIs from an unprotected sexual encounter and they looked at the probabilities involved in contracting things after multiple encounters (here).

We also looked at expected values, and given the effectiveness of different types of contraception, (from here) how many pregnancies a year would you expect if a couple who were always safe made love twice a week. The answer shocked the whole class. They were also amazed by the difference when I asked them to complete tree diagrams and work out the expected value if the couple used condoms and the pill.

This was a much easier concept for them to relate to than picking sweets out of a bag as they could see that this was something that would affect everyone at some point in their lives. It also got across some messages that are important, especially as our school is located in an area with quite a lot if young parents.

## Statistical Deception

When teaching and talking about statistics I always emphasise the need to be careful what you believe and to always ask yourself “what agenda does the person presenting this data have?”

I’ve written before about how stats can legitimately be manipulated to serve different points of views, especially when there are false variables at work. But recently I’ve noticed at darker art in statistical manipulation, one that is, at its heart, lying.

We are less than six weeks away from local elections now, and it is becoming silly season for party political leaflets coming through our letterboxes. Now we all know that the political parties will present data in a way that makes them look better, they are trying to win your vote afterall, but we would expect them not to lie. For the data to be accurate and presented correctly. Unfortunately, however, this is not always the case:

Exhibit AThis popped up a number of times in my twitter feed from a variety of sources. I believe it is from a Lib Dem leaflet in Manchester. As you can see, they have presented a bar chart with proportions labelled as percentages. The first screaming error is that the red bar and the orange bar are massively different heights, yet are both emblazoned by the label 39%. The second glaring error is that the percentages add up to more than 100%. The first implies that either the Lib Dems are deliberately trying to mislead voters into thinking they are in a stronger position in the ward than they are, or that they don’t realise that 39% is equal to 39%. I’m not sure which is worse?!

Here’s an excel interpretation of what the graphs

shouldlook like:Exhibit BThis graph came through my door in Leeds North West parliamentary constituency. The first thing that caught my eye was that although the gap between the number of votes between Lib Dems and Labour; and between Labour and Conservative is almost the same, the difference in the gaps between the bars was almost 5 times as big, which would imply almost five times as many less votes! An obvious fallacy. Either it’s a deliberate attempt to mislead, or they can’t draw a bar chart. If it’s the latter, do we want them in charge of our local authority budgets?! (or the entire economy for that matter!!)

Something else that struck me as deciving, although this time mathematically correct at least, was the choice of data. This was a leaflet issued in the run up to a local election, and the data set used was from the last local election. Why then, is the data that for the parliamentary constituency rather than the council ward? The ward makes up around a quarter of the constituency, and the vote share in the ward is radically different to that of the constituency. The sitting councilor is conservative and sits on a huge majority, and the Lib Dem candidate last time out cane third. To issue a leaflet in the run up to a local election which implies the conservatives can’t win in a ward where they have a large majority and back it up with local election data for a parliamentary constituency is deliberately deceptive and misleading.

Here’s an excel interpretation of what this one

shouldlook like:Exhibit CThis one comes from

“across the pond”and is another which was viral. This one seemed to appear constantly for a few days everywhere I looked. If you are still wondering what’s wrong with it, take a little look at those numbers down the left hand side…. See it? The y axis goes upwards to zero! Drew Barker (@twentythree) made this version which gives a much better picture as to what’s going on.I can’t wait to see what my classes make of these!

nb I haven’t “selected” these graphs as an attack on the Lib Dems, it’s just they are the only party who have sent me a leaflet with incorrect maths. I’ll gladly expose any of the parties if they themselves do. I do collect these, so if you spot anything similar, do send me it!## Share this via:

## Like this: