Good data analysis

Teacher unions fear assessment data being released because they worry about league tables published by media. And look I agree a league table which takes account of no control factors is not very helpful.

But what has excited me about the data being released, is what some real data experts can do with it (talking of data experts the Herald editorial moaned about ” A high priesthood of data analysis bemoans news media interest” which has caused me to label Keith Ng as Cardinal Keith!). An example is Luis Apiolaza at Quantum Forest. First he did the standard average proportion of students meeting the reading standard at each decile.

So you look at that, and think wow it is all about decile. But he then looks at the variance in each decile, not just the average.

The box shows the middle 50% for each decile, and the line in 1.5 times that interquartle range. What this shows is that the lower decile schools may have a lower average, they have much more variability. This is good, because it dispels some of the myth that all low decile schools have few students meet the national standard.

What I think this data may allow, is to then look at that huge variance in decile 1 to 3 schools, and identify the factors that have some schools higher than others. Eric Crampton has tweeted he has started some analysis and ethnicity is a big factor.

Of course this is all just a snapshot of data, and the lack of moderation means you take care with it (but does not make the data useless by any means). Th real value will be over time, as we get trend information.

I suspect there is going to be lot of sites doing their own analysis of the data.

Comments (31)

Login to comment or vote