Data Democratisation and the Trouble with Rolling Averages
21/09/2017


I was watching a webinar the other day with various luminaries adding weight to a subtle sales pitch. Two of them were talking about future trends – one claiming data democratisation (Google it!) was spreading like wild-fire and was to be encouraged, while the other spoke about it being a myth. I’m with the latter – and much more in favour of information democratisation. Letting every person and their dog have access to any data so they can interpret it any which way they can – in my view that’s a recipe for much activity, confusion and wasted effort…
Let’s take just one of many examples – the application of moving average charts to data. It is often used to look for trends, cutting out the amount of “noise” (uppy-downy movements in data over time) in the data. Investors use it for example to cut out the ups and downs in trading to see where a certain stock is generally heading – and good luck to them! Those trained in statistics may use these kinds of charts to smooth out seasonality so they can see any signals where data is breaking out of that seasonal pattern.
So what is the effect when we look at some of these charts?
Here’s a 12-month moving average chart of some data (out of context to demonstrate the point):

So we’ve got a roughly constant set of results from July 2013 to June 2014 – nothing much going on.
But putting this in context we see a different picture – one that we might want to take action on:

So, in context we see there is a strong seasonal pattern, with high volumes in July and August and low volumes in February! The Moving Average Chart above would look like the same picture if the underlying data were flat rather than seasonal. So the Moving Average loses information about what is actually going on! It does not help in taking corrective action. The extended-SPC chart (showing the seasonal pattern) allows officers the opportunity to focus their efforts when Incident volume is high in July and August in order to have maximum impact.
Here’s another example:

So now we have a signal in February 2014 – something must be special about February 2014. “Find out what happened in February 2014” rattles down from the corridors of power.
But wait up! When we look at the data in context, we see a very different picture:

So actually, there’s nothing special about February 2014 at all! February is just part of a run of low results since June 2013 (when actually earlier corrective action was taken by the force to reduce shoplifting). The February 2014 signal in the Moving Average Chart above is actually as a result of the previous high values for months up to February 2014 dropping out of the 12-month moving average calculation! So the Moving Average Chart can be misleading taken out of context – which they usually are!
Lessons Learned:
- Don’t use Moving Average Charts out of context
- Don’t use Moving Average Charts to look for signals of corrective action impact nor for opportunities to apply corrective action
- Recognise that by processing any set of raw data potentially removes information from that data.
- Apply Moving Average Charts only when appropriate.
Or alternatively, you could standardise on the one Dilbert chart above – it might be just as useful!
Categories & Tags:
Leave a comment on this post:
You might also like…
Reaching new heights: How a Global Excellence Scholarship fuelled my aerospace dreams
Leaving my home in India to pursue an MSc in Aerospace Dynamics at Cranfield University was a leap of faith. Hi, I’m Oliza Kachroo and as an international student, the transition ...
How do I reference…when delivering a presentation?
Just as you cite and reference sources in written work, you should also acknowledge the sources you use or quote in oral presentations. Citing your sources in presentations provides your audience with information about the ...
How do I reference a blog post… in the NLM style?
Whilst blog posts do not fall into the scholarly material category, they can be useful for providing personal insights into a particular area. For example, there may be a high-profile member of an industry who ...
From nature walks to neural networks: My journey in Applied AI at Cranfield
Hi, I’m Ebru K and choosing a postgraduate degree is about more than just picking a subject; it’s about choosing where your future begins. As an international student from Turkey, I ...
Leading With Heart: My Journey as Cranfield Student Association President by Summer Yan
When I first arrived at Cranfield, I had no idea that one year later I would be standing at the heart of our student community, serving as President of the Cranfield Student Association (CSA). ...
Creating and using constituent lists in Datastream
Whether you're analysing industry performance, or comparing company financials, Datastream is a powerful tool. One of its most useful features is the ability to work with constituent lists — collections of companies grouped by index, ...
