We're firmly in the analytics economy
where data is considered the new oil. Like oil, data on its own is just a
natural resource and needs refining before it delivers its true value. In the
data world it's analytics that refines, delivering actionable insights, transforming organisations to be more data driven and do things better. We see
the most successful customers really getting the most value when they also use
Earlier this month I had the honor to
chat with Bruce Sinclair from IoT inc. on a podcast about the
importance of external data. It got me thinking that this is something we take
for granted here at Qlik as our data analytics platform makes it insanely easy to
combine all your data.
There is a lot of potential for external
data, in many use cases not just within IoT, because while people are important
to bring in knowledge of the business and subject matter expertise when
deciphering insights, external data can bring context that just doesn’t exist
in your internal data.
An obvious example is weather, you would expect it to affect footfall in retail but maybe not drive seaside tram ticket sales on windy days. There are many types of external data that can fill
critical gaps to deliver significant impact. One UK ambulance service is using ordinance survey meta data to better understand not just the location of reported incidents but the surrounding details as well to issue the best patient care possible.
Bruce makes an important point that use
of external data is strategic and top-down thinking is crucial when searching
for it and applying it by asking the right questions like "what problem are
we trying to solve?", "what outcome am I trying to achieve?”, and "what data do I need to create that outcome?"
So where do you get this external data?
well the web has an abundance of data, especially in the form of data markets. Some are
free like Government Open Data and some
have a fee. There are also data
brokers who manage and aggregate collections of public data in
One challenge with using external data direct from the source is its quality, each data source will have its own structures and issues and each source may have to be transformed or wrangled into useful information for your organisation to ingest it and start making sense from it.
The reality is that data preparation accounts for about 80% of the work of data scientists. This is a lot of time and effort but some data market places, like our own, can help reduced that time down significantly, an important consideration when evaluating data marketplaces.
It’s also important to consider regulations such as GDPR when looking at the content of external data. Ensure it’s from a trusted source and the data is ethical, i.e. contains no personal data if you don’t need it (buying personal data is one for a separate blog)
And of course, data literacy is important too, understanding not just working with the data but the ability to analyse it along with your internal data and be able to argue with it.
That is why we have a data literacy program which includes several free product agnostic courses along with a framework for organizations to start bringing data literacy into their own environment and increasing people’s skill levels. We're not saying make everyone a data scientist but it is important to improve people’s comfort levels with data and increase their ability to read data rather than leaving it standing on the outside.