Missing Data

Finding missing data in the context of a larger whole.

I’m a big fan of the 1980s Sherlock Holmes series featuring Jeremy Brett as Holmes. I revisit the series from time to time and usually afterwards I spend a period of time with a heightened sense of logical problem solving abilities and critical thinking skills (real or imagined). A hallmark of the Holmes character is his ability to look for things that others just don’t think of. Naturally he looks at what is present, but he also thinks of what is missing. He thinks of information in the context of a larger whole. The things that should be here but aren’t are harder to think of because it requires the knowledge of what a situation should be comprised of and the presence of mind to look for everything on that list. It’s much easier to just observe what is before us.

This ability to know what should be there but isn’t, the instinct to consider information in the context of a larger whole, requires experience & knowledge. Barry Schwartz speaks to some of this in his TED talk about wisdom. If you have experienced something enough times and you have paid attention, you can see it coming in the future, know what to do next, and know what else to look for.

We see this (or don’t see it as it were) in data analysis. The data we have is great but what about missing data? What about additional information that could change our reading of the data we have and influence our course of action? The most skilled analysts draw not just on the data they see but on their experience of having seen similar data in the past and what might be missing from this situation. In user experience design we see this as well. There are requirements given for a project but what if the requirements aren’t fully capturing what users need? What is missing from the list of things we are asked to include in our design?

So without waiting for years of experience is there a short-cut to know what to look for? Is there a way to hot-wire this learning process? The trick is in not taking things at face value. What you are presented with is a good start but think outside of that, a process Edward de Bono calls lateral thinking. Try to look at a project/problem from different directions. In UX design this comes in the form of user interviews and creating personas. This helps designers to account for the needs of different kinds of users and to not design a rigid solution that only works for one kind of person.

In data analysis it is useful to think of the qualitative aspects of the quantitative data, to consider the data in the context of the real world. Steven Levitt of Freakonomics gave an example of this at Qlik Qonnections. He talked about trying to find potential terrorists using banking data and how he couldn’t make any progress. Then after months he had a lateral thought: look for ATM withdraws from known terrorist suspects AND look to see if anyone withdrew money nearly right after them. If this happened more than once it was so statistically unlikely to be chance that it meant these two individuals knew each other and were traveling together. In stepping outside the data for a moment, and thinking of how people in a group follow a behavioral pattern in accessing ATMs, he found something in the data that was otherwise overlooked.

In thinking laterally and considering data as a part of a larger whole we are better served to find missing information, hidden patterns, and find better ways to move forward.

Michael Anthony poses an interesting question on data collection: Do we have everything we need? In his latest post, he discusses what it takes to uncover missing data.


In this article:

Keep up with the latest insights to drive the most value from your data.

Get ready to transform your entire business with data.

Follow Qlik