Regulation poses another problem - and it is likely that they will be out of compliance with one or more of the regulations they are governed by. Some regulators require that they keep seven years of data, and it’s in their interest to delete old data.
This customer is not sure what they’ve got but one thing is for sure: we need to help them make a plan for what to keep and what to throw away. We also need to help them find some technologies to de-duplicate their data, and to store what they do need in a way that compresses it - not makes it bigger.
I encounter so many companies jumping on the big data bandwagon, the “let’s store everything” approach - and they claim it’s strategic. Sure, there are some systems that have tables with a lot of rows - like a general ledger, or a transaction system in a bank, but not all systems do. It’s everyone’s responsibility to make sure that they only keep what they need, and actively delete what they don’t.
What these companies really need to do is define a data strategy. They should define policies for what to keep, for how long, what to keep in-memory for instant analysis, what to keep close by and what to file away in cheap storage for emergencies. You can only find something if you know where it is - which means that you can only delete something if you know where it is. And the more copies you store the harder it becomes.
I’m helping my customer to build a platform to audit their data and building Qlik apps to show them what they’ve got, what they need to keep and what they can throw away. It’s an innovative use of data discovery, but an essential one. Once this project is complete we will have just a few terabytes of organized data to deal with, and we’ll be able to build some apps to show them how their business is actually performing.
In almost all of the BI and Analytics projects I’ve seen over the last 20 years, the real insights have come from analyzing data from different departments and bringing it together. The value comes from searching for subtle associations between sales and marketing, or finance and risk, and using those insights to drive decision-making and change. It’s not how much you’ve got, it’s what you do with it that matters.
Being organized is the key to success and having a data strategy is a good place to start.
Next time a storage vendor tells you that big data is the answer to the world’s problems keep all this in mind.
Maybe small data is the new big data.