This idea that there's more to data than the rows you are seeing in the spreadsheet or data points on a chart, is essential to keeping the potential of your data alive. Jer Thorp calls this the "data system" and suggests that we always think of data in terms of it being a system and not simply an artefact. That we always view it as a series of processes and activities around; collection, computation and representation. Each of these in turn filter the 'noise' and magnify specific 'signals'. When we engage with data we have to be conscious of the effects of those processes; what policies were applied during collection, what was missed, what was removed, what unknown errors lurk in the precision and 'truth' of those numbers? When presented with a formatted chart or cleaned data set, in can be very hard to find ways of answering those questions.
The answers lay in lineage and history. It's well known that overtime we humans have a tendency to change our stories, to fit the 'facts' to our preferred narratives. Journalists and researchers know that contemporary secondhand records can be more reliable than firsthand recollections (you can thank Ebbinghaus' "forgetting curve" for that). Data has a little of that too, each transformation, cleaning, interpretation moves it further along a preferred narrative. When we keep the rawness, noise and mess of that first point of collection we can retain more of that history. As we find new ways of seeing the noise, through machine learning or new techniques, we may well start finding fresh insights. It could be that we will be able to infer the policies influencing the collection or undercover hidden signals that reframe the entire data set, stretching its use beyond anything thought of when first collected. The more we maintain of the contemporary records the more possibilities and potential are stored in that data system. The more we record of the influences, approaches and policies that drove the collection of that data alongside it, the more we will be able to understand what else it might be telling us. We could even start to hear beyond the hiss to reveal the participants and processes that formed it.