In my last post, I noted that over the past twenty years, increased hardware performance coupled with innovations in data management and integration tools has created an environment in which the spectrum of business intelligence users has expanded, which in turn has only added to the demand for information availability. This post looks at an alternate set of drivers for information availability, namely the potential of harnessing and analyzing massive data sets for business value creation.
In my last post, I noted that over the past twenty years, increased hardware performance coupled with innovations in data management and integration tools has created an environment in which the spectrum of business intelligence users has expanded, which in turn has only added to the demand for information availability. This post looks at an alternate set of drivers for information availability, namely the potential of harnessing and analyzing massive data sets for business value creation.
The appeal of big data analytics stems from three specific perceptions of benefit:
The promise of big data analytics additionally reflects the “democratization of business intelligence” effect I suggested in a prior post in that it presents a methodology for rapidly conjuring up an environment that, once the massive amounts of data are loaded, can enable a wide spectrum of investigations and analyses in a way that is both scalable and flexible.
Of course, that one qualification – once the massive amounts of data are loaded – remains the kicker, as it did for enterprise data warehousing and integrated pervasive business intelligence. Big data platforms are configured to take advantage of massive parallelism, and their incorporation of commodity computational and storage componentry enables elasticity and scalability within the platform environment. However, the challenges that often accompany this part of the process include:
In addition, Internet bandwidth tends to be insufficient to satisfy timely delivery of massive data sets, only adding to the complexity and bottlenecks. The data sets will need to be collected, collated, and then brought into the big data server before any of the analyses can begin.
In other words, even with the promise of high performance calculation and computation, the analysis processes are still limited by the challenges of making information available in ‘right’ time. Environments that have not considered approaches such as data replication for improving the speed of data movement will still be constrained by the data access bottleneck.