And I tend to agree with him, from a certain point of view that is. SQL will never go away completely, as it is required to pull data from the numerous data sources that people need for analysis. Most vendors, including Qlik, use SQL in this fashion.
However, there is a distinct difference between using SQL to extract data, and relying on it as your fundamental architecture for modern analytics. SQL was originally designed for transactional systems, not interactive analysis. Query-based approaches limit users to restricted, linear exploration on partial subsets of data. This might be ok if you know the questions in advance, but what about the unanticipated questions or unexpected insights you need in real-world business situations? For new questions, you have to build new queries, and that means going back to the data experts. The game of “ask, wait, answer” is on.
SQL-based approaches tend to struggle in three major ways. The first is around data preparation. Data structures, aggregations, and hierarchies built with SQL must be fully modeled in advance, requiring significant effort, and limiting flexibility. Sparse or incomplete data can cause problems. And, when SQL joins are used to combine data from multiple sources, data loss or inaccurate results can occur. This is especially true when working with high numbers of disparate sources.
Second, SQL based tools restrict interactivity and exploration. They limit people to a structured, linear experience on top of partial subsets of data, instead of offering freeform exploration and search across all your data. If a user wants to pivot their thinking to new questions or new ideas, they will likely need to build new queries.
Query-based visualizations are discrete, disconnected entities that do not inherently stay in context together. And you can only see the data included in the query criteria and result set – all other data is left behind. This means you will have blind spots, and won’t be able to uncover insights such as customers that did not buy or products that did not sell – but don’t you want to know about this?
Finally, there’s performance. SQL and query-based tools are at the mercy of the underlying databases that support them, which are typically slow and inflexible. The problem is compounded with increasing numbers of users and big data or other types of complex sources. Query-based systems simply can’t provide instant response for high numbers of users, on large data sets, asking questions that were not anticipated in advance.
So what’s the answer? Well it’s actually quite simple. Use technology that was intended for the purpose it’s serving – in this case modern analytics. I think Gartner’s Cindi Howson does a great job of defining this in a recent blog. I particularly like where Cindi says, “One of the key features of a modern analytics and BI platform is that it has a self-contained performance layer; this may be a columnar or in-memory engine.”
Well said. And queue the Qlik Associative Engine, a self-contained high-performance in-memory engine, which is designed specifically for interactive, free-form exploration and analysis. It fully combines large numbers of data sources and indexes them to find the possible associations, without leaving any data behind. It offers powerful on-the-fly calculation and aggregation that instantly updates analytics and highlights associations in the data, exposing both related and unrelated values after each click. And it was built to scale without sacrificing flexibility, to high numbers of users and data both big and small.
This means people are free to search, explore and pivot based on what they see, without blind spots and without having to go back to experts and wait. This is why Qlik users consistently report previously unforeseen insights that have been missed by SQL and query based tools, driving tremendous value. This is The Associative Difference™.
Photo credit: therefromhere via Foter.com / CC BY-NC-SA