There’s recently been an important debate taking place on LinkedIn around the topic of relational databases, joins and how that impacts a user’s experience with analytics.
What struck me is despite ample evidence, there exists an overconfident commitment in some camps to traditional relational databases, and how that’s blinded them to other approaches. It reminded me of a famous quote from Muhammad Ali:
"Float like a butterfly, sting like a bee. His hands can't hit what his eyes can't see. Now you see me, now you don't."
Ali’s point – he was so fast that his competition could not even see him – is the idea of what is hidden can be the difference. He not only talked the talk but he walked the walk, with a truly different approach (reliance on speed vs. brute power).
Relational Databases and the Cost of Speed
This aspect of what’s hidden to the user is center stage in this debate about relational databases and analytics. As valuable as relational data models have been for the last 50 years, their weaknesses are well documented, with whole conferences designed to help developers understand their risks and work around their weaknesses.Here is just one example: a one-hour session where BI expert Alberto Ferrari from SQLbits.com takes a deep dive into the “dangerous” world of bi-directional joins, null values and ambiguous relationships: https://www.sqlbi.com/tv/understanding-relationships-in-power-bi/
It gets really interesting at about the 23:45 mark with his solution called “welcome to hell.” Albert dives into complex relational database topics such as:
Type of relationship
What’s really at stake here is speed and accuracy – all these workarounds and allowances create significant additional developer work and slow down both the speed to analysis overall and the ability to scale fast analysis to many users. Now, many BI and Analytics experts “joined” at the hip with traditional relational database technology dismiss these concerns. They will quickly deflect, making statements like “there is no way that you can fit Petabytes of data in RAM.”
This is a false narrative based on a blind worship to relational databases as the only option. It’s an issue the initial Qlik team took head on in its very earliest days. The dangers that reside in traditional relational tables do not exist in a typical Qlik scenario since with the Qlik Associative Data Model delivers “system of insight that is usable by all knowledge workers without compromising on integrity, governance, and security.”2
Why don’t we suffer from the same relationship challenges that others do?
We join the data automatically based on a common field name (effectively a full outer join).
All of the details from all of the tables reside in the data model (No pre-aggregation is necessary because aggregations are calculated on the fly)
BI developers have the option of locking down a relationship if necessary
Our platform offers a myriad of robust options to solve any and all data challenges across the entire data, analytics and insights supply chain. We do not force you to do it a certain way, you are free to apply a combination of options and best practices that best suit your use case.
And just like Ali’s speedy hands, the Qlik data model is not something that is explicitly seen by business users (or even seasoned data, business intelligence & analytics experts) but it is a true differentiation of the platform – it is so fast that most cannot even comprehend what they just saw. The Qlik data model at its core has solved a challenge that we as an industry have been struggling with since Ali’s first title against Sonny Liston in 19641
Here is a video demonstrating a complex insurance data model including tables for customers, policies, products, sales organization and claims payments. I walk you through the build of a very complex analysis called the claims loss experience triangle sourcing info from several tables in the model.
So you say, what about search?
New competitors “can’t hit what the eyes can’t see” either. Google-like search is mature and out of the box with Qlik. No need to bring in ANOTHER technology when the data model is solved and re-usable and search is built in. It is the exposure of those detailed data relationship to the “built-in” search across all dimensions and combination of dimensions that makes it so powerful and unique.
When analyzing Qlik, this is essential to understand – this is only possible because of the underlying data model within Qlik.
Here is a nice video that includes a search demo:
Also, here is a list of features and advantages that Henric
Cronström (VP of Products at Qlik) presented at our recent global customer
event called Qonnections3:
Relationships in Qlik are automatically joined based on a common dimension with effectively a full outer join. Unlike what Alberto Ferrari warns against, the model developer does not have to be concerned with: 1) filter direction; 2) the type of relationship; 3) invalid relationships; 4) null values; or 5) ambiguous relationships. This is not "dangerous" or "frowned upon" or "reserved for the elite. "Rather, it provides assurance that the numbers are correct and can be relied upon.
Deliver the knock-out punch for your organization by understanding the difference between traditional query tool and the Qlik platform. It has been transformational for our Financial Services customers and this transformation can be repeated in your organization.