Someone told me yesterday that “OLAP is dead”. “Everyone is choosing tabular/in-memory.” I know it’s not dead, maybe at least sick. But did I underestimate the time of the tipping point, the curve in the hockey stick, where the vast majority of users will “sensibly choose” the tabular/in-memory option over OLAP?
I realize some, including me, think this topic is beaten to death. From the point of view that OLAP is my bread and butter, my top skill (I should have become a cop like my dad wanted), of course I took it to the heart, and take things (including “official” word from MSFT) with a grain of salt. But I also realize the person who told me this is very bright and knows the market. So I had to take a time-out today to revisit this issue as a reality check on my own professional strategy; a good thing to do every now and then.
When I first became aware of the OLAP is dead controversy a little over two years ago, I wasn’t too afraid of this since 256 GB of RAM was still really high-end. Today, 2 TB is “really high-end” (a few Moore’s Law iterations), well beyond the size of all but a few OLAP cubes I’ve dealt with (not even considering in-memory compression!). And there were a couple of issues I still had not fully digested at that time.
One of those issues was not fully appreciating the value and genius of the in-memory compression. At first, I was only thinking that RAM with no IO is just simply faster. But the compression/decompression cost that occurs in the CPUs, which results in a whole lot more CPU utilization, isn’t really much of a cost since those cores were under-utilized anyway. Another was the volatility issue of RAM. At the time solid state memory was still fringe and my thought was that even though volatility wouldn’t be much of an issue in the read-only BI world, but would be an issue in the OLTP world. Well, that doesn’t seem to be the case with Hekaton.
After thinking for much of the night, here are two key questions I came up with that will determine whether OLAP (specifically SQL Server Analysis Services OLAP) will die:
- Is the question really more will hard drives (the kind we use today with the spinning wheels and all those moving parts) become obsolete? RAM and/or flash could neutrailize all the advantages of disks (cheaper, bigger, non-volatile) relatively soon.
- Will OLAP become minor enough in terms of utilization and product pull-through that Microsoft will no longer support a dev team? I can quickly think of a few Microsoft products with a strong but relatively small following that just didn’t warrant an infrastructure and were dumped.
An important thing to keep in mind is that there are really two separate issues. One is the underlying structures, OLAP versus in-memory, and tabular versus multi-dimensional. The first issue, the underlying structures, is a far more powerful argument for the death of OLAP. The underlying structure really will be seamless to the end-user and it won’t require any guru-level people to implement properly, messing with all those IO-related options.
However, I still don’t completely buy the “tabular is easier to understand than multi-dimensional” argument. I buy it to the point that, yes, it is true, but I don’t think this is the way it should be. My feeling is that the multi-dimensional concepts encapsulated in MDX and OLAP are more along the lines of how we think than what is encapsulated with SQL and relational databases. What comes to mind is the many times I’ve engaged a customer with “thousands of reports” that were really variations of a couple dozen and were mostly replaced with a cube or two.
As a side note, one exercise I use to demonstrate the elegance of MDX is to think about the syntax of how Excel handles multi-dimensions. Excel is multi-dimension, but just two dimensions. With a cap on dimensionality, it’s easy to use the A1 (column A, row 1) syntax. But what about three dimensions? A sheet (Sheet1$A1). Four dimensions? A different xlsx document. Five? A different directory. That’s not at all elegant. But MDX elegantly “scales” in the number of dimensions; it looks the same from zero through 128 dimensions.
The tabular model reminds me of when I started my OLAP career in 1998 as a developer on the “OLAP Services” (SQL Server 7.0 version of Analysis Services) team at Microsoft. OLAP for SQL Server 7.0 was really the just core OLAP, no frills, just strong hierarchies and aggregations. It was very easy to understand, but users quickly hit walls with it. That reminds me of how VB was so easy to learn. One could learn to build pretty good applications quickly, but would run into problems venturing beyond the 80/20 point. Eventually .NET (C#/VB.NET) came along, still relatively easy to use (compared to C++), but still a quantum leap in complexity. For OLAP Services, that was SQL Server 2005 Analysis Services with the weak hierarchies, many to many relationships, MDX Script, KPIs, etc.
I guess what I’m saying is this is a case of taking a step backwards to take two steps forward. The spotlight (tabular) isn’t currently on the high-end where I normally make my living. However, it doesn’t mean there isn’t a high-end. The high-end as we know it today (OLAP) will eventually die or at least severely morph, but requirements of yet unknown sorts on the low-end will push the complexity back up. How will Big Data affect the kinds of analysis that are done? Will 2 TB of RAM then be sufficient for the “masses”?
At the moment, I do believe that in terms of raw new BI implementations, tabular is giving a whooping to OLAP. It should since the idea is to expand the accessibility of BI to a much broader audience. I’ve lived through the rise of Windows 3.1 and the dot-com crash. This is a minor disruption; it’s not like I haven’t begun moving on years ago – in fact, skill-wise, I learned to always to be moving on to some extent.
BTW, he also told me that “no one makes money on Big Data and that Predictive Analytics is limited to one or two people”. Those are in fact the two skills I’ve been shifting towards the past few years in light of the sickness of OLAP. While I really don’t know about the former claim (and would find it short-sighted to base strategy on that), I do have a couple of opinions on the latter:
- Bridging Predictive Analytics and Performance Management
- Why Isn’t Predictive Analytics a Big Thing?