Analysis Services is a difficult subject that at first glance can appear to be a relatively simple subject. After all, most people can learn within an hour or two how to create Pivot Tables in Excel and slice and dice through it like a pro. However, as one’s requirements become more sophisticated, one eventually runs into scalability walls. Developing advanced cubes that are processed from huge amounts of data relating to complex entities and business rules requires an understanding of how to balance the trade-offs such as query and process performance, business requirements, maintenance, and cost.
I’ve been delivering advanced Analysis Services workshops for about five years now. I delivered the first session in September 2005, when I was still in MCS, prior to the release of SQL Server 2005. At the time there was hardly any available training material let alone advanced training material, so I cranked for a month at nights and weekends prior to the session developing as much material as I could. But what’s different about advanced workshops is that the attendees are often interested in advanced topics because they are pushing the limits of the subject of the training. That means for the next few years, every session I taught was unique.
The result is that I never had time to fully develop the training material to a high level. The topics were always different and I spent a great deal of time preparing new material prior to the session or whiteboarding during the session for the most part. The good news (for me at least) is that after all that time I don’t hear too many new requests. So I finally can spend the time I used to require to create new material instead perfecting the material I already have. I’ve been working on that over the past few weeks and wanted to share a couple of the slides I created. My purpose for showing these samples is to demonstrate how my workshops focus on understanding the “why” of the techniques in designing optimized SSAS cubes.
Figure 1 depicts the paths of cause and effect leading to the creation of useful (or un-useful) aggregations. On the left are the three problems common problems. These problems are especially evident for dimensions that are long (lots of members) and wide (lots of attributes). A good example of such a dimension is a “Customers” dimension in a cube used to analyze customer behavior. I’ve seen such dimensions proposed that consists of tens of millions of customers and hundreds of customer attributes.
Figure 1 – Aggregation Design Catch-22.
Figure 2 graphically illustrates the constraints of time and disk space when developing an effective aggregation design. For a given cube, there are a large number of possible aggregations – which consists of a set of tabular data objects each consisting of a unique combination of dimension attributes. However, there is only so much time to process them and so much disk space available to store them. Given those constraints, we can only create a subset of the possible aggregations, therefore, what we do create must count (must support the query patterns of the users).
Of course, there is much more to the art of aggregation design that I haven’t covered here. But the purpose of this blog entry is to demonstrate a sample of how my SSAS training focuses on building a solid foundation of knowledge from which the plethora of “best practices” and property settings can make sense.
Please see SQL Server Analysis Services – Cube Development Offerings or contact me at easahara@hotmail.com for more information on my Analysis Serivices training offering.