A Getting-Started and Survival Guide for planning, designing and building Tabular Semantic Models with Microsoft SQL Server 2012 Analysis Services.
by Paul Turley
This post will be unique in that it will be a living document that will be updated and expanded over time. I will also post-as-I-go on the site about other things but this particular post will live for a while. I have a lot of good intentions – I know that about myself and I also know that the best way to get something done is to get it started – especially if I’m too busy with work and projects. If it’s important, the “completing” part can happen later. In the case of this post, I’ll take care of building it as I go, topic by topic. Heck, maybe it will never be “finished” but then are we ever really done with IT business solutions? I have been intending to get started on this topic for quite some time but in my very busy project schedule lately, didn’t have a concise message for a post – but I do have a lot to say about creating and using tabular models.
I’ve added some place-holder topic headers for some things that are on my mind. This list is inspired by a lot of the questions my consulting customers, students, IT staff members and business users have asked me on a regular basis. This will motivate me to come back and finish them and for you to come back and read them. I hope that you will post comments about your burning questions, issues and ideas for related topics to cover in this living post about tabular model design practices and recommendations.
SQL Server Analysis Services is a solid and mature platform that now serves as the foundation for two different implementations. Multidimensional models are especially suited for large volumes of dimensionally-structured data that have additive measure values that sum-up along related dimensional attributes & hierarchies.
By design, tabular architecture is more flexible than multidimensional in a number of scenarios. Tabular also works well with dimensional data structures but also works well in cases where the structure of the data doesn’t resemble a traditional star or snowflake of fact and dimension tables. When I started using PowerPivot and tabular SSAS projects, I insisted on transforming data into star schemas like I’ve always done before building a cube. In many cases, I still do because it’s easier to design a predictable model that performs well and is easy for users to navigate. A dimensional model has order and disciple however, the data is not always shaped this way and it can take a lot of effort to force it into that structure.
Tabular is fast for not only additive, hierarchal structured data but in many cases, it works well with normalized and flattened data as long as all the data fits into memory and the model is designed to support simple relationships and calculations that take advantage of the function engine and VertiPaq compression and query engine. It’s actually pretty easy to make tabular do silly, inefficient things but it’s also not very hard to make it work really well, either.
James Serra has done a nice job of summarizing the differences between the two choices and highlighted the strengths and comparative weaknesses of each in his April 4 blog post titled SQL Server 2012: Multidimensional vs Tabular. James points out that tabular models can be faster and easier to design and deploy, and that they concisely perform well without giving them a lot of extra attention for tuning and optimization. Honestly, there isn’t that much to maintain and a lot of the tricks we use to make cubes perform better (like measure group partitioning, aggregation design, strategic aggregation storage, usage-base optimization, proactive caching and cache-warming queries) are simply unnecessary. Most of these options don’t really exist in the tabular world. We do have partitions in tabular models but they’re really just for ease of design.
The fact is the multidimensional databases (which most casual SSAS users refer to as “cubes”) will be supported for years to come. The base architecture for SSAS OLAP/UDM/Multidimensional is about 13 years old since Microsoft originally acquired a product code base from Panorama and then went on to enhance and then rewrite the engine over the years as it has matured. In the view of many industry professionals, this is still the more complete and feature-rich product.
Both multi and tabular have some strengths and weaknesses today and one is not clearly superior to the other. In many cases, tabular performs better and models are more simple to design and use but the platform is lacking equivalent commands and advanced capabilities. In the near future, the tabular product may inherit all of the features of its predecessor and the choice may become more clear; or, perhaps a hybrid product will emerge.
No. …um, Yes. …well, sort of. Here’s the thing: The term “cube” has become a defacto term used by many to describe the general concept of a semantic model. Technically, the term “cube” defines a multidimensional structure that stores data in hierarchies of multi-level attributes and pre-calculated aggregate measure values at the intersect points between all those dimensions and at strategic points between many of the level members in-between. It’s a cool concept and an an even cooler technology but most people who aren’t close to this product don’t understand all that. Users just know that it works somehow but they’re often confused by some of the fine points… like the difference between hierarchies and levels. One has an All member and one doesn’t but they both have all the other members. It makes sense when you understand the architecture but it’s just weird behavior for those who don’t.
Since the tabular semantic model is actually Analysis Services with a single definition of object metadata, certain client tools will continue to treat the model as a cube, even though it technically isn’t. A tabular Analysis Services database contains some tables that serve the same purpose as measure groups in multidimensional semantic models. The rest of the tables are exposed as dimensions in the same way that cube dimensions exists in multidimensional. If a table in a tabular model includes both measures and attribute fields, in certain client tools like Excel, it will show up twice in the model; once as a measure group table and once as a dimension table.
(more to come)