With the introduction of SQL Server 2012, a whole new way of creating OLAP solutions was added to the business intelligence ecosystem of Microsoft. This new product is called Analysis Services Tabular and it is revolutionary in its way of storing and handling data. Tabular models are not a replacement for Analysis Services Multidimensional—the analysis services you know from the previous versions of SQL Server—but is, rather, a viable alternative. Each model has its own strengths and weaknesses; this article will introduce you to both models and guide you through the process of choosing the right model for your business intelligence projects.
Analysis Services Multidimensional
Analysis Services Multidimensional (SSAS MD) is a mature OLAP product that has been around for several versions of SQL Server (starting way back in SQL Server 7.0). As the name suggests, it relies heavily on the multidimensional modeling techniques of Ralph Kimball. It allows you to create a multidimensional model on top of a data source—typically an SQL Server database modeled using the Kimball methodology—consisting of dimensions and cubes. The cubes contain various measure groups, which are the implementations of fact tables. For a very short introduction to facts and dimensions with a small example, please refer to the first paragraphs of Four Methods for Implementing a Slowly Changing Dimension in Integration Services – Part I.
Using proprietary storage, indexes, and pre-calculated aggregations, SSAS can achieve great performance for analytic queries on large data sets. This technology is called MOLAP (multidimensional OLAP). Analysis Services also supports ROLAP (relation OLAP). With ROLAP, SSAS only provides a semantic model to the users, but queries are translated and sent directly to the source. This mode is ideal for real-time or low-latency reporting solutions, but can have performance issues for larger datasets. A hybrid combination of both storage modes is possible; it is called HOLAP.
Since SSAS MD has been around for over a decade, it provides us with a great range of features, from perspectives, KPIs, actions, translations, and write-back to partitioning, advanced security, and much more. It also supports more complex modeling, such as parent-child hierarchies, many-to-many relationships, and ragged hierarchies. Other important features include the data mining capabilities that come with SSAS MD.
A short introduction like this one doesn’t do justice to the wealth of possibilities that Analysis Services provides. For more information, I’d like to refer you to the following resources:
The language used to query Analysis Services Multidimensional is called MDX, which stands for MultiDimensional eXpressions. MDX provides a specialized syntax for querying and manipulating multidimensional models. Introduced by Microsoft, MDX has grown to be a standard language for OLAP models and is implemented by many different vendors.
Aside from querying SSAS, MDX can also be used to create calculations, establish security rules, and define calculated objects, such as named sets and calculated members. It is also used to define KPIs. MDX has many specialized analytical functions that simplify, for example, time intelligence and the handling of hierarchies.
MDX isn’t an easy language to learn, unfortunately, but in the right hands it can be very powerful. For more information about this language, please refer to MDX Query Fundamentals.
Analysis Services Tabular
With SQL Server 2012, Analysis Services Tabular was announced. It differs radically from its multidimensional sibling from an engine point of view, as all of the data is stored in memory. Due to amazing compression ratios achieved by storing the data in columnar format and a multi-threaded query processor, the in-memory analytics engine—called xVelocity—allows SSAS Tabular to achieve tremendous performance. In certain scenarios, such as, for example, distinct counts, SSAS Tabular can easily outperform SSAS Multidimensional.
It all began with the release of the free Power Pivot add-in for Excel 2010, which kick-started the self-service BI offering of Microsoft. This add-in leverages the same in-memory engine as SSAS Tabular but stores all of the data inside Excel. Due to the great compression results, millions of rows can be loaded into the Power Pivot model. With Power Pivot, users can load data from numerous sources and make mash-ups using the intuitive tabular model. There are no dimensions or facts, only tables that can contain descriptive attributes and measures as well. This makes it very easy for users used to relational modeling to quickly create data models.
Power Pivot is now integrated inside Excel 2013 and the xVelocity engine has found its way to a full-blown server: Analysis Services Tabular. Developing models in SSAS Tabular is remarkably similar to developing them in Power Pivot. There are just a few minor differences between them.
As with SSAS Multidimensional, the Tabular version also had different storage modes. The first one is Cached, where everything is stored in memory, and DirectQuery. In the latter, DAX queries (see the next paragraph) are translated to SQL and directly executed against SQL Server. Using this method, you can query in real time against very large data sources. SSAS has become an abstraction layer over the data warehouse.
The dual offering of two different models in Analysis Services is called the business intelligence semantic model. The vision is that each of the models can be used interchangeably and that the back end (the data warehouse or other sources) and the front end (reporting and dashboarding solutions) can plug in to the semantic model, regardless of the technology choice you have made.
For more information about tabular modeling, please refer to the following sources:
Tabular Modeling (SSAS Tabular) on TechNet
The excellent book, Microsoft SQL Server 2012 Analysis Services: The BISM Tabular Model by Marco Russo, Alberto Ferrari, and Chris Webb.
With a new data model comes also a new query language. Power Pivot introduced the language DAX, which stands for Data Analysis eXpressions. In their efforts to keep Power Pivot as simple as possible to use, the DAX language syntax resembles the Excel formula syntax a lot, which makes it easier for users familiar with Excel to pick up DAX. However, this comparison only lasts for so long and DAX can quickly become complex for more advanced modeling and calculations.
DAX also supports several time intelligence functions, such as year-to date functions, for example. It also supports, to a limited extent, parent-child hierarchies if they are modeled as flattened hierarchies. True modeling of parent-child hierarchies, as in SSAS Multidimensional, is currently not supported. In general, DAX has very limited capabilities for supporting hierarchies, in contrast with MDX. An interesting fact is that you can also query tabular models using MDX.
You can find an overview of the DAX language on TechNet.
Making the Decision
Both the models serve a common purpose: providing the users with a robust and user-friendly analytical environment that can be used for ad hoc analysis, dashboards, and/or corporate reporting. Each model also has its own strengths and weaknesses, something we’ll investigate further.
Both models compress the data while processing. The actual results can vary greatly, depending on the structure of the data, but for Multidimensional you can expect a result of one-third of the original data size, while Tabular can get to one-10th of the size. This is due to the columnar storage, which aids greatly in the compression.
Although Tabular has better compression results, all of the data has to fit into memory. Even more, you must have extra memory available for processing the model. Memory has become cheap over the years, but Tabular isn’t the right solution for storing terabytes of information. Multidimensional can cope more easily with large data set requirements, if your disks are big enough, of course.
Multidimensional supports relational data sources using OLE DB native or managed providers. Tabular, on the other hand, can import data from a wealth of different sources: aside from just about any relational database, it can also import data from SSAS Multidimensional, an SSRS report, OData Sources, and the Windows Azure Marketplace.
Storage and Data Access Modes
As mentioned earlier in the introductions, each model has different modes for storing and querying data. Multidimensional has MOLAP, ROLAP, and HOLAP, while Tabular has in-memory cached and DirectQuery. For most use cases, MOLAP and in-memory are sufficient and will achieve the greatest performance. However, both models take a hit on latency, as the models need to be reprocessed in order to calculate the latest data and aggregations. If you’re looking for more real-time solutions, ROLAP or DirectQuery might provide suitable alternatives.
ROLAP can have some serious performance issues, though, for larger datasets, as it doesn’t have a dedicated query engine. If you’re planning on using a multidimensional model, you can achieve pseudo real-time and acceptable performance by combining MOLAP with pro-active caching.
DirectQuery has a dedicated query engine, but it can only work with SQL Server as a relational data source. There are also some serious limitations: Calculated columns cannot be used, for instance, and only DAX queries can be issued. This means that an Excel PivotTable cannot be used to query Tabular in DirectQuery mode, as Excel uses MDX. For a more detailed look at the real-time capabilities, see James Serra’s blog post, Comparing DirectQuery and ROLAP for real-time access.
All of the reporting tools from the Microsoft Business Intelligence stack support both models. You can use reporting services, Excel, PerformancePoint, and Power View to create reports and dashboards on both Tabular and Multidimensional. Keep in mind, though, that using Power View on top of a multidimensional source is a feature added in SQL Server 2012 SP1 cumulative update 4, so make sure your instances are upgraded.
Both models rely on Windows authentication in order to authenticate users. This means you can assign domain users and groups to different roles. Using these rules, you can set up authorization using either MDX for Multidimensional or DMX for Tabular.
The Multidimensional model, however, has finer-grained options, such as visual totals and cell security. If tight security is an important requirement for your environment, Multidimensional might be a better choice.
Multidimensional models have been around for quite a while, but Tabular models are still in their first release. Therefore, they lacks a lot of features that are present in Multidimensional. As mentioned earlier, Tabular cannot easily model many-to-many relationships and parent-child relationships. It is possible, though, but with quite some elaborate DAX. Tabular only supports single-column relationships and you can have only one active relationship at a time between two tables, which makes implementing role-playing dimensions more difficult. You can work around most limitations, but it can get frustrating after a while.
Some features, however, are totally absent and cannot be imitated through DAX or some work-around. Tabular doesn’t support actions, custom assemblies, write-back, or translations. In multi-lingual global environments, missing translations can be a real showstopper.
It isn’t all bad news, though. Power Pivot solutions can easily be upgraded to Tabular models, making them an ideal prototyping tool.
For more information, check out the following resources:
The choice between Tabular or Multidimensional models in SQL Server Analysis Services 2012 on the element61 website
Tabular model: Not ready for prime time? by James Serra
SQL Server 2012 supports two different models for Analysis Services: Multidimensional and Tabular. Although there are quite some differences between those two, such as the modeling experience and the features they offer, for most BI projects both models will suffice. If you’re looking for more advanced modeling, such as complex many-to-many relationships, Multidimensional is the better choice. If you’re searching for blazing fast performance and your dataset isn’t too large, Tabular will suit your needs. I’d argue that for small and not too complex projects, Tabular will be a good choice because of the ease of development and the good performance, unless you absolutely need one of the features that are present only in Multidimensional. I do hope more features will be added to the next release of Tabular, though.