What are Dynamic Cubes?
Dynamic Cubes are a new feature released with IBM Cognos BI version 10.2. It extends to OLAP the Dynamic Query Mode (DQM) functionality that was introduced with Cognos 10.1. Dynamic Cubes are based on ROLAP technology, like Microsoft Analysis Services cubes. The primary benefits of Dynamic Cubes are application responsiveness for users and unprecedented scalability.
The introduction of a new OLAP construct could be interpreted as the end of PowerPlay cubes. While Transformer and PowerPlay will likely be deprecated in the coming years, there has been no announcement yet to that effect. There is a large user base and no migration path for PowerPlay cubes. Dynamic cubes should not be considered as the evolution of PowerPlay cubes, as each technology excels for different use cases.
Why use Dynamic Cubes? What are the alternatives?
With Dynamic Cubes, the IBM Cognos BI solution has four flavours of OLAP, each with a unique history and each addressing a different need. I have also included Microsoft SQL Server Analysis Services (MSAS) for discussion, as Cognos has always been an effective presentation layer for MSAS cubes.
Here are the five OLAP options:
Cognos PowerPlay is no longer sold by IBM, but it is still supported. The modeling tool is known as Transformer. There is no migration path for PowerPlay cubes, but the same functionality can be found in at least one, if not all, of the other OLAP options. The strength of PowerPlay is its ease of use: both for business users, browsing data on the web or in the Windows PowerPlay client; and for developers, using Transformer, the modeling tool. The cubes can be built from a star schema data warehouse or from disparate, semi-structured data sources. The weakness of PowerPlay is its limited scalability. As it is 32-bit application, cubes are limited in size to 2Gb, and cubes greater than a few hundred megabytes MB can be unwieldy and difficult to manage.
PowerPlay Strengths:
- Proven and mature product
- Relative time calculations are built-in and easy to customize
- Transformer will leverage a well designed star schema data warehouse
- Cubes can also be built from disparate and semi-structured data sources (although it is more work)
- Transformer can use Cognos reports and Framework Manager Query Subjects as data sources
- Less intensive hardware requirement, as the cubes are built off-line
- Very responsive user experience when working with small amounts of data
PowerPlay Weaknesses:
- Scalability, as there is a 2 Gb cube size limit for the cube file
- Scaling requires careful management using cube groups and drill-throughs
- Performance can be poor as the data volume increases
- 32-bit technology, which limits the ability to do caching and take advantage of modern server hardware
- Due to the size limit and performance limits, cubes usually only contain summarized data (if more detail is required, multiple cubes or drill through reports must be used, adding development and administrative complexity)
- A limit on the number of categories per dimension level
- Cube build times can be long
- Row level security is difficult
- No migration path for Transformer models and PowerPlay cubes
- PowerPlay is on the path to deprecation; no new features are expected
TM1 came out of the acquisition of Applix by Cognos, shortly before Cognos itself was acquired by IBM. TM1 is primarily a financial planning, budgeting and forecasting tool, based on an OLAP technology foundation. It does what-if analysis and has write back functionality which the other IBM Cognos tools do not. It is not a good general purpose BI tool as it is designed specifically for financial analysis. Scalability is limited as data and user volume will degrade performance rapidly.
Dimensionally Modeled Relational, or DMR, is a hybrid approach to OLAP. Using this feature in Framework Manager requires a well designed star schema data warehouse or performance will be very poor. With DMR, the OLAP model is designed in Framework Manager. In the background, SQL is run against the database and the results are cached as necessary for local processing. This means that a very complex SQL query is generated and executed for every report that is created against the DMR model. For a small amount of data, this can work well, but it is not scalable. Additionally, the effort required to model the data makes this a poor choice for most situations. The exception would be if the data warehouse is a clean, well tuned star schema, and only a limited amount of OLAP functionality is required by a small number of users, making the investment in one of the other options a costly choice (in terms of both licenses and development effort).
Microsoft SQL Server Analysis Services (MSAS) is obviously not an IBM product, but I have included it here because Cognos BI works well as a presentation layer for MSAS cubes. It is a consideration for many IBM Cognos customers, as Analysis Services is included with the popular Microsoft SQL Server database. MSAS is a mature software product that supports OLAP features over large volumes of data. While Microsoft pushes Excel and Sharepoint as front ends for its BI offerings, Cognos has long provided a slick alternative interface, first with PowerPlay, then later with ReportNet, Cognos 8 and now Cognos 10.
MSAS Strengths:
- Mature product
- Scalable
- Handles large dimensions with large amounts of detail level data inside the cube structure
- Very flexible calculations are possible with MDX coding
- Dynamic and granular security
MSAS Weaknesses:
- Requires Microsoft SQL Server
- Not all MSAS features are exposed through all BI front ends
- MDX coding is required for relative time and other features, which increases the complexity of cube development
Dynamic Cubes are the newest choice, and the subject of the remainder of this article. Dynamic cubes work like MSAS: the data remains in the database, ideally in a star schema data warehouse.
Why a star schema?
The short answer is because this is the data structure that the Cognos software expects to use. Dynamic cubes are optimized to run this way. A snowflake schema will work, and even semi-structured data can be used, but the consequences are an increase in development time and slower performance. Using small amounts of data let you get away with a suboptimal design, but anything greater than a few million records probably will not. In today’s world of ubiquitous data capture and big data, it is common to summarize and to report on tens or hundreds of millions of records, so a scalable design is a must.
Perquisites for Dynamic Cubes
A prerequisite for Dynamic Cubes is Dynamic Query Mode, or DQM, which is a feature introduced in Cognos 10. DQM is a sophisticated data cache that improves report performance, as it holds data and calculations in memory on the server. A dynamic cube is another layer on this data cache, putting into memory the aggregate calculations and intersections which makes OLAP possible. In PowerPlay, the mdc file contains all of these aggregate calculations (compiled by Transformer) and the PowerPlay Enterprise Server can cache a limited amount of the data. For small cubes, performance was very fast allowing users to browse the data with ease and to create crosstab reports, which are very expensive computationally when doing relational reporting. The dynamic cube architecture with intelligent in memory aggregates provides even better performance (once the cache is loaded), without the data volume limitation of PowerPlay. It is even conceivable to load most of the data warehouse into memory, permitting the use of OLAP functionality across hundreds of millions of records.
Benefits of Dynamic Cubes
Dynamic cubes look and behave very much like Microsoft Analysis Services cubes. From a business user perspective, the most visible benefit is the presence of detail level data in the cube. Performance and user responsiveness is not sacrificed. IBM claims that a dimension can contain hundreds of millions of records. This means that analysts can go from summary to detail level data very quickly. Reporting on detail level data is not limited to list reports or simple charts: crosstabs and complex visualizations can be done quickly, as OLAP features like suppress zeros (a very expensive operation against relational data), count distinct and row over row calculations can be done on massive volumes of data.
More Benefits of Dynamic Cubes
- Data comes directly from the data warehouse database, so there is no build time for an OLAP cube file, nor a requirement to manage the publishing, moving and archiving of these files
- Scalable performance by using aggregate awareness (to pull aggregates directly from the database as opposed to calculating them on the fly)
- Can be used with Report Studio, Workspace and Workspace Advanced
- Leverages a star schema or snowflake data warehouse
- Support for dimension attributes
- Built in relative time calculations (YTD, QTD, etc.) like in PowerPlay Transformer
- Calculations can be extended further with MDX, like the custom relative time calculations in Transformer
- With the Aggregate Advisor, tuning suggestions are provided for the database, based on user queries
- Large data volumes without sacrificing performance
- No separate license is required
- Support for attributes, for example, a product can have attributes such as color, weight, etc. (Transformer did not handle this very well: you could add another dimension or alternate drill path, but it could make for a confusing design)
- Eliminates the need for drill-through reports to provide detail level data
Potential Pitfalls
As compelling as the benefits of Dynamic Cubes are, there are some potential pitfalls:
- For optimal performance, the Cognos servers must be sized and tuned properly, and the data warehouse must be well structured, indexed and maintained.
- In order for caching to work optimally, adequate memory must be allocated.
- The data source must be a star schema dimensionally modeled database. I have always considered this to be a best practice with Transformer and MSAS, but it is even more important for building Dynamic Cubes.
- There is no possibility of leveraging work done in Transformer or in Framework Manager
- The first users of the cube suffer poor performance until the cache is warmed. This can be mitigated by running reports after the ETL updates the data warehouse and before users are expected onto the system.
- Creating the aggregate tables is an iterative process using the aggregate advisor. This requires feedback from users so performance might be poor until the tuning is complete, which could take several weeks.
- Dynamic Cubes are a relatively new feature of Cognos 10. They came out of DB2 Cube Views, so the underlying technology is proven, but it is still a bit rough around the edges.
- Updating the source data must be balanced with refreshing the cache, meaning that real-time OLAP reporting will not be possible in most cases. However, by using virtual cubes, near real-time reporting can be implemented.
How do Dynamic Cubes work?
There are three components of Cognos Dynamic Cubes:
- IBM Cognos Cube Designer, which is the modeling tool, analogous to Transformer.
- The Dynamic cube Data sources in Cognos administration, which contain details about the dynamic cube and some tuning parameters.
- The Aggregate Advisor, which is launched from the Dynamic Query Analyzer.
The business users accesses Dynamic Cubes through Workspace or Workspace Advanced, which is a combination of Query Studio, PowerPlay and Analysis Studio. Workspace Advanced can handle most analysis and reporting tasks. For more complex reports, Report Studio can be used.
The smart caching works on two levels, for both query planning and query execution. The power of the cache, which drives the high performing OLAP features, is aggregate awareness: the software knows where the intersections exist in the source data. Dynamic cubes use a combination of cached in-memory aggregates and in-database aggregate tables. These aggregate tables are created with the assistance of the Aggregate Advisor. This feature provides guidance on how to improve performance by creating these aggregate tables, so use of the Aggregate Advisor must be part of the initial cube deployment plan.
Dynamic Cube Design
Ideally, each cube should be designed around a single fact table, reflecting the star schema data warehouse. While snowflake schema is supported, good results are difficult to achieve. The cube designer tool expects to see fact and dimension tables with primary and surrogate keys. Fact table foreign keys will save a lot of time. Too many table relationships, as found in highly normalized data, will hinder cube performance. The single fact table per cube is not usually a limitation, as multiple cubes can be created inside the same project, and virtual cubes can be created to combine cubes designed against different fact tables and different data sources. By the same token, dimensions can be shared across cubes, saving development time and providing a consistent user experience across different cubes. A nice feature is the ability to browse data from within the designer tool to do a quick visual check of the results. The cube designer has an auto-design feature but the results are problematic, even against a star schema data warehouse.
The table relationships between fact and dimension tables are automatically detected by the cube designer. If they are missed, the joins can be added manually. This allows some data modeling to be done in the designer tool, but again, it is better to do the data warehouse modeling in the database to save time and to avoid performance problems.
When the cube is published, it appears in the Cognos Administration tool, under Query Service. The caching can be controlled via a query service admin task, which can schedule a refresh or be run off a trigger. The default cache refresh is every 24 hours. Managing the cache needs to be carefully considered, as performance can be very poor the first time users hit a dynamic cube. A technique to avoid this problem is to schedule reports to run to populate ( or “warm”) the cache. The default cache memory is 1 Gb which will need to be adjusted upwards as data volumes increase. The maximum cache memory is 2 terabytes. Yes, 2 terabytes of memory and I think that this will be commonplace within a couple of years.
Thoughts on choosing an OLAP Solution
Dynamic Cubes are a powerful addition to the Cognos BI toolkit, but it should not be the default choice for a BI project. Assuming that you have decided on building an OLAP component (another discussion), there are many considerations to take into account before deciding on Dynamic Cubes:
- O/S and software requirements
- database and data source support
- data volume and scalability requirements
- specific OLAP features
- security
- licensing and the vendor relationship
- availability and the skills of the BI administrators
- ease of use and flexibility (both very subjective)
Making a choice
I tend to agree with this discussion on the IBM Cognos OLAP options, which I summarize here:
Ease of use: PowerPlay and Transformer are the best for both end users and developers. Dynamic Cubes are a close second and since it is new, we will likely see rapid improvements over the next couple of releases. I would like to see some integration with Framework Manager so that modeling work done there could be leveraged for Dynamic Cubes. TM1 and DMR have a steeper learning curve.
Flexibility: TM1 and MSAS (if it can be considered an option) offer the greatest flexibility. Transformer and DMR are the least flexible solutions, meaning that developers are more likely to have to compromise on the business requirements. Dynamic Cubes are a happy medium, and again, they will likely get even better over the next few releases.
Development speed: A cube can be whipped up quickly in Transformer, and if the prerequisite data modeling work has been done, in DMR. Dynamic Cubes are more work as existing Framework Manager packages cannot be leveraged, but if the data warehouse is a well designed star schema, it can be as fast as Transformer to develop a cube. Using virtual cubes and reusing dimensions across cubes make broadening the solution a quick task.
Overall conclusion: For small amounts of data, PowerPlay remains a good choice if it is already a part of your current BI infrastructure. MSAS is a solid choice for Microsoft SQL Server shops, and the Tabular Model is an attractive, if unproven, feature. DMR is complex to develop and not scalable, but it can work well for small amounts of data in a contained environment. TM1 is a finance application and not really suitable as a general purpose OLAP tool.
Dynamic Cubes are very promising and they will probably displace the other options as the go-to IBM Cognos OLAP solution. I would expect it to be a little rough around the edges for the next couple of releases until all the problems are worked out. It requires a commitment to administering and tuning, especially during the initial deployment, which includes creating the aggregate tables, tuning the cache refresh and configuring the memory settings. Some kind of monitoring must be in place to ensure that adequate CPU, memory and bandwidth are all available to support the solution as data volumes grow.
In all cases, getting the data into a dimensional star schema data warehouse will make the OLAP modeling and development an order of magnitude easier. Cognos software expects to see the data in that format, so it makes it easier to model the cube and enable OLAP features. Data quality fixes and business logic should be pushed upstream as much as possible into the data warehouse: this has always been true and remains so today.
Resources
Here are some useful resources I found for learning about Dynamic Cubes:
Webinars from Senturus Consulting:
Documentation from IBM:
[…] Fix packs are not the most exciting topic to blog about, but they are a crucial part of any successful enterprise software implementation. On August 14, 2014, Cognos BI 10.2.1 Fix Pack 4 became available. As far as I can tell, this fix pack contains many code fixes and only one new feature: it is now possible to convert report pages to tabs in Report Studio. This is in contrast to Fix Pack 3, which had the usual amount of fixes plus a lengthy list of product enhancements, mostly around Dynamic Query Mode (DQM) and Dynamic Cube functionality. […]