Big data

Big Data Notes 012: Olap (Molap, Rolap, Holap)

Online analytical processing – or OLAP, by its obligatory acronym – is an established method for business intelligence reporting. With its new incarnations, it is primed for working with big data too. Big Data Notes explains.

Molap, Rolap and Holap? Sounds like the characters out of something my kids would watch…

Well unless your kids are signing in to webinars delivered by leading big data providers or perhaps the MIT, this is unlikely the case. Online analytical processing – or OLAP – and its derivative forms are basically a method of rapidly analysing and dissecting amalgamated sets of data.

So it’s a brand spanking new development to deal with big data then?

Not exactly. As Gartner states, ‘OLAP is, in truth, only a new name for a class of BI products, some of which have existed for decades’. However, it does fit very well for big data, and while Multidimensional Olap (Molap) might have been around for a while as the original form, you could argue that Rolap and Holap have been developed in response to the evolving big data landscape.

So Mo, Ro or Ho?

Yes, let’s.

Molap stands for multi-dimensional OLAP, and essentially is the original OLAP; it has been afforded its prefix by the arrival of its cousins Ho and Ro. The ‘multi dimensional’ bit refers to the fact that you can pull data together from disparate sources into one structure – what we call a ‘data cube’ – rather than a relational database, which is configured to answer specific questions. The benefit is that the MOLAP tools are not restricted by the need of a relational database to pull individual tables together to analyse them against each other and query and write time is therefore much faster.

Rolap, meanwhile, has been developed to try to accommodate some of the benefits of the OLAP model, but applied to work with traditional relational databases, rather than the data cubes. The benefits of this are two-fold. One is that given the overwhelming popularity and use of relational databases, many people have their data configured in this manner and therefore want to be able to perform quick and complex analysis on it without having to redevelop their storage, and the second is that the querying is open, rather than being limited to what was pre-configured for the cube, meaning any question can be asked. The process is to add new tables into the database which hold the aggregated information from previous queries, which helps the latency and speed of analysis since not all of the tables have to pulled together again each time to perform consecutive queries.

Finally, Holap is a hybrid form of Mo and Ro, designed to offer the benefits of each – the speed and variation friendly attributes of a data cube and the legacy compatibility of relational. Data would be divided between the two forms depending on how it is most efficient to do so and the analysis from a single query is split and takes place disparately on each before being joined back together centrally to provide the answer.

How do I know if it’s Olap?

When originally defining Olap, Gartner came up with 12 defining attributes and uses these to evaluate the different products on the market:

These are:

1. Multidimensional conceptual view
2. Transparency
3. Accessibility
4. Consistent reporting performance
5. Client/server architecture
6. Generic dimensionality
7. Dynamic sparse matrix handling
8. Multiuser support
9. Unrestricted cross-dimensional operations
10. Intuitive data manipulation
11. Flexible reporting
12. Unlimited dimensions and levels

Accept no imitations.

Where can I get it?

The usual suspects – Microsoft, Oracle, IBM, SAP etc – all offer proprietary systems based on the OLAP model which come with their database and analytics tools. Emerging BI and big data company Pentaho offers a free version – Mondrian – through its open source community editions range, as per its usual model.