Evaluating the Pentaho Open Source BI Platform
My previous article discussed why “Open Source Business Intelligence is a Good Idea”. In this article, I evaluate Pentaho BI, one the most popular open source BI solutions (others being Jaspersoft and Actuate). Pentaho BI is offered in two flavors namely the community (aka free) edition and the enterprise (aka commercial) edition. The community edition contains most of the functionality contained in the enterprise edition barring some administrative and auditing features. Of course, the enterprise edition comes with professional support from Pentaho whereas the support for the community version is only available from the forums and wikis.
The core of the Pentaho BI system comprises of the BI server that hosts all the BI components and provides the infrastructure for report scheduling, viewing, and user management. All the BI elements are systematically organized into a logical hierarchy of browsable folders called “Solutions”. Its latest release, Pentaho 2.0, offers a sleek AJAX/Web 2.0 based user interface.
My evaluation of Pentaho 2.0 is based on the following components:
Pre-Published reports
Pentaho provides the ability to create pixel perfect reports using the Pentaho Reports Designer.
The reports designer provides all the functionality that comes with any standard report creation tool. For the non-technical folks that are not very comfortable with report creation it also offers a convenient and easy-to-use wizard-based report creation interface.
Once these reports are created, they can be easily deployed on the reporting server into one of the available solutions where they can be invoked by users on demand, run in background, or scheduled for periodic executions.
Ad-hoc Reports
Pentaho BI allows business users to dynamically (ad hoc) create reports and share them with colleagues and peers without having to rely on the IT department. A simple web-based wizard guides business users through the steps to create and publish ad hoc reports.
Dashboards
Pentaho provides all the facilities to create visual dashboards using JSP technology. Complex dashboards can be designed by connecting various individual elements in a simple JSP page. Alternatively, one can create dashboards without writing a single line of code using the Community Dashboard Framework. However I have not yet used the CDF.
Reporting Workflows
At the heart of the Pentaho BI server is the concept of process flow (referred to as Action Sequence). An action sequence is the smallest unit of work that the Pentaho server is capable of executing. Hence, every report on Pentaho (ad hoc or pre-published) is converted into an action sequence in order to be executed. An action sequence is am XML document with a .xaction extension. The concept is a little heavy to grasp in the beginning. But once you get a hang of it, the notion of a BI workflow allows you to do some of the most powerful things that can be expected on any BI system. Tasks like report bursting and exception-based reporting become pretty simple to achieve. Pentaho has an eclipse-based design studio allows you to create complex action sequences with minimal effort (once you get over the learning curve).
Metadata Model
In order to simplify the complex relations between various data sources in a report, Pentaho provides a metadata model (similar to Universes in BusinessObjects). Ad hoc reports depend on such business centric data models to shield the business user from complex data relationships. The metadata editor provides all the functionality to achieve this.
Administration
Along with feature rich functionality for the end users, Pentaho BI also offers strong support for system administration. This is the key component where the enterprise edition offers more functionality than its free counterpart. The community version allows support for user management, roles management, data source management, service management, and public schedule management. The enterprise edition offers additional support for performance monitoring, audit monitoring, server clustering, and automated content cleaning.
To summarize this article, I must say I was impressed with the features offered by Pentaho. Its free community edition seems to offer more functionality than several commercial BI systems. It does have its share of bugs and problems, but nothing that you can’t live without. I have not yet evaluated its analytical report capabilities (Mondrian) as well as its data mining functionality (Weka). Pentaho also has a strong data integration system called Kettle. More on all of these topics very soon.









One Response to “Evaluating the Pentaho Open Source BI Platform”
[...] fixes, majority of the functionality is the same as that available in Enterprise Edition version 2 (previous post). One significant addition is the self-service dashboard [...]
Leave a Reply