Excel taking the BI crown in 2010?
2010 is well on its way and many of you posted their insights and predictions for the upcoming year. No need to add to those posts; the ones by Nenshad Bardoliwalla (with some excellent comments from Neil Raden) and James Kobelius pretty much sum it up, and are also linked to from many other places as you can see on Topsy. If you combine those predictions/outlooks with this enormous list compiled by the Cutter Consortium you'll see a lot of stuff that repeatedly gets mentioned. Cloud, Agile and Social Media seem to be the hottest trends for the years ahead, and Open Source, Operational BI and Analytical Databases are also part of almost every list.
I tend to be a bit skeptical about these predictions. They seem to overlook the fact that since the final quarter of 2008 many people have been laid off, costs have been cut to the bare acceptable minimum and a lot of companies are just struggling to survive. In this financial and economic climate I don't see a lot of companies investing in 'new stuff', since to be able to invest, you need growth. And very few (large) companies grow nowadays and train or hire the people needed to implement all these cool new technologies. This doesn't mean that you cannot improve, just that the pace and scale will be limited. Nevertheless, a marathon also consists of many single steps, and some of these steps can be made very easily. Since I don't have a crystal ball to make predictions, I'll give some tips for the year to come that can help you start your own marathon.
This might seem strange coming from an open source advocate like me, but I think we're likely to see an even wider adoption of Excel as a BI tool. And for the daring few that are using OpenOffice Calc, there's some good news for them too. Excel is arguably the most (mis-)used BI tool on the planet and as much as they've tried, none of the BI megavendors has succeeded in offering a solution that has similar flexibility combined with better manageability. A better solution that will persuade users to abandon Excel and switch to the new solution, not because they have to, but because they want to. So it looks like we're stuck with Excel, whether we like it or not. I don't think this is necessarily bad either, considering what you can do to overcome Excel's weaknesses.
First of course there's Microsoft's own PowerPivot offering (a.k.a. Gemini) which will turn Excel into an analytic powerhouse. Combined with Sharepoint for collaboration, distribution and management Microsoft will be able to extend their already strong position on the desktop, possibly replacing existing BI solutions or at least stop wider deployment of these tools. There's a catch though: PowerPivot is an add-in for Excel 2010. Many companies are still on 2003 or only recently migrated to 2007. These companies are not likely to upgrade again to 2010 anytime soon. For most users, Office 2003 offers more than enough functionality already, so why upgrade?
Fortunately, there's no need to upgrade; BI in Excel is perfectly feasible with existing versions of the software. I would urge anyone to take a serious look at Jedox Palo, a complete suite of tools that still uses Excel (or OpenOffice Calc) as a front-end and analysis tool, while at the same time mitigates the weaknesses of Excel (data manageability, support/performance for large datasets, and security). Jedox released version 3.1 of their toolset last December. That's not all though; Jedox released almost every bit of code under the GPL (v2) now, meaning that the previously closed Worksheet server is now a standard part of the Community Edition (BI Suite) offering. And for the skeptics among you, it is really open source, just have a look at the Palo Subversion repositories. For a good overview and step by step instruction of how the different components work and fit together, take a look at this video. Also good to mention is the fact that Palo lets you run everything on Linux as well; there's a Linux installer for the BI server, and there's a Palo add-in for OpenOffice Calc that works on both Windows and Linux (though not always out of the box)
There are more alternatives too that don't break the bank; Vizubi is a free Excel Add-in that shares many of the Powerpivot characteristics but also works with Office 2003 and 2007. Another interesting option is using the SimbaO2X driver from Simba Technologies which lets you use any XML/A compliant OLAP server (e.g. Mondrian or SAP/BW) as a datasource for those Excel pivot tables. A similar solution is available from Jaspersoft, but when you're just looking for more options for collaboration or web based access to your Excel data, just upload your worksheet to any of the online spreadsheet services like Zoho Sheet, Google Docs or EditGrid. The latter contains even more functions (500) than Excel (350)!
2. Fast & cheap hardware
By far the cheapest hardware is the hardware you already own. Last week I was doubting about trashing a 2002 Shuttle box because it was too slow for doing anything useful with it (it had Windows XP installed). Still I decided to install Ubuntu 9.10 to see how it would behave. And now my daughter has a 'new' computer! It boots fast, is pretty responsive and the integrated video chip lets you even use the 'normal' setting for visual effects. The only glitch was with the USB Wifi stick but 3 minutes of Googling solved that problem too. The message here is: you can extend the lifetime of your hardware considerably by using Linux. Bad news for hardware vendors, good news for cost aware customers.
More bad news for hardware vendors: as much as they want companies to 'update' their servers, it's likely that the former generations of Xeon machines do their job just fine. And when they don't, be careful not to overestimate requirements. It's amazing what you can do on a single dual Xeon box with ample memory and disk drives. Last year I was able to build a very fast machine for under $10K. Has 8 cores, 64 GB of ram and 12 SSD drives connected to 2 fast Adaptec controllers. Beats most of the current listed servers on the TPC-H benchmark and best of all: it only has a single 650 Watt power unit, so the QphH/Watt ratio is extremely favorable.
Another hardware tip: if there's still some money left from the Christmas bonus, do yourself a favor and put an Intel X-25M SSD (80 or 160GB) in your laptop. Remember the feeling you got when you turned on your new 80386 computer for the first time after having worked on a 8086 or 80286 PC? That's what I mean...
3. Fast & cheap software
The same hardware message applies here too: the cheapest software is the stuff you already have. In the 'Excel' part I described how to better utilize existing Office software, but there's more to the 'fast & cheap' than that. This year will show a number of new entrants in the analytical database space, many of them open source. Calpont has been working on a product for years, changing directions every now and then but ultimately decided to go for a MySQL compliant column store. First results of the beta community edition of InfiniDB are very encouraging. Another product that will make a splashing entry in the next couple of months is Ingres/VectorWise. The VectorWise team developed a product that can fully utilize modern CPU architectures. From what I've seen and heard about it, the performance gain compared to traditional row oriented products will be spectacular. Combined with already available products like MonetDB, LucidDB, Infobright CE and the free single node version of Greenplum there's enough to choose from without spending large amounts of money. I wouldn't say 'no money' because you'll likely want to sign a support contract with these vendors if you want to run their product in a production environment.
Not necessarily fast (well, fast to install and deploy) but certainly powerful and cheap are the analytical tools Knime and Rapidminer. Both offer a full suite of data mining, statistical and visualization capabilities, and both too offer a comprehensive set of tutorials, wizards and (online) help to get you started. Did I already mention these are open source as well? Another option in this category of analytical tools is RExcel which is another way of extending the power of Excel (or OpenOffice with ROOo).
And that brings us right back to the first part of this small list. 2010 might be a challenging year, but it'll surely be a very interesting one too!