Where are the open source analytical databases?
Over the last few months several announcements have been made by different vendors having blended open source BI solutions with high speed analytical databases. Though this might seem good news at first, it also shows something else. Why would companies like Jaspersoft and Pentaho join forces with proprietary vendors like Vertica and Paraccel? Simple: there isn't a viable OS alternative available. Scaling efficiently beyond the 1 TB range with a product like MonetDB, LucidDB or even Infobright cannot be done due to the lack of MPP support in these products. An alternative like EnterpriseDB's PostgreSQL Plus combined with their GridSQL option (yes, both of these are Open Source!) could scale out but lacks the column store and in-memory operation of e.g. MonetDB. That might not be a big problem though; if you look at proprietary solutions such as Netezza, Kognitio, Dataupia or Greenplum they all use some sort of 'brute force' strategy to achieve fast query performance. The interesting thing here is that the majority of these products have a modified version of PostgreSQL under the hood. Which might lead to the following question: when are we going to see an analytical database that's scalable, column oriented, compresses data, uses memory where it can, can be loaded fast ànd is really Open Source?