This week has seen big moves, big changes in the big data landscape – and that’s big :-)
First, the industry took a dramatic turn – led by HortonWorks, Splunk, Pivotal and others with the creation of the Open Data Platform (ODP) intiative.
This isn’t a naming or marketing exercise. It’s a fundamental shift for HortonWorks (see what they say here) and Pivotal – and is joined by many big industry players like IBM, SAS, Splunk and others.
Most importantly, it’s a shift where Pivotal is contributing more code materially to open-source. Specifically 3 big pieces of intellectual property to this initiative by open sourcing:
- SQL on Hadoop engine HAWQ (this is a complete SQL transactional DB layer and a higher performance and more complete alternative to others choices like Hive)
- GemFire (an ultra high transactional in-memory DB)
- GreenPlum (a Massive Parallel Processing analytical DB)
HortonWorks and Pivotal started to see this need for a “hardened enterprise open core” as they collaborated on Ambari as an industry-standard open way of managing Hadoop clusters (and is one of the Hadoop-related Apache projects).
Today, BTW – along with analytics elements on top of the data engines, the topic of “how do you manage Hadoop clusters” is where there is a lot of forking (and proprietary value) between the various distributions.
Together with the other open-source elements in Apache® Hadoop® (follow the link, and you’ll see why this ODP initiative is important – you have some tightly coupled parts, and some more loose projects) – these are packaged up together into the Pivotal HD3 distribution (as well as other distributions like HortonWorks)
What’s this all about? Read on!