Hortonworks buys better Hadoop data flow management

25.08.2015
Hadoop vendor Hortonworks, fresh off releasing a new version of its distribution, has acquired a company with a framework Hortonworks wants for handling how data moves into, out of, and next to Hadoop.

The company is Onyara, and the framework (of which Onyara is a commercial supporter) is the Apache NiFi project, a system for graphically diagramming how data can move through a system.

Hortonworks sees NiFi as a way to create a new data platform for Hadoop that deals with data gathered in and acted on in real time from a panoply of devices, "smart" and otherwise. Originally a product of the NSA, NiFi was open-sourced under the agency's Technology Transfer Program, the same declassification effort that provided the SIMP cyber security tool.

Rather than trying to build the functionality into Hadoop directly, Hortonworks is creating a parallel product offering, Hortonworks DataFlow (not to be confused with Google's product of the same name). DataFlow will be sold to enterprises looking for a solution to handle data in motion as well as data at rest.

NiFi is also meant to play well with all the other stars of the Hadoop cast, like Spark (for real-time data processing) and Kafka (messaging). Plans are on the table for integrating NiFi-controlled flows into Hortonworks's existing Data Governance Initiative as well, so DataFlow-controlled data can be labeled and tagged even apart from Hadoop itself.

Adding NiFi to the Hortonworks mix complements Hortonworks's central mission, which is to provide Hadoop and related products without proprietary encumbrances. But all signs point to pure open source plays of any kind as increasingly tough sledding.

Hortonworks' recent financial news has been mixed, with net losses up despite a growing customer base and increasing quarterly revenue. DataFlow comes off as an attempt to give the company a new revenue stream by leveraging existing customers instead of adding new ones. With the size of the market for commercial Hadoop offerings in question, the former approach seems smarter.

(www.infoworld.com)

Serdar Yegulalp

Zur Startseite