Use Apache Spark This tool can help you tap machine learning

01.07.2016
Finding insight in oceans of data is one of enterprises' most pressing challenges, and increasingly AI is being brought in to help. Now, a new tool for Apache Spark aims to put machine learning within closer reach.

Announced on Friday, Sparkling Water 2.0 is a major new update from H2O.ai that's designed to make it easier for companies using Spark to bring machine-learning algorithms into their analyses. It's essentially an API (application programming interface) that lets Spark users tap H2O's open-source artificial-intelligence platform instead of -- or alongside -- the algorithms included in Spark's own MLlib machine-learning library.

Among the highlights of the new software is the ability to run Spark and Scala through H2O's Flow user interface. Sparkling Water 2.0 also brings a new visualization component to MLlib, giving users the ability to see their algorithmic results in an easy-to-digest form.

The software supports the Apache Zeppelin notebook as well as Spark 2.0 and all previous versions. It offers production support for machine-learning pipelines. Model and data governance can be handled through H2O's Steam data-science hub.

Businesses increasingly need to tap a variety of machine-learning algorithms to solve the problems they face today, said Matt Aslett, a research director with 451 Research. “Sparkling Water is likely to be attractive to H2O and Spark users alike, enabling them to mix and match algorithms as required.”

Katherine Noyes