Dataiku DSS 3.1 Goes Spark-Native with Scala

Dataiku releases Data Science Studio (DSS) 3.1, now with 5 machine learning backends for visual data transformation (including H2O& MLlib) and also support for Scala

Dataiku, the maker of the all-in-one predictive analytics software platform Dataiku Data Science Studio (DSS), has today announced the release of Dataiku DSS 3.1, which adds additional external integrations, an improved UX interface, 5 visual machine learning engines, and now enables transformations in Apache Spark’s native language, Scala.

The blending of visual code-free and free-form code-based transformations is one of the main strengths of Dataiku DSS for the prototyping and production of data applications. In addition to Python, R, SQL, Hive, Impala, and Pig, Dataiku DSS 3.1 now enables Apache Spark users to write transformations and interactive notebooks in Scala, bringing the power of Spark's native and most performant language to the data teams using Dataiku DSS.

Compared with Python, Scala is considered the ‘engineering language’ for developing data science applications. One of the main advantages of using Scala in an integrated data science production environment is its agility, which allows for easy testing and refactoring of code as a data application is being built. Dataiku DSS users who use Spark now have the ability to write transformations and interactive notebooks in Scala when developing data solutions.

To learn more about using Scala in Dataiku DSS visit:

Dataiku DSS 3.1 also introduces new visual machine learning engines that allow users to create incredibly powerful predictive applications within a code-free interface. Users of all skill levels can now leverage HPE Vertica machine learning, H2O Sparkling Water, MLlib, Scikit-Learn, and XGBoost directly from within the visual analysis section of Dataiku DSS 3.1 to apply powerful machine learning algorithms to their data science projects without having to write a single line of code.

“With Dataiku DSS 3.1, we continue to bridge the gap between day to day analytic needs and the latest cutting edge data science technologies,” said Florian Douetteau, CEO and co-founder of Dataiku. “By adding additional machine learning engines and enabling development in Scala, we are bringing even more tools to the table. This allows our users to build the best All of the new features in this release add to our goal of being a complete, end-to-end platform for  and most dynamic data science applications - quickly. the creation, development, and deployment of predictive analytics solutions for any organization.”

Additional features of DSS 3.1 include:

  • New external databases - Integration with IBM Netezza, Hana, and Big Query.
  • New DSS project home page-  Project dependencies are now visible in the user interface.  Projects also now have a status (Active, Sandbox, Archive) and can be tagged and filtered in various ways.
  • Fluid Navigation – A new, fluid way to navigate between items.
  • Better Integration with Tableau – Users can extend Dataiku DSS compatibility by creating custom export formats for datasets, including Tableau .tde files. This allows for better integration with Tableau and other tools.

Dataiku DSS 3.1 enables data teams of all skill levels to develop powerful data analytics solutions using the latest techniques in data science and technologies in Machine Learning.  It can be used to quickly build predictive services that transform raw data into business impacting services including:

  • Churn Analytics
  • Fraud Detection
  • Graph Analytics
  • Data Management
  • Demand Forecasting
  • Spatial Analytics
  • Lifetime Value Optimization
  • Predictive Maintenance
  • An Analytical CRM and more.

To learn more about all of the features in Dataiku DSS 3.1 visit:

About Dataiku
Dataiku develops Dataiku Data Science Studio, the unique advanced analytics software solution that enables companies to build and deliver their own data products more efficiently. Thanks to a collaborative and team-based user interface for data scientists and beginner analysts, to a unified framework for both development and deployment of data projects, and to immediate access to all the features and tools required to design data products from scratch, users can easily apply machine learning and data science techniques to all types, sizes, and formats of raw data to build and deploy predictive data flows.

More than 80 customers in industries ranging from ecommerce, to industrial factories, to finance, to insurance, to healthcare, and pharmaceuticals use Dataiku DSS on a daily basis to collaboratively build predictive dataflows to detect fraud, reduce churn, optimize internal logistics, predict future maintenance issues, and more. Dataiku has offices in Paris, New York, and in the Bay Area.

Dataiku raised $3.7 million this year from two investors to grow its sales and tech team and international development initiatives.