Dataiku Data Science Studio Review

Good data preparation tools and integrates well with BigQuery


What is our primary use case?

Our primary uses for this solution are data preparation and data modeling.

We have a testing environment, a production environment, and two API nodes.

How has it helped my organization?

Using Dataiku has meant that we spend less time on preparing and cleaning data, and we spend less time on blending models together. Ultimately, it means that we can spend more time modeling. 

What is most valuable?

The most valuable feature is the set of visual data preparation tools. 

The solution supports code from different languages including Python and R. Whatever code you want to use, it works well.

This solution allows us to store and retrieve data directly into BigQuery.

The documentation and tutorials are quite good.

What needs improvement?

From an administrative point of view, I would like to be able to communicate with the users who are logged into the system. For example, I would like to be able to send a broadcast message that says "I am shutting down the system."

I would like to see more organization and better cohesion within the tool.

In the next release of this solution, I would like to see deep learning better integrated into the tool and not simply an extension or plugin.

I would like to have a better way to manage images and sound.

The error messages are not self explanatory and can sometimes be difficult to understand.

For how long have I used the solution?

I have been using this solution for one year.

What do I think about the stability of the solution?

The stability is quite good.

What do I think about the scalability of the solution?

This is a scalable solution where we integrate with BigQuery for storing and retrieving our data. There are only two of us in the company who use this solution, although we would like to increase our usage.

How are customer service and technical support?

The technical support is quite good. We have had to open a few tickets and they replied in just a few minutes. They are quick and supportive.

Which solution did I use previously and why did I switch?

I still use several other solutions for data science including RapidMiner 9 and Weka. I have also been using Octave and MATLAB for modeling.

I use the Community Editions for these other products, so I have restrictions when it comes to things like the size of the dataset. When I need to be free of restrictions then I use Dataiku Data Science Studio.

How was the initial setup?

The initial setup is very, very simple.

To deploy the entire platform will take one or two days.

What about the implementation team?

We handled the deployment ourselves. We can work totally independently.

What's my experience with pricing, setup cost, and licensing?

The annual licensing fees are approximately €20 ($22 USD) per key for the basic version and €40 ($44 USD) per key for the version with everything.

What other advice do I have?

At the moment, we haven't had any need to use containers or Spark because everything is included in BigQuery.

My advice for anybody who is implementing this solution is to start with having somebody who can mentor you. Whether this is the case or not, the tutorial and documentation are quite good, so I would suggest going through the whole tutorial and academy material.

This solution does have a learning curve, although it is not steep.

I would rate this solution an eight out of ten.

Which deployment model are you using for this solution?

On-premises
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner.
Add a Comment
Guest