We performed a comparison between Databricks and H2O.ai based on real PeerSpot user reviews.
Find out what your peers are saying about Databricks, Microsoft, Alteryx and others in Data Science Platforms."I like cloud scalability and data access for any type of user."
"Databricks is a scalable solution. It is the largest advantage of the solution."
"I haven't heard about any major stability issues. At this time I feel like it's stable."
"The initial setup is pretty easy."
"It is fast, it's scalable, and it does the job it needs to do."
"The most valuable feature of Databricks is the integration of the data warehouse and data lake, and the development of the lake house. Additionally, it integrates well with Spark for processing data in production."
"It is a cost-effective solution."
"The Delta Lake data type has been the most useful part of this solution. Delta Lake is an opensource data type and it was implemented and invented by Databricks."
"The ease of use in connecting to our cluster machines."
"AutoML helps in hands-free initial evaluations of efficiency/accuracy of ML algorithms."
"Fast training, memory-efficient DataFrame manipulation, well-documented, easy-to-use algorithms, ability to integrate with enterprise Java apps (through POJO/MOJO) are the main reasons why we switched from Spark to H2O."
"It is helpful, intuitive, and easy to use. The learning curve is not too steep."
"The most valuable features are the machine learning tools, the support for Jupyter Notebooks, and the collaboration that allows you to share it across people."
"One of the most interesting features of the product is their driverless component. The driverless component allows you to test several different algorithms along with navigating you through choosing the best algorithm."
"CI/CD needs additional leverage and support."
"Databricks may not be as easy to use as other tools, but if you simplify a tool too much, it won't have the flexibility to go in-depth. Databricks is completely in the programmer's hands. I prefer flexibility rather than simplicity."
"There could be more support for automated machine learning in the database. I would like to see more ways to do analysis so that the reporting is more understandable."
"When I used the support, I had communication problems because of the language barrier with the agent. The accent was difficult to understand."
"There should be better integration with other platforms."
"Databricks is an analytics platform. It should offer more data science. It should have more features for data scientists to work with."
"The solution could be improved by adding a feature that would make it more user-friendly for our team. The feature is simple, but it would be useful. Currently, our team is more familiar with the language R, but Databricks requires the use of Jupyter Notebooks which primarily supports Python. We have tried using RStudio, but it is not a fully integrated solution. To fully utilize Databricks, we have to use the Jupyter interface. One feature that would make it easier for our team to adopt the Jupyter interface would be the ability to select a specific variable or line of code and execute it within a cell. This feature is available in other Jupyter Notebooks outside of Databricks and in our own IDE, but it is not currently available within Databricks. If this feature were added, it would make the transition to using Databricks much smoother for our team."
"Anyone who doesn't know SQL may find the product difficult to work with."
"The model management features could be improved."
"On the topic of model training and model governance, this solution cannot handle ten or twelve models running at the same time."
"The interpretability module has room for improvement. Also, it needs to improve its ability to integrate with other systems, like SageMaker, and the overall integration capability."
"I would like to see more features related to deployment."
"Referring to bullet-3 as well, H2O DataFrame manipulation capabilities are too primitive."
"It needs a drag and drop GUI like KNIME, for easy access to and visibility of workflows."
"It lacks the data manipulation capabilities of R and Pandas DataFrames. We would kill for dplyr offloading H2O."
Earn 20 points
Databricks is ranked 1st in Data Science Platforms with 78 reviews while H2O.ai is ranked 19th in Data Science Platforms. Databricks is rated 8.2, while H2O.ai is rated 7.6. The top reviewer of Databricks writes "A nice interface with good features for turning off clusters to save on computing". On the other hand, the top reviewer of H2O.ai writes "It is helpful, intuitive, and easy to use. The learning curve is not too steep". Databricks is most compared with Amazon SageMaker, Informatica PowerCenter, Dataiku Data Science Studio, Microsoft Azure Machine Learning Studio and Dremio, whereas H2O.ai is most compared with Amazon SageMaker, Dataiku Data Science Studio, Microsoft Azure Machine Learning Studio, KNIME and IBM Watson Studio.
See our list of best Data Science Platforms vendors.
We monitor all Data Science Platforms reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.