We performed a comparison between Databricks and Pentaho Business Analytics based on real PeerSpot user reviews.
Find out what your peers are saying about Databricks, Microsoft, Alteryx and others in Data Science Platforms."Databricks covers end-to-end data analytics workflow in one platform, this is the best feature of the solution."
"The most valuable feature of Databricks is the integration of the data warehouse and data lake, and the development of the lake house. Additionally, it integrates well with Spark for processing data in production."
"It is fast, it's scalable, and it does the job it needs to do."
"The main features of the solution are efficiency."
"The solution is built from Spark and has integration with MLflow, which is important for our use case."
"Databricks integrates well with other solutions."
"Automation with Databricks is very easy when using the API."
"The fast data loading process and data storage capabilities are great."
"The initial setup is pretty straightforward."
"Pentaho is an analytics platform that can be used when an organization has a lot of big data storage systems already installed and needs to manage and analyze that data. It has a specific use case for unstructured data, such as documents, and needs to be able to search and analyze it."
"Easy to use components to create the job."
"We were able to install it without any assistance from tech support."
"Pentaho Business Analytics' best features include the ease of developing data flows and the wide range of options to connect to databases, including those on the cloud."
"The most valuable feature of Pentaho is the Tableau report."
"I use the BI Server, CDE Dashboards, Saiku, and Kettle, because these tools are very good and highly experienced."
"The solution could improve by providing better automation capabilities. For example, working together with more of a DevOps approach, such as continuous integration."
"There is room for improvement in visualization."
"It should have more compatible and more advanced visualization and machine learning libraries."
"The solution could be improved by adding a feature that would make it more user-friendly for our team. The feature is simple, but it would be useful. Currently, our team is more familiar with the language R, but Databricks requires the use of Jupyter Notebooks which primarily supports Python. We have tried using RStudio, but it is not a fully integrated solution. To fully utilize Databricks, we have to use the Jupyter interface. One feature that would make it easier for our team to adopt the Jupyter interface would be the ability to select a specific variable or line of code and execute it within a cell. This feature is available in other Jupyter Notebooks outside of Databricks and in our own IDE, but it is not currently available within Databricks. If this feature were added, it would make the transition to using Databricks much smoother for our team."
"The ability to customize our own pipelines would enhance the product, similar to what's possible using ML files in Microsoft Azure DevOps."
"The stability of the clusters or the instances of Databricks would be better if it was a much more stable environment. We've had issues with crashes."
"There are no direct connectors — they are very limited."
"Generative AI is catching up in areas like data governance and enterprise flavor. Hence, these are places where Databricks has to be faster."
"Another concern is that Pentaho is not customizable or interactive."
"We did not achieve the ROI. The work delivered to users had lesser value than the subscription cost."
"Deployment is not simple. It is not simple because we are dealing with a lot of data; we are dealing with a lot of storage. So, it's not a simple process."
"The repository should be improved."
"Version control would be a good addition."
"Logging capability is needed."
"Pentaho Business Analytics' user interface is outdated."
"Pentaho, at the general level, should greatly improve the easy construction of its dashboards and easy integration of information from different sources without technical user intervention."
Databricks is ranked 1st in Data Science Platforms with 78 reviews while Pentaho Business Analytics is ranked 21st in BI (Business Intelligence) Tools with 42 reviews. Databricks is rated 8.2, while Pentaho Business Analytics is rated 8.0. The top reviewer of Databricks writes "A nice interface with good features for turning off clusters to save on computing". On the other hand, the top reviewer of Pentaho Business Analytics writes "Flexible, easy to understand, and simple to set up". Databricks is most compared with Amazon SageMaker, Informatica PowerCenter, Dataiku Data Science Studio, Microsoft Azure Machine Learning Studio and Dremio, whereas Pentaho Business Analytics is most compared with Microsoft Power BI, Microsoft SQL Server Reporting Services, SAP Crystal Reports, Tableau and KNIME.
We monitor all Data Science Platforms reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.