We performed a comparison between Databricks and Spring Cloud Data Flow based on real PeerSpot user reviews.
Find out in this report how the two Streaming Analytics solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI."The most valuable feature of Databricks is the integration of the data warehouse and data lake, and the development of the lake house. Additionally, it integrates well with Spark for processing data in production."
"Specifically for data science and data analytics purposes, it can handle large amounts of data in less time. I can compare it with Teradata. If a job takes five hours with Teradata databases, Databricks can complete it in around three to three and a half hours."
"The processing capacity is tremendous in the database."
"In the manufacturing industry, Databricks can be beneficial to use because of machine learning. It is useful for tasks, such as product analysis or predictive maintenance."
"The initial setup is pretty easy."
"I like cloud scalability and data access for any type of user."
"The solution is an impressive tool for data migration and integration."
"The solution offers a free community version."
"There are a lot of options in Spring Cloud. It's flexible in terms of how we can use it. It's a full infrastructure."
"The most valuable features of Spring Cloud Data Flow are the simple programming model, integration, dependency Injection, and ability to do any injection. Additionally, auto-configuration is another important feature because we don't have to configure the database and or set up the boilerplate in the database in every project. The composability is good, we can create small workloads and compose them in any way we like."
"The most valuable feature is real-time streaming."
"The product is very user-friendly."
"The connectivity with various BI tools could be improved, specifically the performance and real time integration."
"It would be nice to have more guidance on integrations with ETLs and other data quality tools."
"I would like it if Databricks adopted an interface more like R Studio. When I create a data frame or a table, R Studio provides a preview of the data. In R Studio, I can see that it created a table with so many columns or rows. Then I can click on it and open a preview of that data."
"The integration and query capabilities can be improved."
"The solution could be improved by adding a feature that would make it more user-friendly for our team. The feature is simple, but it would be useful. Currently, our team is more familiar with the language R, but Databricks requires the use of Jupyter Notebooks which primarily supports Python. We have tried using RStudio, but it is not a fully integrated solution. To fully utilize Databricks, we have to use the Jupyter interface. One feature that would make it easier for our team to adopt the Jupyter interface would be the ability to select a specific variable or line of code and execute it within a cell. This feature is available in other Jupyter Notebooks outside of Databricks and in our own IDE, but it is not currently available within Databricks. If this feature were added, it would make the transition to using Databricks much smoother for our team."
"The solution has some scalability and integration limitations when consolidating legacy systems."
"The product could be improved by offering an expansion of their visualization capabilities, which currently assists in development in their notebook environment."
"There are no direct connectors — they are very limited."
"On the tool's online discussion forums, you may get stuck with an issue, making it an area where improvements are required."
"Spring Cloud Data Flow could improve the user interface. We can drag and drop in the application for the configuration and settings, and deploy it right from the UI, without having to run a CI/CD pipeline. However, that does not work with Kubernetes, it only works when we are working with jars as the Spring Cloud Data Flow applications."
"The configurations could be better. Some configurations are a little bit time-consuming in terms of trying to understand using the Spring Cloud documentation."
"Some of the features, like the monitoring tools, are not very mature and are still evolving."
Databricks is ranked 2nd in Streaming Analytics with 78 reviews while Spring Cloud Data Flow is ranked 9th in Streaming Analytics with 5 reviews. Databricks is rated 8.2, while Spring Cloud Data Flow is rated 8.0. The top reviewer of Databricks writes "A nice interface with good features for turning off clusters to save on computing". On the other hand, the top reviewer of Spring Cloud Data Flow writes "Provides ease of integration with other cloud platforms ". Databricks is most compared with Amazon SageMaker, Informatica PowerCenter, Dataiku, Dremio and Microsoft Azure Machine Learning Studio, whereas Spring Cloud Data Flow is most compared with Apache Flink, Google Cloud Dataflow, Apache Spark Streaming, TIBCO BusinessWorks and Confluent. See our Databricks vs. Spring Cloud Data Flow report.
See our list of best Streaming Analytics vendors.
We monitor all Streaming Analytics reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.