We performed a comparison between Apache Spark Streaming and Databricks based on real PeerSpot user reviews.
Find out in this report how the two Streaming Analytics solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI."It's the fastest solution on the market with low latency data on data transformations."
"The solution is very stable and reliable."
"As an open-source solution, using it is basically free."
"Apache Spark Streaming is versatile. You can use it for competitive intelligence, gathering data from competitors, or for internal tasks like monitoring workflows."
"Apache Spark Streaming has features like checkpointing and Streaming API that are useful."
"Apache Spark Streaming was straightforward in terms of maintenance. It was actively developed, and migrating from an older to a newer version was quite simple."
"The solution is better than average and some of the valuable features include efficiency and stability."
"Apache Spark Streaming's most valuable feature is near real-time analytics. The developers can build APIs easily for a code-steaming pipeline. The solutions have an ecosystem of integration with other stock services."
"The capacity of use of the different types of coding is valuable. Databricks also has good performance because it is running in spark extra storage, meaning the performance and the capacity use different kinds of codes."
"The fast data loading process and data storage capabilities are great."
"I haven't heard about any major stability issues. At this time I feel like it's stable."
"Databricks is hosted on the cloud. It is very easy to collaborate with other team members who are working on it. It is production-ready code, and scheduling the jobs is easy."
"The simplicity of development is the most valuable feature."
"There are good features for turning off clusters."
"Databricks' Lakehouse architecture has been most useful for us. The data governance has been absolutely efficient in between other kinds of solutions."
"Easy to use and requires minimal coding and customizations."
"The solution itself could be easier to use."
"The initial setup is quite complex."
"We would like to have the ability to do arbitrary stateful functions in Python."
"It was resource-intensive, even for small-scale applications."
"The service structure of Apache Spark Streaming can improve. There are a lot of issues with memory management and latency. There is no real-time analytics. We recommend it for the use cases where there is a five-second latency, but not for a millisecond, an IOT-based, or the detection anomaly-based. Flink as a service is much better."
"The cost and load-related optimizations are areas where the tool lacks and needs improvement."
"There could be an improvement in the area of the user configuration section, it should be less developer-focused and more business user-focused."
"In terms of improvement, the UI could be better."
"We'd like a more visual dashboard for analysis It needs better UI."
"Databricks would have more collaborative features than it has. It should have some more customization for the jobs."
"There is room for improvement in the documentation of processes and how it works."
"In the future, I would like to see Data Lake support. That is something that I'm looking forward to."
"Databricks doesn't offer the use of Python scripts by itself and is not connected to GitHub repositories or anything similar. This is something that is missing. if they could integrate with Git tools it would be an advantage."
"The product should provide more advanced features in future releases."
"The integration features could be more interesting, more involved."
"The solution could improve by providing better automation capabilities. For example, working together with more of a DevOps approach, such as continuous integration."
Apache Spark Streaming is ranked 8th in Streaming Analytics with 8 reviews while Databricks is ranked 2nd in Streaming Analytics with 78 reviews. Apache Spark Streaming is rated 8.0, while Databricks is rated 8.2. The top reviewer of Apache Spark Streaming writes "Easy integration, beneficial auto-scaling, and good open-sourced support community". On the other hand, the top reviewer of Databricks writes "A nice interface with good features for turning off clusters to save on computing". Apache Spark Streaming is most compared with Amazon Kinesis, Spring Cloud Data Flow, Azure Stream Analytics, Apache Pulsar and Amazon MSK, whereas Databricks is most compared with Amazon SageMaker, Informatica PowerCenter, Dataiku, Microsoft Azure Machine Learning Studio and Dremio. See our Apache Spark Streaming vs. Databricks report.
See our list of best Streaming Analytics vendors.
We monitor all Streaming Analytics reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.