We performed a comparison between Apache Spark and IBM InfoSphere BigInsights [EOL] based on real PeerSpot user reviews.
Find out what your peers are saying about Apache, Cloudera, Amazon Web Services (AWS) and others in Hadoop."It is useful for handling large amounts of data. It is very useful for scientific purposes."
"We use Spark to process data from different data sources."
"The most valuable feature is the Fault Tolerance and easy binding with other processes like Machine Learning, graph analytics."
"The most valuable feature of this solution is its capacity for processing large amounts of data."
"Now, when we're tackling sentiment analysis using NLP technologies, we deal with unstructured data—customer chats, feedback on promotions or demos, and even media like images, audio, and video files. For processing such data, we rely on PySpark. Beneath the surface, Spark functions as a compute engine with in-memory processing capabilities, enhancing performance through features like broadcasting and caching. It's become a crucial tool, widely adopted by 90% of companies for a decade or more."
"This solution provides a clear and convenient syntax for our analytical tasks."
"AI libraries are the most valuable. They provide extensibility and usability. Spark has a lot of connectors, which is a very important and useful feature for AI. You need to connect a lot of points for AI, and you have to get data from those systems. Connectors are very wide in Spark. With a Spark cluster, you can get fast results, especially for AI."
"Spark helps us reduce startup time for our customers and gives a very high ROI in the medium term."
"InfoSphere Streams was the one core product from the platform in which we were using. We were building a real-time response system and we built it on InfoSphere Streams."
"It requires overcoming a significant learning curve due to its robust and feature-rich nature."
"The management tools could use improvement. Some of the debugging tools need some work as well. They need to be more descriptive."
"Its UI can be better. Maintaining the history server is a little cumbersome, and it should be improved. I had issues while looking at the historical tags, which sometimes created problems. You have to separately create a history server and run it. Such things can be made easier. Instead of separately installing the history server, it can be made a part of the whole setup so that whenever you set it up, it becomes available."
"Technical expertise from an engineer is required to deploy and run high-tech tools, like Informatica, on Apache Spark, making it an area where improvements are required to make the process easier for users."
"The graphical user interface (UI) could be a bit more clear. It's very hard to figure out the execution logs and understand how long it takes to send everything. If an execution is lost, it's not so easy to understand why or where it went. I have to manually drill down on the data processes which takes a lot of time. Maybe there could be like a metrics monitor, or maybe the whole log analysis could be improved to make it easier to understand and navigate."
"The setup I worked on was really complex."
"Apache Spark is very difficult to use. It would require a data engineer. It is not available for every engineer today because they need to understand the different concepts of Spark, which is very, very difficult and it is not easy to learn."
"At times during the deployment process, the tool goes down, making it look less robust. To take care of the issues in the deployment process, users need to do manual interventions occasionally."
"The UI was not interactive: Responses used to be very slow and hang up at times."
Earn 20 points
Apache Spark is ranked 1st in Hadoop with 60 reviews while IBM InfoSphere BigInsights [EOL] doesn't meet the minimum requirements to be ranked in Hadoop. Apache Spark is rated 8.4, while IBM InfoSphere BigInsights [EOL] is rated 7.6. The top reviewer of Apache Spark writes "Reliable, able to expand, and handle large amounts of data well". On the other hand, the top reviewer of IBM InfoSphere BigInsights [EOL] writes "The BIQSQL implementation is fully SQL ANSI compliant, but I have found a lot of issues in Fluid Query". Apache Spark is most compared with Spring Boot, AWS Batch, Spark SQL, SAP HANA and Cloudera Distribution for Hadoop, whereas IBM InfoSphere BigInsights [EOL] is most compared with .
See our list of best Hadoop vendors.
We monitor all Hadoop reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.