We performed a comparison between Apache Spark and QueryIO based on real PeerSpot user reviews.
Find out what your peers are saying about Apache, Cloudera, Amazon Web Services (AWS) and others in Hadoop."There's a lot of functionality."
"I found the solution stable. We haven't had any problems with it."
"The fault tolerant feature is provided."
"The processing time is very much improved over the data warehouse solution that we were using."
"One of Apache Spark's most valuable features is that it supports in-memory processing, the execution of jobs compared to traditional tools is very fast."
"The tool's most valuable feature is its speed and efficiency. It's much faster than other tools and excels in parallel data processing. Unlike tools like Python or JavaScript, which may struggle with parallel processing, it allows us to handle large volumes of data with more power easily."
"ETL and streaming capabilities."
"The good performance. The nice graphical management console. The long list of ML algorithms."
"Anyone who has even a little bit of knowledge of the solution can begin to create things. You don't have to be technical to use the solution."
"Its UI can be better. Maintaining the history server is a little cumbersome, and it should be improved. I had issues while looking at the historical tags, which sometimes created problems. You have to separately create a history server and run it. Such things can be made easier. Instead of separately installing the history server, it can be made a part of the whole setup so that whenever you set it up, it becomes available."
"The solution must improve its performance."
"Apache Spark could improve the connectors that it supports. There are a lot of open-source databases in the market. For example, cloud databases, such as Redshift, Snowflake, and Synapse. Apache Spark should have connectors present to connect to these databases. There are a lot of workarounds required to connect to those databases, but it should have inbuilt connectors."
"The migration of data between different versions could be improved."
"When using Spark, users may need to write their own parallelization logic, which requires additional effort and expertise."
"Apart from the restrictions that come with its in-memory implementation. It has been improved significantly up to version 3.0, which is currently in use."
"Dynamic DataFrame options are not yet available."
"The solution needs to optimize shuffling between workers."
"There needs to be some simplification of the user interface."
Earn 20 points
Apache Spark is ranked 1st in Hadoop with 60 reviews while QueryIO is ranked 16th in Hadoop. Apache Spark is rated 8.4, while QueryIO is rated 8.0. The top reviewer of Apache Spark writes "Reliable, able to expand, and handle large amounts of data well". On the other hand, the top reviewer of QueryIO writes "Stable with good connectivity and good integration capabilities". Apache Spark is most compared with Spring Boot, AWS Batch, Spark SQL, SAP HANA and Cloudera Distribution for Hadoop, whereas QueryIO is most compared with Splice Machine.
See our list of best Hadoop vendors.
We monitor all Hadoop reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.