Compare AWS Batch vs. Apache Spark

Cancel
You must select at least 2 products to compare!
Apache Spark Logo
11,251 views|9,221 comparisons
AWS Batch Logo
2,010 views|1,950 comparisons
Most Helpful Review
Use AWS Batch? Share your opinion.
Find out what your peers are saying about Apache, Amazon, StackStorm and others in Compute Service. Updated: November 2020.
448,542 professionals have used our research since 2012.
Quotes From Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:

report
Use our free recommendation engine to learn which Compute Service solutions are best for your needs.
448,542 professionals have used our research since 2012.
Questions from the Community
Top Answer: SQreamDB is a GPU DB. It is not suitable for real-time oltp of course. Cassandra is best suited for OLTP database use cases, when you need a scalable database (instead of SQL server, Postgres)… more »
Top Answer: I love every core functionality of Apache Spark Initially they have only provided RDD basic interface to process the data across distributed cluster. Then it evolved to dataframe and dataset interface… more »
Top Answer: Apache spark is available in cloud services like AWS cloud, Azure. We have to use the specific service for our use case. For example we can use AWS Glue which runs spark for ETL process, AWS EMR… more »
Ask a question

Earn 20 points

Ranking
1st
out of 13 in Compute Service
Views
11,251
Comparisons
9,221
Reviews
12
Average Words per Review
388
Avg. Rating
8.3
6th
out of 13 in Compute Service
Views
2,010
Comparisons
1,950
Reviews
0
Average Words per Review
0
Avg. Rating
N/A
Popular Comparisons
Compared 31% of the time.
Compared 7% of the time.
Compared 6% of the time.
Compared 5% of the time.
Compared 46% of the time.
Compared 36% of the time.
Compared 1% of the time.
Also Known As
Amazon Batch
Learn
Apache
Amazon
Overview

Spark provides programmers with an application programming interface centered on a data structure called the resilient distributed dataset (RDD), a read-only multiset of data items distributed over a cluster of machines, that is maintained in a fault-tolerant way. It was developed in response to limitations in the MapReduce cluster computing paradigm, which forces a particular linear dataflowstructure on distributed programs: MapReduce programs read input data from disk, map a function across the data, reduce the results of the map, and store reduction results on disk. Spark's RDDs function as a working set for distributed programs that offers a (deliberately) restricted form of distributed shared memory

AWS Batch enables developers, scientists, and engineers to easily and efficiently run hundreds of thousands of batch computing jobs on AWS. AWS Batch dynamically provisions the optimal quantity and type of compute resources (e.g., CPU or memory optimized instances) based on the volume and specific resource requirements of the batch jobs submitted. With AWS Batch, there is no need to install and manage batch computing software or server clusters that you use to run your jobs, allowing you to focus on analyzing results and solving problems. AWS Batch plans, schedules, and executes your batch computing workloads across the full range of AWS compute services and features, such as Amazon EC2 and Spot Instances.

Offer
Learn more about Apache Spark
Learn more about AWS Batch
Sample Customers
NASA JPL, UC Berkeley AMPLab, Amazon, eBay, Yahoo!, UC Santa Cruz, TripAdvisor, Taboola, Agile Lab, Art.com, Baidu, Alibaba Taobao, EURECOM, Hitachi SolutionsHess, Expedia, Kelloggs, Philips, HyperTrack
Top Industries
REVIEWERS
Financial Services Firm44%
Computer Software Company22%
Marketing Services Firm11%
Non Profit11%
VISITORS READING REVIEWS
Computer Software Company27%
Comms Service Provider18%
Media Company12%
Financial Services Firm8%
VISITORS READING REVIEWS
Media Company39%
Computer Software Company19%
Comms Service Provider12%
Financial Services Firm7%
Company Size
REVIEWERS
Small Business39%
Midsize Enterprise19%
Large Enterprise42%
No Data Available
Find out what your peers are saying about Apache, Amazon, StackStorm and others in Compute Service. Updated: November 2020.
448,542 professionals have used our research since 2012.

Apache Spark is ranked 1st in Compute Service with 12 reviews while AWS Batch is ranked 6th in Compute Service. Apache Spark is rated 8.2, while AWS Batch is rated 0.0. The top reviewer of Apache Spark writes "Good Streaming features enable to enter data and analysis within Spark Stream". On the other hand, Apache Spark is most compared with Spring Boot, Azure Stream Analytics, SAP HANA, AWS Lambda and Apache NiFi, whereas AWS Batch is most compared with AWS Lambda, AWS Fargate, Apache NiFi and Amazon Elastic Inference.

See our list of best Compute Service vendors.

We monitor all Compute Service reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.