Amazon EC2 Auto Scaling vs Apache Spark comparison

Cancel
You must select at least 2 products to compare!
Amazon Web Services (AWS) Logo
3,148 views|2,743 comparisons
100% willing to recommend
Apache Logo
3,093 views|2,345 comparisons
89% willing to recommend
Comparison Buyer's Guide
Executive Summary

We performed a comparison between Amazon EC2 Auto Scaling and Apache Spark based on real PeerSpot user reviews.

Find out in this report how the two Compute Service solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI.
To learn more, read our detailed Amazon EC2 Auto Scaling vs. Apache Spark Report (Updated: March 2024).
768,578 professionals have used our research since 2012.
Featured Review
Quotes From Members
We asked business professionals to review the solutions they use.
Here are some excerpts of what they said:
Pros
"It has the best auto-scaling features.""We appreciate that this solution allows us to run all of our severs through it, meaning that our workloads are mainly on the EC2 instance only.""We use the solution to increase CPU and memory size.""What we have found most valuable are the purchasing of usage at the time and small storage.""Auto-scaling is a good feature.""Amazon EC2 Auto Scaling has good integration.""The solution removes the need for hardware. We can easily create servers or machines. Just by clicking or specifying our requirements, like memory size or disk space, it's set up for us. The tool eliminates the need for hardware. We can choose what we need and pay as we use it. It is flexible and can integrate with any product.""The initial setup is straightforward."

More Amazon EC2 Auto Scaling Pros →

"The solution is scalable.""I found the solution stable. We haven't had any problems with it.""The processing time is very much improved over the data warehouse solution that we were using.""The tool's most valuable feature is its speed and efficiency. It's much faster than other tools and excels in parallel data processing. Unlike tools like Python or JavaScript, which may struggle with parallel processing, it allows us to handle large volumes of data with more power easily.""The product's deployment phase is easy.""The main feature that we find valuable is that it is very fast.""There's a lot of functionality.""Spark can handle small to huge data and is suitable for any size of company."

More Apache Spark Pros →

Cons
"The product's setup is complex for an intermediate user.""There should be an AWS instance in South Africa, where the latency would be even lower. It might happen soon since AWS has recently opened more data centres in Nigeria. AWS may extend its reach to South Africa, and offer hosted CLI servers there. Most of the problems with AWS are not to do with the solution itself but with configuration. It is something on design, more or less.""The licensing cost is expensive.""We have found that the sizing in Amazon EC2 Auto Scaling is far off. For example, we will see some at one terabyte and the other one is two terabytes. There is nothing between one and two terabytes. Sometimes it's a struggle if I need one and a half, I still am supposed to pay for two. There are five terabytes, six terabytes, and 12 terabytes, and if I need something at eight or nine, I'm still paying 30 to 40 percent more by taking the one which is 12 terabytes. Microsoft Azure does similar sizes but the gap can be more, such as six terabytes, and the next one is 12 terabytes.""The tool must provide proper guidelines to troubleshoot connectivity issues.""The product's technical support needs to be better.""The technical support needs to be improved.""The documentation for this solution could be improved. For example, it is difficult to find documentation for integration with applications."

More Amazon EC2 Auto Scaling Cons →

"It's not easy to install.""I would like to see integration with data science platforms to optimize the processing capability for these tasks.""When you are working with large, complex tasks, the garbage collection process is slow and affects performance.""This solution currently cannot support or distribute neural network related models, or deep learning related algorithms. We would like this functionality to be developed.""The setup I worked on was really complex.""If you have a Spark session in the background, sometimes it's very hard to kill these sessions because of D allocation.""Technical expertise from an engineer is required to deploy and run high-tech tools, like Informatica, on Apache Spark, making it an area where improvements are required to make the process easier for users.""The solution needs to optimize shuffling between workers."

More Apache Spark Cons →

Pricing and Cost Advice
  • "Pricing could be a little bit more competitive."
  • "The pricing is not fixed and it is based on usage."
  • "The price of this product could be a little bit lower."
  • "Licensing fees are paid on a yearly basis."
  • "I have not explored the price of the solution extensively, but from what I have seen the price is alright."
  • "When we want to use more services, we need to pay more. It's a monthly subscription, rather than licensed-based. Pricing or fees for Amazon EC2 Auto Scaling could be improved."
  • "The solution pricing varies by service region is mid-range."
  • "Amazon EC2 Auto Scaling uses a pay-as-you-go pricing model."
  • More Amazon EC2 Auto Scaling Pricing and Cost Advice →

  • "Since we are using the Apache Spark version, not the data bricks version, it is an Apache license version, the support and resolution of the bug are actually late or delayed. The Apache license is free."
  • "Apache Spark is open-source. You have to pay only when you use any bundled product, such as Cloudera."
  • "We are using the free version of the solution."
  • "Apache Spark is not too cheap. You have to pay for hardware and Cloudera licenses. Of course, there is a solution with open source without Cloudera."
  • "Apache Spark is an expensive solution."
  • "Spark is an open-source solution, so there are no licensing costs."
  • "On the cloud model can be expensive as it requires substantial resources for implementation, covering on-premises hardware, memory, and licensing."
  • "It is an open-source solution, it is free of charge."
  • More Apache Spark Pricing and Cost Advice →

    report
    Use our free recommendation engine to learn which Compute Service solutions are best for your needs.
    768,578 professionals have used our research since 2012.
    Questions from the Community
    Top Answer:The tool must provide proper guidelines to troubleshoot connectivity issues. It must also improve AMI creation.
    Top Answer:We use Spark to process data from different data sources.
    Top Answer:In data analysis, you need to take real-time data from different data sources. You need to process this in a subsecond, and do the transformation in a subsecond
    Ranking
    2nd
    out of 16 in Compute Service
    Views
    3,148
    Comparisons
    2,743
    Reviews
    31
    Average Words per Review
    324
    Rating
    9.0
    5th
    out of 16 in Compute Service
    Views
    3,093
    Comparisons
    2,345
    Reviews
    25
    Average Words per Review
    432
    Rating
    8.7
    Comparisons
    Also Known As
    AWS RAM
    Learn More
    Overview

    Amazon EC2 Auto Scaling helps you maintain application availability and allows you to automatically add or remove EC2 instances according to conditions you define. ... Dynamic scaling responds to changing demand and predictive scaling automatically schedules the right number of EC2 instances based on predicted demand.

    Spark provides programmers with an application programming interface centered on a data structure called the resilient distributed dataset (RDD), a read-only multiset of data items distributed over a cluster of machines, that is maintained in a fault-tolerant way. It was developed in response to limitations in the MapReduce cluster computing paradigm, which forces a particular linear dataflowstructure on distributed programs: MapReduce programs read input data from disk, map a function across the data, reduce the results of the map, and store reduction results on disk. Spark's RDDs function as a working set for distributed programs that offers a (deliberately) restricted form of distributed shared memory

    Sample Customers
    Expedia, Intuit, Royal Dutch Shell, Brooks Brothers
    NASA JPL, UC Berkeley AMPLab, Amazon, eBay, Yahoo!, UC Santa Cruz, TripAdvisor, Taboola, Agile Lab, Art.com, Baidu, Alibaba Taobao, EURECOM, Hitachi Solutions
    Top Industries
    REVIEWERS
    Computer Software Company44%
    Financial Services Firm16%
    Comms Service Provider8%
    Media Company4%
    VISITORS READING REVIEWS
    Financial Services Firm22%
    Computer Software Company13%
    University8%
    Government7%
    REVIEWERS
    Computer Software Company30%
    Financial Services Firm15%
    University9%
    Marketing Services Firm6%
    VISITORS READING REVIEWS
    Financial Services Firm24%
    Computer Software Company13%
    Manufacturing Company7%
    Comms Service Provider6%
    Company Size
    REVIEWERS
    Small Business33%
    Midsize Enterprise15%
    Large Enterprise53%
    VISITORS READING REVIEWS
    Small Business25%
    Midsize Enterprise10%
    Large Enterprise65%
    REVIEWERS
    Small Business40%
    Midsize Enterprise19%
    Large Enterprise40%
    VISITORS READING REVIEWS
    Small Business17%
    Midsize Enterprise12%
    Large Enterprise71%
    Buyer's Guide
    Amazon EC2 Auto Scaling vs. Apache Spark
    March 2024
    Find out what your peers are saying about Amazon EC2 Auto Scaling vs. Apache Spark and other solutions. Updated: March 2024.
    768,578 professionals have used our research since 2012.

    Amazon EC2 Auto Scaling is ranked 2nd in Compute Service with 37 reviews while Apache Spark is ranked 5th in Compute Service with 60 reviews. Amazon EC2 Auto Scaling is rated 8.8, while Apache Spark is rated 8.4. The top reviewer of Amazon EC2 Auto Scaling writes "Well-documented setup process and highly stable solution". On the other hand, the top reviewer of Apache Spark writes "Reliable, able to expand, and handle large amounts of data well". Amazon EC2 Auto Scaling is most compared with AWS Fargate, AWS Lambda, AWS Batch, Amazon Elastic Inference and Oracle Compute Cloud Service, whereas Apache Spark is most compared with Spring Boot, AWS Batch, Spark SQL, SAP HANA and Cloudera Distribution for Hadoop. See our Amazon EC2 Auto Scaling vs. Apache Spark report.

    See our list of best Compute Service vendors.

    We monitor all Compute Service reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.