We performed a comparison between Apache Spark and AWS Lambda based on real PeerSpot user reviews.
Find out in this report how the two Compute Service solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI."With Spark, we parallelize our operations, efficiently accessing both historical and real-time data."
"It is highly scalable, allowing you to efficiently work with extensive datasets that might be problematic to handle using traditional tools that are memory-constrained."
"Spark can handle small to huge data and is suitable for any size of company."
"Apache Spark provides a very high-quality implementation of distributed data processing."
"The product’s most valuable feature is the SQL tool. It enables us to create a database and publish it."
"We use it for ETL purposes as well as for implementing the full transformation pipelines."
"The solution is scalable."
"One of Apache Spark's most valuable features is that it supports in-memory processing, the execution of jobs compared to traditional tools is very fast."
"The automation feature is valuable."
"Lambda has improved our organization by making it possible to transform data."
"Amazon takes care of the scalability. That's the right way. It's automatic and it's fully managed. That's one benefit of Lambda."
"AWS Lambda is interlinked with CloudWatch. When we have any errors we can directly go there and check the CloudWatch logs. Additionally, we can run it very fast and we can increase the RAM size and other components."
"The basic feature that I like is that there is no server installation. It also has good support for various languages, such as Java, .NET, C#, and Python."
"It is my preferred product, as it provides me with source code within the solution."
"The most valuable feature of this solution is the API Gateway."
"The programming language and the integration with other AWS services are the most valuable features."
"We use big data manager but we cannot use it as conditional data so whenever we're trying to fetch the data, it takes a bit of time."
"If you have a Spark session in the background, sometimes it's very hard to kill these sessions because of D allocation."
"Apache Spark could improve the connectors that it supports. There are a lot of open-source databases in the market. For example, cloud databases, such as Redshift, Snowflake, and Synapse. Apache Spark should have connectors present to connect to these databases. There are a lot of workarounds required to connect to those databases, but it should have inbuilt connectors."
"The setup I worked on was really complex."
"It requires overcoming a significant learning curve due to its robust and feature-rich nature."
"The product could improve the user interface and make it easier for new users."
"Technical expertise from an engineer is required to deploy and run high-tech tools, like Informatica, on Apache Spark, making it an area where improvements are required to make the process easier for users."
"Stability in terms of API (things were difficult, when transitioning from RDD to DataFrames, then to DataSet)."
"The overall performance of this solution could be improved. We would also like to have better integration with other AWS features."
"There's room for improvement in the solution's warm start, which refers to the minimum time it takes to start up a Lambda function if you haven't been running it."
"AWS Lambda could improve by having no-code or low-code options because currently, you need to be able to write code well to use it."
"I have seen some drawbacks with certain integrations."
"We need to better understand Lambda for different scenarios. We need some joint effort between Amazon and the users to have the users identify how they can really leverage Lambda. It's not about Lambda itself; it's about the practice, the guidance. There needs to be very good documentation. From the user perspective, what exists now is not always enough."
"The support team does not know how to implement and build the solution."
"The metrics and reporting for this solution could be improved."
"The feature to attach external storage, such as an S3 or elastic storage, must be added to AWS Lambda. This is its area for improvement."
Apache Spark is ranked 5th in Compute Service with 60 reviews while AWS Lambda is ranked 1st in Compute Service with 70 reviews. Apache Spark is rated 8.4, while AWS Lambda is rated 8.6. The top reviewer of Apache Spark writes "Reliable, able to expand, and handle large amounts of data well". On the other hand, the top reviewer of AWS Lambda writes "An easily scalable solution with a variety of use cases and valuable event-based triggers". Apache Spark is most compared with Spring Boot, AWS Batch, Spark SQL, SAP HANA and Azure Stream Analytics, whereas AWS Lambda is most compared with AWS Batch, Amazon EC2 Auto Scaling, Apache NiFi, AWS Fargate and Google Cloud Dataflow. See our AWS Lambda vs. Apache Spark report.
See our list of best Compute Service vendors.
We monitor all Compute Service reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.