Apache Spark vs SAP HANA Comparison 2024

Apache Spark

SAP HANA

Apache Spark

Read 60 Apache Spark reviews

2,430 views|1,869 comparisons

SAP HANA

Read 81 SAP HANA reviews

714 views|458 comparisons

Comparison Buyer's Guide

Download the complete report

Buyer's Guide

Apache Spark vs. SAP HANA

May 2024

Executive Summary

We performed a comparison between Apache Spark and SAP HANA based on real PeerSpot user reviews.

Find out in this report how the two Hadoop solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI.

To learn more, read our detailed Apache Spark vs. SAP HANA Report (Updated: May 2024).

Download the complete report

772,649 professionals have used our research since 2012.

Featured Review

Vineeth Marar

Cloud solution architect at 0

Offers seamless integration with Azure services and on-premises servers

We've set up a Spark cluster running in Azure to process real-time data. This setup involves connecting Azure applications to the Spark cluster via... Read more →

Khalil AbdulrahmanAlasbahi

Commercial Manager at Natco Information technology

Excellent compatibility between modules and the control

The solution saves us in operational costs. The controls are very professional for managing orders with suppliers and partners.

Quotes From Members

We asked business professionals to review the solutions they use.
Here are some excerpts of what they said:

Pros

"Its scalability and speed are very valuable. You can scale it a lot. It is a great technology for big data. It is definitely better than a lot of earlier warehouse or pipeline solutions, such as Informatica. Spark SQL is very compliant with normal SQL that we have been using over the years. This makes it easy to code in Spark. It is just like using normal SQL. You can use the APIs of Spark or you can directly write SQL code and run it. This is something that I feel is useful in Spark.""It's easy to prepare parallelism in Spark, run the solution with specific parameters, and get good performance.""The features we find most valuable are the machine learning, data learning, and Spark Analytics.""DataFrame: Spark SQL gives the leverage to create applications more easily and with less coding effort.""Spark can handle small to huge data and is suitable for any size of company.""The most valuable feature of Apache Spark is its ease of use.""I appreciate everything about the solution, not just one or two specific features. The solution is highly stable. I rate it a perfect ten. The solution is highly scalable. I rate it a perfect ten. The initial setup was straightforward. I recommend using the solution. Overall, I rate the solution a perfect ten.""We use it for ETL purposes as well as for implementing the full transformation pipelines."

More Apache Spark Pros →

"The solution operates well.""The main feature is that the processes are very flexible, they are able to be adapted to the business and their departments.""It is very flexible to integrate with SaaS components.""It has a very huge bandwidth and data transfer.""The most valuable features I have found are speed, dashboard, and reporting.""The in-memory computing and the efficient response time are very good features.""In comparison with other DMS solutions, it offers good performance.""It's sufficed all of our requirements. We primarily needed it to run SAP applications, like NetWeaver or S/4HANA, and it has been really good at that."

More SAP HANA Pros →

Cons

"I would like to see integration with data science platforms to optimize the processing capability for these tasks.""Needs to provide an internal schedule to schedule spark jobs with monitoring capability.""The logging for the observability platform could be better.""Technical expertise from an engineer is required to deploy and run high-tech tools, like Informatica, on Apache Spark, making it an area where improvements are required to make the process easier for users.""Apache Spark can improve the use case scenarios from the website. There is not any information on how you can use the solution across the relational databases toward multiple databases.""We are building our own queries on Spark, and it can be improved in terms of query handling.""Apache Spark is very difficult to use. It would require a data engineer. It is not available for every engineer today because they need to understand the different concepts of Spark, which is very, very difficult and it is not easy to learn.""Apache Spark should add some resource management improvements to the algorithms."

More Apache Spark Cons →

"I would like to see improvement on the feedback from the road-map; it is currently extremely hard to get insight in this area.""The inclusion of a well-performing Time Machine is vital.""The surface side or Attack dashboard needs improvement because there are some gaps after sales services.""In my limited experience using SAP, the process of granting access to different modules is difficult. Specifically, the requirement to assign roles and key codes to users rather than being able to assign them individually made the process more complex. It would be beneficial if there was a way to assign key codes separately, rather than having to create multiple roles. This would make managing access easier.""It could be a bit more scalable.""I would like to see improvements in the connectivity of the solution with other BI software. Not every software can connect to it natively.""I give the scalability of SAP HANA a six out of ten.""In terms of improvement, the speed is not as good as we thought it would be. That is why we are trying different solutions that will be built with different technologies."

More SAP HANA Cons →

Pricing and Cost Advice

"Since we are using the Apache Spark version, not the data bricks version, it is an Apache license version, the support and resolution of the bug are actually late or delayed. The Apache license is free."

"Apache Spark is open-source. You have to pay only when you use any bundled product, such as Cloudera."

"We are using the free version of the solution."

"Apache Spark is not too cheap. You have to pay for hardware and Cloudera licenses. Of course, there is a solution with open source without Cloudera."

"Apache Spark is an expensive solution."

"Spark is an open-source solution, so there are no licensing costs."

"On the cloud model can be expensive as it requires substantial resources for implementation, covering on-premises hardware, memory, and licensing."

"It is an open-source solution, it is free of charge."

More Apache Spark Pricing and Cost Advice →

"Set up a consortium of consulting partners and hardware vendors to define your tech. Landscape TCO (total cost of ownership) and then approach the OEM for pricing (on-premise or on cloud or a hybrid model). Check if you can bring your own licenses for some of the existing application licenses on the new platform, to reduce TCO."

"People who are technical will accept the cost, but financially they will assess whether this solution will bring them revenue or not. People often ask, how will I profit when the cost is so high?"

"It is expensive, which isn't a problem for us because SAP HANA is processing the data so fast."

"SAP HANA is an expensive product."

"It is expensive."

"Setup and licensing require planning and proper budgeting, as it is not cheap."

"The price of the solution could be reduced, it is expensive."

"The price of this product is good."

More SAP HANA Pricing and Cost Advice →

See Which Vendors Are Best For You

Use our free recommendation engine to learn which Hadoop solutions are best for your needs.

See Recommendations

772,649 professionals have used our research since 2012.

Questions from the Community

What do you like most about Apache Spark?

Top Answer:We use Spark to process data from different data sources.

Read all 30 answers →

What is your experience regarding pricing and costs for Apache Spark?

Top Answer:The solution is moderately priced.

Read all 19 answers →

What needs improvement with Apache Spark?

Top Answer:In data analysis, you need to take real-time data from different data sources. You need to process this in a subsecond, and do the transformation in a subsecond

Read all 32 answers →

What are the biggest benefits of using SAP HANA?

Top Answer:Based on my work with SAP HANA, the biggest benefit that it can bring to your business is total data management. This product is by SAP - a company that serves almost all needs a client may have… more »

Read all 2 answers →

Is SAP HANA’s customer and technical support reliable?

Top Answer:We have been using SAP HANA for a fairly short period of time and have only taken advantage of their customer support. So far, we have not had issues that required specialized help from technical… more »

Read all 2 answers →

Is SAP HANA difficult to set up and start using?

Top Answer:SAP HANA is fairly easy to set up, however, I do not think a complete beginner can do it. You certainly need some preparation - either you need to have experience with similar solutions, or with other… more »

Read all 2 answers →

Ranking

1st

out of 22 in Hadoop

Views

2,430

Comparisons

1,869

Reviews

Average Words per Review

444

Rating

8.7

1st

out of 14 in Embedded Database

Views

714

Comparisons

458

Reviews

Average Words per Review

411

Rating

8.5

Comparisons

Spring Boot vs. Apache Spark

Compared 31% of the time.

AWS Batch vs. Apache Spark

Compared 10% of the time.

Spark SQL vs. Apache Spark

Compared 9% of the time.

Cloudera Distribution for Hadoop vs. Apache Spark

Compared 6% of the time.

AWS Lambda vs. Apache Spark

Compared 5% of the time.

More Apache Spark Competitors →

Oracle Database vs. SAP HANA

Compared 31% of the time.

SQL Server vs. SAP HANA

Compared 28% of the time.

MySQL vs. SAP HANA

Compared 8% of the time.

IBM Db2 Database vs. SAP HANA

Compared 7% of the time.

SAP Adaptive Server Enterprise vs. SAP HANA

Compared 3% of the time.

More SAP HANA Competitors →

Also Known As

SAP High-Performance Analytic Appliance, HANA

Learn More

Apache

SAP

Overview

Spark provides programmers with an application programming interface centered on a data structure called the resilient distributed dataset (RDD), a read-only multiset of data items distributed over a cluster of machines, that is maintained in a fault-tolerant way. It was developed in response to limitations in the MapReduce cluster computing paradigm, which forces a particular linear dataflowstructure on distributed programs: MapReduce programs read input data from disk, map a function across the data, reduce the results of the map, and store reduction results on disk. Spark's RDDs function as a working set for distributed programs that offers a (deliberately) restricted form of distributed shared memory

SAP HANA, also known as SAP High-performance Analytics Appliance, is a multi-model database that stores data in its memory, allowing users to avoid disk storage. The product combines its robust database with services for creating applications. SAP HANA is faster than other database management systems (DBMS) because it stores data in column-based tables in main memory and brings online analytical processing (OLAP) and online transaction processing (OLTP) together.

The column-oriented in-memory database design allows users to run high-speed transactions alongside advanced analytics, all in a single system. This provides companies with the ability to process very large amounts of data with low latency and query data in an instant. By combining multiple data management capabilities, the solution simplifies IT, helps businesses with innovations, and facilitates digital transformation.

The solution is structured into five groups of capabilities, categorized as:

Database design
Database management
Application development
Advanced analytics
Data virtualization

There are three more SAP products that work alongside SAP HANA and complete the experience for users together. SAP S/4HANA Cloud is a ready-to-run cloud enterprise resource planning (ERP). SAP BW/4HANA is a packaged data warehouse, based on SAP HANA, which allows users to consolidate data across the enterprise to get a consistent view of their data. Finally, SAP Cloud is a single database as a service (DBaaS) foundation for modern applications and analytics across all enterprise data. All three products can combine with SAP HANA to deliver to users an optimized experience regarding their data.

SAP HANA Features

Each architectural group of capabilities of SAP HANA has various features that users can benefit from. These include:

Parallel processing database: SAP HANA utilizes a single platform to run transactional and analytical workloads.
ACID compliance: This feature ensures compliance with requirements for Atomicity, Consistency, Isolation, and Durability (ACID) standards.
Multi-tenancy: This feature allows multiple tenant databases to run in one system while sharing the same memory and processors.
Multi-tier storage and persistent memory support: SAP HANA's native storage extension is a built-in capability to manage between memory and persistent storage, including SAP HANA Cloud Data Lake.
Scaling: The scaling feature supports terabytes of data in a single server and distributes large tables across multiple servers in a cluster to scale further.
Data modeling: This feature consists of graphical modeling tools that enable collaboration between stakeholders and the creation of models to execute complex business logic and data transformation in real time.
Stored procedures: The product has a native language to build stored procedures and uses advanced capabilities to create complex logic.
Administration: This feature consists of administration tools for various platform lifecycle, performance, and management operations and automations.
Security: SAP HANA provides its users with real-time data anonymization features to extract value from data while protecting privacy.
Availability and recovery: The tool supports high availability and disaster recovery through an array of techniques, including backup, storage mirroring, synchronous, asynchronous, and multitarget system replication.
Extended application services: Through its built-in application server, users can develop services such as REST and ODATA, as well as web applications that can run on multiple locations.
Client access: The product offers clients the ability to access it via other application platforms and languages, including Java, JavaScript, R, and Go.
Application lifecycle management: This set of features facilitates the building and packaging of applications, transporting them for development to test to production, and then deploying them.
Application development: This feature consists of a set of tools that offer application development on premises and in the Cloud. The programming language ABAP includes additional optimized features to build extensions to SAP applications.
Search: The search feature uses SQL to locate text promptly across multiple columns and textual content.
Spatial processing: This product feature provides native support for spatial data types and spatial functions.
Graph: Through this feature, users of the product can store and process highly connected data using a property graph.
Streaming analytics: This feature combines various data sources that users can utilize to discover trends over a set period.
Data integration and replication: The solution offers comprehensive features to handle all data integration scenarios.
Data federation: This feature allows users to perform queries on remote data sources in real time with data federation.
Caching: The capacity to cache data provides users with the ability to optimize federated queries against remote sources of data.

SAP HANA Benefits

SAP HANA provides many benefits for its users. These include:

This solution offers a high level of data and application security, beginning from a secure setup and providing continuous support.
SAP HANA offers augmentation for applications and analytics with built-in machine learning (ML).
The solution works in a timely manner, as it provides a response to queries within seconds in large production applications.
SAP HANA simplifies work, as it provides a single gateway to all user data with advanced data virtualization.
The product is very flexible, as it allows users to deploy applications in a public or private cloud, in multiple clouds, on premises, or hybrid.
SAP HANA scales easily for data volume and concurrent users across a distributed environment.
This is a powerful solution in terms of querying large datasets with a massively parallel processing (MPP) database.
SAP HANA is a versatile product that supports hybrid transactional and analytical processing as well as many data types.
The product provides a smaller data footprint with no data duplication or advanced compression, and reduces data silos.

Reviews from Real Users

According to a database consultant at a pharma/biotech company, SAP HANA is a very robust solution with good data access.

Bruno V., owner at LAVORO AUTOM INF E COM LTDA, likes SAP HANA because the product offers advanced features, helps reduce hours, and makes it easy to find what you need.

Sample Customers

NASA JPL, UC Berkeley AMPLab, Amazon, eBay, Yahoo!, UC Santa Cruz, TripAdvisor, Taboola, Agile Lab, Art.com, Baidu, Alibaba Taobao, EURECOM, Hitachi Solutions

Unilever, NHS 24, adidas Group, CHIO Aachen, Hamburg Port Authority (HPA), Bangkok Airways Public Company Limited

Top Industries

REVIEWERS

Computer Software Company33%

Financial Services Firm12%

University9%

Marketing Services Firm6%

VISITORS READING REVIEWS

Financial Services Firm25%

Computer Software Company13%

Manufacturing Company7%

Comms Service Provider5%

REVIEWERS

Manufacturing Company16%

Computer Software Company14%

Energy/Utilities Company10%

Retailer8%

VISITORS READING REVIEWS

Manufacturing Company14%

Computer Software Company13%

Financial Services Firm8%

Government6%

Company Size

REVIEWERS

Small Business42%

Midsize Enterprise16%

Large Enterprise42%

VISITORS READING REVIEWS

Small Business17%

Midsize Enterprise12%

Large Enterprise71%

REVIEWERS

Small Business25%

Midsize Enterprise15%

Large Enterprise60%

VISITORS READING REVIEWS

Small Business20%

Midsize Enterprise13%

Large Enterprise67%

Buyer's Guide

Apache Spark vs. SAP HANA

May 2024

Free Report: Apache Spark vs. SAP HANA

Find out what your peers are saying about Apache Spark vs. SAP HANA and other solutions. Updated: May 2024.

DOWNLOAD NOW

772,649 professionals have used our research since 2012.

Apache Spark is ranked 1st in Hadoop with 60 reviews while SAP HANA is ranked 1st in Embedded Database with 81 reviews. Apache Spark is rated 8.4, while SAP HANA is rated 8.4. The top reviewer of Apache Spark writes "Reliable, able to expand, and handle large amounts of data well". On the other hand, the top reviewer of SAP HANA writes "Excellent compatibility between modules and the control". Apache Spark is most compared with Spring Boot, AWS Batch, Spark SQL, Cloudera Distribution for Hadoop and AWS Lambda, whereas SAP HANA is most compared with Oracle Database, SQL Server, MySQL, IBM Db2 Database and SAP Adaptive Server Enterprise. See our Apache Spark vs. SAP HANA report.

See our list of best Hadoop vendors.

We monitor all Hadoop reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.

Apache Spark vs SAP HANA comparison