Apache Spark Other Advice
I can recommend the product. It's a nice system for batch processing huge data.
I'd rate the solution eight out of ten.
View full review »Overall, I would rate the solution a nine out of ten.
I would recommend this tool to someone considering it for scalable data processing.
Nowadays, Apache Spark is on the market, and most organizations are using it. There are people with more experience and knowledge than me, and they're confident about this tool.
That's why it's become a solution for organizations. It's not a one-man decision but rather a group or community effort.
View full review »SS
Sachin Shukre
Sr Manager at a transportation company with 10,001+ employees
If your use case involves real-time applications frequently changing columns or data frames, then Spark is a fantastic option for you.
However, if you have a batch process and don't have a structural data analysis, I would suggest avoiding it. The high cost of cloud infrastructure combined with Apache Spark can be a significant burden in such scenarios.
Overall, I would rate the solution a nine out of ten.
View full review »Buyer's Guide
Apache Spark
April 2024
Learn what your peers think about Apache Spark. Get advice and tips from experienced pros sharing their opinions. Updated: April 2024.
769,236 professionals have used our research since 2012.
Apache Spark is a good product for processing large volumes of data compared to other distributed systems. It provides efficient integration with Hadoop and other platforms.
I rate it a ten out of ten.
View full review »Spark was written in Scala. Scala is a programming language fundamentally in Java and useful for data lakes.
We thought about using Flink instead, but it wasn't useful for us and wouldn't gain any additional value. Besides, Spark's community is much wider, so information is available and is better than Flink's.
I rate Apache Spark an eight out of ten.
If you plan to implement Apache Spark on a large-scale system, you should learn to use parallelism, partitioning, and everything from the physical level to get the best performance from Spark. And it will be good to know Python, especially for data scientists using PySpark for analysis. Likewise, it's good to know Scala because you can be very efficient in preparing some datasets since it is Spark's native language.
View full review »The tool is used for real-time data analytics as it is very powerful and reliable. The code that you write with Apache Spark provides stability. There are many bugs that can appear according to the code that you use, which could be Java or Scala. So this is amazing. Apache Spark is very reliable, powerful, and fast as an engine. When compared with another competitor like MapReduce, Apache Spark performs 100 times better than MapReduce.
The monitoring part of the product is good.
The product offers clusters that are resilient and can run into multiple nodes.
The tool can run with multiple clusters.
The integration capabilities of the product with other platforms to improve our company's workflow are good.
In terms of the improvements in the product in the data analysis area, new libraries have been launched to support AI and machine learning.
My company is able to process huge datasets with Apache Spark. There is a huge value added to the organization because of the tool's ability to process huge datasets.
I rate the overall solution a nine out of ten.
Additional skill requirements are crucial to use the solution and its related features effectively. Training costs and efforts may be necessary to ensure individuals are proficient in using these technologies. Overall, I would rate it nine out of ten.
View full review »Given our extensive experience with it and its ability to meet all our requirements over time, I highly recommend it. Overall, I would rate it nine out of ten.
View full review »The tool offers functionality that helps my company deal with data processing in projects on a near real-time basis.
The impact of in-memory processing capabilities on the improvement of computational efficiency is one of the reasons why my company chose Apache Spark.
At the moment, my company plans to explore data analysis with Apache Spark. My company primarily used the product for data processing and not for data analysis.
If you buy the product with the capabilities of Azure DevOps and use the tool's dashboard, you find the solution to be good. The tool has an in-built UI and other good capabilities.
I feel that the product is fine and easy to use for those who plan to use it in the future. I recommended the tool to others based on the performance and scalability features it offers.
I managed data partitioning and distribution with Apache Spark once in my company.
The benefits of the use of the product revolve around the fact that it was easy to get the data processing done in a very quick and fastest possible way with the help of its n-memory processing and performance.
I rate the solution an eight and a half to nine out of ten.
VM
Vineeth Marar
Cloud solution architect at 0
My advice is to thoroughly understand your own needs and environment before making a decision. Recommendations should be based on product features, quality, accuracy, and stability.
Cost is also a factor, but it should not be the only consideration. Depending on whether the priority is performance and scalability or cost-effectiveness, I would suggest a solution that best meets those needs, whether it's a managed service or a more cost-conscious option.
I would rate Spark as ten out of ten. I haven't had any issues with Spark in my experience.
View full review »I rate Apache Spark an eight out of ten.
View full review »I would recommend Apache Spark to users doing analytics, data computation, or pipelines.
Overall, I rate Apache Spark ten out of ten.
If you're new to Apache Spark, the best way to learn is by using the Databricks Community Edition. It provides a cluster for Apache Spark where you can learn and test. I rate the product an eight out of ten.
View full review »ML
reviewer1759647
Information Technology Business Analyst at a aerospace/defense firm with 10,001+ employees
I would recommend the product. I think it's a good solution for analytics. Overall, I rate the product an eight out of ten.
View full review »I have the solution installed on my computer and on our servers. You can use it on-premises or as a SaaS.
I'd rate the solution at a nine out of ten. I've been very pleased with its capabilities.
I would recommend the solution for the people who need to deploy projects with streaming. If you have many different sources or different types of data, and you need to put everything in the same place - like a data lake - Spark, at this moment, has the right tools. It's an important solution for data science, for data detectors. You can put all of the information in one place with Spark.
View full review »This is a good solution for big data use cases and I rate it eight out of 10.
View full review »KK
Kürşat Kurt
Software Architect at Akbank
I would advise planning well before implementing this solution. In enterprise corporations like ours, there are a lot of policies. You should first find out your needs, and after that, you or your team should set it up based on your needs. If your needs change during development because of the business requirements, it will be very difficult.
If you are clear about your needs, it is easier to set it up. If you know how Spark is used in your project, you have to define firewall rules and cluster needs. When you set up Spark, it should be ready for people's usage, especially for remote job execution.
I would rate Apache Spark a nine out of ten.
View full review »I rate the overall solution a ten out of ten.
View full review »SB
SlavenBatnozic
CTO at Hammerknife
I recommend Apache Spark for batch analytics features.
View full review »MA
Marco Amhof
PLC Programmer at Alzero
I advise others to analyze data and understand your business requirements before purchasing the product. I rate it an eight out of ten.
View full review »I would recommend Apache Spark to other users.
Overall, I rate Apache Spark an eight out of ten.
FK
Farzam Khodaei
Data Engineer at Berief Food GmbH
Overall, I rate the product more than eight out of ten.
View full review »JK
reviewer2208003
Quantitative Developer at a marketing services firm with 11-50 employees
I would recommend understanding the use case better. Only if it fits your use case, then go for it. But it is a great tool.
Overall, I would rate Apache Spark an eight out of ten.
View full review »I would rate this solution a nine out of ten.
View full review »We are well versed in Spark, the version, the internal structure of Spark, and we know what exactly Spark is doing.
The solution cannot be easier. Everything cannot be made simpler because it involves core data, computer science, pro-engineering, and not many people are actually aware of it.
I rate Apache Spark a six out of ten.
View full review »I would rate it a nine out of ten.
View full review »NB
reviewer1283880
CEO International Business at a tech services company with 1,001-5,000 employees
I would give it a rating of seven out of ten, which, by my standards, is quite high.
View full review »My advice to others would be just to use Apache Spark for large scale data processing, as it provides good performance at low cost, unlike Ab-Initio or Informatica. But the main problem is, now in the market, there are not many people certified in Apache Spark.
View full review »I would rate Apache Spark eight out of ten.
View full review »Spark can handle small to huge data and is suitable for any size of company. I would rate Spark as eight out of ten.
View full review »RV
Rajendran Veerappan
Director at Nihil Solutions
We're customers and also partners with Apache.
While we are on version 2.6, we are considering upgrading to version 3.0.
I'd rate the solution nine out of ten. It works very well for us and suits our purposes almost perfectly.
View full review »SA
reviewer879201
Technical Consultant at a tech services company with 1-10 employees
On a scale of 1 to 10, I'd put it at an eight.
To make it a perfect 10 I'd like to see an improved configuration bot. Sometimes it is a nightmare on Linux trying to figure out what happened on the configuration and back-end. So I think installation and configuration with some other tools. We are technical people, we could figure it out, but if aspects like that were improved then other people who are less technical would use it and it would be more adaptable to the end-user.
View full review »NK
NitinKumar
Director of Enginnering at Sigmoid
I would definitely recommend Spark. It is a great product. I like Spark a lot, and most of the features have been quite good. Its initial learning curve is a bit high, but as you learn it, it becomes very easy.
I would rate Apache Spark an eight out of ten.
View full review »PE
reviewer1792824
Senior Test Automation Consultant / Architect at a tech services company with 11-50 employees
I would advise not using it if you don't have experienced users inside your organization. If you have to figure it all out on your own, then you shouldn't start with it.
Overall, I would rate it a six out of 10. For a commercial use case, it is a six out of 10. For scientific purposes, it is an eight out of 10.
View full review »GA
reviewer1535340
Senior Solutions Architect at a retailer with 10,001+ employees
I would recommend Apache Spark to new users, but it depends on the use case. Sometimes, it's not the best solution.
On a scale from one to ten, I would give Apache Spark a ten.
View full review »I would recommend the solution. I would rate it an eight or nine out of 10.
For some areas, I would give it ten but I cannot use some parts. If you are going to use it for a consumer then I would be able to recommend it and you should go ahead. It doesn't work for me as I have different clients and different engagements.
AR
reviewer1185906
Manager - Data Science Competency at a tech services company with 201-500 employees
We are not using the current version of this platform, Spark 3. However, we do know that it is used in the market and it has new features. We will eventually move to it.
My advice for anybody who wants to use Apache Spark is that they have two options. The first is Databricks, which are the creators of Apache Spark, and use their proprietary version. If you choose this option then you will have to pay for the product.
If instead, you use Apache Spark, then you can rely on your own expert in-house team for support, maintenance, and deployment. In this option, you don't have to pay anything to anybody outside of your company.
I would rate this solution an eight out of ten.
View full review »I love Spark over other solutions.
View full review »Spark gives the flexibility for developing custom applications.
View full review »Get to know how Spark works, what are job, stage, task, DAG, etc., and it will help you to write Spark application.
View full review »Be sure to Uuse the Apache versions and avoid vendor-specific extensions.
View full review »AD
reviewer1046250
Senior Consultant & Training at a tech services company with 51-200 employees
The work that we are doing with this solution is quite common and is very easy to do.
My advice for anybody who is implementing this solution is to look at their needs and then look at the community. Normally, there are a lot of people who have already done what you need. So, even without experience, it is quite simple to do a lot of things.
I would rate this solution a nine out of ten.
View full review »Go for it.
View full review »I also suggest having a Chief Technologist who has extensive experience in architecting several Big Data solutions. They should be able to communicate in business as well as technology language. Their expertise should range from infrastructure to application development and have command of Hadoop technologies.
View full review »SK
reviewer1904019
Chief Technology Officer at a tech services company with 11-50 employees
I rate Apache Spark an eight out of ten.
View full review »It's easy to use and has a learning curve.
View full review »KK
KamleshKhollam
Managing Consultant at a computer software company with 501-1,000 employees
I would rate this solution an eight out of ten.
View full review »MG
Mohamed Ghorbel
Director of BigData Offer at IVIDATA
We use both on-premises and public and private cloud deployment models. We're partners with Databricks.
I'm a consultant. Our company works for large enterprises such as banks and energy companies. 17 of our workers use Apache Spark.
With the cloud, there are many companies that integrate Spark. Most projects in big data around the world use Spark, indirectly or directly.
I'd rate the solution eight out of ten.
View full review »This is a very good product for the big data analytics and integrates well with other parts like Machine Learning and graph analytics.
View full review »Learn Scala as this will greatly reduce the pain in starting off with Spark.
View full review »Have Scala developers at hand. Base Java competency will not be enough during optimization rounds.
View full review »LC
Snrsecengin567
Snr Security Engineer at a tech vendor with 201-500 employees
I would rate this solution eight out of 10.
View full review »The advice that I would give to someone considering this solution is that the quality of data has key streaming capabilities like velocity. This means how quickly you are going to refer to the data. These things matter by designing the solution. We need to take these things out.
I would rate Apache Spark an eight out of ten.
To make it a ten they should improve the speed. The data storage capacity means we can inject somewhere in the user database in more efficient ways.
View full review »Buyer's Guide
Apache Spark
April 2024
Learn what your peers think about Apache Spark. Get advice and tips from experienced pros sharing their opinions. Updated: April 2024.
769,236 professionals have used our research since 2012.