Apache Spark Previous Solutions
We also use Flink.
Before Spark, I worked with another company that we used some different technology, including Kafka, Radius, Postgres SQL, S3, and Spring.
View full review »I have also used Hadoop.
The main reason for choosing Apache Spark was for big data solutions. Hadoop was introduced earlier, and most organizations were using Hadoop or cloud data platforms.
Then, Apache Spark came into the picture, and it was much faster. It's kind of taking the place of Hadoop. Organizations using Hadoop are now primarily focusing on Apache Spark for support.
So, for big data computing tasks, what you do with Hadoop is like a top-level layer. Spark is another layer on top of that. Organizations using Hadoop technologies and big data technologies in general have adopted Spark.
There aren't really other comparable tools for big data computing tasks. But, resource managers like Kubernetes and YARN are used with Spark. YARN was used in Hadoop big data technology, but now Kubernetes is more commonly used for resource management.
View full review »Before choosing Apache Spark for processing big data, we evaluated another option, Hadoop. However, Spark emerged as a superior choice comparatively.
View full review »Buyer's Guide
Apache Spark
April 2024
Learn what your peers think about Apache Spark. Get advice and tips from experienced pros sharing their opinions. Updated: April 2024.
768,415 professionals have used our research since 2012.
In the past, my company has used certain ETL tools, like Informatica, based on the performance levels offered.
We used Pandas data frames and SQL-type queries for smaller datasets, but we haven't worked with anything on the scale of Spark SQL.
View full review »VM
Vineeth Marar
Cloud solution architect at 0
I've been exploring its capabilities in the OpenAI context, rather than dealing with external databases.
I've also started using Apache Kafka for messaging and event streaming, which is essential since our solutions often integrate with applications running in Azure, including event hubs and service bus for messaging. This experience includes interfacing with various technologies, not just within Microsoft's ecosystem but also with Amazon Web Services.
Learning new technologies is a continuous process, and I've never found it difficult to adapt, especially with something as foundational as Apache Kafka.
The main reason our company opted for the product is its capability to process large volumes of data. While other options like Snowflake offer some advantages, they may have limitations regarding custom logic or modifications.
View full review »I also use Databricks, which I use in the cloud.
View full review »Before using this solution we used Apache Storm.
View full review »NB
reviewer1283880
CEO International Business at a tech services company with 1,001-5,000 employees
Opting for Apache Spark, an open-source solution, provides a distinct advantage by offering control over the code. This means you can identify issues, make necessary fixes, and determine what aspects to accept as they are. In contrast, dealing with a vendor may limit control, requiring you to submit requests and advocate for changes based on your business volume with them. This dependency on volume can potentially compromise control. To safeguard both your customers and your business, the choice of an open-source solution like Apache Spark allows for more autonomy and control over the technology stack.
View full review »Yes to make this job we've used a MySQL database. We switch because MySQL is not a scalable solution and we've reach it's limits.
View full review »RV
Rajendran Veerappan
Director at Nihil Solutions
We did previously use a lot of different mechanisms, however, we needed something that was good at processing data for analytical purposes, and this solution fit the bill. It's a very powerful tool. I haven't seen other tools that could do precisely what this one does.
View full review »SA
reviewer879201
Technical Consultant at a tech services company with 1-10 employees
I have used MapReduce from Hadoop previously. Otherwise, I haven't used any other big data infrastructure.
In my work previously, not in this company, I was working with some big data, but I was extracting using a single-core off my PC. I realized over time that my system had eight cores. So instead, I used all of those cores for multi-core programming. Then I realized that Hadoop and Spark do the same thing but with different PC's. That was then I used multi-core programming and that's the point - Spark needs to go and search Hadoop and other things.
View full review »GA
reviewer1535340
Senior Solutions Architect at a retailer with 10,001+ employees
Because my area is data analytics and analytics solutions, I use BigQuery, SQL, and ETL. I also use Dataproc and DataFlow.
View full review »I was using some other systems and we moved to Spark later. We faced performance and other issues with the other solution.
AR
reviewer1185906
Manager - Data Science Competency at a tech services company with 201-500 employees
I work on several open-source frameworks including Python, Scikit-learn, TensorFlow, PyTorch, H20.ai, and R. We don't endorse proprietary tools so we aren't working with them.
View full review »I evaluated Hadoop-based solution, and chose Spark due to the fast processing and ease of use.
View full review »Yes, we previously used Oracle, from which we ported our data.
View full review »Yes we used Hive, Pig, and Storm. Having everything in the same framework has helped us out a lot.
View full review »I previously used Python and R, but neither of these scaled particularly well.
View full review »Buyer's Guide
Apache Spark
April 2024
Learn what your peers think about Apache Spark. Get advice and tips from experienced pros sharing their opinions. Updated: April 2024.
768,415 professionals have used our research since 2012.