Apache Spark Previous Solutions

Ilya Afanasyev - PeerSpot reviewer
Senior Software Development Engineer at Yahoo!

We also use Flink.

Before Spark, I worked with another company that we used some different technology, including Kafka, Radius, Postgres SQL, S3, and Spring. 

View full review »
SurjitChoudhury - PeerSpot reviewer
Data engineer at Cocos pt

I have also used Hadoop.

The main reason for choosing Apache Spark was for big data solutions. Hadoop was introduced earlier, and most organizations were using Hadoop or cloud data platforms. 

Then, Apache Spark came into the picture, and it was much faster. It's kind of taking the place of Hadoop. Organizations using Hadoop are now primarily focusing on Apache Spark for support.

So, for big data computing tasks, what you do with Hadoop is like a top-level layer. Spark is another layer on top of that. Organizations using Hadoop technologies and big data technologies in general have adopted Spark. 

There aren't really other comparable tools for big data computing tasks. But, resource managers like Kubernetes and YARN are used with Spark. YARN was used in Hadoop big data technology, but now Kubernetes is more commonly used for resource management.

View full review »
Suriya Senthilkumar - PeerSpot reviewer
Analyst at Deloitte

Before choosing Apache Spark for processing big data, we evaluated another option, Hadoop. However, Spark emerged as a superior choice comparatively.

View full review »
Buyer's Guide
Apache Spark
April 2024
Learn what your peers think about Apache Spark. Get advice and tips from experienced pros sharing their opinions. Updated: April 2024.
768,415 professionals have used our research since 2012.
Hamid M. Hamid - PeerSpot reviewer
Data architect at Banking Sector

In the past, my company has used certain ETL tools, like Informatica, based on the performance levels offered.

View full review »
Lucas Dreyer - PeerSpot reviewer
Data Engineer at BBD

We used Pandas data frames and SQL-type queries for smaller datasets, but we haven't worked with anything on the scale of Spark SQL.

View full review »
VM
Cloud solution architect at 0

I've been exploring its capabilities in the OpenAI context, rather than dealing with external databases. 

I've also started using Apache Kafka for messaging and event streaming, which is essential since our solutions often integrate with applications running in Azure, including event hubs and service bus for messaging. This experience includes interfacing with various technologies, not just within Microsoft's ecosystem but also with Amazon Web Services.

Learning new technologies is a continuous process, and I've never found it difficult to adapt, especially with something as foundational as Apache Kafka.

View full review »
UjjwalGupta - PeerSpot reviewer
Module Lead at Mphasis

The main reason our company opted for the product is its capability to process large volumes of data. While other options like Snowflake offer some advantages, they may have limitations regarding custom logic or modifications.

View full review »
Oscar Estorach - PeerSpot reviewer
Chief Data-strategist and Director at Theworkshop.es

I also use Databricks, which I use in the cloud.

View full review »
Suresh_Srinivasan - PeerSpot reviewer
Co-Founder at FORMCEPT Technologies

Before using this solution we used Apache Storm

View full review »
NB
CEO International Business at a tech services company with 1,001-5,000 employees

Opting for Apache Spark, an open-source solution, provides a distinct advantage by offering control over the code. This means you can identify issues, make necessary fixes, and determine what aspects to accept as they are. In contrast, dealing with a vendor may limit control, requiring you to submit requests and advocate for changes based on your business volume with them. This dependency on volume can potentially compromise control. To safeguard both your customers and your business, the choice of an open-source solution like Apache Spark allows for more autonomy and control over the technology stack.

View full review »
it_user371832 - PeerSpot reviewer
Chief System Architect at a marketing services firm with 501-1,000 employees

Yes to make this job we've used a MySQL database. We switch because MySQL is not a scalable solution and we've reach it's limits.

View full review »
RV
Director at Nihil Solutions

We did previously use a lot of different mechanisms, however, we needed something that was good at processing data for analytical purposes, and this solution fit the bill. It's a very powerful tool. I haven't seen other tools that could do precisely what this one does.

View full review »
SA
Technical Consultant at a tech services company with 1-10 employees

I have used MapReduce from Hadoop previously. Otherwise, I haven't used any other big data infrastructure.

In my work previously, not in this company, I was working with some big data, but I was extracting using a single-core off my PC. I realized over time that my system had eight cores. So instead, I used all of those cores for multi-core programming. Then I realized that Hadoop and Spark do the same thing but with different PC's. That was then I used multi-core programming and that's the point - Spark needs to go and search Hadoop and other things.

View full review »
GA
Senior Solutions Architect at a retailer with 10,001+ employees

Because my area is data analytics and analytics solutions, I use BigQuery, SQL, and ETL. I also use Dataproc and DataFlow.

View full review »
it_user946074 - PeerSpot reviewer
Principal Architect at a financial services firm with 1,001-5,000 employees

I was using some other systems and we moved to Spark later. We faced performance and other issues with the other solution.

View full review »
AR
Manager - Data Science Competency at a tech services company with 201-500 employees

I work on several open-source frameworks including Python, Scikit-learn, TensorFlow, PyTorch, H20.ai, and R. We don't endorse proprietary tools so we aren't working with them.

View full review »
it_user373173 - PeerSpot reviewer
Lead Big Data Engineer at a non-profit with 51-200 employees

I evaluated Hadoop-based solution, and chose Spark due to the fast processing and ease of use.

View full review »
it_user371334 - PeerSpot reviewer
CEO at a tech consulting company with 51-200 employees

Yes, we previously used Oracle, from which we ported our data.

View full review »
it_user326142 - PeerSpot reviewer
Architect at a healthcare company with 51-200 employees

Yes we used Hive, Pig, and Storm. Having everything in the same framework has helped us out a lot.

View full review »
it_user371325 - PeerSpot reviewer
Data Scientist at a tech vendor with 10,001+ employees

I previously used Python and R, but neither of these scaled particularly well.

View full review »
LC
Snr Security Engineer at a tech vendor with 201-500 employees

In previous companies, we used MySQL platform and solutions like ArcSight and Splunk. We switched for scalability. MySQL wasn't going to scale, and we don't use Splunk at this company.

View full review »
Buyer's Guide
Apache Spark
April 2024
Learn what your peers think about Apache Spark. Get advice and tips from experienced pros sharing their opinions. Updated: April 2024.
768,415 professionals have used our research since 2012.