2018-06-27T19:19:00Z

What is your primary use case for Apache Spark?


How do you or your organization use this solution?

Please share with us so that your peers can learn from your experiences.

Thank you!

Guest
1111 Answers

author avatar
Top 5Real User

Apache Spark can be used in multiple use case in big data and in data engineering task. We are using Apache spark for ETL, integration with streaming data and performing real time prediction like anomaly, price prediction and data exploration on large volume of data.

2020-06-10T05:14:07Z
author avatar
Top 5LeaderboardConsultant

Our use case for Apache Spark was a retail price prediction project. We were using retail pricing data to build predictive models. To start, the prices were analyzed and we created the dataset to be visualized using Tableau. We then used a visualization tool to create dashboards and graphical reports to showcase the predictive modeling data. Apache Spark was used to host this entire project.

2020-02-02T10:42:14Z
author avatar
Top 10LeaderboardReal User

We have built a product called "NetBot." We take any form of data, large email data, image, videos or transactional data and we transform unstructured textual data videos in their structured form into reading into transactional data and we create an enterprise-wide smart data grid. That smart data grid is being used by the downstream analytics tool. We also provide machine-building for people to get faster insight into their data.

2020-01-29T11:22:00Z
author avatar
Top 10LeaderboardConsultant

We are working with a client that has a wide variety of data residing in other structured databases, as well. The idea is to make a database in Hadoop first, which we are in the process of building right now. One place for all kinds of data. Then we are going to use Spark.

2019-12-23T07:05:00Z
author avatar
Top 5LeaderboardReal User

We primarily use the solution to integrate very large data sets from another environment, such as our SQL environment, and draw purposeful data before checking it. We also use the solution for streaming very very large servers.

2019-12-09T10:58:00Z
author avatar
Top 10LeaderboardConsultant

We use this solution for information gathering and processing. I use it myself when I am developing on my laptop. I am currently using an on-premises deployment model. However, in a few weeks, I will be using the EMR version on the cloud.

2019-10-13T05:48:00Z
author avatar
Real User

We primarily use the solution for security analytics.

2019-07-14T10:21:00Z
author avatar
Top 20Real User

We use the solution for analytics.

2019-07-10T12:01:00Z
author avatar
Real User

Streaming telematics data.

2019-04-08T13:04:00Z
author avatar
Real User

Ingesting billions of rows of data all day.

2019-03-17T03:12:00Z
author avatar
User

Used for building big data platforms for processing huge volumes of data. Additionally, streaming data is critical.

2018-06-27T19:19:00Z
Find out what your peers are saying about Apache, Informatica, Cloudera and others in Hadoop. Updated: September 2020.
442,517 professionals have used our research since 2012.