Apache Spark Review

​I use it to process large amount of data in the energy industry.


Valuable Features

Spark is relatively easy to deploy, with rich features in handling big data. Spark Core, Spark SQL, Spark MLlib are used mostly in our applications.

Improvements to My Organization

I use Spark to process large amount of data in the energy industry.

Room for Improvement

Good tool to analyse Spark application performance. Right now there are still many parameters to tune in order to get good performance of Spark application, I would like to see the auto tuning of parameters.

Use of Solution

I've been using Spark for seven months.

Deployment Issues

There were no issues with the deployment.

Stability Issues

I ran into Spark application performance issues. For instance, Spark JDBC write performance needs to be improved.

Scalability Issues

There were no issues with the scalability.

Customer Service and Technical Support

Customer Service:

I use Apache open source. Everything is on our own.

Technical Support:

I use Apache open source. Everything is on our own.

Previous Solutions

I evaluated Hadoop-based solution, and chose Spark due to the fast processing and ease of use.

Initial Setup

The initial setup is not complex. The online documents are pretty good.

Implementation Team

I implemented it in-house.

Other Advice

Get to know how Spark works, what are job, stage, task, DAG, etc., and it will help you to write Spark application.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Add a Comment
Guest

Sign Up with Email