Apache Spark Review

We can now harness richer data sets and benefit from use cases


How has it helped my organization?

Organisations can now harness richer data sets and benefit from use cases, which add value to their business functions.

What is most valuable?

Distributed in memory processing. Some of the algorithms are resource heavy and executing this requires a lot of RAM and CPU. With Hadoop-related technologies, we can distribute the workload with multiple commodity hardware.

What needs improvement?

Include more machine learning algorithms and the ability to handle streaming of data versus micro batch processing.

For how long have I used the solution?

Three to five years.

What do I think about the stability of the solution?

At times when users do not know how to use Spark and request a lot of resources, then the underlying JVMs can crash, which is a big sense of worry. 

What do I think about the scalability of the solution?

No issues.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Add a Comment
Guest
Sign Up with Email