We just raised a $30M Series A: Read our story
2016-11-28T04:28:00Z

Apache Spark without Hadoop -- Is this recommended?

31

Hi community, 

I'm aware that we can use Apache Spark with/without Hadoop. 

But I am sure that the majority of people are using Apache Spark with Hadoop, and I read one article that states how using Apache Spark without Hadoop is not good for deployment, and can be usable for the development environment. 

Is that true? 

I'd greatly appreciate if anyone can elaborate on this.

Thanks.

ITCS user
Guest
33 Answers

author avatar
Top 5LeaderboardReal User

I don't think using Apache Spark without Hadoop has any major drawbacks or issues. I have used Apache Spark quite successfully with AWS S3 on many projects which are batch based. Yes for very high performance system HDFS is a better option. 


The main problem with Apache Spark with object storage like S3 has been the consistency problem of these object storage systems. You can read this post which will help you understand the issue and how to avoid it. Hope this helps you.



https://arnon.me/2015/08/spark...


2021-09-03T05:02:26Z
author avatar
Top 5Real User

I mean we can configure Spark without Hadoop as well like using WinUtils.exe . Is that recommended for Deployment ? Or would like to understand difference between Spark Hadoop Environment and Spark Without Hadoop?

2017-01-04T13:58:25Z
author avatar
Consultant

Can you elaborate on the information you've been told about how using Apache Spark without Hadoop isn't good for deployment?

This insight would help many of our users.

2016-12-06T14:25:42Z
Find out what your peers are saying about IBM, Broadcom, Compuware and others in Software Configuration Management. Updated: October 2021.
541,708 professionals have used our research since 2012.