Apache Spark Room for Improvement

reviewer879201
Technical Consultant at a tech services company with 1-10 employees
I think for IT people it is good. The whole idea is that Spark works pretty easily, but a lot of people, including me, struggle to set things up properly. I like contributions and if you want to connect Spark with Hadoop its not a big thing, but other things, such as if you want to use Sqoop with Spark, you need to do the configuration by hand. I wish there would be a solution that does all these configurations like in Windows where you have the whole solution and it does the back-end. So I think that kind of solution would help. But still, it can do everything for a data scientist. Spark's main objective is to manipulate and calculate. It is playing with the data. So it has to keep doing what it does best and let the visualization tool do what it does best. Overall, it offers everything that I can imagine right now. View full review »
Karthikeyan R
Principal Architect at a financial services firm with 1,001-5,000 employees
The search could be improved. Usually, we are using other tools to search for specific stuff. We'll be using it how I use other tools - to get the details, but if there any way to search for little things that will be better. It needs a new interface and a better way to get some data. In terms of writing our scripts, some processes could be faster. In the next release, if they can add more analytics, that would be useful. For example, for data, built data, if there was one port where you put the high one then you can pull any other close to you, and then maybe a log for the right script. View full review »
reviewer1046250
Senior Consultant & Training at a tech services company with 51-200 employees
When you first start using this solution, it is common to run into memory errors when you are dealing with large amounts of data. Once you are experienced, it is easier and more stable. When you are trying to do something outside of the normal requirements in a typical project, it is difficult to find somebody with experience. View full review »
Find out what your peers are saying about Apache, Informatica, VMware and others in Hadoop. Updated: March 2020.
408,459 professionals have used our research since 2012.
reviewer1221765
Co-Founder at a tech vendor with 11-50 employees
We've had problems using a Python process to try to access something in a large volume of data. It crashes if somebody gives me the wrong code because it cannot handle a large volume of data. View full review »
reviewer1223676
Lead Consultant at a tech services company with 51-200 employees
We use big data manager but we cannot use it as conditional data so whenever we're trying to fetch the data, it takes a bit of time. There is some latency in the system and latency in the data caching. The main issue is that we need to design it in a way that data will be available to us very quickly. It takes a long time and the latest data should be available to us much quicked. View full review »
Snrsecengin567
Snr Security Engineer at a tech vendor with 201-500 employees
The management tools could use improvement. Some of the debugging tools need some work as well. They need to be more descriptive. View full review »
Mohamed Ghorbel
Director of BigData Offer at IVIDATA
The solution needs to optimize shuffling between workers. View full review »
KamleshKhollam
Consultant at Exusia
I would like to see integration with data science platforms to optimize the processing capability for these tasks. View full review »
reviewer894894
User
I would suggest for it to support more programming languages, and also provide an internal scheduler to schedule spark jobs with monitoring capability. View full review »
Sumanth Punyamurthula
Director - Data Management, Governance and Quality at Hilton
It is like going back to the '80s for the complicated coding that is required to write efficient programs. View full review »
Rosemary Walsh
Portfolio Manager, Enterprise Solutions Architect at Capgemini
Better data lineage support. View full review »
Find out what your peers are saying about Apache, Informatica, VMware and others in Hadoop. Updated: March 2020.
408,459 professionals have used our research since 2012.