Spring Cloud Data Flow Review

Good integration with Kafka and rich community support, but the monitoring tools are not yet mature


What is our primary use case?

In my last project, I worked on Spring Cloud Data Flow (SCDF). We created a stream using this product and we had a Spring Kafka Binder as well. The project included creating a data lake for our clients.

The platform that we created maintained a data lake for an internet banking user and provided an out-of-the-box solution for integration with it. We used SCDF to gather the data, as well as our ETL (extract, transform, and load) pipelines.

What is most valuable?

The most valuable feature is real-time streaming.

It integrates very well with Kafka. The integration of Elasticsearch Appian was indeed very good because we just attached Appian to a pipeline. We had an Elasticsearch cloud, on-premises, so we were able to connect to the data.

It is open-source and has rich community support.

What needs improvement?

Some of the features, like the monitoring tools, are not very mature and are still evolving. With some of the products we used, they did not integrate well and were hanging a lot. One of the advantages of using open-source is that if you don't like a particular tool then you can use another one.

If you want to use Kubernetes then you have to optimize a lot in terms of resources. I had a 15 GB MacBook Pro, but initially, it wouldn't work because it would hang. There were also some weird shutdowns. We weren't able to figure out exactly why it happened but it was clearly due to having not enough system resources. When then needed to optimize and increase our heap memory.

For how long have I used the solution?

We used this product for almost six months in my previous company.

How are customer service and technical support?

This product has a rich support community.

What's my experience with pricing, setup cost, and licensing?

This is an open-source product that can be used free of charge.

What other advice do I have?

We used this product with Kubernetes, which had been recently introduced and we liked it. It was very good, compared to Maven. We did try it with Maven; however, the server took 15 or 16 minutes to start. This is when we switched to Kubernetes and it was very good. They provide a lot of different configurations and environment types. We use Kafka on Kubernetes, as well. The configured was proved by SCDF.

I would rate this solution a seven out of ten.

Which deployment model are you using for this solution?

On-premises
**Disclosure: I am a real user, and this review is based on my own experience and opinions.
More Spring Cloud Data Flow reviews from users
...who compared it with Apache Flink
Find out what your peers are saying about VMware, StreamSets, TIBCO and others in Data Integration Tools. Updated: July 2021.
524,194 professionals have used our research since 2012.
Add a Comment
ITCS user
Guest