How do you or your organization use this solution?
Please share with us so that your peers can learn from your experiences.
Mostly the use cases are related to building a data pipeline. There are multiple microservices that are working in the Spring Cloud Data Flow infrastructure, and we are building a data pipeline, mostly a step-by-step process processing data using Kafka. Most of the processor sync and sources are being developed based on the customers' business requirements or use cases. In the example of the bank we work with, we are actually building a document analysis pipeline. There are some defined sources where we get the documents. Later on, we extract some information united from the summary and we export the data to multiple destinations. We may export it to the POGI Database, and/or to Kafka Topic. For CoreLogic, we were actually doing data import to elastic. We had a BigQuery data source. And from there we did some transformation of the data then imported it in the elastic clusters. That was the ETL solution.
In my last project, I worked on Spring Cloud Data Flow (SCDF). We created a stream using this product and we had a Spring Kafka Binder as well. The project included creating a data lake for our clients. The platform that we created maintained a data lake for an internet banking user and provided an out-of-the-box solution for integration with it. We used SCDF to gather the data, as well as our ETL (extract, transform, and load) pipelines.
The organization I’m currently consulting for is performing a lift-and-shift, moving its existing software from an on-prem platform and infrastructure to the cloud. They have chosen Azure as their cloud provider. As part of this process, they have orders to move away from expensive, monolithic, proprietary software platforms, and to replace them with open-source, publicly available software technologies. One area we’ll be replacing is their current middleware software which consists of IBM WebSphere Message Broker. While it is a fine tool for the most part, it’s also bulky and expensive to operate. The final solution we’re working towards will be much more cloud-native, will support scalability, be able to process messages much faster, and consist of several different technologies and vendors (not just a single vendor, as is the case with the current IBM solution). This new middleware platform will consist of: Apache Kafka for the delivery of messages, Spring Cloud Data Flow, and a handful of RESTful APIs.
What do you like most about Spring Cloud Data Flow?
Thanks for sharing your thoughts with the community!