Apache NiFi Review

It provides a useful GUI for configuring the system and monitoring the data flows.


What is most valuable?

We are a research institution and use NiFi for its easy Java extensibility, built-in provenance capturing, and graphical web interface.

How has it helped my organization?

We are replacing a custom built Java data ingestion system that over time had become difficult to maintain and was brittle.

NiFi allows us to organize our ingestion as the directed graphs and provides a useful GUI, that can be used to configure the system and monitor data flows.

NiFi’s provenance capturing is also a big plus, as our legacy system did not do this sufficiently.

What needs improvement?

Most of our data is binary and we frequently must write our own processors. Also, there is no support for the stateful operations that require information from other data flows or look-up tables.

For how long have I used the solution?

I have used this solution for more than one years.

What do I think about the stability of the solution?

There were no stability issues.

What do I think about the scalability of the solution?

There were no scalability issues.

How is customer service and technical support?

It is an open-source software, but there is an active and rapidly growing contributor and user base.

Which solutions did we use previously?

We previously used custom code and switched to simplify maintenance and improve our functionality.

How was the initial setup?

The initial setup was very straightforward. NiFi is very easy to install and get running.

What's my experience with pricing, setup cost, and licensing?

It’s free!

Which other solutions did I evaluate?

We looked at some proprietary solutions and also, evaluated StreamSets. The proprietary solutions were expensive and often didn’t suit our use cases. StreamSets didn’t have the same level of adoption.

What other advice do I have?

Think about your data flows as the directed graphs between low-level processing modules, so you can re-use as much of the path as possible for different data streams. Don’t create entirely separate flows for new data sources, i.e., if you can avoid it.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Add a Comment
Guest
Sign Up with Email