What is our primary use case?
We use it for micro-batching of Kafka topics, which is like small, little bits of clickstream data. For almost all our use cases, the target of the data goes into our data warehousing solution, Snowflake. We also take large XML files from multiple parties, transform them, and put them into our Snowflake.
How has it helped my organization?
I am part of the data engineering/data science group and governance. I am a senior software engineer on the team. We ingest data from dozens of systems. We ingest that data through Equalum and orchestrate it. We then deploy it to our data warehousing solutions. Then, we work with the data science team to come up with more metrics/measurements.
Equalum has resulted in system performance improvements in our organization. Now, I am ingressing data off of multiple S3 sources, doing data processing, and formatting a schema. This would usually take me a couple of days, but now it takes me hours.
It has redefined how we architect our data system. Before, once a day, we would go and grab data. Now, a vast majority of the time, we are actively scanning for new data to come in. As much as possible, we try not to wait for a person to tell us the data is there. We actually actively go out and get it whenever we can. That is a big change for us. So, if somebody says that the file was supposed to be there at 8:00 PM, and it doesn't show up, that kind of mucks up your data flow. Now, if the data shows up at 8:01, you are already actively checking the directory, so you won't miss the file if they are late.
What is most valuable?
Its most valuable feature is the change data capture (CDC). This is usually a little bit more of a pain if I was using an open source or other tools, but I find their change data capture and data query pretty intuitive.
Equalum provides a single platform for the following core architectural use cases: CDC replication, streaming ETL, and batch ETL. This is core to our company. I would score it as a nine out of 10. It is pretty much how we were moving all our data through their system.
The no-code part is useful. It is like a seven out of 10 for us. We are all software engineers, so it just helps speed up a lot of the data mapping. For example, I just did five documents now. It probably saved me 50 percent of my time.
What needs improvement?
Their UI could use some work. Also, they could make it just a little faster to get around their user interface. It could be a bit more intuitive with things like keyboard shortcuts. They already know this from me, because I have already complained to them about it.
For how long have I used the solution?
I have been using it for a year and a half.
What do I think about the stability of the solution?
The stability has been okay. There have been some rough patches at times. It was on and off in the beginning.
It basically runs itself. We check in on it every day, because we use it all the time, but our team is pretty small.
What do I think about the scalability of the solution?
The scalability has been pretty good. We have had to work with them with memory management, but it is more of a learning curve from us than from the system itself.
There are three or four people who use it. Most people just care about the end results, not necessarily the pipes.
Because it goes into a Snowflake, any of our data analysts, data scientists, and anybody who touches a Tableau dashboard are end users.
More than 90 percent of our data flows through Equalum. It is a core piece of our data platform.
How are customer service and technical support?
I would rate their technical support as 10 out of 10. They are quite lovely. I can reach them pretty much 24/7. They are very responsive. We have a Slack channel with them. If we post something in it, they will respond within the hour and usually open a ticket. We sit with them once a week and go over the backlog. They are very hands-on and willing to talk about improvements in terms of the system. They take feedback very well. If they can't figure out the problem themselves, they will log onto our production or back-end systems, when we give them permission, helping us resolve problems faster.
They are pretty quick. They patch the system quickly when we find bugs. They will hotfix stuff for us. We are finding less stuff as we go along, but we are pretty picky.
Which solution did I use previously and why did I switch?
Prior to using this solution, it was quite difficult, especially for real-time streaming. Most solutions that we had used before were batching. Moving over to Equalum, we went to more data streaming.
How was the initial setup?
The initial setup is medium complex. It is semi-hosted. It is not a full-stack platform, so there is still hands-on stuff. For example, we wrote Terraform scripts to help deploy their whole system in our particular way that we wanted to do it.
The deployment didn't take long. Once we got the training wheels going, it was about a week or two. It was mostly just figuring out where and how we were going to run it.
Our original plan was a lot of SQL stuff. Now, we have moved onto mostly Kafka and more smaller micro-batching. So, we had to reconfigure the system as we evolved it. They have worked with us hand-in-hand to do that.
Deploying throughout the entire organization was pretty smooth. We have needed to reconfigure a few things here and there because our use cases have changed.
What about the implementation team?
We had one or two people for the deployment. It was probably a week's worth of coding work to get it the way that we wanted to go operate it.
We worked with them pretty extensively and talked to them probably every week when doing our upgrade.
What was our ROI?
We have most definitely seen ROI with the ability to onboard data, i.e., the speed of business.
This solution has enabled us to consolidate the use of other tools. We are actively phasing out some other tools. We are probably saving an hour or two every day. We have gotten rid of Stitch Data, Airflow, and Luigi, which we are slowly replacing with Equalum.
Which other solutions did I evaluate?
We assessed several solutions, but Equalum beat most of them. We could actually see what was going on inside Equalum's systems. They have their low code interface, but we can get access to a vast majority of the back-end. So, it runs like Kafka and Spark, and that is what they are using, which allows us to use open source technology with it rather than be completely closed off from everything.
There are not a lot of other systems out there that I find as intuitive as Equalum, which can do some pretty complex stuff. We haven't used a ton of what we could do. We are converting the whole company over to data strings, so we haven't been able to take advantage of a ton of their more advanced features yet.
What other advice do I have?
Know your use cases, e.g., will you be doing a lot of micro-batching, database work, or pulling data straight off of Kafka topics?
The user interfaces are pretty good for data products. There is nothing amazing about it, but there is nothing that really detracts from it.
We don't do any data testing inside of Equalum. It doesn't mean that we couldn't, but we don't at the moment.
Eventually, when there are new data features coming out using Jupyter Notebooks, we will start incorporating those into our data science.
The biggest lesson learnt: How to operate a Kafka cluster in Spark and do it well.
I would rate it as a nine out of 10. Their customer support is phenomenal. Most companies usually sell it to you, then they disappear. Equalum is very interested in customer feedback.
Which deployment model are you using for this solution?
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Which version of this solution are you currently using?