What is our primary use case?
We use FiveTran to do data integration and pipelines into data warehouses for analytics. We do the typical use case of bringing in from our internal systems, data syncing, change data capture, as well as bringing from third-party systems, like Salesforce, NetSuite, and even Google Analytics and web platforms. We use the solution primarily in an ELT framework.
How has it helped my organization?
The main benefit is just being able to onboard new sources of data. One of the things that we did was create a staging database in FDLC. We set up a new connection, and a new source in destination. We're able to sync and set up a one hundred gigabyte database from PostgreSQL to Redshift, a completely new implementation that's subsequent to the initial one, within two days. We are able to completely replicate an entire staging environment within a two-day timeframe.
What is most valuable?
There's the general feature of the platform where it just makes it very easy to integrate different things, but I would say a specific difference is their integration of DBT, being able to have the transformation components be driven by Fivetran.
What needs improvement?
One of the traditional issues with the platform has been logging. The logging, while they have it, is not particularly verbose, so when there are issues it becomes hard to do. They also have internal logs versus customer-facing logs. We've asked FiveTran to provide more exposure on that or to be able to subscribe to it via an API or Datadog or something like that to pull from their system.
Another thing is mainly their breadth of being able to pull from different systems. They have some of this already, but they're pushing to do some integrations with Excel online. Some of the pain points we're looking at are trying to integrate some of the items in the Microsoft stack, so SharePoint and Excel, and then some of the newer Azure services.
What do I think about the stability of the solution?
The solution is mostly stable. For the breadth and number of connections, it's okay. The thing sometimes with Fivetran is that they'll have random outages for some functions. I have had a couple of cases where there were some critical errors that have taken too long to fix. One issue was that stripe data was not sinking correctly, and it took over two months to get resolution on that.
When the solution is working, it works well. When it breaks, it is very difficult to troubleshoot and fix because it becomes almost like you have an in-house ETL process that you have somebody outsourced trying to fix. Plus, they're trying to fix it for multiple customers at the same time, and sometimes that can be competing.
I would say the solution is mostly stable 98% of the time, but the 2% that it isn't, there are usually critical issues.
What do I think about the scalability of the solution?
The solution is very scalable. In terms of the breadth of connections and things like that, it's definitely there. In terms of volumes, they're not necessarily in charge of the platforms themselves. For instance, Fivetran doesn't control the speed of our databases, but as long as it's working in concert with customer systems, it can work well. I think there's just some work that needs to be done in terms of tuning those capabilities so that it remains consistently scalable. For instance, when we're doing syncing on PostgreSQL and things like that, there's certain features and flags that you can use to make the process faster, so there's some coordination there. Other than that, once it's set up, it's usually pretty good.
On a daily basis, we have four or five people using it in the business intelligence and analytics area. The SRE team uses it, and I think sometimes software engineering uses it if they want to ingest data from other systems. Also, our business intelligence engineers and the site reliability engineer, plus data engineers use it as well.
The solution does require maintenance right now. Sometimes there will be alerts that come up in the system if you have schema drift or something like that. Usually, the business intelligence engineers manage that.
We use the solution as our primary ingest for all the data into warehouse. We're looking to expand it. We're on Redshift, but we have another company that uses Azure and the SQL server and Synapse, so we're planning on expanding use there as well.
How are customer service and support?
Earlier on, there were growing pains with the tech support. Early on, when we had to engage with tech support, it was usually for more critical issues, so to me, that's almost like a four out of ten. Especially if we had business breaking issues like a severity one, those probably didn't get as much attention as we needed them to get.
It has gotten a little bit better. I've heard they've reorganized some of their processes and handoffs because they try to do twenty-four by seven, so they have handoffs to different regions and are trying to do better with them. I think it's improving, but I haven't had to use them recently.
How was the initial setup?
Step one was connecting to them, then opening up ports to our cloud, verifying connections, connecting to our different databases from source destinations, testing, and implementing. Obviously, with the initial onboarding there's also security and things like that.
The initial deployment was fairly small, so it didn't take a particularly long time, maybe a week on and off, in terms of just working with the team and opening ports and connecting. We're on AWS, so some of it was on our side having to do IM rolls and whitelisting.
From a day-to-day perspective of onboarding new ones, it's really just pointing towards sources, then the destinations, and then just doing verifications. Day-to-day is pretty easy.
What about the implementation team?
Our deployment was handled in-house by two people. One is an SRE engineer and the other is an analytics BI.
What was our ROI?
There are two things, but they haven't been fully quantified. One is the time required for onboarding new data sources, and then two is that we don't have to stand up a data engineering department or function. I would say potentially Fivetran could replace at least one full-time engineer. As far as ROI, we could say maybe one FTE worth of time, though, obviously, there's the contract expense that goes with that.
What's my experience with pricing, setup cost, and licensing?
We started off with just our credit card and we made payments. Now we're on a separate and negotiated contract.
The pricing generally can be very expensive and a little bit opaque, but they can be negotiated down because it is a SA solution. They've changed the pricing model. They do it by monthly rows now, I think. Also, their pricing practices, when we experienced them, were not very good. They would automatically renew a contract without negotiation, which is not good practice from a client perspective. I would say they're a little bit on the expensive side, and their contract process is not particularly good, but there is a lot of potential flexibility.
What other advice do I have?
My advice is to be very clear about how many rows or the volume of data because that is the main driver of the cost. Then, be wary of the contract terms if it has an escalator per year. Also, obviously just catalog all the different sources and then, if there are sources that aren't available, see if they're on the roadmap or if they have the capability of doing custom connectors.
If I'm comparing it to other solutions in the market, I'd give this solution an eight out of ten.
I think it does a very good job of being able to quickly stand up and connect to sources. It's even possible from a startup perspective. If you only have one person, you can connect three, four, five, or ten different systems and be up and running in a very short timeframe without having to do custom work. The stability is good, the pricing is okay, and the service is okay, and I think there is significant value in the product. There are more competitors coming about that might offer more customization, but I think that out of the box, Fivetran is probably the easiest to use.
Disclosure: I am a real user, and this review is based on my own experience and opinions.