We just raised a $30M Series A: Read our story

Zaloni Data Platform OverviewUNIXBusinessApplication

Zaloni Data Platform is the #4 ranked solution in our list of top Data Preparation Tools. It is most often compared to Collibra Governance: Zaloni Data Platform vs Collibra Governance

What is Zaloni Data Platform?

Zaloni simplifies big data for transformative business insights. We work with pioneering enterprises to modernize their data architecture and operationalize their data to accelerate insights for everyday business practices.

Zaloni Data Platform is also known as ZDP.

Buyer's Guide

Download the Data Governance Buyer's Guide including reviews and more. Updated: October 2021

Zaloni Data Platform Customers
CDS Global, AIG,
Zaloni Data Platform Video

Zaloni Data Platform Reviews

Filter by:
Filter Reviews
Industry
Loading...
Filter Unavailable
Company Size
Loading...
Filter Unavailable
Job Level
Loading...
Filter Unavailable
Rating
Loading...
Filter Unavailable
Considered
Loading...
Filter Unavailable
Order by:
Loading...
  • Date
  • Highest Rating
  • Lowest Rating
  • Review Length
Search:
Showingreviews based on the current filters. Reset all filters
SL
Solution Architect at a financial services firm with 10,001+ employees
Real User
Top 10Leaderboard
Solid multi-ingestion tool but with poor exception handling

Pros and Cons

  • "You can create a lot of ingestions based on the file levels or based on the time."
  • "The major pain point with Zaloni is that their exception handling is not good. If any event happens, it doesn't tell you at which point it failed and it doesn't tell the operations team how they should take corrective actions unless you call Zaloni and then identify the issues. That is one issue."

What is our primary use case?

Zaloni is actually a big data platform management tool. It is extensively used. They have different connectors. You can start ingesting your batch and you can also use real-time streaming, but we haven't used that component. We have used it mostly for batch ingestions.

It's not a single product. It has multiple pieces within itself. I think most of them are plug-and-play.

We are using the enterprise version of Zaloni.

How has it helped my organization?

The benefits of Zaloni are that it is readily deployable, so you will have the solution within the tool itself. It's a matter of how you need to integrate and how you can establish your connectors because it has a lot of connectors built-in. Then you run your data pipelines in that span of time.

It's commercial, off the shelf, with minimal configuration, and within three months you can go to the production if you are handling small-scale of data. But if it's huge figures, then it takes more time.

Another good advantage with Zaloni is that it has a very good schema evolution process but the process is very tedious to run on the tool. But once you go through that difficult phase, then it has a very good building. For example, if you want to ingest all the data, it has the ability to automatically find out the right schema at that time and place, and then it picks up that schema and runs the data accordingly. The historical data integration is another very good feature. You can ingest data whenever you want from the historical data.

What is most valuable?

In terms of most valuable features, Zaloni has different components. One is its ingestion aspect where you can create a lot of ingestions based on the file levels or based on the time. You have listeners actually continuously listening on certain poles, and as the data arrives it will start picking up. That is the one use-case that we have used.

Another one is batch ingestion where you can set up your timer. At the set time, it will check whether the file is present, then it will start the data pipeline and pick up the jobs to trigger.

Another feature is that it is time-based. That means the advantage with Zaloni is you can have both set up - either on data arrival or schedule-based, or you can have both, mix and match.

Once the ingestion starts, it has the connectivity to the rest of the big data components. You can store the data on S3 or wherever you want. It has different connectors to connect to on-premise and cloud. We were using on AWS, so our primary storage was the S3 buckets for the information. It uses the Hive as well. Without big data components, it cannot work on its own. You need to have big data installed fast, and Zaloni works on top of it.

It has Bedrock architecture which is a component that actually manages these schedules and all other activities. Bedrock is another tool within Zaloni itself.

There is one more component called Metadata Management. It's called EMDM I guess, but I don't remember exactly what the component is called. It allows you to manage your metadata management. It's kind of a business metadata. You can go ahead and write whatever fields that describe the data, then it manages them.

What needs improvement?

The major pain point with Zaloni is that their exception handling is not good. If any event happens (an event is when the job stops in the middle of the process), it doesn't tell you at which point it failed and it doesn't tell the operations team how they should take corrective actions unless you call Zaloni and then identify the issues. That is one issue.

Another issue is that sometimes your jobs fail and if you run it a second time, it will go through.

A third area it could be improved is the deployment process. When you want to deploy anything, it has a lot of manual processes. For example, you have to create your password in an encrypted format, and then you have to use a lot of manual deployment process. They should actually be building something else, like using Jenkins or automating their process. I have suggested to them that they have to improve their deployment process because they want everyone to run a manual deployment. It takes a lot of time, about half a day, for any single deployment. Then test it, then it might not work, and then reverting back is not easy because of the manual deployment process.

I think in recent versions they added a lot of upgrades and additional features including a lot of integrations. Before it was just AWS, later they extended it to Azure. I'm not sure how they have extended it to GCP.

It has some built-in features and a lot of improvements now because the UI and the features were not easy to navigate. Regarding showing the metrics, it's okay. I will say it's neither easy nor hard, there is whatever is required.

Lastly, on the governance side, it's not very good. We faced some issues with the Ranger version. Ranger was an authentication tool on big data and Zaloni had some compatibility issues with the Ranger at that time. Later they said those are all going to be resolved, but at that time it had some issues. I'm not sure whether it was working with the Sentry or not.

For how long have I used the solution?

I have been using Zaloni Data Platform within the last 18 months.

What do I think about the stability of the solution?

Stability-wise, it's good. I don't see many issues, except one thing - once a week from a job user to event. But after that, if you resubmit your job, then it goes through. So I don't see an issue once you establish it. It's a good product to continue with.

So when you say maintenance it means daily operations is the one thing that I can look at. The other one is version upgrades and upgrading the security patches. Because of the employers, we relied on the Zaloni team for any maintenance activities, like version updates.

Our team needs to handle daily operations. I was one of the team managing daily operations like running and making sure the cluster is up, jobs are running perfectly, and the data is updated for the next business day and available for business users.

What do I think about the scalability of the solution?

In terms of scalability, we were the first one to implement it with the scalability. We tried and tested the scaling with the AWS. It has good scalability. When it comes to auto-scaling Zaloni the only thing is the underlying cluster should have the scalability.

We implemented it in AWS, and once we defined the threshold, I think we were able to run on five different instances. It is auto-scaling enabled on the AWS server.

That was the platform with Zaloni itself. Initially we implemented a big data solution, massive data, almost one petabyte of data. The entire need was to ingest it into big data and Zaloni was the fastest product to test and implement into production.

But before that, there were some attempts to try to build their own clusters without any tools or anything, but those were not successful. Then they had to go and buy Zaloni and then implement the solution.

Because of the complexity involved, my employer was trying to switch to other platforms because they wanted to try different tools rather than sticking to Zaloni because of its difficulty in managing and the version upgrades, because every time you need to have somebody from Zaloni look into the issues.

They were identifying different tools and experimenting with it.

How are customer service and technical support?

Their support is very good because it has a dedicated support team for our employer. They were able to respond. If you call anytime, 24/7, people are available and they made sure that things are taken care.

They are quite responsive in that.

Which solution did I use previously and why did I switch?

I had experience with similar solutions, but Zaloni is something different. They market it as a big data management tool but it's not really because it handles only part of all the data.

If you take the example of Cloudera, it makes your job easier to manage your clusters because it has a lot of built-in features and a lot of UI features so you can add a node at any time. You can decommission the node, you can balance the cluster. But Zaloni doesn't have many of those capabilities. It's more like you can look at it as an ingestion tool. Actually, a little superior to ingestion because it has some metadata management and you don't have to procure another license for that.

On the governance side it doesn't help, but with additional features you can manage your data confidentiality, with data plug-and-play solutions. You can encrypt all the data or you can encrypt only certain fields in the database. You can do it while ingesting or you can encrypt once it is ingested. It has the ability to encrypt the data throughout your pipeline.

How was the initial setup?

In terms of initial setup you need to actually get Zaloni support to do that. We had evaluated the different tools like Talon and other ingestion tools. Even AWS has one of the tools, I forgot its name. We evaluated how we can use these tools to simplify this process.

Whatever workflows you're creating you'll have to create on Zaloni only. So if you create outside of the Zaloni, it doesn't know anything about that so you have to have a scheduler built-in to the platform itself. It has the ability to integrate your data pipelines, then you can deploy the code within that. You have the ability to manage your metadata.

What was our ROI?

I would say there was a return on investment. I would say within two years, but it depends how well you can sell your data. It depends on how organizations look at it. It changes from everyone's perspective.

How is your organization really interested in selling the data and making money by exposing the data to the APS? Then definitely you can get the revenue much faster.

But in my organization, that's not the model, because the model is that they wanted to give data to business users so that they can play around with the data with less effort.

What's my experience with pricing, setup cost, and licensing?

I don't know how the licensing works, to be honest, but it's quite expensive. We paid around 150 to 200 grand, per yearly basis. There is no special pricing charges. It's just licensing regardless of the number of users and regardless of how much data you're processing. They don't have all the different licensing structures.

Which other solutions did I evaluate?

We are thinking about Talon, which is a similar product. There are a lot of different products that can be used. In a different organization, Apache can also be used. That is an open-source. And if you are really rich, then you can go Pentaho Data Integration, a similar tool, that is much easier. Informatica big data component is another one.

What other advice do I have?

I would recommend Zaloni. But before choosing they should evaluate different options so they know which one is better. Even though we have similar products, some products are good in one aspect, some are not good in others.

Take the example of Informatica. Informatica is very good for database warehousing platforms. But when it comes to big data and when you have to input the data into big data then it has a lot of difficulties. Then I had to use another component that was better in handling the data.

So it's based on the need. What is your objective and where are you pulling out the data? If it's just simple, you're pulling the data from only RDBMS then you can rely on Zaloni or in any other product and then you can use it right away, out of box. You don't need any other tools as such. But maybe you are specialized in one them and you have a lot of restrictions, like in financial institutions, when you're not allowed to get the data from any data source. What they do is offload the data from the database and then they put it in some server. Then you can access the data from that server. There are lots of layers for building it to manage the security. If that is the case here, then you have to look at which one suits you best.

There are areas of improvement. One is especially the manual deployment process that was in place. That's one of the biggest challenges. Also when you're creating the entities it could have been done much more easily than it was.

On a scale of 1 to 10, 1 being the worst and 10 being the best, I would rate Zaloni a seven.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: My company has a business relationship with this vendor other than being a customer:
HJ
Manager Programmes - Analytics & Data Science at a transportation company with 10,001+ employees
Real User
Top 20
Good workflows and data mapping capability, but technical support is in need of improvement

Pros and Cons

  • "The most valuable feature is the ability to map data and write workflows with logic inside them."
  • "Technical support is in need of improvement."

What is our primary use case?

We are using this solution to work with Big Data.

What is most valuable?

The most valuable feature is the ability to map data and write workflows with logic inside them.

It is possible to see the lineage and metadata information, which is useful even though we have not yet used it extensively in my organization.

What needs improvement?

Technical support is in need of improvement.

For how long have I used the solution?

I have been using the Zaloni Data Platform for about a year.

How are customer service and technical support?

We have been in contact with technical support and I would rate them a five or six out of ten. When we have gone to the support team because something is wrong with the product, I would say that it has not been easy for them to sort things out and quickly give us a patch.

If it is a normal problem that doesn't require going to the product support team then it's fine, but if it does go to them then it is complex. For a few of the problems that we have had, we never got the patches.

Which solution did I use previously and why did I switch?

Prior to this solution, we relied on using shell scripts. Not all of the engineers were capable of doing it this way, and for those that were, it took longer.

We also use other solutions at the same time as the Zaloni Data Platform including DataIQ and Trifacta. However, the adoption of Trifacta was not very good.

The big difference between ZDP and DataIQ is that we use ZDP for analytics and BI, whereas DataIQ is used for predictive analysis. You can connect it to libraries to perform data science functions.

How was the initial setup?

The initial setup is complex and not very straightforward. There were a lot of things that we had to consider such as our cloud, our CVS, the Spark system, and the Linux environment. Overall, I would say that it is not an easy deployment and setting up the entire platform is difficult.

What other advice do I have?

The suitability of this product depends on the organization. I was not part of the decision to implement it here. 

I would rate this solution a six out of ten.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.