I am a manager of a team that uses this solution.
Azure Data Factory is primarily used for data integration, which involves moving data from sources into a data lake house called Delta Lake.
I am a manager of a team that uses this solution.
Azure Data Factory is primarily used for data integration, which involves moving data from sources into a data lake house called Delta Lake.
It's fairly simple to use. The most valuable feature of this solution would be ease of use.
It does not appear to be as rich as other ETL tools. It has very limited capabilities. It simply moves data around. It's not very good after that because it's taking the data to the next level and modeling it.
I have been working with Azure Data Factory for less than a year.
I would say that we are working with the latest version.
The stability of Azure Data Factory is good. The performance is good.
I haven't had to scale this solution as of yet.
We have six people in our company who use this solution.
Increasing the usage is not on our strategy pathway.
I have not contacted technical support. I have not required any yet.
I have had very little contact with Microsoft support, but it's been good.
I have also worked with Talend. I didn't switch products, but rather companies.
Talend is a more robust enterprise-wide solution that can handle everything from start to finish, whereas Azure Data Factory is more of an ingestion tool.
I was not involved with the initial setup.
We are an enterprise that uses an integrator.
It does not require any maintenance, it's simple.
I don't see a cost; it appears to be included in general support. I have been told that you have to be very careful because it can blow out. I have not experienced it yet, but I've been warned that as Azure ingestion increases, the costs can rise.
In my opinion, the price is competitive.
It's a good tool, a good product that does what it's supposed to do well, which is ingesting data from a source to your target, to another cloud, to another source. Just be conscious to monitor your costs.
I would rate Azure Data Factory an eight out of ten.
Our customers use it for data analytics on a large volume of data. So, they're basically bringing data in from multiple sources, and they are doing ETL extraction, transformation, and loading. Then they do initial analytics, populate a data lake, and after that, they take the data from the data lake into more on-premise complex analytics.
Its version depends on a customer's environment. Sometimes, we use the latest version, and sometimes, we use the previous versions.
It is very modular. It works well. We've used Data Factory and then made calls to libraries outside of Data Factory to do things that it wasn't optimized to do, and it worked really well. It is obviously proprietary in regards to Microsoft created it, but it is pretty easy and direct to bring in outside capabilities into Data Factory.
It is very flexible. You can build any features you want.
There is always room to improve. There should be good examples of use that, of course, customers aren't always willing to share. It is Catch-22. It would help the user base if everybody had really good examples of deployments that worked, but when you ask people to put out their good deployments, which also includes me, you usually got, "No, I'm not going to do that." They don't have enough good examples. Microsoft probably just needs to pay one of their partners to build 20 or 30 examples of functional Data Factories and then share them as a user base.
I have been using this solution for the last five years, but probably, the last three years have been significant.
It has been stable. I have not experienced any issues.
It is decent for most things. I'm not sure if it is necessarily intended for large volume and high-speed streams of data. By large, I mean really big, but for pretty much anything that most users would want to do, including ourselves, it is fine. Our clients are large government organizations.
It scales fine within its environment. You can literally throw another Data Factory in or replicate one and do things pretty quickly. So, it is not at all hard to increase your processing footprint, but you have to pay for it. It doesn't end up being quite expensive. Although I haven't really done it, I would suspect that if I did the equivalent in AWS, Azure would be more expensive than AWS because of the way they price data.
They're all right. I would rate them a seven out of 10. They do fine, but there is a lot that they don't do.
I'm not sure if even Microsoft has enough SMEs from a user point of view. They are helpful for getting it set up, making it work, and helping you figure out why it doesn't work. If you want to ask them about something that you are trying to do, they'll try to direct you to a partner, which is fine, but the partners also don't necessarily have an experience. It is Catch-22. There aren't a lot of people out there with Azure experience because Azure started to be in demand only over the last two years.
The customer used a lot of homebrew stuff. They were doing a lot of internal stuff and some Oracle stuff. They were doing things, and they made a workaround and said, "Okay, we'll bring it into Oracle Database, and then we'll do all these things to it." We're like, "Okay, that works, but then you're taking it out of that database and putting it over into the data lake. I don't understand why are you doing that?" That's what they were doing.
It is pretty straightforward. Devil is in the details, but you can easily get up and running in a day with Data Factory. Anybody who is comfortable in Azure can set up Data Factory, but it takes experience to know what it can and can't do or should and shouldn't do.
It is proven, and it works. Make sure you have a well-defined use case and build a quick prototype to ensure that it, in fact, does what you need. Give yourself some benchmarks. That's exactly what we did. We defined the use case, and then we set up Data Factory. We found a couple of things that it didn't do. We figured out a way to work around those things and have it do those things. After that, we confirmed it. It is operational, and it is doing its job. It has been pretty much error-free since then.
It would become easier to use as more people become Azure-capable. If I want to find an AWS SME, I can get tons. They're expensive, but I have them. If I want to find an Azure SME, I usually have to create them. Azure was later to market than AWS. So, there are fewer people who are experts in Azure, and they are in high demand.
I would rate Azure Data Factory a nine out of 10. They just don't have enough good examples out there of things.
I primarily use Data Factory for data ingestion and B2B transformation.
Data Factory's best features are connectivity with different tools and focusing data ingestion using pipeline copy data.
Data Factory's monitorability could be better. In the next release, Data Factory should include integrations with open-source tools like Air Flow.
I've been working with Data Factory for about a year.
Data Factory is stable.
Data Factory is scalable.
Microsoft's technical support is good, so long as your company has a good relationship with them.
I previously worked with Talend, Matillion, and Fivetran.
Data Factory is expensive.
I would rate Data Factory seven out of ten.
We mainly use this solution to carry out data movement and transformation.
This solution has provided us with an easier, and more efficient way to carry out data migration tasks.
This solution is currently only useful for basic data movement and file extractions, which we would like to see developed to handle more complex data transformations.
We have been using this solution for a year.
We have found this to be a stable solution.
This is an easily scalable product, due to it being cloud-based.
The customer support for this solution is very good.
The initial setup of this product is straightforward, if you deploy the solution using a template; rather than implementing the solution first, and configuring the features afterwards.
We implemented the product using both in-house staff, and members of a vendor team. The vendor team were very helpful, and gave good advice while we were deploying the solution.
We would recommend this solution as it is very solid and has good security features.
I would rate this solution an eight out of 10.
Our company uses the solution for data ingestion.
It is beneficial that the solution is written with Spark as the back end.
The solution is cloud-based and integrates well with other Azure products such as Synapse Analytics.
There are limitations when processing more than one GD file.
Data ingestion pipelines sometimes fail because of transient issues that have to do with the cloud network. It takes more than six hours to process or ingest 300,000 records and that is a long time.
I have been using the solution for two years.
The solution is new in the market and pretty stable because ADF is a little more codified than AWS. Synapse Analytics adds another tool for data.
Stability is not quite at the level of Informatica or DataStage.
The solution is scalable.
For multi-tenant applications connected to multiple databases, Microsoft recommends a share box and a cell post integration run time. But a run time connecting to multiple sources has limitations and requires multiple shares connecting to your data if you are ingesting it from on-premises.
Technical support is okay. Support is contracted or partnered with various companies but is fine as a first level.
Most of the time, technical support has to connect with product engineers who troubleshoot issues.
The setup is not very complex but requires intake, setting up integration services, and connecting to databases like Oracle before you push it to service.
We implemented the solution in-house.
The pricing is pay-as-you-go or reserve instance. Of the two options, reserve instance is much cheaper.
In the cloud, everything is service based and expensive. Users should be knowledgeable enough to maximize the solution.
For example, it makes no sense to run integration services all day if you are not ingesting data because you pay for that usage. It is important to understand how the product works to manage it accordingly and keep costs down.
I rate the solution a six out of ten.
It's a PaaS service. It's a hybrid solution. The cloud provider is Microsoft.
We are not using Azure Data Factory as for users. Rather, we're using it as a process base. We're just using it for orchestration, not for any kind of ETL stuff.
We have plans to increase usage. It's going to take a major role in any kind of traditional data warehousing. It has big potential, especially as a PaaS offering.
There has been improvement in data resilience, in the way that we're moving the data from on-prem to cloud and vice versa.
The most important feature is that it can help you do the multi-threading concepts. It's in Informatica, but the resourcing is quite robust. You can scale up and scale down as per your needs.
There should be a way that it can do switches, so if at any point in time I want to do some hybrid mode of making any data collections or ingestions, I can just click on a button. I can change a switch and make sure a batch can be a streaming process.
I've been using Azure Data Factory for more than two years.
The stability of Azure as a PaaS could be improved.
It's scalable.
I would rate their technical support 3 out of 5. It's not great, but it isn't bad.
The setup is complex. It has nothing to do with the technology but with the design. We were wondering how to leverage the orchestration layer where we are having the Azure Data Factory and how to integrate with the Databricks. That's where we had some challenges in terms of choosing the right product.
You can do deployment in-house.
I would rate this solution 8 out of 10.
For someone who is looking to use this solution, my advice is to do proper due diligence of your current application, know where your application is fitting, and look for the requirements. It all depends upon the current use case that you have currently in your system.
The solution is primarily used for data integration. We are using it for the data pipelines to get data out of the legacy systems and provide it to the Azure SQL Database. We are using the SQL data source providers mainly.
The data pipeline and the orchestration functionality are the most valuable aspects of the solution.
The interface is very good. It seeks to be very responsive and intuitive.
The initial setup is very quick and easy.
I'm more of a general manager. I don't have any insights in terms of missing features or items of that nature.
Integration of data lineage would be a nice feature in terms of DevOps integration. It would make implementation for a company much easier. I'm not sure if that's already available or not. However, that would be a great feature to add if it isn't already there.
We've used the solution for the last 12 months or so.
From what I have witnessed, the solution is quite stable. It doesn't crash or freeze. There are no bugs or glitches. It's reliable.
We work with medium to enterprise-level organizations. Customers have anywhere from 300 employees up to 160,000 employees.
Microsoft offers a great community. There's a lot of support available. We're quite satisfied with the level of assistance on offer.
Since the solution is a service, it's basically just a click and run setup. It's very simple. There's very little implementation necessary. A company should be able to easily arrange it. The deployment doesn't take very long at all.
We do provide the implementation for our clients. We're able to provide templates as well. We have predefined implementation space in Data Factory and provide it to the customer.
While clients might individually evaluate other options, however, we're not aware of that information. I can't say what other solution clients might consider before ultimately choosing Microsoft. I would say that it is likely Talend and maybe SQL Server Integration Services.
We are like an integrator. We are a data warehouse NPI consulting company and we use Data Factory to pull data from different legacy systems and do all these transformations that are necessary in order to provide analytical models.
In our normal scenario is that we are providing Azure SQL Databases together with Azure Data Factory and Power BI. 80% of our customers have recognized such a scenario.
On a scale from one to ten, I'd rate the solution at an eight. We've been largely happy with the capabilities of the product.
It's an integration platform, we migrate data across hybrid environments. We have data in our cloud environment or on-prem system so we use it for when we want to integrate data across different environments. It was a problem for us to get data from different hybrid environments.
From my experience so far, the best feature is the ability to copy data to any environment. We have 100 connectors and we can connect them to the system and copy the data from its respective system to any environment. That is the best feature.
The user interface could use improvement. It's not a major issue but it's something that can be improved.
It has the ability to create separate folders to organize objects, Data Factory objects. But any time that we created a folder we were not able to create objects. We had to drag and drop into the folder. There were no default options. It was manual work. We offered their team our feedback and they accepted my request.
I have been using Azure Data Factory for around one year.
Based on my experience with other products on the market, the stability is good.
I haven't had much experience with scalability. I know we do have scalability options though. It's used daily.
There are around 1,000 plus users using this solution in my company.
It requires two people for maintenance. The administrators are the ones who maintain it and give access to the engineers. They regulate who has privileges.
We have needed to contact their technical support. If we can't find the answers ourselves on the blogs, we contact them with our questions. We get most of the answers we need from the blogs but if not then we can directly speak to the Microsoft team from the Data Factory interface itself, it's really helpful.
I have only used Data Factory for the cloud. For on-prem we have used SSIS.
The initial setup was a bit complex but once you understand its setup, it's less complex. There are certain processes that need to be followed. Once you understand the process, it becomes easier to implement.
The implementation took a little less than one day. The planning for the deployment takes around one or two days.
We had a discussion with the Microsoft team about the data. We discussed how we were going to implement. Based on the discussion we were able to deploy. A Microsoft partner helped us with some parts.
We also evaluated AWS.
The advice that I would give to someone considering this solution is to have some background in data warehousing and ETL concepts. Have the background about data warehousing and ETL that extract, transform, and load. If you have the background you need, you will be successful. If not, then my advice would be to learn a little more about it before using Data Factory.
I would rate Data Factory as an eight out of ten.