Solution Architect at Giant Eagle
Real User
Easy to use and can be used for data integration
Pros and Cons
  • "The most valuable features of the solution are its ease of use and the readily available adapters for connecting with various sources."
  • "Some known bugs and issues with Azure Data Factory could be rectified."

What is our primary use case?

We use Azure Data Factory for data integration.

What is most valuable?

The most valuable features of the solution are its ease of use and the readily available adapters for connecting with various sources.

What needs improvement?

Some known bugs and issues with Azure Data Factory could be rectified.

For how long have I used the solution?

I have been using Azure Data Factory for about two years.

Buyer's Guide
Azure Data Factory
April 2024
Learn what your peers think about Azure Data Factory. Get advice and tips from experienced pros sharing their opinions. Updated: April 2024.
768,578 professionals have used our research since 2012.

What do I think about the stability of the solution?

I rate the solution an eight out of ten for stability.

What do I think about the scalability of the solution?

Azure Data Factory is a scalable solution. A team of 16 people from the data analytics team use the solution in our organization.

I rate the solution an eight out of ten for scalability.

How was the initial setup?

On a scale from one to ten, where one is difficult and ten is easy, I rate the solution's initial setup a seven out of ten.

What about the implementation team?

A team of three people deployed Azure Data Factory in three to four days.

What's my experience with pricing, setup cost, and licensing?

The solution's pricing is competitive.

What other advice do I have?

We build data pipelines primarily for integration. Few of them are real-time data transfers, and few of them would be a batch-free file. These would direct the data from various sources to our data warehouse. Azure Data Factory helps build the data pipelines and adaptors.

The solution has built-in features and a control center for us to monitor the status of the pipelines. The solution's email notification also helps us in monitoring. We didn't face any challenges to set up the data pipelines. We know there are some controls, but governance is customized for the organization's requirements. We have our own policies.

Azure Data Factory is deployed on the cloud in our organization. I would recommend Azure Data Factory to other users.

Overall, I rate the solution a nine out of ten.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
Anil Jha - PeerSpot reviewer
Director D&A at Iris Software Inc.
Real User
Easy to set up and integrates well, but it needs support for custom data delimiters
Pros and Cons
  • "It is easy to integrate."
  • "You cannot use a custom data delimiter, which means that you have problems receiving data in certain formats."

What is our primary use case?

The primary use case is integrating data from different ERP systems and loading it into Azure Synapse for reporting. We use Power BI for the reporting side of it.

We also have customers who are migrating to Azure Data Factory and we are assisting them with making the transition.

What is most valuable?

It is easy to integrate.

I do not foresee any issues with security.

What needs improvement?

I find that Azure Data Factory is still maturing, so there are issues. For example, there are many features missing that you can find in other products.

You cannot use a custom data delimiter, which means that you have problems receiving data in certain formats. For example, there are problems dealing with data that is comma-delimited.

For how long have I used the solution?

I have been using Azure Data Factor for almost one year.

What do I think about the stability of the solution?

The stability is dependent on how you set up your cloud infrastructure, and how you authorize people to make use of it.

What do I think about the scalability of the solution?

I have not seen any issues with respect to scalability, as it is all hosted within the cloud. We have approximately 20 users.

How are customer service and technical support?

I have been in contact with technical support, although most of the time I was told that the feature I was interested in was not yet available. In these cases, they will be implementing the missing features in the future.

Which solution did I use previously and why did I switch?

I use several similar products by different vendors including Talend, Informatica, and Microsoft SSIS. The biggest advantage that Azure has is deployment. However, in others, it is possible to specify custom data delimiters.

How was the initial setup?

The initial setup is pretty simple and it can be deployed in a couple of hours.

What about the implementation team?

I deployed it myself and am also responsible for maintenance.

What other advice do I have?

I would rate this solution a five out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
PeerSpot user
Buyer's Guide
Azure Data Factory
April 2024
Learn what your peers think about Azure Data Factory. Get advice and tips from experienced pros sharing their opinions. Updated: April 2024.
768,578 professionals have used our research since 2012.
Anirban Bhattacharya - PeerSpot reviewer
Practice Head, Data & Analytics at a tech vendor with 10,001+ employees
Real User
Top 10
Beneficial guides, scales well, and helpful support
Pros and Cons
  • "The most valuable feature of Azure Data Factory is the core features that help you through the whole Azure pipeline or value chain."
  • "Azure Data Factory can improve the transformation features. You have to do a lot of transformation activities. This is something that is just not fully covered. Additionally, the integration could improve for other tools, such as Azure Data Catalog."

What is our primary use case?

Azure Data Factory can be deployed on the cloud and hybrid cloud. There have been very few deployments on private clouds.

What is most valuable?

The most valuable feature of Azure Data Factory is the core features that help you through the whole Azure pipeline or value chain.

Across the whole field of use, from accepting the ingestion and real-time SaaS ingestion for which we often use other components. These areas have been instrumental across the board.

What needs improvement?

Azure Data Factory can improve the transformation features. You have to do a lot of transformation activities. This is something that is just not fully covered. Additionally, the integration could improve for other tools, such as Azure Data Catalog.

For how long have I used the solution?

I have been using Azure Data Factory for approximately four years.

What do I think about the stability of the solution?

The stability of Azure Data Factory is good.

I rate the scalability of Azure Data Factory a seven out of ten.

What do I think about the scalability of the solution?

Azure Data Factory is scalable. The solution can move up and be aligned to resources or scaled down.

We have a lot of customers using the solution, approximately 100.

How are customer service and support?

The support from Azure Data Factory is very good. There are some improvements needed.

I rate the support from Azure Data Factory a four out of five.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

I have previously used Informatica. When comparing Informatica to Azure Data Factory, Informatica is a bit behind.

How was the initial setup?

The initial setup of Azure Data Factory is not complex if you know what you are doing. If you do not know the technology you will have a problem.

What's my experience with pricing, setup cost, and licensing?

Azure Data Factory gives better value for the price than other solutions such as Informatica.

What other advice do I have?

I recommend this solution to others.

I rate Azure Data Factory an eight out of ten.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
PeerSpot user
Arpita-Mishra - PeerSpot reviewer
Specialist Software Engineer at a financial services firm with 10,001+ employees
Real User
Top 10
Faster than other solutions, has multiple connectors, and is easy to set up
Pros and Cons
  • "One advantage of Azure Data Factory is that it's fast, unlike SSIS and other on-premise tools. It's also very convenient because it has multiple connectors. The availability of native connectors allows you to connect to several resources to analyze data streams."
  • "There's no Oracle connector if you want to do transformation using data flow activity, so Azure Data Factory needs more connectors for data flow transformation."

What is our primary use case?

I use Azure Data Factory for architecture creation, for example, loading data from Oracle DB to Azure Synapse Analytics, creating facts and dimensions using Azure Data Pipeline, and creating Azure Synapse notebooks for data transformation. 

Another use case for Azure Data Factory is dashboard creation to help customers make informed decisions.

How has it helped my organization?

Compared to the on-premise SSIS, Azure Data Factory has better infrastructure. It also benefits my company because you can scale the solution up or down with different resources.

Azure Data Factory is also on a pay-as-you-go or pay-as-you-use model, which is suitable for the company because my company only pays for its usage or requirement.

The solution is also very user-friendly, and the Azure Data Factory support team responds quickly whenever my team has a loading issue.

What is most valuable?

One advantage of Azure Data Factory is that it's fast, unlike SSIS and other on-premise tools.

It's also very convenient because Azure Data Factory has multiple connectors. It has sixty connectors which you can't find in SSIS. The availability of native connectors allows you to connect to several resources to analyze data streams.

I also like that you can set up your own VM and infrastructure on Azure Data Factory without any help from the IT team because it only requires a single click.

What needs improvement?

What's missing in Azure Data Factory is an Oracle connector. If you want to connect directly to the Oracle database, you must copy and transform the data. There's no Oracle connector if you want to do transformation using data flow activity, so Azure Data Factory needs more connectors for data flow transformation.

Sending out emails after a job is completed is another area for improvement in the tool.

For how long have I used the solution?

I've been using Azure Data Factory for three years.

What do I think about the scalability of the solution?

Azure Data Factory is a scalable tool.

Which solution did I use previously and why did I switch?

We used SSIS, but its on-premise version is slower than Azure Data Factory, and Azure Data Factory, infrastructure-wise, is better, so we went with Azure Data Factory.

How was the initial setup?

The initial setup for Azure Data Factory is an eight out of ten.

Manually deploying Azure Data Factory is easy and doesn't take much time, but I'm not sure how long it takes for an automated approach to deployment.

What's my experience with pricing, setup cost, and licensing?

The licensing model for Azure Data Factory is good because you won't have to overpay. Pricing-wise, the solution is a five out of ten. It was not expensive, and it was not cheap. It's in the middle.

What other advice do I have?

I have experience with both Azure Data Factory and SSIS.

I'm using the latest version of Azure Data Factory.

My rating for Azure Data Factory is eight out of ten.

My company is an Azure Data Factory user.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
PeerSpot user
Data Architect at World Vision
Real User
Top 5Leaderboard
The good, the bad and the lots of ugly
Pros and Cons
  • "The trigger scheduling options are decently robust."
  • "There is no built-in pipeline exit activity when encountering an error."

What is our primary use case?

The current use is for extracting data from Google Analytics into Azure SQL Database as a source for our EDW.  Extracting from GA was problematic with SSIS

The larger use case is to assess the viability of the tool for larger use in our organization as a replacement for SSIS for our EDW and also as an orchestration agent to replace SQL Agent for firing SSIS packages using Azure SSIS-IR.

The initial rollout was to solve the immediate problem while assessing its ability to be used for other purposes within the organization. And also establish the development and administration pipeline process.  

How has it helped my organization?

ADF allowed us to extract Google Analytics data (via BigQuery) without purchasing an adapter.  

It has also helped with establishing how our team can operate within Azure using both PaaS and IaaS resources and how those can interact. Rolling out a small data factory has forced us to understand more about all of Azure and how ADF needs to rely upon and interact with other Azure resources.

It provides a learning ground for use of DevOps Git along with managing ARM templates as well as driving the need to establish best practices for CI.  

What is most valuable?

The most valuable aspect has been a large list of no-cost source and target adapters.

It is also providing a PaaS ELT solution that integrates with other Azure resources. 

Its graphical UI is very good and is even now improving significantly with the latest preview feature of displaying inner activities within other activities such as forEach and If conditions.   

Its built-in monitoring and ability to see each activity's JSON inputs/outputs provide an excellent audit trail.

The trigger scheduling options are decently robust.

The fact that it's continually evolving is hopeful that even if some feature is missing today, it may be soon resolved. For example, it lacked support for simple SQL activity until earlier this year, when that was resolved. They have now added a "debug until" option for all activities. The Copy Activity Upsert option did not perform well at all when I first started using the tool but now seems to have acceptable performance.  

The tool is designed to be metadata driven for large numbers of patterned ETL processes, similar to what BIML is commonly used for in SSIS but much simpler to use than BIML. BIML now supports generating ADF code although with ADF's capabilities I'm not sure BIML still holds its same value as it did for SSIS.

What needs improvement?

The list of issues and gaps in this tool is extensive, although as time goes on, it gets shorter. It currently includes:

1) Missing email/SMTP activity

2) Mapping data flows requires significant lag time to spin up spark clusters

3) Performance compared to SSIS. Expect copy activity to take ten times that of what SSIS takes for simple data flow between tables in the same database

4) It is missing the debug of a single activity. The workaround is setting a breakpoint on the task and doing a "rerun from activity" or setting debug on activity and running up to that point

5) OAuth 2.0 adapters lack automated support for refresh tokens

6) Copy activity errors provide no guidance as to which column is causing a failure

7) There's no built-in pipeline exit activity when encountering an error

8) Auto Resolve Integration runtime should never pick a region that you're not using (should be your default for your tenant)

9) IR (integration runtime) queue time lag. For example, a small table copy activity I just ran took 95 seconds of queuing and 12 seconds to actually copy the data. Often the queuing time greatly exceeds the actual runtime

10) Activity dependencies are always AND (OR not supported). This is a significant missing capability that forces unnecessary complex workarounds just to handle OR situations when they could just enhance the dependency to support OR like SSIS does. Did I just ask when ADF will be as good as SSIS?  

They need to fix bugs. For example:

1) The debug sometimes stops picking up saved changes for a period of time, rendering this essential tool useless during that time

2) Enable interactive authoring (a critical tool for development) often doesn't turn on when enabled without going into another part of the tool to enable it. Then, you have to wait several minutes before it's enabled which is time you're blocked from development until it's ready.  And then it only activates for up to 120 minutes before you have to go through this all over again. I think Microsoft is trying to torture developers

3) Exiting the inside of an activity that contains other activities always causes the screen to jump to the beginning of a pipeline requiring re-navigating where you were at (greatly slowing development productivity)

4) Auto Resolve Integration runtime (using default settings) often picks remote regions (not necessarily even paired regions!) to operate, which causes either an unnecessary slowdown or an error message saying it's unable to transfer the volume of data across regions

5) Copy activity often gets the error "mapping source is empty" for no apparent reason. If you play with the activity such as importing new metadata then it's happy again. This sort of thing makes you want to just change careers. Or tools. 

For how long have I used the solution?

I have been using this product for six months.

What do I think about the stability of the solution?

Production operation seems to run reliably so far, however, the development environment seems very buggy where something works one day and not the next. 

What do I think about the scalability of the solution?

So far, the performance of this solution is abysmal compared to SSIS. Especially with small tasks such as copying activity from one table to another within the same database. 

How are customer service and support?

Customer support is non-existent. I logged multiple issues only to hear back from 1st level support weeks later asking questions and providing no help other than wasting my time. In one situation it was a bug where the debug function stopped working for a couple of days. By the time they got back to me, the problem went away. 

How would you rate customer service and support?

Negative

Which solution did I use previously and why did I switch?

We have been and still rely on SSIS for our ETL. ADF seems to do ELT well but I would not consider it for use in ETL at this time.  Its mapping data flows are too slow (which is a large understatement) to be of practical use to us. Also, the ARM template situation is impractical for hundreds of pipelines like we would have if we converted all our SSIS packages into pipelines as a single ADF couldn't take on all our pipelines. 

How was the initial setup?

Initial setup is the largest caveat for this tool. Once you've organized your Azure environment and set up DevOps pipelines, the rest is a breeze. But this is NOT a trivial step if you're the first one to establish the use of ADF at your organization or within your subscription(s). Instead of learning just an ETL tool, you have to get familiar with and establish best practices for the entire Azure and DevOps technologies. That's a lot to take on just to get some data movements operational. 

What about the implementation team?

I did this in-house with the assistance of another team who uses DevOps with Azure for other purposes (non-ADF use). 

What's my experience with pricing, setup cost, and licensing?

The setup cost is only the time it takes to organize Azure resources so you can operate effectively and figure out how to manage different environments (dev/test/sit/UAT/prod, etc.). Also, how to enable multiple developers to work on a single data factory without losing changes or conflicting with other changes.

Which other solutions did I evaluate?

We operate only with SSIS today, and it works very well for us. However, looking toward the future, we will need to eventually find a PaaS solution that will have longer sustainability.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Emad Afaq Khan - PeerSpot reviewer
Lead Architect & Scrum Master at a energy/utilities company with 10,001+ employees
Real User
Top 10
A good and constantly improving solution but the Flowlets could be reconfigured
Pros and Cons
  • "Azure Data Factory became more user-friendly when data-flows were introduced."
  • "Azure Data Factory uses many resources and has issues with parallel workflows."

What is our primary use case?

We use this solution to ingest data from one of the source systems from SAP. From the SAP HANA view, we push data to our data pond and ingest it into our data warehouse.

How has it helped my organization?

Azure Data Factory didn't bring a lot of good when we were also using Alteryx. Alteryx is user-friendly, while Azure Data Factory uses many resources and has issues with parallel workflows. Alteryx helps you diagnose issues quicker than Azure Data Factory because it's on the cloud and has a cold start debugger.

Azure Data Factory has to wake up whenever you are trying to do testing, and it takes about four to five minutes. It's not always online to do a quick test. For example, if we want to test an Excel file to see if the formatting is correct or why the data-flow or pipeline is failing, we need to wait four to five minutes to get the cold start debugger to run. Compared to Alteryx, Azure Data Factory could be better. Nevertheless, we are using it because we have to.

What is most valuable?

Initially, when we started using it, we didn't like it because it needed to be more mature and had data-flows, so we used the traditional pipeline. After that, Azure Data Factory introduced the concept of data-flows, and it started to become more mature and look more like Alteryx. Azure Data Factory became more user-friendly when data-flows were introduced.

What needs improvement?

They introduced the concept of Flowlets, but it has bugs. Flowlets are a reusable component that allows you to create data-flows. We can configure a Flowlet as a reusable pipeline and plug it inside different data-flows, so we don't have to rewrite our code or visual transformation.

If we make any changes in our data-flow, it reverts all our changes to the original state of the Flowlet. It does not retain changes, and we must reconfigure the Flowlets repeatedly. We had these issues three months ago so things might have changed. It works fine whenever we plug it in and configure it in our data-flow, but if we make minor changes to it, the Flowlet needs to be reconfigured again and loses the configuration.

For how long have I used the solution?

We have used this solution for about a month and a half. It is a cloud-based tool, so there are no versions. It is all deployed on Azure Cloud.

What do I think about the stability of the solution?

Everything is computed inside the SQL server if we're working with pipelines, so we have to be very careful when designing our solution in Azure Data Factory. Alteryx spoiled us because we never cared how it looked in the backend because all the operations were happening on the Alteryx server. But in Azure Data Factory, they run on the capacity of our data warehouse. So Azure Data Factory cannot run your queries, and it directly sends the query to the instance in the SQL server or data warehouse. So we have to be very careful about how we perform certain operations.

We need to have knowledge of SQL and how to optimize our queries. If we are calling a stored procedure, it joins one table in Alteryx. It is pretty easy, and we just put a joint tool. Suppose we want to do it with a stored procedure in the Azure Data Factory. In that case, we have to be very careful about how we write our code. So that is a challenge for our team because we were not looking into how to optimize their SQL queries when fighting queries from Azure Data Factory to the data warehouse.

In addition, the workflows were running very slow, the performance was bad, and some queries were getting timed out because we have a threshold. So we faced many challenges and had to reeducate ourselves on SQL and query optimization.

What do I think about the scalability of the solution?

In regards to scaling, when Azure Data Factory was introduced as your Databricks, it worked similarly to Hadoop or Spark, and it had some Spark clusters in the back end that could scale it as much as it could, and speed up the performance. So it is scalable, especially with Databricks, because a lot of data-related transformations can be performed.

On my team, there are approximately 20 people who work with Azure Data Factory.

How are customer service and support?

We do not have experience with customer service and support.

How was the initial setup?

It does not require any installation and is more like software as a service. You need to create an instance of Azure Data Factory in Azure and configure some of the connections to your databases. You can connect to your block storages and some authentication is necessary for Azure Data Factory.

The setup is straightforward. It doesn't take much time, and it's on cloud. It requires a few clicks, and you can quickly set it up and grant access to the developer. Then the developer can go to the link and start developing within their browser.

We have a team that takes care of the cloud infrastructure, so we raise a ticket and request infrastructure, and they just exceed it based on the naming convention with the project name.

What about the implementation team?

We have an entire team that takes care of the cloud infrastructure. So we raise a ticket when we need infrastructure, which is executed based on the naming convention for the project name.

What was our ROI?

The nature of our solution is not based on ROI because we are building solutions for other functions within the same organization. In addition, due to the large size of our organization and the services we provide, the ROI is not something we consistently track. It's something discussed with the management, so I can't comment on it.

What's my experience with pricing, setup cost, and licensing?

The cost is based on usage and the computing resources consumed. However, since Azure Data Factory connects with so many different functionalities that Azure provides, such as Azure functions, Logic apps and others in the Azure Data Factory pipelines, additional costs can be acquired by using other tools.

Which other solutions did I evaluate?

We did not evaluate other options because this solution was aligned with out current work environment. 

What other advice do I have?

I rate the solution a seven out of ten. The solution is good and constantly improving, but the concept of Flowlets can be reconfigured to retain the changes we make. I advise users considering this solution to thoroughly understand what Azure Data Factory is and evaluate what's available in the market. Secondly, to assess the nature of the use cases and the kind of products they will be building before deciding to choose a solution.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Dan_McCormick - PeerSpot reviewer
Chief Strategist & CTO at a consultancy with 11-50 employees
Real User
Top 10
Secure and reasonably priced, but documentation could be improved and visibility is lacking
Pros and Cons
  • "The most valuable feature of Azure Data Factory is that it has a good combination of flexibility, fine-tuning, automation, and good monitoring."
  • "They require more detailed error reporting, data normalization tools, easier connectivity to other services, more data services, and greater compatibility with other commonly used schemas."

What is our primary use case?

We use Azure Data Factory for data transformation, normalization, bulk uploads, data stores, and other ETL-related tasks.

How has it helped my organization?

Azure Data Factory allows us to create data analytic stores in a secure manner, run machine learning on our data, and easily adapt to changing schema.

What is most valuable?

The most valuable feature of Azure Data Factory is that it has a good combination of flexibility, fine-tuning, automation, and good monitoring.

What needs improvement?

The documentation could be improved. They require more detailed error reporting, data normalization tools, easier connectivity to other services, more data services, and greater compatibility with other commonly used schemas.

I would like to see a better understanding of other common schemas, as well as a simplification of some of the more complex data normalization and standardization issues.

It would be helpful to have visibility, or better debugging, and see parts of the process as they cycle through, to get a better sense of what is and isn't working.

It's essentially just a black box. There is some monitoring that can be done, but when something goes wrong, even simple fixes are difficult to troubleshoot.

For how long have I used the solution?

I have been working with Azure Data Factory for a couple of years.

There is only one version.

What do I think about the stability of the solution?

Overall, I believe the stability has been good, but there have been a couple of occasions when Microsoft's resources needed to be allocated were overburdened, and we had to wait for unacceptable amounts of time to get our slot. It has now happened twice which is not ideal.

What do I think about the scalability of the solution?

There is no limit to scalability.

We only have a few users. One is a data scientist, and the other is a data analyst.

We use it to push up various dashboards and reports, it's a transitional product for transferring, transforming, and transitioning data.

It is extensively used, and we intend to expand our use.

How are customer service and support?

You don't really get that kind of support; it's more about documentation and the community support that is available. I would rate it a three out of five compared to others.

You could call them, and pay for their consulting hours directly, but for the most part, we try to figure it out or look through documentation. 

I think their documentation is lagging because it's not as popular of a tool, there's just not a lot, or as much to fall back on.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

We had only our own tools, and we switched because you get to leverage all of the work done in a SaaS or platform as a service, or however they classify it. As a result, you get more functionality, faster, for less money.

How was the initial setup?

The initial setup is straightforward.

It is a working tool. You can start using it within an hour and then make changes as needed.

We only need one person to maintain the solution; it doesn't take much to keep it running.

It's not a problem; it's a platform.

What about the implementation team?

We completed the deployment ourselves.

What was our ROI?

We have seen a return on investment. I can't really share many details, but for us, this becomes something that we sell back to our clients.

What's my experience with pricing, setup cost, and licensing?

You pay based on your workload. Depending on how much data you process through it, the cost could range from a few hundred dollars to tens of thousands of dollars.

Pricing is comparable, it's somewhere in the middle.

There are no additional fees to the standard licensing fee.

Which other solutions did I evaluate?

We looked at some other tools, such as Databricks, AmazonGlue, and MuleSoft.

We already had most of our infrastructure connected to Azure in some way. So the integration of where our data resided appeared to be simpler and safer.

What other advice do I have?

I believe it would be beneficial if they could find someone experienced in some of the tools that are a part of this, such as Spark, not necessarily Data Factory specifically, but some of those other tools that will be very familiar and have a very quick time for productivity. If you're used to doing things in a different way, it may take some time because there isn't as much documentation and community support as there is for some more popular tools.

I would rate Azure Data Factory a seven out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Biswajith Gopinathan - PeerSpot reviewer
Data Analytics Specialist at GlaxoSmithKline
Real User
Top 10
Quick delivery due to drag-and-drop interface
Pros and Cons
  • "One of the most valuable features of Azure Data Factory is the drag-and-drop interface. This helps with workflow management because we can just drag any tables or data sources we need. Because of how easy it is to drag and drop, we can deliver things very quickly. It's more customizable through visual effect."
  • "Data Factory could be improved by eliminating the need for a physical data area. We have to extract data using Data Factory, then create a staging database for it with Azure SQL, which is very, very expensive. Another improvement would be lowering the licensing cost."

What is our primary use case?

My primary use case of Azure Data Factory is supporting the data migration for advanced analytics projects. 

What is most valuable?

One of the most valuable features of Azure Data Factory is the drag-and-drop interface. This helps with workflow management because we can just drag any tables or data sources we need. Because of how easy it is to drag and drop, we can deliver things very quickly. It's more customizable through visual effect. 

What needs improvement?

Data Factory could be improved by eliminating the need for a physical data area. We have to extract data using Data Factory, then create a staging database for it with Azure SQL, which is very, very expensive. Another improvement would be lowering the licensing cost. 

For how long have I used the solution?

I have been using this solution for the past year. 

What do I think about the stability of the solution?

This solution is stable. We are using an Azure subscription, so there is no maintenance or direct updates, it's just always the latest version.

What do I think about the scalability of the solution?

This solution is automatically scalable, since it's in the cloud. At my company, there were more than one thousand people using this solution because we were a big, media-based company. If there are many user requests in the front end application and the system is not responding much or has slow performance, the system will automatically scale up the performance hardware requirements. 

How are customer service and support?

I have contacted technical support. I have never faced an issue like that with Denodo. Fortunately, we got some kind of a tutorial PDF, which helps us to deploy everything quickly. 

Which solution did I use previously and why did I switch?

Before working with Azure, I worked with Python. In the culture I was working in, there was no integration. We were using Pure Python scripting and Python data manipulation tools. For example, we used Python's pandas library, which we coded to transform and orchestrate the data, which is necessary for the endpoint. It was not at all a visual tool. It took more time than Denodo. 

How was the initial setup?

There is no installation because it's on the cloud. You just log on to the cloud with your subscription credentials, then you can use Data Factory directly. 

What about the implementation team?

I implemented through an in-house team. 

What's my experience with pricing, setup cost, and licensing?

Data Factory is very expensive. We are using an Azure subscription, so Data Factory has no direct updates, it's just always the latest version. Compared to Denodo, Azure is very costly. Azure Framework has multiple services, not only Data Factory. So in the cloud-based solution, if you're selecting a particular service, like Data Factory, you need to pay for each request.

Which other solutions did I evaluate?

I also use Denodo. Data Factory is like a transformation layer, but we need an additional staging database or a data storage facility, which is very expensive compared to implementing Denodo. So we extracted the data using Data Factory, then created a staging database with Azure SQL, which cost a huge amount since it's a physical data area. In Denodo, we just implement a layer, which is all handled in Denodo, and not a physical storage mechanism. I prefer customizable data solutions because they improve performance, creativity, and are helpful for front end people.

In comparison to Data Factory's drag-and-drop interface, Denodo developers need to create all the unified views by coding, so we have to create SQL queries to execute. With Data Factory, you can quickly drag and drop data or tables, but in Denodo, it takes more time because you need to code and test and all that.

What other advice do I have?

I rate Data Factory an eight out of ten, mainly because you need a staging database. I recommend Azure to others, but it depends on architecture. In Data Factory, there is no virtualization environment, no layer of virtualization to help integration and doing caching mechanisms. Though Data Factory is there, Denodo is going further. 

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Download our free Azure Data Factory Report and get advice and tips from experienced pros sharing their opinions.
Updated: April 2024
Buyer's Guide
Download our free Azure Data Factory Report and get advice and tips from experienced pros sharing their opinions.