We just raised a $30M Series A: Read our story

Datadog OverviewUNIXBusinessApplication

Datadog is the #2 ranked solution in our list of APM tools. It is most often compared to Dynatrace: Datadog vs Dynatrace

What is Datadog?
Datadog is a monitoring service for IT, Dev and Ops teams who write and run applications at scale, and want to turn the massive amounts of data produced by their apps, tools and services into actionable insight.
Datadog Buyer's Guide

Download the Datadog Buyer's Guide including reviews and more. Updated: October 2021

Datadog Customers
Adobe, Samsung, facebook, HP Cloud Services, Electronic Arts, salesforce, Stanford University, CiTRIX, Chef, zendesk, Hearst Magazines, Spotify, mercardo libre, Slashdot, Ziff Davis, PBS, MLS, The Motley Fool, Politico, Barneby's
Datadog Video

Archived Datadog Reviews (more than two years old)

Filter by:
Filter Reviews
Industry
Loading...
Filter Unavailable
Company Size
Loading...
Filter Unavailable
Job Level
Loading...
Filter Unavailable
Rating
Loading...
Filter Unavailable
Considered
Loading...
Filter Unavailable
Order by:
Loading...
  • Date
  • Highest Rating
  • Lowest Rating
  • Review Length
Search:
Showingreviews based on the current filters. Reset all filters
AS
DevOps Engineer at Spark New Zealand
Real User
It has enhanced the performance of my team

Pros and Cons

  • "It has enhanced the performance of my team."
  • "The product could do better with its notifications."

What is our primary use case?

We use it for notifications, alerting, and capturing most of the information from Amazon, such as EC2 instances.

How has it helped my organization?

It has enhanced the performance of my team.

What needs improvement?

The product could do better with its notifications. 

I want more technical support than conferences because technical support helps with setting up the product much easier.

For how long have I used the solution?

One to three years.

What do I think about the stability of the solution?

So far, it has been pretty stable. After we stand up and configure it, it works well.

What do I think about the scalability of the solution?

We have managed to get up to 350 hosts in one of the clusters, and it works fine.

How is customer service and technical support?

Datadog's support is pretty good.

How was the initial setup?

The integration and configuration of the product in our AWS environment was easy. This was one of the many things that I liked about Datadog.

What was our ROI?

I have not seen ROI out.

Which other solutions did I evaluate?

We chose Datadog over the other products that we evaluated because it had better features: notifications, alerting, and metric capture. Also, Datadog had the skill sets that we wanted at the time.

What other advice do I have?

Try out some of the other products in comparison. This is a good product if you are looking for notifications and custom metrics.

We have always used the cloud version of this product.

This product also integrates with Slack and PagerDuty.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
DD
Senior Solutions Architect at a tech services company with 11-50 employees
MSP
It lacks consistency in the APIs. However, It has saved us a lot of trouble in implementation.

Pros and Cons

  • "It provides more cloud data. They tend to just get the way a service would be designed on the cloud."
  • "It has saved us a lot of trouble in implementation."
  • "The ease with which we can filter, use metrics, and give accounts to customers, then let the customer filter, set up metrics, and alerts. This has been a big win for us."
  • "It does not have the best interface."
  • "Stability of the product has been a concern for us outside of the primary monitoring agents."
  • "It lacks consistency in the APIs."

What is our primary use case?

We are using the infrastructure and app monitoring side, such as process monitoring. We are using it in a very traditional way. We are not using the APM capabilities. When it comes to something like containers, we will generally use it on the host but not inside the container itself. 

We are using it with our customers and in-house day-to-day.

How has it helped my organization?

It provides more cloud data. They tend to just get the way a service would be designed on the cloud. Datadog can handle a server disappearing and account for it, but they will kick somebody out. 

The ease with which we can filter, use metrics, and give accounts to customers, then let the customer filter, set up metrics, and alerts. This has been a big win for us. This can't be done with a lot of the other platforms. This has made things considerably easier. Where we used to get "What's my performance?" Here, have access. Go nuts. Tell us if you need it. Now, our customers no longer ask us for all that, as they want to go do it themselves. This has made our lives infinitely easier.

What needs improvement?

The only thing that they were missing that has throw us from the beginning (they are still missing it) is consistency in the APIs. There are a couple of guys on the automation side who complain rightfully over how hard it is because every new feature which comes out has a new way of interfacing with the API. This was our big, red flag in the beginning, but given the price and other features, it wasn't enough for us to discount. We said "That we would live with this one red flag", but it is still a red flag.

Stability of the product has been a concern for us outside of the primary monitoring agents.

It does not have the best interface.

For how long have I used the solution?

Three to five years.

What do I think about the stability of the solution?

We haven't noticed any issues in the primary use case for which we are using it. 

The reason we're not using or looking at the APM space right now is due to platform availability. Datadog doesn't support enough platforms, which they know. Every customer that we have is running PHP, and we cannot use APM with any of our customers because of that. Even if they are 95 percent running Java, if Datadog doesn't have PHP, we can't use it because it won't integrate.

What do I think about the scalability of the solution?

Scalability has not been a concern at all. We have had customers with steady state loads: low and high. Our smallest customer is a friends and family startup which has about three instances. We have steady state loads which are more than 500. Then, we have customers with two instances all summer, but do seasonal work in the winter and can scale to more than 1000 instances. 

We have never noticed a hiccup on Datadog with any of our scaling. It has always grown to meet our program.

How are customer service and technical support?

We have used technical support for certain integrations. We use a lot of Ansible and Chef, and we have had a lot of problems with both of these automating components. Technical support was helpful within their limitations.

Which solution did I use previously and why did I switch?

We switched when we started getting heavy into the cloud. We used to use ScienceLogic, New Relic, AppDynamics, Zabbix, etc. It was hodgepodge. 

We were very strong in the APM space. We had all of our APMs going through AppDynamics, which suited a lot of our customer use cases in the cloud. However, when our customers started to get more specific, they wanted traditional core monitoring and the other on-premise traditional vendors, like ScienceLogic, weren't cutting it. That is when we started to look at Datadog. We went back and forth for a while between Zabbix and Datadog. In the end, Datadog won out based on feature price and everything together.

How was the initial setup?

The integration with the AWS environment has been pretty seamless. There have been a few services that we don't use that they don't have book support for. However, usually that happens when it is a new service which is really unpopular. Most of the time, our customers shouldn't have been using that service to begin with, since it's a legacy thing that we inherited. I can't think of a single case where we haven't told the customer "You have to get off of that." 

What was our ROI?

It has saved us a lot of trouble in implementation.

What's my experience with pricing, setup cost, and licensing?

The pricing came up a bit compared to their competitors. It is not that the price has risen, but that the competitors have gone down. They keep adding more features that I would have expected to be baked in at a more nominal price. I have been increasingly dissatisfied with the pricing, but not enough to jump ship. It is still pretty good.

What other advice do I have?

Check the APIs very carefully. Without fail, this is the single biggest complaint for automation and operations. It is not that it can't be done. Just make sure that you have the technical expertise to work around it.

We use a mixture of both AWS and on-premise. There are actually three scenarios: 

  1. Some of our customers purchase it for AWS. 
  2. Some of them were accounts that we set up directly on Datadog for our customers. 
  3. In some cases, customers already have a relationship with Datadog. 

Those are the three scenarios. Some have a mixture of scenarios due to regulatory reasons.

Disclosure: My company has a business relationship with this vendor other than being a customer: Reseller.
Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: October 2021.
543,089 professionals have used our research since 2012.
DT
Director of Engineering at a tech vendor with 201-500 employees
Real User
The ingestion points are unlimited and support customization. We would like the averages of average issue to be fixed.

Pros and Cons

  • "The integration and configuration are incredibly simple. The SaaS offering is remarkably easy to set up, especially if you're coming from a Graphite environment or anything that uses a StatsD."
  • "The ingestion points are unlimited and support customization. We haven't had anything yet that we haven't been able to integrate with it."
  • "There are things about it that we would like to be fixed, such as it is taking averages of average. This results in data that we don't expect."

What is our primary use case?

  • Monitoring
  • Analytics
  • Tracing
  • APM

What is most valuable?

It's hosted. We don't have to do it, and they handle a large amount of data with backups and all of the other things that we no longer have to manage. 

What needs improvement?

There are things about it that we would like to be fixed, such as it is taking averages of average. This results in data that we don't expect, but overall we are happy with it.

For how long have I used the solution?

Three to five years.

What do I think about the stability of the solution?

It is incredibly stable.

What do I think about the scalability of the solution?

We have had no issues with scalability.

How is customer service and technical support?

We have needed technical support because we were dealing with averages of averages.

How was the initial setup?

The integration and configuration are incredibly simple. The SaaS offering is remarkably easy to set up, especially if you're coming from a Graphite environment or anything that uses a StatsD. Datadog is a custom StatsD client, and it adds additional functionality, like tags, etc. However, out-of-the-box should work with native StatsD, so it is incredibly easy to drop in replace if you are using StatsD for metrics.

What was our ROI?

In terms of employee time: While the instructor costs were transferred to Datadog, it freed up our engineers to work on things which were of valuable to our business rather than maintaining a service that we don't make money on.

What's my experience with pricing, setup cost, and licensing?

It costs the same amount it would if we were hosting it ourselves, so we are incredibly happy with the cost.

Which other solutions did I evaluate?

We did look at several vendors. What it came down to is we did not want to manage the metric services ourselves anymore, and Datadog matched what it cost for us to host it ourselves.

What other advice do I have?

Check out Datadog. It is awesome.

The ingestion points are unlimited and support customization. We haven't had anything yet that we haven't been able to integrate with it.

We have only used the SaaS offering, but not AWS nor on-premise.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Enrique Yanez
Software Engineer at Sony Corporation of America
Real User
It is very easy to use and configure. It has a nice UI.

Pros and Cons

  • "If we have a large load for users using our basic Datadog, it will immediately fire off an alert notifying us either something's wrong or not."
  • "It has a nice UI."
  • "We have asked technical support questions, and sometimes they don't get back to us right away. Or when they do, it is not the right answer."

What is our primary use case?

If our app is up and running, we use it to monitor how many credits the app is using up on each node. We also monitor services by how long each call is taking with the help of EC2s off of application.

How has it helped my organization?

If we have a large load for users using our basic Datadog, it will immediately fire off an alert notifying us either something's wrong or not. It provides us insights on our calls to other services, such as how long each call is taking and what is the whole stack trace.

What is most valuable?

  • It is very easy to use.
  • It is easy to configure.
  • It has a nice UI.
  • Datadog provides everything that we need.

For how long have I used the solution?

One to three years.

What do I think about the stability of the solution?

Stability is great. It has not come down. It is always up.

We do not put a lot of stress on it. It use for monitoring our app, and it's a pretty great product.

What do I think about the scalability of the solution?

We have an application in AWS running four nodes. It is not too large. Our user base is about 2000 users.

How are customer service and technical support?

We have asked technical support questions, and sometimes they don't get back to us right away. Or when they do, it is not the right answer. 

Which solution did I use previously and why did I switch?

Before Datadog, we had APM monitoring, which is something similar, but it wasn't as nice to use or as easy to configure.

How was the initial setup?

It is easy to configure. You load the Datadog agent into the EC2 instance, then you just follow it. 

Which other solutions did I evaluate?

I did not participate in the evaluation of the other products.

What other advice do I have?

If you are monitoring the metrics and insights in your application, and need help monitoring, then this is a great application to look into. The app is always available. It has a clean UI and provides the metrics that you will need. It is a good product.

Right now, we only using it on this one application.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Brendan Buono
Software Engineer at Lovepop
Real User
It lets us react more quickly to things going wrong, then we can get back up and running faster for our customers

Pros and Cons

  • "It has scaled great. I haven't run into any problems anywhere that I've used it. They have handled everything that we have needed them to."
  • "It lets us react more quickly to things going wrong. Whereas before, it might have been 30 minutes to an hour before we noticed something going on, we will know within a minute or two if something is off, which will let us essentially get something back up and running faster for our customers, which is revenue."
  • "I would love to see support for front-end and mobile applications. Right now, it is mostly all back-end stuff. Being able to do some integration with our front-end products would be awesome."

What is our primary use case?

The primary use case is application monitoring. We also use it set custom metrics and watch our AWS metrics, as well as data.

At my current job, I have only use it a couple months. However, I used it for a few years at a previous company.

How has it helped my organization?

It lets us react more quickly to things going wrong. Whereas before, it might have been 30 minutes to an hour before we noticed something going on, we will know within a minute or two if something is off, which will let us essentially get something back up and running faster for our customers, which is revenue.

What is most valuable?

Its most valuable feature is the monitoring, such as all the custom metrics that Datadog imports from AWS. In addition, the specific monitoring where you can set up an alert to a bunch of different services. 

What needs improvement?

Some of their newer solutions are interesting, like their logging, but they are not fleshed out. They could use more metrics or synthetics, which would be really helpful.

I would love to see support for front-end and mobile applications. Right now, it is mostly all back-end stuff. Being able to do some integration with our front-end products would be awesome.

For how long have I used the solution?

One to three years.

What do I think about the stability of the solution?

It is very stable. Both times that I have worked with Datadog, we haven't had any issues with them going down. Or, if they did, we didn't know, which is good.

At the previous company that I worked at, we threw a lot at them all at once.

Because this is a newer integration, we are putting less stress on the tool. We are still working on integrating it into our platform.

What do I think about the scalability of the solution?

It has scaled great. I haven't run into any problems anywhere that I've used it. They have handled everything that we have needed them to.

We are a 100 person company with 20 engineers.

How is customer service and technical support?

The technical support is great. They respond quickly. They know what they are talking about and dig right in. If they don't know the answer, they can get it to us very quickly.

How was the initial setup?

The integration and configuration through AWS was pretty smooth. It was easy to set up and start using. The documentation was clear. So, it worked really well.

What about the implementation team?

We did the integration and configuration through AWS ourselves.

What was our ROI?

We haven't seen ROI at my current company. The solution is too new. 

At my last company, we did see ROI, specifically around response time. We could get to mission critical things that were down and losing revenue on immediately. So, the product paid itself back.

What's my experience with pricing, setup cost, and licensing?

The pricing and licensing through AWS Marketplace has been good. It would be nice if it was cheaper, but their pricing is reasonable for what it is. Sometimes, for their newer features, they charge as if it's fully fleshed out, even though it is a newer feature and it may have less stuff than their other items. So, if they would scale the pricing appropriately as they add more stuff to it, that would makes sense. The pricing should reflect the abilities of the features.

Which other solutions did I evaluate?

We looked into self-hosting something, like Prometheus. We also evaluated New Relic.

We chose Datadog for its ease of use in getting set up and what they offered us.

What other advice do I have?

Take the time to explore it and see all the metrics which are available. The metrics make the reporting better. Spend the time and learn the metrics. The things that they can send and give you are good. Learn how to aggregate them and how to write more complex queries, which they do a good job of showing how to do, but I found that newer people don't do this. They just try to use the baseline set of features. Doing the more complex stuff adds significant value.

We have PagerDuty integrated with it, as well as all of AWS. Those are the big ones we have running through it. It integrates well. It essentially replaces CloudWatch, so we can just use Datadog, which is nice. The biggest thing that they provide is putting everything in one spot.

I have just used the AWS version.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
AA
Principal Engineer at a comms service provider with 51-200 employees
Real User
It has the ability to create dashboards and matrices with graphs

Pros and Cons

  • "Using the data, our operation teams works with the dashboards to get their statistics, analytics, etc."
  • "I would like testing for data in the future."

What is our primary use case?

We mainly use it to send metrics about CV and memory usage, in addition to the number of files descriptors on a socket.

How has it helped my organization?

We are working as an SMS segregator. Therefore, we send a lot of SMS message to customers. This product holds one of the most important dashboards for our traffic from each server or cluster on our Gateway. It gives us very good information, mainly for the operations team and other sales guys, about what each account is sending, how often, etc.

Using the data, our operation teams works with the dashboards to get their statistics, analytics, etc.

What is most valuable?

The ability to create dashboards and matrices with graphs. This information is useful to us.

What needs improvement?

I would like testing for data in the future. That would be really nice.

Also, I would like some additional enhancement in the visuals.

For how long have I used the solution?

One to three years.

What do I think about the stability of the solution?

The stability is good. For every message that we send, we have a corresponding metric. We send one to two million messages per server a day.

What do I think about the scalability of the solution?

The scalability is good.

We have over a 1000 accounts on three servers. There are other servers, which work as a helper server. However, the servers which help aid traffic or do the heavy lifting are three main servers, currently. They are hosted on Amazon: large machines with large instances.

How is customer service and technical support?

I have not used Datadog's technical support.

How was the initial setup?

The AWS integration and configuration is pretty good. It has multiple languages and platforms.

What other advice do I have?

Give it a try. It is a good tool for creating statistics and analytics with data. 

Anyone who uses a large amount of data and want insights on the analytics of their data. They can just dump into the tool, and it will do all the heavy lifting.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
JC
System Ninja at a philanthropy with 51-200 employees
Real User
We have a better grasp of what is occurring during the deployment cycle

Pros and Cons

  • "We have a better grasp of what is occurring during the deployment cycle. If something fails, we have an idea what has failed, where it has failed, and how it failed to better mitigate the situation."
  • "It is a good one stop location where we keep all our data for our infrastructure, and it's also easier to navigate between different things."
  • "We want to reduce having to go to different screens to obtain all the information."
  • "At the beginning, when we started throwing logs at it, there was a bit of hiccup. However, this was during their beta period, so hiccups were expected."

What is our primary use case?

We use it to monitor our infrastructure, particularly our different EC2 instances, and our containers. We also use it to capture our logs.

How has it helped my organization?

We have a better grasp of what is occurring during the deployment cycle. If something fails, we have an idea what has failed, where it has failed, and how it failed to better mitigate the situation.

What is most valuable?

It is a good one stop location where we keep all our data for our infrastructure, and it's also easier to navigate between different things.

What needs improvement?

We want to reduce having to go to different screens to obtain all the information. However, they are moving in the right direction from what we have noticed.

For how long have I used the solution?

Less than one year.

What do I think about the stability of the solution?

Stability has never been an issue. We throw all of our servers and containers at it. We have now started to throw our on-premise logs at it too.

At the beginning, when we started throwing logs at it, there was a bit of hiccup. However, this was during their beta period, so hiccups were expected.

What do I think about the scalability of the solution?

It pretty much vacuums up any information that we throw at it. So, stability hasn't been an issue.

It scales depending on the time of year. Right now, we have about 25 to 50 instances, and in each instance there are probably five different containers, not including logging for all those containers.

How is customer service and technical support?

We used their technical support, especially during rollout. They were really good. We worked hand in hand to try to figure out how to configure everything.

How was the initial setup?

For the monitoring of different EC2 instances, you install them into Datadog

We use Chef to install Datadog's package, then that calls out all the information from the instance.

Which other solutions did I evaluate?

We did evaluate other vendors.

We chose Datadog because we were looking for an all-in-one package. They also do log caching and integrate with other systems well.

What other advice do I have?

Take advantage of Datadog's trial period, and really beat it up, then give them a call.

We use the web service for this product.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
MI
Site Reliability Engineer at a financial services firm with 201-500 employees
Real User
They have a good ecosystem for their integrations

Pros and Cons

  • "Their interface is probably one of the easiest things to use because it lets non-developers and non-engineers quickly get access to metrics and pull business value out of them. We could put together dashboards and give it to people who are non-technical, then they can see the state of the world."
  • "We have been able to set very specific CPU and memory alerts, at the very base level, then we started to pull real business value, like 99th percentile response rates for our API calls."
  • "It has turned into an operational dashboard. If you felt something is going wrong, you can immediately open up Datadog. It has been our go to application because we know the answer will be there."
  • "The way data is represented can be limiting. When I first tried it out a long time ago, you could graph a metric and another metric, and they'd overlay, but you couldn't take the ratio between the two."
  • "When I started using it years ago, it had stability problems. I remember, specifically, we ran everything in Docker containers. There were some problems getting it into a Docker container with very specific memory limits."

What is our primary use case?

We use it for custom metrics of our applications and monitoring of our systems.

How has it helped my organization?

My current company didn't have very good monitoring in the past. We had been using basic CPU monitoring. We have been able to set very specific CPU and memory alerts, at the very base level, then we started to pull real business value, like 99th percentile response rates for our API calls. 

It has turned into an operational dashboard. If you felt something is going wrong, you can immediately open up Datadog. It has been our go to application because we know the answer will be there.

What is most valuable?

Their interface is probably one of the easiest things to use because it lets non-developers and non-engineers quickly get access to metrics and pull business value out of them. We could put together dashboards and give it to people who are non-technical, then they can see the state of the world. 

They have a very good ecosystem for their integrations. They have a lot of different integrations, and we use a lot of them. We have integrations with Amazon for ECS, RDS, and all of the subsystems of Amazon. We also have Docker and Splunk integrations. The integrations are great because they're definitely vetted and not third-party integrations. They're part of the Datadog ecosystem and seamless.

What needs improvement?

The way data is represented can be limiting. They have added their own little query language that you can use to manipulate things, so you can graph and relate two different metrics together. This is relatively new this year. When I first tried it out a long time ago, you could graph a metric and another metric, and they'd overlay, but you couldn't take the ratio between the two. However, it looks like this is the direction that they're going, and that's a good direction. I think they should continue adding things that way.

I like being able to put the formulas in myself. I don't want the average. I want a rolling average over three minutes, not five minutes. They're getting better at letting the user customize this.

For how long have I used the solution?

Three to five years.

What do I think about the stability of the solution?

When I started using it years ago, it had stability problems. I remember, specifically, we ran everything in Docker containers. There were some problems getting it into a Docker container with very specific memory limits. We couldn't nail down exactly what the limits and the application needed. Once we did that, we were good. However, it was tricky to get the limit in the first place.

What do I think about the scalability of the solution?

It has always scaled for us. Cost scales up too, but that is not necessarily a bad thing. It's reasonable for what they're providing. I haven't had any concerns about scaling.

We use between a 100 to 500 servers at any given point in time.

How is customer service and technical support?

For the most part, the technical support is pretty good. Every now and again, you will get stuck with a support rep who could have better training, but in general, they are very good and responsive. They're willing to talk about new features, etc.

How was the initial setup?

The integration and configuration processes have been very smooth because everything is very well-documented. The documentation is phenomenal. 

What was our ROI?

We can see trends a lot easier than if we didn't have the solution. The management can see the changes which are being made, whether it being performance or in the number of hosts that went down. We recently made internal improvements to some of our internal APIs, so we reduced the number of servers that we needed. So, you could see that the load on the system went down and the number of servers went down. Thus, it was easy to visualize.

What's my experience with pricing, setup cost, and licensing?

Pricing and licensing are reasonable for what they give you. You get the first five hosts free, which is fun to play around with. Then it's about four dollars a month per host, which is very affordable for what you get out of it. We have a lot of hosts that we put a lot of custom metrics into, and every host gives you an allowance for the number of custom metrics. We have not had a problem with it.

Which other solutions did I evaluate?

My company now is pretty good at looking at alternatives. Also, I evaluated alternative solutions at my last company. 

There are some other competitors. For example, I know one of them started doing metrics and their licensing is very cheap because the metric size is very small and it's per megabyte. They charge you per storage, and it's very small. However, the interface and integrations aren't there. and there are some other competitors, 

The other thing is granularity. Datadog gives you one second granularity for a year. Whereas, some of the competitors would roll up, so after about a week you don't have one second, you have five seconds. Then, after a month, you don't have five seconds, you have a minute. So, you start to lose the granularity, whether it be that it averages it or maxes it, you start to lose the ability to see incidents historically, which is super valuable. If we have an incident, which we think we've seen this before, and want to look back historically, we can zoom right in and see in the database where it peaked.

What other advice do I have?

Give Datadog a try. It's the leader in this space. 

I have only used the AWS version of the product.

They have a thing for the color purple, but it is all good.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
YS
Software Developer at AhnLab, Inc.
Vendor
The AWS version is not difficult to upgrade, but the on-premise version is

What is our primary use case?

We use it to store editorial content. We started out on the on-premise version, then moved to the AWS version.

What is most valuable?

I don't have to worry about upgrades with the AWS version.

What needs improvement?

The on-premise version is very difficult to upgrade.

For how long have I used the solution?

More than five years.

What is our primary use case?

We use it to store editorial content.

We started out on the on-premise version, then moved to the AWS version.

What is most valuable?

I don't have to worry about upgrades with the AWS version.

What needs improvement?

The on-premise version is very difficult to upgrade.

For how long have I used the solution?

More than five years.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
DusanJovanovic
Software Engineer at a media company with 51-200 employees
Real User
Excellent autocomplete for everything in the UI

Pros and Cons

  • "Excellent autocomplete for everything in the UI."
  • "It has empowered all our platform engineers with a very powerful and easy to use monitoring system."
  • "Going from viewing a metric to creating a monitor alerting on a metric is very easy."
  • "The web app has a real-time support chat window in which a support engineer is chatting with you within a minute."
  • "​It would be nice to be able to graph metrics by excluding certain tags (like you can do in monitors)."
  • "It would also be nice if we had more insight into our own usage of Datadog (agents and custom metrics). They provide a usage page which does help, but it is not in real-time."
  • "It would be great if usage metrics were automatically created and we could create custom metrics, instead we ended up building some of our own stuff to track and alert on our own usage."

What is our primary use case?

We run the agent in AWS. 

How has it helped my organization?

It has empowered all our platform engineers with a very powerful and easy to use monitoring system. Most of our platform organization is now involved in monitoring. Previously, only a handful of platform engineers were involved, because Graphite and Sensu were so cumbersome to use.

What is most valuable?

It is incredibly easy to do common monitoring actions:

  • Excellent autocomplete for everything in the UI.
  • Using tags is very intuitive (in contrast to the cumbersome regex-like based querying in Graphite).
  • Going from viewing a metric to creating a monitor alerting on a metric is very easy. This is very important as the easier it is to create monitors, the more monitors will be created by people. With Graphite and Sensu, the effort required to create and test a monitor was so great that we had only a handful of monitors. We now have over 300 monitors.

What needs improvement?

  • It would be nice to be able to graph metrics by excluding certain tags (like you can do in monitors). 
  • It would also be nice if we had more insight into our own usage of Datadog (agents and custom metrics). They provide a usage page which does help, but it is not in real-time. 
  • It would be great if usage metrics were automatically created and we could create custom metrics, instead we ended up building some of our own stuff to track and alert on our own usage.

For how long have I used the solution?

One to three years.

What do I think about the stability of the solution?

Very rarely. Maybe only once or twice that we noticed. It is very reliable. 

What do I think about the scalability of the solution?

No.

How are customer service and technical support?

It is excellent. The web app has a real-time support chat window in which a support engineer is chatting with you within a minute. That is the "right" way to do support. 

Which solution did I use previously and why did I switch?

We previously ran Graphite and Sensu ourselves. By moving to Datadog, we did not need to manage our own monitoring infrastructure anymore. Graphite was somewhat complex to run.

How was the initial setup?

Initial setup is easy. Install the agent and send it metrics. There are StatsD/Datadog libraries available for most languages.

What's my experience with pricing, setup cost, and licensing?

Pricing seems reasonable. It depends on the size of your organization, the size of your infrastructure, and what portion of your overall business costs go toward infrastructure. It is hard to say without looking at all of this.

Which other solutions did I evaluate?

We looked at several competitors at the time (Summer 2016). There did not seem to be any compelling alternatives. Once we did the PoC with Datadog, we loved it and decided to move forward.

What other advice do I have?

Try it out and see if you like it.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
it_user147573
CTO with 51-200 employees
Vendor
We can build dashboards as fast we roll out new systems, which can be fast.

Pros and Cons

  • "The most valuable features have been: Sharable dashboards, TimeBoards, dogstatsd API, Slack Integration, Event logging API. CloudTrail Events, Tags, alerts, and anomaly detection. EBS Volume Snapshot Age, which they added upon request."
  • "More granular control over dashboard sharing. Timeboard sharing."

How has it helped my organization?

We can build dashboards as fast we roll out new systems, which can be fast.

We use standard and custom metrics for every new system we roll out for 360 degree visibility into our systems.

What is most valuable?

The most valuable features have been: Sharable dashboards, TimeBoards, dogstatsd API, Slack Integration, Event logging API. CloudTrail Events, Tags, alerts, and anomaly detection. EBS Volume Snapshot Age, which they added upon request. We used PagerDuty integration for a while as well.

What needs improvement?

More granular control over dashboard sharing. Timeboard sharing.


What do I think about the stability of the solution?

There are infrequent hiccups, which have been decreasing over the time we have used it.

What do I think about the scalability of the solution?

No.

How are customer service and technical support?

Customer Service:

Never seen better. Questions answered usually almost immediately, even on weekends. An in-stream with your event stream.

Technical Support:

High.

Overall they have always had an amazing team, and quality has been maintained as the company has grown.

Which solution did I use previously and why did I switch?

Complementary to other tools we used.

How was the initial setup?

Setup is generally easy. They provide an large number of integrations, some are more complex than others, which is to be expected.

What about the implementation team?

In house implementation.

What was our ROI?

We didn’t calculate explicitly, but as we used the product to track down underutilized instances, it more than paid for itself in the first month.

What's my experience with pricing, setup cost, and licensing?

Pricing overall in this segment has standardized in the last several years.

Which other solutions did I evaluate?

A few, including Zabbix and Icinga.

What other advice do I have?

One of the fastest and most flexible tools we have used in this area..

Disclosure: IT Central Station contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
it_user147570
Programmer with 51-200 employees
Vendor
The fact that its hosted so you don't have to rely on infrastructure to do the work is a very valuable feature.

What is most valuable?

The fact that its hosted so you don't have to rely on infrastructure to do the work is a very valuable feature.

How has it helped my organization?

At other places I've worked, we had to rely on different types of infrastructure -- Datadog does all that. Its very time saving. Datadog has alerting events and metrics all in one place; This was a huge plus, other solutions were trying to treat monitoring as a multi-faceted problem. Datadog treated it as one problem. Also, We no longer use Nagios for alerting, we use Datadog’s alarms, and then we push the data into PagerDuty.

What needs improvement?

The performance, especially when were drilling into metrics we've been running for a year and a half. When you're launching old data it can get slow.

For how long have I used the solution?

We have been using the solution for the last year and a half.

What was my experience with deployment of the solution?

No issues with deployment.

What do I think about the stability of the solution?

No serious issues with stability -- has been pretty great.

What do I think about the scalability of the solution?

We pushed them sometimes, meaning when we would do something their platform didn't support. But they were always very cooperative and proactive.

How are customer service and technical support?

Customer Service: Very good customer service.Technical Support: Hit or miss, always very proactive and friendly but could be faster sometimes. Almost always have the right answer, but sometimes could be a bit quicker.

Which solution did I use previously and why did I switch?

At this job I've only used Datadog. Past jobs I've used New Relic, and assortment of others.

How was the initial setup?

Very straightforward. We got Datadog running in a couple of hours, and deployed it across our production environment, We just installed StatsD, started sending metrics, and it just worked. Deployment was super easy and their setup and integration made us feel like it wasn't a vendor lock-in.

What's my experience with pricing, setup cost, and licensing?

Initial cost was a couple of days of work, day to day nothing really.

Which other solutions did I evaluate?

Yes, we evaluated a bunch of option, including New Relic, as well as toying with the idea of hosting our own, a couple of others as well. We didn't want to manage our own, our corporate culture is to let other companies be the experts when possible. In the case of Datadog, the product exists -- made sense to let them be the experts.

What other advice do I have?

I'm an advocate of Datadog -- go for it.
Disclosure: IT Central Station contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
it_user147213
CEO & Co-Founder with 51-200 employees
Vendor
I always have Datadog up to debug issues with integrations or to ensure that live campaigns are running smoothly.
We were building a real-time bidding exchange for digital out-of-home ads by providing the analytics and the infrastructure for the ecosystem. We not only facilitate the buys, but we also act as an ad server for the network and advertisers who put in their server requirements. It is very similar to the large online ad servers. In order to provide this real-time service, we had to be able to monitor and analyze a wide range of data points for the many media companies in their ad exchange.  We wanted to find an “out-of-the-box” metrics solution that would integrate easily with current systems. We found that Datadog included the integrations that we needed to get our monitoring solution up quickly. Additionally, we liked Datadog’s ability to perform customized metric monitoring, log critical…

We were building a real-time bidding exchange for digital out-of-home ads by providing the analytics and the infrastructure for the ecosystem. We not only facilitate the buys, but we also act as an ad server for the network and advertisers who put in their server requirements. It is very similar to the large online ad servers. In order to provide this real-time service, we had to be able to monitor and analyze a wide range of data points for the many media companies in their ad exchange. 

We wanted to find an “out-of-the-box” metrics solution that would integrate easily with current systems. We found that Datadog included the integrations that we needed to get our monitoring solution up quickly. Additionally, we liked Datadog’s ability to perform customized metric monitoring, log critical events, and scale easily. I had scaled up open-source monitoring solutions once before, and it wasn’t fun. So when I had the opportunity to do it again, I said ‘No.’ 

On a daily basis, Datadog eliminates a lot of the back and forth with the media owners to find out what is going on, it is just a good visual tool for seeing the activity from each of the content management systems that we work with. It is an easy way to go in and get a feel of what is going on. Without Datadog, we would have to repeatedly spend time to reach out to each media owner directly to see if they are sending any requests that day.

As we moved toward a real-time system, it was important to understand whether our partner networks had reported all of their ads in a short period of time. We now use Datadog as a daily monitoring tool to get a feel for what networks are actively sending requests to their servers, what live campaigns are running smoothly and whether the right networks are requesting ads. I always have Datadog up on a daily basis to debug issues with integrations or just to make sure that live campaigns are running smoothly. 

Each network that we work with has different requirements. In terms of connectivity of their networks, we are always dealing with different frequencies and rates of requests each day. The wide diversity in partners has made customization key since the monitoring needs for each partner varies drastically. There are days where we see weird trends of requests coming in. We are able to use Datadog’s custom graphs to catch the anomalies or concern areas in their system on a real-time basis.

Going forward, we are looking to create separate dashboards for each ad network that we work with. This will enable us to gain more detailed metrics for each individual media owner. We will be able to take these detailed metrics and use them to quickly identify potential problem areas, improving our ability to solve problems as soon as they are detected.

Disclosure: IT Central Station has made contact with the reviewer to validate that the person is a real user. The information in the posting is based upon a vendor-supplied case study, but the reviewer has confirmed the content's accuracy.