Please share with the community what you think needs improvement with Google Stackdriver.
What are its weaknesses? What would you like to see changed in a future version?
While we are satisfied with the overall performance, in certain cases we must add additional metrics and additional tools like Grafana and Dynatrace. We would like to improve the application log. Most of these logs are derived from Grafana and not from Google Stackdriver. The application locker should be addressed.
I think given it's a new application, a new solution on the block, operations documentation could be a bit better. It took us some time to get up and running with it. It would have been helpful if they had provided some migration tools. For example, you're never going to get away from Genius. We have to do most of the work manually. The tools would have helped us to migrate scripts from other well-known software products and other management tools for Stackdriver. A helpful feature would be for them to improve cost transparency.
The APM functionality needs to be improved. It is difficult to estimate in advance how much something is going to cost. Often, you don't know how much it costs until you are finished.
As far as what can be improved in Stackdriver, on the application side, if I want to track any round-trip or breakdowns of my response times, I'm not able to get it. My request goes through various levels of the Google Cloud Platform (GCP) and comes back to my client machine. Suppose that my request has taken 10 seconds overall, so if I want to break it down, to see where the delay is happening within my architecture, I am not able to find that out using Stackdriver. It can give me the utilization reports, and other network traffic information, but it is not able to provide me answers in detail at a request level. That is why I have to rely on some other APM tools. It would be better if they could provide the round-trip of the response time at a request level because on an overall basis it gives the traffic, transactions per second, etc. if any drops happen, what could be the reason for the drops? Suppose I have a Kong, Kubes,NAT and VPN. With these kinds of things, If any drop happened, it will show me that the drop happened, but it will not give me the reason why it got dropped.
How do you or your organization use this solution?
Please share with us so that your peers can learn from your experiences.