End User Management or Monitoring(EUM) falls within the area of Application Performance Management (APM). When EUM is discussed, most people think of Synthetic monitoring – using robots that run synthetic transactions from several locations. The use of synthetic transactions is just one of many options – and it is certainly not the best solution. This article discusses the main ways of performing synthetic monitoring and the Pros and Cons of each method.
Figure 1: Real User Monitoring - Showing URL Performance
The Use of Synthetic transactions is perhaps the best know of EUM methods – but arguable the least effective – and definitely the method with the highest Administrate overhead and Total Cost of Ownership (TCO). There are many issues with Synthetic Transactions:
Incomplete Location Coverage. The use of synthetic transactions does not monitor real users; it monitors transactions executed by robots from certain fixed locations. It is very expensive to monitor all locations due to the number of robots required.
Not monitoring Real Users. It is impossible to measure the experience of real users – especially for highly distributed users of external customer facing applications such as internet banking. The synthetic transaction may work fine, but real users may experience issues due to their location, or type of browser or their local workstation.
Page Abandonment. Synthetic transaction monitoring does not report on user behaviour such as page abandonment rates.
High Administrative Cost. Synthetic transactions are costly to maintain. First, the scripts must be developed. It usually takes from 3-5 days to develop a robust production ready script for each synthetic transaction. Most products have a “record” capability for capturing the transaction – but the script will always need editing in order to make the script robust and production ready. Scripts need to be updated every time the application is updated. Robots occasionally fail and need restarting. For a mid-sized organization with 6,000 employees (such as a small bank) you could easily require one FTE support person to maintain the solution
The second method for performing End User Monitoring (EUM) is install an appliance in the Data Centre which collects network packets and then perform deep packet inspection. The level of statistics that can be gathered using this approach is very good and exceeds the detail gathered by Synthetic Monitoring. These type of products can report on individual web page load times for any user located in any location. User satisfaction measures such as abandonment rates can be measured and reported. The total transaction latency can be divided into client latency, network latency and Server latency to quickly determine which tier is responsible for the performance issue. Depending on where the probe is located, the issue can be isolated to the specific tier in the datacentre. It should be emphasized that the probe is entirely passive; the probe is attached to a span port on a main switch and just passively captures packets. Once installed, the solution requires almost no maintenance. There are no scripts to update and no robots to keep running.
The main disadvantages to this method are the following:
Does not work without users. If there are no users (e.g at night) then the passive probe can not detect a problem with the application. Problems may go undetected until the first user logs on in the morning.
Protocol Support. The passive probes perform deep packet inspection. The range of protocols that are supported is limited. The techniques works best for protocols that have a defined start and end such as HTML (or SSL). The probes can decode SSL traffic if provided with the decryption key. The probes do not work so well for custom in-house applications that use a custom protocol. Generally protocols that supported include: HTTP, HTTP over SSL, SQL, Tuxedo/Jolt, Citrix, MQ. Depending on the product, support for AJAX may be limited. The solution works best for HTTP.
One of the best solution in this category is Gomez Data Center Real User Monitoring (formerly Vantage Agentless or Adlex). I have deployed this product at two customer sites and can recommend this product highly.
Web Page Instrumentation
Java (or .NET) Profiler
Many of the vendors sell Java (or .NET) profilers that are able to track transactions inside the JVM. These tools do not monitor End User Experience but thet can help diagnose performance issues related to the code - which usually account for the majority of performance issues. These profilers can trace transactions and determine what exactly how much time is being spent in each method that is invoked for each transaction. However, these products are more than simple profilers; If the agents are loaded into all backend JVM's and then linked together to one management server, then it is possible to track transactions through the various tiers and draw maps of how the backend tiers fits together. Products such as AppDynamics and Compuware's Dynatrace work in this fashion.
I have implemented Synthetic Transactions (both Compuware and BMC solutions) and am not convinced by this technology. Due to the high maintenance cost required, these solutions generally fall into disrepair and stop working after a few years.
A very good solution for End User Monitoring is Gomez Data Center Real User Monitoring (formerly Vantage Agentless or Adlex). I have deployed this product at two customer sites and can recommend this product highly. For HTTP based application such as Internet Banking, this product is a great fit and invaluable. Just make sure you investigate support for AJAX (if this is a requirement). The product is relatively easy to integrate into the event management layer and can be performed using SNMP, so don’t be concerned about integration. Initial up-front cost may be high – but TCO will be low; the product requires almost no maintenance. Compuware can perform a Virtual POC (they just capture some Network Traffic) so purchasing the product can be relatively painless too.
Most performance and availability issues are caused in the datacentre, so it is important to be able to break down server time into measurements for each tier. Tracing performance issues into the JVM using a java profiler is standard practice nowadays for DEV environments. However tracing production transactions through the back-end messaging layer or into the database is a more complex task and requires the capabilities of products such as Compuware’s Dynatrace and AppDynamics.
Most customers implement component level monitoring (bottom up monitoring) first. Most customers consider Real User Monitoring (RUM) to be a luxury. Is this view correct? I have been raving about RUM - but would I implement RUM before Component level monitoring?
The answer is definitely Yes. For HTTP applications, I would implement Data Center RUM before any other type of monitoring. Top down monitoring gives you immediate information and alerts about overall application availability and performance.
RUM will tell you about the state of your application now – at the current point in time. However, RUM will not tell you about potential issues that might occur in the future – in one hour or tomorrow. RUM does not monitor things that fill up – such as storage. RUM is useless for capacity issues. So, customers must implement component level monitoring as well. Customers should have both. For all customers, both top down and bottom up monitoring are essential.
Gomez Synthetic Monitoring
|Gomez Data Center RUM
||Gomez User Experience Management (formerly Gomez Browser RUM)||DynaTrace|
|AppDynamics||AppDynamics Pro (RUM)||AppDynamics Pro|
|New Relic||New Relic RUM|
|IBM||Tivoli Composite Application Manager for Transactions (Rational)||Tealeaf|
|CA||CA Wiley Customer Experience Manager||APM Cloud Monitor|
|BMC||ProactiveNet TMART (licensed from Borland SilkTest or SilkPerform )||Coradiant|
|HP||Loadrunner||HP BAC - RUM||>/td>|