What is our primary use case?
We use it for checking the latency for different markets, clients, and brokers across different sites and core locations. We are using this internally within our team for measuring latency for different points, across the whole flow, within our applications.
How has it helped my organization?
We can measure the latency. Usually, we are measuring the client round trip and revenue round trip latency for different markets and clients. We are also tracking our internal latency for our applications. So, it has helped us to understand how much time is spent within the applications to process any particular order, how much time it takes for an acknowledgement to come back from the exchange, and then back to the client. It gives us an understanding of how much time is spent in the applications and where we could improve the time within the applications, if there is certain things were improved, e.g., if there was extra time spent within the applications for an order to be processed. We are currently trying to improve this. This is how it has helped my team.
We use the data to analyze how much time we spend within the applications. Then, based on that, we are doing multiple analyses and types of investigations to work on reducing the amount of time spent on the latency, which helps our applications.
Corvil helps to correlate individual client or trade desk transactions to infrastructure and venue latency. We are using it for tracking the client round trip times, as well as the venue round trip times. The time it takes when the order comes in from the client to the time that it takes when it goes out. Plus, we are tracking the times when an acknowledgement comes back from the exchange. So, these are the type of statistics that we are using at the moment.
What is most valuable?
The functionality is its most valuable feature. We can use CLI with the UI for configuring the new monitoring system, which is good.
With the Corvil Stored Data Analyzer module, we can use it for test data or a set of production data to set up the configuration for latency setup, so we can use the fields to correlate messages.
The analytics feature is good. Sometimes, we are using the data search functionality for analyzing certain data over a day or so to find out if we are seeing some increased latency over a certain period of time. We are using the inspect the data analysis or data search to find outliers for anything specific, trying to identify:
- Whether for an order canceled or an order that has been replaced where we are seeing an increase in latency?
- What period of time during the day do we see an increase or decrease?
- For any issue, how much was the latency, and whether it increased or decreased?
What needs improvement?
Sometimes, when you are saving any configuration and making changes, there are times something is missing. An error comes up, or sometimes, there is no error. More details around the error, such as, "This is missing...," or "You need to add this particular...," for this session to work would be helpful. While there are errors, they are not very straightforward as to the issue. If you're saving the session, sometimes there can be errors which are not specific to an actual problem. You see that error and have to figure out what it might be related to, searching around the whole session. You scan it up and down to find out what is missing, which is why it is complaining, then you add that. Therefore, we would like to have more specific errors when any particular configuration is missing.
I have seen errors where the CNE and the CMC haven't synced because of something missing in the CMC, which was there in the CNE. We would get some type of error, but it doesn't actually say what exactly was missing in the CNE. I had an issue where it said, "There was some type of error because the CMC is not in sync with CNE," but it wasn't really clear what was missing. I had to go to the session discovery site and found that there were certain channels discovered in the CNE, but not found in CMC. So, then we had to sync it. However, the error wasn't explicit about what was missing in the CMC, which is there in CNE.
For FIX protocol, maybe we could have built-in configurations for signatures and decoders. Also, for certain protocols, which are newer, we would like to just add the signatures within the decoders itself.
For how long have I used the solution?
Less than one year.
What do I think about the stability of the solution?
The stability has been good. We haven't many updates on the same platform that we are on (9.2). It has been pretty stable. There hasn't been any type of outages.
What do I think about the scalability of the solution?
The scalability is good. We can use it across multiple applications for analyzing the different data. We have it working with eight to ten applications.
The whole of the desk relies on it (around ten people) along with ten people from our support team and a few from the dev team. Everyone within the team is using it (25 to 30 people).
How are customer service and technical support?
The technical support is good. We can get an answer within the same day sometimes. However, if more analysis is required, then they will provide you updates. Often, an issue can be a bit more complex, then it take some time. Usually, if there are any questions or issues, we just send it to the support team, and they will immediately start looking into it and provide some updates on it.
If you previously used a different solution, which one did you use and why did you switch?
To my knowledge, we were not using a prior solution. Before Corvil, there were manual scripts which were running on the servers. This is not the best solution. So, we went to Corvil.
Previoiusly, it was more of a manual thing, just scripting and getting the stats.
We had so many applications. The flow was getting bigger and larger everyday. Therefore, it made sense to have a better tool which was more automated, like Corvil.
How was the initial setup?
The initial setup was a little complex, like understanding how the whole flow has to be set up within Corvil and what kind of measurements have to be added. It takes time to set up the latency configuration formation. Whenever there is a new protocol, we have to configure the latency setup for different protocols. For example, if you want to put a correlation between the FIX protocol or any different protocol, then we have to add a new configuration.
If this could be something, which could be built into the decoders, then this would help. This is the most complex part of the setting up Corvil. Otherwise, setting up new sessions and everything else is fairly simple.
It took around six months for the monitoring to be set up from the time when the order comes through from the client to when it reaches application, then when it goes out to the exchange. Now, we have a bit more complex flow which we have incorporated. Since, we now have more understanding and experience on this tool, it take less time to set up a more complex flow: Around a month or so. So, the time frame it takes depends on the types of environments for different flows and on the experience and understanding of these flow.
Our implementation strategy was understanding the requirements, different points in the flow, and the subnet groups for catching the traffic, along with spanning the correct traffic to those CNEs, adding the class maps, etc. We first putting together all of the requirements:
- What kind of subnet groups do we require?
- What kind of class maps would be required?
- What are the different protocols that are need to be used for a particular flow?
Once that is all in place, then we had to work on putting together the configuration and sessions, laying down all the requirements, such as what kind of feature will be required to be able to set up the whole flow, making things easier. If you have everything in place, you have written down what may be required, then you can just fill up those blanks.
When the process was complex, then it took more time. When it was less complex, it took a day or two. This took two people from our front office support team.
After it is deployed, there will be four to five people who will be monitoring Corvil. It is a very high-level tool. There are hourly reports which are generated. We just check whether we are getting the right stats. If something is missing, then we just go back and see what is required to be added, or what needs to be further investigated. Then, we raise a backlog ticket for it.
What was our ROI?
As I am working more with Corvil, it looks like it is improving diagnostic times.
In the cases where we go live with any new client, or if we go live with any new exchange flow, then it helps to know the latency on day one. Then, we can produce the latency comparing it to any other flow or with any other statistics. E.g., if their latency has increased or improved the performance. It definitely helps in doing this.
What other advice do I have?
Corvil is really useful, if you want to produce statistics for your application across different platforms then I would definitely recommend it. It is easy to maintain. You can do a proper analysis around it. The support services are good. You can reach out to people, and they're pretty helpful. Once you start working on it and getting the experience, then it becomes easier to configure new sessions or configurations around different flows.
If more time is spent on the venue round trip's time, there is very little control that we have, because there might be an increase in latency at the venue's site which we don't have visibility of. Therefore, we can discuss with the exchange or market, why there was an increase in latency at this particular time, and whether there was any particular changes at their site or if something was different. If we have those statistic, then we can go to the market or the client, providing them those statistic and talk about them in more detail. For example, why was there an increase or decrease in any particular latency during a certain day?
If we have the venue round trip time from the time it leaves the application, we can just go back to the exchange, discuss this, and say, "Why has it taken so much time?" Maybe there has been scenarios where the exchange or market comes back saying they did some type of configuration changes at that site during that particular time, and that's why there was an increase in latency. Or, they needed some type of changes at their site to improve the latency. This helps in our venue performance analysis.
For different venues, depending upon the application, we have different requirements. For example, for certain application, we target the time that it takes for the acknowledgement to come back, or for the request to go to the exchange, it should be seamless. So, we use different statistic for different markets. Based on that, we can work with different markets or exchanges to match the timings. Or, we use a different routing logic within our application to be able to process the order at the same time. Based on this analysis and the statistics that we have, we can use it to match or change the routing logic that we have. There have been a few scenarios where we have done this.
We are on 9.2 version of Corvil and plan to upgrade to 9.4 over the coming weeks. We are building the roadmap for this year:
- What is the plan to use it across different application?
- How many more devices will we need?
They have plans to expand it across different applications. There are certain applications which we haven't already moved onto Corvil, but there are plans to onboard them in the future. We are in the process of building those up. Going forward, we definitely see our usage increasing. Everyday, we have something which we need to analyze and use Corvil. There is higher dependency for a lot of things. We want to look at tracking a smarter routing flow for different flows in different applications as a future roadmap item for us.
Disclosure: IT Central Station contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.