Last several months I was involved into the task of design and implementation of statistics/analytics system for the game in social network. There are a lot of users at the same time. All of them produce huge amount of events. One of the standards for analytics systems is providing fast queries for the collected data. Logically, I used OLAP cubes to collect all kinds of events needed for our team to analyze. Technically, the best way in our case is using column-based storage. I use Infobright. In our case regular RDBMS (SQL) storage or document-based DB like MongoDB is not enough because of performance. They should be used rather for OLTP, but not for MGD OLAP. From other hand, such cool gun as Hadoop-based solution would be overrun. So, Infobright is exactly the case. It was one of the best decisions I made as software architect for last several months:
- As it’s pure OLAP solution, so, I’m able to implement any ETL/Storage/Query scheme;
- As Infobright is column-based storage, all my even very sofisticated queries on even huge recordsets have extremely short execution time;
- As all huge functionality like aggregation/filtering is hidden in Inforbright’s internals, I concentrate on my business task, so, able to desing/implement/add new module/scheme/query very quickly.