Infobright DB Review

If you need a real big data solution, look for a distributed solution that actually has a proven track record.

What is our primary use case?

Big data columnar database with steady frequent writes and sporadic reads with high aggregation queries.  Queries would span a few billion rows

What is most valuable?

A valuable feature was the use of a columnar database for large, ever-growing, big datasets.  It also have very amazing smart grid query feature for very fast aggregate queries across millions of rows

How has it helped my organization?

When working properly, the ability to continually insert large datasets, millions of records per minute, while simultaneously querying the same data tables, was very impressive. But it almost never was able to run continually without errors.

What needs improvement?

This version of Infobright has zero support for distributed scalability. The internal smart grid employed for each table has a major flaw in that the data size cannot be expunged until 2GB of data is reached at the column-level.

This is a major flaw, making usage in a big-data scenario impossible. This means that you can delete as many records from a database table as you want. However, unless the 2GB aggregate size threshold was reached for some of the columns in the table, no reduction in disk space usage will occur.

Only the data from the columns that reached 2GB will actually decrease. Other columns below 2GB in size do not leave the disk.

I spent countless hours trying to find some workaround for this. I have nightmares of my e-mail inbox full of unsolvable questions about data size reduction from our field engineers.

What do I think about the stability of the solution?

We experienced major issues with stability. Looking back, this may be because we chose to go with the PostgreSQL version, as opposed to the more tried and true MySQL flavor.

Many stability issues were experienced in the database, reaching error conditions and simply shutting itself down. We actually had calls as frequent as three times a week with Infobright personnel helping them debug their product as they tried to provide hot-fix patches for us.

What do I think about the scalability of the solution?

Scalability was not-existent with this version of Infobright. It existed on one big database server, central-point-of-failure style. We ended up implementing our own sharding client to hash-shard our inbound data to multiple instances of Infobright.

How is customer service and technical support?

The level of technical support was probably about 2/10. While the field rep at Infobright was very enthusiastic, their off-shore developer team was never reachable. I'm not sure they even had any real technical or developer-level staff on the payroll in 2015 and 2016.

Which solutions did we use previously?

Infobright database employed as part of a new new Greenfield product we were building. We tried several times to migrate to a different solution.

We were successful in moving a portion of the geographic searchable data into Elastic Search and only use Infobright for storage of the fine-grained data.

How was the initial setup?

The initial setup was always a pain with Infobright's special FTP server which we had to pull the RPM bundles from. We then had to apply a license file.

Eventually, I got it down to about an hour of time that one of my guys would have to burn it in order to install a newly released version.

What about the implementation team?

This was all in-house implementation

What was our ROI?

After all the re-work to our product to remove as much reliance on Infobright, and the extra hardware costs we had to absorb, there was definitely a negative return on investment.

What's my experience with pricing, setup cost, and licensing?

Our pricing was based on server instances and it was actually very cheap compared to Oracle. I guess you get what you pay for.

Which other solutions did I evaluate?

I inherited this product when I came on board. I was told by a well respected "database architect" in the company that this product could handle everything and we were safe to build on top of it.

What other advice do I have?

Do not use the Infobright IEE database. It is a fast, standalone columnar database masquerading as a big data solution. If you need a real big data solution, look for a distributed solution that actually has a proven track record.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Add a Comment
Sign Up with Email