Monday, July 7, 2008

column store seems more and more popular

last year, did some survey about column store. mainly about vertica, however, the trail of vertica need human verification, it's hard to get that using a graduate student profile. the c-store paper seems charming, and vertica goes further. with column store other than row store, the compression techniques can be used more vastly and deeply, since usually lots of values in the same column have similar values or valuable rules, which is a good property for compression. the retrieve efficiency will be boosted also, since columns stored together, thus only needed column in the result will be returned, other columns will keep untouched. most olap apps are read dominant. so, column techniques are good for this case. though the insertion, modification and deletion will take more time in column. sybase iq used these technology long ago, however, no big influence in the industry.

sofa, one size fit all, is a question or marketing strategy, proposed by mike stonebreaker to advertise his vertica in vldb last year, makes the industry looking that seriously again. sofa, key idea is nowadays the modern relational databases are becoming so huge, sql specification also hugs lots of rarely used, company specific features, these unnecessary features are not required by all kinds of applications. the db research fields are big fan to make any new data model be stored in the database, like store xml in relational database, store objects in relational database, even rdf and etc. the main reason is to reuse all the techniques in relational database. however, how much cost we payed for these kinds of one size fit all solutions. you can say, this sofa is a kind of marketing strategy by vertica, since business need stable and mature software.

last year vldb, the best paper was using column store to store rdf format, by mit dna who is one key contributor in vertica and mit c-store. though idea is simple, pretty charming. most influence ideas are always simple. behind these kinds of facts, maybe we can learn something, how to choose the right time, how to use old ideas to develop new techniques when the environment changes. nod.

No comments:

Post a Comment