Wednesday, June 18, 2008

use iconv & libcharguess solving international encoding problem

though last post said that the ruby on ubuntu was frustrating. anyhow, there's a old saying that if it is gold, it will shine. using iconv and libcharguess really solved the problem. the steps are as follows

1. get libcharguess at sourceforge
2. go to the directory ./configure
3. make install
4. make check, if nothing outputs, you're done
5. sudo make install,
6. get charguess at ruby-lang
7. go to the directory, cp libcharguess-dir/cpp/charguess.h to the ungzip directory
8. ruby -r mkmf extconf.rb, if mkmf missing, refer to the last post
9. ruby sample.rb, if something end with -JP print out, we've done

by getting these two libraries. the 'charguess' can be required can used sweetly.
require 'charguess'

somestring_encoding = CharGuess::guess somestring

if somestring is latin1 maybe, the somestring_encoding will be nil , otherwise it will be the guessing encoding. use the encoding and target encoding in the Iconv constructor:

converted_string = Iconv.new(target_encoding, somestring_encoding).iconv(somestring)

the string can be converted well. cool.

No comments:

Post a Comment