Thursday, July 17, 2008

another way to solve international encoding and decoding

from this site: http://www.ruby-lang.org.cn/forums/viewthread.php?tid=601
there's another solution, other than the libcharguess one

use rchardet instead, as said in the original post, this is a python char guess port.

since the libcharguess, they do not provide any binary things in the sourceforge, and the compile file in based on linux environment.
if we work on windows, install a cygwin for a quick demo program is too long. without compiling libcharguess, you can not use the ruby bind lib for libcharguess.

so from this point of view, rchardet is good solution, of course, also combine using with iconv

1. install rchardet, very trivial
gem install rchardet

that's it

2. use it
require 'rchardet'
str = ...
puts CharDet.detect(str).encoding

then the encoding can be used in iconv, refer to the post http://swimminginbits.blogspot.com/2008/06/use-iconv-libcharguess-solving.html

No comments:

Post a Comment