recode.pl - Converts a database from one encoding (or multiple encodings) to UTF-8.
contrib/recode.pl [--guess [--show-failures]] [--charset=iso-8859-2] [--overrides=file_name] --dry-run Don't modify the database. --charset Primary charset your data is currently in. This can be optionally omitted if you do --guess. --guess Try to guess the charset of the data. --show-failures If we fail to guess, show where we failed. --overrides Specify a file containing overrides. See --help for more info. --help Display detailed help. If you aren't sure what to do, try: contrib/recode.pl --guess --charset=cp1252
Don't modify the database, just print out what the conversions will be.
recode.pl will print out a Key for each item. You can use this in the overrides file, described below.
If your database is in multiple different encodings, specify this switch and recode.pl will do its best to determine the original charset of the data. The detection is usually very reliable.
If recode.pl cannot guess the charset, it will leave the data alone, unless you've specified --charset.
If you do not specify --guess, then your database is converted from this character set into the UTF-8.
If you have specified --guess, recode.pl will use this charset as a fallback--when it cannot guess the charset of a particular piece of data, it will guess that the data is in this charset and convert it from this charset to UTF-8.
charset-name must be a charset that is known to perl's Encode module. To see a list of available charsets, do:
perl -MEncode -e 'print join("\n", Encode->encodings(":all"))'
If --guess fails to guess a charset, print out the data it failed on.
This is a way of specifying certain encodings to override the encodings of --guess. The file is a series of lines. The line should start with the Key from --dry-run, and then a space, and then the encoding you'd like to use.