Linux下字符编码转换 -- iconv命令

来源:互联网 发布:淘宝天天特卖会 编辑:程序博客网 时间:2024/06/09 23:34

一、获取iconv命令的帮助及命令参数信息

[root@CentOS-78 covers]# iconv --helpUsage: iconv [OPTION...] [FILE...]Convert encoding of given files from one encoding to another. Input/Output format specification:  -f, --from-code=NAME       encoding of original text  -t, --to-code=NAME         encoding for output Information:  -l, --list                 list all known coded character sets Output control:  -c                         omit invalid characters from output  -o, --output=FILE          output file  -s, --silent               suppress warnings      --verbose              print progress information  -?, --help                 Give this help list      --usage                Give a short usage message  -V, --version              Print program versionMandatory or optional arguments to long options are also mandatory or optionalfor any corresponding short options.For bug reporting instructions, please see:<http://www.gnu.org/software/libc/bugs.html>.
二、得到iconv所支持的字符类型

[root@CentOS-78 covers]# iconv -lThe following list contain all the coded character sets known.  This doesnot necessarily mean that all combinations of these names can be used forthe FROM and TO command line parameters.  One coded character set can belisted with several different names (aliases).  437, 500, 500V1, 850, 851, 852, 855, 856, 857, 860, 861, 862, 863, 864, 865,  866, 866NAV, 869, 874, 904, 1026, 1046, 1047, 8859_1, 8859_2, 8859_3, 8859_4,  8859_5, 8859_6, 8859_7, 8859_8, 8859_9, 10646-1:1993, 10646-1:1993/UCS4,  ANSI_X3.4-1968, ANSI_X3.4-1986, ANSI_X3.4, ANSI_X3.110-1983, ANSI_X3.110,  ARABIC, ARABIC7, ARMSCII-8, ASCII, ASMO-708, ASMO_449, BALTIC, BIG-5,  BIG-FIVE, BIG5-HKSCS, BIG5, BIG5HKSCS, BIGFIVE, BRF, BS_4730, CA, CN-BIG5,  CN-GB, CN, CP-AR, CP-GR, CP-HU, CP037, CP038, CP273, CP274, CP275, CP278,  CP280, CP281, CP282, CP284, CP285, CP290, CP297, CP367, CP420, CP423, CP424,  CP437, CP500, CP737, CP775, CP803, CP813, CP819, CP850, CP851, CP852, CP855,  CP856, CP857, CP860, CP861, CP862, CP863, CP864, CP865, CP866, CP866NAV,  CP868, CP869, CP870, CP871, CP874, CP875, CP880, CP891, CP901, CP902, CP903,  CP904, CP905, CP912, CP915, CP916, CP918, CP920, CP921, CP922, CP930, CP932,  CP933, CP935, CP936, CP937, CP939, CP949, CP950, CP1004, CP1008, CP1025,  CP1026, CP1046, CP1047, CP1070, CP1079, CP1081, CP1084, CP1089, CP1097,  CP1112, CP1122, CP1123, CP1124, CP1125, CP1129, CP1130, CP1132, CP1133,  CP1137, CP1140, CP1141, CP1142, CP1143, CP1144, CP1145, CP1146, CP1147,  CP1148, CP1149, CP1153, CP1154, CP1155, CP1156, CP1157, CP1158, CP1160,  CP1161, CP1162, CP1163, CP1164, CP1166, CP1167, CP1250, CP1251, CP1252,  CP1253, CP1254, CP1255, CP1256, CP1257, CP1258, CP1282, CP1361, CP1364,  CP1371, CP1388, CP1390, CP1399, CP4517, CP4899, CP4909, CP4971, CP5347,  CP9030, CP9066, CP9448, CP10007, CP12712, CP16804, CPIBM861, CSA7-1, CSA7-2,  CSASCII, CSA_T500-1983, CSA_T500, CSA_Z243.4-1985-1, CSA_Z243.4-1985-2,  CSA_Z243.419851, CSA_Z243.419852, CSDECMCS, CSEBCDICATDE, CSEBCDICATDEA,  CSEBCDICCAFR, CSEBCDICDKNO, CSEBCDICDKNOA, CSEBCDICES, CSEBCDICESA, ... ...
如果想要查询所支持的某一类字符类型,可以通过grep命令进行过滤。

三、应用举例

1. 文件装换

iconv -f ISO88592 -t UTF8 < input.txt > output.txt

ISO88592:原编码格式

UTF8:要转换的编码格式

2. 转换网页

curl -s http://www.dreamdu.com/ | iconv -f utf8 -t gbk
curl -s http://www.google.com.hk/ | iconv -f big5 -t gbk

============================================

参考文献:

        http://linux.die.net/man/1/iconv

        http://codingstandards.iteye.com/blog/807077