Character Sets
Summary
This is the list of the characters sets (type=java.nio.charset.Charset) that are available here.
Also check the list by code page number.
For help figuring out which character set a file is using, try the Reverse Charset Mapping Tool.
Detail
| Name | Notes | Aliases |
|---|---|---|
| Big5 | csBig5 | |
| Big5-HKSCS | big5-hkscs, big5hk, big5-hkscs:unicode3.0, big5hkscs, Big5_HKSCS | |
| codepage437 | (no encoder) | codepage437, ibmpc_437, pcbios, 437, ibmpc437 pc, ibmpc |
| EUC-JP | eucjis, x-eucjp, csEUCPkdFmtjapanese, eucjp, Extended_UNIX_Code_Packed_Format_for_Japanese x-euc-jp, euc_jp |
|
| EUC-KR | ksc5601, 5601, ksc5601_1987, ksc_5601, ksc5601-1987 euc_kr, ks_c_5601-1987, euckr, csEUCKR |
|
| GB18030 | gb18030-2000 | |
| GB2312 | gb2312-1980, gb2312, EUC_CN, gb2312-80, euc-cn euccn, x-EUC-CN |
|
| GBK | windows-936, CP936 | |
| GSM-DEFAULT-ALPHABET | GSM7, GSM-7BIT, GSM_DEFAULT, GSM_0338 | |
| hp-roman8 | csHPRoman8, r8, roman8 | |
| IBM-Thai | ibm-838, ibm838, 838, cp838 | |
| IBM00858 | cp858, ccsid00858, cp00858, 858 | |
| IBM01140 | 1140, ccsid01140, cp01140, cp1140 | |
| IBM01141 | cp01141, cp1141, ccsid01141, 1141 | |
| IBM01142 | cp01142, cp1142, ccsid01142, 1142 | |
| IBM01143 | 1143, cp01143, cp1143, ccsid01143 | |
| IBM01144 | cp01144, cp1144, 1144, ccsid01144 | |
| IBM01145 | ccsid01145, cp01145, 1145, cp1145 | |
| IBM01146 | ccsid01146, cp1146, 1146, cp01146 | |
| IBM01147 | cp1147, 1147, ccsid01147, cp01147 | |
| IBM01148 | cp01148, cp1148, ccsid01148, 1148 | |
| IBM01149 | cp1149, ccsid01149, 1149, cp01149 | |
| IBM037 | csIBM037, cpibm37, cp037, cs-ebcdic-cp-us, ibm-037 ibm-37, cs-ebcdic-cp-ca, cs-ebcdic-cp-wt, cs-ebcdic-cp-nl, ibm037 037 |
|
| IBM1026 | 1026, ibm1026, cp1026, ibm-1026 | |
| IBM1047 | 1047, ibm-1047, cp1047 | |
| IBM273 | ibm273, 273, cp273, ibm-273 | |
| IBM277 | ibm277, cp277, ibm-277, 277 | |
| IBM278 | csIBM278, ibm278, cp278, ebcdic-cp-se, 278 ibm-278, ebcdic-sv |
|
| IBM280 | ibm280, cp280, 280, ibm-280 | |
| IBM284 | cpibm284, csIBM284, ibm-284, ibm284, 284 cp284 |
|
| IBM285 | 285, ebcdic-cp-gb, ibm-285, csIBM285, cp285 ibm285, cpibm285, ebcdic-gb |
|
| IBM297 | csIBM297, ebcdic-cp-fr, cp297, ibm297, ibm-297 297, cpibm297 |
|
| IBM420 | ibm420, 420, ebcdic-cp-ar1, csIBM420, ibm-420 cp420 |
|
| IBM424 | cp424, 424, ebcdic-cp-he, ibm424, csIBM424 ibm-424 |
|
| IBM437 | windows-437, cspc8codepage437, ibm437, cp437, 437 ibm-437 |
|
| IBM500 | 500, ebcdic-cp-ch, ebcdic-cp-bh, ibm-500, csIBM500 cp500, ibm500 |
|
| IBM775 | ibm775, cp775, ibm-775, 775 | |
| IBM850 | ibm-850, 850, ibm850, cspc850multilingual, cp850 | |
| IBM852 | 852, ibm-852, csPCp852, cp852, ibm852 | |
| IBM855 | 855, ibm855, cp855, cspcp855, ibm-855 | |
| IBM857 | cp857, ibm857, csIBM857, 857, ibm-857 | |
| IBM860 | ibm860, ibm-860, csIBM860, cp860, 860 | |
| IBM861 | csIBM861, ibm861, 861, cp861, ibm-861 | |
| IBM862 | cp862, ibm862, 862, ibm-862, csIBM862 | |
| IBM863 | cp863, csIBM863, ibm863, 863, ibm-863 | |
| IBM864 | csIBM864, ibm-864, 864, ibm864, cp864 | |
| IBM865 | ibm-865, csIBM865, 865, ibm865, cp865 | |
| IBM866 | 866, ibm-866, csIBM866, ibm866, cp866 | |
| IBM868 | cp-ar, 868, ibm868, csIBM868, ibm-868 cp868 |
|
| IBM869 | ibm869, ibm-869, 869, cp869, csIBM869 cp-gr |
|
| IBM870 | ebcdic-cp-yu, ibm870, ibm-870, 870, csIBM870 cp870, ebcdic-cp-roece |
|
| IBM871 | csIBM871, ibm-871, cp871, ebcdic-cp-is, 871 ibm871 |
|
| IBM918 | ibm-918, 918, cp918, ebcdic-cp-ar2 | |
| ISO-2022-CN | (no encoder) | csISO2022CN, ISO2022CN |
| ISO-2022-JP | jis, jis_encoding, csjisencoding, csISO2022JP, iso2022jp | |
| ISO-2022-KR | ISO2022KR, csISO2022KR | |
| ISO-8859-1 | iso-ir-100, 8859_1, ISO_8859-1, ISO8859_1, 819 csISOLatin1, IBM-819, ISO_8859-1:1987, latin1, cp819 ISO8859-1, IBM819, ISO_8859_1, l1 |
|
| ISO-8859-13 | ISO8859-13, 8859_13, iso8859_13, iso_8859-13 | |
| ISO-8859-15 | 8859_15, csISOlatin9, IBM923, cp923, 923 L9, IBM-923, ISO8859-15, LATIN9, ISO_8859-15 LATIN0, csISOlatin0, ISO8859_15_FDIS, ISO-8859-15, ISO8859_15 |
|
| ISO-8859-2 | ibm912, l2, ibm-912, cp912, ISO_8859-2:1987 ISO_8859-2, latin2, csISOLatin2, iso8859_2, 912 8859_2, ISO8859-2, iso-ir-101 |
|
| ISO-8859-3 | iso8859_3, cp913, csISOLatin3, ibm-913, ISO_8859-3 913, ISO8859-3, 8859_3, ibm913, iso-ir-109 ISO_8859-3:1988, latin3, l3 |
|
| ISO-8859-4 | iso-ir-110, l4, 8859_4, ibm914, latin4 ibm-914, csISOLatin4, iso8859_4, iso8859-4, cp914 914, ISO_8859-4:1988, ISO_8859-4 |
|
| ISO-8859-5 | 915, ISO_8859-5:1988, iso8859_5, cp915, ibm915 ISO_8859-5, ISO8859-5, csISOLatinCyrillic, cyrillic, 8859_5 iso-ir-144, ibm-915 |
|
| ISO-8859-6 | 8859_6, arabic, ibm-1089, iso8859_6, ISO_8859-6 iso-ir-127, ibm1089, ISO_8859-6:1987, ECMA-114, 1089 csISOLatinArabic, ISO8859-6, ASMO-708, cp1089 |
|
| ISO-8859-6-BIDI | ISO_8859-6-E, ISO_8859-6-I, csISO88596I, ISO-8859-6-I, csISO88596E ISO-8859-6-E |
|
| ISO-8859-7 | sun_eu_greek, 8859_7, iso-ir-126, ISO_8859-7:1987, ibm-813 iso8859_7, ISO_8859-7, csISOLatinGreek, greek8, ECMA-118 ibm813, ELOT_928, iso8859-7, cp813, greek 813 |
|
| ISO-8859-8 | iso-ir-138, ibm-916, iso8859_8, cp916, ISO8859-8 ISO_8859-8:1988, hebrew, 8859_8, csISOLatinHebrew, ibm916 916, ISO_8859-8 |
|
| ISO-8859-8-BIDI | csISO88598I, csISO88598E, ISO_8859-8-I, ISO-8859-8-E, ISO_8859-8-E ISO-8859-8-I |
|
| ISO-8859-9 | cp920, l5, ISO_8859-9, ibm-920, csISOLatin5 8859_9, iso-ir-148, latin5, 920, ISO8859-9 ibm920, ISO_8859-9:1989, iso8859_9 |
|
| JIS_X0201 | JIS_X0201, X0201, JIS0201, csHalfWidthKatakana | |
| JIS_X0212-1990 | jis_x0212-1990, iso-ir-159, x0212, JIS0212, csISO159JISX02121990 | |
| KOI8-R | koi8, koi8_r, cskoi8r | |
| KOI8-U | KOI8-RU | |
| Shift_JIS | shift-jis, shift_jis, x-sjis, ms_kanji, csShiftJIS sjis |
|
| TIS-620 | tis620.2533, tis620 | |
| US-ASCII | ISO646-US, IBM367, ASCII, cp367, default ascii7, ANSI_X3.4-1986, iso-ir-6, us, 646 iso_646.irv:1983, csASCII, ANSI_X3.4-1968, ISO_646.irv:1991 |
|
| UTF-16 | utf16, UTF_16 | |
| UTF-16BE | X-UTF-16BE, UnicodeBigUnmarked, UTF_16BE, ISO-10646-UCS-2 | |
| UTF-16LE | UnicodeLittleUnmarked, X-UTF-16LE, UTF_16LE | |
| UTF-7 | csUnicode11UTF7, UNICODE-2-0-UTF-7, UNICODE-1-1-UTF-7, UTF7 | |
| UTF-7-OPTIONAL | UTF-7-O, UTF7O, UTF-7O | |
| UTF-8 | UTF8, unicode-1-1-utf-8 | |
| windows-1250 | cp1250, cp5346 | |
| windows-1251 | ansi-1251, cp1251, cp5347 | |
| windows-1252 | cp1252, cp5348 | |
| windows-1253 | cp1253, cp5349 | |
| windows-1254 | cp5350, cp1254 | |
| windows-1255 | cp1255 | |
| windows-1256 | cp1256 | |
| windows-1257 | cp1257, cp5353 | |
| windows-1258 | cp1258 | |
| windows-31j | csWindows31J, windows-932, MS932 | |
| x-Big5-Solaris | Big5_Solaris | |
| x-euc-jp-linux | euc_jp_linux, euc-jp-linux | |
| x-EUC-TW | cns11643, euc_tw, EUC-TW, euctw | |
| x-eucJP-Open | EUC_JP_Solaris, eucJP-open | |
| x-IBM1006 | cp1006, ibm1006, 1006, ibm-1006 | |
| x-IBM1025 | ibm1025, 1025, cp1025, ibm-1025 | |
| x-IBM1046 | ibm1046, 1046, cp1046, ibm-1046 | |
| x-IBM1097 | ibm1097, 1097, cp1097, ibm-1097 | |
| x-IBM1098 | cp1098, ibm-1098, ibm1098, 1098 | |
| x-IBM1112 | cp1112, 1112, ibm1112, ibm-1112 | |
| x-IBM1122 | ibm-1122, 1122, cp1122, ibm1122 | |
| x-IBM1123 | cp1123, ibm1123, ibm-1123, 1123 | |
| x-IBM1124 | cp1124, ibm1124, ibm-1124, 1124 | |
| x-IBM1381 | 1381, cp1381, ibm1381, ibm-1381 | |
| x-IBM1383 | ibm1383, ibm-1383, cp1383, 1383 | |
| x-IBM33722 | ibm-33722, cp33722, ibm-33722_vascii_vpua, ibm-5050, ibm33722 33722 |
|
| x-IBM737 | ibm-737, ibm737, cp737, 737 | |
| x-IBM834 | cp834, ibm-834, ibm834 | |
| x-IBM856 | ibm-856, 856, ibm856, cp856 | |
| x-IBM874 | cp874, ibm874, ibm-874, 874 | |
| x-IBM875 | ibm875, ibm-875, 875, cp875 | |
| x-IBM921 | 921, cp921, ibm921, ibm-921 | |
| x-IBM922 | cp922, ibm922, ibm-922, 922 | |
| x-IBM930 | cp930, 930, ibm930, ibm-930 | |
| x-IBM933 | ibm933, cp933, 933, ibm-933 | |
| x-IBM935 | 935, cp935, ibm935, ibm-935 | |
| x-IBM937 | cp937, ibm-937, ibm937, 937 | |
| x-IBM939 | ibm-939, ibm939, cp939, 939 | |
| x-IBM942 | cp942, ibm942, ibm-942, 942 | |
| x-IBM942C | ibm942C, cp942C, ibm-942C, 942C | |
| x-IBM943 | ibm943, ibm-943, cp943, 943 | |
| x-IBM943C | ibm-943C, ibm943C, 943C, cp943C | |
| x-IBM948 | 948, ibm-948, cp948, ibm948 | |
| x-IBM949 | ibm-949, cp949, 949, ibm949 | |
| x-IBM949C | cp949C, 949C, ibm949C, ibm-949C | |
| x-IBM950 | 950, cp950, ibm-950, ibm950 | |
| x-IBM964 | 964, cp964, ibm-964, ibm964 | |
| x-IBM970 | ibm970, 970, cp970, ibm-eucKR, ibm-970 | |
| x-ISCII91 | iscii, ST_SEV_358-88, iso-ir-153, csISO153GOST1976874, ISCII91 | |
| x-ISO-2022-CN-CNS | ISO2022CN_CNS, ISO-2022-CN-CNS | |
| x-ISO-2022-CN-GB | ISO-2022-CN-GB, ISO2022CN_GB | |
| x-iso-8859-11 | iso-8859-11, iso8859_11 | |
| x-JIS0208 | JIS0208, csISO87JISX0208, x0208, JIS_C6226-1983, JIS_X0208-1983 iso-ir-87 |
|
| x-JISAutoDetect | (no encoder) | JISAutoDetect |
| x-Johab | johab, ms1361, ksc5601-1992, ksc5601_1992 | |
| x-MacArabic | MacArabic | |
| x-MacCentralEurope | MacCentralEurope | |
| x-MacCroatian | MacCroatian | |
| x-MacCyrillic | MacCyrillic | |
| x-MacDingbat | MacDingbat | |
| x-MacGreek | MacGreek | |
| x-MacHebrew | MacHebrew | |
| x-MacIceland | MacIceland | |
| x-MacRoman | MacRoman | |
| x-MacRomania | MacRomania | |
| x-MacSymbol | MacSymbol | |
| x-MacThai | MacThai | |
| x-MacTurkish | MacTurkish | |
| x-MacUkraine | MacUkraine | |
| x-MS950-HKSCS | MS950_HKSCS | |
| x-mswin-936 | ms936, ms_936 | |
| x-PCK | pck | |
| x-windows-50220 | cp50220, ms50220 | |
| x-windows-50221 | ms50221, cp50221 | |
| x-windows-874 | windows-874, ms874, ms-874 | |
| x-windows-949 | windows949, ms_949, ms949 | |
| x-windows-950 | windows-950, ms950 | |
| x-windows-iso2022jp | windows-iso2022jp |
Count: 160
Resources
- Official IANA Character Set registry.
- An IBM developerWorks article on charsets with excellent links at the bottom.
- A list of ISO-8859-1 characters to check against.
- Good Characters and Encodings Reference by Jukka "Yucca" Korpela
- Excellent history of character set standardization by Dik T. Winter