================================== Recode NEWS - User visible changes ================================== .. contents:: .. sectnum:: :Copyright: © 1993-2023 Free Software Foundation, Inc. Version 3.7.14 ============== :Author: Reuben Thomas, 2023-01. + No user-visible changes; minor clean-ups as part of the Debian packaging process. Version 3.7.13 ============== :Author: Reuben Thomas, 2023-01. + Fix request diagnostics with --verbose: avoid output of confusing and possibly incorrect diagnostics. + Fix a file descriptor leak. Version 3.7.12 ============== :Author: Reuben Thomas, 2022-02. + Simplify support for ignoring invalid input with iconv, integrated with the --strict and --force mechanisms. + Various documentation improvements. Version 3.7.11 ============== :Author: Reuben Thomas, 2022-01. + Re-add support for transliteration with iconv (removed in 3.7). + Add support for ignoring invalid input with iconv. + Fix a bug introduced in 3.7.10 that prevented building the code. Version 3.7.10 ============== :Author: Reuben Thomas, 2022-01. + In recode program, only use iconv if needed; add --prefer-iconv option to allow its use in all cases. + Remove support for pre-3.5 request syntax (colon as charset separator). + PO files have been updated from the Translation Project. + Build system fixes and improvements. + Update gnulib to fix a problem building on Android. + Fix confusingly wrong NEWS entry for 3.7.4. Version 3.7.9 ============= :Author: Reuben Thomas, 2021-06. + A fix to the CP1252 encoding: U+017E LATIN SMALL LETTER Z WITH CARON is at byte 0x9e, not byte 0x8f. + Minor documentation fixes. Version 3.7.8 ============= :Author: Reuben Thomas, 2020-09. + Improvements to the build system. + Fix man page generation, and document that help2man must be built with gettext. + Updates to nl, pt, sv translations (thanks, translators!). Version 3.7.7 ============= :Author: Reuben Thomas, 2020-07. + Improvements to the build system. + Updates to nl, pt, sv translations (thanks, translators!). Version 3.7.6 ============= :Author: Reuben Thomas, 2019-09. + Improvements to the build system. Version 3.7.5 ============= :Author: Reuben Thomas, 2019-09. + Port tests to Python 3. Version 3.7.4 ============= :Author: Reuben Thomas, 2019-09. + Fixes to file handling in recode program. + Fix tests on Windows. Version 3.7.3 ============= :Author: Reuben Thomas, 2019-08. + No code changes to recode itself; this release features a properly versioned shared library. Version 3.7.2 ============= :Author: Reuben Thomas, 2019-08. + No code changes to recode itself; this release includes updates to license headers to guide users to the GPL online, corrects the version of COPYING-LIB shipped with the sources, and updates the message files for various languages. Version 3.7.1 ============= :Author: Reuben Thomas, 2018-09. + No code changes to recode itself; this release just updates the version of gnulib to fix a bug in glibc 2.28: (GitHub issue #11 https://github.com/rrthomas/recode/issues/11 Version 3.7 =========== :Author: François Pinard, 2008-03; Reuben Thomas, 2018-01. + Converters for BibTeX (from Vincent Danjean) and the ANSEL and ISO 5426 character sets (from Wolfram Schneider) have been added. + The conversion strategies (whether to use pipes, memory or files) are no longer available. Now it is reasonable to assume virtual memory, so files and memory have similar performance characteristics (in particular, the memory method is not limited by physical memory.) Further, tests showed that even for runs on little data, the pipes method has minimal performance impact (none was measured). This is not a surprise, as for one-step recodings, the commonest case, no forking is needed. The command-line options -i, -p and --sequence=STRATEGY are ignored for backwards compatibility. + Recode does not include libiconv anymore, but uses an external iconv library if one was available at installation time. The -x: option to the program, or a new flag to the library recode_new_outer function, inhibits the initialisation and usage of iconv. + The experimental ``tree`` surface is removed. Structured data needs a proper parser, and that doesn't fit the framework of Recode. + Many bug fixes. + Long ago, I renamed GNU recode to Free recode: the permission for using the GNU prefix mandated a level of obedience to the FSF that once went overboard, in my opinion. After that change, I realized that some people read Free as a four letter word! To be peaceful, this version changes the name again, to merely Recode. recode (no capital) still names the executable program specifically, or the distribution archive itself. + make check accepts a LIMIT= option, for limiting tests to one or a few cases. See tests/Makefile.am. + PO files have been updated from the Translation Project. + The test system has been overhauled. Tests now run much faster, and require Python and Cython. + Overhauled build system, now using gnulib for portability. This reduces the amount of code in the Recode tree considerably. Version 3.6 =========== :Author: François Pinard, Bruno Haible, 2001-01. General changes --------------- + The recode manual is now indexed, by charset, by concept, etc. + Program messages are also available in Greek, Gallicean and Italian. + Bruno Haible's nice portable iconv library has been integrated. + RFC 1345 tables and French character names have been updated. + The Texinfo charset has been refreshed, and made reversible. New charsets ------------ (most from libiconv) + Japanese + EUC-JP (csEUCPkdFmtJapanese, EUC_JP, Extended_UNIX_Code_Packed_Format_for_Japanese); + ISO-2022-JP (csISO2022JP); ISO-2022-JP-1; ISO-2022-JP-2 (csISO2022JP2); + JIS_C6220-1969-ro (csISO14JISC6220ro, ISO646-JP, iso-ir-14, jp); + JIS_X0201 (csHalfWidthKatakana, JIS0201, JISX0201-1976, JISX0201.1976-0, X0201); + JIS_X0208 (csISO87JISX0208, ISO-IR-87, JIS0208, JIS_X0208.1983-0, JIS_X0208.1983-1, JIS_X0208-1990-0, JIS_X0208.1983-1, X0208); + JIS_X0212 (csISO159JISX02121990, ISO-IR-159, JIS0212, JIS_X0212.1990-0, JIS_X0212-1990, X0212); + SJIS (csShiftJIS, MS_KANJI, SHIFT-JIS). + Chinese + BIG5 (BIG-5, BIG-FIVE, BIGFIVE, CN-BIG5 csBig5); BIG5HKSCS; + EUC-CN (CN-GB, csGB2312, EUC_CN, GB2312); EUC-TW (csEUCTW, EUC_TW); + GB18030; HZ (HZ-GB-2312); ISO-2022-CN (csISO2022CN); ISO-2022-CN-EXT; + GB_1988-80 (cn, csISO57GB1988, ISO646-CN, iso-ir-57); + GB_2312-80 (CHINESE, csISO58GB231280, GB2312.1980-0, ISO-IR-58); + ISO-IR-165 (CN-GB-ISOIR165). + Korean + JOHAB (CP1361); EUC-KR (csEUCKR, EUC_KR); GBK (CP936); + ISO-2022-KR (csISO2022KR); + KSC_5601 (CP949, csKSC56011987, ISO-IR-149, KOREAN, KSC5601.1987-0, KS_C_5601-1987, KS_C_5601-1989, KSX1001:1992). + Vietnamese (independently of libiconv) + TCVN; VIQR; VISCII; VNI; VPS. + Other languages + ARMSCII-8; Georgian-Academy; Georgian-PS; WINDOWS-874 (CP874); + MuleLao-1; CP1133 (IBM-CP1133); CP1258 (WINDOWS-1258); + TIS-620 (ISO-IR-166, TIS620, TIS620.2529-1, TIS620-0, TIS620.2533-0, TIS620.2533-1). + Apple specifics + MacArabic; MacCentralEurope; MacCroatian; MacCyrillic; MacGreek; + MacHebrew; MacIceland; MacRomania; MacThai; MacTurkish; MacUkraine + Unicode + JAVA; UCS-2-INTERNAL; UCS-2LE (UnicodeLITTLE); UCS-2-SWAPPED; UCS-4BE; + UCS-4-INTERNAL; UCS-4LE; UCS-4-SWAPPED; UTF-16BE; UTF-16LE. + Others + CP932; CP949 (UHC); CP950; CP866 (866, csIBM866, IBM866). + ISO-8859-16 (ISO-IR-226, ISO_8859-16:2000). + Recode internal + :libiconv: (:) [so option -x: avoids going through libiconv] New aliases ----------- (from libiconv) [list to be revised] + csASCII (for ANSI_X3.4-1968); csHPRoman8 (for hp-roman8); + csISOLatin1 (for ISO-8859-1); csISOLatin2 (for ISO-8859-2); + csISOLatin3 (for ISO-8859-3); csISOLatin4 (for ISO-8859-4); + csISOLatin5 (for ISO-8859-9); + csISOLatin6 and ISO_8859-10:1992 (for ISO-8859-10); + csISOLatinArabic (for ISO-8859-6); csISOLatinCyrillic (for ISO-8859-5); + csISOLatinGreek (for ISO-8859-7); csISOLatinHebrew (for ISO-8859-8); + csKOI8R (for KOI8-R); csPC850Multilingual (for IBM850); + csUCS4 (for ISO-10646-UCS-4); + csUnicode, csUnicode11, UCS-2BE, UnicodeBIG (for ISO-10646-UCS-2); + csUnicode11UTF7 (for UNICODE-1-1-UTF-7); + csVISCII and VISCII1.1-1 (for VISCII); + ISO-IR-179 (for ISO-8859-13); csMacintosh and MacRoman (for macintosh); + TCVN5712-1, TCVN5712-1:1993 and TCVN-5712 (for TCVN). New surfaces ------------ + tree (experimental). Version 3.5 =========== :Author: François Pinard, 1999-05. Incompatible changes -------------------- + A double dot ``..`` should now be used instead of a colon ``:``. + Option --force (-f) is needed to pursue recoding despite errors. + There is no more quoting for special characters within charsets names. + Auto check (``-a``) and popen (``-o``) options have been withdrawn. + Some charsets and aliases were deleted, see `Charsets & aliases`_ below. Extended features ----------------- + Program messages are available in localised form for many languages. + Long character names are available in French, if LANGUAGE is set to ``fr``. + A new request syntax allows for recode chaining, and for surfaces. + Option --header-file (-h) accepts a language parameter, and Perl is new. + Full charset listings now show the UCS-2 value for characters. + Option --known=PAIRS (-k) also accepts octal and hexadecimal numbers. + Option --list (-l) better sorts charsets and aliases, also fully written. + Charset ``RFC1345`` implements mnemonic+ascii+38, and is now reversible. + HTML is not limited anymore to Latin-1, HTML 4.0 entities are supported. New features ------------ + Euro support. + Updated RFC 1345 set of tables, from Keld Simonsen. + Some African charsets and transliterated forms. + Conversions for ISO 10646 and Unicode. + Combining or explosion of UCS-2 diacriticized characters and ligatures. + Implementation of surfaces, see `Surfaces & aliases`_ below. + Mixed mode for recoding only comments and strings in C sources or PO files. + A stand-alone recoding library gets installed, often as a shared library. + Option --find-subsets (-T) lists charsets which are subsets of another. + The library may generate testing data, and study character frequencies. Charsets & aliases ------------------ + New ISO 10646 and Unicode charsets + combined-UCS-2: pseudo-charset. + count-characters: pseudo-charset. + dump-with-names: pseudo-charset. + ISO-10646-UCS-2 (UNICODE-1-1, BMP, rune, u2). + ISO-10646-UCS-4 (10646, ISO-10646, UCS-4, u4). + UNICODE-1-1-UTF-7 (TF-7, u7). + UTF-8 (UTF-2, UTF-FSS, FSS_UTF, TF-8, u8). + UTF-16 (Unicode, TF-16, u6). + RFC 1345.bis matters + Deleted charsets + dk-us, us-dk (because of &duplicate which Recode does not handle yet). + New charsets + baltic (alias is iso-ir-179); CP1250 (1250, ms-ee, windows-1250); + CP1251 (1251, ms-cyrl, windows-1251); + CP1252 (1252, ms-ansi, windows-1252); + CP1253 (1253, ms-greek, windows-1253); + CP1254 (1254, ms-turk, windows-1254); + CP1255 (1255, ms-hebr, windows-1255); + CP1256 (1256, ms-arab, windows-1256); + CP1257 (1257, WinBaltRim, windows-1257); + CWI (CWI-2, cp-hu); EBCDIC-IS-FRISS (friss); + GOST_19768-87 with aliases of previous GOST_19768-74; + IBM256 (256, CP256, EBCDIC-INT1); IBM875 (875, CP875, EBCDIC-Greek); + IBM1004 (1004, CP1004, os2latin1); IBM1047 (1047, CP1047); + ISO-8859-13 (ISO_8859-13:1998, iso-baltic, iso-ir-179a, l7, latin7); + ISO-8859-14 (ISO_8859-14:1998, iso-celtic, iso-ir-199, l8, latin8); + ISO-8859-15 (ISO_8859-15:1998, iso-ir-203, l9, latin9); + KOI-7; KOI-8 (GOST_19768-74); KOI8-R; KOI8-RU; KOI8-U; + macintosh_ce (macce); mac-is; + NeXTSTEP (next) yet previous Recode had it outside RFC 1345. + Alias promoted to charset (with previous charset becoming alias) + ISO-646.basic (with ISO-646.basic:1983); ISO-646.irv (ISO-646.irv:1983); + ISO_5427-ext (ISO_5427:1981); ISO_5428 (ISO_5428:1980); + ISO-8859-1 (ISO_8859-1:1987); ISO-8859-2 (ISO_8859-2:1987); + ISO-8859-3 (ISO_8859-3:1988); ISO-8859-4 (ISO_8859-4:1988); + ISO-8859-5 (ISO_8859-5:1988); ISO-8859-6 (ISO_8859-6:1987); + ISO-8859-7 (ISO_8859-7:1987); ISO-8859-8 (ISO_8859-8:1988); + ISO-8859-9 (ISO_8859-9:1989); ISO-8859-10 (latin6); + NC_NC00-10 (NC_NC00-10:81); sami (latin-lap). + New aliases + 037 (for charset IBM037); 038 (IBM038); 273 (IBM273); 274 (IBM274); + 275 (IBM275); 278 (IBM278); 280 (IBM280); 281 (IBM281); 284 (IBM284); + 285 (IBM285); 290 (IBM290); 297 (IBM297); 367 (ANSI_X3.4-1968); + 420 (IBM420); 423 (IBM423); 424 (IBM424); 500, 500V1 (IBM500); + 819 (ISO-8859-1); 864 (IBM864); 868 (IBM868); 870 (IBM870); + 871 (IBM871); 880 (IBM880); 891 (IBM891); 903 (IBM903); 905 (IBM905); + 912, CP912, IBM912 (ISO-8859-2); 918 (IBM918); 1026 (IBM1026); + ECMA-113, ECMA-113:1986 (ECMA-Cyrillic); GOST_19768-74 (KOI8); + ISO_8859-N (ISO-8859-N) for N = 1 through 10 and 13 through 15; + ISO_8859-10:1993 (ISO-8869-10); iso-ir-170 (INVARIANT); + KOI8_L2 (CSN_369103); pclatin2, pcl2 (IBM852); SS636127 (SEN_850200_B). + New African charsets + AFRL1-101-BPI_OCIL (t-francais, t-fra); + AFRFUL-102-BPI_OCIL (bambara, bra, ewondo, fulfulde); + AFRFUL-103-BPI_OCIL (t-bambara, t-bra, t-ewondo, t-fulfulde); + AFRLIN-104-BPI_OCIL (lingala, lin, sango, wolof); + AFRLIN-105-BPI_OCIL (t-lingala, t-lin, t-sango, t-wolof). + Extra miscellaneous charsets + KEYBCS2 (Kamenicky); CORK (T1); KOI-8_CS2. + New HTML pseudo-charsets + HTML_1.1 (h1); HTML_2.0 (RFC 1866, 1866, h2); HTML-i18n (RFC 2070); + HTML_3.2 (h3) reimplemented; HTML_4.0 (h4, HTML, h); + deleted aliases HTF, 8859, ISO 8859, Entities, SGML, WWW, w3. Surfaces & aliases ------------------ + Base64 (64, b64); Quoted-Printable (qp, Quote-Printable); + 21-Permutation (swabytes); 4321-Permutation; CR; CR-LF (cl); + Decimal-1 (d, d1); Decimal-2 (d2), Decimal-4 (d4); + Hexadecimal-1 (x, x1); Hexadecimal-2 (x2); Hexadecimal-4 (x4); + Octal-1 (o, o1); Octal-2 (o2); Octal-4 (o4). + data; test7; test8; test15; test16. Version 3.4 =========== :Author: François Pinard, 1994-11. + Charset HTML is new, it handles ``&...;`` sequences for Latin-1. + Charset AtariST handling is more general, --list may be used with it. + Charset ASCII-BS overstriking has been extended, mainly for German. + Charset RFC1345 may be a goal, to debug or study RFC 1345 short names. + Charset names have been revised. Note that nextstep is now NeXT. + Option --force (-f) is accepted, but does not yet protect reversibility. + Option --quiet or --silent (-q) silences irreversible recoding messages. + Option --known=PAIRS (-k) helps searching through recodings. + Option --sequence=pipe (-p) does not fall back on -o anymore. + Option --auto-check may narrow its study around one particular charset. + An MSDOS port is available, check ftp.iro.umontreal.ca in pub/gnuish. + Compilation should now succeed on OS/2 EMX. Thanks to Kai Uwe Rommel. + Program initialization is almost three times faster on average. + Corrected reported bugs, added small improvements, some aesthetic. Version 3.3 =========== :Author: François Pinard, 1993-12. + Charsets atarist, ebcdic-ccc, ebcdic-ibm and nextstep have been added. + Also, most RFC 1345 charsets and aliases are handled. That's a bunch! + Old ascii disappears because of RFC 1345's ascii, use ascii-bs instead. + Old maci disappears because of RFC 1345's macintosh, use applemac instead. + Charsets cccascii and cdcascii disappear, use ebcdic-ccc and ebcdic instead. + Recoding between latin1, ibmpc and applemac is (almost) reversible. + The texinfo documentation has been reorganized, this to be continued. + Long options are accepted, charset names may be abbreviated. + Option --list (-l) displays charsets, aliases and contents in many formats. + Option --strict (-s) asks for stricter, non-reversible recodings. + Option --graphics (-g) approximates ibmpc rulers with ASCII graphics. + Option --header (-h) produces C source for many recoding tables. + Option --auto-check (-a) reports about all possible recodings. + Option --ignore (-x) prevents a charset from being selected. + Execution has been sped up through step merging, hashing for charset names. + Many various buglets have been eradicated, portability increased. + Charsets may be edited out by modifying the Makefile only. + Configuration is made through the use of an external config.h file. + New -d ``diacritics_only`` option for LaTeX. + A few bugs have been corrected. + Documentation reorganization and improvements. + Increased portability, now uses Autoconf. + A few bugs solved. Version 3.2 =========== :Author: François Pinard, 1991-10. + MSDOS port redone. + New check goal at installation time. + Add -v option for verbose processing, remove old -q. + Add -i, -o and -p for letting the user control the strategy. + A few bugs corrected. + Embedded NULs should now be transmitted. Version 3.1 =========== :Author: François Pinard, 1990-03. + Rename -V to -C for showing Copyright. + Calling sequence changed, said files now recoded on themselves. + Add -t option for touching files. + Better on-line help. + Add -q option for quiet processing. + Executable file now considerably smaller, also speedier. + A few bugs corrected. Version 3.0 =========== :Author: François Pinard, 1989-10. + New Text to Latin1 processing, should be faster. + A few bugs corrected. For prior history down to 1980, see at the end of the ChangeLog.