/*====================================================================* - Copyright (C) 2001 Leptonica. All rights reserved. - - Redistribution and use in source and binary forms, with or without - modification, are permitted provided that the following conditions - are met: - 1. Redistributions of source code must retain the above copyright - notice, this list of conditions and the following disclaimer. - 2. Redistributions in binary form must reproduce the above - copyright notice, this list of conditions and the following - disclaimer in the documentation and/or other materials - provided with the distribution. - - THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS - ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT - LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR - A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL ANY - CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, - EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, - PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR - PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY - OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING - NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS - SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. *====================================================================*/
Note: The following are highlights of the changes in each version. They are not a complete listing of the modifications.
1.83.1 Jan 26, 2023 * Cherry-pick two bug fixes from 1.84.0. 1.83.0 Dec 20, 2022 * Simplify setting the title of pdf files. * Catch tiff failure to open stream in fopenTiffMemStream() * Check for POSIX functions fstatat() and dirfd() before use. * In prog/cleanpdf: do not allow threshold to exceed 190. Make all parameters required; do not use default values for invalid parameters. * In prog/concatpdf: add input param for title; add jpeg quality factor * Fix flaky hash_reg test on i686: sets generated from SelectRange() can depend slightly on platform. * Convenience function for adding multiple black and white borders. * Fix oss-fuzz issue 42202: underined shift in l_convertCharstrToInt(). * Fix oss-fuzz issue 43841: made pixCountPixels() more efficient. * Fix oss-fuzz issue 44008: pixCountArbInRect() used wrong depth. * Always return tiff resolution of 0 (unknown) if not set. * Simplify operations on pix memory to help avoid mem leaks * Make bmp non-support of 32-bit bmp (rgba) files explicit. * Improve tiff read resolution conversion by rounding. * Use stdatomic.h to make cloning string safe. Remove all *GetRefcount() and *ChangeRefcount() accessors. * Replace procName and mainName strings by __func__. * Remove information about fields in many structs from the public interface allheaders.h, instead putting them in internal files pix_internal.h, array_internal.h and ccbord_internal.h. * Increase the .so number from 5.4.0 to 6.0.0. * Rename the autotools generated libraries from liblept to libleptonica * Fix potential memory leaks from recogAverageSamples() and recogDebugAverages() by not destroying a recog. 1.82.0 Sept 22, 2021 * Fix issue-585: reading tiff rgb with tiffbpl = 1.5 * packedbpl. * Fix issue-586: failure to properly wrap tiff-g4 in pdf without transcoding. The fix is to do transcoding for tiff-g4, as was done before April 2021. 1.81.1 June 11, 2021 * Added choice of codec (JP2 or J2K) when writing jp2k files. * Fix use of hashmap with key based on dna. 1.81.0 June 6, 2021 * Fixed problems with tiff pdf wrapping photometry. * Fixed scaling issues in prog/cleanpdf for printing. * New progs: tiffpdftest, hashtest * Fixed uninitialized data error in pixAddBorderGeneral() and pixRemoveBorderGeneral() * Rewrote Numa functions that discretize into bins. Have binning by both sorting and histogram. * Rewrote and simplified pixGetRankColorArray() and pixGetBinnedColor(). * Added tests to prog/rankbin_reg.c. * Simplified fpixCopy() and dpixCopy(), and functions that use them. * Check input for bilateral transforms. * Add function for splitting a file evenly by lines. * Check input for getFilenamesInDirectory() * Many new fuzzers. * Use size_t for all size inputs to ascii85 encoding/decoding * New regression tests: encoding_reg.c, binmorph6_reg.c, flipdetect_reg.c * Reworked concatpdf for generality, using the Poppler package. * Removed dwa flipdetection from the library. All the dwa code is now in flipdetectdwa.c.notused. Likewise prog/flipselgen.c is retained for completeness, renamed flipselgen.c.notused, and is not compiled. * Added hashmap utility (hashmap.c, hashmap.h). * Removed functions using dnahash on strings, pts and doubles. * Improved the speed of hashing for strings and doubles. * Added function for tiling images in pairs for comparison. * Added null terminations to serialized strings written to memory, preventing buffer overrun by strlen() [fixed by Stefan Weil]. * No longer use "NoInit" versions of pix creation in leptonica. because they risk reading uninitialized data. These functions remain in the library because they are in use in applications. * Add two composite binarization functions, from prog/binarize_set. * Giulio Lunati fixed pnm reading to work with stdin input. * Removed several of the boxa compare and modify functions. * Implemented reading jpeg200 data encded in j2k "codestream" format. Can now read data in both jp2 and j2k. 1.80.0 28 July 2020 * Improve bmp handling of 1 bpp images and sanity checking of params. * Add function to display all rgb gamut colors * in Makefile.am, use option serial-tests to avoid races in testing * Make m4 subdirectory and add ax_split_version.m4 there * Simple function for hue-invariant mapping (pixMapWithInvariantHue) * Fixed bug in limit of ptra size when used for sorting by bins. * Use hashmap to count pixel colors in RGB(A) images. * Convert hashtest program to regression test hash_reg. * Convert croptest program to regression test crop_reg. * New color segmentation by region growing (colorfill.c) * New regression tests: colorfill_reg, circle_reg, ccbord_reg. * Set maxima for all allocations for common leptonica data structures. * Don't fail when downscaling 2, 4, 8, and 32 bpp images, even to one pixel, invoking pixScaleSmooth(). * New functions that select 1 bpp components based on their area. * Incremental addition to sorted array of numbers. * new prog/fuzzing directory for oss-fuzz based fuzzing programs * use of pixcmapIsValid() with extra argument to determine validity with the pix it is attached to. * Use lept_stderr() in all programs in the prog directory. * New program rasteroptest() for thorough testing of rasterop functions. * Removed the pixSaveTiled*() functions * Stubbed pixDisplayWrite(). Last used in tesseract 3.04.01 (2/2016). 1.79.0 1 Jan 2020 * Clean up auto-generation of files; removed 'register' * Some fixes for issues identified by fuzzer * New source files: checkerboard.c * New programs: replacebytes.c, webpanimio_reg.c, partifytest.c, rectangle_reg.c, lowsat_reg.c, rotate_it.c, scale_it.c, dewarp_it.c, pdfio1_reg.c, pdfio2_reg.c, checkerboard_reg.c, underlinetest.c. * Convert to standard reg test: heap_reg.c, pixa1_reg.c, smallpix_reg.c * Improve data checking when reading image file headers (pnm, png, jpeg, tiff) * Fix some bugs in pnm reading * Fix inconsistencies with the encoding type flags in pdf writing * Allow tiff to write images with colormaps * Fix errors in PS code; made some functions static * Add code for animated webp (requires webp mux and demux libraries) * Add "partify" application for separating parts in a musical score * Enable tif read/write of gray+alpha and rgba; filter out tiff pixels that are not uint and compression by tile * Apply consistent formatting of static const variables * Add programs for scaling, rotation and deskew, named dewkew_it, rotate_it and scale_it, for useful operations on arbitrary images. * Convert pdfiotest program to two regression tests: pdfio1_reg and pdfio2_reg. * Remove all use of strncat; use stringCat(). * New functions from removing outliers in sequences of boxes. * Generalize pixAverageInRect(): mask, region and range filters, and subsampling. New pixAverageInRectRGB(). * Fix int overflow bug in pixMedianCut(); required new heap accessor. * New pixMultiplyGray() allows pix to be multiplied by an array (or another pix) * Better routines for counting color. * Lossless conversion for RGB to cmap with not more than 256 colors. * New histo based global thresholding: pixThresholdByHisto(). * Allow most reg tests to run even if external libraries are not available. * New one-line gplot functions that return a pix. * New application to find where corners meet in a checkerboard. * Add utility functions for painting through mask in cmap pix, creating a hit-miss sela from a color pix, equality of two pta. * Proper handling of 1 bpp colormap tiffs: remove when reading, preserve when writing. * Deprecate three pixSaveTile*() functions; removed all calls to these from the library and progs. * Include auto_config.h explicitly in all src and prog files. * Improve input data checking for bmp files. * Added pixAutoPhotoinvert() for tesseract, to automatically photo- invert text regions where the background is black and text is white. 1.78.0 21 Mar 19 * Various improvements in handling boxa sequences and transforms. * New regression tests: boxa4_reg, string_reg * New function for copying a pix, filtered by a boxa. * Modify histogram method for image comparison. * More careful attention to invalid boxes in box geometry functions. * Better string and array functions for search and replace. * Convenience functions for generating simple masks. * Allow pdf writing of jp2k images, in full generality. * Allow writing compressed ps images for printing. * Modified enum comments to include a suggested enum name. * New program: imagetops 1.77.0 14 Dec 18 Here is the current status of CVE issues with leptonica; see https://security-tracker.debian.org/tracker/source-package/leptonlib * CVE-2018-7442: potential injection attack because '/' is allowed in gplot rootdir. Functions using this command have been disabled by default in the distribution, starting with 1.76.0. As for the specific issue, it is impossible to specify a general path without using the standard directory subdivider '/'. * CVE-2018-7186: number of characters not limited in fscanf or sscanf, allowing possible attack with buffer overflow. This has been fixed in 1.75.3. * CVE-2018-3836: command injection vulnerability in gplotMakeOutput(). This has been fixed in 1.75.3, using stringCheckForChars() to block rootnames containing any of: ;&|>"?*$()/< * CVE-2017-18196: duplicated path components. This was fixed in 1.75.3. * CVE-2018-7441: hardcoded /tmp pathnames. These are all wrapped in special debug functions that are not enabled by default in the distribution, starting with 1.76.0. * CVE-2018-7247: input 'rootname' can overflow a buffer. This was fixed in 1.76.0, using snprintf(). * CVE-2018-7440: command injection in gplotMakeOutput using $(command). Fixed in 1.75.3, which blocks '$' as well as 11 other characters. Wrapped the few 'system' calls in an extra layer of debug code. More coverity scan fixes; defects are about 1 per 10,000 source lines. New regression tests: numa1_reg, numa2_reg, lowaccess_reg, pixmem_reg. New non-regression test programs: histoduptest Juergen Buchmueller is working on Lua bindings. He typedef'd l_ok and used it in 1100 functions that return a success/failure status. He also helped clean up remaining issues in the doxygen-generated documentation. Using a packed struct for bmp headers to avoid crash on some big-endians. Fixed a bug in the prototype parser for xtractprotos that was surfaced by a typedef declaration for the bmp headers. Cleaned up IOS guards to avoid compiling a system(3) call on IOS. Renamed autobuild --> autogen.sh Added some basic pixa functions for rotation and translation. Added an iterative method to find rectangular coverings for arbitrary connected components. Converted two tests to reg tests running in alltests_reg: ptra1_reg, ptra2_reg Enabled read/write for standard jpeg compressed tiff images. Enabled reading for the old (deprecated) jpeg-encoded tiffs. Fix range selectors for pixa, pixaa, boxa, boxaa, pta: Now, last = -1 goes to the end. When reading tiff --> pix, insert IMAGEDESCRIPTION into text field. Converted iotest to reg test iomisc_reg; added to alltests_reg Converted rasterop_reg into a standard regression test; added to alltests_reg. Converted boxa2_reg and fhmtauto_reg into standard regression tests; added to alltests_reg. Split boxa sequence functions out of boxfunc4.c, into a new boxfunc5.c. Simplified bmp header and made reading more clearly endian agnostic (Juergen Buchmueller) New boxa3_reg regression test. This tests sequences of boxes by two new boxfunctions in boxfunc5.c. New bootnumgen4.c for more digit templates. Rename prog/recog_bootnum.c --> prog/recog_bootname1.c New in prog: recog_bootnum2.c, recog_bootnum3.c, recogtest7.c Fixed uninitialized data in pixCentroid() on 1 bpp pix. New reg test: bytea_reg.c. (removed byteatest.c) Fixed bug in non-transcoding pdf generation from 1 bpp png. Added LGTM to static analyzers that run over the library. 1.76.0 1 May 18 Modify infrastructure to fix outstanding security issues. By default, you can no longer create temp directories and temp files whose names are known to the compiler. Also, prevent "system" calls, which were used for image display and gnuplot. Replaced remaining sprintf() with snprintf() in prog tests. Added non-transcoding functions for generating pdf from jpeg pixacomp Add control of jpeg quality from pixWriteMem() and pixWriteStream() Fixed getFilenamesInDirectory() to properly identify directories Prevent size overflow in calloc for kernel; cleaned it up fpix and dpix bmp reading now accepts negative height Simplified splitimage2pdf; it no longer uses ps2pdf Remove name-mangling WRITE_AS_NAMED compile option. Removed 2 deprecated write functions. Added these regression tests: locminmax_reg, speckle_reg, watershed_reg, 1.75.3 15 Feb 18 Fixed some coverity scan issues. Autotools fix to check for png if enabling gnuplot (James Le Cuirot). 1.75.2 11 Feb 18 Converted several progs to standard regression tests. Added these tests to the alltests_reg suite: adaptnorm_reg, binmorph1_reg, binmorph3_reg, equal_reg, extrema_reg, grayfill_reg, falsecolor_reg, grayquant_reg. Autotools fix for restricting giflib to 5.1+, and allowing openjpeg 2.3 (James Le Cuirot). 1.75.1 31 Jan 18 Simpler and more accurate function for finding word masks from text image; better debugging and more thorough testing. Added to regression test set: prog/italic_reg Fix for potential injection attack using gplot rootdir. Bug fix for bmp reading to set opacity. 1.75.0 24 Jan 18 This is a new version, for major Ubuntu release 18.04. $TMPDIR path rewriting turned off on Unix; only used for Windows. Added pix conversion to depth 2 and 4. We now have general converters to 1, 2, 4, 8, 16 and 32 bpp. Modified giflib to use read/write from/to memory; no temp files; no longer support versions before 5.1. Move most low-level code from separate files to their callers; about 30 of them became static. Improved table detection on scanned page images (tests: pageseg_reg.c) Added support for write/compare regression tests for files. Modified printimage for more flexibility. Enable lookup by key on comma-separated key/value text file. Update README.html for building with Visual Studio. Improved functions for getting pixel averages in RGB images Simplified and speedup of unsharp masking. New function for detecting and correcting text orientation. Remove slow sharpening operation when not appropriate during scaling. Better handling of gplots with 0 or 1 data point. Coverity scan fixes. Modified jpeg2000 header to use openjpeg 2.3. Improved depth accessors for pixa and pixaa; added size accessors for pixa and pixaa. Bug fix in webp interface on read error. New function that finds the closest boxes in a boxa to any particular box, in each of 4 directions. New regression tests in automated sequence: blend5_reg, quadtree_reg, wordboxes_reg. New program: textorient Removed programs: snapcolortest 1.74.4 11 Jun 17 Converted two progs to reg tests New version because 1.74.3 had some spurious files (xtractprotos, endianness.h) 1.74.3 9 Jun 17 Coverity scan fixes. Several fixes for running on Windows, including subtle one with tiff encoding depending on pad bits. Utility and test if a page image likely has a table. Remove use of pixCreateTemplateNoInit() where it may cause problems. Make release 'configure-make ready' 1.74.2 19 May 17 Many simplifications and improvements to recognizer. Cleanup of doxygen comments Encoded pdf title in escape 4-byte hex (for safety) Fixed several hundered coverity scan possible leaks Added about 20 regression tests to the automated set. Improved vertical alignment of dewarp. Implemented preliminary method for correcting dewarp foreshortening due to page curvature. Improved multipage tiff reading and writing. Added a new version of textline finding. Fixed bug in fast 2D sharpening code (used in some scaling ops) BMP i/o rewritten to implement memory version directly. PNG i/o functions added for encoding and decoding directly to memory. Method for finding light color regions on scanned page images. 1.74.1 3 Jan 17 Configuration changes to support the patch number in the version (major.minor.patch). Removed all remaining pixDisplayWrite() calls in prog/. Cleaned up and/or promoted about 15 programs to full regression tests. There are now 95 tests in the regression set. Over half the initial coverity scan warnings have been removed. 1.74.0 10 Dec 16 Leptonica development was moved to github. The master is at: github.com/danbloomberg/leptonica Egor Pugin is the maintainer of the site. A very large number of changes have been made. Some of them follow; details can be found in the git commit messages. Static makefiles modified to work with gnu*9 and c*9. Modify SET macros to work on windows. New modes for RGB --> gray conversion. New functions added for displaying a pix from a pixa. Split out sort/hash/set/map functions for dna, sarray and pta. More robust horizontal deskew on multi-column page images. Improve webpio_reg test. Remove X11 display for gplot; it is no longer supported. Remove most sleep calls, which were put in for gplot; no longer needed. Removed use of gthumb in library. Removed use of pixDisplayWrite() in the library; still in some progs. Improved test for endianness in makefile.static; no longer requires any local files or building and running a program. Modified all files for doxygen output (spearheaded by Juergen Buchmueller) Improved plotting of the boxes in a boxa. Replaced the slow point hash function with a simple fast one. Added pam (4 component) format writing to pnmio.c (Juergen Buchmueller) Improved rendering of pixa in side/by/sides. Better utilities for pixa and pixacomp. Add read/write serialization functions from/to memory for all the major data structures that do not already have them. More serialized boot recognizers stored as self-generating code. Cleaned up generating an adapted recognizer from the boot recognizer. Simplified temp file naming; removed most instances of named temp files from non-debug code; use tmpfile() and a wrapper l_makeTempFilename(). Simplify and streamline multipage tiff reading (Jeff Breidenbach). Improvement of Otsu thresholding. Recognize outstanding contributors to leptonica over the years. New gif mem read/write interface that avoids writing a temp file, contributed by Tobias Peirick. Use double arrays (dna) instead of float (numa) for set ops. Enrolled in coverity scan to find potential bugs (Stefan Weil managed it). Fixed about 200 of them, mostly potential memory leaks from early function exit. Cleanup of gray quantization functions and tests. Refactored connectivity-conserving operations, to make them more useful. Provided methods for measuring and regularizing the width of strokes. Removed viewfiles.c from library; code is now in prog htmlviewer.c. Better debugging in page segmentation (pageseg.c) Deprecated the pixDisplayWrite*() debugging methods. Added about 15 regression tests to the framework in alltests_reg.c Final mods for compatibility with tesseract 4.00. 1.73 25 Jan 16 All lept_* functions have been rewritten to avoid path rewrites for output to temp files, which were introduced in 1.72. Now, (1) files are written to the directory specified and (2) we are careful to write to subdirectories of /tmp/lept/ for all test programs, starting with the reg tests and prog/dewarp* and prog/recog*. This also required re-writing stringcode.c and stringtemplate1.txt to write temp files to subdirectories. Goal is to write to the specified path while not spamming the /tmp and /tmp/lept directories. This is particularly important on windows because files in thedirectory are not cleared on reboot. Naming changes (to avoid collisions): #defines MALLOC --> LEPT_MALLOC, CALLOC --> LEPT_CALLOC, etc. ByteBuffer --> L_ByteBuffer Added grayscale histogram functions that can be used to compare images. Added functions to determine if an image region has horizontal text lines. Added functions to compare photo regions of images to determine if they're essentially the same. Added red-black tree utility functions to implement maps and sets. The keys for maps and sets can be 64-bit entities (signed and unsigned integers and doubles). Implemented hashsets and hashmaps, using 64 bit keys. Replaced the numaHash by l_dnaHash; removed numa2d Improved security of tiff and gif reading, to prevent memory corruption when reading bad data. Removed src files: bootnumgen.c Added src files: rbtree.c, rbtree.h, map.c, bootnumgen1.c, bootnumgen2.c Added prog files: rbtreetest.c, maptest.c, settest.c, hashtest.c, recog_bootnum.c, percolatetest.c Added files for building using cmake (Egor Pugin) 1.72 5 Apr 15 Better handling of 1 bpp colormap read/write with png so that they are losseless. The colormap is always removed on read and the conversion is to the simplest non-cmapped pix that can fully represent the input -- both with and without alpha. Fixed overflow bug in pixCorrelationBinary(). Fixed orientation flags and handling of 16 bit RGB in tiff. Also new wrappers to TIFFClientOpen(), so we no longer go through the file descriptor for memory operations. Improvements in the dewarp functions. New box sequence smoothings. New antialiased painting through mask; previously it was only implemented for connected components in a mask. Better error handling and debug output with jpeg2000 read/write. Implemented base64 encoding. This allows binary data to be represented as a C string that can be compiled. Used this in bmf utility. Implemented automatic code generation for deserialization from compiled strings (stringcode.*) Regression tests write to leptonica subdir of in windows; in unix it is optional. This avoids spamming the directory. Added new colorspace conversions (XYZ, LAB). New source files: encoding.c, bmfdata.h, stringcode.c, stringcode.h, bootnumgen.c. Removed source files: convolvelow.c, graymorphlow.c New programs: genfonts_reg, colorize_reg, texturefill_reg, autogentest1, autogentest2. alltests_reg now has 66 tests. 1.71 18 Jun 14 This version supports tesseract 3.04. In particular, 3.04 has automatic conversion of a set of scanned images, either in a directory or coming directly from a scanner, into pdf with injected text. This is something we've wanted to do for several years! Improved jp2k header reading, including resolution. Removed src files: rotateorthlow.c, pdfio.c, pdfiostub.c Renamed jp2kio.c, jp2kiostub.c ==> jp2kheader.c, jp2kheaderstub.c. These header reading functions parse the jp2k files, and don't require a jpeg2000 library. New jp2kio.c, jp2kiostub.c, that uses openjpeg-2.X to read and write jp2k. We now support I/O from these formats: png, tiff, jpeg, bmp, pnm, webp, gif and jp2k as well as writing to PostScript and pdf. New pdfio1.c, pdfio1stub.c, pdfio2.c pdfio2stub.c, where we've split functions into high and low level. Fixed memory bug in bilateral.c Improved reading/write of binary data from file. For example, l_binaryReadStream() can now be used to capture data piped in via stdin. Font directory now arg passed in everywhere (not hardcoded) Don't write temporary files to /tmp; only to a small number of subdirectories, to avoid spamming the /tmp directory. E.g., for regression tests, the current output is now to /tmp/regout/. For jpeg reading modify pixReadJpeg() to take as a hint a bit flag that allows extraction of only the luminance channel. Allow wrapping of pdf objects from png images without transcoding (thanks to Jeff Breidenbach) Better support for alpha on read/write with png, including 1 bpp with colormap, alpha (supported in png with transparency array) 1.70 3 Feb 14 (distribution to debian; ubuntu 14-04; 4.1.0) New bilateral filtering. New simple character recognition utility. Improved dewarping functionality, in model building and rendering. More flexible use of ref models. Better and more consistent handling of alpha layer in RGBA, though use of the spp field. Ability to handle more png files with alpha, including palette with alpha. New fast converters from jpeg and jpeg2000 to pdf, without transcoding. Made bmp reader (and pix reading in general) more robust; avoid size overflow errors. New text labelling operations; depth conversion of a set of images New license (essentially BSD 2-clause), to specify conditions for both source and binary distribution. Improved auto make: make all progs, install just 11, test 61. New src files: bilateral.{c,h}, dewarp1.c, dewarp2.c, dewarp3.c, dewarp4.c, jp2kio.c, jp2kiostub.c, pixlabel.c, recogbasic.c, recogdid.c, recogident.c, recogtrain.c, recog.h New prog files: adaptmap_dark.c, alphaxform_ret.c, bilateral_reg.c, binarize_reg.c, binarize_set.c, blackwhite_reg.c, blend1_reg.c, blend3_reg.c, blend4_reg.c, boxa1_reg.c, colorcontent_reg.c, coloring_reg.c, colorspace_reg.c, compare_reg.c, converttopdf.c, croptest.c, dewarprules.c, dewarptest1.c, dewarptest2.c, dewarptest3.c, dewarptest4.c, displayboxa.c, displaypix.c, displaypixa.c, findcorners_reg.c, fpix1_reg.c, fpix2_reg.c, fpixcontours.c, insert_reg.c, italictest.c, jpegio_reg.c, label_reg.c, multitype_reg.c, nearline_reg.c, newspaper_reg.c, numa1_reg.c, numa2_reg.c, recogsort.c, recogtest1.c, shear1_reg.c, webpio_reg.c, wordboxes_reg.c Removed src files: arithlow.c, binexpandlow.c, binreducelow.c, dewarp.c Removed prog files: blend_reg.c, blendtest1.c, dewarptest.c, fpix_reg.c, inserttest.c, numa_reg.c, rotatetest2.c shear_reg, xvdisp.c 1.69 16 Jan 12 (distribution to debian; ubuntu 12-04; 3.0.0) Fixed bug in pdf generation for large files, using a new double array (dnabasic.c). Added several new modes for pdf generation from sets of images. Dewarp based on image content now aligns to left and right margins; works at book level; is more robust to bad disparity models; version 2 serialization. Fixed regutils to return the actual number of errors. Improved sorting efficiency of numas in cases where binning, which is order N, makes sense. Fixed fpix serialization (now version 2). New version (5) of xtractprotos, allows putting prototypes in-line in allheaders.h. Having them separately in leptprotos.h still an option New copyright (BDS, 2 clause) on src files. Removed all trailing whitespace in src files. New src files: boxfunc4.c coloring.c, dnabasic.c New prog files: dna_reg.c, alphaops_reg.c Removed prog file: alphaclean_reg.c 1.68 10 Mar 11 Fixed windows issues with passing pointers across C-runtime boundaries when using dlls, by providing special functions (e.g., lept_fopen()). Proper version numbers are now set with automake. New utility (quadtree.c) for generating quadtree statistics. New utility (in colorspace.c) for conversions to and from YUV. Refactored functions for assembling image data for generating either PS or PDF images using g4, jpeg or flate encoding. Better tempfile names, using current time in microseconds. Functions for getting resolution from jpeg and png files. Use size_t throughout for reading and writing binary data. Deprecate arrayRead*() and arrayWrite() functions; replace in the library with l_binaryRead*() and l_binaryWrite(). Better handling of colormap images for in-place rasterop and shear. New utility (bytearray.c) for parsing and handling binary data; used for generating PDF files. New utility (pdfio.c) for generating PDF files. Refactored regutils functions to make them simpler to use. Top-level deskew now works on any image. Added functions in utils.c for cross-platform development, mostly for functions that make and remove directories, copy, move and delete files, etc. It should now be straightforward to write programs that will compile and run on windows. Reg tests have better printout; all give timings. New utility program: convertfilestopdf 1.67 9 Nov 10 Autoconf: now built with James Cuirot's config files that build the library and all 200 progs. New sudoku solver. Just a game, but there are interesting aspects. Modified parseprotos.c to reject a type of "extern" decl. Add faster implementation for very small gray morphology operations (3x1, 1x3, 3x3). Eliminate warnings on recent gcc if you don't check return values from fread, fscanf, fgets, system, etc. Convolution: new functions for windowed variance and stdev; allow non-square kernel for windowed mean square. Put stdio.h and stdlib.h in alltypes.h, so they're not required in any .c files. Replace numaConvolve(), which is just a windowed mean, by windowed statistics functions (mean, mean square, variance). Generalize pixExtractOnLine() for arbitrary lines. Add pix interface to webp (webpio.c, webpiostub.c). This is a new open source codec, based on the video codec vpx (webm). Serialization of FPix and DPix Interconversion between FPix and DPix Integer scaling of FPix and DPix; includes the last row and column. New convertfiles.c: depth conversion on files in a directory. Testing programs in prog: convolve_reg.c, numa_reg.c: expanded test set projection_reg.c (tests pixRowStats(), pixColumnStats()) dewarptest.c: output ps and pdf files writemtiff.c: simple driver to write images to a single file 1.66 3 Aug 10 More tweaks for including (or not) bounding box hints for PS wrapping. Default is to write b.b., but not in functions that wrap images as full pages (psio1.c, psio2.c) pix4.c split in two files, and added function to identify c.c. that are sufficiently similar in shape to rectangles (pix5.c) Modify 2 and 4 bit setters to clip the input value so that it can only affect the pixel requested (arrayaccess.c, arrayaccess,h) New pseudorandom sequence functions (numafunc1.c) Dewarping camera-based images using textlines (dewarp.c, prog/dewarp*) Geometrical function for aggregating overlapping bounding boxes. Programs to generate figures for book chapter "Document Image Applications" in "Mathematical Morphology: theory and applications" (see: http:/www.leptonica.org/najman-talbot-book-chapter.html) (prog/livre*.c) Functions that do affine and other operations in images with alpha blending at edges: pix*WithAlpha(). Also do this with a gamma/inverse-gamma wrapper to further reduce edge aliasing. (rotate.c, scale.c, projective.c, affine.c, bilinear.c, prog/alphaxform_reg.c) Improved color segmentation (fixed bugs; made faster) Higher order least square fits: quadratic, cubic, quartic. (pts.c) Various mods for otsu binarization and the *SplitDistribution*() functions (numafunc2.c, prog/otsutest2.c) Control sampling in convolution output (convolve.c, prog/fpix_reg.c) Morphological operations on numas (numafunc1.c, numafunc2.c, prog/numa_reg.c) Pix serialization wrapped so we can use pixRead(), etc on these files (spixio.c, readfile.c, writefile.c) Gif read/write to memory fixed (and cheated -- using files) (gifio.c) New fpix and dpix accessors; contour rendering on fpix (fpix1.c, fpix2.c) Various functions for linearly mapping colors and displaying arrays of colors (pix4.c, blend.c, prog/rankhisto_reg.c) Functions for getting approximate ranges of colors and color components in an image (pix4.c, colormap.c) Cleaned up windows platform and compiler defines and macros. 1.65 5 Apr 10 Added regression test utility functions for standardizing and automating construction and running of regression tests. Makes the golden files when the 2nd arg to the reg test is 'generate'. (regutils.{c,h}) Converted 22 reg tests in prog to use this; invoked with alltests_reg. Goal is to put all prog/*_reg.c into this format and put a set of golden files on the web. Small fixes in gifio for handling streams properly. New functions for shifting colors, hue invariant transforms, etc (blend.c) prog/dwamorph*.c: rename *1_reg.c to dwalineargen.c; others converted to standard reg tests. New rgb convolution functions. For PS output, write all images with a bounding box hint and with page numbers, which works for both embedded (e.g., in tex) and full page generated PS. Once converted to pdf, this is fine in all situation. New functions for initialization and random insertion with pixcomp. For color quantization, make the lightest color white if sufficiently close; ditto for black (colorquant1.c, colorquant2.c). Rank binning of 8 bpp and rgb images (numafunc2.c, pix4.c) A function to rank colors by the intensity of the minimum comp (pix4.c) New pixRotateBinaryNice(), rotates 1 bpp pix in such a way that the shear lines aren't visible. (rotate.c) New pixSaveTiledWithText(), a convenience function to append text to images that are being tiled. (writefile.c) Stereoscopic warping functions and stereo pair functions (warper.c) Linear interpolated shear -- better than rasterop shear (shear.c) Option to use higher quality chroma (U,V) sampling in jpeg (jpegio.c) Rename Bmf --> L_Bmf. New tests in prog: alltests_reg.c alphaclean_reg.c, psio_reg.c, rankbin_reg.c, rankhisto_reg.c, warpertest.c 1.64 3 Jan 10 Easy setup for standard byte processing on 8 bpp pix (pix2.c) Evaluation of difference between similar images; test for similar images and (compare.c) Subpixel scaling, with color input and separate scale factors (pixconv.c) Fix tiff header reader to get correct format (tiffio.c) Enable pixDisplay() to work with i_view (windows) and with xzgv and xli as well as xv; allow application to choose which to use (writefile.c). Use a mask to specify regions that are changed by a morphological operation (morphapp.c). Improve the default sharpening for scaling (scale.c) Function to test for equivalence of file data (utils.c) Select and read image files with embedded index (readfile.c) Fix box size calculation in pixEmbedForRotation(); solution provided by Brent Sundheimer. New pixDisplayMultiple(), instead of calling gthumb directly; this is now set up to use i_view for windows. Changed criteria for determining if an image is color (colorcontent.c, colorquant{1,2}.c. Optional mode where the filename extension is automatically written to output image files; particularly useful for windows. Initialize boxa and pixa as full, with minimal placeholders. Get rank valued numbers and boxes in numa and boxa. Cute implementation for finding largest solid rectangle (maze.c) New median cut quantization for mixed (color/gray) images (colorquant2.c) Many changes to allow the library and applications be built easily in windows. There is now a thorough windows readme, written by Tom Powers, for doing this. The windows build information and project files are now in a new vs2008 directory. 1.63 8 Nov 09 Added pixScaleToGrayFast(), a faster version with very similar quality. Fixed scaleGrayLILow() to handle edge pixels more accurately Text processing: new text application (finditalic.c, prog/finditalic.c) for locating words in italic type style. Easier to add text to a pix using the bitmap font stored in the font directory; see, e.g., prog/writetxt_reg.c. Blending of 2 images with an alpha channel: pixBlendWithGrayMask() Fixed bug in color segmentation; it now (again) works properly. New utility (pixcomp.c) for handling compressed pix arrays in memory; new PixComp and PixaComp structs. Fast serialization of pix without compressing (pixSerializeToMemory and pixDeserializeFromMemory); required serialized colormaps FileI/O: new functions for reading file headers. PostScript generation modernized; split psio.c into psio1.c and psio2.c; added level 3 (flate) encoding. new functions for reading and writing multipage tiffs, for arbitrary input images. For writing, compression is lossless (either g4 or zip) update all I/O stub files Miscellaneous: new pixaAddBorderGeneral(); new functions in pix3.c for counting fg pixels and summing 8 bpp pixels by column and row; new numaUniformSampling() for resampling with interpolation; subpixel scaling. New or improved regression tests in prog: extrema_reg, pixalloc_reg, blend2_reg, rotateorth_reg, ioformats_reg, colorseg_reg, pixcomp_reg, pixserial_reg, writetext_reg, psioseg_reg, subpixel_reg. Interface changes: findFileFormat() and findFileFormatBuffer(): now returns format using input ptr. The function return value is 0 if OK; 1 on error rename: pixThresholdPixels() --> pixThresholdPixelSum() 1.62 26 Jul 09 Expanded composite Dwa implementation as a sequence of operations, so that it now works beyond a size of 63. It's typically about 2x faster than the composite rasterop implementation (with help from Ankur Jain). Also use data transfer instead of data copy whenever possible. Thorough tests with binmorph4_reg and binmorph5_reg. New functions in colorseg.c for masking and histogramming in HSV color space. Treat string constants rigorously as const char*, initializing to char[] if to be used as non-const, or in some cases casting to char*. This avoids compiler warnings. Improved color quantization using existing colormap for octcubes and a new version for grayscale. This will rigorously map most black and most white octcubes (rsp) to black and white if they exist in the colormap. Fast quantization to an existing colormap for color and grayscale. Fixed some bugs; e.g., in pixAffineSampled() for 1 bpp with L_BRING_IN_BLACK; reading and writing pnm for 2 and 4 bpp. In pngio.c, enable compile time control over these settings: converting 16 bpp --> 8 bpp on read removing alpha channel on read setting zlib compression on write For general scaling, allow sharpening to be optional, and provided faster sharpening operations. Improve support for 16 bpp grayscale. For scaleToGray* functions, reduce the width truncation. In psio.c, new functions for converting segmented page images (text and image) into level 2 PostScript. Removed all implicit casting to const char*. New custom pix memory allocator, designed for large pix whose memory needs to be reused many times. In xtractprotos, we now allow prepending an arbitrary string to each prototype. In environ.h, additions for MSVC to work with VC++6, including prototpye strings for dll import and export (thanks to Ray Smith). In colorseg.c, new functions for building HSV histograms, finding peaks, and generating masks based on the peaks. New or improved regression tests: pixalloc_reg, binmorph4_reg, binmorph5_reg, conversion_reg, scale_reg, cmapquant_reg, 1.61 26 Apr 09 New histo-based grayscale quantization: pixGrayQuantizeFromHisto(), that is used in new pixQuanitzeIfFewColorsMixed(). Made final fix in pixBlockconv(). No underflows; no more overflows! More efficient rgb write with pnm. Add proto to jpegiostub.c, allowing proper use of the stubber. Fix several filter functions to use proper test on filter size; viz., pixMinMaxTiles(), several functions in convolve.c. Redo shear implementation to handle arbitrary angles, to handle colormapped images, and to avoid the singularity at pi/2. Removed both static vars from pixSaveTiled(). Generalized pixRotate() to handle colormapped images, and to use pixRotateBySampling() in place of the removed pixRotateEuclidean(). New skew finder for full angle range, pixFindSkewOrthogonalRange(). For skew detection, now allow shear about image center as well as about the UL corner. New rotation reg tests: rotate1_reg.c and rotate2_reg.c. Better serialization format for boxaa; introduce new version numbers for boxaa, pixa, and boxa, as required. Proper init in boxGetGeometry(), boxaGetBoxGeometry(), and the accessors in sel1.c and kernel.c. Improved Numa functions in numafunc1.c and numafunc2.c; in particular, numaMakeHistogramAuto() and numaGetStatsUsingHistogram(). With all histo generators, make sure the start and binsize params are properly set and are used. Interface change: Use these parameters implicitly in numaHistogramGetRankFromVal() and numaHistogramGetValFromRank(). Interface change to ptaGetLinearLSF(): add 1 optional parameter. In several pixaDisplay*() functions, handle colormaps properly. pixafunc.c split to pixafunc1.c and pixafunc2.c. New connected component selections and options in pixaSort. Patch from Tony Dovgal for reading tiff rgba files. Added new logical operation options for numas. New pixConvertRGBToGrayMinMax() that chooses min or max of 3 components. Computation of pixelwise aligned stats between multiple images of the same size (e.g., video), in pix4.c. Very fast binsort implemented for boxa and pixa. Cleanup and rename stack, queue, heap and ptra functions: all structs and typedefs start with "L_" all functions start with "l" Sel creation for crosses and T junctions. New thresholding operations to binary; split out from adaptmap.c into binarize.c. Implementation of sauvola binarization, including use of pixtiling. Added composite parallel union and intersection morphological operations. Small changes to scaling and rotation to improve accuracy; only visible on very tiny, symmetric images. Implemented DPix (double precision data); useful for the mean square accumulator for sauvola binarization. New fast hybrid grayscale seedfill, in addition to the interative version (contributed by Ankur Jain). New or improved regression tests: rotate1_reg, rotate2_reg, shear_reg, numa_reg, skew_reg, ptra1_reg, ptra2_reg, paint_reg, smallpix_reg, pta_reg, pixmem_reg, binarize_reg, grayfill_reg. 1.60 19 Jan 09 Fixed bug in pixBlockconv(), introduced in 1.59, that causes overflow when convolving with an image that has white (255) at the edges. [quickly found by Dave Bryan] Include function to display freetype fonts in a pix. The files freetype.c and freetype.h are in the distribution, but are not yet linked into the library. This is contributed by Tony Dovgal, and this version works only for MSVC. Found that the problems with binary compression in giflib are fixed with giflib 4.1.6. 1.59 11 Jan 09 Lots of changes since 1.58. New files: affinecompose.c, ptra.c, warper.c, watershed.{h,c}. Split: boxfunc.c --> (boxfunc1.c, boxfunc2.c, boxfunc3.c) Improved connected component filtering, with logical functions applied to indicator arrays (pix4.c, pixafunc.c, numafunc1.c). Function to determine if an image can be quantized nicely with only a few colors (colorcontent.c, pixconv.c). New gray seed-filling functions (seedfill.c, seedfilllow.c). Fixed bugs in tophats and hdome, due to misuse of pixSubtractGray() (morphapp.c). New function for improving contrast (adaptmap.c) Watershed transform (still slightly buggy) (watershed.c,h). Fast random access into a pix using line pointers (pix1.c, arrayaccess.*) Conversions of colormaps from gray to color and v.v. (colormap.c) Seedfill function that applies an upper limit to the fill distance (seedfill.c) New function for warping images with random harmonic distortion (with help from Tony Dovgal). New generic ptr array utility: all O(1) functions of a stack plus random replace, insert and delete (ptra.c). Simple functions for colorizing a grayscale image with an arbitrary color (pixconv.c, colormap.c) Flexible affine transforms (translation, scale, rotation) on pta and boxa (affinecompose.c). Clipping of foreground (both exact and approximate) starting from within a rectangular region of the image (pix4.c) Blending a colored rectangle over an image (pix2.c, boxfunc3.c) Generation of rectangle covering of mask components (boxfunc3.c). Block convolution using tiles (for very large images) (convolve.c) New or improved regression tests in prog: locminmax_reg, lowaccess_reg, grayfill_reg, adaptnorm_reg, xformbox_reg, warper_reg, cmapquant_reg, compfilter_reg, splitcomp_reg, affine_reg, bilinear_reg, projective_reg Acknowledgments: (1) Big thanks to Tony Dovgal for helping with the warping (e.g. for captcha). Tony also provided an implementation that allows rendering truetype fonts into a Pix on windows. This is not yet incorporated, because it opens a huge "can of worms," which is OK if you're going fishing but maybe not if you're trying to support leptonica on many platforms. TBD. (2) David Shao provided a libtools build system that includes building the prog directory! I believe this will work, but it is is not yet included because of problems I continue to have with macros in version 2.61 of gnu libtools. (3) Steve Rogers is working on a MSVC build for the prog directory. I hope to have this available for 1.60. 1.58 27 Sept 08 Added serialization for numaa. New octree quantizer pixOctreeQuantByPopulation(), that uses a mixture of level2 and level4 octcubes. Renamed many functions in colorquant1.c, and arranged/documented them more carefully. Revised documentation in leptonica.org/papers/colorquant.pdf. Simplified customization for I/O libraries and fmemopen() in environ.h. Fixed bugs in colormap.c, viewfiles.c, pixarith.c. Verified Adam Langley's jbig2enc (encoding jbig2 and generating pdf from these encoded files) works properly with the current version -- see Section 24 of README.html for usage and build hints. New separable convolution; let pixConvolve() take 8, 16 and 32 bpp input. New floating pt pix (FPix) utility, which allows convolution and arithmetic operations on FPix; also interconversion to Pix. Ability to read headers on multipage tiff. More robust updown text detection in flipdetect.c. Use of sharpening to improve scaling when the scale factor is near 1.0. See prog/fpix_reg.c for regression test and usage. See prog/blend_reg.c for blending regression test, with new functions. 1.57 13 Jul 08 New Debian distribution for 1.57 (thanks to Jeff Breidenbach). Improved the Otsu-type approach for finding a binarization threshold, by choosing the min in the histogram subject to constraints (numafunc2.c, adaptmap.c) New function pixSeedspread() in seedfill.c, similar to a voronoi tiling, that is used for adaptive thresholding in pixThresholdSpreadNorm(). In the process, fixed a small bug in pixDistanceFunction(). (The approach was suggested by Ray Smith, and uses the fast Vincent distance function method to expand each seed.) Generalized the functions in kernel.c to use float weights for general convolution (Version 2 for kernel), and added gaussian kernel generation. Put all jpeg header functions into jpegio.c, where they belong. Fixed bugs in pixaInsertPix() and pixaRemovePix(). Added read/write serialization for Numaa. New functions for comparing two images using bounding boxes (classapp.c). 1.56 12 May 08 Added several new 1d barcode decoders. The functional interface is still in flux. Autoconf! To get this working, it was necessary to: determine and set the endian flag; select which libraries are to be linked; determine if stream-based memory I/O is enabled. This required a major cleanup of the include files, minimizing dependencies on external library header files, and getting everything to work with both autoconf (HAVE_CONFIG_H) and the old customized makefile. Customization is now all in environ.h. pixSaveTiled(): a new way to display tiled images. pixtiling.c: interface for splitting an image into a set of overlapping tiles, using mirrored borders for tiles touching the image boundary. pixBlendHardLight(): new blending mode with nice visual effects. pixColorFraction(): determines extent of color in image Both octree and median-cut color quantization check first if image is essentially grayscale; improvements to both algorithms. box*TransformOrdered(): general sequence of linear transforms. colorquant_reg.c, xformbox_reg.c, hardlight_reg.c: new regr tests. 1.55 16 Mar 08 New functions for combining two images arbitrarily through a mask, including mirrored tiling (pix3.c) Modify pixSetMasked*() to work on all images (pix3.c) New functions for extracting masked regions such as pixClipMasked() (pix3.c) and pixMaskConnComp() and pixMaskBoxa() (boxfunc.c). New functions to separate fg from bg (pix3.c), one of which is supported by numaSplitDistribution (numafunc.c). Modify sobel edge detector to take another parameter (edge.c) Support for 4 bpp cmyk color space in jpeg (jpegio.c) Modified median cut color quantization (colorquant2.c) Renamed colorquant.c (for octree quant) --> colorquant1.c. Absorbed conncomp.h and colorquant.h into specific files that depend on them (colorquant1.c, conncomp.c, pix.h) General convolution with utility for building kernels (convolve.c, kernel.c) Initial implementation of 1D barcode reader. So far, we just have the signal processing to locate barcodes on a page, deskew them, and find the bar widths, along with decoders for two formats. (readbarcode.*, prog/barcodetest.c) Made the default to stub out read/write for non-tiff image formats to memory; it doesn't work on Macs & they were complaining (*io.c) Include MSVC project files for building leptonlib under windows (leptonlib.*) 1.54 21 Jan 08 Histogram equalization (enhance.c). New functions for pixaa: serialization (r/w), creation from pixa, and a tiled/scaled display into a pixa (pixabasic.c, pixafunc.c). Read/write of tiff to memory (instead of a file, using the TIFFClientOpen() callback interface), contributed by Adam Langley (tiffio.c, testing in prog/ioformats_reg). Improved image statistics functions, both over tiles and through a mask over the entire image. Added standard deviation and variance; enable statistics for rgb and colormapped images, in addition to 8 bpp grayscale (pix3.c). New function to extract rgb components from a colormapped image (pix2.c). Fix pixWriteStringPS() to work with all depths and colormap (psio.c) Enable all non-tiff formats to also write and read to/from memory (*io.c) Added support for read/write to gif, contributed by Tony Dovgal (gifio.c, gifiostub.c, imageio.h). See Makefile for instructions on enabling this. 1.53 29 Dec 07 Add 4th arg to pixDistanceFunction() to specify b.c., and fixed output to 16 bpp grayscale pix. (seedfill*.c) New un-normalized block grayscale convolution (convolve.c) Fixed bug in getLogBase2(), so that pixMaxDynamicRange() works properly. This is useful for displaying a 16 bpp pix as 8 bpp (pixarith.c). New function for getting rank val for rgb over a region specified by a mask (pix3.c). New function for extremem values of rgb colormap (colormap.c). New function pixGlobalNormNoSatRGB(), a variant of pixGlobalNormRGB() that prevents saturation for any component above a specified rank value (adaptmap.c). Added mechanism for memory management of pix (pix1.c). Added selective morphology by region given by a mask (morphapp.c). Fixed prototype extracdtion to work properly with function prototypes as args; released version 1.2 of xtractprotos (parseprotos.c, xtractprotos.c). Add a boxa field for pixaa, along with serialization (pixabasic.c), and modified display of pixaa to include this (pixafunc.c). Coalesced the version numbers for pixa, pixaa, boxa, and boxaa serialization (pix.h). New progs: modifyhuesat displays modified versions on a grid; textlinemask shows simple methods for extracting textline masks. 1.52 25 Nov 07 Implemented Breuel's whitespace partitioning algorithm (partition.c). Generalized pixColorMagnitude() to allow different methods for computing the color amount of a pixel (colorcontent.c). New methods for computing overlap of boxes (boxfunc.c). New methods for painting (solid) and drawing (outline) of boxes, replacing boxaDisplay() with pixDrawBoxa*() and pixPaintBoxa*() (pix2.c, boxfunc.c). Ray Smith fixed bug in the distance function (seedfilllow.c). For pixConvertTo1() and pixConvertTo8(), treat input pixs as a const and never return a clone or altered cmap (pixconv.c). Make pixGlobalNormalRGB() crash-proof (adaptmap.c). Tony Dovgal added ability to read jpeg comment (jpegio.c). 1.51 21 Oct 07 Improved histogramming of gray and color images (pix3.c) Histogram statistics (numafunc.c). Better handling of tiff formats, testing rle and packbits output and improving level 2 postscript conversion efficiency (readfile.c, psio.c). Test program for r/w and display of Sels (prog/seliotest.c). Use endiantest to determine automatically which flags to set when compiling for big- or little-endians (endiantest.c) Compute a color magnitude for each rgb pixel (colorcontent.c). Allow separate modification of hue and saturation (enhance.c). Global transform of color image for arbitrary white point (adaptmap.c). 1.50 07 Oct 07 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||| NOTE CAREFULLY: The image format enum in imageio.h has changed. This is an ABI change, and it requires recompilation of the library. ||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Suggestions by David Bryan again resulted in several changes, including improvements to the dwa generating functions and interfaces. Major improvements for dwa code generation, including an optional filename for the output code, adding function prototypes to the code so it can easily be linked outside the library. Addition of 2-way composable dwa functions for bricks, with code addition to the library, and a new interpreter for dwa composable brick sequences (fmorphauto.c, fhmtauto.c, morphtemplate1.c, hmttemplate1.c, morphdwa.c, dwacomb*.2.c, morphseq.c) Exhaustively tested in six programs (prog/binmorph*_reg, prog/dwamorph*_reg). New input modes for Sels, from both color bitmap editors and a simple file format (sel1.c). Better Sel generation functions in sel2.c, including combs for composable brick operations and linear bricks for comparison. Removed unnecessary copies for more efficient border add'n & removal. Added RLE basline enc/dec for tiff. Binary morphology documentation on the web page updated for these changes/additions. William Rucklidge unrolled inner loops and added LUTs to speed up several more functions, including correlation (correlscore.c), centroid calculation (morphapp.c), 2x linear interp grayscale scanning (scalelow.c), thresholding to binary (grayquantlow.c), and removal of colormaps to gray (pixconv.c). 1.49 23 Sep 07 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||| NOTE CAREFULLY: The image format enum in imageio.h has changed. This is an ABI change, and it requires recompilation of the library. ||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Suggestions by David Bryan resulted in several changes. pixUnpackBinary() unpacks to all depths. Can now write and read tiff in LZW and ZIP (gzip) formats. These, like uncompressed tiff, work on all bit depths. Also enabled pnm 16 bpp r/w, both non-ascii and ascii. ioFormatTest() now has better coverage and clarity; this is used in prog/ioformats_reg.c. Rewrite of morphautogen code to implement opening and closing atomically. Cleaner interaction with new text templates (fmorphauto.c, fhmtauto.c, sarray.c, *template*.txt,). More regression testing (e.g., binmorph1_reg.c, binmorph3_reg.c). 1.48 30 Aug 07 William Rucklidge sped up pixCorrelationScore() by in-lining all bit operations (jbclass.c). Generalized rank filtering from 8 bpp to color (rank.c). Fixed many functions that take a dest pix so that they don't fail if the dimensions or depth are not consistent with the src pix. The underlying change for this is to pixCopy() (pix1.c). Improved display of Sel as a pix; added selaDisplayInPix() to display all Sels in a Sela, orthogonal rotations of Sels (sel1.c). New functions for thinning and thickening while preserving connectivity and avoiding both free end erosion and dendritic cruft (ccthin.c, prog/ccthin1_reg.c, prog/ccthin2_reg.c). New function pixaDisplayTiledInRows() for compactly tiling pix in a pixa, plus documentation of different existing methods. (pixafunc.c) 1.47 22 Jul 07 New brick rank order filter (rank.c, prog/ranktest.c, prog/rank_reg.c). Use mirror reflection b.c. to avoid special processing at boundaries (pix2.c). Simple sobel edge detector (edge.c). Utility for assempling level 2 compressed images in PostScript (psio.c, prog/convertfilestops.c). Enable read/write of 16 bpp, grayscale tiff (tiffio.c, pix2.c). New function for finding the number of c.c., which is a bit faster than finding the b.b. or the component images (conncomp.c) New functions for finding local extrema in grayscale image (seedfill.c) 1.46 28 Jun 07 Added interpreted mode for color morphology (morphseq.c). In functions, make effort to consistently do early initialization of ptrs to objects returned on the heap. This is to try to avoid letting functions return uninitialized objects, even if the return early because of bad input. Split pixa.c into 2 files; revised the component filtering in both pixafunc.c and boxfunc.c. Added component filtering for "thin" components. Added subsampling functions for numa and pta. Word segmentation now works at both full and half resolution. Better methods for displaying and tiling (for debugging), using pixDisplayWrite(), pixaReadFiles() and pixaDisplayTiledAndScaled(). 1.45 27 May 07 Further improvements of orientation and mirror flip detection (flipdetect.c). Added 2x rank downscaling and general integer replicative expansion (scale.c). Simplified interface for averaging, and included tiled averaging, which is yet another integer reduction scaling function (pix3.c). 1.44 1 May 07 Split pix2.c into (pix2.c, pix3.c), with basic housekeeping functions (e.g., ops on borders, padding) in pix2.c. Split numarray.c into (numabasic.c, numafunc.c), with constructors and accessors in numabasic.c. Added a number of histogram, rank value and interpolation functions to numafunc.c. Add rms and rank difference comparison functions (compare.c). Separated orientation and mirror flip detection; fixed the latter (flipdetect.c). 1.43 24 Mar 07 New and fixed functions for handling word boxes (classapp.c) More consistent use of L_* flags (e.g., sarray.h, morph.h) Morphology on color images (gray ops on each component) (colormorph.c) New methods for generating sels; we now have five methods in sel1.c and 3 others in selgen.c. Also a function that displays Sels as an image, for use in documentation (sel1.c) New high-level converters, such as pixConvertTo8(), pixConvertTo32(), pixConvertLossless() (pixconv.c) Identify regression tests, and rename them as prog/*_reg.c. Complete revision of plotting package (gplot.c) New functions for comparing pix (compare.c) New morph application functions, such as the ability to run a morph sequence separately on selected c.c. in an image, and a fast, quasi-tophat function (morphapp.c) Cleanup and new interfaces to border representations of c.c. (ccbord.c) Page segmentation application (pageseg.c) Better serialization with version control for all major structs. Morphological brick operations with 2-way composite sels (morph.c) 1.42 26 Dec 06 New sorting functions, including 2-d sorting, for boxa and pixa, and functions that sort by index (e.g., pixa --> pixa and for 2d, pixa --> pixaa; ditto for boxa). New accessors for pix dimensions. A new strtokSafe() to substitute for strtok_r (utils.c). Page flip detection, using both rasterop and dwa morphology (flipdetect.c), with dwa generation (fliphmtgen.c) and testing (prog/fliptest.c). Increased basic sels from 42 to 52 (sel2.c). Better high-level interfaces for binary morphology with brick (separable) sels, both for rasterop (morph.c) and for dwa (morphdwa.c); fully tested for both asymmetric and symmetric b.c. (prog/morphtest3.c). Faster area mapping reduction for power-of-2 scaling. 1.41 5 Nov 06 Simplified morph enums, removing all unused ones (morph.h). Added new high-level interfaces for adaptive mapping (adaptmap.c). New method to extract color content of images (colorcontent.c). New method to generate sels from text strings, and to identify roman text that is not properly oriented (thanks to Adam Langley). Fast grayscale min/max (rank) scale reduction by integer factors. New accessors for box and sel, that should be used when possible. Thresholding grayscale mask by bandpass (grayquant.c). Use of strtok_r() for thread safety. 1.40 15 Oct 06 Fixed xtractprotos for cygwin. Minor fixes and improved documentation (baseline.c, conncomp.c, pix2.c, morphseq.c, pts.c, numarray.c, utils.c, skew.c). Add ability to quantize an rgb image to a specified colormap (colorquant.c); tested in prog/cmapquanttest.c. Modifications to allow conditional compilation on MS VC++, and to allow I/O calls to be stubbed out (new files: *iostub.c, zlibmemstub.c, pstdint.h, arrayaccess.h.ms60) 1.39 31 Aug 06 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| NOTE CAREFULLY: There has been an interface change to make affine, bilinear and projective transforms more general. The implementation has been changed to allow them to handle all image types and to make them faster (esp. with both sampled and interpolated mapping on color images). ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Added prog/Makefile.mingw to build executables. This is still in a relatively raw state. It is necessary to download gnuwin32 packages for 4 libraries (jpeg, png, zlib, tiff) to link with leptonlib and the main, and I still have not been able to build static executables (they require jpeg2b.dll, etc.). 1.38 8 Aug 06 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| NOTE CAREFULLY: There has been an interface change to both simplify and generalize the grayscale morphology operations: pixErodeGray(), pixDilateGray(), pixOpenGray(), pixCloseGray(), pixTophat() and pixMorphGradient(). The prototypes are not changed; old code will compile, but it will be wrong! The old interface had a size and a type (horizontal, vertical, square). The new interface takes horizontal and vertical Sel dimensions. ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| For cross-compilation to make windows programs, you can use src/Makefile.mingw to make a windows version of the library. 6x scale-to-gray function donated by Alberto Accomazzi. Interpreter added for sequence of grayscale morphological operations, including the tophat (morphseq.c). Pixacc container added to simplify the interface for accumulator arithmetic using Pix. Removed fmorph.c and fmorphlow.c from the library. These are very limited and were deprecated in favor of fmorphauto(), which autogens the code from (nearly) any Sel. Fixed some of the gray morphology operations, which had errors on the boundary. All gray morph ops should now be rigorously OK (graymorph.c). For testing of graymorph dualities, the the graymorph interpreter, etc., see prog/morphgraytest.c. 1.37 10 Jul 06 [After v.36 was released, Jeff Breidenbach built a Debian distribution of Leptonica based on v.36, and you can now get Leptonica as a Debian package. Use "apt-cache search leptonica" to see what is available.] The libraries are now combined into a single library (liblept.a, liblept.so) and the function prototypes are also in a single file (leptprotos.h). cextract was found not to work on recent versions of linux that support 64 bit data types, and it is no longer distributed with leptonica. Instead, I wrote a prototype extractor in leptonica (xtractprotos). When you 'make allprotos', it now uses this program. The shared libraries now have major and minor numbers corresponding to the version. 1.36 17 Jun 06 Line graphics generation (graphics.c) reorganized; separated out pta generation from rendering. Can now render with alpha blending. Examples of use are given in prog/graphicstest.c. Sort functions for basic geometrical objects now have the option of returning a numa giving the sort order on the original array. The pixa sort can sort with either clones or copies of the pix. 1.35 21 May 06 The efficiency of the multipage jbig unsupervised classifier is significantly improved due to a NumaHash struture implemented by Adam Langley. Functions for computing runlength in 1 bpp images have been added. 1.34 7 May 06 Completely rewrote the jbig unsupervised classifier. It now works on multiple pages, and is more accurate in performing visually lossless substitutions. You can classify by connected components, characters, or words. The old data structures and interfaces have been removed. New unpackers from 1 to 2 and 1 to 4 bpp, with and without colormaps in the dest. 1.33 18 Mar 06 Generalized color snap to have different src and target colors, and to include colormaps (blend.c). Distribute into root directory that specifies the version number (e.g., 1.33). Add color space conversion between rgb and hsv. Re-bundle thresholding code from (binarize*.c, dibitize*.c) to grayquant*.c. pixThreshold8() now also quantizes 8 bpp --> 8 bpp. High-level pixRotate() that optionally expands image sufficiently so that no pixels are lost in any sequence of rotations (rotate.c). Generalize shear to specify color of pixels brought in, including for in-place operation (shear.c, rotateshear.c). Faster version of color rotation by area mapping, both about center and about UL corner. You can now use the standard color rotator (pixRotateAM) and get nearly the same speed as with the "Fast" one. 1.32 4 Feb 06 Ability to specify a sequence of binary morphological (& binary reduction/expansion) operations in a single function (morphseq.c). Fast downscaling combined with conversion from rbg to gray and to binary (scale.c). Utility for segmenting images by color (colorseg.c). 1.31 7 Jan 06 Remove more complicated functions that threshold to 2 bpp, retaining the simplest interface. Retest all thresholding and dithering. Add "ascii" write of PNM. Improve graphics writing of lines; generalize to colormaps. New colorization functions (paintcmap.c, blend.c). 1.30 22 Dec 05 Remove most instances of fprintf(stderr, ...), except within DEBUG or encapsulated in error, warning or info macros. As a result, there is no output to stderr if NO_CONSOLE_IO is defined. Adaptive mapping to make bg uniform (adaptmap.c). A few bug fixes. New PostScript output functions for embedding PS files (prog/converttops). Generalized some image enhancement functions. New functions for generating hit-miss sels. 1.29 12 Nov 05 More flexible blending of two images, with and without colormaps (see blend.c). Painting colormapped images through masks, etc (see paintcmap.c). More flexible interface for gamma and contrast enhancement (see enhance.c). 1.28 8 Oct 05 Removed all pix colormaps for 1 bpp. Allow programmatic resetting of binary morphology boundary conditions. Added (yet) another simple octcube color quantizer. New colormap operations. 1.27 24 Sep 05 Renamed many of the enums and typedefs to avoid namespace collisions. This includes structs and typedefs for BMP. Interface change to pixClipRectangle(); apologies to everyone whose code is broken by these changes -- I hope it's worth it. Removed colormap.h; simplified all colormap usage, hiding details from all but a few colormap functions. Am now saving file format in the pix when an image is read, and can by default write out in this format. Resolution info added for jpeg and png. Added L_INFO* macros and l_info* fctns for printing (e.g., debugging) info. Suggestions and code kindly supplied by Dave Bryan, who helped solve compatibility issues with MINGW32 (e.g., in timing and directory functions). Added some blending and linear TRC functions. Generalized pixEqual() to include all cases with and without colormaps. New regression tests in prog: ioformats, equaltest. 1.26 24 Jul 05 Generalized affine pointwise to do interpolation as well as sampling. For both projective and bilinear transforms, implemented using both sampling and interpolation. Added function to remove keystoning by computing the necessary projective transform and doing it. Also find baselines in text images. Added downscaling using accurate area-mapping over subpixels. 1.25 25 Jun 05 Better endian conversion fctns for 2 and 4 byte words. Remove colormaps before converting by thresholds. Added functions to read header parameters for png and tiff. 1.24 5 Jun 05 Added image splitting to allow printing in tiles (as several pages). Added new octree quantization function to generate 4 and 8 bpp colormapped output (not dithered). Fixed bmp resolution. Added new flag for colormap removal (using dest depth based on src colormap). Added I/O tests (prog/ioformats.c) 1.23 10 Apr 05 Added thresholding from 8 bpp to 2 and 4 bpp, allowing specification of both the number of output levels and whether or not a colormap is made. 1.22 27 Mar 05 Add pointer queue facility. To demonstrate it, you can now generate a binary maze using a cellular automaton and find the shortest path between two points in the maze. Add heap of pointers (keyed on the first field), which is used to implement a priority queue. This is applied to search for a "least cost" path on a grayscale image (a generalization of a binary maze). 1.21 28 Feb 05 Read/write of colormaps to file. For gplot, add a new latex output terminal. Bring ptrs into 21st century by including stdint.h, and using uintptr_t for the ptr address arithmetic in arrayaccess.*. This seems to be OK back to RH 7.0, but if you run into trouble with an earlier C compiler, let me know. Also, use enums for global constants whenever possible, and qualify named constants (e.g., ADD --> ARITH_ADD, HORIZ --> MORPH_HORIZ) to avoid possible interactions with other libraries. 1.20 31 Jan 05 Speed up of tiffio and pngio with byte swap generating new pix. In textops.c, ability to split string into paragraphs, in preparation for more general typesetting. Automatic hit-miss Sel generation for pattern matching. Fast downscaling using a lowpass filter and subsampling. Generalization of several grayscale and color operations to work on colormapped images. Improved scale-to-gray and scaling reduction operations to be antialiased for best results. 1.19 30 Nov 04 Additions to fileIO: (1) new jpeg reading options, such as returning warnings and scaled raster; (2) ability to write custom tiff flags. Better tiling functions. Edge extraction, both with grayscale morphology and clipped convolution filters. More general painting through a binary mask: pixSetMaskedGeneral(). Unpacking from binary to 8, 16 and 32 bpp. Thresholding and dithering from 8 bpp to 2 bpp ("dibitization"). New bitmap font facility, using a single rendered font in a variety of sizes: allows painting the text on an image (binary, gray, RGB). (People have asked for the ability to write text on images). 1.18 25 Aug 04 Changed typedefs of built-in types to avoid possible conflicts. Cleaned up and tested all programs in the prog directory. Simplified and fixed the pixSetMasked() and pixCombineMasked() functions. 1.17 31 May 04 Implemented distance function for 16 bpp. We can now generate out 16 bpp PNG. Simple programs for generating PS from a directory of g4tiff or jpeg images. Changed implementation of erosion to allow either asymmetric or symmetric boundary conditions. The distinction is described on the binary morphology web page. Allow read/write of multipage TIFF files. Implemented read/write of PNM files. 1.16 31 Mar 04 New depth conversion functions, improved conversion to false color, new contour rendering (onto 1 bpp or onto the src grayscale image), new orthogonal rotations, better interface for doing arithmetic on 2-d arrays using a pix, improved distance function. 1.15 31 Jan 04 Fast interpolated color rotation with 4x4 subpixels; has nearly the accuracy of the slower method using 16x16 subpixels. Demonstration of line removal from grayscale sketch in prog/lineremoval.c. Conversion of grayscale to false color. Fixed shear and rotation functions to handle angle = 0.0 properly. Other small fixes and interface improvements. 1.14 30 Nov 03 Small implementation changes to list.c. Better sorting routines for number arrays (numa), plus sorting for box arrays (boxa) and pix arrays (pixa). PostScript wrapper for jpeg. Better handling of colormaps, and a simple function to convert an RGB pix with not more than 256 colors to the smallest colormapped pix. PS output wrappers for JFIF JPEG and TIFF G4 files. Comments compatible with doxygen for automatic documentation. 1.13 31 Oct 03 Cleaned up documentation in src. Made libraries and test programs ANSI C++ compliant. Added special cases to rasterops for alignment to word boundaries. Fixed pngio.c to work with most recent libpng (1.2.5). 1.12 30 Jun 03 Implemented border chain representation from a binary image, writes/reads a compressed version, and renders the original image back from the borders. Also writes outline file out in svg format. Number arrays (numa) and point arrays (pta) are also extended to 2nd level arrays (numaa, ptaa). Serialized I/O for boxa, pta, and pixa. 1.11 31 May 03 Implemented generic list handling, for doubly-linked list cons cells and arbitrary objects. 1.10 14 Apr 03 Implemented simple image enhancements in gray and color: gamma correction, contrast enhancement, unsharp masking. Extended smoothing via block convolution to color. Implemented auto-gen'd DWA version of hit-miss transform; the code for generating these hmt routines is very similar to that for DWA auto-gen'd erosion and dilation. 1.9 28 Feb 03 Implemented a safe, expandable byte queue. As an example of its use, implemented memory-to-memory compression and decompression using zlib. Generalized PS write to include RGB color. Implemented a method to find image skew. 1.8 31 Jan 03 Implemented a simple 1-pass color quantization with dithering, and improved the 2-pass octree color quantization. Documented with an application page, that includes jbig2. Added new general sampling operations and made a table that summarizes the available scaling operations. 1.7 31 Dec 02 Added pixHtmlViewer(), a formatter that allows portable viewing of a set of images (like a slide show) in a browser. Implemented better octree color quantization, with variable number of colors, pruning the octree for good color clusters, and fast traversal for pixel assignment to colormap. 1.6 30 Nov 02 Generalized shear and shear rotation to arbitrary locations about which the operation is performed. Implemented in-place translation using pixRasteropIP(). Implemented arbitrary affine transform of image two ways: pointwise and sequential. Added binarization by error diffusion. Added simple color quantization by octree. 1.5 31 Oct 02 Put jpeglib.h in local directory. This, along with the jmorecfg.h file there prevents compiler warnings about redefined typedefs. Compiled everything with g++ to make strictly ansi C compatible. Added interface gplotFromFile() for simple file-based plotting with gnuplot 3.7.2. Added functions to convert 2, 4 and 8 bpp color-mapped (i.e., paletted) images to 24 bpp color or 8 bpp grayscale. Added several jbig2 application cores that only require a simple wrapper to make into programs. 1.4 30 Sep 02 Added interface to gnuplot 3.7.2 and to x11 display of images. Added new functions with arrays of images for use in applications such as jbig2 encoders, along with a simple jbig2 implementation using either hausdorff or correlation scoring. Added centroid finder for images. For accessing image arrays from arrays of image arrays, added a "new reference" (NEW_REF) flag, with a ref count attached to the array. Added power-of-2 binary expansion and reduction. 1.3 30 Jun 02 Extended connected components to 8. Added morphological operations tophat and hdome, along with clipped arithmetic operators on grayscale images. Fixed memory error in rasteropGeneralLow() that was found using valgrind. Tested most operations with valgrind for memory errors. Replaced integer arrays with number arrays, to include floats. Added arithmetic functions on grayscale images. 1.2 30 May 02 Added connected component utility, stack utility, pix arrays, line drawing and seed filling. Binary reconstruction, both morphological and raster-oriented, are now supported for 4 and 8 connected fills. Added the distance function on binary images, grayscale reconstruction, and grayscale morphology using the Gil-Werman method. 1.1 30 Apr 02 Added orthogonal rotations, binary scaling, PS output, binary reconstruction, integer arrays, structuring element input/output. 1.0 25 Feb 02 Initial distribution, with rasterops, binary morphology (two implementations: rasterops and dwa), affine transforms (translation, shear, scaling, rotation), fast convolution, and basic i/o (BMP, PNG and JPEG).