--- /n/sources/plan9/sys/man/1/gzip Sun Dec 2 23:42:15 2007
+++ /sys/man/1/gzip Sat May 1 00:00:00 2021
@@ -1,12 +1,12 @@
.TH GZIP 1
.SH NAME
-gzip, gunzip, bzip2, bunzip2, compress, uncompress, zip, unzip \- compress and expand data
+gzip, gunzip, bzip2, bunzip2, lzip, lunzip, compress, uncompress, zip, unzip \- compress and expand data
.SH SYNOPSIS
.B gzip
.RB [ -cvD [ 1-9 ]]
.RI [ file
.BR ... ]
-.PP
+.br
.B gunzip
.RB [ -ctTvD ]
.RI [ file
@@ -16,12 +16,22 @@
.RB [ -cvD [ 1-9 ]]
.RI [ file
.BR ... ]
-.PP
+.br
.B bunzip2
.RB [ -cvD ]
.RI [ file
.BR ... ]
.PP
+.B lzip
+.RB [ -cvD [ 1-9 ]]
+.RI [ file
+.BR ... ]
+.br
+.B lunzip
+.RB [ -cvD ]
+.RI [ file
+.BR ... ]
+.PP
.B compress
[
.B -cv
@@ -29,7 +39,7 @@
.I file
.B ...
]
-.PP
+.br
.B uncompress
[
.B -cv
@@ -44,7 +54,7 @@
.IR zipfile ]
.I file
.RB [ ... ]
-.PP
+.br
.B unzip
.RB [ -cistTvD ]
.RB [ -f
@@ -86,7 +96,9 @@
and
.IR gunzip ,
but use a modified Burrows-Wheeler block sorting
-compression algorithm.
+compression algorithm,
+which often produces smaller compressed files than
+.IR gzip .
The default suffix for output files is
.BR .bz2 ,
with
@@ -99,6 +111,32 @@
as a synonym for
.BR .tbz .
.PP
+.I Lzip
+and
+.I lunzip
+are also similar in interface to
+.I gzip
+and
+.IR gunzip ,
+but use a specific LZMA (Lempel-Ziv-Markov) compression algorithm,
+which often produces smaller compressed files than
+.IR bzip2 .
+The default suffix for output files is
+.BR .lz ,
+with
+.B .tar.lz
+becoming
+.BR .tlz .
+Note that the popular
+.I xz
+compression program uses different LZMA compression algorithms
+and so files compressed by it will not be understood by
+.I lunzip
+and vice versa
+(and may not even be understood by other
+.I xz
+implementations).
+.PP
.I Compress
and
.I uncompress
@@ -130,7 +168,8 @@
If the process fails, the faulty output files are removed.
.PP
The options are:
-.TP 0.6i
+.\" .TP 0.6i
+.TP 0.3i
.B -a
Automaticialy creates directories as needed, needed for zip files
created by broken implementations which omit directories.
@@ -183,9 +222,7 @@
.B -D
Produce debugging output.
.SH SOURCE
-.B /sys/src/cmd/gzip
-.br
-.B /sys/src/cmd/bzip2
+.B /sys/src/cmd/*zip*
.br
.B /sys/src/cmd/compress
.SH SEE ALSO
diff -Nru /sys/src/cmd/lzip/AUTHORS /sys/src/cmd/lzip/AUTHORS
--- /sys/src/cmd/lzip/AUTHORS Thu Jan 1 00:00:00 1970
+++ /sys/src/cmd/lzip/AUTHORS Sat May 1 00:00:00 2021
@@ -0,0 +1,7 @@
+Clzip was written by Antonio Diaz Diaz.
+
+The ideas embodied in clzip are due to (at least) the following people:
+Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrey Markov (for
+the definition of Markov chains), G.N.N. Martin (for the definition of
+range encoding), Igor Pavlov (for putting all the above together in
+LZMA), and Julian Seward (for bzip2's CLI).
diff -Nru /sys/src/cmd/lzip/COPYING /sys/src/cmd/lzip/COPYING
--- /sys/src/cmd/lzip/COPYING Thu Jan 1 00:00:00 1970
+++ /sys/src/cmd/lzip/COPYING Sat May 1 00:00:00 2021
@@ -0,0 +1,338 @@
+ GNU GENERAL PUBLIC LICENSE
+ Version 2, June 1991
+
+ Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
+ 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license document, but changing it is not allowed.
+
+ Preamble
+
+ The licenses for most software are designed to take away your
+freedom to share and change it. By contrast, the GNU General Public
+License is intended to guarantee your freedom to share and change free
+software--to make sure the software is free for all its users. This
+General Public License applies to most of the Free Software
+Foundation's software and to any other program whose authors commit to
+using it. (Some other Free Software Foundation software is covered by
+the GNU Lesser General Public License instead.) You can apply it to
+your programs, too.
+
+ When we speak of free software, we are referring to freedom, not
+price. Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+this service if you wish), that you receive source code or can get it
+if you want it, that you can change the software or use pieces of it
+in new free programs; and that you know you can do these things.
+
+ To protect your rights, we need to make restrictions that forbid
+anyone to deny you these rights or to ask you to surrender the rights.
+These restrictions translate to certain responsibilities for you if you
+distribute copies of the software, or if you modify it.
+
+ For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must give the recipients all the rights that
+you have. You must make sure that they, too, receive or can get the
+source code. And you must show them these terms so they know their
+rights.
+
+ We protect your rights with two steps: (1) copyright the software, and
+(2) offer you this license which gives you legal permission to copy,
+distribute and/or modify the software.
+
+ Also, for each author's protection and ours, we want to make certain
+that everyone understands that there is no warranty for this free
+software. If the software is modified by someone else and passed on, we
+want its recipients to know that what they have is not the original, so
+that any problems introduced by others will not reflect on the original
+authors' reputations.
+
+ Finally, any free program is threatened constantly by software
+patents. We wish to avoid the danger that redistributors of a free
+program will individually obtain patent licenses, in effect making the
+program proprietary. To prevent this, we have made it clear that any
+patent must be licensed for everyone's free use or not licensed at all.
+
+ The precise terms and conditions for copying, distribution and
+modification follow.
+
+ GNU GENERAL PUBLIC LICENSE
+ TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
+
+ 0. This License applies to any program or other work which contains
+a notice placed by the copyright holder saying it may be distributed
+under the terms of this General Public License. The "Program", below,
+refers to any such program or work, and a "work based on the Program"
+means either the Program or any derivative work under copyright law:
+that is to say, a work containing the Program or a portion of it,
+either verbatim or with modifications and/or translated into another
+language. (Hereinafter, translation is included without limitation in
+the term "modification".) Each licensee is addressed as "you".
+
+Activities other than copying, distribution and modification are not
+covered by this License; they are outside its scope. The act of
+running the Program is not restricted, and the output from the Program
+is covered only if its contents constitute a work based on the
+Program (independent of having been made by running the Program).
+Whether that is true depends on what the Program does.
+
+ 1. You may copy and distribute verbatim copies of the Program's
+source code as you receive it, in any medium, provided that you
+conspicuously and appropriately publish on each copy an appropriate
+copyright notice and disclaimer of warranty; keep intact all the
+notices that refer to this License and to the absence of any warranty;
+and give any other recipients of the Program a copy of this License
+along with the Program.
+
+You may charge a fee for the physical act of transferring a copy, and
+you may at your option offer warranty protection in exchange for a fee.
+
+ 2. You may modify your copy or copies of the Program or any portion
+of it, thus forming a work based on the Program, and copy and
+distribute such modifications or work under the terms of Section 1
+above, provided that you also meet all of these conditions:
+
+ a) You must cause the modified files to carry prominent notices
+ stating that you changed the files and the date of any change.
+
+ b) You must cause any work that you distribute or publish, that in
+ whole or in part contains or is derived from the Program or any
+ part thereof, to be licensed as a whole at no charge to all third
+ parties under the terms of this License.
+
+ c) If the modified program normally reads commands interactively
+ when run, you must cause it, when started running for such
+ interactive use in the most ordinary way, to print or display an
+ announcement including an appropriate copyright notice and a
+ notice that there is no warranty (or else, saying that you provide
+ a warranty) and that users may redistribute the program under
+ these conditions, and telling the user how to view a copy of this
+ License. (Exception: if the Program itself is interactive but
+ does not normally print such an announcement, your work based on
+ the Program is not required to print an announcement.)
+
+These requirements apply to the modified work as a whole. If
+identifiable sections of that work are not derived from the Program,
+and can be reasonably considered independent and separate works in
+themselves, then this License, and its terms, do not apply to those
+sections when you distribute them as separate works. But when you
+distribute the same sections as part of a whole which is a work based
+on the Program, the distribution of the whole must be on the terms of
+this License, whose permissions for other licensees extend to the
+entire whole, and thus to each and every part regardless of who wrote it.
+
+Thus, it is not the intent of this section to claim rights or contest
+your rights to work written entirely by you; rather, the intent is to
+exercise the right to control the distribution of derivative or
+collective works based on the Program.
+
+In addition, mere aggregation of another work not based on the Program
+with the Program (or with a work based on the Program) on a volume of
+a storage or distribution medium does not bring the other work under
+the scope of this License.
+
+ 3. You may copy and distribute the Program (or a work based on it,
+under Section 2) in object code or executable form under the terms of
+Sections 1 and 2 above provided that you also do one of the following:
+
+ a) Accompany it with the complete corresponding machine-readable
+ source code, which must be distributed under the terms of Sections
+ 1 and 2 above on a medium customarily used for software interchange; or,
+
+ b) Accompany it with a written offer, valid for at least three
+ years, to give any third party, for a charge no more than your
+ cost of physically performing source distribution, a complete
+ machine-readable copy of the corresponding source code, to be
+ distributed under the terms of Sections 1 and 2 above on a medium
+ customarily used for software interchange; or,
+
+ c) Accompany it with the information you received as to the offer
+ to distribute corresponding source code. (This alternative is
+ allowed only for noncommercial distribution and only if you
+ received the program in object code or executable form with such
+ an offer, in accord with Subsection b above.)
+
+The source code for a work means the preferred form of the work for
+making modifications to it. For an executable work, complete source
+code means all the source code for all modules it contains, plus any
+associated interface definition files, plus the scripts used to
+control compilation and installation of the executable. However, as a
+special exception, the source code distributed need not include
+anything that is normally distributed (in either source or binary
+form) with the major components (compiler, kernel, and so on) of the
+operating system on which the executable runs, unless that component
+itself accompanies the executable.
+
+If distribution of executable or object code is made by offering
+access to copy from a designated place, then offering equivalent
+access to copy the source code from the same place counts as
+distribution of the source code, even though third parties are not
+compelled to copy the source along with the object code.
+
+ 4. You may not copy, modify, sublicense, or distribute the Program
+except as expressly provided under this License. Any attempt
+otherwise to copy, modify, sublicense or distribute the Program is
+void, and will automatically terminate your rights under this License.
+However, parties who have received copies, or rights, from you under
+this License will not have their licenses terminated so long as such
+parties remain in full compliance.
+
+ 5. You are not required to accept this License, since you have not
+signed it. However, nothing else grants you permission to modify or
+distribute the Program or its derivative works. These actions are
+prohibited by law if you do not accept this License. Therefore, by
+modifying or distributing the Program (or any work based on the
+Program), you indicate your acceptance of this License to do so, and
+all its terms and conditions for copying, distributing or modifying
+the Program or works based on it.
+
+ 6. Each time you redistribute the Program (or any work based on the
+Program), the recipient automatically receives a license from the
+original licensor to copy, distribute or modify the Program subject to
+these terms and conditions. You may not impose any further
+restrictions on the recipients' exercise of the rights granted herein.
+You are not responsible for enforcing compliance by third parties to
+this License.
+
+ 7. If, as a consequence of a court judgment or allegation of patent
+infringement or for any other reason (not limited to patent issues),
+conditions are imposed on you (whether by court order, agreement or
+otherwise) that contradict the conditions of this License, they do not
+excuse you from the conditions of this License. If you cannot
+distribute so as to satisfy simultaneously your obligations under this
+License and any other pertinent obligations, then as a consequence you
+may not distribute the Program at all. For example, if a patent
+license would not permit royalty-free redistribution of the Program by
+all those who receive copies directly or indirectly through you, then
+the only way you could satisfy both it and this License would be to
+refrain entirely from distribution of the Program.
+
+If any portion of this section is held invalid or unenforceable under
+any particular circumstance, the balance of the section is intended to
+apply and the section as a whole is intended to apply in other
+circumstances.
+
+It is not the purpose of this section to induce you to infringe any
+patents or other property right claims or to contest validity of any
+such claims; this section has the sole purpose of protecting the
+integrity of the free software distribution system, which is
+implemented by public license practices. Many people have made
+generous contributions to the wide range of software distributed
+through that system in reliance on consistent application of that
+system; it is up to the author/donor to decide if he or she is willing
+to distribute software through any other system and a licensee cannot
+impose that choice.
+
+This section is intended to make thoroughly clear what is believed to
+be a consequence of the rest of this License.
+
+ 8. If the distribution and/or use of the Program is restricted in
+certain countries either by patents or by copyrighted interfaces, the
+original copyright holder who places the Program under this License
+may add an explicit geographical distribution limitation excluding
+those countries, so that distribution is permitted only in or among
+countries not thus excluded. In such case, this License incorporates
+the limitation as if written in the body of this License.
+
+ 9. The Free Software Foundation may publish revised and/or new versions
+of the General Public License from time to time. Such new versions will
+be similar in spirit to the present version, but may differ in detail to
+address new problems or concerns.
+
+Each version is given a distinguishing version number. If the Program
+specifies a version number of this License which applies to it and "any
+later version", you have the option of following the terms and conditions
+either of that version or of any later version published by the Free
+Software Foundation. If the Program does not specify a version number of
+this License, you may choose any version ever published by the Free Software
+Foundation.
+
+ 10. If you wish to incorporate parts of the Program into other free
+programs whose distribution conditions are different, write to the author
+to ask for permission. For software which is copyrighted by the Free
+Software Foundation, write to the Free Software Foundation; we sometimes
+make exceptions for this. Our decision will be guided by the two goals
+of preserving the free status of all derivatives of our free software and
+of promoting the sharing and reuse of software generally.
+
+ NO WARRANTY
+
+ 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
+FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
+OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
+PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
+OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
+TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
+PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
+REPAIR OR CORRECTION.
+
+ 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
+WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
+REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
+INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
+OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
+TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
+YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
+PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGES.
+
+ END OF TERMS AND CONDITIONS
+
+ How to Apply These Terms to Your New Programs
+
+ If you develop a new program, and you want it to be of the greatest
+possible use to the public, the best way to achieve this is to make it
+free software which everyone can redistribute and change under these terms.
+
+ To do so, attach the following notices to the program. It is safest
+to attach them to the start of each source file to most effectively
+convey the exclusion of warranty; and each file should have at least
+the "copyright" line and a pointer to where the full notice is found.
+
+ <one line to give the program's name and a brief idea of what it does.>
+ Copyright (C) <year> <name of author>
+
+ This program is free software: you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation, either version 2 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>.
+
+Also add information on how to contact you by electronic and paper mail.
+
+If the program is interactive, make it output a short notice like this
+when it starts in an interactive mode:
+
+ Gnomovision version 69, Copyright (C) <year> <name of author>
+ Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
+ This is free software, and you are welcome to redistribute it
+ under certain conditions; type `show c' for details.
+
+The hypothetical commands `show w' and `show c' should show the appropriate
+parts of the General Public License. Of course, the commands you use may
+be called something other than `show w' and `show c'; they could even be
+mouse-clicks or menu items--whatever suits your program.
+
+You should also get your employer (if you work as a programmer) or your
+school, if any, to sign a "copyright disclaimer" for the program, if
+necessary. Here is a sample; alter the names:
+
+ Yoyodyne, Inc., hereby disclaims all copyright interest in the program
+ `Gnomovision' (which makes passes at compilers) written by James Hacker.
+
+ <signature of Ty Coon>, 1 April 1989
+ Ty Coon, President of Vice
+
+This General Public License does not permit incorporating your program into
+proprietary programs. If your program is a subroutine library, you may
+consider it more useful to permit linking proprietary applications with the
+library. If this is what you want to do, use the GNU Lesser General
+Public License instead of this License.
diff -Nru /sys/src/cmd/lzip/ChangeLog /sys/src/cmd/lzip/ChangeLog
--- /sys/src/cmd/lzip/ChangeLog Thu Jan 1 00:00:00 1970
+++ /sys/src/cmd/lzip/ChangeLog Sat May 1 00:00:00 2021
@@ -0,0 +1,115 @@
+2017-04-13 Antonio Diaz Diaz <antonio@gnu.org>
+
+ * Version 1.9 released.
+ * The option '-l, --list' has been ported from lziprecover.
+ * Don't allow mixing different operations (-d, -l or -t).
+ * Compression time of option '-0' has been reduced by 6%.
+ * Compression time of options -1 to -9 has been reduced by 1%.
+ * Decompression time has been reduced by 7%.
+ * main.c: Continue testing if any input file is a terminal.
+ * main.c: Show trailing data in both hexadecimal and ASCII.
+ * file_index.c: Improve detection of bad dict and trailing data.
+ * lzip.h: Unified messages for bad magic, trailing data, etc.
+ * clzip.texi: Added missing chapters from lzip.texi.
+
+2016-05-13 Antonio Diaz Diaz <antonio@gnu.org>
+
+ * Version 1.8 released.
+ * main.c: Added new option '-a, --trailing-error'.
+ * main.c (decompress): Print up to 6 bytes of trailing data
+ when '-vvvv' is specified.
+ * decoder.c (LZd_verify_trailer): Removed test of final code.
+ * main.c (main): Delete '--output' file if infd is a terminal.
+ * main.c (main): Don't use stdin more than once.
+ * clzip.texi: Added chapter 'Trailing data'.
+ * configure: Avoid warning on some shells when testing for gcc.
+ * Makefile.in: Detect the existence of install-info.
+ * testsuite/check.sh: A POSIX shell is required to run the tests.
+ * testsuite/check.sh: Don't check error messages.
+
+2015-07-07 Antonio Diaz Diaz <antonio@gnu.org>
+
+ * Version 1.7 released.
+ * Ported fast encoder and option '-0' from lzip.
+ * Makefile.in: Added new targets 'install*-compress'.
+
+2014-08-28 Antonio Diaz Diaz <antonio@gnu.org>
+
+ * Version 1.6 released.
+ * Compression ratio of option '-9' has been slightly increased.
+ * main.c (close_and_set_permissions): Behave like 'cp -p'.
+ * clzip.texinfo: Renamed to clzip.texi.
+ * License changed to GPL version 2 or later.
+
+2013-09-17 Antonio Diaz Diaz <antonio@gnu.org>
+
+ * Version 1.5 released.
+ * Show progress of compression at verbosity level 2 (-vv).
+ * main.c (show_header): Don't show header version.
+ * Ignore option '-n, --threads' for compatibility with plzip.
+ * configure: Options now accept a separate argument.
+
+2013-02-18 Antonio Diaz Diaz <ant_diaz@teleline.es>
+
+ * Version 1.4 released.
+ * Multi-step trials have been implemented.
+ * Compression ratio has been slightly increased.
+ * Compression time has been reduced by 10%.
+ * Decompression time has been reduced by 8%.
+ * Makefile.in: Added new target 'install-as-lzip'.
+ * Makefile.in: Added new target 'install-bin'.
+ * main.c: Use 'setmode' instead of '_setmode' on Windows and OS/2.
+ * main.c: Define 'strtoull' to 'strtoul' on Windows.
+
+2012-02-25 Antonio Diaz Diaz <ant_diaz@teleline.es>
+
+ * Version 1.3 released.
+ * main.c (close_and_set_permissions): Inability to change output
+ file attributes has been downgraded from error to warning.
+ * encoder.c (Mf_init): Return false if out of memory instead of
+ calling cleanup_and_fail.
+ * Small change in '--help' output and man page.
+ * Changed quote characters in messages as advised by GNU Standards.
+ * configure: 'datadir' renamed to 'datarootdir'.
+
+2011-05-18 Antonio Diaz Diaz <ant_diaz@teleline.es>
+
+ * Version 1.2 released.
+ * main.c: Added new option '-F, --recompress'.
+ * main.c (decompress): Print only one status line for each
+ multimember file when only one '-v' is specified.
+ * encoder.h (Lee_update_prices): Update high length symbol prices
+ independently of the value of 'pos_state'. This gives better
+ compression for large values of '--match-length' without being
+ slower.
+ * encoder.h encoder.c: Optimize pair price calculations. This
+ reduces compression time for large values of '--match-length'
+ by up to 6%.
+
+2011-01-11 Antonio Diaz Diaz <ant_diaz@teleline.es>
+
+ * Version 1.1 released.
+ * Code has been converted to 'C89 + long long' from C99.
+ * main.c: Fixed warning about fchown return value being ignored.
+ * decoder.c: '-tvvvv' now shows compression ratio.
+ * main.c: Match length limit set by options -1 to -8 has been
+ reduced to extend range of use towards gzip. Lower numbers now
+ compress less but faster. (-1 now takes 43% less time for only
+ 20% larger compressed size).
+ * Compression ratio of option '-9' has been slightly increased.
+ * main.c (open_instream): Don't show the message
+ " and '--stdout' was not specified" for directories, etc.
+ * New examples have been added to the manual.
+
+2010-04-05 Antonio Diaz Diaz <ant_diaz@teleline.es>
+
+ * Version 1.0 released.
+ * Initial release.
+ * Translated to C from the C++ source of lzip 1.10.
+
+
+Copyright (C) 2010-2017 Antonio Diaz Diaz.
+
+This file is a collection of facts, and thus it is not copyrightable,
+but just in case, you have unlimited permission to copy, distribute and
+modify it.
diff -Nru /sys/src/cmd/lzip/NEWS /sys/src/cmd/lzip/NEWS
--- /sys/src/cmd/lzip/NEWS Thu Jan 1 00:00:00 1970
+++ /sys/src/cmd/lzip/NEWS Sat May 1 00:00:00 2021
@@ -0,0 +1,21 @@
+Changes in version 1.9:
+
+The option '-l, --list' has been ported from lziprecover.
+
+It is now an error to specify two or more different operations in the
+command line (--decompress, --list or --test).
+
+Compression time of option '-0' has been reduced by 6%.
+
+Compression time of options '-1' to '-9' has been reduced by 1%.
+
+Decompression time has been reduced by 7%.
+
+In test mode, clzip now continues checking the rest of the files if any
+input file is a terminal.
+
+Trailing data are now shown both in hexadecimal and as a string of
+printable ASCII characters.
+
+Three missing chapters have been added to the manual, which now contains
+all the chapters of the lzip manual.
diff -Nru /sys/src/cmd/lzip/README /sys/src/cmd/lzip/README
--- /sys/src/cmd/lzip/README Thu Jan 1 00:00:00 1970
+++ /sys/src/cmd/lzip/README Sat May 1 00:00:00 2021
@@ -0,0 +1,126 @@
+Description
+
+Clzip is a C language version of lzip, fully compatible with lzip-1.4 or
+newer. As clzip is written in C, it may be easier to integrate in
+applications like package managers, embedded devices, or systems lacking
+a C++ compiler.
+
+Lzip is a lossless data compressor with a user interface similar to the
+one of gzip or bzip2. Lzip can compress about as fast as gzip (lzip -0),
+or compress most files more than bzip2 (lzip -9). Decompression speed is
+intermediate between gzip and bzip2. Lzip is better than gzip and bzip2
+from a data recovery perspective.
+
+The lzip file format is designed for data sharing and long-term
+archiving, taking into account both data integrity and decoder
+availability:
+
+ * The lzip format provides very safe integrity checking and some data
+ recovery means. The lziprecover program can repair bit-flip errors
+ (one of the most common forms of data corruption) in lzip files,
+ and provides data recovery capabilities, including error-checked
+ merging of damaged copies of a file.
+
+ * The lzip format is as simple as possible (but not simpler). The
+ lzip manual provides the source code of a simple decompressor along
+ with a detailed explanation of how it works, so that with the only
+ help of the lzip manual it would be possible for a digital
+ archaeologist to extract the data from a lzip file long after
+ quantum computers eventually render LZMA obsolete.
+
+ * Additionally the lzip reference implementation is copylefted, which
+ guarantees that it will remain free forever.
+
+A nice feature of the lzip format is that a corrupt byte is easier to
+repair the nearer it is from the beginning of the file. Therefore, with
+the help of lziprecover, losing an entire archive just because of a
+corrupt byte near the beginning is a thing of the past.
+
+Clzip uses the same well-defined exit status values used by lzip and
+bzip2, which makes it safer than compressors returning ambiguous warning
+values (like gzip) when it is used as a back end for other programs like
+tar or zutils.
+
+Clzip will automatically use the smallest possible dictionary size for
+each file without exceeding the given limit. Keep in mind that the
+decompression memory requirement is affected at compression time by the
+choice of dictionary size limit.
+
+The amount of memory required for compression is about 1 or 2 times the
+dictionary size limit (1 if input file size is less than dictionary size
+limit, else 2) plus 9 times the dictionary size really used. The option
+'-0' is special and only requires about 1.5 MiB at most. The amount of
+memory required for decompression is about 46 kB larger than the
+dictionary size really used.
+
+When compressing, clzip replaces every file given in the command line
+with a compressed version of itself, with the name "original_name.lz".
+When decompressing, clzip attempts to guess the name for the decompressed
+file from that of the compressed file as follows:
+
+filename.lz becomes filename
+filename.tlz becomes filename.tar
+anyothername becomes anyothername.out
+
+(De)compressing a file is much like copying or moving it; therefore clzip
+preserves the access and modification dates, permissions, and, when
+possible, ownership of the file just as "cp -p" does. (If the user ID or
+the group ID can't be duplicated, the file permission bits S_ISUID and
+S_ISGID are cleared).
+
+Clzip is able to read from some types of non regular files if the
+"--stdout" option is specified.
+
+If no file names are specified, clzip compresses (or decompresses) from
+standard input to standard output. In this case, clzip will decline to
+write compressed output to a terminal, as this would be entirely
+incomprehensible and therefore pointless.
+
+Clzip will correctly decompress a file which is the concatenation of two
+or more compressed files. The result is the concatenation of the
+corresponding uncompressed files. Integrity testing of concatenated
+compressed files is also supported.
+
+Clzip can produce multimember files, and lziprecover can safely recover
+the undamaged members in case of file damage. Clzip can also split the
+compressed output in volumes of a given size, even when reading from
+standard input. This allows the direct creation of multivolume
+compressed tar archives.
+
+Clzip is able to compress and decompress streams of unlimited size by
+automatically creating multimember output. The members so created are
+large, about 2 PiB each.
+
+In spite of its name (Lempel-Ziv-Markov chain-Algorithm), LZMA is not a
+concrete algorithm; it is more like "any algorithm using the LZMA coding
+scheme". For example, the option '-0' of lzip uses the scheme in almost
+the simplest way possible; issuing the longest match it can find, or a
+literal byte if it can't find a match. Inversely, a much more elaborated
+way of finding coding sequences of minimum size than the one currently
+used by lzip could be developed, and the resulting sequence could also
+be coded using the LZMA coding scheme.
+
+Clzip currently implements two variants of the LZMA algorithm; fast
+(used by option '-0') and normal (used by all other compression levels).
+
+The high compression of LZMA comes from combining two basic, well-proven
+compression ideas: sliding dictionaries (LZ77/78) and markov models (the
+thing used by every compression algorithm that uses a range encoder or
+similar order-0 entropy coder as its last stage) with segregation of
+contexts according to what the bits are used for.
+
+The ideas embodied in clzip are due to (at least) the following people:
+Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrey Markov (for
+the definition of Markov chains), G.N.N. Martin (for the definition of
+range encoding), Igor Pavlov (for putting all the above together in
+LZMA), and Julian Seward (for bzip2's CLI).
+
+
+Copyright (C) 2010-2017 Antonio Diaz Diaz.
+
+This file is free documentation: you have unlimited permission to copy,
+distribute and modify it.
+
+The file Makefile.in is a data file used by configure to produce the
+Makefile. It has the same copyright owner and permissions that configure
+itself.
diff -Nru /sys/src/cmd/lzip/README.plan9 /sys/src/cmd/lzip/README.plan9
--- /sys/src/cmd/lzip/README.plan9 Thu Jan 1 00:00:00 1970
+++ /sys/src/cmd/lzip/README.plan9 Sat May 1 00:00:00 2021
@@ -0,0 +1,2 @@
+This is clzip 1.9, tuned and somewhat beautified.
+It's still not pretty but it's legible.
diff -Nru /sys/src/cmd/lzip/decoder.c /sys/src/cmd/lzip/decoder.c
--- /sys/src/cmd/lzip/decoder.c Thu Jan 1 00:00:00 1970
+++ /sys/src/cmd/lzip/decoder.c Sat May 1 00:00:00 2021
@@ -0,0 +1,294 @@
+/* Clzip - LZMA lossless data compressor
+ Copyright (C) 2010-2017 Antonio Diaz Diaz.
+
+ This program is free software: you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation, either version 2 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>. */
+
+#include "lzip.h"
+#include "decoder.h"
+
+void
+Pp_show_msg(Pretty_print *pp, char *msg)
+{
+ if (verbosity >= 0) {
+ if (pp->first_post) {
+ unsigned i;
+
+ pp->first_post = false;
+ fprintf(stderr, "%s: ", pp->name);
+ for (i = strlen(pp->name); i < pp->longest_name; ++i)
+ fputc(' ', stderr);
+ if (!msg)
+ fflush(stderr);
+ }
+ if (msg)
+ fprintf(stderr, "%s\n", msg);
+ }
+}
+
+/* Returns the number of bytes really read.
+ If returned value < size and no read error, means EOF was reached.
+ */
+int
+readblock(int fd, uchar *buf, int size)
+{
+ int n, sz;
+
+ for (sz = 0; sz < size; sz += n) {
+ n = read(fd, buf + sz, size - sz);
+ if (n <= 0)
+ break;
+ }
+ return sz;
+}
+
+/* Returns the number of bytes really written.
+ If (returned value < size), it is always an error.
+ */
+int
+writeblock(int fd, uchar *buf, int size)
+{
+ int n, sz;
+
+ for (sz = 0; sz < size; sz += n) {
+ n = write(fd, buf + sz, size - sz);
+ if (n != size - sz)
+ break;
+ }
+ return sz;
+}
+
+bool
+Rd_read_block(Range_decoder *rdec)
+{
+ if (!rdec->at_stream_end) {
+ rdec->stream_pos = readblock(rdec->infd, rdec->buffer, rd_buffer_size);
+ if (rdec->stream_pos != rd_buffer_size && errno) {
+ show_error( "Read error", errno, false );
+ cleanup_and_fail(1);
+ }
+ rdec->at_stream_end = (rdec->stream_pos < rd_buffer_size);
+ rdec->partial_member_pos += rdec->pos;
+ rdec->pos = 0;
+ }
+ return rdec->pos < rdec->stream_pos;
+}
+
+void
+LZd_flush_data(LZ_decoder *d)
+{
+ if (d->pos > d->stream_pos) {
+ int size = d->pos - d->stream_pos;
+ CRC32_update_buf(&d->crc, d->buffer + d->stream_pos, size);
+ if (d->outfd >= 0 &&
+ writeblock(d->outfd, d->buffer + d->stream_pos, size) != size) {
+ show_error( "Write error", errno, false );
+ cleanup_and_fail(1);
+ }
+ if (d->pos >= d->dict_size) {
+ d->partial_data_pos += d->pos;
+ d->pos = 0;
+ d->pos_wrapped = true;
+ }
+ d->stream_pos = d->pos;
+ }
+}
+
+static bool
+LZd_verify_trailer(LZ_decoder *d, Pretty_print *pp)
+{
+ File_trailer trailer;
+ int size = Rd_read_data(d->rdec, trailer, Ft_size);
+ uvlong data_size = LZd_data_position(d);
+ uvlong member_size = Rd_member_position(d->rdec);
+ bool error = false;
+
+ if (size < Ft_size) {
+ error = true;
+ if (verbosity >= 0) {
+ Pp_show_msg(pp, 0);
+ fprintf( stderr, "Trailer truncated at trailer position %d;"
+ " some checks may fail.\n", size );
+ }
+ while (size < Ft_size)
+ trailer[size++] = 0;
+ }
+
+ if (Ft_get_data_crc(trailer) != LZd_crc(d)) {
+ error = true;
+ if (verbosity >= 0) {
+ Pp_show_msg(pp, 0);
+ fprintf( stderr, "CRC mismatch; trailer says %08X, data CRC is %08X\n",
+ Ft_get_data_crc(trailer), LZd_crc(d));
+ }
+ }
+ if (Ft_get_data_size(trailer) != data_size) {
+ error = true;
+ if (verbosity >= 0) {
+ Pp_show_msg(pp, 0);
+ fprintf( stderr, "Data size mismatch; trailer says %llud, data size is %llud (0x%lluX)\n",
+ Ft_get_data_size(trailer), data_size, data_size);
+ }
+ }
+ if (Ft_get_member_size(trailer) != member_size) {
+ error = true;
+ if (verbosity >= 0) {
+ Pp_show_msg(pp, 0);
+ fprintf(stderr, "Member size mismatch; trailer says %llud, member size is %llud (0x%lluX)\n",
+ Ft_get_member_size(trailer), member_size, member_size);
+ }
+ }
+ if (0 && !error && verbosity >= 2 && data_size > 0 && member_size > 0)
+ fprintf(stderr, "%6.3f:1, %6.3f bits/byte, %5.2f%% saved. ",
+ (double)data_size / member_size,
+ (8.0 * member_size) / data_size,
+ 100.0 * (1.0 - (double)member_size / data_size));
+ if (!error && verbosity >= 4)
+ fprintf( stderr, "CRC %08X, decompressed %9llud, compressed %8llud. ",
+ LZd_crc(d), data_size, member_size);
+ return !error;
+}
+
+/* Return value: 0 = OK, 1 = decoder error, 2 = unexpected EOF,
+ 3 = trailer error, 4 = unknown marker found. */
+int
+LZd_decode_member(LZ_decoder *d, Pretty_print *pp)
+{
+ Range_decoder *rdec = d->rdec;
+ Bit_model bm_literal[1<<literal_context_bits][0x300];
+ Bit_model bm_match[states][pos_states];
+ Bit_model bm_rep[states];
+ Bit_model bm_rep0[states];
+ Bit_model bm_rep1[states];
+ Bit_model bm_rep2[states];
+ Bit_model bm_len[states][pos_states];
+ Bit_model bm_dis_slot[len_states][1<<dis_slot_bits];
+ Bit_model bm_dis[modeled_distances-end_dis_model+1];
+ Bit_model bm_align[dis_align_size];
+ Len_model match_len_model;
+ Len_model rep_len_model;
+ unsigned rep0 = 0; /* rep[0-3] latest four distances */
+ unsigned rep1 = 0; /* used for efficient coding of */
+ unsigned rep2 = 0; /* repeated distances */
+ unsigned rep3 = 0;
+ State state = 0;
+
+ Bm_array_init(bm_literal[0], (1 << literal_context_bits) * 0x300);
+ Bm_array_init(bm_match[0], states * pos_states);
+ Bm_array_init(bm_rep, states);
+ Bm_array_init(bm_rep0, states);
+ Bm_array_init(bm_rep1, states);
+ Bm_array_init(bm_rep2, states);
+ Bm_array_init(bm_len[0], states * pos_states);
+ Bm_array_init(bm_dis_slot[0], len_states * (1 << dis_slot_bits));
+ Bm_array_init(bm_dis, modeled_distances - end_dis_model + 1);
+ Bm_array_init(bm_align, dis_align_size);
+ Lm_init(&match_len_model);
+ Lm_init(&rep_len_model);
+
+ Rd_load(rdec);
+ while (!Rd_finished(rdec)) {
+ int pos_state = LZd_data_position(d) & pos_state_mask;
+ if (Rd_decode_bit(rdec, &bm_match[state][pos_state]) == 0) /* 1st bit */ {
+ Bit_model * bm = bm_literal[get_lit_state(LZd_peek_prev(d))];
+ if (St_is_char(state)) {
+ state -= (state < 4) ? state : 3;
+ LZd_put_byte(d, Rd_decode_tree8(rdec, bm));
+ } else {
+ state -= (state < 10) ? 3 : 6;
+ LZd_put_byte(d, Rd_decode_matched(rdec, bm, LZd_peek(d, rep0)));
+ }
+ } else /* match or repeated match */ {
+ int len;
+ if (Rd_decode_bit(rdec, &bm_rep[state]) != 0) /* 2nd bit */ {
+ if (Rd_decode_bit(rdec, &bm_rep0[state]) == 0) /* 3rd bit */ {
+ if (Rd_decode_bit(rdec, &bm_len[state][pos_state]) == 0) /* 4th bit */ {
+ state = St_set_short_rep(state);
+ LZd_put_byte(d, LZd_peek(d, rep0));
+ continue;
+ }
+ } else {
+ unsigned distance;
+ if (Rd_decode_bit(rdec, &bm_rep1[state]) == 0) /* 4th bit */
+ distance = rep1;
+ else {
+ if (Rd_decode_bit(rdec, &bm_rep2[state]) == 0) /* 5th bit */
+ distance = rep2;
+ else {
+ distance = rep3;
+ rep3 = rep2;
+ }
+ rep2 = rep1;
+ }
+ rep1 = rep0;
+ rep0 = distance;
+ }
+ state = St_set_rep(state);
+ len = min_match_len + Rd_decode_len(rdec, &rep_len_model, pos_state);
+ } else /* match */ {
+ unsigned distance;
+ len = min_match_len + Rd_decode_len(rdec, &match_len_model, pos_state);
+ distance = Rd_decode_tree6(rdec, bm_dis_slot[get_len_state(len)]);
+ if (distance >= start_dis_model) {
+ unsigned dis_slot = distance;
+ int direct_bits = (dis_slot >> 1) - 1;
+ distance = (2 | (dis_slot & 1)) << direct_bits;
+ if (dis_slot < end_dis_model)
+ distance += Rd_decode_tree_reversed(rdec,
+ bm_dis + (distance - dis_slot), direct_bits);
+ else {
+ distance +=
+ Rd_decode(rdec, direct_bits - dis_align_bits) << dis_align_bits;
+ distance += Rd_decode_tree_reversed4(rdec, bm_align);
+ if (distance == 0xFFFFFFFFU) /* marker found */ {
+ Rd_normalize(rdec);
+ LZd_flush_data(d);
+ if (len == min_match_len) /* End Of Stream marker */ {
+ if (LZd_verify_trailer(d, pp))
+/* code folded from here */
+ return 0;
+/* unfolding */
+ else
+/* code folded from here */
+ return 3;
+/* unfolding */
+ }
+ if (len == min_match_len + 1) /* Sync Flush marker */ {
+ Rd_load(rdec);
+ continue;
+ }
+ if (verbosity >= 0) {
+ Pp_show_msg(pp, 0);
+ fprintf( stderr, "Unsupported marker code '%d'\n", len );
+ }
+ return 4;
+ }
+ }
+ }
+ rep3 = rep2;
+ rep2 = rep1;
+ rep1 = rep0;
+ rep0 = distance;
+ state = St_set_match(state);
+ if (rep0 >= d->dict_size || (rep0 >= d->pos && !d->pos_wrapped)) {
+ LZd_flush_data(d);
+ return 1;
+ }
+ }
+ LZd_copy_block(d, rep0, len);
+ }
+ }
+ LZd_flush_data(d);
+ return 2;
+}
+
diff -Nru /sys/src/cmd/lzip/decoder.h /sys/src/cmd/lzip/decoder.h
--- /sys/src/cmd/lzip/decoder.h Thu Jan 1 00:00:00 1970
+++ /sys/src/cmd/lzip/decoder.h Sat May 1 00:00:00 2021
@@ -0,0 +1,354 @@
+/* Clzip - LZMA lossless data compressor
+ Copyright (C) 2010-2017 Antonio Diaz Diaz.
+
+ This program is free software: you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation, either version 2 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>. */
+
+enum { rd_buffer_size = 16384 };
+
+typedef struct LZ_decoder LZ_decoder;
+typedef struct Range_decoder Range_decoder;
+
+struct Range_decoder {
+ uvlong partial_member_pos;
+ uchar * buffer; /* input buffer */
+ int pos; /* current pos in buffer */
+ int stream_pos; /* when reached, a new block must be read */
+ uint32_t code;
+ uint32_t range;
+ int infd; /* input file descriptor */
+ bool at_stream_end;
+};
+
+bool Rd_read_block(Range_decoder *rdec);
+
+static bool
+Rd_init(Range_decoder *rdec, int ifd)
+{
+ rdec->partial_member_pos = 0;
+ rdec->buffer = (uchar *)malloc(rd_buffer_size);
+ if (!rdec->buffer)
+ return false;
+ rdec->pos = 0;
+ rdec->stream_pos = 0;
+ rdec->code = 0;
+ rdec->range = 0xFFFFFFFFU;
+ rdec->infd = ifd;
+ rdec->at_stream_end = false;
+ return true;
+}
+
+static void
+Rd_free(Range_decoder *rdec)
+{
+ free(rdec->buffer);
+}
+
+static bool
+Rd_finished(Range_decoder *rdec)
+{
+ return rdec->pos >= rdec->stream_pos && !Rd_read_block(rdec);
+}
+
+static uvlong
+Rd_member_position(Range_decoder *rdec)
+{
+ return rdec->partial_member_pos + rdec->pos;
+}
+
+static void
+Rd_reset_member_position(Range_decoder *rdec)
+{
+ rdec->partial_member_pos = 0;
+ rdec->partial_member_pos -= rdec->pos;
+}
+
+static uchar
+Rd_get_byte(Range_decoder *rdec)
+{
+ /* 0xFF avoids decoder error if member is truncated at EOS marker */
+ if (Rd_finished(rdec))
+ return 0xFF;
+ return rdec->buffer[rdec->pos++];
+}
+
+static int Rd_read_data(Range_decoder *rdec, uchar *outbuf, int size)
+{
+ int sz = 0;
+
+ while (sz < size && !Rd_finished(rdec)) {
+ int rd, rsz = size - sz;
+ int rpos = rdec->stream_pos - rdec->pos;
+
+ if (rsz < rpos)
+ rd = rsz;
+ else
+ rd = rpos;
+ memcpy(outbuf + sz, rdec->buffer + rdec->pos, rd);
+ rdec->pos += rd;
+ sz += rd;
+ }
+ return sz;
+}
+
+static void
+Rd_load(Range_decoder *rdec)
+{
+ int i;
+ rdec->code = 0;
+ for (i = 0; i < 5; ++i)
+ rdec->code = (rdec->code << 8) | Rd_get_byte(rdec);
+ rdec->range = 0xFFFFFFFFU;
+ rdec->code &= rdec->range; /* make sure that first byte is discarded */
+}
+
+static void
+Rd_normalize(Range_decoder *rdec)
+{
+ if (rdec->range <= 0x00FFFFFFU) {
+ rdec->range <<= 8;
+ rdec->code = (rdec->code << 8) | Rd_get_byte(rdec);
+ }
+}
+
+static unsigned
+Rd_decode(Range_decoder *rdec, int num_bits)
+{
+ unsigned symbol = 0;
+ int i;
+ for (i = num_bits; i > 0; --i) {
+ bool bit;
+ Rd_normalize(rdec);
+ rdec->range >>= 1;
+ /* symbol <<= 1; */
+ /* if(rdec->code >= rdec->range) { rdec->code -= rdec->range; symbol |= 1; } */
+ bit = (rdec->code >= rdec->range);
+ symbol = (symbol << 1) + bit;
+ rdec->code -= rdec->range & (0U - bit);
+ }
+ return symbol;
+}
+
+static unsigned
+Rd_decode_bit(Range_decoder *rdec, Bit_model *probability)
+{
+ uint32_t bound;
+ Rd_normalize(rdec);
+ bound = (rdec->range >> bit_model_total_bits) * *probability;
+ if (rdec->code < bound) {
+ rdec->range = bound;
+ *probability += (bit_model_total - *probability) >> bit_model_move_bits;
+ return 0;
+ } else {
+ rdec->range -= bound;
+ rdec->code -= bound;
+ *probability -= *probability >> bit_model_move_bits;
+ return 1;
+ }
+}
+
+static unsigned
+Rd_decode_tree3(Range_decoder *rdec, Bit_model bm[])
+{
+ unsigned symbol = 1;
+ symbol = (symbol << 1) | Rd_decode_bit(rdec, &bm[symbol]);
+ symbol = (symbol << 1) | Rd_decode_bit(rdec, &bm[symbol]);
+ symbol = (symbol << 1) | Rd_decode_bit(rdec, &bm[symbol]);
+ return symbol & 7;
+}
+
+static unsigned
+Rd_decode_tree6(Range_decoder *rdec, Bit_model bm[])
+{
+ unsigned symbol = 1;
+ symbol = (symbol << 1) | Rd_decode_bit(rdec, &bm[symbol]);
+ symbol = (symbol << 1) | Rd_decode_bit(rdec, &bm[symbol]);
+ symbol = (symbol << 1) | Rd_decode_bit(rdec, &bm[symbol]);
+ symbol = (symbol << 1) | Rd_decode_bit(rdec, &bm[symbol]);
+ symbol = (symbol << 1) | Rd_decode_bit(rdec, &bm[symbol]);
+ symbol = (symbol << 1) | Rd_decode_bit(rdec, &bm[symbol]);
+ return symbol & 0x3F;
+}
+
+static unsigned
+Rd_decode_tree8(Range_decoder *rdec, Bit_model bm[])
+{
+ unsigned symbol = 1;
+ int i;
+ for (i = 0; i < 8; ++i)
+ symbol = (symbol << 1) | Rd_decode_bit(rdec, &bm[symbol]);
+ return symbol & 0xFF;
+}
+
+static unsigned
+Rd_decode_tree_reversed(Range_decoder *rdec, Bit_model bm[], int num_bits)
+{
+ unsigned model = 1;
+ unsigned symbol = 0;
+ int i;
+ for (i = 0; i < num_bits; ++i) {
+ unsigned bit = Rd_decode_bit(rdec, &bm[model]);
+ model = (model << 1) + bit;
+ symbol |= (bit << i);
+ }
+ return symbol;
+}
+
+static unsigned
+Rd_decode_tree_reversed4(Range_decoder *rdec, Bit_model bm[])
+{
+ unsigned symbol = Rd_decode_bit(rdec, &bm[1]);
+ unsigned model = 2 + symbol;
+ unsigned bit = Rd_decode_bit(rdec, &bm[model]);
+ model = (model << 1) + bit;
+ symbol |= (bit << 1);
+ bit = Rd_decode_bit(rdec, &bm[model]);
+ model = (model << 1) + bit;
+ symbol |= (bit << 2);
+ symbol |= (Rd_decode_bit(rdec, &bm[model]) << 3);
+ return symbol;
+}
+
+static unsigned
+Rd_decode_matched(Range_decoder *rdec, Bit_model bm[], unsigned match_byte)
+{
+ unsigned symbol = 1;
+ unsigned mask = 0x100;
+ while (true) {
+ unsigned match_bit = (match_byte <<= 1) & mask;
+ unsigned bit = Rd_decode_bit(rdec, &bm[symbol+match_bit+mask]);
+ symbol = (symbol << 1) + bit;
+ if (symbol > 0xFF)
+ return symbol & 0xFF;
+ mask &= ~(match_bit ^ (bit << 8)); /* if(match_bit != bit) mask = 0; */
+ }
+}
+
+static unsigned
+Rd_decode_len(struct Range_decoder *rdec, Len_model *lm, int pos_state)
+{
+ if (Rd_decode_bit(rdec, &lm->choice1) == 0)
+ return Rd_decode_tree3(rdec, lm->bm_low[pos_state]);
+ if (Rd_decode_bit(rdec, &lm->choice2) == 0)
+ return len_low_syms + Rd_decode_tree3(rdec, lm->bm_mid[pos_state]);
+ return len_low_syms + len_mid_syms + Rd_decode_tree8(rdec, lm->bm_high);
+}
+
+struct LZ_decoder {
+ uvlong partial_data_pos;
+ struct Range_decoder *rdec;
+ unsigned dict_size;
+ uchar * buffer; /* output buffer */
+ unsigned pos; /* current pos in buffer */
+ unsigned stream_pos; /* first byte not yet written to file */
+ uint32_t crc;
+ int outfd; /* output file descriptor */
+ bool pos_wrapped;
+};
+
+void LZd_flush_data(LZ_decoder *d);
+
+static uchar
+LZd_peek_prev(LZ_decoder *d)
+{
+ if (d->pos > 0)
+ return d->buffer[d->pos-1];
+ if (d->pos_wrapped)
+ return d->buffer[d->dict_size-1];
+ return 0; /* prev_byte of first byte */
+}
+
+static uchar
+LZd_peek(LZ_decoder *d,
+unsigned distance)
+{
+ unsigned i = ((d->pos > distance) ? 0 : d->dict_size) +
+ d->pos - distance - 1;
+ return d->buffer[i];
+}
+
+static void
+LZd_put_byte(LZ_decoder *d, uchar b)
+{
+ d->buffer[d->pos] = b;
+ if (++d->pos >= d->dict_size)
+ LZd_flush_data(d);
+}
+
+static void
+LZd_copy_block(LZ_decoder *d, unsigned distance, unsigned len)
+{
+ unsigned lpos = d->pos, i = lpos -distance -1;
+ bool fast, fast2;
+
+ if (lpos > distance) {
+ fast = (len < d->dict_size - lpos);
+ fast2 = (fast && len <= lpos - i);
+ } else {
+ i += d->dict_size;
+ fast = (len < d->dict_size - i); /* (i == pos) may happen */
+ fast2 = (fast && len <= i - lpos);
+ }
+ if (fast) /* no wrap */ {
+ d->pos += len;
+ if (fast2) /* no wrap, no overlap */
+ memcpy(d->buffer + lpos, d->buffer + i, len);
+ else
+ for (; len > 0; --len)
+ d->buffer[lpos++] = d->buffer[i++];
+ } else
+ for (; len > 0; --len) {
+ d->buffer[d->pos] = d->buffer[i];
+ if (++d->pos >= d->dict_size)
+ LZd_flush_data(d);
+ if (++i >= d->dict_size)
+ i = 0;
+ }
+}
+
+static bool
+LZd_init(struct LZ_decoder *d, Range_decoder *rde, unsigned dict_size, int ofd)
+{
+ d->partial_data_pos = 0;
+ d->rdec = rde;
+ d->dict_size = dict_size;
+ d->buffer = (uchar *)malloc(d->dict_size);
+ if (!d->buffer)
+ return false;
+ d->pos = 0;
+ d->stream_pos = 0;
+ d->crc = 0xFFFFFFFFU;
+ d->outfd = ofd;
+ d->pos_wrapped = false;
+ return true;
+}
+
+static void
+LZd_free(LZ_decoder *d)
+{
+ free(d->buffer);
+}
+
+static unsigned
+LZd_crc(LZ_decoder *d)
+{
+ return d->crc ^ 0xFFFFFFFFU;
+}
+
+static uvlong
+LZd_data_position(LZ_decoder *d)
+{
+ return d->partial_data_pos + d->pos;
+}
+
+int LZd_decode_member(struct LZ_decoder *d, Pretty_print *pp);
diff -Nru /sys/src/cmd/lzip/encoder.c /sys/src/cmd/lzip/encoder.c
--- /sys/src/cmd/lzip/encoder.c Thu Jan 1 00:00:00 1970
+++ /sys/src/cmd/lzip/encoder.c Sat May 1 00:00:00 2021
@@ -0,0 +1,735 @@
+/* Clzip - LZMA lossless data compressor
+ Copyright (C) 2010-2017 Antonio Diaz Diaz.
+
+ This program is free software: you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation, either version 2 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>. */
+
+#include "lzip.h"
+#include "encoder_base.h"
+#include "encoder.h"
+
+CRC32 crc32;
+
+/*
+ * starting at data[len] and data[len-delta], what's the longest match
+ * up to len_limit?
+ */
+int
+maxmatch(uchar *data, int delta, int len, int len_limit)
+{
+ uchar *pdel, *p;
+
+ p = &data[len];
+ pdel = p - delta;
+ while (*pdel++ == *p++ && len < len_limit)
+ ++len;
+ return len;
+}
+
+static int
+findpairmaxlen(LZ_encoder *e, Pair **pairsp, int *npairsp, int maxlen, int pos1,
+ int len_limit, int min_pos, uchar *data, int np2, int np3)
+{
+ int num_pairs;
+ Pair *pairs;
+
+ pairs = *pairsp;
+ num_pairs = *npairsp;
+ if (np2 > min_pos && e->eb.mb.buffer[np2-1] == data[0]) {
+ pairs[0].dis = e->eb.mb.pos - np2;
+ pairs[0].len = maxlen = 2;
+ num_pairs = 1;
+ }
+ if (np2 != np3 && np3 > min_pos && e->eb.mb.buffer[np3-1] == data[0]) {
+ maxlen = 3;
+ np2 = np3;
+ pairs[num_pairs].dis = e->eb.mb.pos - np2;
+ ++num_pairs;
+ }
+ if (num_pairs > 0) {
+ maxlen = maxmatch(data, pos1 - np2, maxlen, len_limit);
+ pairs[num_pairs-1].len = maxlen;
+ if (maxlen >= len_limit)
+ *pairsp = nil; /* done. now just skip */
+ }
+ if (maxlen < 3)
+ maxlen = 3;
+ *npairsp = num_pairs;
+ return maxlen;
+}
+
+int
+LZe_get_match_pairs(LZ_encoder *e, Pair *pairs)
+{
+ int len = 0, len0, len1, maxlen, num_pairs, len_limit, avail;
+ int pos1, min_pos, cyclic_pos, delta, count, key2, key3, key4, newpos1;
+ int32_t *ptr0, *ptr1, *newptr, *prevpos;
+ uchar *data;
+ uchar *p;
+ unsigned tmp;
+
+ len_limit = e->match_len_limit;
+ avail = Mb_avail_bytes(&e->eb.mb);
+ if (len_limit > avail) {
+ len_limit = avail;
+ if (len_limit < 4)
+ return 0;
+ }
+
+ data = Mb_ptr_to_current_pos(&e->eb.mb);
+ tmp = crc32[data[0]] ^ data[1];
+ key2 = tmp & (Nprevpos2 - 1);
+ tmp ^= (unsigned)data[2] << 8;
+ key3 = Nprevpos2 + (tmp & (Nprevpos3 - 1));
+ key4 = Nprevpos2 + Nprevpos3 +
+ ((tmp ^ (crc32[data[3]] << 5)) & e->eb.mb.key4_mask);
+
+ min_pos = (e->eb.mb.pos > e->eb.mb.dict_size) ?
+ e->eb.mb.pos - e->eb.mb.dict_size : 0;
+ pos1 = e->eb.mb.pos + 1;
+ prevpos = e->eb.mb.prev_positions;
+ maxlen = 0;
+ num_pairs = 0;
+ if (pairs)
+ maxlen = findpairmaxlen(e, &pairs, &num_pairs, maxlen, pos1,
+ len_limit, min_pos, data, prevpos[key2], prevpos[key3]);
+ newpos1 = prevpos[key4];
+ prevpos[key2] = prevpos[key3] = prevpos[key4] = pos1;
+
+ cyclic_pos = e->eb.mb.cyclic_pos;
+ ptr0 = e->eb.mb.pos_array + (cyclic_pos << 1);
+ ptr1 = ptr0 + 1;
+ len0 = len1 = 0;
+ for (count = e->cycles; ;) {
+ if (newpos1 <= min_pos || --count < 0) {
+ *ptr0 = *ptr1 = 0;
+ break;
+ }
+
+ delta = pos1 - newpos1;
+ newptr = e->eb.mb.pos_array + ((cyclic_pos - delta +
+ (cyclic_pos >= delta? 0: e->eb.mb.dict_size + 1)) << 1);
+ p = &data[len];
+ if (p[-delta] == *p) {
+ len = maxmatch(data, delta, len + 1, len_limit);
+ if (pairs && maxlen < len) {
+ pairs[num_pairs].dis = delta - 1;
+ pairs[num_pairs].len = maxlen = len;
+ ++num_pairs;
+ }
+ if (len >= len_limit) {
+ *ptr0 = newptr[0];
+ *ptr1 = newptr[1];
+ break;
+ }
+ p = &data[len];
+ }
+ if (p[-delta] < *p) {
+ *ptr0 = newpos1;
+ ptr0 = newptr + 1;
+ newpos1 = *ptr0;
+ len0 = len;
+ if (len1 < len)
+ len = len1;
+ } else {
+ *ptr1 = newpos1;
+ ptr1 = newptr;
+ newpos1 = *ptr1;
+ len1 = len;
+ if (len0 < len)
+ len = len0;
+ }
+ }
+ return num_pairs;
+}
+
+static void
+LZe_update_distance_prices(LZ_encoder *e)
+{
+ int dis, len_state;
+
+ for (dis = start_dis_model; dis < modeled_distances; ++dis) {
+ int dis_slot = dis_slots[dis];
+ int direct_bits = (dis_slot >> 1) - 1;
+ int base = (2 | (dis_slot & 1)) << direct_bits;
+ int price = price_symbol_reversed(e->eb.bm_dis + (base - dis_slot),
+ dis - base, direct_bits);
+
+ for (len_state = 0; len_state < len_states; ++len_state)
+ e->dis_prices[len_state][dis] = price;
+ }
+
+ for (len_state = 0; len_state < len_states; ++len_state) {
+ int *dsp = e->dis_slot_prices[len_state];
+ int *dp = e->dis_prices[len_state];
+ Bit_model * bmds = e->eb.bm_dis_slot[len_state];
+ int slot = 0;
+
+ for (; slot < end_dis_model; ++slot)
+ dsp[slot] = price_symbol6(bmds, slot);
+ for (; slot < e->num_dis_slots; ++slot)
+ dsp[slot] = price_symbol6(bmds, slot) +
+ ((((slot >> 1) - 1) - dis_align_bits) << price_shift_bits);
+
+ for (dis = 0; dis < start_dis_model; ++dis)
+ dp[dis] = dsp[dis];
+ for (; dis < modeled_distances; ++dis)
+ dp[dis] += dsp[dis_slots[dis]];
+ }
+}
+
+static int
+pricestate2(LZ_encoder *e, int price, int *ps2p, State *st2p, int len2)
+{
+ int pos_state2;
+ State state2;
+
+ state2 = *st2p;
+ pos_state2 = *ps2p;
+
+ pos_state2 = (pos_state2 + 1) & pos_state_mask;
+ state2 = St_set_char(state2);
+ price += price1(e->eb.bm_match[state2][pos_state2]) +
+ price1(e->eb.bm_rep[state2]) +
+ LZe_price_rep0_len(e, len2, state2, pos_state2);
+
+ *ps2p = pos_state2;
+ *st2p = state2;
+ return price;
+}
+
+static int
+encinit(LZ_encoder *e, int reps[num_rep_distances],
+ int replens[num_rep_distances], State state, int main_len,
+ int num_pairs, int rep_index, int *ntrialsp)
+{
+ int i, rep, num_trials, len;
+ int pos_state = Mb_data_position(&e->eb.mb) & pos_state_mask;
+ int match_price = price1(e->eb.bm_match[state][pos_state]);
+ int rep_match_price = match_price + price1(e->eb.bm_rep[state]);
+ uchar prev_byte = Mb_peek(&e->eb.mb, 1);
+ uchar cur_byte = Mb_peek(&e->eb.mb, 0);
+ uchar match_byte = Mb_peek(&e->eb.mb, reps[0] + 1);
+
+ e->trials[1].price = price0(e->eb.bm_match[state][pos_state]);
+ if (St_is_char(state))
+ e->trials[1].price += LZeb_price_literal(&e->eb,
+ prev_byte, cur_byte);
+ else
+ e->trials[1].price += LZeb_price_matched(&e->eb,
+ prev_byte, cur_byte, match_byte);
+ e->trials[1].dis4 = -1; /* literal */
+
+ if (match_byte == cur_byte)
+ Tr_update(&e->trials[1], rep_match_price +
+ LZeb_price_shortrep(&e->eb, state, pos_state), 0, 0);
+ num_trials = replens[rep_index];
+ if (num_trials < main_len)
+ num_trials = main_len;
+ *ntrialsp = num_trials;
+ if (num_trials < min_match_len) {
+ e->trials[0].price = 1;
+ e->trials[0].dis4 = e->trials[1].dis4;
+ Mb_move_pos(&e->eb.mb);
+ return 1;
+ }
+
+ e->trials[0].state = state;
+ for (i = 0; i < num_rep_distances; ++i)
+ e->trials[0].reps[i] = reps[i];
+
+ for (len = min_match_len; len <= num_trials; ++len)
+ e->trials[len].price = infinite_price;
+
+ for (rep = 0; rep < num_rep_distances; ++rep) {
+ int price, replen;
+
+ if (replens[rep] < min_match_len)
+ continue;
+ price = rep_match_price + LZeb_price_rep(&e->eb, rep,
+ state, pos_state);
+ replen = replens[rep];
+ for (len = min_match_len; len <= replen; ++len)
+ Tr_update(&e->trials[len], price +
+ Lp_price(&e->rep_len_prices, len, pos_state), rep, 0);
+ }
+
+ if (main_len > replens[0]) {
+ int dis, normal_match_price = match_price +
+ price0(e->eb.bm_rep[state]);
+ int replp1 = replens[0] + 1;
+ int i = 0, len = max(replp1, min_match_len);
+
+ while (len > e->pairs[i].len)
+ ++i;
+ for (;;) {
+ dis = e->pairs[i].dis;
+ Tr_update(&e->trials[len], normal_match_price +
+ LZe_price_pair(e, dis, len, pos_state),
+ dis + num_rep_distances, 0);
+ if (++len > e->pairs[i].len && ++i >= num_pairs)
+ break;
+ }
+ }
+ return 0;
+}
+
+static void
+finalvalues(LZ_encoder *e, int cur, Trial *cur_trial, State *cstatep)
+{
+ int i;
+ int dis4 = cur_trial->dis4;
+ int prev_index = cur_trial->prev_index;
+ int prev_index2 = cur_trial->prev_index2;
+ State cur_state;
+
+ if (prev_index2 == single_step_trial) {
+ cur_state = e->trials[prev_index].state;
+ if (prev_index + 1 == cur) { /* len == 1 */
+ if (dis4 == 0)
+ cur_state = St_set_short_rep(cur_state);
+ else
+ cur_state = St_set_char(cur_state); /* literal */
+ } else if (dis4 < num_rep_distances)
+ cur_state = St_set_rep(cur_state);
+ else
+ cur_state = St_set_match(cur_state);
+ } else {
+ if (prev_index2 == dual_step_trial) /* dis4 == 0 (rep0) */
+ --prev_index;
+ else /* prev_index2 >= 0 */
+ prev_index = prev_index2;
+ cur_state = 8; /* St_set_char_rep(); */
+ }
+ cur_trial->state = cur_state;
+ for (i = 0; i < num_rep_distances; ++i)
+ cur_trial->reps[i] = e->trials[prev_index].reps[i];
+ mtf_reps(dis4, cur_trial->reps); /* literal is ignored */
+ *cstatep = cur_state;
+}
+
+static int
+litrep0(LZ_encoder *e, State cur_state, int cur, Trial *cur_trial,
+ int num_trials, int triable_bytes, int pos_state, int next_price)
+{
+ int len = 1, endtrials, limit, mlpl1, dis;
+ uchar *data = Mb_ptr_to_current_pos(&e->eb.mb);
+
+ dis = cur_trial->reps[0] + 1;
+ mlpl1 = e->match_len_limit + 1;
+ limit = min(mlpl1, triable_bytes);
+ len = maxmatch(data, dis, len, limit);
+ if (--len >= min_match_len) {
+ int pos_state2, price;
+ State state2;
+
+ pos_state2 = (pos_state + 1) & pos_state_mask;
+ state2 = St_set_char(cur_state);
+ price = next_price + price1(e->eb.bm_match[state2][pos_state2])+
+ price1(e->eb.bm_rep[state2]) +
+ LZe_price_rep0_len(e, len, state2, pos_state2);
+ endtrials = cur + 1 + len;
+ while (num_trials < endtrials)
+ e->trials[++num_trials].price = infinite_price;
+ Tr_update2(&e->trials[endtrials], price, cur + 1);
+ }
+ return num_trials;
+}
+
+static int
+repdists(LZ_encoder *e, State cur_state, int cur, Trial *cur_trial,
+ int num_trials, int triable_bytes, int pos_state,
+ int rep_match_price, int len_limit, int *stlenp)
+{
+ int i, rep, len, price, dis, start_len;
+
+ start_len = *stlenp;
+ for (rep = 0; rep < num_rep_distances; ++rep) {
+ uchar *data = Mb_ptr_to_current_pos(&e->eb.mb);
+
+ dis = cur_trial->reps[rep] + 1;
+ if (data[0-dis] != data[0] || data[1-dis] != data[1])
+ continue;
+ len = maxmatch(data, dis, min_match_len, len_limit);
+ while (num_trials < cur + len)
+ e->trials[++num_trials].price = infinite_price;
+ price = rep_match_price + LZeb_price_rep(&e->eb, rep,
+ cur_state, pos_state);
+ for (i = min_match_len; i <= len; ++i)
+ Tr_update(&e->trials[cur+i], price +
+ Lp_price(&e->rep_len_prices, i, pos_state), rep, cur);
+
+ if (rep == 0)
+ start_len = len + 1; /* discard shorter matches */
+
+ /* try rep + literal + rep0 */
+ {
+ int pos_state2, endtrials, limit, mlpl2, len2;
+ State state2;
+
+ len2 = len + 1;
+ mlpl2 = e->match_len_limit + len2;
+ limit = min(mlpl2, triable_bytes);
+ len2 = maxmatch(data, dis, len2, limit);
+ len2 -= len + 1;
+ if (len2 < min_match_len)
+ continue;
+
+ pos_state2 = (pos_state + len) & pos_state_mask;
+ state2 = St_set_rep(cur_state);
+ price += Lp_price(&e->rep_len_prices, len, pos_state) +
+ price0(e->eb.bm_match[state2][pos_state2]) +
+ LZeb_price_matched(&e->eb, data[len-1],
+ data[len], data[len-dis]);
+ price = pricestate2(e, price, &pos_state2,
+ &state2, len2);
+ endtrials = cur + len + 1 + len2;
+ while (num_trials < endtrials)
+ e->trials[++num_trials].price = infinite_price;
+ Tr_update3(&e->trials[endtrials], price, rep,
+ endtrials - len2, cur);
+ }
+ }
+ *stlenp = start_len;
+ return num_trials;
+}
+
+static int
+trymatches(LZ_encoder *e, State cur_state, int cur, int num_trials,
+ int triable_bytes, int pos_state, int num_pairs,
+ int normal_match_price, int start_len)
+{
+ int i, dis, len, price;
+
+ i = 0;
+ while (e->pairs[i].len < start_len)
+ ++i;
+ dis = e->pairs[i].dis;
+ for (len = start_len; ; ++len) {
+ price = normal_match_price + LZe_price_pair(e, dis, len, pos_state);
+ Tr_update(&e->trials[cur+len], price, dis + num_rep_distances, cur);
+
+ /* try match + literal + rep0 */
+ if (len == e->pairs[i].len) {
+ uchar *data = Mb_ptr_to_current_pos(&e->eb.mb);
+ int endtrials, mlpl2, limit;
+ int dis2 = dis + 1, len2 = len + 1;
+
+ mlpl2 = e->match_len_limit + len2;
+ limit = min(mlpl2, triable_bytes);
+ len2 = maxmatch(data, dis2, len2, limit);
+ len2 -= len + 1;
+ if (len2 >= min_match_len) {
+ int pos_state2 = (pos_state + len) &pos_state_mask;
+ State state2 = St_set_match(cur_state);
+
+ price += price0(e->eb.bm_match[state2][pos_state2]) +
+ LZeb_price_matched(&e->eb, data[len-1], data[len], data[len-dis2]);
+ price = pricestate2(e, price,
+ &pos_state2, &state2, len2);
+ endtrials = cur + len + 1 + len2;
+ while (num_trials < endtrials)
+ e->trials[++num_trials].price = infinite_price;
+ Tr_update3(&e->trials[endtrials],
+ price, dis +
+ num_rep_distances,
+ endtrials - len2, cur);
+ }
+ if (++i >= num_pairs)
+ break;
+ dis = e->pairs[i].dis;
+ }
+ }
+ return num_trials;
+}
+
+/*
+ * Returns the number of bytes advanced (ahead).
+ trials[0]..trials[ahead-1] contain the steps to encode.
+ (trials[0].dis4 == -1) means literal.
+ A match/rep longer or equal than match_len_limit finishes the sequence.
+ */
+static int
+LZe_sequence_optimizer(LZ_encoder *e, int reps[num_rep_distances], State state)
+{
+ int main_len, num_pairs, i, num_trials;
+ int rep_index = 0, cur = 0;
+ int replens[num_rep_distances];
+
+ if (e->pending_num_pairs > 0) { /* from previous call */
+ num_pairs = e->pending_num_pairs;
+ e->pending_num_pairs = 0;
+ } else
+ num_pairs = LZe_read_match_distances(e);
+ main_len = (num_pairs > 0) ? e->pairs[num_pairs-1].len : 0;
+
+ for (i = 0; i < num_rep_distances; ++i) {
+ replens[i] = Mb_true_match_len(&e->eb.mb, 0, reps[i] + 1);
+ if (replens[i] > replens[rep_index])
+ rep_index = i;
+ }
+ if (replens[rep_index] >= e->match_len_limit) {
+ e->trials[0].price = replens[rep_index];
+ e->trials[0].dis4 = rep_index;
+ LZe_move_and_update(e, replens[rep_index]);
+ return replens[rep_index];
+ }
+
+ if (main_len >= e->match_len_limit) {
+ e->trials[0].price = main_len;
+ e->trials[0].dis4 = e->pairs[num_pairs-1].dis + num_rep_distances;
+ LZe_move_and_update(e, main_len);
+ return main_len;
+ }
+
+ if (encinit(e, reps, replens, state, main_len, num_pairs, rep_index,
+ &num_trials) > 0)
+ return 1;
+
+ /*
+ * Optimize price.
+ */
+ for (;;) {
+ Trial *cur_trial, *next_trial;
+ int newlen, pos_state, triable_bytes, len_limit;
+ int next_price, match_price, rep_match_price;
+ int start_len = min_match_len;
+ State cur_state;
+ uchar prev_byte, cur_byte, match_byte;
+
+ Mb_move_pos(&e->eb.mb);
+ if (++cur >= num_trials) { /* no more initialized trials */
+ LZe_backward(e, cur);
+ return cur;
+ }
+
+ num_pairs = LZe_read_match_distances(e);
+ newlen = num_pairs > 0? e->pairs[num_pairs-1].len: 0;
+ if (newlen >= e->match_len_limit) {
+ e->pending_num_pairs = num_pairs;
+ LZe_backward(e, cur);
+ return cur;
+ }
+
+ /* give final values to current trial */
+ cur_trial = &e->trials[cur];
+ finalvalues(e, cur, cur_trial, &cur_state);
+
+ pos_state = Mb_data_position(&e->eb.mb) & pos_state_mask;
+ prev_byte = Mb_peek(&e->eb.mb, 1);
+ cur_byte = Mb_peek(&e->eb.mb, 0);
+ match_byte = Mb_peek(&e->eb.mb, cur_trial->reps[0] + 1);
+
+ next_price = cur_trial->price +
+ price0(e->eb.bm_match[cur_state][pos_state]);
+ if (St_is_char(cur_state))
+ next_price += LZeb_price_literal(&e->eb, prev_byte, cur_byte);
+ else
+ next_price += LZeb_price_matched(&e->eb, prev_byte,
+ cur_byte, match_byte);
+
+ /* try last updates to next trial */
+ next_trial = &e->trials[cur+1];
+
+ Tr_update(next_trial, next_price, -1, cur); /* literal */
+
+ match_price = cur_trial->price +
+ price1(e->eb.bm_match[cur_state][pos_state]);
+ rep_match_price = match_price + price1(e->eb.bm_rep[cur_state]);
+
+ if (match_byte == cur_byte && next_trial->dis4 != 0 &&
+ next_trial->prev_index2 == single_step_trial) {
+ int price = rep_match_price +
+ LZeb_price_shortrep(&e->eb, cur_state, pos_state);
+ if (price <= next_trial->price) {
+ next_trial->price = price;
+ next_trial->dis4 = 0; /* rep0 */
+ next_trial->prev_index = cur;
+ }
+ }
+
+ int trm1mcur = max_num_trials - 1 - cur;
+
+ triable_bytes = Mb_avail_bytes(&e->eb.mb);
+ if (triable_bytes > trm1mcur)
+ triable_bytes = trm1mcur;
+ if (triable_bytes < min_match_len)
+ continue;
+
+ len_limit = min(e->match_len_limit, triable_bytes);
+
+ /* try literal + rep0 */
+ if (match_byte != cur_byte && next_trial->prev_index != cur)
+ num_trials = litrep0(e, cur_state, cur, cur_trial,
+ num_trials, triable_bytes, pos_state, next_price);
+
+ /* try rep distances */
+ num_trials = repdists(e, cur_state, cur, cur_trial,
+ num_trials, triable_bytes, pos_state,
+ rep_match_price, len_limit, &start_len);
+
+ /* try matches */
+ if (newlen >= start_len && newlen <= len_limit) {
+ int normal_match_price = match_price +
+ price0(e->eb.bm_rep[cur_state]);
+
+ while (num_trials < cur + newlen)
+ e->trials[++num_trials].price = infinite_price;
+
+ num_trials = trymatches(e, cur_state, cur, num_trials,
+ triable_bytes, pos_state, num_pairs,
+ normal_match_price, start_len);
+ }
+ }
+}
+
+static int
+encrepmatch(LZ_encoder *e, State state, int len, int dis, int pos_state)
+{
+ int bit = (dis == 0);
+
+ Re_encode_bit(&e->eb.renc, &e->eb.bm_rep0[state], !bit);
+ if (bit)
+ Re_encode_bit(&e->eb.renc, &e->eb.bm_len[state][pos_state],
+ len > 1);
+ else {
+ Re_encode_bit(&e->eb.renc, &e->eb.bm_rep1[state], dis > 1);
+ if (dis > 1)
+ Re_encode_bit(&e->eb.renc, &e->eb.bm_rep2[state],
+ dis > 2);
+ }
+ if (len == 1)
+ state = St_set_short_rep(state);
+ else {
+ Re_encode_len(&e->eb.renc, &e->eb.rep_len_model, len, pos_state);
+ Lp_decr_counter(&e->rep_len_prices, pos_state);
+ state = St_set_rep(state);
+ }
+ return state;
+}
+
+bool
+LZe_encode_member(LZ_encoder *e, uvlong member_size)
+{
+ uvlong member_size_limit = member_size - Ft_size - max_marker_size;
+ bool best = (e->match_len_limit > 12);
+ int dis_price_count = best? 1: 512;
+ int align_price_count = best? 1: dis_align_size;
+ int price_count = (e->match_len_limit > 36? 1013 : 4093);
+ int price_counter = 0; /* counters may decrement below 0 */
+ int dis_price_counter = 0;
+ int align_price_counter = 0;
+ int ahead, i;
+ int reps[num_rep_distances];
+ State state = 0;
+
+ for (i = 0; i < num_rep_distances; ++i)
+ reps[i] = 0;
+
+ if (Mb_data_position(&e->eb.mb) != 0 ||
+ Re_member_position(&e->eb.renc) != Fh_size)
+ return false; /* can be called only once */
+
+ if (!Mb_data_finished(&e->eb.mb)) { /* encode first byte */
+ uchar prev_byte = 0;
+ uchar cur_byte = Mb_peek(&e->eb.mb, 0);
+
+ Re_encode_bit(&e->eb.renc, &e->eb.bm_match[state][0], 0);
+ LZeb_encode_literal(&e->eb, prev_byte, cur_byte);
+ CRC32_update_byte(&e->eb.crc, cur_byte);
+ LZe_get_match_pairs(e, 0);
+ Mb_move_pos(&e->eb.mb);
+ }
+
+ while (!Mb_data_finished(&e->eb.mb)) {
+ if (price_counter <= 0 && e->pending_num_pairs == 0) {
+ /* recalculate prices every these many bytes */
+ price_counter = price_count;
+ if (dis_price_counter <= 0) {
+ dis_price_counter = dis_price_count;
+ LZe_update_distance_prices(e);
+ }
+ if (align_price_counter <= 0) {
+ align_price_counter = align_price_count;
+ for (i = 0; i < dis_align_size; ++i)
+ e->align_prices[i] = price_symbol_reversed(
+ e->eb.bm_align, i, dis_align_bits);
+ }
+ Lp_update_prices(&e->match_len_prices);
+ Lp_update_prices(&e->rep_len_prices);
+ }
+
+ ahead = LZe_sequence_optimizer(e, reps, state);
+ price_counter -= ahead;
+
+ for (i = 0; ahead > 0;) {
+ int pos_state = (Mb_data_position(&e->eb.mb) - ahead) &
+ pos_state_mask;
+ int len = e->trials[i].price;
+ int dis = e->trials[i].dis4;
+ bool bit = (dis < 0);
+
+ Re_encode_bit(&e->eb.renc, &e->eb.bm_match[state][pos_state],
+ !bit);
+ if (bit) { /* literal byte */
+ uchar prev_byte = Mb_peek(&e->eb.mb, ahead+1);
+ uchar cur_byte = Mb_peek(&e->eb.mb, ahead);
+
+ CRC32_update_byte(&e->eb.crc, cur_byte);
+ if (St_is_char(state))
+ LZeb_encode_literal(&e->eb, prev_byte,
+ cur_byte);
+ else {
+ uchar match_byte = Mb_peek(&e->eb.mb,
+ ahead + reps[0] + 1);
+
+ LZeb_encode_matched(&e->eb, prev_byte,
+ cur_byte, match_byte);
+ }
+ state = St_set_char(state);
+ } else { /* match or repeated match */
+ CRC32_update_buf(&e->eb.crc,
+ Mb_ptr_to_current_pos(&e->eb.mb) - ahead,
+ len);
+ mtf_reps(dis, reps);
+ bit = (dis < num_rep_distances);
+ Re_encode_bit(&e->eb.renc, &e->eb.bm_rep[state],
+ bit);
+ if (bit) /* repeated match */
+ state = encrepmatch(e, state, len, dis,
+ pos_state);
+ else { /* match */
+ dis -= num_rep_distances;
+ LZeb_encode_pair(&e->eb, dis, len,
+ pos_state);
+ if (dis >= modeled_distances)
+ --align_price_counter;
+ --dis_price_counter;
+ Lp_decr_counter(
+ &e->match_len_prices, pos_state);
+ state = St_set_match(state);
+ }
+ }
+ ahead -= len;
+ i += len;
+ if (Re_member_position(&e->eb.renc) >= member_size_limit) {
+ if (!Mb_dec_pos(&e->eb.mb, ahead))
+ return false;
+ LZeb_full_flush(&e->eb, state);
+ return true;
+ }
+ }
+ }
+ LZeb_full_flush(&e->eb, state);
+ return true;
+}
diff -Nru /sys/src/cmd/lzip/encoder.h /sys/src/cmd/lzip/encoder.h
--- /sys/src/cmd/lzip/encoder.h Thu Jan 1 00:00:00 1970
+++ /sys/src/cmd/lzip/encoder.h Sat May 1 00:00:00 2021
@@ -0,0 +1,323 @@
+/* Clzip - LZMA lossless data compressor
+ Copyright (C) 2010-2017 Antonio Diaz Diaz.
+
+ This program is free software: you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation, either version 2 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>. */
+
+typedef struct Len_prices Len_prices;
+struct Len_prices {
+ struct Len_model *lm;
+ int len_syms;
+ int count;
+ int prices[pos_states][max_len_syms];
+ int counters[pos_states]; /* may decrement below 0 */
+};
+
+static void
+Lp_update_low_mid_prices(Len_prices *lp, int pos_state)
+{
+ int *pps = lp->prices[pos_state];
+ int tmp = price0(lp->lm->choice1);
+ int len = 0;
+ for (; len < len_low_syms && len < lp->len_syms; ++len)
+ pps[len] = tmp + price_symbol3(lp->lm->bm_low[pos_state], len);
+ if (len >= lp->len_syms)
+ return;
+ tmp = price1(lp->lm->choice1) + price0(lp->lm->choice2);
+ for (; len < len_low_syms + len_mid_syms && len < lp->len_syms; ++len)
+ pps[len] = tmp +
+ price_symbol3(lp->lm->bm_mid[pos_state], len - len_low_syms);
+}
+
+static void
+Lp_update_high_prices(Len_prices *lp)
+{
+ int tmp = price1(lp->lm->choice1) + price1(lp->lm->choice2);
+ int len;
+ for (len = len_low_syms + len_mid_syms; len < lp->len_syms; ++len)
+ /* using 4 slots per value makes "Lp_price" faster */
+ lp->prices[3][len] = lp->prices[2][len] =
+ lp->prices[1][len] = lp->prices[0][len] = tmp +
+ price_symbol8(lp->lm->bm_high, len - len_low_syms - len_mid_syms);
+}
+
+static void
+Lp_reset(Len_prices *lp)
+{
+ int i;
+ for (i = 0; i < pos_states; ++i)
+ lp->counters[i] = 0;
+}
+
+static void
+Lp_init(Len_prices *lp, Len_model *lm, int match_len_limit)
+{
+ lp->lm = lm;
+ lp->len_syms = match_len_limit + 1 - min_match_len;
+ lp->count = (match_len_limit > 12) ? 1 : lp->len_syms;
+ Lp_reset(lp);
+}
+
+static void
+Lp_decr_counter(Len_prices *lp, int pos_state)
+{
+ --lp->counters[pos_state];
+}
+
+static void
+Lp_update_prices(Len_prices *lp)
+{
+ int pos_state;
+ bool high_pending = false;
+
+ for (pos_state = 0; pos_state < pos_states; ++pos_state)
+ if (lp->counters[pos_state] <= 0) {
+ lp->counters[pos_state] = lp->count;
+ Lp_update_low_mid_prices(lp, pos_state);
+ high_pending = true;
+ }
+ if (high_pending && lp->len_syms > len_low_syms + len_mid_syms)
+ Lp_update_high_prices(lp);
+}
+
+typedef struct Pair Pair;
+struct Pair { /* distance-length pair */
+ int dis;
+ int len;
+};
+
+enum {
+ infinite_price = 0x0FFFFFFF,
+ max_num_trials = 1 << 13,
+ single_step_trial = -2,
+ dual_step_trial = -1
+};
+
+typedef struct Trial Trial;
+struct Trial {
+ State state;
+ int price; /* dual use var; cumulative price, match length */
+ int dis4; /* -1 for literal, or rep, or match distance + 4 */
+ int prev_index; /* index of prev trial in trials[] */
+ int prev_index2; /* -2 trial is single step */
+ /* -1 literal + rep0 */
+ /* >= 0 (rep or match) + literal + rep0 */
+ int reps[num_rep_distances];
+};
+
+static void
+Tr_update2(Trial *trial, int pr, int p_i)
+{
+ if (pr < trial->price) {
+ trial->price = pr;
+ trial->dis4 = 0;
+ trial->prev_index = p_i;
+ trial->prev_index2 = dual_step_trial;
+ }
+}
+
+static void
+Tr_update3(Trial *trial, int pr, int distance4, int p_i, int p_i2)
+{
+ if (pr < trial->price) {
+ trial->price = pr;
+ trial->dis4 = distance4;
+ trial->prev_index = p_i;
+ trial->prev_index2 = p_i2;
+ }
+}
+
+typedef struct LZ_encoder LZ_encoder;
+struct LZ_encoder {
+ LZ_encoder_base eb;
+ int cycles;
+ int match_len_limit;
+ Len_prices match_len_prices;
+ Len_prices rep_len_prices;
+ int pending_num_pairs;
+ Pair pairs[max_match_len+1];
+ Trial trials[max_num_trials];
+
+ int dis_slot_prices[len_states][2*max_dict_bits];
+ int dis_prices[len_states][modeled_distances];
+ int align_prices[dis_align_size];
+ int num_dis_slots;
+};
+
+static bool
+Mb_dec_pos(struct Matchfinder_base *mb, int ahead)
+{
+ if (ahead < 0 || mb->pos < ahead)
+ return false;
+ mb->pos -= ahead;
+ if (mb->cyclic_pos < ahead)
+ mb->cyclic_pos += mb->dict_size + 1;
+ mb->cyclic_pos -= ahead;
+ return true;
+}
+
+int LZe_get_match_pairs(struct LZ_encoder *e, struct Pair *pairs);
+
+/* move-to-front dis in/into reps; do nothing if(dis4 <= 0) */
+static void
+mtf_reps(int dis4, int reps[num_rep_distances])
+{
+ if (dis4 >= num_rep_distances) /* match */ {
+ reps[3] = reps[2];
+ reps[2] = reps[1];
+ reps[1] = reps[0];
+ reps[0] = dis4 - num_rep_distances;
+ } else if (dis4 > 0) /* repeated match */ {
+ int distance = reps[dis4];
+ int i;
+ for (i = dis4; i > 0; --i)
+ reps[i] = reps[i-1];
+ reps[0] = distance;
+ }
+}
+
+static int
+LZeb_price_shortrep(struct LZ_encoder_base *eb, State state, int pos_state)
+{
+ return price0(eb->bm_rep0[state]) + price0(eb->bm_len[state][pos_state]);
+}
+
+static int
+LZeb_price_rep(struct LZ_encoder_base *eb, int rep, State state, int pos_state)
+{
+ int price;
+ if (rep == 0)
+ return price0(eb->bm_rep0[state]) +
+ price1(eb->bm_len[state][pos_state]);
+ price = price1(eb->bm_rep0[state]);
+ if (rep == 1)
+ price += price0(eb->bm_rep1[state]);
+ else {
+ price += price1(eb->bm_rep1[state]);
+ price += price_bit(eb->bm_rep2[state], rep - 2);
+ }
+ return price;
+}
+
+static int
+LZe_price_rep0_len(struct LZ_encoder *e, int len, State state, int pos_state)
+{
+ return LZeb_price_rep(&e->eb, 0, state, pos_state) +
+ Lp_price(&e->rep_len_prices, len, pos_state);
+}
+
+static int
+LZe_price_pair(struct LZ_encoder *e, int dis, int len, int pos_state)
+{
+ int price = Lp_price(&e->match_len_prices, len, pos_state);
+ int len_state = get_len_state(len);
+ if (dis < modeled_distances)
+ return price + e->dis_prices[len_state][dis];
+ else
+ return price + e->dis_slot_prices[len_state][get_slot(dis)] +
+ e->align_prices[dis & (dis_align_size - 1)];
+}
+
+static int
+LZe_read_match_distances(struct LZ_encoder *e)
+{
+ int num_pairs = LZe_get_match_pairs(e, e->pairs);
+ if (num_pairs > 0) {
+ int len = e->pairs[num_pairs-1].len;
+ if (len == e->match_len_limit && len < max_match_len)
+ e->pairs[num_pairs-1].len =
+ Mb_true_match_len(&e->eb.mb, len, e->pairs[num_pairs-1].dis + 1);
+ }
+ return num_pairs;
+}
+
+static void
+LZe_move_and_update(struct LZ_encoder *e, int n)
+{
+ while (true) {
+ Mb_move_pos(&e->eb.mb);
+ if (--n <= 0)
+ break;
+ LZe_get_match_pairs(e, 0);
+ }
+}
+
+static void
+LZe_backward(struct LZ_encoder *e, int cur)
+{
+ int dis4 = e->trials[cur].dis4;
+ while (cur > 0) {
+ int prev_index = e->trials[cur].prev_index;
+ struct Trial *prev_trial = &e->trials[prev_index];
+
+ if (e->trials[cur].prev_index2 != single_step_trial) {
+ prev_trial->dis4 = -1; /* literal */
+ prev_trial->prev_index = prev_index - 1;
+ prev_trial->prev_index2 = single_step_trial;
+ if (e->trials[cur].prev_index2 >= 0) {
+ struct Trial *prev_trial2 = &e->trials[prev_index-1];
+ prev_trial2->dis4 = dis4;
+ dis4 = 0; /* rep0 */
+ prev_trial2->prev_index = e->trials[cur].prev_index2;
+ prev_trial2->prev_index2 = single_step_trial;
+ }
+ }
+ prev_trial->price = cur - prev_index; /* len */
+ cur = dis4;
+ dis4 = prev_trial->dis4;
+ prev_trial->dis4 = cur;
+ cur = prev_index;
+ }
+}
+
+enum {
+ Nprevpos3 = 1 << 16,
+ Nprevpos2 = 1 << 10
+};
+
+static bool
+LZe_init(struct LZ_encoder *e, int dict_size, int len_limit, int ifd, int outfd)
+{
+ enum {
+ before = max_num_trials,
+ /* bytes to keep in buffer after pos */
+ after_size = (2 *max_match_len) + 1,
+ dict_factor = 2,
+ Nprevpos23 = Nprevpos2 + Nprevpos3,
+ pos_array_factor = 2
+ };
+
+ if (!LZeb_init(&e->eb, before, dict_size, after_size, dict_factor,
+ Nprevpos23, pos_array_factor, ifd, outfd))
+ return false;
+ e->cycles = (len_limit < max_match_len) ? 16 + (len_limit / 2) : 256;
+ e->match_len_limit = len_limit;
+ Lp_init(&e->match_len_prices, &e->eb.match_len_model, e->match_len_limit);
+ Lp_init(&e->rep_len_prices, &e->eb.rep_len_model, e->match_len_limit);
+ e->pending_num_pairs = 0;
+ e->num_dis_slots = 2 * real_bits(e->eb.mb.dict_size - 1);
+ e->trials[1].prev_index = 0;
+ e->trials[1].prev_index2 = single_step_trial;
+ return true;
+}
+
+static void
+LZe_reset(struct LZ_encoder *e)
+{
+ LZeb_reset(&e->eb);
+ Lp_reset(&e->match_len_prices);
+ Lp_reset(&e->rep_len_prices);
+ e->pending_num_pairs = 0;
+}
+
+bool LZe_encode_member(struct LZ_encoder *e, uvlong member_size);
diff -Nru /sys/src/cmd/lzip/encoder_base.c /sys/src/cmd/lzip/encoder_base.c
--- /sys/src/cmd/lzip/encoder_base.c Thu Jan 1 00:00:00 1970
+++ /sys/src/cmd/lzip/encoder_base.c Sat May 1 00:00:00 2021
@@ -0,0 +1,203 @@
+/* Clzip - LZMA lossless data compressor
+ Copyright (C) 2010-2017 Antonio Diaz Diaz.
+
+ This program is free software: you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation, either version 2 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>. */
+
+#include "lzip.h"
+#include "encoder_base.h"
+
+Dis_slots dis_slots;
+Prob_prices prob_prices;
+
+bool
+Mb_read_block(Matchfinder_base *mb)
+{
+ if (!mb->at_stream_end && mb->stream_pos < mb->buffer_size) {
+ int size = mb->buffer_size - mb->stream_pos;
+ int rd = readblock(mb->infd, mb->buffer + mb->stream_pos, size);
+
+ mb->stream_pos += rd;
+ if (rd != size && errno) {
+ show_error( "Read error", errno, false );
+ cleanup_and_fail(1);
+ }
+ if (rd < size) {
+ mb->at_stream_end = true;
+ mb->pos_limit = mb->buffer_size;
+ }
+ }
+ return mb->pos < mb->stream_pos;
+}
+
+void
+Mb_normalize_pos(Matchfinder_base *mb)
+{
+ if (mb->pos > mb->stream_pos)
+ internal_error( "pos > stream_pos in Mb_normalize_pos." );
+ if (!mb->at_stream_end) {
+ int i, offset = mb->pos - mb->before_size - mb->dict_size;
+ int size = mb->stream_pos - offset;
+
+ memmove(mb->buffer, mb->buffer + offset, size);
+ mb->partial_data_pos += offset;
+ mb->pos -= offset; /* pos = before_size + dict_size */
+ mb->stream_pos -= offset;
+ for (i = 0; i < mb->num_prev_positions; ++i)
+ if (mb->prev_positions[i] < offset)
+ mb->prev_positions[i] = 0;
+ else
+ mb->prev_positions[i] -= offset;
+ for (i = 0; i < mb->pos_array_size; ++i)
+ if (mb->pos_array[i] < offset)
+ mb->pos_array[i] = 0;
+ else
+ mb->pos_array[i] -= offset;
+ Mb_read_block(mb);
+ }
+}
+
+bool
+Mb_init(Matchfinder_base *mb, int before, int dict_size, int after_size, int dict_factor, int num_prev_positions23, int pos_array_factor, int ifd)
+{
+ int buffer_size_limit = (dict_factor * dict_size) + before + after_size;
+ unsigned size;
+ int i;
+
+ mb->partial_data_pos = 0;
+ mb->before_size = before;
+ mb->pos = 0;
+ mb->cyclic_pos = 0;
+ mb->stream_pos = 0;
+ mb->infd = ifd;
+ mb->at_stream_end = false;
+
+ mb->buffer_size = max(65536, dict_size);
+ mb->buffer = (uchar *)malloc(mb->buffer_size);
+ if (!mb->buffer)
+ return false;
+ if (Mb_read_block(mb) && !mb->at_stream_end &&
+ mb->buffer_size < buffer_size_limit) {
+ uchar * tmp;
+ mb->buffer_size = buffer_size_limit;
+ tmp = (uchar *)realloc(mb->buffer, mb->buffer_size);
+ if (!tmp) {
+ free(mb->buffer);
+ return false;
+ }
+ mb->buffer = tmp;
+ Mb_read_block(mb);
+ }
+ if (mb->at_stream_end && mb->stream_pos < dict_size)
+ mb->dict_size = max(min_dict_size, mb->stream_pos);
+ else
+ mb->dict_size = dict_size;
+ mb->pos_limit = mb->buffer_size;
+ if (!mb->at_stream_end)
+ mb->pos_limit -= after_size;
+ size = real_bits(mb->dict_size - 1) - 2;
+ if (size < 16)
+ size = 16;
+ size = 1 << size;
+// if (mb->dict_size > (1 << 26)) /* 64 MiB */
+// size >>= 1;
+ mb->key4_mask = size - 1;
+ size += num_prev_positions23;
+
+ mb->num_prev_positions = size;
+ mb->pos_array_size = pos_array_factor * (mb->dict_size + 1);
+ size += mb->pos_array_size;
+ if (size * sizeof mb->prev_positions[0] <= size)
+ mb->prev_positions = 0;
+ else
+ mb->prev_positions =
+ (int32_t *)malloc(size * sizeof mb->prev_positions[0]);
+ if (!mb->prev_positions) {
+ free(mb->buffer);
+ return false;
+ }
+ mb->pos_array = mb->prev_positions + mb->num_prev_positions;
+ for (i = 0; i < mb->num_prev_positions; ++i)
+ mb->prev_positions[i] = 0;
+ return true;
+}
+
+void
+Mb_reset(Matchfinder_base *mb)
+{
+ int i;
+
+ if (mb->stream_pos > mb->pos)
+ memmove(mb->buffer, mb->buffer + mb->pos, mb->stream_pos - mb->pos);
+ mb->partial_data_pos = 0;
+ mb->stream_pos -= mb->pos;
+ mb->pos = 0;
+ mb->cyclic_pos = 0;
+ for (i = 0; i < mb->num_prev_positions; ++i)
+ mb->prev_positions[i] = 0;
+ Mb_read_block(mb);
+}
+
+void
+Re_flush_data(Range_encoder *renc)
+{
+ if (renc->pos > 0) {
+ if (renc->outfd >= 0 &&
+ writeblock(renc->outfd, renc->buffer, renc->pos) != renc->pos) {
+ show_error( "Write error", errno, false );
+ cleanup_and_fail(1);
+ }
+ renc->partial_member_pos += renc->pos;
+ renc->pos = 0;
+ show_progress(0, 0, 0, 0);
+ }
+}
+
+/* End Of Stream mark => (dis == 0xFFFFFFFFU, len == min_match_len) */
+void
+LZeb_full_flush(LZ_encoder_base *eb, State state)
+{
+ int i;
+ int pos_state = Mb_data_position(&eb->mb) & pos_state_mask;
+ File_trailer trailer;
+ Re_encode_bit(&eb->renc, &eb->bm_match[state][pos_state], 1);
+ Re_encode_bit(&eb->renc, &eb->bm_rep[state], 0);
+ LZeb_encode_pair(eb, 0xFFFFFFFFU, min_match_len, pos_state);
+ Re_flush(&eb->renc);
+ Ft_set_data_crc(trailer, LZeb_crc(eb));
+ Ft_set_data_size(trailer, Mb_data_position(&eb->mb));
+ Ft_set_member_size(trailer, Re_member_position(&eb->renc) + Ft_size);
+ for (i = 0; i < Ft_size; ++i)
+ Re_put_byte(&eb->renc, trailer[i]);
+ Re_flush_data(&eb->renc);
+}
+
+void
+LZeb_reset(LZ_encoder_base *eb)
+{
+ Mb_reset(&eb->mb);
+ eb->crc = 0xFFFFFFFFU;
+ Bm_array_init(eb->bm_literal[0], (1 << literal_context_bits) * 0x300);
+ Bm_array_init(eb->bm_match[0], states * pos_states);
+ Bm_array_init(eb->bm_rep, states);
+ Bm_array_init(eb->bm_rep0, states);
+ Bm_array_init(eb->bm_rep1, states);
+ Bm_array_init(eb->bm_rep2, states);
+ Bm_array_init(eb->bm_len[0], states * pos_states);
+ Bm_array_init(eb->bm_dis_slot[0], len_states * (1 << dis_slot_bits));
+ Bm_array_init(eb->bm_dis, modeled_distances - end_dis_model + 1);
+ Bm_array_init(eb->bm_align, dis_align_size);
+ Lm_init(&eb->match_len_model);
+ Lm_init(&eb->rep_len_model);
+ Re_reset(&eb->renc);
+}
diff -Nru /sys/src/cmd/lzip/encoder_base.h /sys/src/cmd/lzip/encoder_base.h
--- /sys/src/cmd/lzip/encoder_base.h Thu Jan 1 00:00:00 1970
+++ /sys/src/cmd/lzip/encoder_base.h Sat May 1 00:00:00 2021
@@ -0,0 +1,559 @@
+/* Clzip - LZMA lossless data compressor
+ Copyright (C) 2010-2017 Antonio Diaz Diaz.
+
+ This program is free software: you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation, either version 2 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "lzip.h"
+
+static void
+Dis_slots_init(void)
+{
+ int i, size, slot;
+ for (slot = 0; slot < 4; ++slot)
+ dis_slots[slot] = slot;
+ for (i = 4, size = 2, slot = 4; slot < 20; slot += 2) {
+ memset(&dis_slots[i], slot, size);
+ memset(&dis_slots[i+size], slot + 1, size);
+ size <<= 1;
+ i += size;
+ }
+}
+
+static uchar
+get_slot(unsigned dis)
+{
+ if (dis < (1 << 10))
+ return dis_slots[dis];
+ if (dis < (1 << 19))
+ return dis_slots[dis>> 9] + 18;
+ if (dis < (1 << 28))
+ return dis_slots[dis>>18] + 36;
+ return dis_slots[dis>>27] + 54;
+}
+
+static void
+Prob_prices_init(void)
+{
+ int i, j;
+ for (i = 0; i < bit_model_total >> price_step_bits; ++i) {
+ unsigned val = (i * price_step) + (price_step / 2);
+ int bits = 0; /* base 2 logarithm of val */
+
+ for (j = 0; j < price_shift_bits; ++j) {
+ val = val * val;
+ bits <<= 1;
+ while (val >= (1 << 16)) {
+ val >>= 1;
+ ++bits;
+ }
+ }
+ bits += 15; /* remaining bits in val */
+ prob_prices[i] = (bit_model_total_bits << price_shift_bits) - bits;
+ }
+}
+
+static int
+price_symbol3(Bit_model bm[], int symbol)
+{
+ int price;
+ bool bit = symbol & 1;
+
+ symbol |= 8;
+ symbol >>= 1;
+ price = price_bit(bm[symbol], bit);
+ bit = symbol & 1;
+ symbol >>= 1;
+ price += price_bit(bm[symbol], bit);
+ return price + price_bit(bm[1], symbol & 1);
+}
+
+static int
+price_symbol6(Bit_model bm[], unsigned symbol)
+{
+ int price;
+ bool bit = symbol & 1;
+
+ symbol |= 64;
+ symbol >>= 1;
+ price = price_bit(bm[symbol], bit);
+ bit = symbol & 1;
+ symbol >>= 1;
+ price += price_bit(bm[symbol], bit);
+ bit = symbol & 1;
+ symbol >>= 1;
+ price += price_bit(bm[symbol], bit);
+ bit = symbol & 1;
+ symbol >>= 1;
+ price += price_bit(bm[symbol], bit);
+ bit = symbol & 1;
+ symbol >>= 1;
+ price += price_bit(bm[symbol], bit);
+ return price + price_bit(bm[1], symbol & 1);
+}
+
+static int
+price_symbol8(Bit_model bm[], int symbol)
+{
+ int price;
+ bool bit = symbol & 1;
+ symbol |= 0x100;
+ symbol >>= 1;
+ price = price_bit(bm[symbol], bit);
+ bit = symbol & 1;
+ symbol >>= 1;
+ price += price_bit(bm[symbol], bit);
+ bit = symbol & 1;
+ symbol >>= 1;
+ price += price_bit(bm[symbol], bit);
+ bit = symbol & 1;
+ symbol >>= 1;
+ price += price_bit(bm[symbol], bit);
+ bit = symbol & 1;
+ symbol >>= 1;
+ price += price_bit(bm[symbol], bit);
+ bit = symbol & 1;
+ symbol >>= 1;
+ price += price_bit(bm[symbol], bit);
+ bit = symbol & 1;
+ symbol >>= 1;
+ price += price_bit(bm[symbol], bit);
+ return price + price_bit(bm[1], symbol & 1);
+}
+
+static int
+price_symbol_reversed(Bit_model bm[], int symbol, int num_bits)
+{
+ int price = 0;
+ int model = 1;
+ int i;
+
+ for (i = num_bits; i > 0; --i) {
+ bool bit = symbol & 1;
+ symbol >>= 1;
+ price += price_bit(bm[model], bit);
+ model = (model << 1) | bit;
+ }
+ return price;
+}
+
+static int
+price_matched(Bit_model bm[], unsigned symbol, unsigned match_byte)
+{
+ int price = 0;
+ unsigned mask = 0x100;
+
+ symbol |= mask;
+ for (;;) {
+ unsigned match_bit = (match_byte <<= 1) & mask;
+ bool bit = (symbol <<= 1) & 0x100;
+
+ price += price_bit(bm[(symbol>>9) + match_bit + mask], bit);
+ if (symbol >= 0x10000)
+ return price;
+ mask &= ~(match_bit ^ symbol);
+ /* if(match_bit != bit) mask = 0; */
+ }
+}
+
+struct Matchfinder_base {
+ uvlong partial_data_pos;
+ uchar * buffer; /* input buffer */
+ int32_t * prev_positions; /* 1 + last seen position of key. else 0 */
+ int32_t * pos_array; /* may be tree or chain */
+ int before_size; /* bytes to keep in buffer before dictionary */
+ int buffer_size;
+ int dict_size; /* bytes to keep in buffer before pos */
+ int pos; /* current pos in buffer */
+ int cyclic_pos; /* cycles through [0, dict_size] */
+ int stream_pos; /* first byte not yet read from file */
+ int pos_limit; /* when reached, a new block must be read */
+ int key4_mask;
+ int num_prev_positions; /* size of prev_positions */
+ int pos_array_size;
+ int infd; /* input file descriptor */
+ bool at_stream_end; /* stream_pos shows real end of file */
+};
+
+bool Mb_read_block(Matchfinder_base *mb);
+void Mb_normalize_pos(Matchfinder_base *mb);
+bool Mb_init(Matchfinder_base *mb, int before, int dict_size, int after_size, int dict_factor, int num_prev_positions23, int pos_array_factor, int ifd);
+
+static void
+Mb_free(Matchfinder_base *mb)
+{
+ free(mb->prev_positions);
+ free(mb->buffer);
+}
+
+static int
+Mb_avail_bytes(Matchfinder_base *mb)
+{
+ return mb->stream_pos - mb->pos;
+}
+
+static uvlong
+Mb_data_position(Matchfinder_base *mb)
+{
+ return mb->partial_data_pos + mb->pos;
+}
+
+static bool
+Mb_data_finished(Matchfinder_base *mb)
+{
+ return mb->at_stream_end && mb->pos >= mb->stream_pos;
+}
+
+static int
+Mb_true_match_len(Matchfinder_base *mb, int index, int distance)
+{
+ uchar * data = mb->buffer + mb->pos;
+ int i = index;
+ int len_limit = min(Mb_avail_bytes(mb), max_match_len);
+ while (i < len_limit && data[i-distance] == data[i])
+ ++i;
+ return i;
+}
+
+static void
+Mb_move_pos(Matchfinder_base *mb)
+{
+ if (++mb->cyclic_pos > mb->dict_size)
+ mb->cyclic_pos = 0;
+ if (++mb->pos >= mb->pos_limit)
+ Mb_normalize_pos(mb);
+}
+
+void Mb_reset(Matchfinder_base *mb);
+
+enum { re_buffer_size = 65536 };
+
+typedef struct LZ_encoder_base LZ_encoder_base;
+typedef struct Matchfinder_base Matchfinder_base;
+typedef struct Range_encoder Range_encoder;
+
+struct Range_encoder {
+ uvlong low;
+ uvlong partial_member_pos;
+ uchar * buffer; /* output buffer */
+ int pos; /* current pos in buffer */
+ uint32_t range;
+ unsigned ff_count;
+ int outfd; /* output file descriptor */
+ uchar cache;
+ File_header header;
+};
+
+void Re_flush_data(Range_encoder *renc);
+
+static void
+Re_put_byte(Range_encoder *renc, uchar b)
+{
+ renc->buffer[renc->pos] = b;
+ if (++renc->pos >= re_buffer_size)
+ Re_flush_data(renc);
+}
+
+static void
+Re_shift_low(Range_encoder *renc)
+{
+ if (renc->low >> 24 != 0xFF) {
+ bool carry = (renc->low > 0xFFFFFFFFU);
+ Re_put_byte(renc, renc->cache + carry);
+ for (; renc->ff_count > 0; --renc->ff_count)
+ Re_put_byte(renc, 0xFF + carry);
+ renc->cache = renc->low >> 24;
+ } else
+ ++renc->ff_count;
+ renc->low = (renc->low & 0x00FFFFFFU) << 8;
+}
+
+static void
+Re_reset(Range_encoder *renc)
+{
+ int i;
+ renc->low = 0;
+ renc->partial_member_pos = 0;
+ renc->pos = 0;
+ renc->range = 0xFFFFFFFFU;
+ renc->ff_count = 0;
+ renc->cache = 0;
+ for (i = 0; i < Fh_size; ++i)
+ Re_put_byte(renc, renc->header[i]);
+}
+
+static bool
+Re_init(Range_encoder *renc, unsigned dict_size, int ofd)
+{
+ renc->buffer = (uchar *)malloc(re_buffer_size);
+ if (!renc->buffer)
+ return false;
+ renc->outfd = ofd;
+ Fh_set_magic(renc->header);
+ Fh_set_dict_size(renc->header, dict_size);
+ Re_reset(renc);
+ return true;
+}
+
+static void
+Re_free(Range_encoder *renc)
+{
+ free(renc->buffer);
+}
+
+static uvlong
+Re_member_position(Range_encoder *renc)
+{
+ return renc->partial_member_pos + renc->pos + renc->ff_count;
+}
+
+static void
+Re_flush(Range_encoder *renc)
+{
+ int i;
+ for (i = 0; i < 5; ++i)
+ Re_shift_low(renc);
+}
+
+static void
+Re_encode(Range_encoder *renc, int symbol, int num_bits)
+{
+ unsigned mask;
+ for (mask = 1 << (num_bits - 1); mask > 0; mask >>= 1) {
+ renc->range >>= 1;
+ if (symbol & mask)
+ renc->low += renc->range;
+ if (renc->range <= 0x00FFFFFFU) {
+ renc->range <<= 8;
+ Re_shift_low(renc);
+ }
+ }
+}
+
+static void
+Re_encode_bit(Range_encoder *renc, Bit_model *probability, bool bit)
+{
+ Bit_model prob = *probability;
+ uint32_t bound = (renc->range >> bit_model_total_bits) * prob;
+
+ if (!bit) {
+ renc->range = bound;
+ *probability += (bit_model_total - prob) >> bit_model_move_bits;
+ } else {
+ renc->low += bound;
+ renc->range -= bound;
+ *probability -= prob >> bit_model_move_bits;
+ }
+ if (renc->range <= 0x00FFFFFFU) {
+ renc->range <<= 8;
+ Re_shift_low(renc);
+ }
+}
+
+static void
+Re_encode_tree3(Range_encoder *renc, Bit_model bm[], int symbol)
+{
+ int model = 1;
+ bool bit = (symbol >> 2) & 1;
+
+ Re_encode_bit(renc, &bm[model], bit);
+ model = (model << 1) | bit;
+ bit = (symbol >> 1) & 1;
+ Re_encode_bit(renc, &bm[model], bit);
+ model = (model << 1) | bit;
+ Re_encode_bit(renc, &bm[model], symbol & 1);
+}
+
+static void
+Re_encode_tree6(Range_encoder *renc, Bit_model bm[], unsigned symbol)
+{
+ int model = 1;
+ bool bit = (symbol >> 5) & 1;
+ Re_encode_bit(renc, &bm[model], bit);
+ model = (model << 1) | bit;
+ bit = (symbol >> 4) & 1;
+ Re_encode_bit(renc, &bm[model], bit);
+ model = (model << 1) | bit;
+ bit = (symbol >> 3) & 1;
+ Re_encode_bit(renc, &bm[model], bit);
+ model = (model << 1) | bit;
+ bit = (symbol >> 2) & 1;
+ Re_encode_bit(renc, &bm[model], bit);
+ model = (model << 1) | bit;
+ bit = (symbol >> 1) & 1;
+ Re_encode_bit(renc, &bm[model], bit);
+ model = (model << 1) | bit;
+ Re_encode_bit(renc, &bm[model], symbol & 1);
+}
+
+static void
+Re_encode_tree8(Range_encoder *renc, Bit_model bm[], int symbol)
+{
+ int model = 1;
+ int i;
+ for (i = 7; i >= 0; --i) {
+ bool bit = (symbol >> i) & 1;
+ Re_encode_bit(renc, &bm[model], bit);
+ model = (model << 1) | bit;
+ }
+}
+
+static void
+Re_encode_tree_reversed(Range_encoder *renc, Bit_model bm[], int symbol, int num_bits)
+{
+ int model = 1;
+ int i;
+ for (i = num_bits; i > 0; --i) {
+ bool bit = symbol & 1;
+ symbol >>= 1;
+ Re_encode_bit(renc, &bm[model], bit);
+ model = (model << 1) | bit;
+ }
+}
+
+static void
+Re_encode_matched(Range_encoder *renc, Bit_model bm[], unsigned symbol, unsigned match_byte)
+{
+ unsigned mask = 0x100;
+ symbol |= mask;
+ while (true) {
+ unsigned match_bit = (match_byte <<= 1) & mask;
+ bool bit = (symbol <<= 1) & 0x100;
+ Re_encode_bit(renc, &bm[(symbol>>9)+match_bit+mask], bit);
+ if (symbol >= 0x10000)
+ break;
+ mask &= ~(match_bit ^ symbol);
+ /* if(match_bit != bit) mask = 0; */
+ }
+}
+
+static void
+Re_encode_len(struct Range_encoder *renc, Len_model *lm, int symbol, int pos_state)
+{
+ bool bit = ((symbol -= min_match_len) >= len_low_syms);
+ Re_encode_bit(renc, &lm->choice1, bit);
+ if (!bit)
+ Re_encode_tree3(renc, lm->bm_low[pos_state], symbol);
+ else {
+ bit = ((symbol -= len_low_syms) >= len_mid_syms);
+ Re_encode_bit(renc, &lm->choice2, bit);
+ if (!bit)
+ Re_encode_tree3(renc, lm->bm_mid[pos_state], symbol);
+ else
+ Re_encode_tree8(renc, lm->bm_high, symbol - len_mid_syms);
+ }
+}
+
+enum {
+ max_marker_size = 16,
+ num_rep_distances = 4 /* must be 4 */
+};
+
+struct LZ_encoder_base {
+ struct Matchfinder_base mb;
+ uint32_t crc;
+
+ Bit_model bm_literal[1<<literal_context_bits][0x300];
+ Bit_model bm_match[states][pos_states];
+ Bit_model bm_rep[states];
+ Bit_model bm_rep0[states];
+ Bit_model bm_rep1[states];
+ Bit_model bm_rep2[states];
+ Bit_model bm_len[states][pos_states];
+ Bit_model bm_dis_slot[len_states][1<<dis_slot_bits];
+ Bit_model bm_dis[modeled_distances-end_dis_model+1];
+ Bit_model bm_align[dis_align_size];
+ struct Len_model match_len_model;
+ struct Len_model rep_len_model;
+ struct Range_encoder renc;
+};
+
+void LZeb_reset(LZ_encoder_base *eb);
+
+static bool
+LZeb_init(LZ_encoder_base *eb, int before, int dict_size, int after_size, int dict_factor, int num_prev_positions23, int pos_array_factor, int ifd, int outfd)
+{
+ if (!Mb_init(&eb->mb, before, dict_size, after_size, dict_factor,
+ num_prev_positions23, pos_array_factor, ifd))
+ return false;
+ if (!Re_init(&eb->renc, eb->mb.dict_size, outfd))
+ return false;
+ LZeb_reset(eb);
+ return true;
+}
+
+static void
+LZeb_free(LZ_encoder_base *eb)
+{
+ Re_free(&eb->renc);
+ Mb_free(&eb->mb);
+}
+
+static unsigned
+LZeb_crc(LZ_encoder_base *eb)
+{
+ return eb->crc ^ 0xFFFFFFFFU;
+}
+
+static int
+LZeb_price_literal(LZ_encoder_base *eb, uchar prev_byte, uchar symbol)
+{
+ return price_symbol8(eb->bm_literal[get_lit_state(prev_byte)], symbol);
+}
+
+static int
+LZeb_price_matched(LZ_encoder_base *eb, uchar prev_byte, uchar symbol, uchar match_byte)
+{
+ return price_matched(eb->bm_literal[get_lit_state(prev_byte)], symbol,
+ match_byte);
+}
+
+static void
+LZeb_encode_literal(LZ_encoder_base *eb, uchar prev_byte, uchar symbol)
+{
+ Re_encode_tree8(&eb->renc, eb->bm_literal[get_lit_state(prev_byte)],
+ symbol);
+}
+
+static void
+LZeb_encode_matched(LZ_encoder_base *eb, uchar prev_byte, uchar symbol, uchar match_byte)
+{
+ Re_encode_matched(&eb->renc, eb->bm_literal[get_lit_state(prev_byte)],
+ symbol, match_byte);
+}
+
+static void
+LZeb_encode_pair(LZ_encoder_base *eb, unsigned dis, int len, int pos_state)
+{
+ unsigned dis_slot = get_slot(dis);
+ Re_encode_len(&eb->renc, &eb->match_len_model, len, pos_state);
+ Re_encode_tree6(&eb->renc, eb->bm_dis_slot[get_len_state(len)], dis_slot);
+
+ if (dis_slot >= start_dis_model) {
+ int direct_bits = (dis_slot >> 1) - 1;
+ unsigned base = (2 | (dis_slot & 1)) << direct_bits;
+ unsigned direct_dis = dis - base;
+
+ if (dis_slot < end_dis_model)
+ Re_encode_tree_reversed(&eb->renc, eb->bm_dis + (base - dis_slot),
+ direct_dis, direct_bits);
+ else {
+ Re_encode(&eb->renc, direct_dis >> dis_align_bits,
+ direct_bits - dis_align_bits);
+ Re_encode_tree_reversed(&eb->renc, eb->bm_align, direct_dis, dis_align_bits);
+ }
+ }
+}
+
+void LZeb_full_flush(LZ_encoder_base *eb, State state);
diff -Nru /sys/src/cmd/lzip/fast_encoder.c /sys/src/cmd/lzip/fast_encoder.c
--- /sys/src/cmd/lzip/fast_encoder.c Thu Jan 1 00:00:00 1970
+++ /sys/src/cmd/lzip/fast_encoder.c Sat May 1 00:00:00 2021
@@ -0,0 +1,188 @@
+/* Clzip - LZMA lossless data compressor
+ Copyright (C) 2010-2017 Antonio Diaz Diaz.
+
+ This program is free software: you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation, either version 2 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>. */
+
+#include "lzip.h"
+#include "encoder_base.h"
+#include "fast_encoder.h"
+
+int
+FLZe_longest_match_len(FLZ_encoder *fe, int *distance)
+{
+ enum { len_limit = 16 };
+ uchar *data = Mb_ptr_to_current_pos(&fe->eb.mb);
+ int32_t * ptr0 = fe->eb.mb.pos_array + fe->eb.mb.cyclic_pos;
+ int pos1 = fe->eb.mb.pos + 1;
+ int maxlen = 0, newpos1, count;
+ int available = min(Mb_avail_bytes(&fe->eb.mb), max_match_len);
+
+ if (available < len_limit)
+ return 0;
+
+ fe->key4 = ((fe->key4 << 4) ^ data[3]) & fe->eb.mb.key4_mask;
+ newpos1 = fe->eb.mb.prev_positions[fe->key4];
+ fe->eb.mb.prev_positions[fe->key4] = pos1;
+
+ for (count = 4; ;) {
+ int32_t * newptr;
+ int delta;
+
+ if (newpos1 <= 0 || --count < 0 ||
+ (delta = pos1 - newpos1) > fe->eb.mb.dict_size) {
+ *ptr0 = 0;
+ break;
+ }
+ newptr = fe->eb.mb.pos_array +
+ (fe->eb.mb.cyclic_pos - delta +
+ ((fe->eb.mb.cyclic_pos >= delta) ? 0 : fe->eb.mb.dict_size + 1));
+
+ if (data[maxlen-delta] == data[maxlen]) {
+ int len = 0;
+ while (len < available && data[len-delta] == data[len])
+ ++len;
+ if (maxlen < len) {
+ maxlen = len;
+ *distance = delta - 1;
+ if (maxlen >= len_limit) {
+ *ptr0 = *newptr;
+ break;
+ }
+ }
+ }
+
+ *ptr0 = newpos1;
+ ptr0 = newptr;
+ newpos1 = *ptr0;
+ }
+ return maxlen;
+}
+
+bool
+FLZe_encode_member(FLZ_encoder *fe, uvlong member_size)
+{
+ uvlong member_size_limit = member_size - Ft_size - max_marker_size;
+ int rep = 0, i;
+ int reps[num_rep_distances];
+ State state = 0;
+
+ for (i = 0; i < num_rep_distances; ++i)
+ reps[i] = 0;
+
+ if (Mb_data_position(&fe->eb.mb) != 0 ||
+ Re_member_position(&fe->eb.renc) != Fh_size)
+ return false; /* can be called only once */
+
+ if (!Mb_data_finished(&fe->eb.mb)) /* encode first byte */ {
+ uchar prev_byte = 0;
+ uchar cur_byte = Mb_peek(&fe->eb.mb, 0);
+ Re_encode_bit(&fe->eb.renc, &fe->eb.bm_match[state][0], 0);
+ LZeb_encode_literal(&fe->eb, prev_byte, cur_byte);
+ CRC32_update_byte(&fe->eb.crc, cur_byte);
+ FLZe_reset_key4(fe);
+ FLZe_update_and_move(fe, 1);
+ }
+
+ while (!Mb_data_finished(&fe->eb.mb) &&
+ Re_member_position(&fe->eb.renc) < member_size_limit) {
+ int match_distance;
+ int main_len = FLZe_longest_match_len(fe, &match_distance);
+ int pos_state = Mb_data_position(&fe->eb.mb) & pos_state_mask;
+ int len = 0;
+
+ for (i = 0; i < num_rep_distances; ++i) {
+ int tlen = Mb_true_match_len(&fe->eb.mb, 0, reps[i] + 1);
+ if (tlen > len) {
+ len = tlen;
+ rep = i;
+ }
+ }
+ if (len > min_match_len && len + 3 > main_len) {
+ CRC32_update_buf(&fe->eb.crc, Mb_ptr_to_current_pos(&fe->eb.mb), len);
+ Re_encode_bit(&fe->eb.renc, &fe->eb.bm_match[state][pos_state], 1);
+ Re_encode_bit(&fe->eb.renc, &fe->eb.bm_rep[state], 1);
+ Re_encode_bit(&fe->eb.renc, &fe->eb.bm_rep0[state], rep != 0);
+ if (rep == 0)
+ Re_encode_bit(&fe->eb.renc, &fe->eb.bm_len[state][pos_state], 1);
+ else {
+ int distance;
+ Re_encode_bit(&fe->eb.renc, &fe->eb.bm_rep1[state], rep > 1);
+ if (rep > 1)
+ Re_encode_bit(&fe->eb.renc, &fe->eb.bm_rep2[state], rep > 2);
+ distance = reps[rep];
+ for (i = rep; i > 0; --i)
+ reps[i] = reps[i-1];
+ reps[0] = distance;
+ }
+ state = St_set_rep(state);
+ Re_encode_len(&fe->eb.renc, &fe->eb.rep_len_model, len, pos_state);
+ Mb_move_pos(&fe->eb.mb);
+ FLZe_update_and_move(fe, len - 1);
+ continue;
+ }
+
+ if (main_len > min_match_len) {
+ CRC32_update_buf(&fe->eb.crc, Mb_ptr_to_current_pos(&fe->eb.mb), main_len);
+ Re_encode_bit(&fe->eb.renc, &fe->eb.bm_match[state][pos_state], 1);
+ Re_encode_bit(&fe->eb.renc, &fe->eb.bm_rep[state], 0);
+ state = St_set_match(state);
+ for (i = num_rep_distances - 1; i > 0; --i)
+ reps[i] = reps[i-1];
+ reps[0] = match_distance;
+ LZeb_encode_pair(&fe->eb, match_distance, main_len, pos_state);
+ Mb_move_pos(&fe->eb.mb);
+ FLZe_update_and_move(fe, main_len - 1);
+ continue;
+ }
+
+ {
+ uchar prev_byte = Mb_peek(&fe->eb.mb, 1);
+ uchar cur_byte = Mb_peek(&fe->eb.mb, 0);
+ uchar match_byte = Mb_peek(&fe->eb.mb, reps[0] + 1);
+ Mb_move_pos(&fe->eb.mb);
+ CRC32_update_byte(&fe->eb.crc, cur_byte);
+
+ if (match_byte == cur_byte) {
+ int short_rep_price = price1(fe->eb.bm_match[state][pos_state]) +
+ price1(fe->eb.bm_rep[state]) +
+ price0(fe->eb.bm_rep0[state]) +
+ price0(fe->eb.bm_len[state][pos_state]);
+ int price = price0(fe->eb.bm_match[state][pos_state]);
+ if (St_is_char(state))
+ price += LZeb_price_literal(&fe->eb, prev_byte, cur_byte);
+ else
+ price += LZeb_price_matched(&fe->eb, prev_byte, cur_byte, match_byte);
+ if (short_rep_price < price) {
+ Re_encode_bit(&fe->eb.renc, &fe->eb.bm_match[state][pos_state], 1);
+ Re_encode_bit(&fe->eb.renc, &fe->eb.bm_rep[state], 1);
+ Re_encode_bit(&fe->eb.renc, &fe->eb.bm_rep0[state], 0);
+ Re_encode_bit(&fe->eb.renc, &fe->eb.bm_len[state][pos_state], 0);
+ state = St_set_short_rep(state);
+ continue;
+ }
+ }
+
+ /* literal byte */
+ Re_encode_bit(&fe->eb.renc, &fe->eb.bm_match[state][pos_state], 0);
+ if (St_is_char(state))
+ LZeb_encode_literal(&fe->eb, prev_byte, cur_byte);
+ else
+ LZeb_encode_matched(&fe->eb, prev_byte, cur_byte, match_byte);
+ state = St_set_char(state);
+ }
+ }
+
+ LZeb_full_flush(&fe->eb, state);
+ return true;
+}
diff -Nru /sys/src/cmd/lzip/fast_encoder.h /sys/src/cmd/lzip/fast_encoder.h
--- /sys/src/cmd/lzip/fast_encoder.h Thu Jan 1 00:00:00 1970
+++ /sys/src/cmd/lzip/fast_encoder.h Sat May 1 00:00:00 2021
@@ -0,0 +1,71 @@
+/* Clzip - LZMA lossless data compressor
+ Copyright (C) 2010-2017 Antonio Diaz Diaz.
+
+ This program is free software: you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation, either version 2 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>. */
+
+typedef struct FLZ_encoder FLZ_encoder;
+struct FLZ_encoder {
+ struct LZ_encoder_base eb;
+ unsigned key4; /* key made from latest 4 bytes */
+};
+
+static void
+FLZe_reset_key4(FLZ_encoder *fe)
+{
+ int i;
+ fe->key4 = 0;
+ for (i = 0; i < 3 && i < Mb_avail_bytes(&fe->eb.mb); ++i)
+ fe->key4 = (fe->key4 << 4) ^ fe->eb.mb.buffer[i];
+}
+
+int FLZe_longest_match_len(FLZ_encoder *fe, int *distance);
+
+static void
+FLZe_update_and_move(FLZ_encoder *fe, int n)
+{
+ while (--n >= 0) {
+ if (Mb_avail_bytes(&fe->eb.mb) >= 4) {
+ fe->key4 = ((fe->key4 << 4) ^ fe->eb.mb.buffer[fe->eb.mb.pos+3]) &
+ fe->eb.mb.key4_mask;
+ fe->eb.mb.pos_array[fe->eb.mb.cyclic_pos] = fe->eb.mb.prev_positions[fe->key4];
+ fe->eb.mb.prev_positions[fe->key4] = fe->eb.mb.pos + 1;
+ }
+ Mb_move_pos(&fe->eb.mb);
+ }
+}
+
+static bool
+FLZe_init(FLZ_encoder *fe, int ifd, int outfd)
+{
+ enum {
+ before = 0,
+ dict_size = 65536,
+ /* bytes to keep in buffer after pos */
+ after_size = max_match_len,
+ dict_factor = 16,
+ num_prev_positions23 = 0,
+ pos_array_factor = 1
+ };
+
+ return LZeb_init(&fe->eb, before, dict_size, after_size, dict_factor,
+ num_prev_positions23, pos_array_factor, ifd, outfd);
+}
+
+static void
+FLZe_reset(FLZ_encoder *fe)
+{
+ LZeb_reset(&fe->eb);
+}
+
+bool FLZe_encode_member(FLZ_encoder *fe, uvlong member_size);
diff -Nru /sys/src/cmd/lzip/lzip.h /sys/src/cmd/lzip/lzip.h
--- /sys/src/cmd/lzip/lzip.h Thu Jan 1 00:00:00 1970
+++ /sys/src/cmd/lzip/lzip.h Sat May 1 00:00:00 2021
@@ -0,0 +1,497 @@
+/* Clzip - LZMA lossless data compressor
+ Copyright (C) 2010-2017 Antonio Diaz Diaz.
+
+ This program is free software: you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation, either version 2 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef _LZIP_H
+#define _LZIP_H
+
+#include <u.h>
+#include <libc.h>
+#include <stdio.h>
+#include <ctype.h>
+
+#define exit(n) exits((n) == 0? 0: "err")
+#define isatty(fd) 0
+#define lseek seek
+
+#ifndef max
+#define max(x,y) ((x) >= (y) ? (x) : (y))
+#endif
+#ifndef min
+#define min(x,y) ((x) <= (y) ? (x) : (y))
+#endif
+
+typedef int State;
+typedef long int32_t;
+typedef ulong uint32_t;
+typedef int bool;
+
+enum { false, true };
+
+enum { states = 12 };
+enum {
+ min_dict_bits = 12,
+ min_dict_size = 1 << min_dict_bits, /* >= modeled_distances */
+ max_dict_bits = 29,
+ max_dict_size = 1 << max_dict_bits,
+ min_member_size = 36,
+ literal_context_bits = 3,
+ literal_pos_state_bits = 0, /* not used */
+ pos_state_bits = 2,
+ pos_states = 1 << pos_state_bits,
+ pos_state_mask = pos_states -1,
+
+ len_states = 4,
+ dis_slot_bits = 6,
+ start_dis_model = 4,
+ end_dis_model = 14,
+ modeled_distances = 1 << (end_dis_model / 2), /* 128 */
+ dis_align_bits = 4,
+ dis_align_size = 1 << dis_align_bits,
+
+ len_low_bits = 3,
+ len_mid_bits = 3,
+ len_high_bits = 8,
+ len_low_syms = 1 << len_low_bits,
+ len_mid_syms = 1 << len_mid_bits,
+ len_high_syms = 1 << len_high_bits,
+ max_len_syms = len_low_syms + len_mid_syms + len_high_syms,
+
+ min_match_len = 2, /* must be 2 */
+ max_match_len = min_match_len + max_len_syms - 1, /* 273 */
+ min_match_len_limit = 5,
+
+ bit_model_move_bits = 5,
+ bit_model_total_bits = 11,
+ bit_model_total = 1 << bit_model_total_bits,
+};
+
+typedef struct Len_model Len_model;
+typedef struct Pretty_print Pretty_print;
+typedef struct Matchfinder_base Matchfinder_base;
+typedef int Bit_model;
+
+struct Len_model {
+ Bit_model choice1;
+ Bit_model choice2;
+ Bit_model bm_low[pos_states][len_low_syms];
+ Bit_model bm_mid[pos_states][len_mid_syms];
+ Bit_model bm_high[len_high_syms];
+};
+struct Pretty_print {
+ char *name;
+ char *stdin_name;
+ ulong longest_name;
+ bool first_post;
+};
+
+typedef ulong CRC32[256]; /* Table of CRCs of all 8-bit messages. */
+
+extern CRC32 crc32;
+
+#define errno 0
+
+static uchar magic_string[4] = { "LZIP" };
+
+typedef uchar File_header[6]; /* 0-3 magic bytes */
+/* 4 version */
+/* 5 coded_dict_size */
+enum { Fh_size = 6 };
+
+typedef uchar File_trailer[20];
+/* 0-3 CRC32 of the uncompressed data */
+/* 4-11 size of the uncompressed data */
+/* 12-19 member size including header and trailer */
+
+enum { Ft_size = 20 };
+
+enum {
+ price_shift_bits = 6,
+ price_step_bits = 2,
+ price_step = 1 << price_step_bits,
+};
+
+typedef uchar Dis_slots[1<<10];
+typedef short Prob_prices[bit_model_total >> price_step_bits];
+
+extern Dis_slots dis_slots;
+extern Prob_prices prob_prices;
+
+#define get_price(prob) prob_prices[(prob) >> price_step_bits]
+#define price0(prob) get_price(prob)
+#define price1(prob) get_price(bit_model_total - (prob))
+#define price_bit(bm, bit) ((bit)? price1(bm): price0(bm))
+
+#define Mb_ptr_to_current_pos(mb) ((mb)->buffer + (mb)->pos)
+#define Mb_peek(mb, distance) (mb)->buffer[(mb)->pos - (distance)]
+
+#define Lp_price(lp, len, pos_state) \
+ (lp)->prices[pos_state][(len) - min_match_len]
+
+#define Tr_update(trial, pr, distance4, p_i) \
+{ \
+ if ((pr) < (trial)->price) { \
+ (trial)->price = pr; \
+ (trial)->dis4 = distance4; \
+ (trial)->prev_index = p_i; \
+ (trial)->prev_index2 = single_step_trial; \
+ } else { \
+ } \
+}
+
+/* these functions are now extern and must be defined exactly once */
+#ifdef _DEFINE_INLINES
+#define _INLINES_DEFINED
+
+int
+get_len_state(int len)
+{
+ int lenstm1, lenmmm;
+
+ lenmmm = len - min_match_len;
+ lenstm1 = len_states - 1;
+ if (lenmmm < lenstm1)
+ return lenmmm;
+ else
+ return lenstm1;
+}
+
+State
+St_set_char(State st)
+{
+ static State next[states] = { 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 4, 5 };
+
+ assert((unsigned)st < nelem(next));
+ return next[st];
+}
+
+int
+get_lit_state(uchar prev_byte)
+{
+ return prev_byte >> (8 - literal_context_bits);
+}
+
+void
+Bm_init(Bit_model *probability)
+{
+ *probability = bit_model_total / 2;
+}
+
+void
+Bm_array_init(Bit_model bm[], int size)
+{
+ int i;
+
+ for (i = 0; i < size; ++i)
+ Bm_init(&bm[i]);
+}
+
+void
+Lm_init(Len_model *lm)
+{
+ Bm_init(&lm->choice1);
+ Bm_init(&lm->choice2);
+ Bm_array_init(lm->bm_low[0], pos_states * len_low_syms);
+ Bm_array_init(lm->bm_mid[0], pos_states * len_mid_syms);
+ Bm_array_init(lm->bm_high, len_high_syms);
+}
+
+void
+Pp_init(Pretty_print *pp, char *filenames[], int num_filenames, int verbosity)
+{
+ unsigned stdin_name_len;
+ int i;
+
+ pp->name = 0;
+ pp->stdin_name = "(stdin)";
+ pp->longest_name = 0;
+ pp->first_post = false;
+
+ if (verbosity <= 0)
+ return;
+ stdin_name_len = strlen(pp->stdin_name);
+ for (i = 0; i < num_filenames; ++i) {
+ char *s = filenames[i];
+ unsigned len = strcmp(s, "-") == 0? stdin_name_len: strlen(s);
+
+ if (len > pp->longest_name)
+ pp->longest_name = len;
+ }
+ if (pp->longest_name == 0)
+ pp->longest_name = stdin_name_len;
+}
+
+void
+Pp_set_name(Pretty_print *pp, char *filename)
+{
+ if ( filename && filename[0] && strcmp( filename, "-" ) != 0 )
+ pp->name = filename;
+ else
+ pp->name = pp->stdin_name;
+ pp->first_post = true;
+}
+
+void
+Pp_reset(Pretty_print *pp)
+{
+ if (pp->name && pp->name[0])
+ pp->first_post = true;
+}
+
+void
+Pp_show_msg(Pretty_print *pp, char *msg);
+
+void
+CRC32_init(void)
+{
+ unsigned n;
+
+ for (n = 0; n < 256; ++n) {
+ unsigned c = n;
+ int k;
+ for (k = 0; k < 8; ++k) {
+ if (c & 1)
+ c = 0xEDB88320U ^ (c >> 1);
+ else
+ c >>= 1;
+ }
+ crc32[n] = c;
+ }
+}
+
+void
+CRC32_update_byte(uint32_t *crc, uchar byte)
+{
+ *crc = crc32[(*crc^byte)&0xFF] ^ (*crc >> 8);
+}
+
+void
+CRC32_update_buf(uint32_t *crc, uchar *buffer, int size)
+{
+ int i;
+ uint32_t c = *crc;
+ for (i = 0; i < size; ++i)
+ c = crc32[(c^buffer[i])&0xFF] ^ (c >> 8);
+ *crc = c;
+}
+
+bool
+isvalid_ds(unsigned dict_size)
+{
+ return (dict_size >= min_dict_size &&
+ dict_size <= max_dict_size);
+}
+
+int
+real_bits(unsigned value)
+{
+ int bits = 0;
+
+ while (value > 0) {
+ value >>= 1;
+ ++bits;
+ }
+ return bits;
+}
+
+void
+Fh_set_magic(File_header data)
+{
+ memcpy(data, magic_string, 4);
+ data[4] = 1;
+}
+
+bool
+Fh_verify_magic(File_header data)
+{
+ return (memcmp(data, magic_string, 4) == 0);
+}
+
+/* detect truncated header */
+bool
+Fh_verify_prefix(File_header data, int size)
+{
+ int i;
+ for (i = 0; i < size && i < 4; ++i)
+ if (data[i] != magic_string[i])
+ return false;
+ return (size > 0);
+}
+
+uchar
+Fh_version(File_header data)
+{
+ return data[4];
+}
+
+bool
+Fh_verify_version(File_header data)
+{
+ return (data[4] == 1);
+}
+
+unsigned
+Fh_get_dict_size(File_header data)
+{
+ unsigned sz = (1 << (data[5] &0x1F));
+ if (sz > min_dict_size)
+ sz -= (sz / 16) * ((data[5] >> 5) & 7);
+ return sz;
+}
+
+bool
+Fh_set_dict_size(File_header data, unsigned sz)
+{
+ if (!isvalid_ds(sz))
+ return false;
+ data[5] = real_bits(sz - 1);
+ if (sz > min_dict_size) {
+ unsigned base_size = 1 << data[5];
+ unsigned fraction = base_size / 16;
+ unsigned i;
+ for (i = 7; i >= 1; --i)
+ if (base_size - (i * fraction) >= sz) {
+ data[5] |= (i << 5);
+ break;
+ }
+ }
+ return true;
+}
+
+unsigned
+Ft_get_data_crc(File_trailer data)
+{
+ unsigned tmp = 0;
+ int i;
+ for (i = 3; i >= 0; --i) {
+ tmp <<= 8;
+ tmp += data[i];
+ }
+ return tmp;
+}
+
+void
+Ft_set_data_crc(File_trailer data, unsigned crc)
+{
+ int i;
+ for (i = 0; i <= 3; ++i) {
+ data[i] = (uchar)crc;
+ crc >>= 8;
+ }
+}
+
+uvlong
+Ft_get_data_size(File_trailer data)
+{
+ uvlong tmp = 0;
+ int i;
+ for (i = 11; i >= 4; --i) {
+ tmp <<= 8;
+ tmp += data[i];
+ }
+ return tmp;
+}
+
+void
+Ft_set_data_size(File_trailer data, uvlong sz)
+{
+ int i;
+ for (i = 4; i <= 11; ++i) {
+ data[i] = (uchar)sz;
+ sz >>= 8;
+ }
+}
+
+uvlong
+Ft_get_member_size(File_trailer data)
+{
+ uvlong tmp = 0;
+ int i;
+ for (i = 19; i >= 12; --i) {
+ tmp <<= 8;
+ tmp += data[i];
+ }
+ return tmp;
+}
+
+void
+Ft_set_member_size(File_trailer data, uvlong sz)
+{
+ int i;
+ for (i = 12; i <= 19; ++i) {
+ data[i] = (uchar)sz;
+ sz >>= 8;
+ }
+}
+#else /* _DEFINE_INLINES */
+void Bm_array_init(Bit_model bm[], int size);
+void Bm_init(Bit_model *probability);
+void CRC32_init(void);
+void CRC32_update_buf(uint32_t *crc, uchar *buffer, int size);
+void CRC32_update_byte(uint32_t *crc, uchar byte);
+unsigned Fh_get_dict_size(File_header data);
+bool Fh_set_dict_size(File_header data, unsigned sz);
+void Fh_set_magic(File_header data);
+bool Fh_verify_magic(File_header data);
+bool Fh_verify_prefix(File_header data, int size);
+bool Fh_verify_version(File_header data);
+uchar Fh_version(File_header data);
+unsigned Ft_get_data_crc(File_trailer data);
+uvlong Ft_get_data_size(File_trailer data);
+uvlong Ft_get_member_size(File_trailer data);
+void Ft_set_data_crc(File_trailer data, unsigned crc);
+void Ft_set_data_size(File_trailer data, uvlong sz);
+void Ft_set_member_size(File_trailer data, uvlong sz);
+void Lm_init(Len_model *lm);
+void Pp_init(Pretty_print *pp, char *filenames[], int num_filenames, int verbosity);
+void Pp_reset(Pretty_print *pp);
+void Pp_set_name(Pretty_print *pp, char *filename);
+void Pp_show_msg(Pretty_print *pp, char *msg);
+State St_set_char(State st);
+int get_lit_state(uchar prev_byte);
+int get_len_state(int len);
+bool isvalid_ds(unsigned dict_size);
+int real_bits(unsigned value);
+#endif /* _DEFINE_INLINES */
+
+#define St_is_char(state) ((state) < 7)
+#define St_set_match(state) ((state) < 7? 7: 10)
+#define St_set_rep(state) ((state) < 7? 8: 11)
+#define St_set_short_rep(state) ((state) < 7? 9: 11)
+
+static char *bad_magic_msg = "Bad magic number (file not in lzip format).";
+static char *bad_dict_msg = "Invalid dictionary size in member header.";
+static char *trailing_msg = "Trailing data not allowed.";
+
+/* defined in decoder.c */
+int readblock(int fd, uchar *buf, int size);
+int writeblock(int fd, uchar *buf, int size);
+
+/* defined in main.c */
+extern int verbosity;
+Dir;
+char *bad_version(unsigned version);
+char *format_ds(unsigned dict_size);
+int open_instream(char *name, Dir *in_statsp, bool no_ofile, bool reg_only);
+void *resize_buffer(void *buf, unsigned min_size);
+void cleanup_and_fail(int retval);
+void show_error(char *msg, int errcode, bool help);
+void show_file_error(char *filename, char *msg, int errcode);
+void internal_error(char *msg);
+struct Matchfinder_base;
+void show_progress(uvlong partial_size, Matchfinder_base *m, Pretty_print *p,
+ uvlong cfile_size);
+#endif
diff -Nru /sys/src/cmd/lzip/main.c /sys/src/cmd/lzip/main.c
--- /sys/src/cmd/lzip/main.c Thu Jan 1 00:00:00 1970
+++ /sys/src/cmd/lzip/main.c Sat May 1 00:00:00 2021
@@ -0,0 +1,883 @@
+/*
+ * Clzip - LZMA lossless data compressor
+ * Copyright (C) 2010-2017 Antonio Diaz Diaz.
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+/*
+ * Exit status: 0 for a normal exit, 1 for environmental problems
+ * (file not found, invalid flags, I/O errors, etc), 2 to indicate a
+ * corrupt or invalid input file, 3 for an internal consistency error
+ * (eg, bug) which caused lzip to panic.
+ */
+
+#define _DEFINE_INLINES
+#include "lzip.h"
+#include "decoder.h"
+#include "encoder_base.h"
+#include "encoder.h"
+#include "fast_encoder.h"
+
+int verbosity = 0;
+
+char *argv0 = "lzip";
+
+struct {
+ char * from;
+ char * to;
+} known_extensions[] = {
+ { ".lz", "" },
+ { ".tlz", ".tar" },
+ { 0, 0 }
+};
+
+typedef struct Lzma_options Lzma_options;
+struct Lzma_options {
+ int dict_size; /* 4 KiB .. 512 MiB */
+ int match_len_limit; /* 5 .. 273 */
+};
+
+enum Mode { m_compress, m_decompress, };
+
+char *output_filename = nil;
+int outfd = -1;
+bool delete_output_on_interrupt = false;
+
+static void
+usage(void)
+{
+ fprintf(stderr, "Usage: %s [-[0-9]cdv] [file...]\n", argv0);
+ exit(2);
+}
+
+char *
+bad_version(unsigned version)
+{
+ static char buf[80];
+
+ snprintf(buf, sizeof buf, "Version %ud member format not supported.",
+ version);
+ return buf;
+}
+
+char *
+format_ds(unsigned dict_size)
+{
+ enum { bufsize = 16, factor = 1024 };
+ char *prefix[8] = { "Ki", "Mi", "Gi", "Ti", "Pi", "Ei", "Zi", "Yi" };
+ char *p = "";
+ char *np = " ";
+ unsigned num = dict_size, i;
+ bool exact = (num % factor == 0);
+ static char buf[bufsize];
+
+ for (i = 0; i < 8 && (num > 9999 || (exact && num >= factor)); ++i) {
+ num /= factor;
+ if (num % factor != 0)
+ exact = false;
+ p = prefix[i];
+ np = "";
+ }
+ snprintf( buf, bufsize, "%s%4ud %sB", np, num, p );
+ return buf;
+}
+
+static void
+show_header(unsigned dict_size)
+{
+ if (verbosity >= 3)
+ fprintf(stderr, "dictionary %s. ", format_ds( dict_size) );
+}
+
+static uvlong
+getnum(char *ptr, uvlong llimit, uvlong ulimit)
+{
+ int bad;
+ uvlong result;
+ char *tail;
+
+ bad = 0;
+ result = strtoull(ptr, &tail, 0);
+ if (tail == ptr) {
+ show_error( "Bad or missing numerical argument.", 0, true );
+ exit(1);
+ }
+
+ if (!errno && tail[0]) {
+ unsigned factor = (tail[1] == 'i') ? 1024 : 1000;
+ int i, exponent = 0; /* 0 = bad multiplier */
+
+ switch (tail[0]) {
+ case 'Y':
+ exponent = 8;
+ break;
+ case 'Z':
+ exponent = 7;
+ break;
+ case 'E':
+ exponent = 6;
+ break;
+ case 'P':
+ exponent = 5;
+ break;
+ case 'T':
+ exponent = 4;
+ break;
+ case 'G':
+ exponent = 3;
+ break;
+ case 'M':
+ exponent = 2;
+ break;
+ case 'K':
+ if (factor == 1024)
+ exponent = 1;
+ break;
+ case 'k':
+ if (factor == 1000)
+ exponent = 1;
+ break;
+ }
+ if (exponent <= 0) {
+ show_error( "Bad multiplier in numerical argument.", 0, true );
+ exit(1);
+ }
+ for (i = 0; i < exponent; ++i) {
+ if (ulimit / factor >= result)
+ result *= factor;
+ else {
+ bad++;
+ break;
+ }
+ }
+ }
+ if (bad || result < llimit || result > ulimit) {
+ show_error( "Numerical argument out of limits.", 0, false );
+ exit(1);
+ }
+ return result;
+}
+
+static int
+get_dict_size(char *arg)
+{
+ char *tail;
+ long bits = strtol(arg, &tail, 0);
+
+ if (bits >= min_dict_bits &&
+ bits <= max_dict_bits && *tail == 0)
+ return (1 << bits);
+ return getnum(arg, min_dict_size, max_dict_size);
+}
+
+void
+set_mode(enum Mode *program_modep, enum Mode new_mode)
+{
+ if (*program_modep != m_compress && *program_modep != new_mode) {
+ show_error( "Only one operation can be specified.", 0, true );
+ exit(1);
+ }
+ *program_modep = new_mode;
+}
+
+static int
+extension_index(char *name)
+{
+ int eindex;
+
+ for (eindex = 0; known_extensions[eindex].from; ++eindex) {
+ char * ext = known_extensions[eindex].from;
+ unsigned name_len = strlen(name);
+ unsigned ext_len = strlen(ext);
+
+ if (name_len > ext_len &&
+ strncmp(name + name_len - ext_len, ext, ext_len) == 0)
+ return eindex;
+ }
+ return - 1;
+}
+
+int
+open_instream(char *name, Dir *, bool, bool)
+{
+ int infd = open(name, OREAD);
+
+ if (infd < 0)
+ show_file_error( name, "Can't open input file", errno );
+ return infd;
+}
+
+static int
+open_instream2(char *name, Dir *in_statsp, enum Mode program_mode,
+ int eindex, bool recompress, bool to_stdout)
+{
+ bool no_ofile = to_stdout;
+
+ if (program_mode == m_compress && !recompress && eindex >= 0) {
+ if (verbosity >= 0)
+ fprintf( stderr, "%s: Input file '%s' already has '%s' suffix.\n",
+ argv0, name, known_extensions[eindex].from);
+ return - 1;
+ }
+ return open_instream(name, in_statsp, no_ofile, false);
+}
+
+/* assure at least a minimum size for buffer 'buf' */
+void *
+resize_buffer(void *buf, unsigned min_size)
+{
+ buf = realloc(buf, min_size);
+ if (!buf) {
+ show_error("Not enough memory.", 0, false);
+ cleanup_and_fail(1);
+ }
+ return buf;
+}
+
+static void
+set_c_outname(char *name, bool multifile)
+{
+ output_filename = resize_buffer(output_filename, strlen(name) + 5 +
+ strlen(known_extensions[0].from) + 1);
+ strcpy(output_filename, name);
+ if (multifile)
+ strcat( output_filename, "00001" );
+ strcat(output_filename, known_extensions[0].from);
+}
+
+static void
+set_d_outname(char *name, int eindex)
+{
+ unsigned name_len = strlen(name);
+ if (eindex >= 0) {
+ char * from = known_extensions[eindex].from;
+ unsigned from_len = strlen(from);
+
+ if (name_len > from_len) {
+ output_filename = resize_buffer(output_filename, name_len +
+ strlen(known_extensions[eindex].to) + 1);
+ strcpy(output_filename, name);
+ strcpy(output_filename + name_len - from_len, known_extensions[eindex].to);
+ return;
+ }
+ }
+ output_filename = resize_buffer(output_filename, name_len + 4 + 1);
+ strcpy(output_filename, name);
+ strcat(output_filename, ".out");
+ if (verbosity >= 1)
+ fprintf( stderr, "%s: Can't guess original name for '%s' -- using '%s'\n",
+ argv0, name, output_filename);
+}
+
+static bool
+open_outstream(bool force, bool)
+{
+ int flags = OWRITE;
+
+ if (force)
+ flags |= OTRUNC;
+ else
+ flags |= OEXCL;
+
+ outfd = create(output_filename, flags, 0666);
+ if (outfd >= 0)
+ delete_output_on_interrupt = true;
+ else if (verbosity >= 0)
+ fprintf(stderr, "%s: Can't create output file '%s': %r\n",
+ argv0, output_filename);
+ return outfd >= 0;
+}
+
+static bool
+check_tty(int, enum Mode program_mode)
+{
+ if (program_mode == m_compress && isatty(outfd) ||
+ program_mode == m_decompress && isatty(infd)) {
+ usage();
+ return false;
+ }
+ return true;
+}
+
+void
+cleanup_and_fail(int retval)
+{
+ if (delete_output_on_interrupt) {
+ delete_output_on_interrupt = false;
+ if (verbosity >= 0)
+ fprintf(stderr, "%s: Deleting output file '%s', if it exists.\n",
+ argv0, output_filename);
+ if (outfd >= 0) {
+ close(outfd);
+ outfd = -1;
+ }
+ if (remove(output_filename) != 0)
+ fprintf(stderr, "%s: can't remove output file %s: %r\n",
+ argv0, output_filename);
+ }
+ exit(retval);
+}
+
+/* Set permissions, owner and times. */
+static void
+close_and_set_permissions(Dir *)
+{
+ if (close(outfd) != 0) {
+ show_error( "Error closing output file", errno, false );
+ cleanup_and_fail(1);
+ }
+ outfd = -1;
+ delete_output_on_interrupt = false;
+}
+
+static bool
+next_filename(void)
+{
+ int i, j;
+ unsigned name_len = strlen(output_filename);
+ unsigned ext_len = strlen(known_extensions[0].from);
+
+ if ( name_len >= ext_len + 5 ) /* "*00001.lz" */
+ for (i = name_len - ext_len - 1, j = 0; j < 5; --i, ++j) {
+ if (output_filename[i] < '9') {
+ ++output_filename[i];
+ return true;
+ } else
+ output_filename[i] = '0';
+ }
+ return false;
+}
+
+typedef struct Poly_encoder Poly_encoder;
+struct Poly_encoder {
+ LZ_encoder_base *eb;
+ LZ_encoder *e;
+ FLZ_encoder *fe;
+};
+
+static int
+compress(uvlong member_size, uvlong volume_size,
+ int infd, Lzma_options *encoder_options, Pretty_print *pp,
+ Dir *in_statsp, bool zero)
+{
+ int retval = 0;
+ uvlong in_size = 0, out_size = 0, partial_volume_size = 0;
+ uvlong cfile_size = in_statsp? in_statsp->length / 100: 0;
+ Poly_encoder encoder = { 0, 0, 0 }; /* polymorphic encoder */
+ bool error = false;
+
+ if (verbosity >= 1)
+ Pp_show_msg(pp, 0);
+
+ if (zero) {
+ encoder.fe = (FLZ_encoder *)malloc(sizeof * encoder.fe);
+ if (!encoder.fe || !FLZe_init(encoder.fe, infd, outfd))
+ error = true;
+ else
+ encoder.eb = &encoder.fe->eb;
+ } else {
+ File_header header;
+
+ if (Fh_set_dict_size(header, encoder_options->dict_size) &&
+ encoder_options->match_len_limit >= min_match_len_limit &&
+ encoder_options->match_len_limit <= max_match_len)
+ encoder.e = (LZ_encoder *)malloc(sizeof * encoder.e);
+ else
+ internal_error( "invalid argument to encoder." );
+ if (!encoder.e || !LZe_init(encoder.e, Fh_get_dict_size(header),
+ encoder_options->match_len_limit, infd, outfd))
+ error = true;
+ else
+ encoder.eb = &encoder.e->eb;
+ }
+ if (error) {
+ Pp_show_msg( pp, "Not enough memory. Try a smaller dictionary size." );
+ return 1;
+ }
+
+ for(;;) { /* encode one member per iteration */
+ uvlong size;
+ vlong freevolsz;
+
+ size = member_size;
+ if (volume_size > 0) {
+ freevolsz = volume_size - partial_volume_size;
+ if (size > freevolsz)
+ size = freevolsz; /* limit size */
+ }
+ show_progress(in_size, &encoder.eb->mb, pp, cfile_size); /* init */
+ if ((zero && !FLZe_encode_member(encoder.fe, size)) ||
+ (!zero && !LZe_encode_member(encoder.e, size))) {
+ Pp_show_msg( pp, "Encoder error." );
+ retval = 1;
+ break;
+ }
+ in_size += Mb_data_position(&encoder.eb->mb);
+ out_size += Re_member_position(&encoder.eb->renc);
+ if (Mb_data_finished(&encoder.eb->mb))
+ break;
+ if (volume_size > 0) {
+ partial_volume_size += Re_member_position(&encoder.eb->renc);
+ if (partial_volume_size >= volume_size - min_dict_size) {
+ partial_volume_size = 0;
+ if (delete_output_on_interrupt) {
+ close_and_set_permissions(in_statsp);
+ if (!next_filename()) {
+ Pp_show_msg( pp, "Too many volume files." );
+ retval = 1;
+ break;
+ }
+ if (!open_outstream(true, !in_statsp)) {
+ retval = 1;
+ break;
+ }
+ }
+ }
+ }
+ if (zero)
+ FLZe_reset(encoder.fe);
+ else
+ LZe_reset(encoder.e);
+ }
+
+ if (retval == 0 && verbosity >= 1)
+ if (in_size == 0 || out_size == 0)
+ fputs( " no data compressed.\n", stderr );
+ else {
+ if (0)
+ fprintf(stderr,
+ "%6.3f:1, %6.3f bits/byte, %5.2f%% saved, ",
+ (double)in_size / out_size,
+ (8.0 * out_size) / in_size,
+ 100.0 * (1.0 - (double)out_size/in_size));
+ fprintf(stderr, "%llud in, %llud out.\n",
+ in_size, out_size);
+ }
+ LZeb_free(encoder.eb);
+ if (zero)
+ free(encoder.fe);
+ else
+ free(encoder.e);
+ return retval;
+}
+
+static uchar
+xdigit(unsigned value)
+{
+ if (value <= 9)
+ return '0' + value;
+ if (value <= 15)
+ return 'A' + value - 10;
+ return 0;
+}
+
+static bool
+show_trailing_data(uchar *data, int size, Pretty_print *pp, bool all,
+ bool ignore_trailing)
+{
+ if (verbosity >= 4 || !ignore_trailing) {
+ char buf[128];
+ int i, len = snprintf(buf, sizeof buf, "%strailing data = ",
+ all? "": "first bytes of ");
+
+ if (len < 0)
+ len = 0;
+ for (i = 0; i < size && len + 2 < sizeof buf; ++i) {
+ buf[len++] = xdigit(data[i] >> 4);
+ buf[len++] = xdigit(data[i] & 0x0F);
+ buf[len++] = ' ';
+ }
+ if (len < sizeof buf)
+ buf[len++] = '\'';
+ for (i = 0; i < size && len < sizeof buf; ++i) {
+ if (isprint(data[i]))
+ buf[len++] = data[i];
+ else
+ buf[len++] = '.';
+ }
+ if (len < sizeof buf)
+ buf[len++] = '\'';
+ if (len < sizeof buf)
+ buf[len] = 0;
+ else
+ buf[sizeof buf - 1] = 0;
+ Pp_show_msg(pp, buf);
+ if (!ignore_trailing)
+ show_file_error(pp->name, trailing_msg, 0);
+ }
+ return ignore_trailing;
+}
+
+static int
+decompress(int infd, Pretty_print *pp, bool ignore_trailing)
+{
+ uvlong partial_file_pos = 0;
+ Range_decoder rdec;
+ int retval = 0;
+ bool first_member;
+
+ if (!Rd_init(&rdec, infd)) {
+ show_error( "Not enough memory.", 0, false );
+ cleanup_and_fail(1);
+ }
+
+ for (first_member = true; ; first_member = false) {
+ int result, size;
+ unsigned dict_size;
+ File_header header;
+ LZ_decoder decoder;
+
+ Rd_reset_member_position(&rdec);
+ size = Rd_read_data(&rdec, header, Fh_size);
+ if (Rd_finished(&rdec)) /* End Of File */ {
+ if (first_member || Fh_verify_prefix(header, size)) {
+ Pp_show_msg( pp, "File ends unexpectedly at member header." );
+ retval = 2;
+ } else if (size > 0 && !show_trailing_data(header, size, pp,
+ true, ignore_trailing))
+ retval = 2;
+ break;
+ }
+ if (!Fh_verify_magic(header)) {
+ if (first_member) {
+ show_file_error(pp->name, bad_magic_msg, 0);
+ retval = 2;
+ } else if (!show_trailing_data(header, size, pp,
+ false, ignore_trailing))
+ retval = 2;
+ break;
+ }
+ if (!Fh_verify_version(header)) {
+ Pp_show_msg(pp, bad_version(Fh_version(header)));
+ retval = 2;
+ break;
+ }
+ dict_size = Fh_get_dict_size(header);
+ if (!isvalid_ds(dict_size)) {
+ Pp_show_msg(pp, bad_dict_msg);
+ retval = 2;
+ break;
+ }
+
+ if (verbosity >= 2 || (verbosity == 1 && first_member)) {
+ Pp_show_msg(pp, 0);
+ show_header(dict_size);
+ }
+
+ if (!LZd_init(&decoder, &rdec, dict_size, outfd)) {
+ Pp_show_msg( pp, "Not enough memory." );
+ retval = 1;
+ break;
+ }
+ result = LZd_decode_member(&decoder, pp);
+ partial_file_pos += Rd_member_position(&rdec);
+ LZd_free(&decoder);
+ if (result != 0) {
+ if (verbosity >= 0 && result <= 2) {
+ Pp_show_msg(pp, 0);
+ fprintf(stderr, "%s: %s at pos %llud\n",
+ argv0, (result == 2?
+ "file ends unexpectedly":
+ "decoder error"), partial_file_pos);
+ }
+ retval = 2;
+ break;
+ }
+ if (verbosity >= 2) {
+ fputs("done\n", stderr);
+ Pp_reset(pp);
+ }
+ }
+ Rd_free(&rdec);
+ if (verbosity == 1 && retval == 0)
+ fputs("done\n", stderr);
+ return retval;
+}
+
+void
+signal_handler(int sig)
+{
+ USED(sig);
+ show_error("interrupt caught, quitting.", 0, false);
+ cleanup_and_fail(1);
+}
+
+static void
+set_signals(void)
+{
+}
+
+void
+show_error(char *msg, int, bool help)
+{
+ if (verbosity < 0)
+ return;
+ if (msg && msg[0])
+ fprintf(stderr, "%s: %s: %r\n", argv0, msg);
+ if (help)
+ fprintf(stderr, "Try '%s --help' for more information.\n",
+ argv0);
+}
+
+void
+show_file_error(char *filename, char *msg, int errcode)
+{
+ if (verbosity < 0)
+ return;
+ fprintf(stderr, "%s: %s: %s", argv0, filename, msg);
+ if (errcode > 0)
+ fprintf(stderr, ": %r");
+ fputc('\n', stderr);
+}
+
+void
+internal_error(char *msg)
+{
+ if (verbosity >= 0)
+ fprintf( stderr, "%s: internal error: %s\n", argv0, msg );
+ exit(3);
+}
+
+void
+show_progress(uvlong partial_size, Matchfinder_base *m,
+ Pretty_print *p, uvlong cfile_size)
+{
+ static uvlong psize = 0, csize = 0; /* csize=file_size/100 */
+ static Matchfinder_base *mb = 0;
+ static Pretty_print *pp = 0;
+
+ if (verbosity < 2)
+ return;
+ if (m) { /* initialize static vars */
+ csize = cfile_size;
+ psize = partial_size;
+ mb = m;
+ pp = p;
+ }
+ if (mb && pp) {
+ uvlong pos = psize + Mb_data_position(mb);
+
+ if (csize > 0)
+ fprintf( stderr, "%4llud%%", pos / csize );
+ fprintf( stderr, " %.1f MB\r", pos / 1000000.0 );
+ Pp_reset(pp);
+ Pp_show_msg(pp, 0); /* restore cursor position */
+ }
+}
+
+/*
+ * Mapping from gzip/bzip2 style 1..9 compression modes to the corresponding
+ * LZMA compression modes.
+ */
+static Lzma_options option_mapping[] = {
+ { 1 << 16, 16 },
+ { 1 << 20, 5 },
+ { 3 << 19, 6 },
+ { 1 << 21, 8 },
+ { 3 << 20, 12 },
+ { 1 << 22, 20 },
+ { 1 << 23, 36 },
+ { 1 << 24, 68 },
+ { 3 << 23, 132 },
+// { 1 << 25, max_match_len }, // TODO
+ { 1 << 26, max_match_len },
+};
+
+void
+main(int argc, char *argv[])
+{
+ int num_filenames, infd, i, retval = 0;
+ bool filenames_given = false, force = false, ignore_trailing = true,
+ recompress = false,
+ stdin_used = false, to_stdout = false, zero = false;
+ uvlong max_member_size = 0x0008000000000000ULL;
+ uvlong max_volume_size = 0x4000000000000000ULL;
+ uvlong member_size = max_member_size;
+ uvlong volume_size = 0;
+ char *default_output_filename = "";
+ char **filenames = nil;
+ enum Mode program_mode = m_compress;
+ Lzma_options encoder_options = option_mapping[6]; /* default = "-6" */
+ Pretty_print pp;
+
+ CRC32_init();
+
+ ARGBEGIN {
+ case '0':
+ case '1':
+ case '2':
+ case '3':
+ case '4':
+ case '5':
+ case '6':
+ case '7':
+ case '8':
+ case '9':
+ zero = (ARGC() == '0');
+ encoder_options = option_mapping[ARGC() - '0'];
+ break;
+ case 'a':
+ ignore_trailing = false;
+ break;
+ case 'b':
+ member_size = getnum(EARGF(usage()), 100000, max_member_size);
+ break;
+ case 'c':
+ to_stdout = true;
+ break;
+ case 'd':
+ set_mode(&program_mode, m_decompress);
+ break;
+ case 'f':
+ force = true;
+ break;
+ case 'F':
+ recompress = true;
+ break;
+ case 'm':
+ encoder_options.match_len_limit =
+ getnum(EARGF(usage()), min_match_len_limit, max_match_len);
+ zero = false;
+ break;
+ case 'o':
+ default_output_filename = EARGF(usage());
+ break;
+ case 'q':
+ verbosity = -1;
+ break;
+ case 's':
+ encoder_options.dict_size = get_dict_size(EARGF(usage()));
+ zero = false;
+ break;
+ case 'S':
+ volume_size = getnum(EARGF(usage()), 100000, max_volume_size);
+ break;
+ case 'v':
+ if (verbosity < 4)
+ ++verbosity;
+ break;
+ default:
+ usage();
+ } ARGEND
+
+ num_filenames = max(1, argc);
+ filenames = resize_buffer(filenames, num_filenames * sizeof filenames[0]);
+ filenames[0] = "-";
+ for (i = 0; i < argc; ++i) {
+ filenames[i] = argv[i];
+ if (strcmp(filenames[i], "-") != 0)
+ filenames_given = true;
+ }
+
+ if (program_mode == m_compress) {
+ Dis_slots_init();
+ Prob_prices_init();
+ }
+
+ if (!to_stdout && (filenames_given || default_output_filename[0]))
+ set_signals();
+
+ Pp_init(&pp, filenames, num_filenames, verbosity);
+
+ output_filename = resize_buffer(output_filename, 1);
+ for (i = 0; i < num_filenames; ++i) {
+ char *input_filename = "";
+ int tmp, eindex;
+ Dir in_stats;
+ Dir *in_statsp;
+
+ output_filename[0] = 0;
+ if ( !filenames[i][0] || strcmp( filenames[i], "-" ) == 0 ) {
+ if (stdin_used)
+ continue;
+ else
+ stdin_used = true;
+ infd = 0;
+ if (to_stdout || !default_output_filename[0])
+ outfd = 1;
+ else {
+ if (program_mode == m_compress)
+ set_c_outname(default_output_filename,
+ volume_size > 0);
+ else {
+ output_filename = resize_buffer(output_filename,
+ strlen(default_output_filename)+1);
+ strcpy(output_filename,
+ default_output_filename);
+ }
+ if (!open_outstream(force, true)) {
+ if (retval < 1)
+ retval = 1;
+ close(infd);
+ continue;
+ }
+ }
+ } else {
+ eindex = extension_index(input_filename = filenames[i]);
+ infd = open_instream2(input_filename, &in_stats,
+ program_mode, eindex, recompress, to_stdout);
+ if (infd < 0) {
+ if (retval < 1)
+ retval = 1;
+ continue;
+ }
+ if (to_stdout)
+ outfd = 1;
+ else {
+ if (program_mode == m_compress)
+ set_c_outname(input_filename,
+ volume_size > 0);
+ else
+ set_d_outname(input_filename, eindex);
+ if (!open_outstream(force, false)) {
+ if (retval < 1)
+ retval = 1;
+ close(infd);
+ continue;
+ }
+ }
+ }
+
+ Pp_set_name(&pp, input_filename);
+ if (!check_tty(infd, program_mode)) {
+ if (retval < 1)
+ retval = 1;
+ cleanup_and_fail(retval);
+ }
+
+ in_statsp = input_filename[0]? &in_stats: nil;
+ if (program_mode == m_compress)
+ tmp = compress(member_size, volume_size, infd,
+ &encoder_options, &pp, in_statsp, zero);
+ else
+ tmp = decompress(infd, &pp, ignore_trailing);
+ if (tmp > retval)
+ retval = tmp;
+ if (tmp)
+ cleanup_and_fail(retval);
+
+ if (delete_output_on_interrupt)
+ close_and_set_permissions(in_statsp);
+ if (input_filename[0])
+ close(infd);
+ }
+ if (outfd >= 0 && close(outfd) != 0) {
+ show_error("Can't close stdout", errno, false);
+ if (retval < 1)
+ retval = 1;
+ }
+ free(output_filename);
+ free(filenames);
+ exit(retval);
+}
diff -Nru /sys/src/cmd/lzip/mkfile /sys/src/cmd/lzip/mkfile
--- /sys/src/cmd/lzip/mkfile Thu Jan 1 00:00:00 1970
+++ /sys/src/cmd/lzip/mkfile Sat May 1 00:00:00 2021
@@ -0,0 +1,21 @@
+# mkfile for lzip - LZMA lossless data compressor
+</$objtype/mkfile
+
+TARG=lzip
+OFILES=decoder.$O \
+ encoder.$O \
+ encoder_base.$O \
+ fast_encoder.$O \
+ main.$O
+
+HFILES=decoder.h \
+ encoder.h \
+ encoder_base.h \
+ fast_encoder.h \
+ lzip.h \
+
+BIN=/$objtype/bin
+
+</sys/src/cmd/mkone
+
+# CFLAGS=$CFLAGS -DUSE_BSTDIO
diff -Nru /sys/src/cmd/lzip/testsuite/check.sh /sys/src/cmd/lzip/testsuite/check.sh
--- /sys/src/cmd/lzip/testsuite/check.sh Thu Jan 1 00:00:00 1970
+++ /sys/src/cmd/lzip/testsuite/check.sh Sat May 1 00:00:00 2021
@@ -0,0 +1,265 @@
+#! /bin/sh
+# check script for Clzip - LZMA lossless data compressor
+# Copyright (C) 2010-2017 Antonio Diaz Diaz.
+#
+# This script is free software: you have unlimited permission
+# to copy, distribute and modify it.
+
+LC_ALL=C
+export LC_ALL
+objdir=`pwd`
+testdir=`cd "$1" ; pwd`
+LZIP="${objdir}"/clzip
+framework_failure() { echo "failure in testing framework" ; exit 1 ; }
+
+if [ ! -f "${LZIP}" ] || [ ! -x "${LZIP}" ] ; then
+ echo "${LZIP}: cannot execute"
+ exit 1
+fi
+
+[ -e "${LZIP}" ] 2> /dev/null ||
+ {
+ echo "$0: a POSIX shell is required to run the tests"
+ echo "Try bash -c \"$0 $1 $2\""
+ exit 1
+ }
+
+if [ -d tmp ] ; then rm -rf tmp ; fi
+mkdir tmp
+cd "${objdir}"/tmp || framework_failure
+
+cat "${testdir}"/test.txt > in || framework_failure
+in_lz="${testdir}"/test.txt.lz
+fail=0
+test_failed() { fail=1 ; printf " $1" ; [ -z "$2" ] || printf "($2)" ; }
+
+printf "testing clzip-%s..." "$2"
+
+"${LZIP}" -fkqm4 in
+{ [ $? = 1 ] && [ ! -e in.lz ] ; } || test_failed $LINENO
+"${LZIP}" -fkqm274 in
+{ [ $? = 1 ] && [ ! -e in.lz ] ; } || test_failed $LINENO
+for i in bad_size -1 0 4095 513MiB 1G 1T 1P 1E 1Z 1Y 10KB ; do
+ "${LZIP}" -fkqs $i in
+ { [ $? = 1 ] && [ ! -e in.lz ] ; } || test_failed $LINENO $i
+done
+"${LZIP}" -lq in
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -tq in
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -tq < in
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -cdq in
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -cdq < in
+[ $? = 2 ] || test_failed $LINENO
+# these are for code coverage
+"${LZIP}" -lt "${in_lz}" 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" -cdl "${in_lz}" > out 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" -cdt "${in_lz}" > out 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" -t -- nx_file 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" --help > /dev/null || test_failed $LINENO
+"${LZIP}" -n1 -V > /dev/null || test_failed $LINENO
+"${LZIP}" -m 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" -z 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" --bad_option 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" --t 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" --test=2 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" --output= 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" --output 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+printf "LZIP\001-.............................." | "${LZIP}" -t 2> /dev/null
+printf "LZIP\002-.............................." | "${LZIP}" -t 2> /dev/null
+printf "LZIP\001+.............................." | "${LZIP}" -t 2> /dev/null
+
+printf "\ntesting decompression..."
+
+"${LZIP}" -lq "${in_lz}" || test_failed $LINENO
+"${LZIP}" -t "${in_lz}" || test_failed $LINENO
+"${LZIP}" -cd "${in_lz}" > copy || test_failed $LINENO
+cmp in copy || test_failed $LINENO
+
+rm -f copy
+cat "${in_lz}" > copy.lz || framework_failure
+"${LZIP}" -dk copy.lz || test_failed $LINENO
+cmp in copy || test_failed $LINENO
+printf "to be overwritten" > copy || framework_failure
+"${LZIP}" -d copy.lz 2> /dev/null
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" -df copy.lz
+{ [ $? = 0 ] && [ ! -e copy.lz ] && cmp in copy ; } || test_failed $LINENO
+
+printf "to be overwritten" > copy || framework_failure
+"${LZIP}" -df -o copy < "${in_lz}" || test_failed $LINENO
+cmp in copy || test_failed $LINENO
+
+rm -f copy
+"${LZIP}" < in > anyothername || test_failed $LINENO
+"${LZIP}" -dv --output copy - anyothername - < "${in_lz}" 2> /dev/null
+{ [ $? = 0 ] && cmp in copy && cmp in anyothername.out ; } ||
+ test_failed $LINENO
+rm -f copy anyothername.out
+
+"${LZIP}" -lq in "${in_lz}"
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -lq nx_file.lz "${in_lz}"
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" -tq in "${in_lz}"
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -tq nx_file.lz "${in_lz}"
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" -cdq in "${in_lz}" > copy
+{ [ $? = 2 ] && cat copy in | cmp in - ; } || test_failed $LINENO
+"${LZIP}" -cdq nx_file.lz "${in_lz}" > copy
+{ [ $? = 1 ] && cmp in copy ; } || test_failed $LINENO
+rm -f copy
+cat "${in_lz}" > copy.lz || framework_failure
+for i in 1 2 3 4 5 6 7 ; do
+ printf "g" >> copy.lz || framework_failure
+ "${LZIP}" -alvv copy.lz "${in_lz}" > /dev/null 2>&1
+ [ $? = 2 ] || test_failed $LINENO $i
+ "${LZIP}" -atvvvv copy.lz "${in_lz}" 2> /dev/null
+ [ $? = 2 ] || test_failed $LINENO $i
+done
+"${LZIP}" -dq in copy.lz
+{ [ $? = 2 ] && [ -e copy.lz ] && [ ! -e copy ] && [ ! -e in.out ] ; } ||
+ test_failed $LINENO
+"${LZIP}" -dq nx_file.lz copy.lz
+{ [ $? = 1 ] && [ ! -e copy.lz ] && [ ! -e nx_file ] && cmp in copy ; } ||
+ test_failed $LINENO
+
+cat in in > in2 || framework_failure
+cat "${in_lz}" "${in_lz}" > in2.lz || framework_failure
+"${LZIP}" -lq in2.lz || test_failed $LINENO
+"${LZIP}" -t in2.lz || test_failed $LINENO
+"${LZIP}" -cd in2.lz > copy2 || test_failed $LINENO
+cmp in2 copy2 || test_failed $LINENO
+
+"${LZIP}" --output=copy2 < in2 || test_failed $LINENO
+"${LZIP}" -lq copy2.lz || test_failed $LINENO
+"${LZIP}" -t copy2.lz || test_failed $LINENO
+"${LZIP}" -cd copy2.lz > copy2 || test_failed $LINENO
+cmp in2 copy2 || test_failed $LINENO
+
+printf "\ngarbage" >> copy2.lz || framework_failure
+"${LZIP}" -tvvvv copy2.lz 2> /dev/null || test_failed $LINENO
+rm -f copy2
+"${LZIP}" -alq copy2.lz
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -atq copy2.lz
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -atq < copy2.lz
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -adkq copy2.lz
+{ [ $? = 2 ] && [ ! -e copy2 ] ; } || test_failed $LINENO
+"${LZIP}" -adkq -o copy2 < copy2.lz
+{ [ $? = 2 ] && [ ! -e copy2 ] ; } || test_failed $LINENO
+printf "to be overwritten" > copy2 || framework_failure
+"${LZIP}" -df copy2.lz || test_failed $LINENO
+cmp in2 copy2 || test_failed $LINENO
+
+printf "\ntesting compression..."
+
+"${LZIP}" -cf "${in_lz}" > out 2> /dev/null # /dev/null is a tty on OS/2
+[ $? = 1 ] || test_failed $LINENO
+"${LZIP}" -cFvvm36 "${in_lz}" > out 2> /dev/null || test_failed $LINENO
+"${LZIP}" -cd out | "${LZIP}" -d > copy || test_failed $LINENO
+cmp in copy || test_failed $LINENO
+
+for i in s4Ki 0 1 2 3 4 5 6 7 8 9 ; do
+ "${LZIP}" -k -$i in || test_failed $LINENO $i
+ mv -f in.lz copy.lz || test_failed $LINENO $i
+ printf "garbage" >> copy.lz || framework_failure
+ "${LZIP}" -df copy.lz || test_failed $LINENO $i
+ cmp in copy || test_failed $LINENO $i
+done
+
+for i in s4Ki 0 1 2 3 4 5 6 7 8 9 ; do
+ "${LZIP}" -c -$i in > out || test_failed $LINENO $i
+ printf "g" >> out || framework_failure
+ "${LZIP}" -cd out > copy || test_failed $LINENO $i
+ cmp in copy || test_failed $LINENO $i
+done
+
+for i in s4Ki 0 1 2 3 4 5 6 7 8 9 ; do
+ "${LZIP}" -$i < in > out || test_failed $LINENO $i
+ "${LZIP}" -d < out > copy || test_failed $LINENO $i
+ cmp in copy || test_failed $LINENO $i
+done
+
+for i in s4Ki 0 1 2 3 4 5 6 7 8 9 ; do
+ "${LZIP}" -f -$i -o out < in || test_failed $LINENO $i
+ "${LZIP}" -df -o copy < out.lz || test_failed $LINENO $i
+ cmp in copy || test_failed $LINENO $i
+done
+
+cat in in in in in in in in > in8 || framework_failure
+"${LZIP}" -1s12 -S100k -o out < in8 || test_failed $LINENO
+"${LZIP}" -t out00001.lz out00002.lz || test_failed $LINENO
+"${LZIP}" -cd out00001.lz out00002.lz | cmp in8 - || test_failed $LINENO
+rm -f out00001.lz
+"${LZIP}" -1ks4Ki -b100000 in8 || test_failed $LINENO
+"${LZIP}" -t in8.lz || test_failed $LINENO
+"${LZIP}" -cd in8.lz | cmp in8 - || test_failed $LINENO
+rm -f in8
+"${LZIP}" -0 -S100k -o out < in8.lz || test_failed $LINENO
+"${LZIP}" -t out00001.lz out00002.lz || test_failed $LINENO
+"${LZIP}" -cd out00001.lz out00002.lz | cmp in8.lz - || test_failed $LINENO
+rm -f out00001.lz out00002.lz
+"${LZIP}" -0kF -b100k in8.lz || test_failed $LINENO
+"${LZIP}" -t in8.lz.lz || test_failed $LINENO
+"${LZIP}" -cd in8.lz.lz | cmp in8.lz - || test_failed $LINENO
+rm -f in8.lz in8.lz.lz
+
+printf "\ntesting bad input..."
+
+cat "${in_lz}" "${in_lz}" "${in_lz}" > in3.lz || framework_failure
+if dd if=in3.lz of=trunc.lz bs=14752 count=1 2> /dev/null &&
+ [ -e trunc.lz ] && cmp in2.lz trunc.lz > /dev/null 2>&1 ; then
+ for i in 6 20 14734 14753 14754 14755 14756 14757 14758 ; do
+ dd if=in3.lz of=trunc.lz bs=$i count=1 2> /dev/null
+ "${LZIP}" -lq trunc.lz
+ [ $? = 2 ] || test_failed $LINENO $i
+ "${LZIP}" -t trunc.lz 2> /dev/null
+ [ $? = 2 ] || test_failed $LINENO $i
+ "${LZIP}" -tq < trunc.lz
+ [ $? = 2 ] || test_failed $LINENO $i
+ "${LZIP}" -cdq trunc.lz > out
+ [ $? = 2 ] || test_failed $LINENO $i
+ "${LZIP}" -dq < trunc.lz > out
+ [ $? = 2 ] || test_failed $LINENO $i
+ done
+else
+ printf "\nwarning: skipping truncation test: 'dd' does not work on your system."
+fi
+
+cat "${in_lz}" > ingin.lz || framework_failure
+printf "g" >> ingin.lz || framework_failure
+cat "${in_lz}" >> ingin.lz || framework_failure
+"${LZIP}" -lq ingin.lz
+[ $? = 2 ] || test_failed $LINENO
+"${LZIP}" -t ingin.lz || test_failed $LINENO
+"${LZIP}" -cd ingin.lz > copy || test_failed $LINENO
+cmp in copy || test_failed $LINENO
+"${LZIP}" -t < ingin.lz || test_failed $LINENO
+"${LZIP}" -d < ingin.lz > copy || test_failed $LINENO
+cmp in copy || test_failed $LINENO
+
+echo
+if [ ${fail} = 0 ] ; then
+ echo "tests completed successfully."
+ cd "${objdir}" && rm -r tmp
+else
+ echo "tests failed."
+fi
+exit ${fail}
diff -Nru /sys/src/cmd/lzip/testsuite/test.txt /sys/src/cmd/lzip/testsuite/test.txt
--- /sys/src/cmd/lzip/testsuite/test.txt Thu Jan 1 00:00:00 1970
+++ /sys/src/cmd/lzip/testsuite/test.txt Sat May 1 00:00:00 2021
@@ -0,0 +1,676 @@
+ GNU GENERAL PUBLIC LICENSE
+ Version 2, June 1991
+
+ Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
+ 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license document, but changing it is not allowed.
+
+ Preamble
+
+ The licenses for most software are designed to take away your
+freedom to share and change it. By contrast, the GNU General Public
+License is intended to guarantee your freedom to share and change free
+software--to make sure the software is free for all its users. This
+General Public License applies to most of the Free Software
+Foundation's software and to any other program whose authors commit to
+using it. (Some other Free Software Foundation software is covered by
+the GNU Lesser General Public License instead.) You can apply it to
+your programs, too.
+
+ When we speak of free software, we are referring to freedom, not
+price. Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+this service if you wish), that you receive source code or can get it
+if you want it, that you can change the software or use pieces of it
+in new free programs; and that you know you can do these things.
+
+ To protect your rights, we need to make restrictions that forbid
+anyone to deny you these rights or to ask you to surrender the rights.
+These restrictions translate to certain responsibilities for you if you
+distribute copies of the software, or if you modify it.
+
+ For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must give the recipients all the rights that
+you have. You must make sure that they, too, receive or can get the
+source code. And you must show them these terms so they know their
+rights.
+
+ We protect your rights with two steps: (1) copyright the software, and
+(2) offer you this license which gives you legal permission to copy,
+distribute and/or modify the software.
+
+ Also, for each author's protection and ours, we want to make certain
+that everyone understands that there is no warranty for this free
+software. If the software is modified by someone else and passed on, we
+want its recipients to know that what they have is not the original, so
+that any problems introduced by others will not reflect on the original
+authors' reputations.
+
+ Finally, any free program is threatened constantly by software
+patents. We wish to avoid the danger that redistributors of a free
+program will individually obtain patent licenses, in effect making the
+program proprietary. To prevent this, we have made it clear that any
+patent must be licensed for everyone's free use or not licensed at all.
+
+ The precise terms and conditions for copying, distribution and
+modification follow.
+
+ GNU GENERAL PUBLIC LICENSE
+ TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
+
+ 0. This License applies to any program or other work which contains
+a notice placed by the copyright holder saying it may be distributed
+under the terms of this General Public License. The "Program", below,
+refers to any such program or work, and a "work based on the Program"
+means either the Program or any derivative work under copyright law:
+that is to say, a work containing the Program or a portion of it,
+either verbatim or with modifications and/or translated into another
+language. (Hereinafter, translation is included without limitation in
+the term "modification".) Each licensee is addressed as "you".
+
+Activities other than copying, distribution and modification are not
+covered by this License; they are outside its scope. The act of
+running the Program is not restricted, and the output from the Program
+is covered only if its contents constitute a work based on the
+Program (independent of having been made by running the Program).
+Whether that is true depends on what the Program does.
+
+ 1. You may copy and distribute verbatim copies of the Program's
+source code as you receive it, in any medium, provided that you
+conspicuously and appropriately publish on each copy an appropriate
+copyright notice and disclaimer of warranty; keep intact all the
+notices that refer to this License and to the absence of any warranty;
+and give any other recipients of the Program a copy of this License
+along with the Program.
+
+You may charge a fee for the physical act of transferring a copy, and
+you may at your option offer warranty protection in exchange for a fee.
+
+ 2. You may modify your copy or copies of the Program or any portion
+of it, thus forming a work based on the Program, and copy and
+distribute such modifications or work under the terms of Section 1
+above, provided that you also meet all of these conditions:
+
+ a) You must cause the modified files to carry prominent notices
+ stating that you changed the files and the date of any change.
+
+ b) You must cause any work that you distribute or publish, that in
+ whole or in part contains or is derived from the Program or any
+ part thereof, to be licensed as a whole at no charge to all third
+ parties under the terms of this License.
+
+ c) If the modified program normally reads commands interactively
+ when run, you must cause it, when started running for such
+ interactive use in the most ordinary way, to print or display an
+ announcement including an appropriate copyright notice and a
+ notice that there is no warranty (or else, saying that you provide
+ a warranty) and that users may redistribute the program under
+ these conditions, and telling the user how to view a copy of this
+ License. (Exception: if the Program itself is interactive but
+ does not normally print such an announcement, your work based on
+ the Program is not required to print an announcement.)
+
+These requirements apply to the modified work as a whole. If
+identifiable sections of that work are not derived from the Program,
+and can be reasonably considered independent and separate works in
+themselves, then this License, and its terms, do not apply to those
+sections when you distribute them as separate works. But when you
+distribute the same sections as part of a whole which is a work based
+on the Program, the distribution of the whole must be on the terms of
+this License, whose permissions for other licensees extend to the
+entire whole, and thus to each and every part regardless of who wrote it.
+
+Thus, it is not the intent of this section to claim rights or contest
+your rights to work written entirely by you; rather, the intent is to
+exercise the right to control the distribution of derivative or
+collective works based on the Program.
+
+In addition, mere aggregation of another work not based on the Program
+with the Program (or with a work based on the Program) on a volume of
+a storage or distribution medium does not bring the other work under
+the scope of this License.
+
+ 3. You may copy and distribute the Program (or a work based on it,
+under Section 2) in object code or executable form under the terms of
+Sections 1 and 2 above provided that you also do one of the following:
+
+ a) Accompany it with the complete corresponding machine-readable
+ source code, which must be distributed under the terms of Sections
+ 1 and 2 above on a medium customarily used for software interchange; or,
+
+ b) Accompany it with a written offer, valid for at least three
+ years, to give any third party, for a charge no more than your
+ cost of physically performing source distribution, a complete
+ machine-readable copy of the corresponding source code, to be
+ distributed under the terms of Sections 1 and 2 above on a medium
+ customarily used for software interchange; or,
+
+ c) Accompany it with the information you received as to the offer
+ to distribute corresponding source code. (This alternative is
+ allowed only for noncommercial distribution and only if you
+ received the program in object code or executable form with such
+ an offer, in accord with Subsection b above.)
+
+The source code for a work means the preferred form of the work for
+making modifications to it. For an executable work, complete source
+code means all the source code for all modules it contains, plus any
+associated interface definition files, plus the scripts used to
+control compilation and installation of the executable. However, as a
+special exception, the source code distributed need not include
+anything that is normally distributed (in either source or binary
+form) with the major components (compiler, kernel, and so on) of the
+operating system on which the executable runs, unless that component
+itself accompanies the executable.
+
+If distribution of executable or object code is made by offering
+access to copy from a designated place, then offering equivalent
+access to copy the source code from the same place counts as
+distribution of the source code, even though third parties are not
+compelled to copy the source along with the object code.
+
+ 4. You may not copy, modify, sublicense, or distribute the Program
+except as expressly provided under this License. Any attempt
+otherwise to copy, modify, sublicense or distribute the Program is
+void, and will automatically terminate your rights under this License.
+However, parties who have received copies, or rights, from you under
+this License will not have their licenses terminated so long as such
+parties remain in full compliance.
+
+ 5. You are not required to accept this License, since you have not
+signed it. However, nothing else grants you permission to modify or
+distribute the Program or its derivative works. These actions are
+prohibited by law if you do not accept this License. Therefore, by
+modifying or distributing the Program (or any work based on the
+Program), you indicate your acceptance of this License to do so, and
+all its terms and conditions for copying, distributing or modifying
+the Program or works based on it.
+
+ 6. Each time you redistribute the Program (or any work based on the
+Program), the recipient automatically receives a license from the
+original licensor to copy, distribute or modify the Program subject to
+these terms and conditions. You may not impose any further
+restrictions on the recipients' exercise of the rights granted herein.
+You are not responsible for enforcing compliance by third parties to
+this License.
+
+ 7. If, as a consequence of a court judgment or allegation of patent
+infringement or for any other reason (not limited to patent issues),
+conditions are imposed on you (whether by court order, agreement or
+otherwise) that contradict the conditions of this License, they do not
+excuse you from the conditions of this License. If you cannot
+distribute so as to satisfy simultaneously your obligations under this
+License and any other pertinent obligations, then as a consequence you
+may not distribute the Program at all. For example, if a patent
+license would not permit royalty-free redistribution of the Program by
+all those who receive copies directly or indirectly through you, then
+the only way you could satisfy both it and this License would be to
+refrain entirely from distribution of the Program.
+
+If any portion of this section is held invalid or unenforceable under
+any particular circumstance, the balance of the section is intended to
+apply and the section as a whole is intended to apply in other
+circumstances.
+
+It is not the purpose of this section to induce you to infringe any
+patents or other property right claims or to contest validity of any
+such claims; this section has the sole purpose of protecting the
+integrity of the free software distribution system, which is
+implemented by public license practices. Many people have made
+generous contributions to the wide range of software distributed
+through that system in reliance on consistent application of that
+system; it is up to the author/donor to decide if he or she is willing
+to distribute software through any other system and a licensee cannot
+impose that choice.
+
+This section is intended to make thoroughly clear what is believed to
+be a consequence of the rest of this License.
+
+ 8. If the distribution and/or use of the Program is restricted in
+certain countries either by patents or by copyrighted interfaces, the
+original copyright holder who places the Program under this License
+may add an explicit geographical distribution limitation excluding
+those countries, so that distribution is permitted only in or among
+countries not thus excluded. In such case, this License incorporates
+the limitation as if written in the body of this License.
+
+ 9. The Free Software Foundation may publish revised and/or new versions
+of the General Public License from time to time. Such new versions will
+be similar in spirit to the present version, but may differ in detail to
+address new problems or concerns.
+
+Each version is given a distinguishing version number. If the Program
+specifies a version number of this License which applies to it and "any
+later version", you have the option of following the terms and conditions
+either of that version or of any later version published by the Free
+Software Foundation. If the Program does not specify a version number of
+this License, you may choose any version ever published by the Free Software
+Foundation.
+
+ 10. If you wish to incorporate parts of the Program into other free
+programs whose distribution conditions are different, write to the author
+to ask for permission. For software which is copyrighted by the Free
+Software Foundation, write to the Free Software Foundation; we sometimes
+make exceptions for this. Our decision will be guided by the two goals
+of preserving the free status of all derivatives of our free software and
+of promoting the sharing and reuse of software generally.
+
+ NO WARRANTY
+
+ 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
+FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
+OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
+PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
+OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
+TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
+PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
+REPAIR OR CORRECTION.
+
+ 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
+WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
+REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
+INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
+OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
+TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
+YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
+PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGES.
+
+ END OF TERMS AND CONDITIONS
+
+ How to Apply These Terms to Your New Programs
+
+ If you develop a new program, and you want it to be of the greatest
+possible use to the public, the best way to achieve this is to make it
+free software which everyone can redistribute and change under these terms.
+
+ To do so, attach the following notices to the program. It is safest
+to attach them to the start of each source file to most effectively
+convey the exclusion of warranty; and each file should have at least
+the "copyright" line and a pointer to where the full notice is found.
+
+ <one line to give the program's name and a brief idea of what it does.>
+ Copyright (C) <year> <name of author>
+
+ This program is free software: you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation, either version 2 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>.
+
+Also add information on how to contact you by electronic and paper mail.
+
+If the program is interactive, make it output a short notice like this
+when it starts in an interactive mode:
+
+ Gnomovision version 69, Copyright (C) <year> <name of author>
+ Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
+ This is free software, and you are welcome to redistribute it
+ under certain conditions; type `show c' for details.
+
+The hypothetical commands `show w' and `show c' should show the appropriate
+parts of the General Public License. Of course, the commands you use may
+be called something other than `show w' and `show c'; they could even be
+mouse-clicks or menu items--whatever suits your program.
+
+You should also get your employer (if you work as a programmer) or your
+school, if any, to sign a "copyright disclaimer" for the program, if
+necessary. Here is a sample; alter the names:
+
+ Yoyodyne, Inc., hereby disclaims all copyright interest in the program
+ `Gnomovision' (which makes passes at compilers) written by James Hacker.
+
+ <signature of Ty Coon>, 1 April 1989
+ Ty Coon, President of Vice
+
+This General Public License does not permit incorporating your program into
+proprietary programs. If your program is a subroutine library, you may
+consider it more useful to permit linking proprietary applications with the
+library. If this is what you want to do, use the GNU Lesser General
+Public License instead of this License.
+ GNU GENERAL PUBLIC LICENSE
+ Version 2, June 1991
+
+ Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
+ 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license document, but changing it is not allowed.
+
+ Preamble
+
+ The licenses for most software are designed to take away your
+freedom to share and change it. By contrast, the GNU General Public
+License is intended to guarantee your freedom to share and change free
+software--to make sure the software is free for all its users. This
+General Public License applies to most of the Free Software
+Foundation's software and to any other program whose authors commit to
+using it. (Some other Free Software Foundation software is covered by
+the GNU Lesser General Public License instead.) You can apply it to
+your programs, too.
+
+ When we speak of free software, we are referring to freedom, not
+price. Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+this service if you wish), that you receive source code or can get it
+if you want it, that you can change the software or use pieces of it
+in new free programs; and that you know you can do these things.
+
+ To protect your rights, we need to make restrictions that forbid
+anyone to deny you these rights or to ask you to surrender the rights.
+These restrictions translate to certain responsibilities for you if you
+distribute copies of the software, or if you modify it.
+
+ For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must give the recipients all the rights that
+you have. You must make sure that they, too, receive or can get the
+source code. And you must show them these terms so they know their
+rights.
+
+ We protect your rights with two steps: (1) copyright the software, and
+(2) offer you this license which gives you legal permission to copy,
+distribute and/or modify the software.
+
+ Also, for each author's protection and ours, we want to make certain
+that everyone understands that there is no warranty for this free
+software. If the software is modified by someone else and passed on, we
+want its recipients to know that what they have is not the original, so
+that any problems introduced by others will not reflect on the original
+authors' reputations.
+
+ Finally, any free program is threatened constantly by software
+patents. We wish to avoid the danger that redistributors of a free
+program will individually obtain patent licenses, in effect making the
+program proprietary. To prevent this, we have made it clear that any
+patent must be licensed for everyone's free use or not licensed at all.
+
+ The precise terms and conditions for copying, distribution and
+modification follow.
+
+ GNU GENERAL PUBLIC LICENSE
+ TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
+
+ 0. This License applies to any program or other work which contains
+a notice placed by the copyright holder saying it may be distributed
+under the terms of this General Public License. The "Program", below,
+refers to any such program or work, and a "work based on the Program"
+means either the Program or any derivative work under copyright law:
+that is to say, a work containing the Program or a portion of it,
+either verbatim or with modifications and/or translated into another
+language. (Hereinafter, translation is included without limitation in
+the term "modification".) Each licensee is addressed as "you".
+
+Activities other than copying, distribution and modification are not
+covered by this License; they are outside its scope. The act of
+running the Program is not restricted, and the output from the Program
+is covered only if its contents constitute a work based on the
+Program (independent of having been made by running the Program).
+Whether that is true depends on what the Program does.
+
+ 1. You may copy and distribute verbatim copies of the Program's
+source code as you receive it, in any medium, provided that you
+conspicuously and appropriately publish on each copy an appropriate
+copyright notice and disclaimer of warranty; keep intact all the
+notices that refer to this License and to the absence of any warranty;
+and give any other recipients of the Program a copy of this License
+along with the Program.
+
+You may charge a fee for the physical act of transferring a copy, and
+you may at your option offer warranty protection in exchange for a fee.
+
+ 2. You may modify your copy or copies of the Program or any portion
+of it, thus forming a work based on the Program, and copy and
+distribute such modifications or work under the terms of Section 1
+above, provided that you also meet all of these conditions:
+
+ a) You must cause the modified files to carry prominent notices
+ stating that you changed the files and the date of any change.
+
+ b) You must cause any work that you distribute or publish, that in
+ whole or in part contains or is derived from the Program or any
+ part thereof, to be licensed as a whole at no charge to all third
+ parties under the terms of this License.
+
+ c) If the modified program normally reads commands interactively
+ when run, you must cause it, when started running for such
+ interactive use in the most ordinary way, to print or display an
+ announcement including an appropriate copyright notice and a
+ notice that there is no warranty (or else, saying that you provide
+ a warranty) and that users may redistribute the program under
+ these conditions, and telling the user how to view a copy of this
+ License. (Exception: if the Program itself is interactive but
+ does not normally print such an announcement, your work based on
+ the Program is not required to print an announcement.)
+
+These requirements apply to the modified work as a whole. If
+identifiable sections of that work are not derived from the Program,
+and can be reasonably considered independent and separate works in
+themselves, then this License, and its terms, do not apply to those
+sections when you distribute them as separate works. But when you
+distribute the same sections as part of a whole which is a work based
+on the Program, the distribution of the whole must be on the terms of
+this License, whose permissions for other licensees extend to the
+entire whole, and thus to each and every part regardless of who wrote it.
+
+Thus, it is not the intent of this section to claim rights or contest
+your rights to work written entirely by you; rather, the intent is to
+exercise the right to control the distribution of derivative or
+collective works based on the Program.
+
+In addition, mere aggregation of another work not based on the Program
+with the Program (or with a work based on the Program) on a volume of
+a storage or distribution medium does not bring the other work under
+the scope of this License.
+
+ 3. You may copy and distribute the Program (or a work based on it,
+under Section 2) in object code or executable form under the terms of
+Sections 1 and 2 above provided that you also do one of the following:
+
+ a) Accompany it with the complete corresponding machine-readable
+ source code, which must be distributed under the terms of Sections
+ 1 and 2 above on a medium customarily used for software interchange; or,
+
+ b) Accompany it with a written offer, valid for at least three
+ years, to give any third party, for a charge no more than your
+ cost of physically performing source distribution, a complete
+ machine-readable copy of the corresponding source code, to be
+ distributed under the terms of Sections 1 and 2 above on a medium
+ customarily used for software interchange; or,
+
+ c) Accompany it with the information you received as to the offer
+ to distribute corresponding source code. (This alternative is
+ allowed only for noncommercial distribution and only if you
+ received the program in object code or executable form with such
+ an offer, in accord with Subsection b above.)
+
+The source code for a work means the preferred form of the work for
+making modifications to it. For an executable work, complete source
+code means all the source code for all modules it contains, plus any
+associated interface definition files, plus the scripts used to
+control compilation and installation of the executable. However, as a
+special exception, the source code distributed need not include
+anything that is normally distributed (in either source or binary
+form) with the major components (compiler, kernel, and so on) of the
+operating system on which the executable runs, unless that component
+itself accompanies the executable.
+
+If distribution of executable or object code is made by offering
+access to copy from a designated place, then offering equivalent
+access to copy the source code from the same place counts as
+distribution of the source code, even though third parties are not
+compelled to copy the source along with the object code.
+
+ 4. You may not copy, modify, sublicense, or distribute the Program
+except as expressly provided under this License. Any attempt
+otherwise to copy, modify, sublicense or distribute the Program is
+void, and will automatically terminate your rights under this License.
+However, parties who have received copies, or rights, from you under
+this License will not have their licenses terminated so long as such
+parties remain in full compliance.
+
+ 5. You are not required to accept this License, since you have not
+signed it. However, nothing else grants you permission to modify or
+distribute the Program or its derivative works. These actions are
+prohibited by law if you do not accept this License. Therefore, by
+modifying or distributing the Program (or any work based on the
+Program), you indicate your acceptance of this License to do so, and
+all its terms and conditions for copying, distributing or modifying
+the Program or works based on it.
+
+ 6. Each time you redistribute the Program (or any work based on the
+Program), the recipient automatically receives a license from the
+original licensor to copy, distribute or modify the Program subject to
+these terms and conditions. You may not impose any further
+restrictions on the recipients' exercise of the rights granted herein.
+You are not responsible for enforcing compliance by third parties to
+this License.
+
+ 7. If, as a consequence of a court judgment or allegation of patent
+infringement or for any other reason (not limited to patent issues),
+conditions are imposed on you (whether by court order, agreement or
+otherwise) that contradict the conditions of this License, they do not
+excuse you from the conditions of this License. If you cannot
+distribute so as to satisfy simultaneously your obligations under this
+License and any other pertinent obligations, then as a consequence you
+may not distribute the Program at all. For example, if a patent
+license would not permit royalty-free redistribution of the Program by
+all those who receive copies directly or indirectly through you, then
+the only way you could satisfy both it and this License would be to
+refrain entirely from distribution of the Program.
+
+If any portion of this section is held invalid or unenforceable under
+any particular circumstance, the balance of the section is intended to
+apply and the section as a whole is intended to apply in other
+circumstances.
+
+It is not the purpose of this section to induce you to infringe any
+patents or other property right claims or to contest validity of any
+such claims; this section has the sole purpose of protecting the
+integrity of the free software distribution system, which is
+implemented by public license practices. Many people have made
+generous contributions to the wide range of software distributed
+through that system in reliance on consistent application of that
+system; it is up to the author/donor to decide if he or she is willing
+to distribute software through any other system and a licensee cannot
+impose that choice.
+
+This section is intended to make thoroughly clear what is believed to
+be a consequence of the rest of this License.
+
+ 8. If the distribution and/or use of the Program is restricted in
+certain countries either by patents or by copyrighted interfaces, the
+original copyright holder who places the Program under this License
+may add an explicit geographical distribution limitation excluding
+those countries, so that distribution is permitted only in or among
+countries not thus excluded. In such case, this License incorporates
+the limitation as if written in the body of this License.
+
+ 9. The Free Software Foundation may publish revised and/or new versions
+of the General Public License from time to time. Such new versions will
+be similar in spirit to the present version, but may differ in detail to
+address new problems or concerns.
+
+Each version is given a distinguishing version number. If the Program
+specifies a version number of this License which applies to it and "any
+later version", you have the option of following the terms and conditions
+either of that version or of any later version published by the Free
+Software Foundation. If the Program does not specify a version number of
+this License, you may choose any version ever published by the Free Software
+Foundation.
+
+ 10. If you wish to incorporate parts of the Program into other free
+programs whose distribution conditions are different, write to the author
+to ask for permission. For software which is copyrighted by the Free
+Software Foundation, write to the Free Software Foundation; we sometimes
+make exceptions for this. Our decision will be guided by the two goals
+of preserving the free status of all derivatives of our free software and
+of promoting the sharing and reuse of software generally.
+
+ NO WARRANTY
+
+ 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
+FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
+OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
+PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
+OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
+TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
+PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
+REPAIR OR CORRECTION.
+
+ 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
+WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
+REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
+INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
+OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
+TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
+YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
+PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGES.
+
+ END OF TERMS AND CONDITIONS
+
+ How to Apply These Terms to Your New Programs
+
+ If you develop a new program, and you want it to be of the greatest
+possible use to the public, the best way to achieve this is to make it
+free software which everyone can redistribute and change under these terms.
+
+ To do so, attach the following notices to the program. It is safest
+to attach them to the start of each source file to most effectively
+convey the exclusion of warranty; and each file should have at least
+the "copyright" line and a pointer to where the full notice is found.
+
+ <one line to give the program's name and a brief idea of what it does.>
+ Copyright (C) <year> <name of author>
+
+ This program is free software: you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation, either version 2 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>.
+
+Also add information on how to contact you by electronic and paper mail.
+
+If the program is interactive, make it output a short notice like this
+when it starts in an interactive mode:
+
+ Gnomovision version 69, Copyright (C) <year> <name of author>
+ Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
+ This is free software, and you are welcome to redistribute it
+ under certain conditions; type `show c' for details.
+
+The hypothetical commands `show w' and `show c' should show the appropriate
+parts of the General Public License. Of course, the commands you use may
+be called something other than `show w' and `show c'; they could even be
+mouse-clicks or menu items--whatever suits your program.
+
+You should also get your employer (if you work as a programmer) or your
+school, if any, to sign a "copyright disclaimer" for the program, if
+necessary. Here is a sample; alter the names:
+
+ Yoyodyne, Inc., hereby disclaims all copyright interest in the program
+ `Gnomovision' (which makes passes at compilers) written by James Hacker.
+
+ <signature of Ty Coon>, 1 April 1989
+ Ty Coon, President of Vice
+
+This General Public License does not permit incorporating your program into
+proprietary programs. If your program is a subroutine library, you may
+consider it more useful to permit linking proprietary applications with the
+library. If this is what you want to do, use the GNU Lesser General
+Public License instead of this License.
Binary files /sys/src/cmd/lzip/testsuite/test.txt.lz and /sys/src/cmd/lzip/testsuite/test.txt.lz differ
|