README for ziproxy (CVS)

Ziproxy - a compression http proxy
Copyright (C)2002-2004 Juraj Variny <variny@naex.sk>
Copyright (C)2005-2009 Daniel Mealha Cabrita <dancab@gmx.net>

	This program is free software; you can redistribute it and/or modify
	it under the terms of the GNU General Public License as published by
	the Free Software Foundation; either version 2 of the License, or
	(at your option) any later version.

	This program is distributed in the hope that it will be useful,
	but WITHOUT ANY WARRANTY; without even the implied warranty of
	MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
	GNU General Public License for more details.

	You should have received a copy of the GNU General Public License
	along with this program; if not, write to the Free Software
	Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111 USA


Ziproxy is a forwarding (non-caching) proxy that gzips text and HTML files,
and reduces the size of images by converting them to lower quality JPEGs.
It is intended to increase the speed for low-speed Internet connections
and it's suitable for both home and professional usage.
Ziproxy is fully configurable and also features transparent proxy mode,
preemptive name resolution, operation in either daemon mode or (x)inetd mode,
a detailed access log with compression statistics, basic authentication, and more.

 Why?

HTML is plain text, and as such is large and can be compressed very 
well. Most web browsers have the ability to receive content 
compressed in gzip form and then view it as normal. Not many people 
know about this ability, so it isn't used very much. Using this 
feature will speed up web access as text files that are uncompressed 
from a web-server, can be compressed using this proxy and then sent 
over a slower internet connection (like dial-up). To give an example 
of the speed increase, a 100K HTML page can be compressed down to 
around 7Kb after using this proxy. Well, it's shameless advert, we 
aren't counting for modem hardware compression - but that is quite 
less efficient. Even for browsers that don't support this there is 
workaround using SSH port forwarding, and it can yield even better 
compression and response times.

Moreover, images on most pages are of unnecessary high quality 
and/or saved in unsuitable format. Average compression of all images 
using ziproxy is one third of original size, with only marginal 
visible quality decrease. Animated GIFs are stopped, too.

The idea is that you install this at your ISP, or on a fast server 
on the internet ("remote host"). Then use this proxy for your 
dial-up connections to the web from "local host".

 Requirements on remote host

* libungif 

* libpng 

* libjasper (if JPEG2000 support is desired, optional)

* libjpeg-6b 

* zlib

* GCC and GNU make. BSD make may work. Sun Make/CC doesn't.

 Installation

To see your options, run: 

$ ./configure --help 

Then, running:

$ ./configure 
$ make 
$ make install 

should compile and install 'ziproxy' binary. There are 
optional test programs that can be compiled and installed with 
--enable-testprogs option to configure:

  modifytest for testing HTML modification - reads the file from 
  stdin and outputs to stdout. 

  imgtest does the same for images, specify input and output file 
  names as command line parameters. 

  cfgtest can be used to check configuration file parser and default 
  values. 

 Command line

ziproxy <-d|-i> [-c </path/to/ziproxy.conf>] [-f <IP.address or hostname>] [-h]

-d runs ziproxy in daemon mode.
Either this or '-i' is mandatory, both are mutually exclusive.

-i runs ziproxy in [x]inetd mode.
Either this or '-d' is mandatory, both are mutually exclusive.
Use this when invoking ziproxy from either inetd or xinetd.

-c configuration file, full path (typically /etc/ziproxy/ziproxy.conf)
Optional, if unspecified use the internal default.

-f same as the "OnlyFrom=" option in configuration file,
but with higher precedence.
Optional, if unspecified this option won't be used.

-h display available command line options



 Configuration file

Default location for configuration file is current directory. 

 daemon mode-only options

  WhereZiproxy
  This option is obsolete.

  Port=8080 Port number Ziproxy uses to listen for connections.

  Address
  Local address to listen for proxy connections.
  If you have more than one network interface,
  it's useful for restricting to which interface you want to bind to.
  Default: binds to all interfaces

  OnlyFrom="an.IP.address" Accept requests only from specified 
  hostname/IP address. You can also specify a range of IP addresses 
  by OnlyFrom="begin.IP.address-end.IP.address". Default is empty 
  (connections are acepted from everywhere).

  NetdTimeout=240 If no connection appears, Ziproxy will exit after 
  specified time in seconds. Set to 0 to disable this.

  MSIETest=true/false If both inetd or xinetd and MSIE run under 
  win2000/XP, MSIE will complain about broken connection. It can be 
  avoided setting this option to TRUE. But then inetd will start 3 
  processes instead of one for every request, what is not convenient 
  for everyday use. It uses system() function instead of exec().

 general options

  Gzip=true/false Whether ziproxy should compress data itself. 
  Browser must accept compressed data. Default: true. If you're 
  using ssh with compression, turn off to prevent unnecessary double 
  compression.
  This optimization is not limited by MaxSize.
  (it used to be like that up to ziproxy 1.3.0)

  Compressible={"shockwave","msword","java"} Specifies MIME data 
  types under application/, which ziproxy should compress too. 
  Default: empty. Type given in response from server is treated 
  following way (example: "application/x-javascript"):

* leading "application/" is discarded (result: "x-javascript")

* if result begins with "x-", that is discarded too (result: "javascript")

* beginning of result is compared with all strings specified in 
  Compressible option. If matches, ziproxy will compress it. (third 
  string above matches leading "java")

  ImageQuality={17,20,23,25} This option must have either 4 values 
  or must be not present at all. The numbers give requested quality 
  of outcoming JPEG images, based on size of image(width*height in 
  pixels), respectively:

1. less than 5000 pixels

2. between 5000 and 50000 pixels or one dimension is smaller than 
  150 pixels

3. between 50000 and 250000 pixels

4. more than 250000 pixels

Either number has following meaning:

* between -100 and -1: convert image to grayscale JPEG with given quality

* 0: do nothing with image

* between 1 and 100: convert image to color JPEG with given quality. 
  If the source image is grayscale, the resulting JPEG may be 
  grayscale too. But ziproxy isn't always able to detect grayscale 
  source images.

For example, ImageQuality={-15,20,25,0} means: Images less than 5000 
pixels will be converted to grayscale JPEG with quality of 15. 
Images between 5000 and 50000 pixels will be converted to color JPEG 
with quality of 20. Images between 50000 and 250000 pixels will be 
converted to color JPEG with quality of 25. Images larger than 
250000 pixels will be unchanged.

  JP2Rate={0.15,0.1,0.04,0.03}
  This option is obsolete.
  See: JP2ImageQuality

  JP2ImageQuality={20,15,15,15}
  Image quality for JP2 (JPEG 2000) compression.
  Image quality is specified in integers between 100 (best) and 0 (worst).
  This option is similar to "ImageQuality" except it applies to JP2K files, instead.
  JP2K, internally, works differently and has a "rate" setting instead of "quality".
  Within Ziproxy's context we want to use a fixed quality, not a fixed bitrate.
  Thus, prior to compression, the image is analysed in order to know which rate
  (loosely) reflects the quality had this picture be compressed using jpeg.
  This option obsoletes "JP2Rate".
  * This option requires Ziproxy to be compiled with libjasper.

  ZiproxyTimeout=seconds
  If processing of request exceeds specified time in seconds,
  or connection is idle beyond that time (stalled) it will abort.
  This avoids processes staying forever (or for a very long time)
  in case of a stalled connection or software bug.
  This will NOT necessarily abort the streaming of very big files,
  it will ONLY if the connection stalls or there's a software bug.
  If "0", no timeout.
  Default: 90 (seconds)

  UseContentLength=true/false By default, if ziproxy is 
  modifying/compressing, it begins sending data only after their 
  length can be determined(UseContentLength=true). If you turn 
  option off, ziproxy will start sending data sooner, what will make 
  browsing feel more responsive. But, because browser doesn't know 
  data length, it will be unable to distinguish broken connection 
  from properly closed one. If you use SSH compression instead and 
  your browser identifies itself as HTTP/1.1, you need not unset 
  this (ziproxy will then send "chunked" content to browser).

  MaxSize=bytes
  Max file size to try to (re)compress, in bytes;
  If "0", means that this limitation won't apply.
  This regards to the file size as received from the remote HTTP server
  (which may arrive gzipped or not -- it doesn't matter).
  If a file is bigger than this limit, Ziproxy will simply stream it unmodified,
  unless the user also requested gzip compression (see below).
  Attention: If setting a very big size, the request answer latency will
    increase since Ziproxy needs to fetch the whole file before
    attempting to (re)compress it.
    A too low value will prevent data bigger that that to de processed
    (jpg/png/gif recompression, htmlopt, preemptdns..).
  Note that if:
    - Only gzipping is to be applied *OR*
    - Gzipping and other is to be applied, but data is > MaxSize
    Gzip compression (and only that) will be applied while streaming.
  Default: 1048576 (bytes)
    (default used to be "0" in ziproxy 2.3.0 and earlier)

  MinTextStream=bytes
  This option is obsolete.

  ViaServer="something" If specified, ziproxy will send and check 
  Via: header with given string as host identification. It is 
  sometimes useful to avoid request loops.
  Default: not specified

  ModifySuffixes=true/false
  This option is obsolete.
 
  AllowLookChange=true/false If ziproxy is compressing transparent or
  animated images, the resulting change of page look is sometimes too 
  drastical.
  Setting this option to false makes ziproxy avoid compressing these 
  images.
  Default: false (true in pre-2.0.0 versions).

  ProcessJPG=true/false If false, ziproxy will not try to recompress
  JPEG format files.
  Default: true.

  ProcessPNG=true/false If false, ziproxy will not try to recompress
  PNG format files.
  Default: true.

  ProcessGIF=true/false If false, ziproxy will not try to recompress
  GIF format files.
  Default: true.

  ProcessJP2=true/false If false, ziproxy will not try to recompress
  JP2 (JPEG 2000) files.
  * This option requires Ziproxy to be compiled with libjasper.
  Default: false.

  ProcessToJP2=true/false
  Whether to try to compress a image to JP2K (JPEG 2000)
  Even when enabled, other formats may sill be tried.
  Web browsers' support vary and an external plugin may be required
  in order to display JP2K pictures.
  If "ForceOutputNoJP2 = true", this option will be overrided
  and stay disabled.
  * This option requires Ziproxy to be compiled with libjasper.
  Default: false

  ForceOutputNoJP2=true/false
  When enabled, this option forces the conversion of all incoming
  JP2K images to another format (usually JPEG).
  JP2K images with unsupported internal data will be forwarded unmodified.
  One may use this option to create "JP2K-compressed tunnels" between
  two Ziproxies with narrow bandwidth in between and serve clients
  which otherwise do not support JP2K while still taking advantage of that
  format. In such scenario, if the clients and their Ziproxy share a LAN,
  for best image quality it is recommended to set a very low (highest quality)
  _local_ output compression.
  This option requires "ProcessJP2 = true" in order to work.
  * This option requires Ziproxy to be compiled with libjasper.
  Default: false

  AnnounceJP2Capability=true/false
  When enabled, every request as a client will include an extra header "X-Ziproxy-Flags"
  announcing it as a Ziproxy with JP2 support enabled.
  This option makes sense when chaining to another Ziproxy.
  Note: when the request is intercepted by another Ziproxy,
        the extra header won't be sent further.
  See also: JP2OutRequiresExpCap
  Default: false

  JP2OutRequiresExpCap=true/false
  "JP2 Output Requires Explicit Capability"
  When enabled (and when JP2 output is enabled) will only compress to JP2 to
  clients which explicity support for that -- that means Ziproxy with
  AnnounceJP2Capability = true.
  This option is useful when you want to compress to JP2 only for clients
  behind a local Ziproxy with ForceOutputNoJP2 = true, but at the same time
  you have clients connecting directly and those do not support JP2.
  Default: false (does not make such discrimination for JP2 output)

  JP2Colorspace=VALUE
  Color model to be used while compressing images to JP2K.
  Accepted values:
    0 - RGB
    1 - YUV
  If different than RGB, it adds extra processing due to conversion.
  By itself doesn't change much the output data size, and the
  conversion is not 100.0% lossless.
  If you plan using JP2CSampling* or JP2BitRes* options, a non-RGB
  color model is highly prefereable.
  Default: 1 (YUV)
  Note: certain jp2-aware software do NOT support a color model
        other than RGB and will either fail or display a distorted image.

  JP2Upsampler=VALUE
  Upsampler to be used while resampling each component of a JP2K picture.
  This is used ONLY when decompressing JP2K pictures, it does not affect
  JP2K compression at all (that uses a downsampler, which is linear-only).
  Accepted values:
    0 - Linear
    1 - Lanczos (Lanczos3)
  For modest scaling such as 2:1, linear is usually better,
  resulting in a overall clear component.
  Lanczos may be interesting when scaling 4:1 or more, though
  it tends to sharpen the JP2K artifacts and add harmonic
  interference to the component.
  Default: 0 (Linear)

  JP2BitResYA={Y1,A1, Y2,A2, Y3,A3, Y4,A4}
  This applies to B&W pictures compressed to JP2K.
  Defines the channel resolution for each component:
  Y (luma) and A (alpha, if present)
  in number of bit (min: 1, max: 8)
  Defines for each file size (see JP2ImageQuality).
  Smallest image is the first components in array.
  Sequence is YAYAYAYA.
  Default: all to eight bits

  JP2BitResRGBA={R1,G1,B1,A1, R2,G2,B2,A2, R3,G3,B3,A3, R4,G4,B4,A4}
  This applies to color pictures compressed to JP2K
  using the RGB model (see JP2Colorspace).
  Defines the channel resolution for each component:
  R (red), G (green), B (blue) and A (alpha, if present)
  in number of bit (min: 1, max: 8)
  Defines for each file size (see JP2ImageQuality).
  Smallest image is the first components in array.
  Sequence is RGBARGBARGBARGBA.
  Default: all to eight bits

  JP2BitResYUVA={Y1,U1,V1,A1, Y2,U2,V2,A2, Y3,U3,V3,A3, Y4,U4,V4,A4}
  This applies to color pictures compressed to JP2K
  using the YUV color model (see JP2Colorspace).
  Defines the channel resolution for each component:
  Y (luma), U (chroma, Cb), V (chroma, Cr), and A (alpha, if present)
  in number of bit (min: 1, max: 8)
  Defines for each file size (see JP2ImageQuality).
  Smallest image is the first components in array.
  Sequence is YUVAYUVAYUVAYUVA.
  Default: sensible values for best quality/compression

  JP2CSamplingYA={ (see below) }
  This applies to B&W pictures compressed to JP2K.
  Here you may define the sampling rate for each component,
  for each picture size.
  The sequence is:
  Y_xpos, Y_ypos, Y_xstep, Y_ystep,  A_xpos, A_ypos, A_xstep, A_ystep, (smallest picture)
  ... ... ... (medium-sized picture)
  etc.
  Default: all x/ypos=0 x/ystep=1 (no components suffer subsampling)
  Note: certain jp2-aware software do NOT support component subsampling and will fail.

  JP2CSamplingRGBA={ (see below) }
  This applies to color pictures compressed to JP2K
  using the RGB model (see JP2Colorspace).
  Here you may define the sampling rate for each component,
  for each picture size.
  The sequence is:
  R_xpos, R_ypos, R_xstep, R_ystep,  G_xpos, G_ypos, G_xstep, G_ystep,  B...  A... (smallest picture)
  ... ... ... (medium-sized picture)
  etc.
  Default: all x/ypos=0 x/ystep=1 (no components suffer subsampling)
  Note: certain jp2-aware software do NOT support component subsampling and will fail.

  JP2CSamplingYUVA={ (see below) }
  This applies to color pictures compressed to JP2K
  using the YUV color model (see JP2Colorspace).
  Here you may define the sampling rate for each component,
  for each picture size.
  The sequence is:
  Y_xpos, Y_ypos, Y_xstep, Y_ystep,  U_xpos, U_ypos, U_xstep, U_ystep,  V...  A... (smallest picture)
  ... ... ... (medium-sized picture)
  etc.
  Default: sensible values for a good image quality.
  Note: certain jp2-aware software do NOT support component subsampling and will fail.

  ProcessHTML=true/false If true, ziproxy will optimize the HTML code
  before attempting (if enabled) to gzip it. This option also covers
  javascript and CSS data embedded into HTML code.
  This option is affected by the other options: ProcessHTML_*.
  Default: false.
  *** THIS OPTION IS EXPERIMENTAL ***

  ProcessCSS=true/false If true, ziproxy will optimize the CSS code
  before attempting (if enabled) to gzip it. This option is affected
  by the other options ProcessHTML_*.
  This ONLY affects stand-alone CSS files, not CSS embedded into
  HTML code.
  Default: false.
  *** THIS OPTION IS EXPERIMENTAL ***

  ProcessJS=true/false If true, ziproxy will optimize the javascript code
  before attempting (if enabled) to gzip it. This option is affected
  by the other options ProcessHTML_*.
  This ONLY affects stand-alone javascript files, not javascript
  embedded into HTML code.
  Default: false.
  *** THIS OPTION IS EXPERIMENTAL ***

  ProcessHTML_CSS=true/false If true, CSS data embedded into HTML
  code will be optimized.
  In order to take effect, this option depends on the ProcessHTML
  option to be enabled aswell.
  Default: true.

  ProcessHTML_JS=true/false If true, javascript code embedded
  into HTML code will be optimized.
  In order to take effect, this option depends on the ProcessHTML
  option to be enabled aswell.
  Default: true.

  ProcessHTML_tags=true/false If true, HTML tags themselves will
  be optimized. It means quote chars suppression (when possible),
  redundant spacing suppression, conversion to lowecase, suppression
  of trailing dash from "/>".
  In order to take effect, this option depends on the ProcessHTML
  option to be enabled aswell.
  Default: true.

  ProcessHTML_text=true/false If true, the text itself will be
  optimized, in practice meaning all the redundant spaces to
  be removed.
  In order to take effect, this option depends on the ProcessHTML
  option to be enabled aswell.
  Default: true.

  ProcessHTML_PRE=true/false If true, PREformatted HTML text
  will be optimized.
  In order to take effect, this option depends on the ProcessHTML
  option to be enabled aswell.
  Default: true.

  ProcessHTML_TEXTAREA=true/false If true, data within TEXTAREA HTML
  tags context will be optimized.
  In order to take effect, this option depends on the ProcessHTML
  option to be enabled aswell.
  Default: true.

  ProcessHTML_NoComments=true/false If true, all the irrelevant
  HTML-level comments will be suppressed. This will not remove
  javascript code embedded into a HTML-comment block, nor certain
  other cases (which are not really comments).
  In order to take effect, this option depends on the ProcessHTML
  option to be enabled aswell.
  Default: true.

  PreemptNameRes=true/false Preemptive name resolution. If true and
  the processed file is a html one, it will try to resolve all
  the hostnames present in the html file (in the hope the resolved
  name will be cached by the DNS or name cache, external to Ziproxy).
  Ziproxy will _not_ cache any hostname by itself! (Try PDNSD, etc)
  If the user clicks a link from a page previously processed by
  Ziproxy, there will be no delay due to name resolution.
  Warning: This option will increase the DNS traffic by many times.
  Default: false (true in pre-2.0.0 versions).

  PreemptNameResMax=50 Maximum hostnames Ziproxy will try to resolve
  in a preemptive manner (see PreemptNameRes).
  Default: 50

  PreemptNameResBC=true/false Bogus check for hostnames Ziproxy
  will try to resolve in a preemptive manner (see PreemptNameRes).
  Currently, if enabled, ignore hostnames other than the ones
  ending with .nnnn, .nnn or .nn (eg. .info, .com, .br...)
  Default: false

  TransparentProxy=true/false Allow processing of requests as
  transparent proxy (will still accept normal proxy requests)
  In order to use Ziproxy as transparent proxy it's also needed
  to reroute the connections from x.x.x.x:80 to ziproxy.host:PROXY_PORT
  Default: false
  See also: RestrictOutPortHTTP

  ConventionalProxy=true/false
  Whether to process normal proxy requests or not
  Only makes sense when TransparentProxy is enabled.
  If transparent proxy is enabled, it's usually a good idea to disable
  conventional proxying since, depending on the layout of your network,
  it can be abused by ill-meant users to circumvent restrictions
  presented by another proxy placed between Ziproxy and the users.
  Default: true

  AllowMethodCONNECT=true/false
  Whether to allow the CONNECT method.
  This method is used by HTTPS, but may be used for other
  types of service (like instant messenging) which allow tunneling through http proxy.
  If you plan on serving only HTTP requests (no HTTPS nor anything else)
  you may want to disable this, in order to prevent potential
  abuse of the service.
  Default: true
  See also: RestrictOutPortCONNECT

  RestrictOutPortHTTP = { <list of ports> }
  If defined, restricts the outgoing connections (except CONNECT methods - used by HTTPS)
  to the listed destination ports.
  If TransparentProxy is used, for security reasons it's recommended to restrict
  to the ports (typically port 80) which are being intercepted.
  Default: all ports are allowed.
  See also: RestrictOutPortCONNECT

  RestrictOutPortCONNECT = { <list of ports> }
  If defined, restricts the outgoing connections using the CONNECT method (used by HTTPS)
  to the listed destination ports.
  If AllowMethodCONNECT=false, then no ports are allowed at all regardless this list.
  Default: all ports are allowed.
  See also: AllowMethodCONNECT, RestrictOutPortHTTP

  OverrideAcceptEncoding=true/false
  Whether to override the Accept-Encoding more to Ziproxy's liking.
  If disabled, Ziproxy will just forward Accept-Encoding received from the client
  (thus the data may or not come gzipped, depending on your HTTP client).
  Currently, this option is used to always advertise Gzip capability to
  the remote HTTP server.
  This has _no_ relation to Gzip support between Ziproxy and the client,
  Ziproxy will compress/decompress the data according to the client.
  Default: true

  DecompressIncomingGzipData=true/false
  Enable/disable the internal gzip decompression by Ziproxy.
  This decompression is needed when the remote server sends data already gzipped,
  but further processing is desired (like HTMLopt, PreemptDNS etc).
  Disabling this will save some processing load, and reduce some latency since
  Ziproxy will directly stream that data to the client.
  - But processing features WILL NOT work with such data.
  Attention:
  If you disable this, but configure Ziproxy to advertise as a gzip-supporting
  client to the remote server: While using a non-gzip-supporting client, the client
  may receive gzip-encoded data and it won't know how to deal with that
  (== it will receive useless garbage).
  Default: true (enabled)

  RedefineUserAgent="SuperBrowser/2.07 (blah blah blah blah)"
  Replaces the User-Agent data sent by the client with a custom string,
  OR defines User-Agent with that string if that entry was not defined.
  If disabled, Ziproxy will just forward the User-Agent sent by the client.
  Normally you will want to leave this option DISABLED (commented).
  It's useful if you, for some reason, want to identify all the clients as
  some specific browser/version/OS.
  Certain websites may appear broken if the client uses a different browser than
  the one specified here.
  Default: <undefined> (just forwards User-Agent as defined by the client)

  MaxUncompressedGzipRatio=2000
  When Ziproxy receives Gzip data it will try to decompress in order to do
  further processing (HTMLopt, PreemptDNS etc).
  This makes Ziproxy vulnerable to 'gzip-bombs' (eg. like 10 GB of zeroes, compressed)
  which could be used to slow down or even crash the server.
  In order to avoid/minimise such problems, you can limit the max
  decompression proportion, related to the original file.
  If a Gzipped file exceedes that proportion while decompressing, its
  decompression is aborted.
  The user will receive an error page instead or (if already transferring)
  transfer will simply be aborted.
  You may disable this feature defining its value to '0'.
  Default: 2000 (that's 2000% == 20 times the compressed size)

  MinUncompressedGzipStreamEval=250000
  When limiting decompression rate with MaxUncompressedGzipRatio
  _and_ gunzipping while streaming it's not possible to know the
  file size until the transfer is finished. So Ziproxy verifies this while
  decompressing.
  The problem by doing this is the possible false positives:
  certain files compress a lot at their beginning, but then not-so
  shortly after.
  In order to prevent/minimize such problems, we define the minimum
  output (the decompressed data) generated before starting to
  check the decompression rate.
  If defined as '0', it will check the rate immediately.
  A too large value will increase the rate-limit precision, at the cost of less
  protection.
  Streams with output less that this value won't have decompression
  rate checking at all.
  This feature is only active if MaxUncompressedGzipRatio is defined.
  This does not affect data wholly loaded to memory (for further processing).
  Default: 250000 (bytes)
  See also: MaxUncompressedGzipRatio

  MaxUncompressedImageRatio = 500
  This is the maximum compression rate allowable for an incoming
  (before recompression) image file.
  If an image has a higher compression rate than this, it will not
  be unpacked and it will be forwarded to the client as is.
  This feature protects against (or mitigates) the problem with
  "image bombs" (gif bombs, etc) done with huge bitmaps with the same
  pixel color (thus very small once compressed).
  Since Ziproxy may try to recompress the image, if several of this
  kind are requested, the server may run out of memory, so this
  may be used as a DoS attack against Ziproxy.
  This feature will not protect the client, since it will receive
  the unmodified picture.
  There are rare legitimate cases matching such high compression rate,
  including poor website design. But in such cases is not really worth
  recompressing anyway (the processing costs are not worth the savings).
  Usually "image bomb" pictures have a >1000:1 compression ratio.
  Setting this to less than 100 risks not processing legitimate pictures.
  Setting 0 disables this feature.
  Default: 500 (500:1 ratio)

  CustomError400="/full/path/error_file.html"
  Custom error message for "Bad request"
  (malformed URL, or unknown URL type)
  Default: <undefined> (internal error message)

  CustomError403="/full/path/error_file.html"
  Custom error message for "Forbidden"
  (includes access to forbidden ports set in Ziproxy)
  Default: <undefined> (internal error message)

  CustomError404="/full/path/error_file.html"
  Custom error message for "Unknown host"
  (Ziproxy will not issue 'page not found' errors itself)
  Default: <undefined> (internal error message)

  CustomError407="/full/path/error_file.html"
  Custom error message for "Proxy Authentication Required"
  Default: <undefined> (internal error message)

  CustomError408="/full/path/error_file.html"
  Custom error message for "Request timed out"
  Default: <undefined> (internal error message)

  CustomError409="/full/path/error_file.html"
  Custom error message for "Conflict"
  Default: <undefined> (internal error message)

  CustomError500="/full/path/error_file.html"
  Custom error message for "Internal error"
  (or empty response from server)
  Default: <undefined> (internal error message)

  CustomError503="/full/path/error_file.html"
  Custom error message for "Connection refused"
  (or service unavailable)
  Default: <undefined> (internal error message)

  PasswdFile="/full/path/ziproxy.passwd"
  If enabled, requires authentication from clients willing to connect
  to the proxy.
  The specified file should contain:
    user:pass pairs,
    lines no longer than 128 chars
  Note: The password is unencrypted
  Default: No file specified (thus no authentication required)

  Nameservers={"1.2.3.4", "11.22.33.44", ...}
  If enabled, used the specified nameservers instead of the default ones
  (usually at /etc/resolv.conf)
  Default: Disabled. Ziproxy uses the OS-level DNS configuration.

  BindOutgoing = {"234.22.33.44", "4.3.2.1", "44.200.34.11", ...}
  Bind outgoing connections (to remote HTTP server) to the following (local) IPs
  It applies to the _outgoing_ connections, it has _no_ relation to the listener socket.
  When 2 or more IPs are specified, Ziproxy will rotate to each of those at each
  outgoing connection. All IPs have the same priority.
  You may use this option for either of the following reasons:
  1. - To use only a specific IP when connecting to remote HTTP servers.
  2. - Use 2 or more IPs for load balancing (a rather primitive one, since it's
       connection-based and does not take into account the bytes transferred).
  3. - You have a huge intranet and certain sites (google.com, for example)
       are blocking your requests because there are so many coming from the same IP.
       So you may use 2 or more IPs here and make it appear that your requests
       come from several different machines.
  This option does _not_ spoof packets, it merely uses the host's local IPs.
  Note: While in (x)inetd mode, output may be bind-ed only to one IP.
  See also: BindOutgoingExList
  Default: Disabled. Binds to the default IP, the OS decides which one.

  BindOutgoingExList="/etc/ziproxy/bo_exception.list"
  Specifies a file containing a list of hosts which should not suffer
  IP rotation as specified by the option "BindOutgoing".
  The reason for this option is that certain services do not like
  the client IP changing in the same session.
  Certain webmail services fail or return authentication failure in this case.
  This option has no effect if BindOutgoing is not used.
  See also: BindOutgoingExAddr
  Default: empty, no hosts are exempted.

  BindOutgoingExAddr="98.7.65.43"
  Defines a specific IP to be bound to for hosts specified in BindOutgoingExList.
  As with BindOutgoing, this IP must be a local IP from the server running Ziproxy.
  This IP may be one of those specified in BindOutgoing, but that's _not_
  a requirement and may be a different IP.
  This option has no effect if BindOutgoingExList is not being used.
  Default: empty, uses the first IP specified in BindOutgoing.

  WA_MSIE_FriendlyErrMsgs=true/false
  Workaround for MSIE's pseudo-feature "Show friendly HTTP error messages."
  If User-Agent=MSIE, don't change/compress the body of error messages in any way.
  If compressed it could go down below to 256 or 512 bytes and be replaced with
  a local error message instead.
  In certain cases the body has crucial data, like HTML redirection or so, and
  that would be broken if a "friendly error" replaces it.
  If you are sure there are no users using MSIE's with "friendly error messages"
  enabled, or you don't support/have users with such configuration, you may
  disable this and have error data compressed for MSIE users.
  This workaround does not affect other clients at all, and error messages
  will be sent compressed if the client supports it.
  Default: true (enabled)

  URLNoProcessing = "/etc/ziproxy/noprocess.list"
  This option specifies a file containing a list of URLs that should be tunneled
  by Ziproxy with no kind of processing whatsoever.
  The list contain fully-formatted URLS (http://xxx/xxx), one URL per line.
  The URLs may also contain pattern-matching asterisks.
  Comments may be present if prefixed by '#' (shell-alike).
  In order to exempt a whole site from processing: "http://www.exemptedhost.xyz/*"
  This option exists when a page is known to stop working under Ziproxy processing
  and there's no specific workaround/bugfix still available.
  Thus, this is a temporary solution when you depend on the page to work in a
  production environment.
  ****** REMEMBER TO REPORT BUGS/INCOMPATIBILITIES SO THEY MAY BE FIXED *******
  *** THIS IS NOT SUPPOSED TO BE A DEFINITIVE SOLUTION TO INCOMPATIBILITIES ***
  Default: empty (no file specified, inactive)

  URLReplaceData = "/etc/ziproxy/replace.list"
  This option specifies a file containing a list of URLs which its
  data should be intercepted and replaced by another.
  Header data such as cookies is maintained.
  Currently the only replacing data available is an empty image
  (1x1 transparent pixel GIF).
  The list contain fully-formatted URLS (http://xxx/xxx), one URL per line.
  The URLs may also contain pattern-matching asterisks.
  Comments may be present if prefixed by '#' (shell-alike).
  In order to exempt a whole site from processing: "http://ad.somehost.xyz/*"
  The way it is, this option may be used as an AD-BLOCKER which is
  transparent to the remote host (data is downloaded from the remove server
  and cookies are transported) -- a stealthy ad-blocker, if you like.
  Default: empty (no file specified, inactive)
  See also: URLReplaceDataCTList, URLReplaceData

  URLReplaceDataCT = "/etc/ziproxy/replace_ct.list"
  Same as URLReplaceData, except it will only replace the data
  from matching URLs if the content-type matches
  the list in URLReplaceDataCTList (mandatory parameter) aswell.
  URLReplaceDataCT may be useful as a more compatible AD-BLOCKER
  if only visual files are replaced. Certain websites rely on
  external javascript from advertisement hosts and break when
  that data is missing, this is a way to block advertisements
  in such cases.
  Default: empty (no file specified, inactive)
  See also: URLReplaceDataCTList, URLReplaceData

  URLReplaceDataCTList = {"image/jpeg", ...}
  List of content-types to use with the URLReplaceDataCT option.
  Default: empty (no content-type specified, inactive)
  See also: URLReplaceDataCT

  URLDeny = "/etc/ziproxy/deny.list"
  This option specifies a file containing a list of URLs which
  should be blocked.
  A "access denied" 403 error will be returned when trying to access
  one of those URLs.
  Default: empty (no file specified, inactive)



 Logging options

Logging output is intended mainly for debugging. If neither LogFile 
nor LogPipe option is found, logging is turned off (this is the default).

  LogFile="file_name" Append log output into file_name. Specified 
  string is passed to strftime() function first with current date/time.

  AccessLogFileName="/something_like/var/log/ziproxy/access.log"
  File to be used as access log.
  Log format (columns):
    TIME (unix time as seconds.msecs),
    PROCESS_TIME (ms, from receiving request to last byte sent to client),
    [USER@]ADDRESS (address with daemon mode only, with [x]inet it displays a '?'),
    FLAGS,
    ORIGINAL_SIZE,
    SIZE_AFTER_(RE)COMPRESSION,
    METHOD,
    URL.
  Where FLAGS may be:
    P (a request as proxy)
    T (a request as transparent proxy)
    S (CONNECT method, usually HTTPS data)
    Z (transfer timeoutted - see ZiproxyTimeout)
    B (interrupted transfer - either by user or by remote http host)
    W (content type was supposed to load into memory, but it had no content-size and, in the end, it was bigger than MaxSize. so it was streamed instead)
    N (URL not processed. See: URLNoProcessing config option)
    R (data was replaced. See: URLReplaceData config option)
    K (image too expansive. See: MaxUncompressedImageRatio config option)
    1 (SIGSEGV received. See: InterceptCrashes config option)
    2 (SIGFPE received. See: InterceptCrashes config option)
    3 (SIGILL received. See: InterceptCrashes config option)
    4 (SIGBUS received. See: InterceptCrashes config option)
    5 (SIGSYS received. See: InterceptCrashes config option)
  Default: No file specified (thus no access logging)

  AccessLogUserPOV=true/false
  By default Ziproxy reports the real incoming (from remote HTTP server) data size when
  writting access logs.
  So if it was gzipped by the HTTP server itself, the incoming data will be (naturally) smaller.
  When the client does not support gzipped data, and the data came originally compressed,
  the access log will show a data size increase (because it had to be decompressed
  by Ziproxy, due to client limitations).
  This is misleading in certain log analysis since, if the client accessed the server
  directly instead, the amount of data transferred into this client would be the same anyway
  -- there's no loss from the client's point-of-view.
  To avoid such distortion in statistics, you may want to log - instead of the real incoming size -
  the size which would be transferred had the client connected directly to the server.
  Only appliable when 'AccessLogFileName' is enabled.
  Default: false (reports the real incoming size).

  InterceptCrashes = true/false
  When enabled, Ziproxy will intercept signals indicative of
  software crash, flag the offending request in access log
  accordingly, then stop the offending process.
  This is useful for debugging purposes and it's not recommended
  to leave it enabled in normal use due to the risk of garbage
  being written to access log (due to a more severe crash).
  Once enabled, the intercepted signals are:
  SIGSEGV (segmentation fault)
  SIGFPE (FPU exception))
  SIGILL (illegal instruction)
  SIGBUS (bus error, alignment issues)
  SIGSYS (bad system call)
  Default: disabled (those signals not intercepted by Ziproxy)

  LogPipe={"command","-arg1","-arg2"} This is incompatible with 
  xinetd, use with netd only! It pipes all logging output through 
  command. If LogFile option is present too, standard output of 
  command is redirected to that file. For example, if you haven't 
  enough space (low quota) on remote host, you can compress logging 
  output on the fly. It has disadvantage that logfile is usable only 
  after Ziproxy exits (or you can end it with ^C). 

  NextProxy="host.name" 

  NextPort=8080 Forward everything to another proxy server. 
  Modifications/compression is still applied.

 Compiling under cygwin -- TODO

It compiles almost the same way. However, you may want to avoid 
libungif dependence on X11. Then add -static option to LDFLAGS 
variable in Makefile:

LDFLAGS = -g $(SYSV_LIBS) -static -lgif -lpng -ljpeg -lm -lz

It has other advantage, that statically linked executable 
ziproxy.exe can be together with cygwin1.dll 
transferred to other machine, where cygwin needs not be installed. 
xinetd server is also available as cygwin package.

 Usage

 With inetd

In /etc/inetd.conf add the line where <location> is where you put 
the executable:

ziproxy stream tcp nowait.500 root /usr/sbin/tcpd <location>/ziproxy 
-i -c <location>/ziproxy.conf

in /etc/services add the line where <port> is the port you want the 
proxy to be on:

ziproxy <port>

then restart inetd.

 With xinetd

See the example config file included in this tarball.

 Daemon mode (standalone operation)

It is intended as simple inetd replacement if you want to use 
ziproxy under unprivileged user account. Every time you connect to 
internet, log in to remote machine and start it with command

- for daemon mode:

./ziproxy -d -c 'somewhere/ziproxy.conf' -f your.IP.adress

- for port forwarding:

./ziproxy -d -c 'somewhere/ziproxy.conf' -f 127.0.0.1 

Or set OnlyFrom=127.0.0.1 in ziproxy.conf instead of -f switch. Then 
it will accept requests only from your machine. If you forget to 
kill Ziproxy before hangup, it times out (according to NetdTimeout 
option). 

 Automated SSH logins

Use SSH public key authentication for logging in without password -- 
see ssh-keygen manpage. 

 Direct connection

You can tell your browser he has to connect directly to remote host 
to use ziproxy. Then compression can be done by ziproxy(Gzip=true 
option), but your browser must support it. MS Internet Explorer 
needs setting up(see Caveats below), Opera, Mozilla or Konqueror are 
fine. There is also HTTP protocol overhead that can't be compressed 
this way. Moreover, this may not work if remote machine is behind firewall.

 Port forwarding

Use SSH port forwarding by running command like

ssh yourlogin@remote.machine -C -L 8090:127.0.0.1:8080 -N

Then set up your browser to use proxy "localhost" port 8090, while 
ziproxy is using port 8080 on remote host. All connections between 
them are carried and compressed by ssh. Remarks about automating 
logins apply as above. 

This capability is present only in OpenSSH. For using it under 
Windows, OpenSSH compiled under cygwin toolkit is available at 
http://www.networksimplicity.com/openssh/ . That's all. If you want 
to tweak it further, there is CompressionLevel option for ssh.

 Transparent proxy

In order to use Ziproxy as transparent proxy:
1. - In ziproxy.conf: TransparentProxy = true
2. - It's also needed to reroute the connections from
     x.x.x.x:80 to ziproxy.host:PROXY_PORT

Examples of traffic rerouting (Linux kernel >= 2.4 OSes):
THESE ARE INCOMPLETE SCRIPTS AND DO NOT PROVIDE ANY SECURITY !!!

### Requests from a local machine --> remote Ziproxy host
$ /sbin/modprobe ip_tables
$ /sbin/modprobe iptable_nat
$ IPTABLES=/usr/sbin/iptables
$ ZIPROXY_HOST=200.56.78.90
$ ZIPROXY_PORT=8080
$ $IPTABLES -t nat -A OUTPUT -s 0/0 -p tcp --dport 80 -j DNAT --to ${ZIPROXY_HOST}:${ZIPROXY_PORT}

### A transparent machine routing HTTP traffic AND running Ziproxy
$ /sbin/modprobe ip_tables
$ /sbin/modprobe iptable_nat
$ IPTABLES=/usr/sbin/iptables
$ NET_INTERFACE=eth0
$ ZIPROXY_HOST=200.56.78.90
$ ZIPROXY_PORT=8080
$ $IPTABLES -t nat -A PREROUTING -i $NET_INTERFACE -d ! $ZIPROXY_HOST -p tcp --dport 80 -j REDIRECT --to-port $ZIPROXY_PORT

 FAQ/Known bugs 

 MSIE setup for ziproxy 

To get IE accepting gzipped data, under Internet Options/Advanced 
tab check option "Use HTTP/1.1 extensions when using proxy".

 How and why ziproxy changes HTML

Obsolete -- most browsers can now detect image type from 
Content-type header properly. There is still need for HTML changing 
in JPEG2000 mode -- see JPEG2000.txt.

When browser comes to display some image, it looks to suffix of its 
name. When ziproxy changes image type from GIF/PNG to JPEG, it can't 
additionally change suffix too and then browser treated image as 
broken. So ziproxy has to preliminary change all image suffixes in 
HTML. When you extract the image from page, you have to rename it 
back to .gif/.png/.jpg according to what really image type is. On 
un*x systems comes "file" tool handy. If you will save entire page 
HTML+pictures, check whether your browser saves pictures with usable 
filename suffixes (most browsers do). If not, temporarily set using 
proxy off, refresh page and save it.

 Some pictures have inappropriate background color

Some transparent GIFs/PNGs are displayed with incorrect background. 
It's because JPEG can't store transparency information, and 
background color information is "out there" in HTML. It is on the 
to-do list, but it will be quite big change ;).

 Older WWWOffle versions don't like gzip compression

Using gzip (not ssh) compression seems to trigger a bug in wwwoffle 
- pages are incorrectly uncompressed. Upgrade to wwwoffle 2.7g or 
newer. 

 ziproxy seems running, but I can't login/run other programs on that 
  remote host!

For every HTTP request, new ziproxy process is started. In case of 
intensive parallel downloading/mirroring (for example, wwwoffle 
-fetch or using httrack), number of processes may temporarily reach 
maximal user processes limit set by administrator. To avoid the 
problem, set subsequent limit for Ziproxy using limit(csh) or 
ulimit(bash) shell command.
