where to buy misoprostol online how to buy valtrex
Python WSGI Middleware for automatic Gzipping | Evan Fosmark

Python WSGI Middleware for automatic Gzipping

The complete code

""" A WSGI middleware application that automatically gzips output
    to the client.
    Before doing any gzipping, it first checks the environ to see if
    the client can even support gzipped output. If not, it immediately
    drops out.
    It automatically modifies the headers to include the proper values
    for the 'Accept-Encoding' and 'Vary' headers. 
 
    Example of use:
 
        from wsgiref.simple_server import WSGIServer, WSGIRequestHandler
 
        def test_app(environ, start_response):
            status = '200 OK'
            headers = [('content-type', 'text/html')]
            start_response(status, headers)
 
            return ['Hello gzipped world!']
 
        app = Gzipper(test_app, compresslevel=8)
        httpd = WSGIServer(('', 8080), WSGIRequestHandler)
        httpd.set_app(app)
        httpd.serve_forever()
 
"""
from gzip import GzipFile
import StringIO
 
__version__ = (1,0)
__author__ = "Evan Fosmark "
 
def gzip_string(string, compression_level):
    """ The `gzip` module didn't provide a way to gzip just a string.
        Had to hack together this. I know, it isn't pretty.
    """
    fake_file = StringIO.StringIO()
    gz_file = GzipFile(None, 'wb', compression_level, fake_file)
    gz_file.write(string)
    gz_file.close()
    return fake_file.getvalue()
 
def parse_encoding_header(header):
    """ Break up the `HTTP_ACCEPT_ENCODING` header into a dict of
        the form, {'encoding-name':qvalue}.
    """
    encodings = {'identity':1.0}
 
    for encoding in header.split(","):
        if(encoding.find(";") > -1):
            encoding, qvalue = encoding.split(";")
            encoding = encoding.strip()
            qvalue = qvalue.split('=', 1)[1]
            if(qvalue != ""):
                encodings[encoding] = float(qvalue)
            else:
                encodings[encoding] = 1
        else:
            encodings[encoding] = 1
    return encodings
 
def client_wants_gzip(accept_encoding_header):
    """ Check to see if the client can accept gzipped output, and whether
        or not it is even the preferred method. If `identity` is higher, then
        no gzipping should occur.
    """
    encodings = parse_encoding_header(accept_encoding_header)
 
    # Do the actual comparisons
    if('gzip' in encodings):
        return encodings['gzip'] >= encodings['identity']
 
    elif('*' in encodings):
        return encodings['*'] >= encodings['identity']
 
    else:
        return False
 
class Gzipper(object):
    """ WSGI middleware to wrap around and gzip all output.
        This automatically adds the content-encoding header.
    """
    def __init__(self, app, compresslevel=6):
        self.app = app
        self.compresslevel = compresslevel
 
    def __call__(self, environ, start_response):
        """ Do the actual work. If the host doesn't support gzip as a proper encoding,
            then simply pass over to the next app on the wsgi stack.
        """
        accept_encoding_header = environ.get("HTTP_ACCEPT_ENCODING", "")
        if(not client_wants_gzip(accept_encoding_header)):
                return self.app(environ, start_response)
 
        def _start_response(status, headers, *args, **kwargs):
            """ Wrapper around the original `start_response` function.
                The sole purpose being to add the proper headers automatically.
            """
            headers.append(("Content-Encoding", "gzip"))
            headers.append(("Vary", "Accept-Encoding"))
            return start_response(status, headers, *args, **kwargs)
 
        data = "".join(self.app(environ, _start_response))
        return [gzip_string(data, self.compresslevel)]

Pages: 1 2

 

 

8 Comments

  1. japherwocky wrote,

    I think that’s exactly the proper usage of StringIO. Not a hack at all. :)

  2. Evan wrote,

    Heh, yeah I guess you’re right. It just looks messy to me. :-P

  3. Jim wrote,

    Two bugs:

    > if(environ.get(“HTTP_ACCEPT_ENCODING”, “”).find(“gzip”) < 0)

    This fails for cases like Accept-Encoding: identity, compress;q=0.5, gzip;q=0

    You need to actually split up the header value properly in order to handle it correctly, a simple search fails for corner cases.

    You also need to transmit a Vary header.

  4. Evan wrote,

    Jim, thank you for your input. I have updated the code accordingly.

  5. Sergey wrote,

    In the first for loop, you could do

    encoding, sep, qvalue = encoding.partition(';')
    qvalue = qvalue.partition('=')[2]
    qvalue = 1 if not qvalue else float(qvalue)
    encodings[encoding] = qvalue

    The advantage here is that the code is linear. The disadvantage is that a later Python version is required.

  6. gthomas wrote,

    def iterstream(stream, chunksize=65536):
        '''stream iterator'''
        read = True
        if not hasattr(stream, 'read'):
            if hasattr(stream, 'next') and not isinstance(stream, basestring):
                read = False
                for chunk in stream:
                    yield chunk
            else:
                from cSringIO import StringIO
                stream = StringIO(stream)
        while read:
            buf = stream.read(chunksize)
            if not buf: break
            yield buf
     
    import zlib, struct
    def _zip_stream(input, compress_level=6, zlib_format=True):
        '''compress chunked input with zlib'''
        #
        size = 0
        if zlib_format:
            crc = zlib.crc32('')
            _crc32 = zlib.crc32
            # magic header, compression method, no flags
            header = '372131000'
            # timestamp
            header += struct.pack('&lt;L', 0)
            # uh.. stuff
            header += '02377'
            yield header
     
        compress = zlib.compressobj(compress_level, zlib.DEFLATED,
            -zlib.MAX_WBITS if zlib_format else zlib.MAX_WBITS, zlib.DEF_MEM_LEVEL, 0)
        _compress = compress.compress
     
        #print '$ input: %s (%s) %s' % (repr(input), type(input), dir(input))
        #
        for buf in iterstream(input):
            #print '$ %d' % len(buf)
            if len(buf) != 0:
                if zlib_format:
                    crc = _crc32(buf, crc)
                size += len(buf)
                yield _compress(buf)
     
        yield compress.flush()
        if zlib_format:
            yield struct.pack('&lt;LL', crc &amp; 0xFFFFFFFFL, size &amp; 0xFFFFFFFFL)
     
    def gzip_stream(input, compress_level=6):
        return _zip_stream(input, compress_level=compress_level, zlib_format=True)
     
    def deflate_stream(input, compress_level=6):
        # NOTE: this produces RFC-conformant but some-browser-incompatible output.
        # The RFC says that you're supposed to output zlib-format data, but many
        # browsers expect raw deflate output. Luckily all those browsers support
        # gzip, also, so they won't even see deflate output.
        return _zip_stream(input, compress_level=compress_level, zlib_format=False)
  7. gthomas wrote,

    Another phenomena I’ve met recently: you should use chunked Transfer-Encoding OR Content-Length must be the zipped length, NOT the original!

  8. more wrote,

    Powerful post. I’ve been checking this blog numerous times and I am impressed! Ridiculously helpful info, especially the last part that I liked so much. I will pay attention to your blog. Thank you and have a wonderful day.

Leave a comment