Simple Output Buffering in Python
Recently, I needed a quick and simple solution to buffer output of a few Python scripts. When I went to look for a module, I was astonished to find that one didn't exist. I quickly decided to change this. In this article, we'll discuss simple ways to buffer output using StringIO and I'll introduce my output buffering module that I constructed.
Why Buffer Output?
Good question. It's different for everybody. For me, I needed it so that I could allow plugins to be built that could seamlessly print output and have my plugin manager be able to capture and parse the data rather than immediately printing it to stdout. Other cases include situations where one would want to gzip all output or implement a "just-in-time" word filter.
Possible Solutions
The way that I see output buffering implemented most often by others is through replacing sys.stdout with an instance of a class which has a write method. This works for some cases, but what many forget is that stdout is considered a file, and thus has the same methods available as a file does. So, implementing only the write method will work for few cases, but can and will break code in other places. It's best to never even consider this as a solution. I just mention it so people using it will realize that they're using a weak solution.
The quickest solution to get around this is to use an instance of the StringIO class as a replacement for sys.stdout, but even this has its limitations. For instance, automatic function callbacks and automatic stdout replacement are out of the question.
The best solution is to use something like my output buffering module. It provides a very robust OutputBuffer object that can suit most needs.
The Quick Solution (using StringIO)
Using StringIO may not be a very robust choice, but it gets the job done when you need to hack something together really quickly.
import StringIO import sys old_stdout = sys.stdout # Need to save it for later sys.stdout = StringIO.StringIO() # Replace stdout with StringIO instance print "Hello, world!" # Will not print. Is instead saved to the buffer. data = sys.stdout.getvalue() # Retrieve the buffered data. sys.stdout = old_stdout # Set stdout to the real one. ( stops buffering )
There are a number of problems to doing output buffering this way. Most notably, you have to keep a pointer to the original stdout instance the entire time, otherwise you're not able to print ever again for the entire script. Yeah, that's a pretty big deal.
The Robust Solution (using the ob module)
The following solution uses a module that I built myself. (Available here)
import ob buf = ob.OutputBuffer() buf.start() # Begin output buffering. print "Hello, world!" # Will not print. Is instead saved to the buffer. buf.stop() # Stop output buffering. data = buf.getvalue() # Retrieve the buffered data
Now, using the ob module has a number of improvements over simply using StringIO. Most notably, it allows you to never have to directly manipulate sys.stdout - it does everything for you. Additionally, if you lose a reference to your OutputBuffer object, not to worry; the ob module provides a way of getting the buffer object:
import ob buf = ob.get_top() # Retrieve the top-most buffer.
At that point, you can again work with the OutputBuffer instance without ever having to directly touch sys.stdout or perform any tests. I'd suggest going to the output buffering module page and reading up on what else it can provide you.
Final Thoughts
Output buffering is a powerful tool. I'm incredibly surprised that Python didn't already provide a solution to it, but at the same time I'm a bit glad because it allowed me to get creative and build my own solution. If you're going to do output buffering, don't be lazy. At the very least use the StringIO solution, but if you want to do it right, then use the ob module or something equivalent.

Leave a comment