python for graphics

(December 2010)

Python is a good tool for prototyping computer graphics experiments. Some handy modules are Python Imaging Library for reading/writing image files, NumPy (part of SciPy) for fast math on arrays, and PyCairo for rasterizing 2D things.

The following is a cookbook for some common things you might want to do.

Read an image into a NumPy array:

import Image
import numpy
a = numpy.asarray( Image.open(fn) )
print a.shape # (300,400,3) means 400x300 with RGB channels
print a.dtype # 'uint8'

Convert from RGB to grayscale by taking just the first channel:

a = a[:, :, 0]

Convert the array to a wider datatype if you need to perform math that would overflow an unsigned 8-bit integer:

b = a.astype(numpy.int)

Data-parallel operations with NumPy are much more efficient than loops in pure Python. Convert RGB to grayscale with weights:

r = rgb[:, :, 0] # slices are not full copies, they cost little memory
g = rgb[:, :, 1]
b = rgb[:, :, 2]

# numpy makes this fast:
gray = (r*2220 + g*7067 + b*713) / 10000 # result is a 2D array

Or sum along an axis and divide by the number of channels:

gray = numpy.sum(rgb.astype(numpy.int), axis=2) / 3

Terse:

gray = rgb.mean(axis=2)

Find the max per-channel values across the image:

maxs = numpy.max(numpy.max(rgb, axis=0), axis=0)

Bilinear resample to a quarter of the image area, using stride and addition of entire arrays at a time:

# halving horizontal
h1 = b[:, 0::2, :] # even columns
h2 = b[:, 1::2, :] # odd columns
b = h1 + h2

# halving vertical
v1 = b[0::2, :, :]
v2 = b[1::2, :, :]
b = v1 + v2

# back to uint8
a = (b/4).astype(numpy.uint8)

Writing a NumPy array to an image file:

Image.fromarray(a).save(fn)

Allocate an empty array to use as a framebuffer:

gray = numpy.zeros((height, width), dtype=numpy.uint8)
rgb = numpy.zeros((height, width, num_channels), dtype=numpy.uint8)

Create an image using Cairo and draw to it:

img = cairo.ImageSurface(cairo.FORMAT_ARGB32, WIDTH, HEIGHT)
ctx = cairo.Context(img)

# background fill
ctx.set_source_rgb(1,1,1)
ctx.rectangle(0,0,WIDTH,HEIGHT)
ctx.fill()

# draw lines
ctx.set_source_rgb(0,0,0)
ctx.set_line_width(4)
ctx.moveto(50,50)
ctx.lineto(200,50)
ctx.lineto(100,100)
ctx.stroke()

Write a Cairo image to disk:

img.write_to_png(fn)

Convert a Cairo image into a NumPy array:

a = numpy.frombuffer(img.get_data(), numpy.uint8)
a.shape = (HEIGHT, WIDTH, 4)

And back again:

height, width, num_channels = a.shape
img = cairo.ImageSurface.create_for_data(a, cairo.FORMAT_ARGB32,
  width, height, width*num_channels)

Calculate the absolute difference of two images, as an image:

a = a.astype(numpy.int) # we don't want to subtract uints
b = b.astype(numpy.int)
diff = abs(a - b) # result is a 2D array
diff = diff.astype(numpy.uint8)
assert diff.shape == a.shape == b.shape

Calculate the Euclidean distance (root mean square) as a scalar:

a = a.astype(numpy.int) # we don't want to subtract uints
b = b.astype(numpy.int)
diff = a - b # still a 2D array
dist = numpy.sqrt( (diff * diff).sum() / float(WIDTH*HEIGHT) )

Normalize brightness:

# floats in the range [0.0, 1.0]
f = (a - a.min()) / float(a.max() - a.min())

# and back to bytes
a = (f * 255).astype(numpy.uint8)