The Haar Transform

Saturday, June 24, 2006

9:52 AM

In my posts "15 Exercises to Know a Programming Language," I suggested some exercises investigating the Haar wavelet, the simplest wavelet. I intended to leave it as is so that people could discover this fascinating subject on their own, but a couple of questions I’ve received have made it impossible for me to hold my tongue.

Okay, to review: The Haar transform works by transforming an array of values into an array of average and differences-from-the-average. Thus, [8, 4] becomes [6, 2], since (8 + 4) / 2 = 6 and 8 – (8 + 4) / 2 = 2. Similarly, [8, 4, 3, 9] becomes, first, [6, 6, 2, -3] (1st and 3rd positions as per previous, 2nd and 4th same transform applied to 3 and 9).

What happens when you apply this transform to a real-world signal? For instance, a photo. The left of Figure 1 shows "Lena" in lovely 256×256 grayscale. The right of the figure is a visualization of the very first step of the Haar transform. Cool, huh?

>

The upper-left-most 64×64 block is now a quarter-resolution image of the original. That makes sense, since it’s the average of the previously-created average. Now consider the 64×64 blocks as we move across (or down). First, we have the average of the edge-map: a low-resolution version of the 2-pixel scale edge map. The third block is edge-map of the average and the fourth block is an edge-map of the edge-map: hey! We’re gathering data at different resolutions!

Okay, let’s quickly revisit the fact that by scaling and clipping we’re losing data, but if we just view this is a visualization of an underlying 256 x 256 array, notice how the "significant" data seems to be moving towards the edges: the upper-left (low-resolution version of the whole) is still recognizable, and in the lower-right (the both-directions edge-map of the edge-map) the data is becoming more chaotic: more "static-y." But in the middle we’re creating a large area with large swaths of gray (I’m even stretching the grayscale so that there’s at least one 0 and one 255 pixel). In other words, large areas where the real data are tending towards 0. If we fully recurse, what do we have?

The upper-left pixel is now the average grayscale value of the entire picture: the ultimate low-resolution version. The rest of the data can’t be parsed by our eyes, but is easily compressed. Now, if you’re really building a wavelet compressor, you don’t scale and clip (as we did for this), but you clip when detail coefficients go below some epsilon. By experimenting with epsilon (making it dynamic?) you can get some very, very impressive compression results.

Some other interesting things to do with wavelets are scale the detail coefficients up and reconstruct the signal: this amplifies low-contrast aspects of the original. Implications of that are left as an exercise to those interested in steganography and forensics.

Wavelets can also be used to aide pattern recognition in signals, since minor variations "move" towards the detail coefficients.

Oh, and you should try wavelet processing on sound. Equally fascinating stuff.

Created with Microsoft Office OneNote 2007 (Beta)

One place for all your information

Download: The Haar Transform.one