The Haar Transform

The Haar Transform

Saturday, June 24, 2006

9:52 AM

In my posts "15 Exercises to Know a Programming Language," I suggested some exercises investigating the Haar wavelet, the simplest wavelet. I intended to leave it as is so that people could discover this fascinating subject on their own, but a couple of questions I’ve received have made it impossible for me to hold my tongue.


Okay, to review: The Haar transform works by transforming an array of values into an array of average and differences-from-the-average. Thus, [8, 4] becomes [6, 2], since (8 + 4) / 2 = 6 and 8 – (8 + 4) / 2 = 2. Similarly, [8, 4, 3, 9] becomes, first, [6, 6, 2, -3] (1st and 3rd positions as per previous, 2nd and 4th same transform applied to 3 and 9).


What happens when you apply this transform to a real-world signal? For instance, a photo. The left of Figure 1 shows "Lena" in lovely 256×256 grayscale. The right of the figure is a visualization of the very first step of the Haar transform. Cool, huh?


The upper-left-most 64×64 block is now a quarter-resolution image of the original. That makes sense, since it’s the average of the previously-created average. Now consider the 64×64 blocks as we move across (or down). First, we have the average of the edge-map: a low-resolution version of the 2-pixel scale edge map. The third block is edge-map of the average and the fourth block is an edge-map of the edge-map: hey! We’re gathering data at different resolutions!


Okay, let’s quickly revisit the fact that by scaling and clipping we’re losing data, but if we just view this is a visualization of an underlying 256 x 256 array, notice how the "significant" data seems to be moving towards the edges: the upper-left (low-resolution version of the whole) is still recognizable, and in the lower-right (the both-directions edge-map of the edge-map) the data is becoming more chaotic: more "static-y." But in the middle we’re creating a large area with large swaths of gray (I’m even stretching the grayscale so that there’s at least one 0 and one 255 pixel). In other words, large areas where the real data are tending towards 0. If we fully recurse, what do we have?

The upper-left pixel is now the average grayscale value of the entire picture: the ultimate low-resolution version. The rest of the data can’t be parsed by our eyes, but is easily compressed. Now, if you’re really building a wavelet compressor, you don’t scale and clip (as we did for this), but you clip when detail coefficients go below some epsilon. By experimenting with epsilon (making it dynamic?) you can get some very, very impressive compression results.


Some other interesting things to do with wavelets are scale the detail coefficients up and reconstruct the signal: this amplifies low-contrast aspects of the original. Implications of that are left as an exercise to those interested in steganography and forensics.


Wavelets can also be used to aide pattern recognition in signals, since minor variations "move" towards the detail coefficients.


Oh, and you should try wavelet processing on sound. Equally fascinating stuff.


Created with Microsoft Office OneNote 2007 (Beta)
One place for all your information

Download: The Haar

This is a review, tagged with hReview microformat

Reviewing the hReview microformat

Jun 23, 2006 by Larry O’Brien hReview

???&#x2606&#x2606 Reviews are one of the most valuable pieces of information one could hope for, guiding purchases as they do, and one of the types of information where the Web is, I would argue, noticeably inferior to print (an exception would be the work of Websites like that essentially apply print-style rigor but happen to disseminate their work via the Web). hReview is a microformat for reviews. Interestingly, hReview is implemented as a series of CSS “class” attributes applied to standard HTML <span>s. For instance, if you were to look at the source of this post, you should see the “stars” above are encapsulated in an element <ABBR class=”rating” title=”3″ worst=”0″ best=”5″>.

It appears that this CSS “class” style for encoding is common, at least if my perusal of is any indication. Although this is in contrast with RSS, the most successful microformat to date, it seems to me a good design trade-off: instead of creating a world of side-by-side files (as we have with RSS and FOAF) you end up with, at worst, “just” a perfectly readable bit of HTML. On the other hand, spidering is harder (to discover all the hReviews on a site, you have to spider all the HTML on the site, rather than just retrieving the, say, .hreview files). On the other, other hand, perhaps spidering is something best left to others (could you not use the Alexa index?).

I give the hReview format 3 stars simply because I have nothing to compare it to. I will be interested in trying to find this review via Technorati microformat search.

WS-* vs. REST: Jigsaw vs. Tangram puzzle

Harry Pierson quotes Nick Gall on the value of using a small set common modular operations (i.e. the REST / WS-Transfer approach):

Modularity can be open or closed. Closed modularity is like a jigsaw puzzle. There are lots of individual pieces, but they can only be put together one way. Open modularity is like a tangram puzzle. There are only seven pieces, but they can be put together in hundreds of different combinations.

Novell cans CEO, CFO

Novell today terminated CEO Jack Messman and CFO Joseph Tibbetts. Messman is replaced immediately by Ronald Hovsepian, Dana Russell is interim CFO.

Novell has long been the hardest-to-parse of the major OS vendors. To say they “owned” the network market in the early 90s is to understate things. Their fumbles have been extraordinary. Lately, I’ve liked their plan to become a major Linux vendor, but execution is everything, especially in the tricky world of making money by being a corporate supporter of OSS.

Solstice Dive: I Love Living in Hawai’i!

Yesterday, Tina and I went for a dive off Honokohau Harbor just north of Kailua Kona on the Big Island of Hawai’i, a 10-minute drive from our house. Tina and I mostly freedive; this was the first time we had tanks on our backs in five months. Two eagle rays cruised the margin at 70′, where the coral slope changed to sand. We went out onto the sand flats to watch a flashing school of scads and stalk a plain of garden eels. Coming back, we saw our friends the eagle rays again, this time at a cleaning station, where we could watch them easily.

Then, in unusually shallow water I saw a school of Heller’s barracuda coming towards me. The usually shy species swirled around me. Wow! I thought, not realizing that I was missing the main show. Suddenly, further away, I saw a giant trevally thrashing a barracuda like a terrier with a rat. It was big: probably a 50# fish. I hooted in my regulator and pointed so Tina would see it.

When we surfaced, Tina told me that the reason the barracuda had swirled around me was because the trevally was herding them, cutting back and forth at lightning speed, only a few feet away from me. She had seen the whole thing.

Freediving’s nice, but we’ve decided to go back for another dive at that spot this weekend!

Rise & Fall of CORBA: Very Good Article

Michi Henning’s article on the history of CORBA in the latest Queue is very good indeed. Henning has biases (he’s got his own middleware solution) but the article never strays too far. It’s opinionated, but accurate. Some of the timeline discussion is a little overblown but not too much (CORBA’s popularity didn’t have quite the triumphant ballooning and deflating that he describes; there were critics and hesitation all along). Very worthwhile read for those interested in Web Services.

Jolt Award: Considering Dynamic Categorization via Tagging

The Jolt Awards are the major industry award for software development tools (compilers, libraries, etc.). One problem we face every year is proper classification of tools. Traditionally, we try to refine / fine tune the previous year’s categories (Development Environments; Libraries, Frameworks, and Components; etc.). Problems arise frequently balancing the number of products in a category (20 entries in one category, 3 in another), when clearly competitive products end up in different categories (happens all the time with categories “Web Development Tools” versus “Development Environments”), and when a product cuts across categories.

Brainstorming yesterday, we wondered if it would not be better to generate the categories dynamically. One idea was to use checkboxes for predefined activities (“defect tracking,” “code generation,” “GIS mapping”) and use some form of entropy measure to divvy them up into our 12-or-so categories. Easy enough mathematically. Another, more dramatic idea, was to create a tagging system for software tools and see if we could come up with a more dynamic view. The main challenge we see is that it seems like a small world: there are only a few hundred tool releases every year and it’s difficult to imagine many people becoming engaged in the task of tagging them.

Do the Web 2.0 dynamics of distributed collaboration apply to small numbers? A for software development tools?

Vista Voice Recognition: Very Nice

Vista has built-in voice recognition capabilities. One of the things that really jumped out in the Tablet PC was that the correction interface makes all the difference when it comes to using alternative input techniques: a service pack released for the Tablet a year or so after the initial launch was a landmark in the usability of handwriting for text entry. I’ve just begun using voice recognition in Vista and am very impressed with the correction interface. It may have reached the tipping point for usability (at least with a sound-cancelling headset).