Archive for September 2007

Writing A Technical Article, Part 2: Gathering Tools

Probably the biggest difference between academic and commercial writing is this: academic writing is almost always concerned with algorithms and process, commercial writing is almost always driven by a specific set of technologies. In this case, I’m being paid by AMD to explain integration of CodeAnalyst and Eclipse CDT. So that dictates that I’ll be using C++, which leads to a couple questions:

  • Linux or Windows?
  • What C++?

The Linux or Windows question is fairly easy for me to answer: I boot into Windows and run Linux in Virtual Machines. For an article on a profiler, the VM is a complication — CodeAnalyst works within a VM, of course, but since I expect this article to get down to the hardware, I don’t want any confusion about how toasty-warm my code is making my Opterons. So I’ll be experiencing the code in Windows.

But when writing about Eclipse, surely I should imagine that my target audience is not strictly Windows-based. So it seems logical to set up a compiler-chain based on GNU C++: I know that will be portable. (Note: I’ll probably cross-compile with at least Microsoft C++ and compare the experience. That might give me enough information for, if not an article, at least a column.)

Since the article is supposed to emphasize multicore/multichip development and in C++, it’s a no-brainer that I’ll be using OpenMP.

My next thought is unit-testing. For one thing, I just don’t develop anything without unit-testing anymore. For another, repeatability is critical to the article. For a third, I expect to hit some hard problems along the way and unit-testing is the best way to maintain momentum in the face of brain-twisters. I have not compared C++ unit-testing frameworks, but Boost has a test component, and while there may well be better, I know that Boost won’t be poor.

Version control is not the same issue in an article that it is in professional development, but you can never over-estimate the benefit of a backup: I’ll be using Subversion against a local server.

Since I know that I’ll be doing image-processing, I’d just as well use a library or two to handle file-loading and basic pixel manipulation. If I were building an application, I’d definitely tend towards GTK+. However, I want to keep things as simple as possible around the algorithm(s) that I’ll be developing, so I’m going to see how far I can get with this project, CImg, which seems nice and minimal.

So my initial tool list is:

  • CodeAnalyst : Profiler
  • Eclipse CDT : IDE
  • GCC : Compiler-chain
  • OpenMP : Shared-memory parallel programming
  • Boost.Test : Unit-testing
  • Subversion : VCS
  • CImg : Image-processing library

If I need or change libraries, I’ll make mention of it in forthcoming posts.

Next: Part 3, Research Begins…

I .NET Rock!

I’m on .NET Rocks this week talking about concurrency — a sure sign that Carl and Richard have run out of interesting people to talk to. Nonetheless, I try to not make too many mistakes when explaining Shared Memory Parallel, Software Transactional Memory, Message Passing, etc.

Good Post on the Byzantine Generals Problem

If you are interested in concurrent programming, you’ll eventually come across the Byzantine Generals problem. Mark Nelson has a good post explaining it.

Writing A Technical Article, Pt. 1: Background

I thought I’d try blogging the development of a technical article; it might prove interesting to, I dunno’, 3 people in the world.

I’ve been contracted to write a 2,500-words-plus-listings article on using AMD’s CodeAnalyst profiler with Eclipse, especially relating to multithreaded / multicore development. So that’s a pretty beefy article, about twice the length of a typical technical article in a print magazine. CodeAnalyst works with native code, so my sample program(s) pretty much have to be written in C++ — anything else would limit the audience.

AMD targets an experienced audience, so the educational goals for the article are “The reader will be able to…”

  • … enable and gather data using CodeAnalyst from within Eclipse CDT
  • … navigate within the main profiles provided
  • … recognize the profile most clearly associated with their performance limits
  • … determine the source-code lines associated with a performance issue
  • … after making code changes, compare the performance of subsequent runs”

Which is a heck of a lot of ground to cover. So the end result will be something like an advanced tutorial that you might get towards the end of a piece of documentation.

My professional goals are to deliver a technically sound article that satisfies my client’s goals within a profitable range of hours. I can control that primarily by manipulating my sample program(s); the amount of time I need to research the actual use of Eclipse or CodeAnalyst is minimal, since I know both products. The research aspect of the article — always the most time-consuming in a technical article — will be the development of sample program(s) that:

  • have a performance-limit that can be ameliorated,
  • are comprehensible, and
  • can be distributed (this limits my use of libraries)

My personal goals are to make sample programs that are interesting in and of themselves. This is a weakness on my part, but what can you do? In this particular case, I hope to implement some significant part of the algorithms used for content-aware image resizing. I have no intention, or interest, in recreating the GUI shown in the video. I have neither the time nor, quite honestly, the mindset — once I see the algorithm work, I can guarantee you that my interest in this will virtually disappear. That’s one reason why I’m not rich.

Next up: Part 2, Gathering Tools

I’ve Issued a DMCA Takedown of the NBA

Following the lead of the SFWA, which issued a takedown to a fan site based on nothing but the existence of the strings “Asimov” or “Silverberg” in a text file, I’ve demanded that NBA.com remove all pages that refer to “Larry O’Brien.” Those bastards have been distributing my source code long enough.

Seam Generation Using A*

The key to content-aware image resizing (and other recent image algorithms) is the discovery of “seams” — paths of pixels that don’t differ much from each other. Such seams are   places where the picture can be “pulled apart” gracefully. In image resizing, the seam is discarded and the two remaining pieces glued together — when done time and time again, the result is the astonishing resizing behavior.

imageIn the image in this post, a blown-up anti-aliased letter “R”, an obvious seam exists straight down the left side. A not-quite-as-good seam might be along the right side, but cutting diagonally back across the tail of the “R.”

Other people have used seams to automate collaging and masking. All very interesting, but the point of this post is to talk very briefly about the algorithms used to generate the seams.

Optimal seam generation is NP-hard. So it boils down to what works efficiently. Graph-cuts are the preferred method, but I just can’t get out of my head that the simple path-finding method called A* might be effective.

The thing that is nagging at me is that A* uses a cost approximation to figure out the next step forward; it seems to me that when manipulating many seams, using approximations that might last for dozens of seams would have a big benefit. The other obvious thing about A* is that the approximations are easy to build: recursively develop lower-and-lower resolution images.

For instance, when the above image is converted into brightness values, you can average neighbors to create the values shown in yellow, and average those to create the values in red. (Did you think “Haar wavelet“?)

image

Initially, this looks promising (the left-most yellow column is a clear winner), but at coarse levels, it’s deceptive (For the red, lowest-resolution “image,” you would clearly be drawn towards the right side for initial processing.

Intuitively, you would think that most images would have several side-by-side seams (i.e., that would show up at  coarser levels). Also, after seam removal, you would be able to “mark dirty” the approximations (i.e., the coarser calculations) and trigger recalculation as the amount of removed seams became significant (and, obviously, not recalculate areas through which the seam did not pass).

image image

This is kind of killing me, because I don’t have time to explore an algorithm that I only suspect will be efficient.

Grrr….