Shapewriter On The Way To Productization

Shapewriter, formerly known as SHARK, is absolutely the fastest text input method short of typing (mmm… possibly voice input in certain domains). For years Shapewriter has languished at IBM’s Alphaworks site. I can’t wait for them to ship for the Tablet PC, although right now the downloads link at the site is inactive.

See Doubleplus

AT&T Invents Programming Language for Mass Surveillance” is Wired’s absurdly bad headline describing how Hancock, “a C-based domain-specific language designed to make it easy to read, write, and maintain programs that manipulate large amounts of relatively uniform data,” was used by AT&T to aid the NSA (allegedly, but c’mon).

In fact, Hancock looks like a good language for Wide Finder, Tim Bray’s logfile analysis program that exemplifies the culture of hashtables and regular expressions, and which is bouncing around the blogosphere as a benchmark.

As a benchmark, Wide Finder is as problematic as previous small benchmarks like Fib and Tak. I could write a lot more about that, but instead I wanted to highlight Bray’s observation that there is an entire culture of programming relating to regex and hashtables. This culture is the remnant of the antediluvian Little Languages culture which, in turn, traced its heritage to the cyclopean Ye Olde Compiler Scribes.

Today, the regex+hashtable crowd is basking in the limelight. Although I’m wholly in favor of promoting language-like techniques, one ought not to believe an emerging meme, which is that “you need a dynamic language to write a language-like program.” As a matter of fact, it doesn’t take long for a regex+hashtable program to grow to the point where a custom parser (generated by a tool, of course, from a regex-like grammar) becomes more efficient. Once you’ve gotten to that point, the utility of dynamism is greatly lessened and you’re well into a situation where many people might prefer the clarity of explicit types (when walking a tree of many thousand nodes, it can be very helpful to know exactly what type of node is at hand).

Tools like ANTLR make language-recognizing programs easier to write than ever. And, as I wrote recently, writing a unit-tested compiler is the most fun you can have programming.