OOPSLA Day 2: Explicit Use-Case Representation in Programming Languages

One of the emerging themes at this conference is the need to move “examples” (and their older siblings, scenarios and use-cases) “into the code,” so that examples/stories/scenarios/use-cases, which are tremendously meaningful to the subject-matter experts, are actually traceable directly into the code, which is tremendously meaningful to, you know, the machine.

I very much enjoyed a talk on “Use-case Representation in Programming Languages,” which described a system called UseCasePy that added a @usecase annotation to Python methods. So you would have:
[sourcecode][/sourcecode]
@usecase(DrawARectangle,DrawALine)
def drawLine(ptA, ptB) … etc …

Now, even if you go no further, you’re doing better than something in a documentation comment, since you can easily write a tool that iterates over all source-code, queries the metadata and builds a database of what classes and methods participate in every use-case: very useful.

Even better, if you have a runtime with a decent interception hook, you can run the program in a particular use-case (perhaps from your BDD test suite, perhaps from an interactive tool, acquire the set of methods involved, and determine, by exercising a large suite of use-cases, metrics that relate the codes “popularity” to user-meaningful use-cases, which could be very helpful in, for instance, prioritizing bug-fixes.

Oh, by the way, apparently we no longer call them “users” or even “domain experts,” they are now “Subject Matter Experts” or even SMEs (“Smees”).

OOPSLA Day 2: David Ungar — Everything You Know (About Parallel Programming) Is Wrong

I should hope so.

This was the afternoon’s first major talk. David Ungar from IBM Research first demonstrated that the tragedy of Romeo & Juliet comes from a race condition (if only he had waited for news from the Friar).

That was excellent, but the real premise of his talk was that there is a fundamental tension between correctness and synchronization in manycore and the scalable solution (he asserts) is to eliminate synchronization. He proposed a few names for this type of programming model : anti-lock or “race and repair.”

This reminds me (and at least one questioner in the audience) of the application- or web-level concept of “eventually consistent.”

The bulk of the talk was a discussion of his experiments with a programming problem (a slightly-more-complicated version of hash table insert) with various techniques that trade off correctness with performance. What he showed (at least in this one experiment) was that he could get better performance from a “race and repair” technique than he could get from compare-and-swap.

He tentatively proposes that probabilistic data structures and algorithms, in which correctness is a spectrum that can be traded with performance, is a new field of study.

OOPSLA Day 2: More on Dart

I think when people saw that Dart was from Gilad Bracha and Lars Bak there was an expectation that Dart was going to be a grand synthesis: a blazingly-fast NewSpeak-with-curly-brackets. It’s very much not such a language. It doesn’t seem, academically, vastly innovative because it doesn’t add much. But, in truth, optional types are a radical design decision in that they take away runtime aspects that a lot of mainstream programmers expect. (Of course, this raises the question of how to define the “mainstream”…)

Pros and Cons of Mandatory Typing In Descending Order of Importance (per Gilad Bracha):

Pros:

  • machine-checkable documentation
  • types provide conceptual framework
  • early error detection
  • performance advantages

Cons:

  • expressiveness curtailed
  • imposes workflow
  • brittleness

Having said that, I attended a lecture in which someone, perhaps from Adobe, measured the performance impact of optional typing. Their conclusion, although admittedly done on the troublesome-ly small and artificial SunSpider benchmarks, was that the performance penalty of implicit types amounts to 40% (with a very large standard of deviation). That “feels” about right to me — definitely significant but not the overwhelming performance benefit you might get from either parallelization or an algorithmic change.

Knowing Scala: Exercise 1

Taking my own 15 Exercises to Know A Programming Language as a starting point for my exploration of Scala…

The first set of exercises are right up the alley of a language with pattern-matching and list-processing:

Write a program that takes as its first argument one of the words ‘sum,’ ‘product,’ ‘mean,’ or ‘sqrt’ and for further arguments a series of numbers. The program applies the appropriate function to the series.

The first thing I did was write the functions:

[sourcecode lang=”scala”]
val args = List(1, 2, 3, 4, 5)
args.foldLeft(0)(_+_) //Sum
args.foldLeft(1)(_*_) //Product
args.sort(_ < _)(if(args.length % 2 == 0) args.length / 2 – 1 else args.length / 2) //Median
args.map(Math.sqrt(_)) //Sqrt
[/sourcecode]

Here the list-processing functions (highlighted in red) shine. The foldLeft() family of functions, also known as reduce, inject, or aggregate, applies a predicate to the elements of a data structure, accumulating a value. It’s hard to imagine cleaner code than what we have for the ‘sum’ and ‘product’ challenges. Similarly, the map() function is similarly clean-as-a-whistle for the ‘sqrt’ challenge. The ‘median’ challenge I’m less thrilled about: Passing in a custom comparator predicate to sort() is lovely, but my fingers typed in the C-language ternary operator :? and I was sorry to find that Scala doesn’t have it.

A big win for Scala is that I typed these in the Read-Evaluate-Print-Loop aka interactive console rather than having to deal with the overhead of a ‘real’ program and a edit-compile-test loop.

Moving on, the complete program follows:

[sourcecode lang=”scala”]
object Main
{
def main(args : Array[String])
{
val argList = List.fromArray(args)
val op = opMatch(argList.head)
val numList = argList.tail.map(java.lang.Double.parseDouble(_))
val result = op(numList)
print(result)
}

def sum(args : List[Double]) : Double = { args.foldLeft(0.0)(_+_) }

def product(args : List[Double]) : Double = { args.foldLeft(1.0)(_*_) }

def median(args : List[Double]) : Double =
{
args.sort(_ < _)(if(args.length % 2 == 0) args.length / 2 – 1 else args.length / 2)
}

def sqrt(args : List[Double]) : List[Double] = { args.map(Math.sqrt(_)) }

def opMatch(arg : String) : (List[Double]) => Any =
{
arg match
{
case "sum" => return sum
case "product" => return product
case "median" => return median
case "sqrt" => return sqrt
_ => throw new java.lang.IllegalArgumentException("Unrecognized operator " + arg)
}
}
}
[/sourcecode]

Let’s see… The translation of the REPL code into functions was very straightforward, although I have to admit to being somewhat disappointed that I could not use a type more abstract than Double. I don’t know if I could define a trait and then apply it retroactively to types in the java.lang package.

opMatch() shows some functional goodness — it’s declared as a function that takes a String and return a function that takes a List of Doubles and returns Any kind of object.

I’m not sure if I could do something more precise for my needs, like “returns a Double or List[Double]”. The pattern-matching in opMatch() is nice but not advanced.

If you have any observations to share, please comment below…

FunLoft: Reactive Concurrent Programming Language

It sounds like someone designed a programming language with the express intention of intriguing me:

FunLoft is an experimental language for concurrent programming, designed with the following objectives:

  • make concurrent programming simpler by providing a framework with a clear and sound semantics.
  • provide a safe language, in which, for example, data-races are impossible.
  • control the use of resources (CPU and memory); for example, memory leaks cannot occur in FunLoft programs, which always react in finite time.
  • have an efficient implementation which can deal with large numbers of concurrent components.
  • benefit from the real parallelism offered by multicore machines.

… FunLoft is based on a programming model which is a variant of reactive programming

MS Concurrency Guru Speaks of “new operating system”

If you are interested in high-performance programming on Windows, you know the name Joe Duffy, whose book Concurrent Programming On Windows is absolutely top-notch.

Today he posted an intriguing notice on his blog “We are hiring.” Check out some of the things he says:

My team’s responsibility spans multiple aspects of a new operating system’s programming model…. When I say languages, I mean type systems, mostly-functional programming, verified safe concurrency, and both front- and back-end compilation….All of these components are new and built from the ground up.

Huh. I’ve argued before that the manycore era requires a fundamental break in OS evolution. Every aspect of the machine has to be rethought; the fundamental metaphor of a computer as a von Neumann machine with, perhaps, a phone line to the outside world has been strained to the breaking point. Forget “the cloud,” we need to think about “the fog” — a computing system where every resource (including resources outside the box at which you happen to be typing) can be accessed concurrently, securely, and virtually.

I don’t think that the OS for the manycore era can evolve from any existing desktop OS. That’s why I think that the “Windows 7 vs. OS X vs. Linux” debates are short-sighted and even the “Windows vs. iOS vs. Android” debates are only skirmishes to determine who has the money, mindshare, and power to eventually win the real battle.

It needs to be said that Microsoft has lots of incubation and research projects whose results either are left to wither or are watered-down and incorporated into mainstream products. But the involvement of a top non-academic thought-leader makes me hopeful that Duffy’s project may have a bright future.

The Traveling Astronomer Problem

Apropos of something I’m not quite ready to talk about, here is an interesting challenge:

How do you optimize your time at the telescope if you have a set of objects that you’d like to observe?

For instance, if you want to see as many Messier objects as you can in a single night, a portion of your night might use this sequence, suggested in the book “Messier Marathon Observer’s Guide

screen-shot-2010-09-16-at-104738-am

On the other hand, it looks like there’s a wasteful jog near Serpens Cauda — near the label “18h30m” in the image. That jog is the recommended sequence “M24-M25-M23” but if minimizing the path were the only criterion, it looks like it would be quicker to visit M25 “on the way” between M7 in Scorpius and M11 in Scutum.

Now, by no means do I want to be so presumptuous to suggest that Machholz “made a mistake” in his recommended order. Minimizing the path is not the only or even overwhelmingly-dominant criterion — if you’re really doing the Messier marathon, it’s customary to do it without the help of a computerized “goto” system and using easy-to-find objects and straight line “star hops” is a big deal.

Similarly, in the real world you’re going to be battling light pollution, clouds, and terrestrial obstructions.

But this visualization that I made using Google Earth and some Ruby code does suggest that it might be worth using the power of a computer to help you plan your evening’s viewing.

The bad news is that there are lots of possible paths one can take between all 110 Messier objects — 1588245541522742940425370312709077287172441023447356320758174831844456\
7162948183030959960131517678520479243672638179990208521148623422266876\
757623911219200000000000000000000000000 paths. Most of those paths are impossible for an Earth-based scope (as a matter of fact, there are only brief windows during the year when all the Messier objects are visible at night from a given location). And most paths could be rejected very quickly. But still, no matter how quickly you evaluate a path, there’s no way to use a computer to find the absolute shortest path between this many cities… er … graph nodes … er … sky targets.

The good news is that there are all sorts of wicked cool ways to find “pretty good” paths.