Archive for the ‘SD Futures’ Category.

LINQ + Reflection: Querying the Object Graph

Yuriy Solodkyy demonstrates the combination of LINQ and Reflection APIs, a technique which could prove to be tremendously powerful and which strikes me as allowing LINQ-enabled languages to have a level of “dynamism” that puts to shame duck-typing.

Could this simply replace the Visitor pattern with an approach that needs no cooperation from the data structure?

Would this allow an Abstract Factory that allowed you to dynamically find all products of one Factory and replace them with those of another?

 [via Steve Pietrek]

C++0x to Incorporate Standard Threading Model

The working groups of the C++0x committee are working hard to complete a major new standard for C++ (there’s a big meeting here in Kona in October). If you’re not intimate with C++, you may be surprised that such an important language has not had a standard threading model and that such a model is a major part of the C++0x version. This is actually part-and-parcel of the design philosophy that made C and C++ so important: the number of libraries dictated by the standard for C and C++ is much smaller than the BCL or Java’s SE libraries. This allows standard C and C++ to be available for hundreds of processors.

I recently read the public C++0x papers on threading (links below). The proposed threading model is non-radical and is based on Boost.Thread. The reasonable perspective is that this is a conservative decision thoroughly in keeping with C/C++’s long tradition of minimal hardware/OS assumptions.

The emotional perspective is that they’ve let slip by a golden opportunity to incorporate the best thinking about memory models. “Multi-threading and locking” is, I would argue, demonstrably broken for business programming. It just doesn’t work in a world of systems built from a combination of libraries and user-code; while you can create large applications based on this model, large multithreaded applications based on locking require not just care, but sophistication, at every level of coding. By standardizing an established systems-level model, C++0x foregoes an opportunity for leadership, albeit radical.

One of the real thought leaders when it comes to more sophisticated concurrency semantics is Herb Sutter. His Concur model (here’s a talk on Concur from PDC ‘05) is, I think, a substantial step forward and I’ve really hoped to see it influence language design. Is Sutter, though, just an academic with flighty thoughts and little understanding of the difficulties of implementation? It seems unlikely, given that he’s the Chair of the ISO C++ standards committee. So you can see why there might have been an opportunity.

Multithreading proposals for C++0x:

Tilera 64-core CPU: The Future Cometh

Looks like the only programming tools for Tilera’s 64-Core CPU is a C compiler, but the day is fast approaching when we’re going to start seeing more and more of these types of tools in the mainstream.

Data Volumes Trumping Core Multiplication? Interesting Thought

Bill de h?ra makes an intriguing pitch that programming will be impacted by increasing data volumes more than by the transition to multi-/many-core. His basis is anecdotal — we don’t have the same metaphysical certainty that all of us will be dealing with much-larger datasets as we have the certainty that we will all be dealing with multiple and then many cores — but is logical. The speed of a single stream of in-cache instructions is blazing: short of chaotic functions, it’s hard to imagine perceptibly-slow scenarios that don’t involve large amounts of data.

What I find especially thought-provoking about this argument is that it stands in opposition to another post I was going to make about YAGNI infrastructure. Not long ago, Alan Zeichick ranked databases and Ian Griffiths questioned whether he took price-performance into account. Even allowing that there are costs for OSS (training, tools, administration, etc.), I’ve noticed that few real-world CEOs understand where their companies stand in relationship to scaling. In my experience, they often over-buy software- and hardware- capacity and under-buy contingency capacity.

It seems to me that nowadays we work more and more with data streams and not data sets. On a transaction-to-transaction basis, I think it’s an uncommon application that uses more data than can fit into several gigabytes of RAM (obvious exception: multimedia data). Never mind multi-node Map-Reduce; I’m saying that it seems to me that many “real” business systems could have a single-node non-relational data access layer.

It seems that what I’m saying is in direct contrast to what de h?ra is describing, and yet points to the same “maybe we ought not to start from the assumption of a relational DB” heresy. No conclusion… food for thought …

Reflection: I think I let my attention wander — the world de h?ra is describing is that high-performance computing and I wandered into general business-computing. The two intersect, of course, but are not generally the same. So the thought then becomes that powerful relational databases are being squeezed from both the low-end (”eh, just put in memory”) and the high-end (”ok, so this is our distributed tuple-space…”).

Moving Beyond The Typing Debate?

Maybe the readers of my blog are more astute (and better looking!) than average, but I was happy that several comments to my recent post on type inference were properly dismissive of what one called “the static vs. dynamic holy war.” As I said when writing about the myth of better programming languages last year, different programming languages engage your mind in different ways and that is what is worth pursuing. There was a time when I was programming professionally in two languages: C and Prolog. They engaged my mind in such profoundly different ways that shifting between them felt like the clutch on the ‘77 Ford van I was driving at the time (three-on-the-tree, baby), but in terms of problem solving, I felt like Superman.

Now, first-class functions have entered the mainstream (primarily via C#) and that, in combination with an influential paper about Google’s MapReduce programming model has led people to begin to see what functional programming advocates have been talking about lo these many years.

Similarly, people are beginning to realize that concurrency models just might be important in the coming years and are beginning to pay attention to languages like Erlang. (Incidentally, O’Reilly & Associates seems to be betting that “shared nothing” is the way to go, a conclusion that I think is certainly too sweeping and premature. ORA is the most influential publishing house in software development right now, so the biases of their editors in this area will have a noticeable impact on the debate in the years to come.)

Update: No sooner had I written this post when I see in my Inbox that Pragmatic Bookshelf has published Programming Erlang. Look for a review in the coming weeks…

IBM’s Telelogic Acquisition: Buying Marketshare, Not Expanding Market

I agree with Alan Zeichick’s analysis of IBM’s acquisition of modeling tool vendor Telelogic:  the overlap with IBM’s Rational product line is high, the acquisition “is a bid to buy market share….we’ve taken a powerful innovator and strong IBM competitor out of the market.”

The software development industry typically pendulums on modeling tools: excess, backlash, abandonment, code is king, frustration, some modeling helps, we can model everything, excess …

Right now, modeling is not popular. But I think it’s actually passed its nadir and, if history holds, we should see modeling increase in popularity. The problem for IBM and Rational is that part of the pendulum is the embrace of new modeling graphics/languages.

How Much of the Industry Will Go Parallel?

Michael Seuss ponders one of my favorite questions: How much of the software industry will have to deal with the concurrent computing [opportunity]? He hits the vital points:

  • 2, 4, and maybe 8 cores may be usefully exploited by system services (anti-virus, disk indexing and searching, etc.), but when you get beyond that, any program for which performance is any kind of issue simply cannot ignore the capacity (this is why I distinguish between our current “multicore” transitional phase and the coming “manycore” era).
  • Media programming (games, A/V processing) have an essentially infinite appetite for processing
  • The manycore era provides an opportunity for new types of functionality. He mentions concurrent semantic analysis of your input, both typing and spoken, and the accumulation of context documents. For instance, as I type this, my computer might be gathering all my blog posts, OneNote notes, source code, etc. relating to concurrency. (And then wouldn’t it be cool if it offered them for my perusal, maybe with, I dunno’, a goggle-eyed paperclip?).

But I think the $64 question is whether such services will be provided in a service-oriented, cross-application manner, or whether it will be the case that we find broad opportunities for them within applications. For instance, mail programs and word processors have had search functionality for a long time, but if you were designing such a program from scratch, you would probably be better advised to say “Hey, I won’t implement a complete search subsystem, I’ll just make sure I can be indexed by Windows and Google Desktop Search. If I want to add value, I’ll layer on top of those systems if at all possible.”

Conversely, if you had some powerful new value proposition (semantic analysis, task recognition, visual input), wouldn’t it be vastly better for you and your customers if you could provide it to applications other than those that you happen to have written? In other words, of course value in the manycore era will derive from increased parallelism but maybe that parallelism will still be very coarse-grained. Maybe software organizations will face a choice: “Either develop client-oriented value with the best practices of “traditional” non-parallel development or develop broader, system-oriented value using whatever is the emerging set of best practices for system-level parallel development.” Maybe that choice will become increasingly orthogonal.

Now, the final part of the thought experiment is this: if that scenario is reasonable, what kind of platform services / APIs would one desire?

Microsoft’s Popfly: Getting Their Ducks In A Row

Popfly is the name (and URL) of Microsoft’s new non-professional developer community, a Windows Live site whose flashiest feature is a Silverlight-based “mashup editor” that facilitates pipes-and-filters development. Before reviewing the gratuitous 3-D spinning cubes, though, pay attention to the context:

  • Visual Studio Express has had 14,000,000 downloads (source: Dan Fernandez personal communication). Of course that translates to something far less than 14M users, but it ain’t hay;
  • The Popfly mashups run inside Silverlight, so anyone wishing to view their friends’ / child’s / grandkid’s project is going to have to install the Silverlight runtime;  
  • Silverlight is going to rapidly evolve to incorporate the Dynamic Language Runtime. Silverlight + CoreCLR + DLR == Microsoft’s platform play for dynamic languages, which have crossed the chasm and, whatever their other strengths or weaknesses, are easier to learn than explicitly typed or Pascal-like highly-structured languages

Microsoft is on the verge of restoring the bridge between power users and programmers.

The collapse of that bridge — the disappearance of macro-based automation during the DOS-Windows transition and the removal of Hypercard from the Mac — was the greatest setback the professional programming community has ever suffered (#insert COBOL or C++ joke here#).

Pipes-and-filters mashups are the UNIX shell-commands of the Web. The next step is automation — after you start figuring out how to pipe commands, you start writing shell scripts, at which point you’re programming the platform at a higher abstraction level. That’s a crucial point: we’re not talking about flow-control and manipulation within the pipes-and-filter components, but at the platform level. That’s why it’s huge that the Popfly mashups are executed on the client (within Silverlight) and not on the CPUs of the host. Mainframes->Minis->PCs: empowered users require and embrace personal resources. This is the salient distinction between Popfly and Yahoo Pipes (Popfly also works with more types of data, but Yahoo could address that). It’s not just that there’s a resource-consumption scaling problem that might be solvable by the host absorbing hardware costs, it’s that there’s a Big O scaling problem: to the extent that mashups are used to program the Web, as soon as people start looping/recursing, you’re talking about non-linear increases in resource consumption.

To be clear, I don’t think Popfly is the Bourne Shell of the Web — that hasn’t been written yet. But I think Popfly’s the | and Silverlight’s the $

IronPython, IronRuby Discussion with Jim Hugunin and Jon Lam

I’m dying because I’ve just had a long talk with two of Microsoft’s heavy hitters on the Dynamic Languages Runtime (DLR) team and have much to discuss, yet I am in a frenzy preparing for a business trip and cannot yet take the time to do the discussion any kind of justice.

The single-most important quote, I think, was the statement that “no one will take [our implementations] seriously until we can run– / We aren’t done until we can run–” [Django | Rails]. That was contrasted with important libraries that were heavily dependent themselves upon C-based libraries (Zope, in particular). It was also contrasted with libraries that rely on unusual language quirks or implementation details; the touchpoint on that was Ruby’s … shoot, I thought Lam said “objectspaces” but I don’t see that in the standard library … maybe he said “ObjectSpaces-like ability to traverse the entire in-memory object graph” (Anyone know what lib that would be?) … Anyway, the point was that this was an example of something that would be very difficult to implement within the constraints of the CLR.

I’ll update this entry when I can report in more detail…

Sun’s Fortress Language : Looks Very Well Designed

This is a rather daunting (124 slide) PDF on Sun’s “Fortress” programming language, designed in large part by Guy Steele, which is designed for scientific / mathematical programming. It looks really good — lots of good decisions (take advantage of Unicode, traits and objects, implict and explicit parallelism… well, actually, making parallelism the default for loops is a mistake…).

I do sometimes second-guess myself about whether concurrency is going to be a mainstream concern or whether taking advantage of 90% of your computer’s power (once you get to more than 10 cores) is going to be a niche problem. My gut tells me that mainstream programmers cannot ignore that much of a discrepency in performance; performance is always an issue and, even though the majority of performance problems are not CPU-bound, I just feel that no one will want to say “Yeah, it’s single-threaded” when the pointy-haired boss is looking for someone to blame for performance woes on a 16-core machine.

Found by way of James Governor