Archive for February 2007

More on OpenID, FOAF, and Trackback

Dmitry Shecthman, who knows more about OpenID than I do, doesn’t get why OpenID is important to making FOAF the validation route for Trackback. Here’s my thinking, which has a 90% chance of being wrong (based on historical averages):

FOAF looks like this:

<foaf:Person>  <foaf:name>Leigh Dodds</foaf:name>  <foaf:firstName>Leigh</foaf:firstName>  <foaf:surname>Dodds</foaf:surname>  <foaf:mbox_sha1sum>71b88e951cb5f07518d69e5bb49a45100fbc3ca5</foaf:mbox_sha1sum>  <foaf:knows>   <foaf:Person>    <foaf:name>Dan Brickley</foaf:name>    <foaf:mbox_sha1sum>241021fb0e6289f92815fc210f9e9137262c252e</foaf:mbox_sha1sum>    <rdfs:seeAlso      rdf:resource="http://rdfweb.org/people/danbri/foaf.rdf"/>   </foaf:Person>  </foaf:knows> </foaf:Person> 

Which essentially says:

And one would expect this to be part of a file created by Leigh Dodd and sitting in his Website (perhaps at www.leighdodds.com/foaf.rdf) Given that Leigh created that file, one would think that Leigh would be willing to have his Trackback server automatically create links to Dan’s comments regarding Leigh’s blogposts (i.e., Dan is trusted by Leigh).

So, a Web of FOAF files (n.b. <rdfs:seeAlso>) defines a social network graph and part of my premise is that anyone within a few degrees of separation from me could be trusted to – oh I can’t resist  — “Foafback.”

So my first cut at a new Foafback software would be one that receives a Trackback post of this form:

POST http://www.example.com/foafback/5  Content-Type: application/x-www-form-urlencoded; charset=utf-8 title=Foo+Bar&url=http://www.bar.com/&excerpt=My+Excerpt&blog_name=Foo;postedBy=Dan+Brickley 

And looks up Dan Brickley in Leigh’s FOAF file and says “Oh yeah, Dan! Swell!” Except, of course, the value of postedBy can’t possibly be “Dan Brickley” or dbrickley@rdfweb.org or even Dan’s mbox_sha1sum because spammers are going to figure out the Websites of those doing Foafbacks to your site and they will easily guess any publicly available identifier of those wishing to perform Foafbacks.

Therefore, I think you need an arbiter of identity; you need a service that Leigh’s Foafback server and Dan’s Foafback pinger can use to silently-after-the-first-time validate the identity of the person doing the posting.

The second cut at a new Foafback server works like this: The first time that Dan trys to Foafback to Leigh’s site, he is redirected to Dan’s OpenID provider (I think that’s the term), logs in, and is told “www.leighdodds.com is requesting your email address” and Dan clicks “Okay, now and forever.”

Leigh’s Foafback server then receives OpenID credentials and a Foafback post (sans postedBy because the email of the person whose logged in is actually coming from the OpenID provider, not from the person performing the Trackback). Leigh’s Foafback server validates that the OpenID identity (i.e., Dan’s persona) is in the trust zone (i.e., can be reached via FOAF) and automatically generates a link.

So that’s why I think you need OpenID.

Now, since I went to the bother of showing what a Trackback post actually looks like, I guess I should state the obvious, which is that the onus of calculating the FOAF graph ought not to be on Leigh (the original blogger) but on Dan, the Foafbacker. The Foafback pinger needs to include the route by which the poster is asserting a relationship (a list of FOAF URIs ought to suffice). The Foafback server needs to verify that route (at least once, but I can well imagine the admin software saying “These people have tracked back to you; include them in your FOAF?”).

Spammers will subvert Overly Trusting Ted with second-order attacks (“Hey, love your site!” from “new friend” cutegirl15, whose FOAF is 10,000 phentermine sites) and there’s little that can be done about that. But the list of targets for the spammers real purpose (which is to get links to their phentermine sites posted on high-traffic blogs) is limited to those in Ted’s FOAF file. But of course Ted asserts that he knows Dave Winer, Robert Scoble, and Cory Doctorow, so the spammers have a target. But if the spammers link indiscriminantly to outbound links, they’re already at 3 degrees of separation (Ted-cutegirl15-phentermine) and, of course, Ted won’t validate (since Winer, Scoble, and Doctorow don’t have Ted in their FOAF files). So the spammers wise up and validate the route to the potential target by checking the potential target FOAFs. But by validating along the directed graph, this severely limits the speed by which spammers can propagate “out” from Ted’s trust zone (assuming that those in the top 1/2 of 1% of the blogging power curve don’t become superpropagators by allowing six-degree-of-separation Foafbacks).

FOAF, OpenID, and Trackback

Is a limited recursion through a FOAF graph based on OpenID the solution to Trackback? If that sentence isn’t understandable, don’t worry about it, but if it parses, continue…

The big problem, of course, is the initial trackback from those outside the limits of the graph. In such a case, the attempted trackback raises the barrier above which a bot can rise: you must have an OpenID and you must propose a path through the graph. Such trackbacks are submitted for moderation (who doesn’t check out those commenting on their posts? The A-Listers? Who gives a frack if this doesn’t work for them? As a person well up the power-curve of blogging (99.9th percentile), I can assure you that it’s not hard to read every mention that Technorati can find).  

OK, so the obvious failure mode is that Trusting Ted, who’s in my trustzone, allows into his zone a mole, who becomes a conduit for spammers. Several things occur to me about this: yeah, I have a blacklist in my trackback mechanism and it, too, is FOAF. Second, Trusting Ted FOAF probably has a distinctively low inbound:outbound ratio (again, the A-List bloggers love being supernodes, so they haven’t noticed that supernodes have downsides). Third, it seems to me that the graphs of spammer’s OpenID-based FOAFs would have characteristics: lots of transience, low connectivity to “real” FOAFs, non-power-law distributions (even if they developed mock supernodes, those would necessarily be transient), etc.

Given that the costs of any automated assault on such a system will approach zero, how is such a system vulnerable?

John Lam (ex-RubyCLR, now Microsoftian) Hints At Forthcoming Announcement

John Lam, whose RubyCLR bridge led to a position in Microsoft’s CLR team, hints that an announcement on his project (my guess, X:Ruby::C#:Java) will be forthcoming. Sadly, he hedges as to whether it will be MIX or PDC. Of course, I’ll be going to PDC, but if there is a major dynamic language announcement at MIX, maybe I’ll have to go to that as well…

If OOXML is Relevant, Why Is MS Unable to Provide Macintosh Converters?

Alan Zeichick relates what happened when the first .docx file was sent to BZ Media, a company that runs primarily on Macs. Microsoft says that everything’s just swell:

We are running on target and expect to release a free public beta version of the file format converters in Spring 2007, with final converters available six to eight weeks after we launch our next version of Office for Mac (which, as previously reported, will be available 6-8 months after general availability of Win Office.)

But Alan wonders “How can Microsoft expect outside developers to be able to implement these new Office Open XML formats , when they can’t even do that themselves in a timely matter? A six to eight month delay, after having more than a year to prepare?”

Detailed Terrain Map By Walking Around With a GPS?

My house is built on a bit of a ridge between two gullies (well, two collapsed lava tubes — I do live on the side of an active volcano, after all). Grand plans include decks and terraces, but I can’t envision them without a plan. Of course, I could hire surveyors to come in and map the place, but I have a GPS (actually, and I wonder if this is significant, I have two GPSes). What I want / wondering if it exists / wondering if I could code it is an application that takes a GPS track and creates a surface. Anyone know?

Why I wonder if having two GPSes is significant is because I would be very happy to leave one GPS sitting, perfectly motionless, while walking around with the other. While the GPS signal is salted with random imprecision, if that imprecision is the same for two different GPS devices, I would think that I could subtract the “jitter” of the stationary GPS from the track of the mobile GPS, perhaps providing me with the vaunted precision of “differential GPS.” … 5 minutes later … Hmm, not an encouraging experiment. Perhaps the GPS have to lock on to the same satellites?

First Look: Komodo 4 for Ruby Programming

It’s been a good couple weeks for Ruby IDEs. First, Ruby In Steel was released. Pretty much simultaneously, ActiveState releasedKomodo 4 with support for Ruby.

Komodo is a significantly “weightier” IDE and Ruby is just one of the many languages it supports. It is, I suppose, more akin to Visual Studio itself than to Ruby In Steel, which adds Ruby support to Visual Studio.

I still have much more head-to-head comparison to doing, but I wanted to point out a clear “win” for Komodo: the Ruby shell shown in the bottom pane here is graphical, allowing for a significantly easier cut-and-paste experience than the IRB-in-a-DOS-Box approach:

 

P.S. What the heck is “IDE_GeneticAlgorithm”? Well, a while back there was a flurry of posts about “the best” customized color schemes for programmers. I thought it would be funny to write a distributed genetic algorithm that “bred” color schemes and evolved them on the Web. The problem is the age-old challenge of creating a decent traversal through colorspace (that isn’t along the gray axis). What’s a way to encode color in a single number such that like values have like colors?

Gunnar Peterson on Message-Level Security

Gunnar Peterson, responding to my posts on REST, says we cannot punt on message-level security. He cites 3 security breaches as evidence that the “the 1995 security model” of “firewall, SSL, and a prayer” won’t cut it. However, I don’t believe that any of these breaches would have been thwarted by message-level security. In the first “an intruder hacked into a TJC Companies’ database,” the 2nd was a stolen file (whether physical or due to a login, I don’t know), and the 3rd was a phishing attack. I don’t see how encryption at the message-level would help in these scenarios. I’m not a computer security expert, but it seems to me that bad logins, physical loss (i.e., stolen laptops), and phishing account for the vast majority of security breaches. At the targeted assault level you have SQL injection and buffer overflows and rootkits. I’ve never heard of an actual man-in-the-middle security breach at the SSL/HTTPS level (feel free to enlighten me).

I’ll reiterate my main point: KISS approaches work well enough for companies like Google, Amazon, and Apple/iTunes to transact billions of dollars in commerce. WS-Security, with its encryption-scheme-independent tokens and trust relationships, etc.: I just don’t see the utility. I certainly see the complexity. Of course, the complexity is generally mitigated within a single vendor’s stack, but interop is actually the “big promise” that started this whole Web Services thing and is much more a real-world issue than the supposed flaws of Internet protocols.

The only scenario that I can think of where I would not trust SSL/HTTPS at the message-level are actual wire transfers. And I think the people who program bank transfers have already figured out a way that works. (Very rapidly, but one penny at a time, as numerous people pointed out in response to my “Top 10 Things I’ve Learned About Computers From The Movies” post.)