Saturday, June 07, 2008 |
|
|
The July 08 (Vol. 30, #7) IEEE Transactions on Pattern Analysis and Machine Intelligence has an incredible paper by Raykar, Duraiswami, and Krishnapuram. A Fast Algorithm for Learning a Ranking Function from Large-Scale Data Sets appears to be a game-changer for an incredibly important problem in machine learning. Basically, they use a "fast multipole method" developed for computational physics to rapidly estimate (to arbitrary precision) the conjugate gradient of an error function. (In other words, they tweak the parameters and "get a little better" the next time through the training data.) The precise calculation of the conjugate gradient is O(m^2). This estimation algorithm is O(m)! (That's an exclamation point, not a factorial!) On a first reading, I don't grok how the crucial transform necessarily moves towards an error minimum, but the algorithm looks (surprisingly) easy to implement and their benchmark results are jaw-dropping. Of course, others will have to implement it and analyze it for applicability across different types of data sets, but this is one of the most impressive algorithmic claims I've seen in years. Once upon a time, I had the great fortune to write a column for a magazine on artificial intelligence and could justify spending huge amounts of time implementing AI algorithms (well, I think I was paid $450 per month for my column, so I'm not really sure that "justified" 80 hours of programming, but I was young). Man, would I love to see how this algorithm works for training a neural network... |
|
|
|
|
Sunday, December 31, 2006 |
|
|
According to InformationWeek, Umakant Sharma, seeded 2nd in a tournament in New Delhi, was caught with a Bluetooth headset stitched into a cap that he wore "pulled down over his ears" during competition. According to the All India Chess Federation, accomplices fed him moves from a chess program. He's been banned by FIDE for 10 years. This reminds me of something I've discussed before -- during Gary Kasparov's famous 1997 battle with Deep Blue, he demanded that the program's code be escrowed because Kasparov was of the belief that no computer could generate such play and that a human or humans must be feeding the machine moves. That response -- an expert in his domain asserting that computer behavior "must be from a human" -- always struck me as more important than the ability of the computer to ultimately grind down the world's best chess player. The response was the first, and to date, closest thing to a triumph in the Turing test. I have to admit it also reminds me of my own scandalous behavior in 3rd grade, when as a Cub Scout I made a pinewood derby car into which I could slip a fishing weight after the official weighing. I was caught because my car didn't just win the race, it ran down the ramp about twice as fast as anything else (objects of different mass might fall in a vacuum at equal speed; objects with wire axles running through a wood block, not so much). Needless to say, that was the end of my time in scouting. (Although, to be fair, they actually wanted to have some kind of disciplinary thing and then let me continue. Somehow I never made it over to the Brennan's house for that meeting.) |
|
|
|
|
Tuesday, October 03, 2006 |
|
|
The Netflix optimization challenge exemplifies a situation for which there should be a solution, but which I've not seen a good answer. Namely: for-profit but initially ad-hoc cooperation. For instance, let's just say that I made the case that a Pandora-like "Movie Genome Project" was the key to winning the prize. And let's say that you are sitting on top of a data-mining algorithm that you think will work great in conjunction with such a database. And let's say that there are 5 people who, reading this, think "Well, I might not be able to contribute an algorithm, but for a piece of $1M, I'd be willing to 'genotype' some movies." The problem is: how do we go about working with each other? In the Open Source world, one can prototype a project by throwing it against the wall and seeing if it sticks: if people contribute or show interest, one can make a judgment about continuing or discontinuing the project. This works because all work, whether prototype or production, is given the same (free) value. However, if there's money at stake, one cannot begin prototyping until "what if we win?" is sorted out. Even more importantly, the payoff percentages seem to necessarily be predetermined even though the relative contributions of all parties to the task won't be known until after-the-fact. I wonder if some derivative of a fairness-ensuring "cake cutting" algorithm can be applied to the problem. |
Tuesday, October 03, 2006 8:50:50 AM (Hawaiian Standard Time, UTC-10:00) | Disqus link | AI
|
|
|
|
|
Scott Swigart points to this article that gives a non-technical overview of the use of genetic algorithms to determine the optimal tuning characteristics of one's Linux Kernel. This ought to work: many years ago I wrote a genetic algorithm that tuned the optimization parameters of one's C++ compiler and it worked perfectly (well, who knows if it worked perfectly, but it did create better runtime performance than one would generally get from naive optimization options). |
Tuesday, October 03, 2006 7:22:47 AM (Hawaiian Standard Time, UTC-10:00) | Disqus link | AI
|
|
|
|
Monday, October 02, 2006 |
|
|
Netflix is offering $1M to the first person who can achieve 10% better movie recommendations than their current system. Sweet. I have all sorts of ideas on this. Thank heavens I have copious spare time. |
Monday, October 02, 2006 12:50:49 PM (Hawaiian Standard Time, UTC-10:00) | Disqus link | AI | Knowing
|
|
|
|
Friday, September 08, 2006 |
|
|
Infants exposed to a Mandarin-speaking adult for less than 5 hours (25-minute sessions over 4 weeks) were "able to distinguish phonetic elements of that language." Very impressive. But infants exposed to a similar amount of speech delivered over DVD could not. Fascinating. General-audience article here (via The Old New Thing) My guess is that there's a difference-in-kind to the type of attention that infants pay humans to the type of attention they pay brightly flickering screens. It would be interesting to see the effect of exposure to, say, a guy in a big purple talking dinosaur suit. Or is the key ingredient perhaps, non-verbal facial communication (eyes and so forth)? I guess I can avoid "Offtopic" by slotting this under "AI"... |
Friday, September 08, 2006 8:48:00 AM (Hawaiian Standard Time, UTC-10:00) | Disqus link | AI
|
|
|
|
Thursday, August 10, 2006 |
|
|
Drools.NET is a port of the Drools library to .NET. I have a bg, big architectural decision coming up for a client and I am debating about whether to tackle the issue with an inference engine or a scripting language. So I've been looking at Drools pretty closely. It's okay. I wouldn't put it in the same league as ILog JRules, but the price is right and it seems to have momentum.
In the .NET front, I've been told there's a Rete-based inference engine inside BizTalk (?), but I never followed up on that. Another tool is mTuitive's xPert Authoring Environment. More to check out.... |
Thursday, August 10, 2006 10:00:00 PM (Hawaiian Standard Time, UTC-10:00) | Disqus link | AI | DotNet
|
|
|
|
Wednesday, August 09, 2006 |
|
|
Brilliant! The Hutter Prize for Lossless Compression (http://www.hutter1.net/prize.htm) takes as its challenge the task of compressing 100MB of Wikipedia text into the pre-competition best of ~17MB. The idea is that a chunk of Wikipedia text that big has characteristics relevant to compression that go beyond statistical analysis (i.e., "meaning"). The deliverable must be entirely self-contained, but it can be near 17MB in size in order to get in the money, so that's a lot of space for generative code (there are no restrictions on runtime speed or memory consumption). |
Wednesday, August 09, 2006 10:00:00 PM (Hawaiian Standard Time, UTC-10:00) | Disqus link | AI
|
|
|
|
Wednesday, July 19, 2006 |
|
|
NaturallySpeaking 9, coming out in August, claims to dramatically reduce the time it takes to model your voice, achieving the best-possible recognition soon after opening the box.
For some people, that best-possible recognition is said to be 99%. Maybe. I've probably gone throught the "voice training" process a dozen or more times over the years. Not only have I never achieved 99%, I've never achieved anything usable.
There are several factors: one is that "tethered to your computer, wearing a noise-cancelling headset, and watching the dictation in realtime," is not appealing to me. The second is that when you make a typo you are off by a coupe letters and then you get back on track. When a voice-reco system fails, the error mode is a parlor-game chain of semi-homonyms "wrecks a beach" == "recognize speech".
I'm ever optimistic, though. As a writer, I'd love to be able to do significant amounts of work using a digital recorder (PDA, smartphone, what-have-you) on the beach. I've even thought of trying out those lost-cost (human) transcription services. Maybe I'll give that a shot this National Novel Writing Month. |
Wednesday, July 19, 2006 8:13:23 AM (Hawaiian Standard Time, UTC-10:00) | Disqus link | AI | Knowing
|
|
|
|
Monday, July 10, 2006 |
|
|
With all my AI posts lately, I'm sorry I hadn't realized that the May 2006 issue of the CACM had a theme on the language-action perspective, a critique by Terry Winograd and Fernando Flores that dates from 1986 whose essential point the CACM summarizes neatly:
[S]killful action always occurs in a context set by conversations, and in the conversations people perform speech acts by they commit to and generate the action. Expert behavior requires an extensive sensitivity to context and an ability to know what to commit to. Computing machines, which are purposely designed to process symbols independent of their context, have no hopes of becoming experts.
It's a cutting insight and goes, I think, to why expert systems, for instance, initially seem very exciting but, in the real world, generally fail to provide a lot of value. (They're great for training operators, though!)
|
|
|
|
|
Sunday, July 09, 2006 |
|
|
Exploring Artificial Intelligence is an exciting prospect for non-professional programming (it's a quite rare part of professional programming). Rather than criticize others for being over-reaching in their 3-month long projects, I asked myself what project would I tackle if I had 3 months, a copy of Visual Studio Express, and wanted to produce (with at least 50% chance of success) a "cool" application that helped build enthusiasm and explain some concept in AI?
It's a hard question.
It's difficult to think of something that is both hard enough to take 3 months to do in a prototyping, non-deliverable manner (assuming that you don't have to learn the technology and assuming that you don't have to care about deployment and hardware and graphics and graceful exception handling, etc.) and simultaneously reasonably achievable (that is, not requiring some "a miracle happens here" serendipitous breakthrough in the course of development). I think that this "achievability chasm" is actually worth reflecting upon as one of the reasons that AI is languishing: you can learn the algorithm and do the canonical textbook sample but convincing yourself (and others) that the technique will yield real-world value requires a commitment (more than 3 months) that very, very few people have the resources to attempt.
Some things that I wouldn't do:
- Recreate a classic experiment. This wouldn't take 3 months.
- A workbench. This was a tougher one: a workbench for developing fuzzy logic or exploring evolutionary computation is something that you could make some serious headway on in 3 months. But, in terms of exciting others, workbenches don't cut it. There have long been, for instance, workbenches for neural networks and, as valuable as they might be to experienced practitioners, I think that newcomers don't respond to them in the way that one might hope.
- Hard-coded game AI. For instance: "Write the best damned squad tactics mod for your favorite FPS." The thing is, the goal is to build enthusiasm and excitement for some legitimately non-mainstream technique, not to write a couple thousand lines of imperative code that momentarily impress. Not that game AI is off-limits, only that simply slogging away with top-down implementation does not meet the goals.
I thought about the problem both in terms of algorithms / technologies we consider "AI" (e.g., neural nets, evolutionary computation) and functions (machine sensing, learning, manipulating). I eventually mind-mapped several ideas:
Test Jeff Hawkins' (On Intelligence) Architecture on Tic-Tac-Toe
Hawkins' theories have appealing characteristics, but it needs to be tested on a humble task. He details his proposed cortical architecture well enough that if it works as he says, you should see unique pattern recognition even if you implement it with a classical training algorithm. The challenge I propose is the rotation-invariance of tic-tac-toe: any learning algorithm that can learn that the rule is "if the first play is in a corner, play to the opposite corner," without being exposed to that in the training set or having rotation hard-coded would powerfully validate the algorithm.
Evolvable Artificial Life in the CLR
This probably breaks my "no recreation of classic experiments" rule (ref. Tierra), but it would be interesting to write a genotypic-phenotypic system that generates valid IL. This would allow for experiments in artificial life within the safe confines of the CLR.
That is, the odds of a random bunch of bytes being interpretable as a valid computer function is extremely low. That makes it very difficult to explore machine-generated functions. However, if what you generate are "known valid" terminals and branches (essentially, a tree), you can then see if various evolutionary models generate complexity.
Fuzzy Logic Workbench
Definitely breaks my "no workbench" rule, but fuzzy logic just screams for the type of 3-D visualization of response manifolds that would be relatively easy to do in WPF.
Webcam-Based Weather Recognition/Prediction
A program that can look at many outside-pointing Webcams and determine whether it's sunny or cloudy, raining or snowing (good challenge for neural nets). Use geotagging to correlate Webcams. Use weather station information to gather learning data. Can a Webcam in Montreal predict tomorrow's weather in Boston?
What Makes a Flickr Photo a Favorite?
A good challenge for machine learning techniques. Search for correlations in the tags. Do basic image processing (color harmony? contrast? edges? skin tones?).
Offline Voice Recognition in Noisy Environments
I submit that the premise that voice recognition should be focused on realtime transcription is flawed. For one thing, composing by continuous dictation is very hard (see Recording in Thoughts). For another, when I'm sitting in front of my computer, I'm willing -- nay, eager -- to use the keyboard and mouse; it's when I'm in a mobile context that I want to use my voice. And, in the mobile context, what I'm interacting with is a small device: a cellphone, PDA, or digital recorder.
Even though some programs allow .WAV file input (Naturally Speaking 8 ), the underlying processing is still realtime (and, last time I checked, single threaded). This is foolish! I would love to take a crack at processor- and database/Internet-intensive voice recognition. For instance, rank every word-pair alternate via their Google distance (the # of Google returns for the word pair); use WordNet to create alternate parse trees from alternates; apply multiple noise filters to the input to see if the recognition changes; use the database of prior recognition for a dictionary, etc.
Note that I'm not talking about the actual transformation of a sound file into a text alternate -- leave that to the existing, pretty easy-to-use APIs. I'm talking about primarily about intense post-processing (and, for the application of filters to the .WAV file, pre-).
For a short project, the focus would have to be very tight. There are two holy grails for this type of voice recognition: voicemail and in-car dictation. My idea would be the recognition of one- to two-sentence task-oriented utterances: "Pick up bacon at the store," and "Call Bob back."
Generating Narratives
My hunch on consciousness is that it is a semi-continuous narrative whose form is generated by relatively hard-coded rules that interact with a "grammar organ" and whose subject is focused by subliminal processes controlling attention and intention. Obviously, I'm hand-waving huge problems relating to these subliminal aspects, but I think that between WordNet, Wikipedia, and Google, there's a real potential for generating complex narratives, even while punting on the underlying intention. In other words, I think that you could at least win the Loebner Prize...
An Evolvable DSL For Poker AI
The program GS2 is favored to win the AAAI Computer Poker Competition in a few days. GS2 apparently dynamically develops its strategy based on game theory. An evolvable DSL that described poker strategies would be fascinating.
Evolving Teams for Fantasy/Rotisserrie Leagues
One of the nice things about fantasy baseball/football/etc. is that you have a task that's driven essentually by statistics and chance, so you have a good chance at creating a system that could reliably beat poor human players. The task here would be to create a Learning Classifier System from which you could extract "good" rules.
Autonomous Blimpbot
Robotics are the new personal computers. If I had the soldering skillz, I'd love to create a self-directing robotic blimp: start with a remote-controlled blimp with a mounted camera, hack a digital controller ("miracle happens here" for me, but for other people, I'm sure it's do-able), and go forth. |
|
|
|
|
Friday, July 07, 2006 |
|
|
Rick Strom has a nice page showing how he used genetic programming to create a path-finding algorithm. This is real genetic programming, which differs from a genetic algorithm in that GP evolves an actual behavior tree of (potentially) arbitrary size, while a GA evolves the optimal parameters for a complex but pre-existing function.
GP is generally less accessible than GA programming and virtually all GP code is implemented in LISP. Strom's code is in C++ and thus may be appealing to a broader audience. |
|
|
|
|
Thursday, July 06, 2006 |
|
|
Snap! Ayende Rahein has shown how Active Record can be used to implement a rules engine. I'd had some thoughts about backward-chaining and LINQ lately as part of a forthcoming post, it's nice to see some groundwork laid.
BTW, this is via Sam Gentile, whose blog is very high quality and who's looking to increase his subscriber base. Definitely subscribe if you're looking for high-end advice on agile techniques for advanced .NET development. |
Thursday, July 06, 2006 10:00:00 PM (Hawaiian Standard Time, UTC-10:00) | Disqus link | AI | DotNet
|
|
|
|
Monday, July 03, 2006 |
|
|
In an earlier post, I was thoughtlessly harsh about the finalists in Microsoft's "Made in Express" contest. It's come to my attention that at least several of the contestants felt insulted by the post. To them I apologize: I wish them nothing but the best and envy them their enthusiasm and involvement with such ambitious projects.
|
|
|
|
|
Monday, May 29, 2006 |
|
|
An Old Neural Net Programmer on Jeff Hawkins' Model of Intelligence
Monday, May 29, 2006
7:07 AM




There's one big issue with the simplest possible 3-layer artificial neural network:
Created with Microsoft Office OneNote 2007 (Beta) One place for all your information
Download: An Old Neural Net Programmer on.one
|
|
|
|
|
Sunday, May 28, 2006 |
|
|
Sunday, May 28, 2006
9:28 PM
Daniel Crenna, one of the finalists in the "Made in Express" contest, felt I went too far in dismissing the entrants when I said their projects were unrealistically ambitious. One of his co-finalists is a professor of robotics, Daniel is confident of his approach, etc. Okay. Why someone capable of writing, in a month or two, a realtime 3-D vision system in C# from scratch is looking to win a $10,000 prize is beyond me, but bully for them for doing it.
Crenna is developing a domain-specific (visual) language for poker robots. He says that he doesn't intend to "advance the state of the art (not in this competition, anyway), but I will do my best to make what is currently available more accessible," with a drag-and-drop interface. This is a worthy goal and not in the same realm of ambition that the 3-D vision system is. I think that it still lies in the realm of "if you can do this, you shouldn't be giving it up for a $10K prize," but that's his business, not mine.
Modeling poker is a fascinating problem. I have just subscribed to Crenna's blog and look forward to reading on his progress: I hope he'll forgive me kibbutzing.
The thing about Poker, and Texas Hold-Em in particular, and Tournament Texas Hold-Em in double-particular, is that it brings the forefront the problem of modeling intentionality. First-order intentionality is when you look at your cards and say "I believe I have a strong hand," (and therefore, I will play). That's easy. The great thing about Texas Hold-Em is that while there's variability in what cards will come up, the variability in what cards you have is very small and the importance of first-order intentionality is minimal.
A "poker intelligence" based on first-order intentionality would have a table of starting hands that are "likely to win" and bet on, say, A-7 or better, fold anything else. After that, it would be driven by pot odds. It would ignore other players betting patterns and would be very easy to beat.
Second-order intentionality is (he played and, therefore,) "I think he has a strong hand," and would be necessary for any non-trivial poker intelligence. So, for instance, if the other player opened, the poker intelligence might "put him" on an A-7 or better and compare the pot odds against various predictions of what the other player might do. Obviously, people play differently, but you should get some results if you had a parameterized model ("Aggressive player," "Passive player," etc.).
Poker betting signals third-order intentionality: (I bet aggressively out of position so) "I think he thinks I have a strong hand." And even the lamest poker player understands bluffing "I think (if I bluff) he will think that I think I have a strong hand." To call a bluff requires a decision about fourth-order intentionality: "He thinks I think he thinks I have a good hand," and, just to take things to what is generally considered the human limit, tournament texas hold 'em happens so fast that you have to model your opponent's model for dealing with bluffing: fifth-order intentionality.
By the time you get to fifth-order intentionality, you're verging on comic territory -- "Only a fool would put poison in the cup in front of him!" I don't think that fifth-order intentionality is necessary for a non-trivial poker intelligence, but I do assert that third-order intentionality is necessary, since that's the level at which bluffing takes place.
Another possibility is to collapse the model into statistics: model your opponent as "10% of the time, he's betting over his card's true odds (aka bluffing)," but putting such a parameter to an opponent's play is very difficult since it is difficult to get enough data about your opponent's real situation versus the evolving pot odds. Again, especially in tournament hold 'em.
Created with Microsoft Office OneNote 2007 (Beta) One place for all your information
Download: AI for Poker.one
|
Sunday, May 28, 2006 10:19:17 PM (Hawaiian Standard Time, UTC-10:00) | Disqus link | AI | DotNet
|
|
|
|
|
Unpredictability and recognition systems
Sunday, May 28, 2006
12:18 PM
In reading Jeff Hawkins book On Intelligence I came upon this great anecdote about developing Graffiti:
"I recognized that people were willing to learn a difficult task (typing) because it was a reliable and fast way to enter text into a machine. Therefore if we could create a new method of entering text with a stylus that was fast and reliable, people would use it even though it required learning. So I designed an alphabet that would reliably translate what you write into computer text; we called it Graffiti. With traditional handwriting recognition systems, when the computer misinterprets your writing you don't know why. But the Graffiti system always produces a correct letter unless you make a mistake in writing. Our brains hate unpredictability, which is why people hate traditional handwriting recognition systems." (Emphasis added)
To this day, I prefer Graffiti for PDA input, although I would love Shark/Shapewriter (which bolsters Hawkins' point even further). On the other hand, I prefer the TabletPC's TIP and correction UI to Graffiti; I'm not sure it's faster, but the correction UI is good enough that using it is predictable. Voice recognition systems, though, definitely produce the "unpredictable == hateful" reaction in me.
Created with Microsoft Office OneNote 2007 (Beta) One place for all your information
Download: Unpredictability and recognition.one
|
| |
| | |