Scott Swigart points to this article that gives a non-technical overview of the use of genetic algorithms to determine the optimal tuning characteristics of one’s Linux Kernel. This ought to work: many years ago I wrote a genetic algorithm that tuned the optimization parameters of one’s C++ compiler and it worked perfectly (well, who knows if it worked perfectly, but it did create better runtime performance than one would generally get from naive optimization options).
Netflix is offering $1M to the first person who can achieve 10% better movie recommendations than their current system. Sweet.
I have all sorts of ideas on this. Thank heavens I have copious spare time.
Infants exposed to a Mandarin-speaking adult for less than 5 hours (25-minute sessions over 4 weeks) were “able to distinguish phonetic elements of that language.” Very impressive. But infants exposed to a similar amount of speech delivered over DVD could not. Fascinating. General-audience article here (via The Old New Thing)
My guess is that there’s a difference-in-kind to the type of attention that infants pay humans to the type of attention they pay brightly flickering screens. It would be interesting to see the effect of exposure to, say, a guy in a big purple talking dinosaur suit. Or is the key ingredient perhaps, non-verbal facial communication (eyes and so forth)?
I guess I can avoid “Offtopic” by slotting this under “AI”…
Drools.NET is a port of the Drools library to .NET. I have a bg, big architectural decision coming up for a client and I am debating about whether to tackle the issue with an inference engine or a scripting language. So I’ve been looking at Drools pretty closely. It’s okay. I wouldn’t put it in the same league as ILog JRules, but the price is right and it seems to have momentum.
In the .NET front, I’ve been told there’s a Rete-based inference engine inside BizTalk (?), but I never followed up on that. Another tool is mTuitive’s xPert Authoring Environment. More to check out….
Brilliant! The Hutter Prize for Lossless Compression (http://www.hutter1.net/prize.htm) takes as its challenge the task of compressing 100MB of Wikipedia text into the pre-competition best of ~17MB. The idea is that a chunk of Wikipedia text that big has characteristics relevant to compression that go beyond statistical analysis (i.e., “meaning”). The deliverable must be entirely self-contained, but it can be near 17MB in size in order to get in the money, so that’s a lot of space for generative code (there are no restrictions on runtime speed or memory consumption).
NaturallySpeaking 9, coming out in August, claims to dramatically reduce the time it takes to model your voice, achieving the best-possible recognition soon after opening the box.
For some people, that best-possible recognition is said to be 99%. Maybe. I’ve probably gone throught the “voice training” process a dozen or more times over the years. Not only have I never achieved 99%, I’ve never achieved anything usable.
There are several factors: one is that “tethered to your computer, wearing a noise-cancelling headset, and watching the dictation in realtime,” is not appealing to me. The second is that when you make a typo you are off by a coupe letters and then you get back on track. When a voice-reco system fails, the error mode is a parlor-game chain of semi-homonyms “wrecks a beach” == “recognize speech”.
I’m ever optimistic, though. As a writer, I’d love to be able to do significant amounts of work using a digital recorder (PDA, smartphone, what-have-you) on the beach. I’ve even thought of trying out those lost-cost (human) transcription services. Maybe I’ll give that a shot this National Novel Writing Month.
The Catalogue of Variable Frequency and Single-Resistance-Controlled Oscillators Employing A Single Differential Difference Complementary Current Conveyor which I imagine is self-explanatory to electrical engineers. Silver went to Multiobjective Genetic Algorithms for Multiscaling Excited-State Direct Dynamics in Photochemistry. Bronze prizes went to two things that I could actually understand: A multi-population genetic algorithm for robust and fast ellipse detection and Using Evolution to Learn How to Perform Interest Point Detection .
With all my AI posts lately, I’m sorry I hadn’t realized that the May 2006 issue of the CACM had a theme on the language-action perspective, a critique by Terry Winograd and Fernando Flores that dates from 1986 whose essential point the CACM summarizes neatly:
[S]killful action always occurs in a context set by conversations, and in the conversations people perform speech acts by they commit to and generate the action. Expert behavior requires an extensive sensitivity to context and an ability to know what to commit to. Computing machines, which are purposely designed to process symbols independent of their context, have no hopes of becoming experts.
It’s a cutting insight and goes, I think, to why expert systems, for instance, initially seem very exciting but, in the real world, generally fail to provide a lot of value. (They’re great for training operators, though!)
Exploring Artificial Intelligence is an exciting prospect for non-professional programming (it’s a quite rare part of professional programming). Rather than criticize others for being On Intelligence) Architecture on Tic-Tac-Toe
Naturally Speaking 8 ), the underlying processing is still realtime (and, last time I checked, single threaded). This is foolish! I would love to take a crack at processor- and database/Internet-intensive voice recognition. For instance, rank every word-pair alternate via their Google distance (the # of Google returns for the word pair); use WordNet to create alternate parse trees from alternates; apply multiple noise filters to the input to see if the recognition changes; use the database of prior recognition for a dictionary, etc.
Note that I’m not talking about the actual transformation of a sound file into a text alternate — leave that to the existing, pretty easy-to-use APIs. I’m talking about primarily about intense post-processing (and, for the application of filters to the .WAV file, pre-).
For a short project, the focus would have to be very tight. There are two holy grails for this type of voice recognition: voicemail and in-car dictation. My idea would be the recognition of one- to two-sentence task-oriented utterances: “Pick up bacon at the store,” and “Call Bob back.”
My hunch on consciousness is that it is a semi-continuous narrative whose form is generated by relatively hard-coded rules that interact with a “grammar organ” and whose subject is focused by subliminal processes controlling attention and intention. Obviously, I’m hand-waving huge problems relating to these subliminal aspects, but I think that between WordNet, Wikipedia, and Google, there’s a real potential for generating complex narratives, even while punting on the underlying intention. In other words, I think that you could at least win the Loebner Prize…
An Evolvable DSL For Poker AI
The program GS2 is favored to win the AAAI Computer Poker Competition in a few days. GS2 apparently dynamically develops its strategy based on game theory. An evolvable DSL that described poker strategies would be fascinating.
Evolving Teams for Fantasy/Rotisserrie Leagues
One of the nice things about fantasy baseball/football/etc. is that you have a task that’s driven essentually by statistics and chance, so you have a good chance at creating a system that could reliably beat poor human players. The task here would be to create a Learning Classifier System from which you could extract “good” rules.
Robotics are the new personal computers. If I had the soldering skillz, I’d love to create a self-directing robotic blimp: start with a remote-controlled blimp with a mounted camera, hack a digital controller (“miracle happens here” for me, but for other people, I’m sure it’s do-able), and go forth.
Rick Strom has a nice page showing how he used genetic programming to create a path-finding algorithm. This is real genetic programming, which differs from a genetic algorithm in that GP evolves an actual behavior tree of (potentially) arbitrary size, while a GA evolves the optimal parameters for a complex but pre-existing function.
GP is generally less accessible than GA programming and virtually all GP code is implemented in LISP. Strom’s code is in C++ and thus may be appealing to a broader audience.