Fascinating: Language Exposure Must Be Live, Not Recorded

Infants exposed to a Mandarin-speaking adult for less than 5 hours (25-minute sessions over 4 weeks) were “able to distinguish phonetic elements of that language.” Very impressive. But infants exposed to a similar amount of speech delivered over DVD could not. Fascinating. General-audience article here (via The Old New Thing)

My guess is that there’s a difference-in-kind to the type of attention that infants pay humans to the type of attention they pay brightly flickering screens. It would be interesting to see the effect of exposure to, say, a guy in a big purple talking dinosaur suit. Or is the key ingredient perhaps, non-verbal facial communication (eyes and so forth)?

I guess I can avoid “Offtopic” by slotting this under “AI”…

Posted in AI

Drools (Java-based inference engine) ported to .NET

Drools.NET is a port of the Drools library to .NET. I have a bg, big architectural decision coming up for a client and I am debating about whether to tackle the issue with an inference engine or a scripting language. So I’ve been looking at Drools pretty closely. It’s okay. I wouldn’t put it in the same league as ILog JRules, but the price is right and it seems to have momentum.

In the .NET front, I’ve been told there’s a Rete-based inference engine inside BizTalk (?), but I never followed up on that. Another tool is mTuitive’s xPert Authoring Environment. More to check out….

Compress Wikipedia, Win 20,000 Euros

 

Brilliant! The Hutter Prize for Lossless Compression (http://www.hutter1.net/prize.htm) takes as its challenge the task of compressing 100MB of Wikipedia text into the pre-competition best of ~17MB. The idea is that a chunk of Wikipedia text that big has characteristics relevant to compression that go beyond statistical analysis (i.e., “meaning”). The deliverable must be entirely self-contained, but it can be near 17MB in size in order to get in the money, so that’s a lot of space for generative code (there are no restrictions on runtime speed or memory consumption).

Posted in AI

Makers of NaturallySpeaking Raising Expectations for Voice Recognition

NaturallySpeaking 9, coming out in August, claims to dramatically reduce the time it takes to model your voice, achieving the best-possible recognition soon after opening the box.

For some people, that best-possible recognition is said to be 99%. Maybe. I’ve probably gone throught the “voice training” process a dozen or more times over the years. Not only have I never achieved 99%, I’ve never achieved anything usable.

There are several factors: one is that “tethered to your computer, wearing a noise-cancelling headset, and watching the dictation in realtime,” is not appealing to me. The second is that when you make a typo you are off by a coupe letters and then you get back on track. When a voice-reco system fails, the error mode is a parlor-game chain of semi-homonyms “wrecks a beach” == “recognize speech”.

I’m ever optimistic, though. As a writer, I’d love to be able to do significant amounts of work using a digital recorder (PDA, smartphone, what-have-you) on the beach. I’ve even thought of trying out those lost-cost (human) transcription services. Maybe I’ll give that a shot this National Novel Writing Month.

Posted in AI

Genetic Algorithms Outperform Humans In…

The Catalogue of Variable Frequency and Single-Resistance-Controlled Oscillators Employing A Single Differential Difference Complementary Current Conveyor which I imagine is self-explanatory to electrical engineers. Silver went to Multiobjective Genetic Algorithms for Multiscaling Excited-State Direct Dynamics in Photochemistry. Bronze prizes went to two things that I could actually understand:  A multi-population genetic algorithm for robust and fast ellipse detection  and Using Evolution to Learn How to Perform Interest Point Detection .

Posted in AI

The Language-Action Perspective: AI is Impossible?

With all my AI posts lately, I’m sorry I hadn’t realized that the May 2006 issue of the CACM had a theme on the language-action perspective, a critique by Terry Winograd and Fernando Flores that dates from 1986 whose essential point the CACM summarizes neatly:

[S]killful action always occurs in a context set by conversations, and in the conversations people perform speech acts by they commit to and generate the action. Expert behavior requires an extensive sensitivity to context and an ability to know what to commit to. Computing machines, which are purposely designed to process symbols independent of their context, have no hopes of becoming experts.

It’s a cutting insight and goes, I think, to why expert systems, for instance, initially seem very exciting but, in the real world, generally fail to provide a lot of value. (They’re great for training operators, though!)

Posted in AI

AI in 3 Months

Exploring Artificial Intelligence is an exciting prospect for non-professional programming (it’s a quite rare part of professional programming). Rather than criticize others for being On Intelligence) Architecture on Tic-Tac-Toe

Naturally Speaking 8 ), the underlying processing is still realtime (and, last time I checked, single threaded). This is foolish! I would love to take a crack at processor- and database/Internet-intensive voice recognition. For instance, rank every word-pair alternate via their Google distance (the # of Google returns for the word pair); use WordNet to create alternate parse trees from alternates; apply multiple noise filters to the input to see if the recognition changes; use the database of prior recognition for a dictionary, etc.

Note that I’m not talking about the actual transformation of a sound file into a text alternate — leave that to the existing, pretty easy-to-use APIs. I’m talking about primarily about intense post-processing (and, for the application of filters to the .WAV file, pre-).

For a short project, the focus would have to be very tight. There are two holy grails for this type of voice recognition: voicemail and in-car dictation. My idea would be the recognition of one- to two-sentence task-oriented utterances: “Pick up bacon at the store,” and “Call Bob back.” 

Generating Narratives

My hunch on consciousness is that it is a semi-continuous narrative whose form is generated by relatively hard-coded rules that interact with a “grammar organ” and whose subject is focused by subliminal processes controlling attention and intention. Obviously, I’m hand-waving huge problems relating to these subliminal aspects, but I think that between WordNet, Wikipedia, and Google, there’s a real potential for generating complex narratives, even while punting on the underlying intention. In other words, I think that you could at least win the Loebner Prize…

An Evolvable DSL For Poker AI

The program GS2 is favored to win the AAAI Computer Poker Competition in a few days. GS2 apparently dynamically develops its strategy based on game theory. An evolvable DSL that described poker strategies would be fascinating.

Evolving Teams for Fantasy/Rotisserrie Leagues

One of the nice things about fantasy baseball/football/etc. is that you have a task that’s driven essentually by statistics and chance, so you have a good chance at creating a system that could reliably beat poor human players. The task here would be to create a Learning Classifier System from which you could extract “good” rules.

Autonomous Blimpbot

Robotics are the new personal computers. If I had the soldering skillz, I’d love to create a self-directing robotic blimp: start with a remote-controlled blimp with a mounted camera, hack a digital controller (“miracle happens here” for me, but for other people, I’m sure it’s do-able), and go forth.

Posted in AI

Evolving a Path-Finding Algorithm

Rick Strom has a nice page showing how he used genetic programming to create a path-finding algorithm. This is real genetic programming, which differs from a genetic algorithm in that GP evolves an actual behavior tree of (potentially) arbitrary size, while a GA evolves the optimal parameters for a complex but pre-existing function.

GP is generally less accessible than GA programming and virtually all GP code is implemented in LISP. Strom’s code is in C++ and thus may be appealing to a broader audience.  

Posted in AI

Active Record as a Rule Engine

Snap! Ayende Rahein has shown how Active Record can be used to implement a rules engine. I’d had some thoughts about backward-chaining and LINQ lately as part of a forthcoming post, it’s nice to see some groundwork laid.

BTW, this is via Sam Gentile, whose blog is very high quality and who’s looking to increase his subscriber base. Definitely subscribe if you’re looking for high-end advice on agile techniques for advanced .NET development.

Apologies to the “Made in Express” Finalists

In an earlier post, I was thoughtlessly harsh about the finalists in Microsoft’s “Made in Express” contest. It’s come to my attention that at least several of the contestants felt insulted by the post. To them I apologize: I wish them nothing but the best and envy them their enthusiasm and involvement with such ambitious projects.

Posted in AI