Archive for 28th May 2006

AI for Poker

AI for Poker

Sunday, May 28, 2006

9:28 PM

Daniel Crenna, one of the finalists in the “Made in Express” contest, felt I went too far in dismissing the entrants when I said their projects were unrealistically ambitious.  One of his co-finalists is a professor of robotics, Daniel is confident of his approach, etc. Okay. Why someone capable of writing, in a month or two, a realtime 3-D vision system in C# from scratch is looking to win a $10,000 prize is beyond me, but bully for them for doing it.

 

Crenna is developing a domain-specific (visual) language for poker robots. He says that he doesn’t intend to “advance the state of the art (not in this competition, anyway), but I will do my best to make what is currently available more accessible,” with a drag-and-drop interface. This is a worthy goal and not in the same realm of ambition that the 3-D vision system is. I think that it still lies in the realm of “if you can do this, you shouldn’t be giving it up for a $10K prize,” but that’s his business, not mine.

 

Modeling poker is a fascinating problem. I have just subscribed to Crenna’s blog and look forward to reading on his progress: I hope he’ll forgive me kibbutzing.

 

The thing about Poker, and Texas Hold-Em in particular, and Tournament Texas Hold-Em in double-particular, is that it brings the forefront the problem of modeling intentionality. First-order intentionality is when you look at your cards and say “I believe I have a strong hand,” (and therefore, I will play). That’s easy. The great thing about Texas Hold-Em is that while there’s variability in what cards will come up, the variability in what cards you have is very small and the importance of first-order intentionality is minimal.

 

A “poker intelligence” based on first-order intentionality would have a table of starting hands that are “likely to win” and bet on, say, A-7 or better, fold anything else. After that, it would be driven by pot odds. It would ignore other players betting patterns and would be very easy to beat.

 

Second-order intentionality is (he played and, therefore,) “I think he has a strong hand,” and would be necessary for any non-trivial poker intelligence. So, for instance, if the other player opened, the poker intelligence might “put him” on an A-7 or better and compare the pot odds against various predictions of what the other player might do. Obviously, people play differently, but you should get some results if you had a parameterized model (“Aggressive player,” “Passive player,” etc.).

 

Poker betting signals third-order intentionality: (I bet aggressively out of position so) “I think he thinks I have a strong hand.” And even the lamest poker player understands bluffing “I think (if I bluff) he will think that I think I have a strong hand.” To call a bluff requires a decision about fourth-order intentionality: “He thinks I think he thinks I have a good hand,” and, just to take things to what is generally considered the human limit, tournament texas hold ‘em happens so fast that you have to model your opponent’s model for dealing with bluffing: fifth-order intentionality.

 

By the time you get to fifth-order intentionality, you’re verging on comic territory — “Only a fool would put poison in the cup in front of him!” I don’t think that fifth-order intentionality is necessary for a non-trivial poker intelligence, but I do assert that third-order intentionality is necessary, since that’s the level at which bluffing takes place.

 

Another possibility is to collapse the model into statistics: model your opponent as “10% of the time, he’s betting over his card’s true odds (aka bluffing),” but putting such a parameter to an opponent’s play is very difficult since it is difficult to get enough data about your opponent’s real situation versus the evolving pot odds. Again, especially in tournament hold ‘em.

 

 

Created with Microsoft Office OneNote 2007 (Beta)
One place for all your information

Download: AI for Poker.one

Unpredictability and recognition systems

Unpredictability and recognition systems

Sunday, May 28, 2006

12:18 PM

In reading Jeff Hawkins book On Intelligence I came upon this great anecdote about developing Graffiti:

 

“I recognized that people were willing to learn a difficult task (typing) because it was a reliable and fast way to enter text into a machine. Therefore if we could create a new method of entering text with a stylus that was fast and reliable, people would use it even though it required learning. So I designed an alphabet that would reliably translate what you write into computer text; we called it Graffiti. With traditional handwriting recognition systems, when the computer misinterprets your writing you don’t know why. But the Graffiti system always produces a correct letter unless you make a mistake in writing. Our brains hate unpredictability, which is why people hate traditional handwriting recognition systems.” (Emphasis added)

 

To this day, I prefer Graffiti for PDA input, although I would love Shark/Shapewriter (which bolsters Hawkins’ point even further). On the other hand, I prefer the TabletPC’s TIP and correction UI to Graffiti; I’m not sure it’s faster, but the correction UI is good enough that using it is predictable. Voice recognition systems, though, definitely produce the “unpredictable == hateful” reaction in me.

 

Created with Microsoft Office OneNote 2007 (Beta)
One place for all your information

Download: Unpredictability and recognition.one

On “On Intelligence”

On On Intelligence

Sunday, May 28, 2006

9:11 AM

On the suggestion of John Lam, I bought Jeff Hawkins’ book On Intelligence and pretty much read it through in one sitting. That was possible because it’s a very accessible book and also because, to a large extent, I was already exposed and sympathetic to the grand themes: minds are what brains do; pattern recognition, association, and prediction is close, if not indistinguishable, from “intelligence;”and the brain’s mechanisms for doing such things are based on highly interconnected hierarchies of localized neuronal structures that exhibit “fire together, wire together,” reinforcement / learning. (For those interested in such things, it should be noted early that Hawkins punts on the problems of qualia, so as far as I am aware, that debate still ends with Dennett and Searle.)

 

Hawkins biggest hypothesis, though, is one that I find intriguing but far from self-evident: that a single algorithm that produces “predictive ability [from] a hierarchical memory,” is sufficient to achieve intelligence. Hawkins was a founder of Palm and Handspring and I was surprised that On Intelligence is rested on the structure of the neocortex and not on computational structures. Of course, a premise is that the neural structures he talks about are computational, and he presents his theory in terms of a few block diagrams, but the book is several steps away from presenting the source code, as it were.

 

This is a little disappointing, because Hawkins’ hypothesis could be tested with relatively easy computational experiments. That the human brain has a gazillion neurons and a bazillion interconnections is largely irrelevant if the phenomena that Hawkins claims arise from relatively small collections of neurons. A testbed that maps well into Hawkins theories is the world of “complete information, zero-sum, binary placement gridded games.” Tic-Tac-Toe, Reverise/Othello, and Go are all this type of game: they all take place on a grid and involve sequential placement of opposing symbols. Unlike Poker, there’s no hidden knowledge or probability issues; winning is binary; and unlike Chess or Checkers, you don’t have to understand movement. A “play” is the changing of a position from an indeterminate state to a determinate one and the consequences of that play. Helpfully, Tic-Tac-Toe is close to trivial and Go is computationally more difficult than Chess. Any model that could solve Tic-Tac-Toe and simple be scaled to Go would be an incredible triumph.

 

And although we probably don’t have much insight into what Hawkins’ theories predict for the form of a Go-solving intelligence, they are testable with Tic-Tac-Toe. It should be straightforward to create a “cortextlike memory system” whose “sensory inputs” are Tic-Tac-Toe placement sequences. So, for instance, a blank grid followed by an X in the upper-left and then an O in the lower-right might be encoded as: “??? ??? ???, X?? ??? ???, X?? ??? ??O”. If Hawkins’ model is correct, then one would expect certain things to be emergent in the self-organizing higher-levels of his hierarchies. Note that we are not testing the ability of a connectionist system to play Tic-Tac-Toe (Tic-Tac-Toe is possible to “solve” with a traditional neural net approach, Go is almost certainly not). Specifically, it follows from Hawkins’ theories that in Tic-Tac-Toe, if the first play is to a corner then a single high-level component should be responsible for “expecting” play to the opposite corner — no matter which corner is initially played  even if the system has never been exposed to play from a particular corner. Optimal play in Tic-Tac-Toe is very straightforward and recognizable, rotational equivalence is trivial compared to the types of sensory interpolation the book asserts are explainable, and yet, it is not self-evident that Hawkins’ model will pass such a test.

 

The book has a companion site at OnIntelligence.org — I just posted a similar comment to the above paragraph, we’ll have to see if anything comes of it.

 

 

 

 

 

Created with Microsoft Office OneNote 2007 (Beta)
One place for all your information

Download: On On Intelligence.one