Grand Master Eliminates Wrong Moves

The superficial mathematical essence of chess is a min max game, but that's only the trivial part. Just to gather the latest number I've had a funny experience with DeepFritz? 8. I just made a draw with the beast (I have to admit that it rarely happens, these days ) on a bi-processors Xeon 2.4 Ghz, 5 min + 2s blitz. DeepFritz? 8 explores the search tree in parallel on as many processors as are available. The engine reported to analyze a staggering
 1567000 tree nodes/second
I really cannot tell exactly how many nodes my poor old neuronal hardware does it, but I'd be very much surprised if I consider as many as 10 -- and that's because it's a blitz game. On a normal game I might do as little as 1 per second. Ok, I am not GrandMaster?, my title is FIDE master, although I play(ed) chess at a decent professional level in international tournaments. I even taught chess to kids with some success, and still do a little in my spare time. The world of human chess and computer chess and how its lessons are applicable to other intellectual activities, and particularly to SoftwareEngineering.

So the main difference between a chess engine and a GrandMaster? or a good human player is that GrandMasterEliminatesWrongMoves. In the min/max tree the chess engines will try to analyze as many as nodes as possible trying to explore the search space exhaustively. In principle they do a breadth first exploration, with a little twist on it: there's a heuristic function that tells the engine to explore deeper on some paths because there is "tension" in the position. This avoids the situation where an engine stops the exploration of a path just one or two move of being mated or before losing material big time. Because of the time limit complete search is not possible so the engines take the decision to limit the depth of breadth first to what is appropriate on the particular CPU that they are running -- (if you want to beat a chess engine, run a little overload on the OS to take it into swapping, and the engine will go nuts because it will not be able to complete the search space it plans initially).

As opposed to a chess engine which tries very hard to explore as many paths as possible, a grand master tries very hard not to explore that many, so his intelligence is in selecting the paths that are really worth exploring, a process otherwise known in CS as pruning the search tree. And human mind is absolutely fabulous at pruning.

How does this pruning happen? Some anonymous contributor claimed that grand masters have a "database" of wrong moves stored in memory, and having good associative memory, recognize the wrong moves from the database. This is completely wrong. Chess players do use their memory quite a lot, but this is only in opening (where it's exactly the other way around: we already know the good moves, we might even play as much as 20 moves without blinking an eye, because it's a well beaten path).

What is at work in pruning the wrong moves and selecting the promising moves worth investigating is what I call (I might be using the wrong terminology), HigherOrderPatterns. First order patterns helps us recognize characters, faces, etc. This is an area where AI has made quite promising progress and it is well understood mathematically. But higher order patterns is recognizing patterns about patterns, or patterns about second order effect.

For example you may program a chess engine to recognize a typical mate configuration. However, a good chess player will recognize, long before the mating configuration can even occur, that a certain plan will create the conditions for the mate configuration to occur. A chess player will recognize also similarities that go way beyond geometrical/mechanical transformations from one position to another.

For example I can teach a kid that when two defender pieces cross paths in their line of defence, sometimes you win by sacrificing a piece where those defenders cross their lines (a trick called interference. I can show him how it works in one particular positions with 2 rooks being the defenders. Not only he will be able later to recognize this when a rook and a bishop are the defenders, but even more amazingly, having learnt 20 or so patterns of attacks, he will be able to sense in a chess puzzle that that particular configuration is riped [?] for interference of all the combinational patterns he learnt. And this talent is innate to humans, it is absolutely amazing. With a neural network you give it a ton of data, and it crunches it and it reaches a configuration that helps it do OCR or voice recognition. All a good 6 yr old kid needs is usually a handful of positions and the logical explanation, and voila, your higher order pattern recognition is ready to fly. Truly talented kids (I've seen a few) may need as little as 1 position. They will look at it exhaustively, not being aware of the pattern, until they get the eureka moment, and then on their own come back with both the logical explanation and the capacity to apply it in the future.

And one of the most important higher order pattern recognition faculty is the capacity of applying chess principles. Chess has a handful of simple principles that if you really follow you are already 3 quarters of the road to being a grand master. There are a few exceptions to those principles and rarely principles may enter in a conflict with each other in particular positions, so the chess player has to decide the best trade-off. When a less skilled player plays with a grand master or somebody with experience, one would think that the weaker player would try to stick as close to the principles as possible in order to be safe, while the grand master will try to be more "creative" in order to exploit his extra capacity. What is amazing is that in practice the opposite happens: the weak player is afraid of grand master's extra knowledge, and tries to steer the game on unbeaten paths, therefore breaking the principles of chess, while the grandmaster just sticks to principles and plays as simply as possible, and more often than not, he is not using even half of his brain in such a game. When players of comparable skills meet, then the most important part is the creativity in applying those principles and recognizing the relative importance of one versus the other. Still a higher order pattern recognition, it goes like this: intuition tells me that in such a configuration the pattern "bad pawn structure" is less important than the pattern "maximize the activity of your pieces". Please note that to maximize the activity of one's pieces is higher order still and depends on more detailed patterns to be applied.
So how does relate to software development (programming):

There are differences however:

-- CostinCozianu (waiting for other contributors)

Management is often open to the idea of doing such things at lunch time, since people might otherwise go out for lunch anyway. I've seen multiple places that call it "the brown bag", meaning you bring your own lunch (in the classic brown paper bag), so that it isn't a budget issue, and people give talks about technology, as you said. Digimarc had this type of weekly seminar. I would think this would be more prevalent at research-based companies and institutions. See LunchnLearn for another example.

Here's an example of higher order pattern recognition: where a park player succeeds, yet the best chess engine to my knowledge fails:

What should black play here? [If you're right, I'm disappointed. In endgames, it can easily be the case that material advantage doesn't lead to a win. Moreover, a blockade should be just another pattern to consider, rather than a "higher order" pattern.] To those who estimated that chess engines should solve this position correctly by exhausting the branches, I think you underestimate the combinatorial explosion in this position. Please note that not only there's a handful of configurations when the pawn structure does not change, but there's also the possibilities that white will sacrifice the rook at some point in time and black will open up the position. Yes, it is easy to create a chess program that will solve this kind of position. Put a special routine, and decide that when the opponent cannot ever create threats (capture a piece and give check), then the position is blocked and the program should not open it up. However that would be a special case designed by you, because you are humans, and you already solved it by your higher abstraction capacity, and now you encode your knowledge of a special case. However, do that in a commercial, general purpose chess engine, and it will simply be a terrible waste of time. I checked this position with 2 different top level engines that I have license for (Fritz 8, Deep Fritz 8, Schredder 7 ), all set at official think time for chess (40 moves in 2 hour). Again Deep Fritz is the latest version of Fritz and was run on a real bi-processor at 2.4Ghz each. Please note that the human also decides (generally much better than a computer, when to invest time budget into thinking more on a position and when to abstain from wasting time calculating -- another higher order pattern. Chess programs have some heuristic for time budgeting, but humans seem to have much more than such a heuristic). Again, if you put the special case for blocked position inside your chess engine, it will have no practical consequence as chances of reaching such positions in practical play are infinitesimal, and checking for such positions is simply a waste of Processor time and instruction cache in the course of regular chess. And if this is a special case, just think how many more special cases there are in chess.It is simply unfeasible to program special cases in chess programs, unless the programmers decide that such special cases have a high occurrence rate and there's a good heuristic for recognizing it. There are special cases in modern chess engines for extending the depth of a generally breadth first search, on certain branches - most of those have to do with tension (pieces attacking each other), danger to the king, possibility of pawn promotion, etc. If you think this position is easy, try to put yourself in the shoes of a chess engine programmer. How would you even think that your engine will recognize positions like this and play them correctly, while for most positions will go the min/max route.

This position reinforces a few memorable sayings of JeanYvesGirard and EwDijkstra, among which: So where's the higher order pattern at work in humans. Like JeanYvesGirard said in an essay, human mathematicians get a lot of mileage from throwing a few on voit bien que in their proofs (approx. transl: we can see easily that). Somebody claimed above that this is simple deduction, not higher order pattern recognition. Well, that deduction starts necessarily with: on voit bien que (white cannot make progress, it cannot attack black pieces). That is, the human sees a pattern of proof and does not bother to formalize the proof (try to to actually prove formally that this position is drawn, and you'll get lost in all kinds of annoying details). Humans not only are content with the sketch of a proof instead of the actual proof, they also are able to select the proof schemes out of the many proof schemes available in approaching a chess position. This warrants in my opinion the name for a new concept: HigherOrderPatterns. While the computer is nothing but a cyber-idiot and that's why it simply cannot skip formal proofs, it cannot recognize when a program (proof heuristic) is more apt to attack one kind of problem then another program. It cannot do anything but crunch numbers, at least for the time being and the foreseeable future. That's why it still fails to address positions such as this. And that's nothing, you may say that the position is completely artificial, but in the most recent matches (Kasparov and Kramnik against two different computers) the game lost by computers were abysmal failures, very much resembling the position above.

It may turn out that chess is the wrong game to prove the limited condition of the cyber-idiot, and enough computing power with min/max and a bit of heuristic will solve any chess problems. It was thought that for all intents and purposes the game of chess was practically infinite, though theoretically finite. Moore's law seems to prove us wrong on this one. But on the other hand try to make computers do Math, and there goes the relevancy of Moore's law out the door. I'm no Go expert, but friends told me that Go is such a wider search space that Go engines are in the minor league compared with Go masters. -- CostinCozianu

Completely correct. The GameOfGo is the new drosophila of ArtificialIntelligence. The GameOfChess is now mere engineering. -- IanOsgood

There's nothing "mere" about it. It has been known for decades that Go is much harder for computers than chess; the status quo there didn't change just because chess programs have improved a lot. There is still a vast amount of potential research that can be done just on chess programs. It is a partially solved problem, not a completely solved problem.

One interesting question is whether algorithmic approaches that work well for Go (hypothetical, as-yet undiscovered ones) could also be used to make chess programs better. -- DougMerritt

I meant it tongue-in-cheek. In ages past, when computers were slow and strong-AI was all the rage, chess was held up as a hard problem that would feed AI research for decades. It turned out that alpha-beta search + faster hardware and plentiful memory was all it took to create world-class players. Most of the additions have been more well-tuned positional evaluators and search extensions, which I would class as engineering rather than basic AI research. DeepBlue and the research leading up to it were interesting exercises in VLSI and parallelism, but not really AI. By contrast, the field of Go programming is pretty wide open, with no two programs even agreeing to have a global search, much less share a search algorithm. Lots of room for basic pattern recognition research and alternative search strategies. -- IanOsgood [Let's see now - black is way down in material, and even taking the rook leaves him still down in material, with enough moves left to explore that a win might not be found in time even if it were available. Therefore, black should consider playing to draw, and that turns out to be very easy, even without recognizing a blockade. All that needs to be noticed is that if black doesn't take the rook, white can't check, even if black's king moves unwisely. If white can't check, he can't win. Hence black can do anything except take the rook and draw. If Fritz doesn't take the draw, that simply shows either that Fritz never takes a draw unless it knows it will be mated otherwise, or that Fritz's methods of determining whether a draw is available are inadequate. Either possibility is disappointing. Of course, Fritz should attempt to find a win for black, but if it fails to do that, it needs a better fallback position than taking the rook, still being down in material, and risking losing.]

[I've just tried Fritz (an early demo version) on a position where the draw can't be missed, and it chose the draw almost immediately. -- Anon] [What may be significant is that recognizing the draw requires specific code, as the rules say the draw occurs after 50 moves (iirc) without a capture or pawn move. The rule may have been varied, as certain positions provide a theoretical win which requires more than fifty moves. I suspect most chess programs would take the draw if they found it was an option in the first place. A good chess program is likely to be used for analysing arbitrary positions as well as playing games, so its end-game analysis ought to cover the rather obscure possibilities as well as more typical situations. -- Anon] [I don't understand the "save a lost game" remark. If a draw occurs, the "lost" game is saved isn't it? Also, I don't see how a special (negative) value for a draw would work - surely common sense implies it's value must lie between the values used for a loss and a win. -- Anon] Update. To its credit Schredder 9 plays the position correctly with black, on my home computer. Deep Fritz is still in trouble. --Costin ''However, switch sides, make the blunder (in a similar position) and let Schredder beat me:

Still Schredder 9 does not quite "understand" the position. A 9 year old with 2 years of chess training should be able to win this position without blinking an eye, but Schredder 9 simply cannot. White has to sacrifice 2 pawns (say on e5, and then on d6 or f6, in order to create space for the white king to penetrate into black's defence. However, this is a plan that Schredder 9 apparently doesn't "see". Recognizing plans is, in my opinion, a form of higher order pattern recognition.

A claim that is apparently wrong, as it has been contested but not properly defended.

In chess, a grand master is said, according to ChunkingTheory?, to make their next move from a "database" of moves stored in long term memory in which bad moves have been eliminated. This his been shown by real brain research to be true. I'm sure there are more scientific references available, but I remember first reading about this in GoedelEscherBach by DouglasHofstadter.

They say it takes about 10 years of experience to make an expert. Grand Masters make the correct move the first time and do not go through an elaborate test refactoring cycle.

What's interesting is that amateurs access short term memory and do not build a move database in long term memory. It seem that most people, even after many years of experience, can not or do not make this leap to the strategy used by grand masters.

There is a direct parallel to software development, see WhyDoPeopleMakeSoManyMistakes.

An experienced developer does not have the same fears that XP says are out in the world because through their experience they know what works and doesn't work and they know mistakes will be detected in their tests and when found are normally very correctable. This is an experienced based observation and is as supportable as the XP assertion that humans can not be trusted to do the right thing because they are inherently fallible.

By design, XP enforces a least common denominator approach that keeps everyone a perpetual amateur by not allowing and not even recognizing the capabilities of a grand master level experience in particular domains or across several domains.
The central premise of this page, that there are developers who develop like GMs play chess is laughable. It may go a long way to explaining why people with experience (in any field) don't repeat the same mistakes much. It may even go some way to explaining why some people are much better at this than others (just like most people with 10-20 years experience at chess will never be GMs). However, there is a fatal flaw in the connection - there is no database of techniques that really work. If there were, you would see wildly successful large scale software systems popping up all over; after all, we have been doing this for 50 years now. Instead, we are awash in a sea of nearly-right, mostly-ok, but starting to become unmaintainable systems, punctuated by the odd spectacular failure, and the even more rare roaring success that nobody seems to be able to duplicate. -- AnonymousDonor

Yeah, I wish there were something that represented reflexive solutions to complicated problems that seem to be natural solutions. Since they're bound to be repetitive, I think I'll call them DesignPatterns. But you're right. The analogy isn't true. The difference between developers and grandmasters is too great, but the point is that experienced people use intuition more than rationality. Similarly, there have been studies that have shown that the most successful businessmen use intuition more than rational analysis. But intuition is a double-edged banana. Not only can you end up in crazy windfalls based on hunches, you can also slip on the peel if your intuition is blinding you. One should only have probable faith in inductive logic, after all. -- SunirShah

Since they're bound to be repetitive, I think I'll call them DesignPatterns

Sunir, that was a particularly humorous comment! And the banana thing too. Are you trying to dethrone PeterMerel?

Oh, and to make this a relevant addition - my opinion, stated elsewhere, is in agreement with Sunir and AnonymousDonor: ProgrammingIsHarderThanChess.

I wish I could help the original author on this page back on WhyDoPeopleMakeSoManyMistakes where I share his point of views, but unfortunately that discussion is outside my area of competency. I'm one of those insects in SpecializationIsForInsects. However chess is well within my area of competency, and I can stress that the pattern is a truism that has been known for more than a century in the world of professional chess.

It ain't need for brain studies to prove it, and while I'm not a grand master, I played professional chess at an international level, I've been beaten by grand masters, I have won with some other grand masters, I used this principle to train a champion, and I can hardly stress enough the importance of this principle.

The database of moves exists only for the openings ( beginning of the game, first 20-30 moves), and for simple endgames (only a handful of pieces left on the table), and these things are usually known even to beginners, memorizing moves is the easiest part of the job. Intuition plays undoubtedly an important role in a few occasions, but most of the time a grand master relies a lot more on reasoning than on intuition. Intuition is important only in very complicated positions and/or when the player is in a time crisis (it has to move in a hurry, otherwise he will lose on time).

But most of the time a grand master will observe the principles of chess, and will apply them creatively. Those are also the ones that help humans beat the computer. After all, the game of chess is mathematically a min/max game. If there will ever be a computer powerful enough to analyze chess moves exhaustively (I very much doubt that) humans will stand no chance. However the big advantage of a grand master is that by observing the principles of chess he cuts the analysis tree by several orders of magnitudes as opposed to what the computer can do. A computer might explore on the order of 100.000s nodes per second while for a human 1 node per second is extraordinary good.

The principles of chess cannot be formalized (as yet, although a great deal of research has been spent) for the computer, but can easily be explained to a talented 10 year old kid.

Now the problem to use the same technique in computer programming is that we might not have yet a clear set (or several alternative sets) of principles to apply.

The great quality of chess principles is that when you lose you'll be able to tell immediately what principles you have broken, and learn from your lost games. I don't know if we can say the same about failed programming efforts.

-- CostinCozianu

I believe that this notion of a grandmaster's database of wrong moves is merely a misunderstanding of research in this area, which, as I have understood it for ages, does indeed mean to imply second-order/higher-order patterns, not merely a good move/bad move database. A classic observation is that grandmasters can practically instantaneously memorize a chess position never seen before, but only so long as it is an excerpt from a real game, and that, when asked to memorize a board position with truly randomized piece locations, they don't do better than non-grandmasters. That seems to say a fair amount on the topic. -- DougMerritt

Perhaps we do have a few principles, don't we? OnceAndOnlyOnce, PrincipleOfLeastSurprise, maybe LawOfDemeter and the OneResponsibilityRule ... Obviously not all of these principles are uncontroversial, but they (purport to) offer more general principles about how good code should work.

Interesting page! Thanks for the revived interest. I tried a rewrite.
See also CollectWhatWorks.


View edit of May 27, 2006 or FindPage with title or text search