The domain in which I did my research in cognitive science is the Bongard Problems. These are problems on visual pattern recognition that appeared first in the appendix of a book published by the Russian scientist M. M. Bongard in 1967, in what was then the USSR. They became more widely known to the western world when D. R. Hofstadter mentioned them in his book, “Goedel, Escher, Bach: an Eternal Golden Braid”, and speculated on how an automated system could be built to solve such problems. Rather than tiring the reader with words, I prefer to show what Bongard Problems (BP’s) are by presenting a — rather trivial — BP, below. There are six boxes on the left, and another six on the right. The ones on the left conform to a pattern, or rule, that describes them, and the task of the problem-solver is to find this pattern, or rule. As an aid, the six boxes on the right do not conform to the same pattern. Sometimes they conform to a different pattern (as is the case in the above example), while in some cases the rule that describes them is simply the negation of the rule on the left. (Thus, the six boxes on the left are the ones that, somehow, have the official status of defining the pattern.) There is nothing magic about the number six — it’s just that Bongard decided to provide six examples and six counter-examples for each of his 100 problems in the original publication (see link for references at bottom) and so BP’s are normally designed with 12 boxes thereafter. As mentioned, the above example is trivial. Not
all BP’s are so easy, though. Let us see a problem of medium
difficulty: Did it take you more than a few seconds to see what is going on? If your answer is “yes”, you are probably a “regular” BP-solver, like me. If it took you less than five seconds, though, you probably belong to a small percentage of people who have a knack at solving such problems. But don’t hurry to celebrate, yet! I don’t mean you are the most intelligent person in general that ever walked on Earth! Maybe you are — I don’t know — but clearly, solving BP’s fast is evidence for only some specific aspects of human intelligence. Also, you don’t need to have a degree in higher education to solve such problems; my own research has shown that even otherwise average people can sometimes show a surprisingly excellent ability for solving BP’s. (See the link to my dissertation, near the end of this page.) Note, please, that if you want to see more BP’s that I (arbitrarily) characterize as belonging to a medium level of difficulty, you may click on the example above. The same remark holds for all the other BP’s shown on this page. You can also check the link to the BP-index, at the bottom of this page. Now, not all problems can be handled so easily. When M. M. Bongard published his book he included 100 BP’s in the appendix, some of which were rather hard. Later, D. R. Hofstadter engaged himself in a brief period of BP-design mania, during which he created 56 new problems (still called “BP’s”, but available in print only through CRCC — his research center; even better, check my index to all BP’s, or my own appendix to my dissertation, below), some of which I never managed to solve. (I mean I have spent more than half an hour, or over one hour in some cases, and still did not come up with a solution.) Want to try your teeth at some mind-bogglers? Watch this: Click on the picture to see more. Although Bongard included the answers at the end of his appendix, Hofstadter did not, so the ones I could not handle belong to Hofstadter’s collection. Now, those few readers who happen to know that I did my work under Hofstadter’s research group might wonder: “Why didn’t you ask him, then?!” Well, I found out he never wrote the answers down, and when he looked at one of his problems (this one) he managed to solve it as if it was somebody else’s, apparently having forgotten all about it after he created them, more than two decades earlier. So I adopted a better strategy: after having this page on the web since 12/1999, I received an avalanche of answers to all problems! Thanks to everyone for the solutions!
My research task was of course not to solve such problems (!!), but to write a program that would be able to do so. Readers not familiar with programming might think this must be easy, probably misled by the apparent effortlessness with which we handle at least the easiest BP’s. Other, computer-savvy readers might think it is possible to write such a program, but then, what do we gain out of it? Here is my basic premise:
Meanwhile, I gave a name to my program. I called it Phaeaco. Click on this link to see where this weird name comes from. (Warning: for your entertainment only; no deep cognitive science issues here.) Before proceeding, let me mention that Phaeaco’s input consists of raw bit-maps, i.e., a Bongard problem is a collection of 12 square black-and-white boxes, each of which has a resolution of 100 pixels per side (just as you see them on this page). There is no other high level information included in the input, such as lines, intersection points, etc. — just pixels. I imposed this form on the input for several reasons:
Starting at the level of individual pixels, however, is the least important of the cognitive issues involved. Some other issues of major importance include:
A program that could handle all of the above cognitive issues would be much more powerful and cognitively interesting than a program that “merely” solves BP’s. At an earlier stage in its development, my project included also a linguistic component, one that could allow Phaeaco to learn words and phrases for its visual patterns, and use them for expressing the solutions. Later, I had to abandon this direction, after receiving advice from my research supervisors, else my project would never end. I have now postponed the linguistic component (which I regard as extremely important, not for Bongard problems, but for human cognition in general) to a future time of pursuing my research interests. Download my dissertation: In May, 2006, I completed my dissertation, and in the following month I graduated from Indiana University. For those of you who would like to take a look at my dissertation, here it is (click on the PDF icon): The above PDF file is 14 MB long. My dissertation is written in a down-to-earth way: there is no “academese”, technical, or other obscure language in it. I hope that, if you are interested in cognition, it makes for a pleasant reading. If you reached this point and wish to learn more on my doctoral research, you may select one of the following links: Interface of Phaeaco and
program information.
Important last note: In January, 2008, I wrote and uploaded a web page with the title: Why I stopped working on the Bongard problems The short answer is, I got scared of some ethical issues involved in this line of research. Click on the above link to learn what those ethical issues are and read the long answer. Footnotes (clicking on the caret (^) on the left of the footnote brings back to the text): (^) At a first glance, one might think there is no noise in the input at all, since the pixels can be either black or white, with no other colors or gray scale values allowed. At the level of individual pixels it is true that there is no noise. However, at every other level there is only noise to be found: Straight lines are never quite straight (e.g., if drawn by hand, or due to low scanning resolution), so the program has to figure out when a line should be seen as straight; lines can have varying width, which should normally be ignored, unless it gets too thick, so we have a specially-featured line; small filled circles are often mere irregular blobs of collections of pixels; bigger circles might look like polygons, or like no circles at all; lines may not meet perfectly at vertex points, or end perfectly at end-points (little serifs may appear), or they may have interruptions; erratic pixels may appear here and there — once again due to scanning noise; and so on. Yet, in all these cases our human visual system has no trouble recognizing the pattern, or shape, that “should be” there, because we are usually guided by top-down forces. My program should be able to do that, too. (^) What is a representational scheme? Well, when we see a triangle, and remember it later, what sort of information is stored in our minds? Pixel-based image? A mathematical-propositional abstraction? Nothing that can be described at a higher level, just neurons that fire in synchrony, responding to similar inputs? I believe, none of the three ideas is exactly right. My representational scheme holds a middle ground, between symbolic and neuronal computing. (^) Although some researchers in the field might disagree, I believe we do possess a representational scheme for visual information in our minds, and in fact we have extended this scheme so that it forms the basis for all of abstract thought and non-visual cognition. Although neurobiologists see only neurons firing in synchrony, I believe this is just the hardware (neural) correlate of what in mental “software” would be expressed through representations, akin to those I describe in my work. |
Back to my other research topics