Wednesday, May 21, 2008

Response to comment...

Roger asked "How do you intend to verify whether you have understood a text?"

Good question!

The short answer is to look at the logical representation of the text in memory. It would be even better if I had an interface to be able to ASK questions of this logical representation. For instance, if the following sentence is read: "The man slowly climbed the stairs." Then we should be able to ask if the man climbed and the answer should be "yes". If we ask what the man climbed then the answer should be "the stairs". Additionally, I would like to see the system learn that men can climb stairs and that this can be done at varying speeds (or at least slowly).

Of course I have to learn to walk before I can fly...so there are more unanswered questions than answered ones right now. How will I query those facts? How will I represent the fact internally? How will I interpret the text into this internal representation? How will I parse the sentences before interpreting them? How will I recognize parts of speech? How will I learn those rules for recognition of parts of speech?

This is where I think I am now. I found with the Porter Stemmer how to recognize some words as verbs. Next I have to figure out recognize other parts of speech and what to do with the words I DON'T recognize. Maybe if it can get the list of words it doesn't understand down to a manageable level, it could ask a user for some information. Hopefully, by asking a user a few careful questions it will be able to learn rules that will allow it to categorize large quantities of unknown words. One thing I DON'T want to do is use some sort of preexisting knowledge of what parts of speech words are. I also don't want to train my application on test data and then have it only have that level of understanding. I would prefer to be able to make an application that, programmed with a core set of rules, would be able to build up its own dictionary and continuously refine its understanding with every bit of text it encounters.

2 comments:

Unknown said...

At the end of a recent talk, Norvig was asked whether researchers should be working on general language processing or more specific projects. He replied that the general problem is too hard right now, so most should be working on specific areas. But that we can't neglect the vision and insights that come from looking at the whole picture.

I think the task is to find a way to make an interesting problem feasible without degrading it into a toy project.

One idea that comes to my mind is a Second Life car salesman or store clerk who can call in the manager when extra help is needed. I also think that modeling someone with an obsessive disorder or other malfunction (e.g. Parry's paranoia) could simplify things by inserting purposeful misinterpretations.

Dalamar Taurog said...

Good points. I can't help but wonder if we have all the pieces we need already. Perhaps we just have to put them together in the proper order?

I'm also concerned with the issue that of how much research is going on but how little of it has real value. For every 10 papers I read, it seems that 9 of them were written just to pass a course and don't seem to have anything new to say.