Wednesday, May 21, 2008

Where I am so far...

I became interested in Natural Language Programming when I was reading Artificial Intelligence - a Modern Approach (2e) by Peter Novig. I read the first 13 chapters in depth but only did a single read through on the remaining chapters. I intend to go back finish the rest of it later. After that I started reading Foundations of Statistical Natural Language Processing by Manning and Schutze. Though I read the whole thing, I became very lost in the final chapters. One of the things that made me lose interest is that it was more about how to statistically recognize languages and my interest is more in getting a computer to understand and create an internal representation of what was read. Some of the concepts, I felt would be easier to understand after a read more introductory material as well.
This lead me to start reading Speech and Natural Language Processing by Jurafsky and Martin. I think the book is great and I just finished chapter five today. Chapter 3 introduced me to stemming. I found a text version of a novel on line (a cheap sci fi) and I've written an application to parse out all the words. I implemented the Porter Stemming algorithm and was pleased to find that, of the 8000 distinct words in my file, I could find determine that about 2000 of them where stems of other words. I have the system make a "guess" that a word is a verb if it finds the stem, -ing, -ed and -s versions of a word.
Chapter 5 is where the book became really interesting as far as I am concerned. I've implemented the Levenshtein minimum edit distance algorithm and am working on the forward algorithm. I plan to implement the Viterbi algorithm next week sometime.

1 comment:

Unknown said...

How do you intend to verify whether you have understood a text?