Dalamar Taurog - Natural Language Processing: February 2011

Saturday, February 19, 2011

Changing reading material

I started reading "Pattern Recognition in Speech and Language Processing" - Wu Chou, Biing-Hwang Juang http://books.google.com/books?id=M1OgYGlpJn8C a few weeks ago. There were a couple of passages that struck a cord with me:

"Hidden Markov modeling is a powerful statistical framework for time varying quasi-stationary process..." I found this to be a VERY succinct statement of what Hidden Markov models are best for.

Here is another set of quotes from the same chapter that I found interesting:

"...regardless of the practical effectiveness of HMM...it should not be taken as the true distribution form ..."

"...HMM is not going to achieve the minimum error rate implied in the true Bayes MAP decision.""This motivates effort of searching for other alternative criteria in classification design...MMI (maximum mutual information) and MDI (minimum discriminative information)..."

Though I've heard this before I felt it was well stated here and important to remember. Basically, I believe they are saying that HMM are effective in practice and this gives us the ILLUSION that it is the true distribution but in fact it is not. HMMs are not going to achieve the minimum error rate of MAP even if they achieve a good estimate. Again, this is something that is easy to forget when you use them regularly.

Another quote:

"..without knowledge of the form of the class posterior probabilities required in the classical Bayes decision theory, classifier design by distribution estimation often does not lead to optimal performance."

"This motivates effort of searching for other alternative criteria in classification design...MMI (maximum mutual information) and MDI (minimum discriminative information)..."I especially liked this because it reminded me that HMM is distribution "estimation" and it linked together, for me, the reasoning for exploring MMI and MDI. I've often wondered these other criteria are used and this passage made it clear to me why they are explored.

I ended up putting down "Pattern Recognition in Speech and Language Processing". When scanning through the pattern recognition book below, I found myself loosing interest int he Chou book and anxious to pick up the Bishop book. So this week I started reading http://books.google.com/books?id=kTNoQgAACAAJ "Pattern Recognition and Machine Learning" by Bishop. I am finding it easier for me to understand. Mostly because the amount of new material that I haven't been exposed to isn't as dense. I'm only about 1/2 way through the first chapter but the review is good for me. I'm excited to get to the Neural Network parts because all my study of Neural Networks to date has been about building classifier networks. I'm also interested in building a network that predicts and actual value.

I came across this paper this week as well: "An empirical comparison of supervised learning algorithms" by A Niculescu-Mizil, R. Caruanahttp://www.cs.cornell.edu/~caruana/ctp/ct.papers/caruana.icml06.pdf. In it the compare the performance of

- Boosted Trees

- Random forests

- Bagged Trees

- Support Vector Machines

- Neural Networks

- k nearest neighbors

- Boosted Stumps

- Decision Trees

- Logistic Regression

- Naive Bayes

I'm only familiar with the highlighted ones and am interested in looking into the others when I get a chance. It was interesting that the paper said that Neural Networks seem to be the best choice for general purpose machine learning though many of the other techniques can perform better if you tune them to your problem.

I also stumbled across this article this week: "Natural Language Processing (almost) from Scratch" by Collobert et. al. http://leon.bottou.org/morefiles/nlp.pdf . I've scanned through it quickly and hope to dig into it further when I have more time.

Wednesday, February 9, 2011

Book I recently read

I just finished a book last night that was given to me by my boss this year:
"I married a Travel Junkie - by Samuel Jay Keyser I thought it was a great book. Quick read but had me laughing at several spots and had a good message in it. The message I got out is was that we are all attempting to run from the mundane that creeps into our lives in various ways. The author's wife chose travel as a way to experience the new. I've personally known people do this by meeting as many new people as they possibly can. I've do this intellectually by learning new subjects all the time or more accurately by learning new details about subjects that I am interested in. When I was younger I did this by moving from job to job.

Thursday, February 3, 2011

BioInformatics lecture and other stuff I'm doing

Saw a very interesting article on online education today: http://www.bbc.co.uk/news/business-11735404
I think that online education is the way of the future. The article mentions how the textbook industry is struggling with this trend toward online and pdf version. I think this is a good thing. In fact, I expect a lot of text books to go open source. For a long time I've been hearing that people claim you don't make money by writing a book anymore. You do it for the reputation for the joy of writing the book. I use this website frequently for textbooks http://www.freebookcentre.net/ With sites like this I don't know why we aren't too far from having all textbooks open source.

I've been listening to this online lecture: http://biochem218.stanford.edu/ I've been studying Machine Learning and Natural Language Processing for the past couple of years so its good to see another field where machine learning is ued. I'm only on the second lecture so far. The first one was REAL good on how many techniques are used in the medical field. The most interesting thing that I saw was how they use this thing called "Multiple Sequence Alignment". Seems very similar to semantic processing in natural language. They are looking for where else in a series a pattern they have found also matches. The purpose seems to be that if you find a binding site on DNA for a particular compound, you want to be able to search the string for other binding sites. These are other sites that have the same sequence patterns. In Natural Language, this could be useful when you find a syntactical pattern, you might want to find other similar patterns to aid interpretation. Lecture two is mostly how to do searches on various Literature searches.

I just finished "The World Jones Made" by Philip K. Dick last night. http://books.google.com/books?id=McAgAQAAIAAJ It was pretty good though I had to laugh at the way that he portrayed the surface of Venus as being more hospitable to life than Mars. I'm making a strong effort to read more fiction this year.

Monday I finished work on my implementation of a Multi Layer Neural Network in C#. I had gotten the Neural Network going last week with just one hidden level. I am testing the network by testing XOR. James Mathews was nice enough at this site: http://www.generation5.org/content/2001/xornet.asp to publish some trial weights and how they change after the first iteration so I have something to test against that I am sure doesn't have any mistakes. Over the weekend I fixed a bug that was caused by an error in my understanding. I wasn't aware that the bias of each node actually had a weight associated with it. This had the effect of the 1 XOR 1 test never really improved during my training. I then extended the code to have multiple hidden layers. Of course I broke the system again. Monday I realized that the problem was caused by a stupid coding error. When I updated the weights on the bias, I replaced a += with a = so the weights were being replaced instead of updated. Now that I have the Neural Network functioning I'm starting to manipulate the data I want to test to get it into a form I can play with.

I'm reading Pattern Recognition in Speech and Language Processing right now http://books.google.com/books?id=M1OgYGlpJn8C (yea I know its not open source like I mentioned above. I'm not sure if this was good book to pick to read now because it has a lot more speech stuff than I am usually interested in. However, until I get deep into the book I won't know if it will catch my interest and at the very least it can reinforce much of the stuff I already know. I'm only about 25 pages in so far so it is too early to tell much.

I saw a link on http://metaoptimize.com/qa today to a paper I want to read: "Unsupervised Semantic Parsing" http://alchemy.cs.washington.edu/papers/pdfs/poon-domingos09.pdf Hopefully I'll get some time to read it in the next few days. Also saw a link for "Topical Semantics of Twitter Links" http://rose.cs.ucla.edu/~cho/papers/WSDM11.pdf

Dalamar Taurog - Natural Language Processing