Sunday, November 1, 2009

Makrov Logic

I watched a video lecture on Markov Logic at http://videolectures.net/icml08_domingos_ipk/ yesterday and was very impressed by the technology. The idea is that you apply first order logic to various Natural Language Processing tasks. However, first order logic is too contraining because things can either be true or false. Specifically, if you get one piece of evidence that doesn't support your premis then it is considered false. Markov Logic solves this issue by applying weights to the first order formulas. Then, when a contracticting piece of evidence is encountered, instead of making that formula false, it just lowers the probability of it being true.

In the video, Pedor Domingos, explains how this can be used to resolve the two statements "Smoking causes cancer." and "Friends have similar smoking habits." Either of these might be true in the majority of cases but there might be a contradictory instance found for each. He also discusses Belief propagation and how this can be very slow on VERY large networks so he presents a Lifted Belief Propagation algorithm to reduce the network size being worked on. For learning he recommends a "Voted Perceptron" and gives and application example involving recognizing citations.

Domingos mentions how this could be also used in unsupervised co-reference resolution and Ontology induction. He includes a link to the "alchemy" system http://alchemy.cs.washington.edu/ they have developed which incorporates all of this logic.

The most obvious place where I think people would be interested in using this technology is in storing semantic information that is learned from text. My concern is that this might be too large of a network to handle even with the lifted belief prorogation but it certainly is an area worth research. My interest is how this could be used to learn and code grammar rules. I've been reading a bit about various HMM (http://acl.ldc.upenn.edu/acl2002/MAIN/pdfs/Main036.pdf) and Maximum Entropy Systems (http://acl.ldc.upenn.edu/A/A00/A00-2018.pdf) for parsing lately and what has struck me is that they mix statistics and automatic rule learning. I have always felt that the rule learning systems of the past were replaced with statistical models that, though they performed better, didn't recognize the advantages of the old rule based systems. The systems seem to encode the rules and learn patterns that override the basic statistics.

I think a technique like Markov logic could be used to take this to the next level. Typical Max Ent systems iterate over the learning text and select the rules that result in the highest probability of matching the test data. A Markov Logic system could recognize each rule or contradiction but wouldn't have to through out rules, it could just lower the probability of them being true. This seems like the best way to develop a grammar to me. Now if we can only figure out how to make it work on unsupervised data.

This is definitely an area I want to learn more about.

No comments: