Research talk:Automated classification of draft quality/Work log/2016-12-01
Add topicThursday, December 1, 2016
[edit]Today, I'm analyzing how good my PCFG models are at differentiating sentences from FA, spam, vandalism and attack articles.
Essentially, I trained the four models and then plotted the log_proba / productions for each of the input sets.
OK so this plot shows that the PCFG can totally differentiate to a minor extent, but it there's certainly a lot of overlap between the different models. --EpochFail (talk) 19:57, 1 December 2016 (UTC)
So in thinking about how there's probably a clear difference between the scores of the various models, I decided to try something different. In the following plot, I subtract the log_proba of the given model from that of the model applied to it's own content. So essentially, this plot shows us how well *the other models* differentiate from the own model. We want to see some negative values and little overlap on or above zero.
This looks a lot more promising. It looks like we can differentiate FA and attacks pretty well. We can differentiate spam pretty well too, but it's interesting to see that spam looks a lot like FA content. Vandalism is weird. A large part of the attack model fits vandalism better than the vandalism model does. That shouldn't be possible. --EpochFail (talk) 22:33, 1 December 2016 (UTC)