|
MONK : Sentimentality Status June 2008
This page last changed on Jul 08, 2008 by ssteger@uga.edu.
This is an update from the last status report posted after the hackfest in Montreal (Sentimentality-status-April2008). I last reported information about the decision tree results I was getting back from running my training set of 70 unsentimental chapters and 50 sentimental chapters against the testbed set of eighty novels. At that time, I didn't have the results of the "more like these" - the predicted classification of chapters in the testbed set. Amit got me these results, based on lemma, by running SEASR flows on his own machine. The workbench isn't working, but I do have results. Amit (who has been very patient with requests from me and deserves big thanks) warned me that the confidence levels on this phase were very low because the training set should be around 25% of the test data (he also warned that this number is really a guess and we need to experiment to know the optimum percentage). I'm dealing with almost 4,000 chapters in the eighty novels that comprise the test data. So, at this phase, I wasn't even close to 25% - I was at about 3%. The experiment returned 827 chapters that it classified as sentimental. Of those, it classified 416 as "perfectly" sentimental - by that I mean that they scored the highest possible "1.0" on the sentimental scale and the lowest possible "0.0" on the unsentimental scale. I took these 416 "perfectly sentimental" chapters and went through them to see the results. This was rather painstaking since we don't have the workbench working. I had to take the list that Amit returned me with the results, which contains the workID, a numeric value for the sentimentality of that workId, one for the unsentimentality, and a predicted classification. I then copied this workID and did a find in another spreadsheet which has the workID, the name of the work, and the name of the chapter that correspond to each workID. This is how I could identify the workID (i.e. which chapter and work I was dealing with). I then went back to the spreadsheet with the result and copied the novel name and the chapter into it. Then I went either to the Workbench or to WordHoard to look up that chapter. I read it, and I marked in the results spreadsheet whether I thought the predictive results were correct - whether that chapter really is sentimental. I then used this information to create a new, improved training set, which consists of 324 chapters - 193 sentimental and 131 unsentimental. Even with the low confidence levels, I was very excited by the results. The system is surprisingly accurate at predicting sentimentality - I disagreed with only 14% of the classifications of the "perfectly sentimental" results. For example, the system identified four chapters from Villette as sentimental. Of those, three are absolutely sentimental, and the other is a judgment call (it could go either way, so I'm not adding it to the training set). The new training set comprises about 9% of the testbed, and Amit has run the experiment again for me, returning a new set of classifications and also a decision tree based on this improved training set. I'm going through these results currently. Going forward, it would be extremely, extremely useful to have the workbench working. It would eliminate the need to toggle back and forth between spreadsheets. I could just click on the results, see the chapter in question, and re-rank it right there. I have found that using the chunk viewer to review chapters really does work. It's a good system, and I can't wait to use it! |
| Document generated by Confluence on Apr 19, 2009 15:05 |