Some Classification Experiments

Since Clovis has running some experiments to see how well Topic Modeling using LDA might be used to predict topics on unseen instances, I thought I would back track a bit and write a bit about some experiments I ran last year which may be salient for future for comparative experimentation or even to begin thinking about putting some of our classification work into some level of production. I am presuming that you are basically familiar with some of the classifiers and problems with the Encyclopédie ontology. These are described in varying levels of detail in some of our recent papers/talks and on the PhiloMine site.

The first set was a series of experiments classifying a number of 18th century documents using a stand alone Bayesian classifier, learning the ontology of the Encyclopédie, and predicting the classes on chapters (divs) of selected documents. I have selected three for discussion here, since they are interesting and are segmented nicely into reasonable size chunks. I ran these using the English classifications and did not exclude the particularly problematic classes, such as Modern Geography (which tend to be biographies about important folks, filed under where they were from) or Literature. Each document shows the Chapter or Article, which is linked to the text of the chapter, followed by one or more classifications, assigned using the Multinomial Bayesian classifier. If I rerun these, I will simply pop the classification data right in each segment, for easier consultation. Right now, you will need to juggle between two windows:

Montesquieu, Esprit des Loix
Selected articles from Voltaire, Dictionnaire philosophique
Diderot, Elements de physiologie

PENDING: Discussion of some interesting examples and notable failures.

The second set of experiments compared K-Nearest Neighbor (KNN) classifier to the Multinomial Bayesian classifiers in two tests, the first being cross classification of the Encyclopédie and the second being multiple classifications, again using the Encyclopedie ontology, to predict classes of knowledge in Montesquieu's Esprit des Loix. The reason for these experiments is to examine the performance of linear (Bayesian) and non-linear (KNN) classifications in the rather noisy information space that is the Encyclopédie ontology. By "noisy" I mean to suggest that it is not at all uniform in terms of size of categories (which can range from several instances to several thousand), size of articles processed, degree of "abstractness," where some categories are very general and some are very specific, and a range other considerations. We have debated, on and off, whether KNN or Bayesian (or other linear classifiers such as Support Vector Machines) classifiers are better suited to the kinds of noisy information spaces that one encounters in retro-fitting historical resources such as the Encyclopedie. The distinction is not rigid. In fact, in a paper last year, on which Russ was the lead author, we argued that one could reasonably combine KNN and Bayesian classifiers by using a "meta-classifier" to determine which should be used to perform a classification task on a particular article in cases of a dispute (Cooney, et. al. "Hidden Roads and Twisted Paths: Intertextual Discovery using Clusters, Classifications, and Similarities", Digital Humanities 2008, University of Oulu, Oulu, Finland, June 25-29, 2008 [link]). We concluded that, for example, "KNN is most accurate when it classifies smaller articles into classes of knowledge with smaller membership".

Cross classification of the classified articles in Encyclopedie using MNB and KNN. I did a number of runs, varying the size of the training set and set to be classified. The result files for each of these runs, on an article by article basis, as quite large (and I'm happy to send them along). So, I compiled the results into a summary table. I took 16,462 classified articles, excluding Modern Geography, and "trained" the classifiers on between 10% and 50% of the instances. I put "trained" in scare quotes because a KNN classifier is an unsupervised learner, so what you are really doing is selecting a subset of comparison vectors with their classes. The selection process resulted in 276 and 708 classes of knowledge in the information space. As is shown in the table, KNN significantly outperforms MNB in this task. We know from pervious work, and general background, that the MNB tends to flatten out distinctions among smaller classes, but has the advantage of being fast.

The distinctions are at times fairly particular and many times the classifiers come up with quite reasonable predictions, even when they are wrong. A few examples (red shows a mis-classification):

Abaissé, Coat of arms (en terme de Blason)

KNN Best category = CoatOfArms
KNN All categories = CoatOfArms, ModernHistory
MNB Best category = ModernHistory
MNB All categories = ModernHistory, Geography

AGRÉMENS, Rufflemaker (Passement.)

KNN Best category = Ribbonmaker
KNN All categories = Ribbonmaker
MNB Best category = Geography
MNB All categories = Geography

TYPHON, Jaucourt: General physics (Physiq. générale)

KNN Best category = Geography
KNN All categories = Geography, GeneralPhysics, Navy, AncientGeography
MNB Best category = Geography
MNB All categories = Geography, AncientGeography

I applied the comparative classifiers in a number of runs using different parameters for Montesquieu, Esprit des Loix. All of the runs tended to give fairly similar results, so here is the last of the result sets. The results are all rather reasonable, with in limits, given the significant variations in size of chapters/sections in the EdL. The entire "section" 1:5:13 is

Idée du despotisme. Quand les sauvages de la Louisiane veulent avoir du fruit, ils coupent l'arbre au pied, et cueillent le fruit. Voilà le gouvernement despotique.

which gets classified as

KNN Best category = NaturalHistoryBotany
KNN All categories = NaturalHistoryBotany
MNB Best category = NaturalHistoryBotany
MNB All categories = NaturalHistoryBotany, Geography, Botany, ModernHistory

In certain other instances, KNN will pick classes like "Natural Law" or "Political Law" while the MNB will return the more general "Jurisprudence". I am particularly entertained by

PARTIE 2 LIVRE 12 CHAPITRE 5:
De certaines accusations qui ont particulièrement besoin de modération et de prudence
KNN Best category = Magic
KNN All categories =
MNB Best category = Jurisprudence
MNB All categories = Jurisprudence

Consulting the article, one finds a "Maxime importante: il faut être très circonspect dans la poursuite de la magie et de l'hérésie" and that the rest of the chapter is indeed a discussion of magic. While the differences are fun, and sometimes puzzling, one should also note the degree of agreement between the different classifiers, particularly if one discounts certain hard to determine differences between classes, such as Physiology and Medicine. The chapter "Combien les hommes sont différens dans les divers climats" (3:14:2) is classified by KNN as "Physiology" and MNB as "Medicine". Both clearly distinguish this chapter from others on Jurisprudence or Law.

I have tended to find KNN classifications to be rather more interesting than MNB. But I don't think the jury is out on that and one can always perform the kinds of tests that Russ described in the Hidden Roads talk.

All of these experiments were run using Ken Williams' incredible handy perl modules AI:Categorizer rather than PhiloMine (which also has a number of Williams' modules) just because it was easier to construct and tinker with the modules. I will post some of these shortly, for future reference.

ARTFL Project Research Blog

Some Classification Experiments

0 comments:

Post a Comment

Labels

Popular Posts

Blog Archive

Developed by ARTFL