Comité d'agriculture et des arts

Leave a Comment


Several décades after the fall of Robespierre, the Convention nationale issued a decree reorganizing its committees on 7 fructidor II. The collections in the Hub have at least two sources for this decree, one in the Newberry French Revolution Collection which was printed by the order of the Convention, and the other reproduced in the Baudouin collection of Revolutionary Laws. One of the 16 committees, with a small but significant charge was the Comité d'agriculture et des arts:

This was, of course, nothing new. Previous revolutionary assemblies have organized agriculture committees and commissions (Mellah, 2020) and this committee simply replaced the Convention's Comité d'agriculture. Concerns with agrarian life and subsistence was a pressing issue during the period. A full text search, using the PhiloLogic4 instance, of the Revolutionary Laws collection shows several hundred decrees being introduced with variations on the expression "après avoir entendu le rapport de son comité d'agriculture".  


Starting from PhiloLogic search results, one can follow the links to the individual laws containing the phase (example link) and, in many cases, navigate from the law to the legislative session in the Archives Parlementaires by clicking on a date, such as Du 11 Octobre. == 13 du même mois.   

The range of activities of the agriculture committees were not simply limited to the production of reports for potential legislative action. Searching for "convention agriculture" in the Hub, returns some 15 titles which reflect the scope of the interests of the Convention's agriculture committee.  The committee published manuals for the cultivation, storage and use of various crops such as potatoes, cabbage, and carrots, all of which link back to the long tradition of agricultural writings.  Looking at the list of similar documents found for Instruction sur la conservation et les usages des pommes-de-terre, for example, finds a relevant number of Ancien Regime texts, including a chapter on potatoes from the 1772 translation of Arthur Young's The farmer's guide.  

Grain and bread were, of course, critically important.  In 1794, the committee published Moyens propres a rendre plus économique l'emploi des farines : provenant des grains nouvellement récoltés ; et à augmenter la qualité du pain qu'elles doivent donner (link) which is linked by similarity measures to a variety of earlier texts, including the chapter on grains in Duhamel du Monceau's, Traité de la culture des terres, suivant les principes de M. Tull, Anglois (1753).  Examination of the titles most closely related to this text by topic and vocabulary show a mixture of practical as well as more theoretical works.




Given that bread constituted at least half of the average wage earners' expenditures through the 18th century (link), it is no surprise that the conditions of the grain trade and the price of bread was of capital importance and was widely debated in the two weeks leading up to the loi du Maximum of 1793 (RevLaws AP).  The first document on the search "convention agriculture" in the Hub is to an anonymous text Mémoire sur la fixation du maximum du prix des grains dans toute la France : remis au Comité d'agriculture de la Convention nationale, l'an premier de la République [1792] found in the Newberry FRC (link).  The collections contain two other renditions of this document, which are shown as a the top two most similar documents on reported at the top of link.  The first as an annex to the April 25, 1793 session of the Convention found in the Archives Parlementaires and the second in Goldsmiths-Kress collection.  The anonymous author opens unequivocally: 

La subsistance du peuple est le premier objet qui doive occuper les législateurs. Il faut assurer l'existence des hommes avant de songer à régler l'usage de leurs facultés.  (AP version).  

Taking on the elder Mirabeau (the friend of man) and the physiocratic tradition of free trade before the Revolution, he writes:

Des philosophes, amis des hommes, avaient cru voir, dans la liberté indéfinie du commerce et même de l'exportation des grains, un principe de fécondité et d'abondance qu'ils regardaient comme le plus sûr préservatif contre la famine.   Pendant qu'ils se livraient à ces contemplations, un gouvernement populicide opérait la famine par le commerce et par l'exportation des grains; l'absurdité du système de la liberté indéfinie de l'exportation a été démontrée par le fait, et cette exportation a été prohibée par l'Assemblée constituante. [1]

The list of top 20 most similar documents and the documents most related topics show an interesting mix of opinions for and against free trade in grain, with many echoes back to the earlier debates as far back as the Turgot ministry.  Beffroy's Rapport fait au nom de la section des subsistances chargée de combattre les économistes (1792) is equally as pointed

Ouvrez maintenant Young, consultez Smith, interrogez Turgot, voyez Beaudot, relisez Ferraud, Roland, Périés & tous les partisans de leur systême; ils ne vous parlent que de l'interest du marchand & du Spéculateur. Or, l'expérience vous a prouvé que cet intérêt mercantille ne s'alimente que des malheurs publics : jugez donc entre lui & celui du peuple que vous représentez. (link)

On the other side, Creuzé-Latouche's Sur les subsistances (1793) makes the case that the free trade in grain implemented in the Turgot administration resulted in low and stable prices across the country in spite of poor harvests in those years.  Further, he writes, Turgot was defending the liberty and sovereignty of the people:

Mais pour vous faire mieux connoître quel étoit ce ministre Turgot, qui avoit voulu établir la liberté entière du commerce des grains, il faut vous dire qu'il supprima les corvées , qu'il donna , le premier, l'idée des assemblées provinciales, qui dévoient bientôt rappeler la nation a sa souveraineté ; et qu'il se fit chasser de la cour , pour avoir voulu défendre la liberté du peuple, et abolir les fiefs. (link)

As a defense of free trade in general, and Turgot in particular, Sur les subsistances, links back to many earlier discussions of this vexed subject, including Mirabeau's L'Ami des hommes and Philosophie rurale, Young's Arithmétique politique and articles from Ephemerides du Citoyen, ou Bibliotheque Raisonee des Sciences Morales et Politiques.  



By integrating heterogenous collections, ranging from Revolutionary laws as enacted, to the debates and publications surrounding these events, to the practical handbooks and theoretical treatises, we can direct attention from the specific recommendations of an important committee to a much broader context.  Working with texts in this context opens the reader to multiple crosscurrents of topics and themes that can be unexpected and illuminating. 

Returning to where we started, in 1795, the Comité d'agriculture et des arts published Instruction sur l'emploi de la lie de vinwhich opens with: 

La lie de vin , rejettée dans plusieurs cantons de la France, comme un résidu sans valeur , peut cependant produire une quantité considérable de potasse utile pour les verreries, les savonneries et plusieurs antres arts , particulièrement pour la fabrication du salpêtre , le premier besoin de la liberté contre les efforts de la tyrannie.  

Here, the practical, theoretical and the political merge.  The twenty most similar documents include discussions of gunpowder, chemistry, manufacturing, and ranging from  Lavoisier's ARTICLE XII. De l'usage de la Potasse pour la fabrication du Salpêtre in Instruction sur l'établissement des nitrières et sur la fabrication du saltpêtre (1777) to Dulac's L'agonie de tous les tyrans, ou, Les moyens de fabriquer la foudre qui va les exterminer (1793).  

Read More

Club de la propagande

Leave a Comment

 


Over three decades ago, while working with the splendid French Revolution Collection (FRC) at the Newberry Library in Chicago, I came across one of those entertaining little finds that stick in your memory and makes working in great research library so worthwhile.  Came across is not quite right, since the librarians at the Newberry had begun working on a database catalog of the collection, starting with the anonymous texts from the FRC.  Searching for "Club" in this early database, which I recall was running on a stand-alone IBM-PC/AT from that epoch, generated a list of titles which included a document which I probably would not have found using standard printed catalogues such as Tourneux's Bibliographie... .   The Dénonciation a toutes les puissances de l'Europe : d'un plan de conjuration contre sa tranquilité général (link), is a right wing attack on the Société de 1789, a political club founded by Condorcet and Sieyès in 1790[1].  

What stuck in my mind for all these many years is the basis of the attack; that the "Club de la Propagande" was part of an American plan to destabilize the thrones of Europe with the ultimate objective of subjugating the old world to the new:

Elle qu'une légère vapeur qui s'élève du sein de la mer, comme le vestige d'un homme, attire du plus loin tous les nuages étendus dans l'air, se condense, s'obscurcit, & éclate enfin en une furieuse tempête ; tel on a vu le spectre pâle & maigre de l’insurrection, sortant d'une terre ingrate, & du milieu d'enfans rebelles parricides, croître & s'élever en un colonne fastueux, qui, posant un de ses pieds sur l'hémisphere qui l'enfanta, essaya de l'autre de franchir l'Océan, pour porter ses ravages sur celui-ci; & comme si l’Amérique avoir encore plus à se plaindre qu'à se louer de l'Europe, elle a envoyé l'anarchie à celle-ci, pour prix du soin qu’elle a pais de la civiliser.
    C'est elle qui est le berceau des convulsions qui commencent à agiter notre continent; c'est-là qu'est né le projet de soumettre l'ancien au nouveau monde ... (link)

Like all good political invective, there were some grains of truth to the attack.  The Société was no doubt rather pro-American in its orientation naming, for example,  Franklin, mentioned by name in the Dénonciation, as an honorary member.  The anonymous author identifies the root cause of all of the disorders in France and Europe are due to the contagion liberal ideas.

Les monstres! Ils ont égaré le peuple par deux mots l’ont toujours rendu la dupe des fourbes = égalité, & désobéissance = l’un, ils le lui on présenté comme un droit naturel. L’autre, comme,  un moyen légitime d’y rentrer. = II ne connoit  pas, ce malheureux peuple, le pouvoir magique  de ces deux mots, qui ont couvert la terre de crimes & de sang, qui ont rendu son séjour un objet d’horreur pour la vertu[?], & qui lui font, à la fin, désirer à lui-même un remede qu’il abhorre.
  
To insure that his readers were precisely able to identify the source of the conspiracy, the author attached a 10 page extract from  Sieyès'  Ébauche d'un nouveau plan de société patriotique, adopté par le Club de mil sept cent quatre-vingt-neuf  (BNF) which includes a discussion of l'art social as well as elements of the club's formal organization.  

The good folks at the Newberry produced a photocopy of this little treasure shortly after, which I squirreled away in my files and have kept, along with the charge slips and other notes, to this day.  Yes, I should provide seriously consider cleaning out the old paper files at one point.  

I had occasion to revisit this text several years ago, almost three decades after my first reading, in a completely different context.  In 2016-7, the Newberry Library made the entire collection available in digital format.  The release on Github consists of Library’s exceptional metadata describing each object, the OCR text data, and links to the digital facsimiles accessible from the Internet Archive, encouraging researchers and instructors to incorporate the digital collection in new kinds of scholarship and engagement.  In 2018, the ARTFL Project, in collaboration with the Newberry, released two versions of the collection under PhiloLogic4 (link).  The collection has also been extremely valuable as a corpus to test various new applications based on sequence alignment and machine learning.  In this course of this work, I was pleased to find the Dénonciation was indeed included in this collection.  


Part of our experimental work in developing the Intertextual Hub, is the deployment of various text mining and machine learning algorithms to a number of large heterogeneous collections.  As I was preparing a presentation on some of this work, I looked up the Dénonciation once more, to observe that the first topic listed in the citation is topic 34, the top words of which are: "election electeur nomination assemblee scrutin majorite elu choix membre votant" (accents removed).  Closer examination of the topic model for this document reveals pretty much the kinds of subjects that I had recalled:



With the notable exception of the first topic, number 34.  This unexpected topic sent me back to the text itself for the first time in decades, reminding me that significant parts of the Dénonciation contains an almost comically complex description of the election process of members taken from Sieyès'  Ébauche... . Here is just part of the involved process to elect members, the number of whom would be limited to 660:

Il est d'une bonne vue de donner au plus grand nombre possible des membres, la facilité de prendre part aux scrutins, afin qu'ils soient d'autant mieux le résultat de la volonté générale ; en conséquence on pourroit régler, que chaque scrutin se fera en quatre parties ; savoir, au premier & au deuxieme jours , & au quinze & au seize de chaque mois; de maniéré que le scrutin commence le matin du premier du mois ; par exemple , depuis onze heures jusqu'à midi , le soir pour ceux qui n'auroient pas pu se présenter le matin; le même scrutin continueroit le lendemain matin, ne se terminera que le, soir. Alors seulement on feroit le recensement. Pour prévenir les abus , il suffiroit que les feuilles de papier , remises aux membres fussent signées par un commissaire , qu'en recevant sa feuille , chaque membre s'inscrivit , ou fut inscrit par un commissaire; on connaîtroit par-là le nombre des feuilles données , ceux qui ont reçu la leur. Il faudrait encore que la boëte du scrutin fut fermée à clef, & qu’on ne pût en rien tirer jusqu’au moment du recensement.  (emphasis mine)


Trying to determine the "general will" just might well require such care and management of election procedures, but I have to admit that I wondered if I had missed the joke the first time around.  Was this a spoof of Condorcet's electoral combinatorics?  


Alas, you can't make this stuff up.  Or at least the author of the 
Dénonciation did not have to. The current version of the Intertextual Hub is based on a number of collections and the system provides two links to the original text by Sieyès. The Topic Model representation of the Dénonciation in the Hub
shows 2 parts of the Ebauche as being the top 2 most similar documents by a measure of vocabulary.  

It is followed by Condorcet's constitutional proposal of 1793.  The document read function of the Hub isolates numerous borrowed passages from the Ébauche


 The system allows the reader to compare two passages side by side to examine just how closely related they are.  


It is important to note that Sieyès' Ébauche is not part of the Newberry French Revolution collection, but is contained in the Goldsmiths-Kress collection of French works related to political economy.  The different techniques employed in our implementation of the Intertextual Hub, lexical density and sequence alignment, gave two different avenues to indicate the the two documents are related.  Being contained in different collections is important in itself. The Dénonciation does not have internal divisions (chapters or sections) while the  Ébauche does. Thus similar documents function from Dénonciation the does not find the Ébauche, because it is treated as parts of a document.  To find various potential points of contact between documents, we use various measures which are complementary and necessary, since we are trying to find relationships between items that are not all the same.  Thus, some of the complexity of the Hub is an artifact of treating huge numbers of heterogeneous documents.

My long, very intermittent, relationship with Dénonciation a toutes les puissances de l'Europe..., a minor text if ever there was one, is illustrative of the progress I believe we have seen over the last three decades in digital humanities. I first found it as part of an experimental bibliographic database in the late 1980s and able to access it only in person and store it as a photocopy. Decades later, it became a small part of an extraordinary collection, searchable as both excellent metadata and uncorrected OCR text. Our current work reflected in the Intertextual Hub, is to build and environment which can draw connections between documents across collections, using the power of distant reading tools to help navigate and elucidate closer considerations of even minor texts.

References
1   Mark Olsen, "A Failure of Enlightened Politics in the French Revolution: the Société de 1789" in French History 6 (1992): 303-34. (DOI)







Read More

Topic Models in the Intertextual Hub

2 comments


ARTFL’s NEH funded Intertextual Bridges project is an effort to facilitate distant and close readings across a large heterogeneous set of collections of 18th century French documents. These range from Revolutionary pamphlets and newspapers to the great works of Enlightenment in the original French as well as translations of many English texts. This post and associated slide show (see below), will provide an overview of the many ways which we attempt to use topic models as a way to search and navigation the collections. In two previous blog posts, Tracing Revolutionary Discourses
and Modeling Revolutionary Discourse, we provided an overview of some the development implementations and offered some initial observations arising from our use of topic models in this effort.  While the description of the procedures and implementation of both posts are reasonably current, we have made significant progress in the intervening months.  Thus, our discussion of Topic Models in this post builds upon our previous posts.  

The Intertextual Hub (https://intertextual-hub.uchicago.edu/) makes extensive use of Topic Models to provide search services, analytics and one form of document navigation[1].  This is an extension of the TopoLogic package which functions as an add-on to ARTFL's PhiloLogic4 text analysis system.   Topic Models are generated by invoking the ARTFL Text Preprocessing Library (ATPL), to extract metadata and word data from the standard representations generated by PhiloLogic4. This allows us to use PhiloLogic4 services to support navigation back to the text. The ATPL supports the treatment of files as either entire documents or as collections of sub-units depending on the available data markup and has a variety of NLP, normalization, and other parameters that can be adjusted for tasks such as Topic Modeling.  For Hub Topic Models, we use modernized unigram nouns longer than 2 letters.  These are directed to the TopoLogic generator which supports another layer of vector parameters, typically using NMF vectors with TF-IDF weightings.  For the primary topic model in the Hub, we selected to use 150 topics across all of the collections, which seem to give the best balance of reasonably coherent topics and number of obscure or meaningless topics.  In addition, we generated two Topic Models of 100 topics each using the same parameters based on documents from 1700-1788 and 1789-1799, which we believe will facilitate exploration of topics from each period. 

It is important to note that the tuning of Topic Models is based on selection and application of a large number of parameters, from number of topics to which words to use, which change the nature of the resulting topics significantly.  These judgements are based to a certain degree on what we expect to observe.  
For example, a topic which contains "citoyen patrie petition commune concitoyen secours moyen defenseur arrete magistrat" (accents removed) as the most heavily weighted terms, quite reasonably, as shown in the graph, is found to be most heavily weighted during the years of the Revolution.  This reliance on expected results, even though they may be perfectly reasonable, does point to a significant limitation of the approach.  Topic Models are extremely useful heuristics which can help summarize and navigate the contents of large collections, but should be used with due care as they can reflect parameter selection in ways that can skew results in various ways. 

The Intertextual Hub, offers several ways to use Topic Models.  From the top down, as it were, with the ability to navigate the collections starting with topics as well as the ability to select the top weighted terms from any of the 150 topics restricted by any available bibliographic data (dates, authors, collections, etc.) returning a list of documents (which may be parts of documents or entire texts depending on available encoding) ordered by relevance to the query.  Just as important, however, is the ability to identify the most important topics for any document and to find other texts that share the same topic distributions which is another way to measure how similar the documents are.  



As shown in the last few slides above, we have included two 100 topic Models derived using the same parameters from documents predating the Revolution and those from 1789-1799.  
These are both full installations of Topologic and not directly linked to the Intertextual Hub.   Users may block copy topic words from one Model and apply these to the full set of documents using the Search and Retrieval functions of the Hub. Some topics, such as 77 from the Revolutionary Model  (pont, canal, ingenieur, navigation, riviere, chaussee, travail, construction, reparation, devis), are probably not significantly different from the ancien régime considerations.  Other topics, however, are more clearly identified as having Revolutionary concerns.  Topic 46 of the Revolutionary 100 (election, scrutin, nomination, electeur, suffrage, majorite, liste, membre, votant, pluralite) reflect contemporary concerns.  Searching for this list of words in documents from 1700-1787 (run search), returns an interesting list of documents, the first six of which are chapters from La Rochefoucauld's Constitutions des treize États-Unis de l'Amérique (1783)


Running one's eye down the list of documents suggests suggests that the discourse regarding elections found its origins in a number of examples from England, the emerging US states, and some other European states.   There is also an interesting mix of well know names, Rousseau and Voltaire, authors who would become better known during the Revolution such as Brissot, and numerous less known writers.  

The Intertextual Hub is designed to offer potentially interesting texts to consider.  We employ Topic Models to provide granular search across the collections as well as to point to similar documents based on the current context.  Finally, we can track topics derived from documents of a later period, to early instances, potentially revealing connections that can offer new evaluations of these texts.  



Notes

[1] There is an extensive literature on the use of topic models in digital humanities including JDH 2012.  



Read More

Reading the Bibliothèque de l'homme public in the Hub

Leave a Comment

The Intertextual Hub (https://intertextual-hub.org/) is an NEH funded project to develop a reading environment that aims to situate specific documents in their broader context of intertextual relations, whether in the form of direct or indirect borrowings, shared topics with other texts or parts of texts, or other kinds of lexical similarity. Relationships discovered by text mining algorithms among texts in large, heterogeneous collections can fruitfully inform and guide traditional close-reading approaches.  


The document collections in the Intertextual Hub can approached in several ways. Viewed from the top or most abstract level, one may search the entire set of collections for specific topics or themes (see related discussion) What follows here is, is an examination of a specific document or a set of documents from, as it were, the bottom up. Using the Bibliothèque de l’homme public (BHP) as a point of departure we are interested in aspects of reading the document which include:
  • similar passage identification, such as reuses, citations, paraphrasing,
  • identification of similar chapters, parts and selections, and,
  • thematic and semantic relationships between documents. 
All of these relationships are established from wider patterns identified by techniques generally known as distant reading. The slides shown below present a step by step itinerary of how one can navigate in the Hub starting from a single document.

The BHP was published between February 1790 and April 1792 by Condorcet and several others, spanning some 28 tomes.  The full title gives an indication of the nature of the project: Bibliothèque de l'homme public et Analyse raisonnée des principaux ouvrages français et étrangers sur la politique en général, la législation, les finances, la police, l'agriculture et le commerce en particulier, et sur le droit naturel et public.  (BNF Link
It was one of numerous efforts by Condorcet to contribute to public instruction and he published a number of pieces, most notably his Cinq Mémoires sur l'instruction publique (1791) and the discussion of Smith referenced below.  As Tourneux notes, however that his role was not clearly defined: 
 
Barbier l'attribue à l'abbé Balestrier de Canilhac, dont le nom ne figure ni sur les titres, ni dans les avant-propos. Celui de Peyssonnel disparait au tome VI et Condorcet est seul nommé à partir du tome XI. Ce recueil, qui avait pour but de mettre autant que possible la science du gouvernement et de l'administration à la portée de tout le monde.... (Tourneux, Vol 2 p. 648).

While the BHP was aimed the education and raising awareness of newly minted French citizens by publishing the "analysis of well-known works, both ancient and modern.” (Faccarello-Steiner 2002, p. 82), it was not always well received as noted in the Journal des révolutions, 1790, VII, p. 9-10 link):

Bibliothèque de l'homme public, par MM. de Condorcet, Chapelier et Peyssonnel ; le premier n'y travaillera point, le second n'y travaillera guère ; le dernier est vieux et cacochyme, il est froid et lent, deux qualités que n'avaient point Bayle, le Clerc et l'abbé Prévost.

It featured extended discussions and extracts of numerous French, English as well as classical authors, including major figures such as Aristotle, Machiavel, Bodin, Hobbes, Locke, Smith, Montesquieu, and Hume, as well a contemporary figures such as Mirabeau and Raynal and lesser known authors such as Guicciardini.  While generally expository, not all of the discussions were intended to be positive:

La vivacité naturelle à l'esprit françois, l'économie du tems , l'ennui qu'entraîne un long ouvrage sur des matières, aussi sérieuses, le caractère national, tout concourt à nous faire adopter la méthode Analytique. [...]  On fera connoître aussi tous les ouvrages relatifs à ce plan, à mesure qu'ils paroîtront: on se permettra même des réflexions critiques, sans toutefois blesser l'amour-propre des auteurs: la malignité aigrit, & n'éclaire pas mieux qu'elle ne corrige.  (Bib homme public, 1790, vol 1 pp. vi & viii)
Smith's Wealth of Nations, for example, is extensively covered, taking up some 220 pages of the BHP. Diatkine (1993) argues that the summary is "very inaccurate", going on to suggest 
[T]he summary published by Bibliotheque de I'Homme Public is the Wealth of Nations minus the 'Invisible Hand'. This shortcoming is too systematic to be attributed to a casualness of approach or to technical difficulties. We are in the presence paradox: here is a book which seems to be very important, yet completely misunderstood. (pp 219-220)
The (BHP) is a highly intertextual collection with a significant number of direct and indirect references to a large number of major authors as well as relatively minor texts. It reflects a distillation and selection of late Enlightenment views on the nature of government and society.  Reading the BHP in the context of the Intertextual Hub allows one to navigate this collection with an eye to the intellectual inheritance and as well as later influences of the authors and texts had during the Revolution.






There are, of course, a great number of texts in the collects deployed in the Intertextual Hub that have many borrowed, reused, or paraphrased passages that can be identified.  For example, the two volume  Les délassemens d'un homme d'esprit, ou nouveau recueil de pensées amusantes, extraites des meilleurs auteurs (1780) is made up of numerous extracts (link to search) organized by theme or subject, such as chapters on SPECTACLES and JALOUSIE.  

This post will be followed by others which we hope will outline the various search and navigation facilities of the Intertextual Hub with a focus on step itineraries from specific starting points.  

Please do post comments below or email us at artfl@artfl.uchicago.edu.  

References

Diatkine D. (1993), "A French Reading of the Wealth of Nations in 1790". In: Mizuta H., Sugiyama C. (eds) Adam Smith: International Perspectives. Palgrave Macmillan, London.  (DOI)

Faccarello, Gilbert and Steiner, Philippe. 2002. The diffusion of the work of Adam Smith in French Language. In Tribe, Keith (ed.), A Critical Bibliography of Adam Smith, London, Pickering and Chatto, pp. 61-119 (link)

Tourneux, M., Bibliographie de l'histoire de Paris pendant la Révolution française, Paris 1890-1913 (BNF)






Read More

Tracing Revolutionary Discourses

Leave a Comment
In our previous blog post in this series, Modeling Revolutionary Discourse, we outlined the integration of various analytic services and entry points to one of the collections -- the French Revolutionary Collection (FRC) -- we are using as part of ARTFL’s NEH funded Intertextual Bridges project.  This provided three distinct ways to approach the richness of the Newberry Library collection, through PhiloLogic4 search and analysis capabilities, through our new TopoLogic instance, and via a ranked relevance retrieval model.  We demonstrated the utility of different models of access and analysis and ways that combining these results could be used to pose different kinds of questions.  For example, using lists of topic words as the basis of rank relevance search can reveal unexpected relationships between documents and discourses.  

The Intertextual Bridges project is based on building ways to visualize and navigate relationships between disparate sets of collections.  For this project, we have started with seven different collections, representing a wide array of documentary materials concerning the French Revolution.  These include the Newberry FRC, the Archives Parlementaires (AP), the Baudouin Collection of Revolutionary Lawsthe Journaux de Marat, as well as 18th century holdings from the ARTFL Frantext Collection, the Goldsmith-Kress Collection, and French holdings of ECCO.  The collections differ from each other in important ways and require specific search and retrieval schemes to allow for proper handling.  The individual speakers of the AP are searchable as part of particular sessions where as the Newberry does not have such data identified.  Simply doing a single build all of the collections into one database instance would reduce the analytic capabilities to the lowest common denominator.  Collection integration properly requires initial builds reflecting the specifics of each dataset, followed by abstraction to a top level interface.  


The first stage of database integration is development of a top level search and retrieval scheme.  For this preliminary work, each of the target collections we built as a separate PhiloLogic4 instance.  We then used the ARTFL Text Preprocessing Library, to extract metadata and word data from the standard representations generated by PhiloLogic4.  This allows us to use PhiloLogic4 services to support navigation back to the text.  The data extraction program allows the treatment of files as either entire documents or as collections of sub-units depending on the available data markup.  The FRC, for example, does not have internal subdivisions and it is treated as one text element per document.  By contrast, the Revolutionary Laws are tagged with divisions reflecting specific laws and other elements.  The Frantext selections and ECCO selections are typically divided into chapters.  Indexing and accessing text elements significantly improves search and retrieval tasks.  


For the purposes of our prototype, we are using the Python Whoosh indexing and search library. We expect to move to a more scalable ranked-relevance search engine for the final product. We have release an instance of our Whoosh-based search tool at:

     https://artflsrv03.uchicago.edu/mark/hub/multipledb.whoosh.html
The search form allows the user to input a list of terms to find and to limit results to the specific collections and/or to time periods.  Results are ordered by a standard relevancy calculation and we have appended a simple count of authors and titles at the bottom of the report. Note that we have turned links to the full text off at this time, since the underlying PhiloLogic4 instances are on an internal research machine which we expect to be updating in the future.  A full implementation will have full links to the documents and other functions, such as TopoLogic, as outlined in our previous blog post.  


For the query "grain subsistance recolte marche farine quantite pain denree prix bled" the search will return many results, displaying the first 100 (by default) instances, showing the relevance score of the document as well optional snippets as shown on the left.  
The snippets may be omitted from the report, which then generates a list of 

corresponding documents.  The search will retrieve and score subsections of documents, such as chapters or sessions in the same way as entire documents.  On the right one finds the continuation of the query for "grain subsistance...".  Limiting the query to the Revolutionary Laws collection will find specific laws on this subject, such as "Décret sur la police du commerce des grains l'approvisionnement des marchés des armées. Du 7 vendémiaire" of Year IV followed by (again in order of relevance to the query words) Décret qui fixe un maximum du prix des grains, farines et fourrages, et prononce des peines contre l'exportation. [11-9-1793].  

Rank relevance retrieval across multiple collections is a useful way to identify documents and passages of interest.  We are also finding that combining this type of query with word vectors representing Revolutionary topics to be a powerful tool to trace aspects of Revolution discourses to often unexpected sources.  We have included two topic models generated from the 26,000 documents Newberry French Revolution collection.  As described earlier, topic models are unsupervised techniques to identify topics in collections of documents.  Topic models identify the topic mix for every document in a collection and well lists of weighted words that are associated with each topic.  The TopoLogic instance of the 50 topic model can be found on
     https://artflsrv03.uchicago.edu/topic-modeling-browser/frc1787_99/



We have included the top ten words in each topic with a link to the ranked relevance search for that topic across all of the collections.  Clicking on Search will will query the words in this list against the Whoosh database.  The parameters are set to display the top 200 documents or sections from the entire collection.  We have also included the same data for a 100 topic model instance (click here).  No single topic model can properly capture the complexity of Revolutionary discourses.  Comparing the lists of 50 and 100 topics, you will find some are complementary, while others emerge only in the 100 topic model.  

While the static searches (clicking on Search with the set parameters) are useful, we recommend that you examine topics in more detail.  You can block copy the words from any of the topics to the search box and set the parameters as you see fit.  We have included on the search form one example.  This is a query for the words of Topic 4 (in the 50 topic model) "constitution pouvoir droit liberte nation peuple autorite homme principe propriete" in documents published before 1789, using the OR operator, and displaying the top 500 instances (click here to run this search).  This will return a list of documents or sections from pre-Revolutionary sources as shown on the right, led off by a translation of David Ramsay's History of the American Revolution and including the state constitution of Massachusetts. Scrolling down to the list of authors, one finds an interesting list of expected and rather unexpected authors including:


  • Du Buat, M. le comte (Louis-Gabriel), : 21
  • Mirabeau, Victor de Riquetti, marquis de, : 20
  • Holbach, Paul Henri Thiry, baron d', : 15
  • De Lolme, Jean Louis, : 14
  • Helvetius, : 13
  • Chamfort, Sébastien Roch Nicholas, : 11
  • Le Trosne, M. (Guillaume François), : 10
  • Mirabeau, Gabriel-Honoré de Riquetti, comte de, : 9
  • Le Mercier de La Rivière, Pierre-Paul, : 8
  • Bodin, Jean, : 8
  • Hume, David, : 7
  • Brissot de Warville, J.-P. (Jacques-Pierre), : 7
  • Franklin, Benjamin, : 6
  • Condorcet, Jean-Antoine-Nicolas de Caritat, Marquis de, : 6
Taking the words from Topic 43: "religion culte pretre eglise dieu fanatisme morale autel clerge divinite" and restricting the results to the 18th century holdings of ARTFL Frantext reveals the strong showing of Holbach (accounting for seven of the top ten most relevant sections) and Helvétius .  The top titles, recalling the sections are counted individually is also suggestive:


  • Lettres juives : 52
  • De l'homme : de ses facultés intellectuelles et de son éducation : 35
  • Essay sur l'hist. génèrale / Voltaire. : 25
  • Le christianisme dévoilé, ou, Examen des principes et des effets de la religion Chrétienne : 17
  • Dictionnaire philosophique : Comprenant les 118 articles parus sous ce titre du vivant de Voltaire, avec leurs suppléments parus dans les Questions sur l'Encyclopédie. : 15
  • Le comte de Valmont : 12
  • Système de la nature, ou, Des loix du monde physique du monde moral : 12
  • Voyage du jeune anacharsis : 11
  • Histoire critique de Jésus-Christ ou analyse raisonnée des Évangiles : 10
  • De la philosophie de la nature : 10
  • Les helviennes : 10
  • Les Incas, ou, La destruction de l'empire du Pérou : 10
  • La contagion sacrée ou Histoire naturelle de la superstition OU Tableau des effets que les opinions religieuses ont produits sur la terre. Tome I : 9
  • Le compère Mathieu : 8
  • Traité sur la tolérance : 8
Moving this time to the 100 topic model, we can look for traces of topic 80 "convention jugement mort royaute inviolabilite souverainete peine tyran crime depute" in pre-1789 texts. In essence, we are asking whether this topic on the tyrannical nature of the sovereignty of the king, so prevalent in revolutionary discourse, has any echoes in earlier texts. It is interesting to see in the results a mix of theoretical works (such as  Bodin's De la république, or Pufendorf's Droit de la nature et des gens), historical accounts (Raynal's Histoire du parlement d'Angleterre, or Boulainvillier's Etat de la France), or literary sources (Voltaire's Cromwell, or Mercier's L'an deux mille quatre cent quarante), thus providing researchers with a broad and diverse overview of discussions of this topic in the pre-revolutionary period. 

In highlighting the possibility of using word vectors that emerge from topic models of Revolutionary discourses, we might be guilty of teleological readings of these earlier texts.  This one approach is simply to demonstrate the the possibility of combining mixtures of algorithms to propose unexpected texts of potentially related interest.  As we move forward, we will be including topic models of the 18th century collections, to allow tracing of earlier topics into the Revolutionary era.  This is another level of navigation that we believe will help guide researchers through large collections, providing access to smaller segments of text are that more tightly focussed on specific issues and topics.  


-- The ARTFL Team



Read More

Modeling Revolutionary Discourse

Leave a Comment
Modeling Revolutionary Discourse

As part of our lead work on ARTFL’s NEH funded Intertextual Bridges project, we are pleased to release a prototype build of the Newberry Library’s French Revolution Collection (FRC), which integrates topic model browsing and search, relevancy searching, and full PhiloLogic4 services, in a set of interrelated functions. This post will describe the current state of this work, document some of the functionalities, and provide an outline of our next steps of development.

In 2017, the Newberry library released digital copies of more than 35,000 pamphlets totalling approximately 850,000 pages of it’s extremely rich holdings related to the French Revolution. Shortly thereafter, ARTFL project released versions of the Newberry FRC under PhiloLogic4 of this unparalleled resource. In a subsequent post, we described the collection, some of the capabilities of this initial installation and preliminary results using the tools deployed in this build.

We have two builds of the FRC under PhiloLogic. The first is simply a load of the entire collection of 38,377 documents as it was downloaded towards the end of 2017. We applied some error correction functions, which we recently modified slightly applied to the installation (search form). The bulk of our work has been aimed at the FRC collection for works from 1787-1799 with the aim to improve the data and metadata as well as remove duplicate documents. The 2017 release of the FRC at ARTFL contained 26,455 documents, where duplicates were identified by metadata comparison. Using data generated our new sequence alignment package TextPair, which identified both similar passages and possibly duplicated documents, we further reduced the collection to 25,935 documents. 

We currently have three entry points to collection. The basic component which underlies the whole system is PhiloLogic, our corpus query engine which houses the words index, the structure and the metadata of the collection:
          https://artflsrv03.uchicago.edu/philologic4/frc1787-99rev2b/
To facilitate the discovery of documents relevant to search queries, we added on a ranked-relevance engine, called Whoosh, which is built on top of the PhiloLogic index:
          https://artflsrv03.uchicago.edu/mark/frc/frc1787-99.whoosh.html
Finally, as an additional way of exploring the topics and discourses that run through the FRC, we built a topic-modeling browser called TopoLogic, which also leverages the PhiloLogic instance:
          https://artflsrv03.uchicago.edu/topic-modeling-browser/frc1787_99/.
While all three systems have specific capabilities and reporting features and function as discrete units, because they share a single data feed model (built from the PhiloLogic index), they are designed to be interoperable, and hence provide links across one another. It is our belief that there is no all-encompassing algorithmic approach to text analysis, and that topic-modeling provides one view that may be worth exploring, but no more so than other methods.

TopoLogic is the latest entry in our quest to build value-added services on top of the standard PhiloLogic index, and leverage topic-modeling techniques to offer an alternate way of exploring text collections. Topic-modeling, the algorithmic technique which we use for this new navigational tool, is an unsupervised machine learning approach designed to facilitate the exploration of large collections of texts where no topical information is provided. As such, this computational method can be a truly useful way of gaining a sense of the topical structure of a corpus -- i.e. to find out what's in there -- and how words are clustered together to form meaningful discourses.

TopoLogic builds upon the topics and semantic fields generated by the algorithm to provide a web-based navigation system which lets users explore topics and discourses across time, as well as word usage within different contexts. The interaction of the three different schemes allows the user to navigate between alternative ways of considering topics across the collection. The following slides are designed to give some idea of how users may navigate between topics, word searches and other capabilities provided by these different systems.



In our experience, there are a number of caveats to consider when using this algorithmic approach to text analysis. First, while topic-modeling is able to uncover relationships between words and documents without a training corpus (thus its unsupervised nature), it does require a certain number of priors, such as the number of topics to uncover, in order to function. In other words, the user of such method needs to determine (through trial and error) what that user deems to be the more meaningful representation of the corpus. Our experience has shown us that slight changes in the underlying texts (such as adding or removing a couple texts), or in the preprocessing steps (such as removing additional function words), can lead to drastically different results. All in all, we have always taken a very measured approach to our interpretation of topic models, and we strongly discourage against relying upon them as the sole source for text analysis.

The systems complement each other by providing checks on the results of particular functions. For example, in slide X above, we present the top 50 documents for topic 19 as measured by topic weight. In using a rank relevancy search for the top 10 tokens for topic 19, we arrive at a rather different list. The differences are due to the interaction of weighting schemes and relevancy measures. Both are useful approaches, but do, by design, deliver somewhat different results.

It is our pleasure to acknowledge that the Newberry Library has released this extraordinary resource under the Open Data Commons Attribution License, ODC-BY 1.0. We believe that this splendid collection and the Newberry’s release of all of the data will facilitate a generation of ground-breaking work in Revolutionary studies. If you find the collection useful, please do contact the Newberry Library to congratulate them on this wonderful initiative and how their efforts contribute to your research. Clovis & Mark
Read More
Next PostNewer Posts Previous PostOlder Posts Home