Earlier this year Mark built a frequency query for the French texts (affectionately named wordcount.pl)Kristin has now implemented this for our Greek and Latin texts. If you wonder what's new about this: Word count for individual documents has always been there in PhiloLogic loads, but the difference here is that you can see frequencies over the entire corpus, or a subset of works/authors.You can find the forms here:http://perseus.uchicago.edu/LatinFrequency.htmlhttp://perseus.uchicago.edu/GreekFrequency.htmlUpdate: Forms moved...
Do LDA generated topics match human identified topics?
I've been experimenting lately on how LDA generated topics and the Encyclopédie classes of knowledge match. The experiment was conducted in the following way:- I chose 100 classes of knowledge in the Encyclopédie, and picked 50 articles of each.- I then ran a first LDA topic trainer choosing 100 topics. - I then proceeded to identify each generated topic and name after the Encyclopédie classes of knowledge. - My plan was then to look at the topic proportions per article and see if the top topic would correspond to its class...
Section Highlighting in Philologic
In many of the Perseus texts currently loaded under philologic, the section labels would overlap and be unreadable. These labels come from the milestone tags in the xml text and are placed along the edge of the text. One particularly problematic text in this regard was the New Testament, as the sections were verses and were thus often small sections of text.In order to fix the overlapping issue, I wrote a little bit of javascript to hide the tags which would be placed in the same position as a previous tag. I also added a function...
Towards PhiloLogic4
Earlier this year I wrote a long discussion paper called "Renovating PhiloLogic" which provided an overview of the system architecture, a frank review of the strengths and (many) failings of the current implementation of the 3 series of PhiloLogic, and proposed a general design model for what would effectively be a complete reimplementation of the system, retaining only selected portions of the existing code base. While we are still discussing this, often in great detail, a few general objectives for any future renovation...