This is a follow-up to my previous blog entry about topic modeling in the Encyclopédie. As the title of this post suggests, I will be showing here the proportions of topics per article. Instead of just posting those results without any further comment, I would like to focus on 12 random articles to see what kind of results one could get. My feeling about this is that the best results are in the 300 topic model. What do you think? Note that there is still a lot of room for some refinement.
Examples from the 42 topic model :
http://docs.google.com/View?id=dgrbcw9z_69gk9w5tgc
Examples from the 100 topic model:
http://docs.google.com/View?id=dgrbcw9z_70c2n79kgv
Examples from the 150 topic model:
http://docs.google.com/View?id=dgrbcw9z_71cx73tsch
Examples from the 200 topic model:
http://docs.google.com/View?id=dgrbcw9z_724t5x9mfm
Examples from the 250 topic model:
http://docs.google.com/View?id=dgrbcw9z_73fvznkb7j
Examples from the 300 topic model:
http://docs.google.com/View?id=dgrbcw9z_74chqfgsct
Examples from the 350 topic model:
http://docs.google.com/View?id=dgrbcw9z_75chsw8gcp
If you wish to look yourself at the results, here they are, the first number is the topic with the proportion measure in parentheses. The article number is the div number of the article :
http://robespierre.uchicago.edu/topic_modeling/topics_in_articles_42.txt
http://robespierre.uchicago.edu/topic_modeling/topics_in_articles_100.txt
http://robespierre.uchicago.edu/topic_modeling/topics_in_articles_150.txt
http://robespierre.uchicago.edu/topic_modeling/topics_in_articles_200.txt
http://robespierre.uchicago.edu/topic_modeling/topics_in_articles_250.txt
http://robespierre.uchicago.edu/topic_modeling/topics_in_articles_300.txt
http://robespierre.uchicago.edu/topic_modeling/topics_in_articles_350.txt
0 comments:
Post a Comment