Porteous, Ian, David Newman, Alexander Ihler, Arthur Asuncion, Padhraic Smyth, and Max Welling. "Fast collapsed gibbs sampling for latent dirichlet allocation." KDD '08: Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. New York, NY, USA: ACM, 2008, 569-577. (Link)
This describes Fast LDA and suggests that this may be helpful in "real time" topic modeling of a few thousand documents returned by a search engine. The introduction to section 3 gives a nice "intuitive" description of LDA, helpful for those, like me, who are significantly math challenged, as well as some algorithm descriptions. The paper has links to code and David Newman has posted links to some earlier code which may be of considerable interest. Newman has done some interesting work on topic modeling of 18th century American newpapers (link and link).
0 comments:
Post a Comment