In thinking about another project, I ran across Harold Love's Scribal Publication in Seventeenth-Century England (Oxford: Clarendon Press, 1993. Pp. xi+379). [Google Books]
This has an interesting discussion regarding scribal publication as being a "perfect example" of Don Swanson's notion of "Undiscovered Public Knowledge". "By this he [Swanson] means knowledge that exists 'like scattered pieces of a puzzle' in scholarly books and articles, but remains unknown because its 'logically related parts ... have never become known to one person." The reference is to Don R. Swanson, 'Undiscovered public knowledge', Library Quarterly 56 (1986). Professor Swanson's work is aimed primarily at bio-medical research using a system that he and his colleagues call Arrowsmith, which is available on http://kiwi.uchicago.edu/ (currently in Charlie's office) which has links to recent papers and more references.
It may be interesting to think about how this might be applied to research in the humanities. Other work in the same area suggests that latent semantic indexing, a variation on the general vector space model, may be of use.
A few more papers to think about:
Xiaohua Hu, et al. "Mining undiscovered public knowledge from complementary and non-interactive biomedical literature through semantic pruning", Proceedings of the 14th ACM international conference on Information and knowledge management (2005) [
Link] and Supercomputing Approach to Undiscovered Public Knowledge
[Link] from, UIUC (of course).
I will post more related articles on the ARTFL CiteULike and, if I remember, use the tag UDPK to cluster the papers.