Library Innovation Lab: Difference between revisions

From Berkman Klein Google Summer of Code Wiki
Jump to navigation Jump to search
(New page: Two potential projects: 1. Syllabus parser. Design, structure and populate an open repository of the information in college syllabi. [Note that this project will be done in conjunction w...)
 
No edit summary
Line 13: Line 13:


*Research the sorts of relations between books that would be of high value to scholars and researchers, in addition to footnotes.
*Research the sorts of relations between books that would be of high value to scholars and researchers, in addition to footnotes.
*Crawl the Google Books corpus to discover these relations.
*Crawl the Google Books corpus to discover these relations [if Google gives us permission]
*Make these relations accessible in an open way, especially in conjunction with the ShelfLife app that provides community-based wayfaring through Harvard Library's holdings for scholars and researchers.
*Make these relations accessible in an open way, especially in conjunction with the ShelfLife app that provides community-based wayfaring through Harvard Library's holdings for scholars and researchers.
*Create interesting and understandable analytics based on the discovered relationships.
*Create interesting and understandable analytics based on the discovered relationships.

Revision as of 10:33, 9 March 2011

Two potential projects:


1. Syllabus parser. Design, structure and populate an open repository of the information in college syllabi. [Note that this project will be done in conjunction with the Harvard Library Innovation Lab.]

  • Assuming we get permission, figure out how to retrieve syllabi from Google. (If we don't get permission, we have a starter set of 500,000+ syllabi.)
  • Figure out how to parse the multiple and free-form formats syllabi are found in.
  • Design an appropriate and open data model for the information in syllabi.
  • Build a Web site with that provides useful end-user and API access to the syllabus data.


2. Scholarly semantic web builder. The aim is to crawl the Google Books corpus looking for useful relationships among scholarly works. Such relationships only begin with citations/footnotes. What other semantic cues can be unearthed to see how scholarly books relate? [Note that this project will be done in conjunction with the Harvard Library Innovation Lab.]

  • Research the sorts of relations between books that would be of high value to scholars and researchers, in addition to footnotes.
  • Crawl the Google Books corpus to discover these relations [if Google gives us permission]
  • Make these relations accessible in an open way, especially in conjunction with the ShelfLife app that provides community-based wayfaring through Harvard Library's holdings for scholars and researchers.
  • Create interesting and understandable analytics based on the discovered relationships.