Digital Public Library of America

From Berkman Klein Google Summer of Code Wiki
Revision as of 09:35, 20 March 2012 by Geeks (talk | contribs)
Jump to navigation Jump to search

Library Item Matching Service

The Digital Public Library of America software platform is gathering metadata about items in collections in libraries, museums, archives, and online cultural collections. Many of these items have identifiers in various standard namespaces such as ISBN numbers, OCLC identifiers, and Open Library IDs. The DPLA platform would like to offer a service through its API by which developers could query with the information they have about a particular item and have returned to them any or all of the identifiers known to the DPLA. If the developer has one of the standard IDs, then it will just take a table lookup to find the others, although this might require accessing the API of other such services, such as OCLC.org's. The problem becomes more difficult when the query does not include an identification number, but does include other metadata such as author, title, publisher, year, etc. Then the matching will be probabilistic since records often vary in these details, or are incomplete, or the query may contain errors or variations. This project would consist of building a useful service that takes all this into account and returns results along with a numeric expression of the degree of confidence the system has in the results.

Find more information about the Digital Public Library of America at: dp.la


Mentor: mphillips@law.harvard.edu

General Questions: berkmancenterharvard@gmail.com