Amber
Amber (https://amberlink.org) is a plugin for WordPress and Drupal to keep links working on blogs and websites.
Sick of seeing “404 Not Found”? Amber automatically preserves a snapshot of every page linked to on a website, giving visitors a fallback option if links become inaccessible. If one of the pages linked to on this website were to ever go down, Amber can provide visitors with access to an alternate version. By default, Amber stores snapshots directly on the host WordPress or Drupal website. But users can choose to store snapshots using a combination of the following third party storage and archiving systems: the Internet Archive, Perma.cc, and Amazon Simple Storage Service (Amazon S3).
Our goal for this summer is to make Amber even more distributed and resilient. Does this mean sharding cached content for preservation among a number of distributed nodes? Building in greater interoperability with the vast network of other web archiving systems? Or working with the latest Internet Archive technologies to produce better, more standard, cached content? The answer is up to you!
More info: https://amberlink.org
Github repos: https://github.com/berkmancenter/amber_wordpress, https://github.com/berkmancenter/amber_drupal, https://github.com/berkmancenter/amber_common
Ideal candidate criteria:
Amber is interested in candidates with experience in PHP, as well as familiarity with developing for either WordPress or Drupal. (It is not necessary to be experienced in both WordPress *and* Drupal, but you must be comfortable with one platform to be considered.) Familiarity with Python and/or the following projects is a definite plus, but not by any means required:
- Memento (http://timetravel.mementoweb.org)
- The Internet Archive’s Python Wayback (https://github.com/ikreymer/pywb) and Webrecorder (https://github.com/webrecorder/webrecorder)
- IPFS (https://ipfs.io)
We have a robust community of active users, and thus have a list of high, medium, and low priority features outlined for development this summer. However, we encourage the GSoC intern to think up new ideas to make Amber more distributed and resilient.
Example sub-projects include:
- Python Wayback/Webrecorder integration to produce high-quality WARC files as opposed to HTML files.
- IPFS integration (https://github.com/ipfs/ipfs) or other sharding of cached Amber content for preservation, for example using torrents.
- Modifying a Memento TimeGate negotiation server (https://github.com/mementoweb/timegate) to easily enumerate, find, and dispatch content cached by Amber users.
In your application, tell us what you want to do with Amber and why. We want to make following links better for web users everywhere, and we welcome your creativity.