Herdict: Difference between revisions

From Berkman Klein Google Summer of Code Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 12: Line 12:
Project 2:
Project 2:


We are looking to make some changes to our downloadable toolbar.
Herdict currently has a downloadable [http://www.herdict.org/participate/download browser add-on] that enables users to report inaccessible sites directly from their browser. We aim to supplement that functionality with automatic reporting when the user encounters an identifiable inaccessible site. For instance, when a browser cannot resolve a requested URL, it will throw a 504 error; our add-on should be able to detect that error and use it to generate an inaccessible report for Herdict. Additionally, there are known blocking patterns that some ISPs provide when a site resolves, but the resolved page indicates only that the requested site is inaccessible.  The add-on should be able to automatically detect these known blocking patterns and generate an inaccessible report.  This new functionality will reduce the hurdles that currently exist for Herdict users.
In particular, we want to automate the detection of blocked sites. When
a browser throws a 504 error, we want the toolbar to catch it, and then
report it. In addition, we have known blocking patterns from ONI, and we
want the toolbar to be able to identify a match with a known blocking
pattern and report that as a inaccessible page as well.

Revision as of 14:14, 7 March 2012

Project 1:

We are looking for someone who can help us create a program that will scour a microblogging site, like twitter, and identify when someone might be saying that a site that is down or blocked. So something that would constantly look for things like URL within 3 words of "blocked" "censored" "inaccessible" "i can't get to" etc. We'd be looking for something north of 75% accuracy, but not perfection. Bonus points if we can do it for Sina Weibo instead of Twitter, but that will require some language skills.

Project 2:

Herdict currently has a downloadable browser add-on that enables users to report inaccessible sites directly from their browser. We aim to supplement that functionality with automatic reporting when the user encounters an identifiable inaccessible site. For instance, when a browser cannot resolve a requested URL, it will throw a 504 error; our add-on should be able to detect that error and use it to generate an inaccessible report for Herdict. Additionally, there are known blocking patterns that some ISPs provide when a site resolves, but the resolved page indicates only that the requested site is inaccessible. The add-on should be able to automatically detect these known blocking patterns and generate an inaccessible report. This new functionality will reduce the hurdles that currently exist for Herdict users.