Text archives Help


Re: [projectvrm] Facial recognition's 'dirty little secret': Millions of online photos scraped without consent


Chronological Thread 
  • From: MXS Insights < >
  • To: Guy Jarvis < >
  • Cc: Benjamin Goering < >, "Dr. Augustine Fou" < >, ProjectVRM list < >
  • Subject: Re: [projectvrm] Facial recognition's 'dirty little secret': Millions of online photos scraped without consent
  • Date: Wed, 13 Mar 2019 19:30:14 +0100

It really speaks to one of the main challenges around data, and the unworkability of the current consent model.  

How can you consent to that which you don’t know exists today and for some infinite time into the future?  
Data, images loaded/shared for one purpose are then used for a purpose never imagined of conceived of, what risks are the individuals potentially subject to in the future?

A share on a photo site to be used as ‘clip art’ is one thing, but to be used as a training material for some future algorithm of unknown purpose?  Is this even covered in the intent of Creative Commons?



Hmm,

This is a really interesting aspect of the web, the opt-in assumption/assertion that publication automatically means putting that content into the public domain, unless explicitly stated otherwise.

Robots.txt is a classic example of that approach, ie a spider and index free for all unless explicitly opted-out

Where the clash between cyber and real worlds occurs though is with the reuse of content, so FB were happy to scrape and republish images but object strenuously and litigiously when others copy and reuse/republish whatever FB publish (even though FB isn't even a content creator per se, beyond the functionality of their website).

Guy

On Wed, 13 Mar 2019, 15:05 Benjamin Goering, < " class=""> > wrote:
I think it's a feature that I (or anyone) can download any image on the web.

Sometimes it seems like people want the benefits of publishing things to the world without any of the downsides, but it's a natural tradeoff. Any way of artificially interfering with that is going to be swimming upstream. The only way to win the "private publishing" game is not to publish.

Rather like FB got started by scraping college year books without consent, there's a pattern here...

@OliviaSolon: "Earlier this year IBM released a dataset of 1 million photos of people's faces designed to reduce bias in facial recognition software. I was surprised that the pictures were taken from Flickr & so investigated the origins of facial recognition datasets"



--
Benjamin Goering, Software Producer




Archive powered by MHonArc 2.6.19.