Text archives Help


Re: [projectvrm] Facial recognition's 'dirty little secret': Millions of online photos scraped without consent


Chronological Thread 
  • From: Shannon Clark < >
  • To: Adrian Gropper < >
  • Cc: Guy Higgins < >, Benjamin Goering < >, MXS Insights < >, Guy Jarvis < >, "Dr. Augustine Fou" < >, ProjectVRM list < >
  • Subject: Re: [projectvrm] Facial recognition's 'dirty little secret': Millions of online photos scraped without consent
  • Date: Wed, 13 Mar 2019 17:25:45 -0700

It is important to keep in mind that for all the discussion about this most people are missing that IBM started with a dataset released by Flickr of over 100M CC licensed images. Now some people whose photos are published under a CC license may have changed that license since that dataset was collected - or may not have been aware of all the possible uses of their images (and it is worth checking Flickr's claims that all of the images were indeed CC licensed ones. But if they were and the terms of those licenses were adhered to then research use of those images is very likely covered. There is a debate as well to be had about when the line between academic research and commercial use gets crossed - but this is not the same case as someone just accumulating images randomly across the web with complete disregard to license or permission - works under a CC (commercial use) license have already granted that permission to be reused (again as long at the terms of the license are met). 

So any outrage should start by talking with Flickr - who originally released the dataset (again of 100M photos not 1M) for researchers to use - a set that was described as being all CC licensed works (so no copyright issues with such a collection or use especially by researchers). IBM then edited the 100M images down to a smaller set of images for use in facial recognition (none of my Flickr images were in this subset likely because most of my Flickr photos aren't of people) 

Lots to discuss here but IBM doing something shady (in terms of crawling the web/Flickr to "take" images) isn't one of them. 

Now the bigger point about CC licenses covering unknown at the time of licensing the work uses is complex - but it is also somewhat the entire point of such licenses - TO FOSTER such uses and reuse of older works in new and innovative ways. Not just as illustrations for a web article - but as building blocks for something new. For years there has also been a bit of a movement for there to be a way to have fully public domain works - however this is complicated by all kinds of issues (moral rights that you can't give up in some countries for example - I'm not however a lawyer so check with one for more in depth discussions of the nuances). CC did briefly try to have their CC0 license but I don't think it was successful. 

Shannon





Archive powered by MHonArc 2.6.19.