Text archives Help


Re: [projectvrm] Can you really close Pandora's box


Chronological Thread 
  • From: Joe Andrieu < >
  • To: Mary Hodder < >
  • Cc: ProjectVRM list < >, , Mark Lizar < >, "j. clark" < >, Dan Miller < >
  • Subject: Re: [projectvrm] Can you really close Pandora's box
  • Date: Sun, 05 Jun 2011 19:56:00 -0700

Mary,

You're analysis is fairly on the nose, but there is a difference between https by default and storing the data encrypted.

Https by default is good, but at a small cost. It also introduces extra debugging overhead, but in theory, most of us would be using standard libraries so that we aren't debugging the crypto anyway.

But all https does is deal with the data in transit.

It's a separate issue for data on the server or in an archive.  And your analysis there is spot on, certain types of transactions simply become either too costly to be reasonable or essentially non-viable, such as search.

I think the biggest problem here is that most of the analytic stuff that's of modern interest needs to operate on unencrypted data. Even simple web analytics to try to find out why people are buying one product verses another requires access to the underlying factors. Any web business worth its salt uses A/B testing to try out new features. Doing that on top of some encrypted data set would be untenable, which means that companies would need to decrypt their entire analytic set when they want to run any tests like that.  And the larger companies are doing this constantly. So, the more sophisticated the vendor, the less and less tenable it is to "encrypt everything", even if you adopt an "on-demand" decryption strategy. On top of that the shear computational complexity of current analytics is already bursting server-side machines to the limit.  Adding crypto simply makes it worse. When you're busting your chops trying to fit your algorithm into a scalable hadoop cluster, you consider any additional compute costs very very carefully.

That said, https by default is, imo, a no brainer.

-j
Joe Andrieu

 
 ">
 
+1 (805) 705-8651

On 6/5/2011 7:17 PM, Mary Hodder wrote:
" type="cite">Hi,

So i checked with a friend who used to write the Linux OS for google servers and knows databases fairly well and he said that 
to encrypt everything would mean the following:

* encryption would take no more space in the database.. because while the data would no longer be in plain text.. encryption would likely compress it to about the same size
* compression would not happen at the same time as encryption
* that extra compression/encryption step for each piece of data would take additional CPU time and power.. times trillions of bits of data.. that would mean a big hit to the CPUs (many more servers would be required)
* you could route all larger media (videos and photos) through unencrypted channels to save on encryption and compression power
* encryption would mean packets would not be inspectable by your ISP etc.. (an added benefit for privacy!!)
* because data would be encrypted you wouldn't be able to reuse it (ie, if twitter encrypted everything.. you would not be able to take a link and have all those who pointed to it.. share it in the database.. i don't know whether twitter does this.. but if they wanted to.. an encrypted system wouldn't allow it though you could assign unique items a unique number, match the item in an additional step, on the way into the database.. then grab the number, encrypt it. and make the user's pointer to it unique and encrypted)
* you could not do searches across the database.. but you could build an index, that would show data.. but not totally compromise it.. and that database could be put somewhere else (offline perhaps?) to keep it safer from crackers

So it sounds to me like encrypting a system with HTTPS for a lot of users and data would slow it down.. and require more CPU power.. but it would work.  And be far more secure than what we have now.  Probably not practical for search.. but if it was say, a site that sells things.. a repeat user logs in.. and it's all an encrypted session, and the entire user's profile would have to be unencrypted.. then another sale could happen matching the older profile data.. then the re-encryption would occur as the new data was stored with the old data.

It would be expensive in terms of CPU time.. but the user data would be pretty secure in that scenario.

Other than that, can anyone think of why we wouldn't recommend HPPTS encryption to services to protect user data?

mary

On Jun 5, 2011, at 3:49 PM, Mary Hodder wrote:

maybe.. 

i don't know if we've evaluated what it would mean to encrypt everything.. 
let's take this use case:

a user has a personal data store.. they put as much as they want to into it..
they add some apps.. maybe a VRM app like buyosphere for sharing shopping data.. 
maybe they get an app for allowing news sites to know their advertising preferences and serve back ads (not a VRM app.. because it's about ads and marketing)
maybe they get an app for archiving their bank statements.. and there is no sharing here.. except with their accountant who is an email away..
maybe they get an app for aggregating their search history..
and maybe they get an app for book recommendations where they share what books they've bought.. 

If all the sites the user touches.. their PDS.. their buysphere account..
their advertising preferences that go to news sites
the bank, all their browsers and histories, and the recommender for books.. and the book purchase sites..

if they all encrypted.. what would that do to the system?

Does every site require an Https: login?

I think this is a good question to evaluate..

what does it do to apps makers of any type that want to hook into a PDS?

mary



On Jun 5, 2011, at 3:16 PM, Michael O'Connor Clarke wrote:

So I've just read the original article linked and the entire resulting thread here, and I have what may be a completely dumb question. The point I kept expecting didn't show up, so I'm driven to ask it, even if it simply serves to highlight my own naivete.

Shouldn't part of the answer here be, simply, encrypt everything?

It doesn't stop data "leaking" or prevent breaches, but it sure makes things harder for your workaday cyber-baddy to do naughty things.

The locks - all locks - can be picked. So just make the stuff behind the locks useless to all but the really determined bad guys.

Doesn't seem like this should be all too hard to implement.

Am I completely missing something obvious? Would appreciate a little education here.

Thanks,

/m

Michael O'Connor Clarke
+1.416.893.4941
@michaelocc
Date: Sun, 5 Jun 2011 13:48:26 -0700
Subject: Re: [projectvrm] Can you really close Pandora's box

Mark,
The idea in the personal data ecosystem model is that users do control their own data.

But where? Most likely their personal data store will be hosted.. 

and point to or store personal data.. so those hosting services should likely have some kind of level of 
security we expect from the hosting.

And.. there will be services out in the world we send our data to.. from those PDS.
Those services should also have security standards for keeping personal data.

I don't think a personal data ecosystem model means we will all be hosting our own boxes..
(kind of like today... most of us don't host our own email on our home servers.. if we even have one..
those of us who have our own domains -- like hodder.org -- likely still have someone else manage
that.)

I'm curious.. of the VRM and PDE companies in the space.. have any of them announced 
a level of security for their servers that is better.. kind of like adhering to a Trust Framework..
but for data security?

So that leaks aren't happening?

I don't know of any company or service that talks about it.

mary


On Jun 5, 2011, at 1:40 PM, Mark Lizar wrote:


if the Customer had authentic and official control of their data then all the other data would be second class.  The customer gets to manage access.  

Does the box need to be closed? Could we all one day have our own box?  :-) 

- M 

On 5 Jun 2011, at 17:05, Mary Hodder wrote:

Joe.. 

I think it scares people to talk about personal data security.. yes.. but i think it's healthy to talk about things that scare us.. 

So.. regarding locks.  I think you and i are agreeing here.. in a weird way.

I want to define the locks and what the "highest standard" is.. but that doesn't mean the extreme standards.. like super cryptography.
It means that .. like the schlage locks example.. we ask for schlage locks. Note that recently we changed all my house
locks from Kwikset ($13 locks) to Schlage ($52 each) and we feel it's good.. safe.. and frankly the locks don't need to be completely
tightened up every 6 months because they are crap. The new locks do the trick..  they have a longer thicker deadbolt, with more pins inside,
better structure and screws and I think given our neighborhood,  we can consider them "highest standard" for the circumstances.

.. so maybe it's asking that when a site collects personal data.. they partition it across multiple data bases.. in ways that make it hard to steal and put back together.. 
unless you know how to do it.. as opposed to keeping the whole of user data in one DB. Or maybe we say.. attach bits of user data to other non PII data in a data structure.. and make an
obscure way to connect it back to the user.. unless you know the way to do it.  In other words don't store: name, address and CC all together. Or whatever.

Maybe the whole of user data at a service is only stored in a single DB when it's not attached to the public internet.

Maybe the api data available is fully examined publicly .. or maybe api's with any PII access require special oversite.. 

Those are a couple of suggestions.. we can talk about asking for the Schlage standard without telling IT people exactly how to do it.. they can figure that out on their own
and in relative secrecy which means the details don't so easily get to the bad guys.  

I agree we need laws against breaking and entering.. i think we have a lot of those now.. but how do you enforce that in Uzbekistan?
We don't have international laws and enforcements at the same levels as we do at the nation state level (nation states in my view are an
anachronism.. i think they are passe but we don't have a lot to take their place.. the real power is in global markets.. for good and for bad..
and it's too scary to talk about the fact that they are kind of passe.)

I don't know that you get the whole world on board with a culture of not breaking and entering.. we have uneasy peace (and wars) across
the world as it is. We can ask for it.. but many won't respect it at all.

What I'm asking for is to create a "highest" standard for services.. put it in writing.. and then show up and ask those guys: 
"hey.. are you following the standard?"  because we'd really like you to....

Give the IT guys something practical to implement instead of just lamenting the fact that our data is leaking all over.

So I'm asking for that.. what does that look like?

mary

ps.. did you see the thing last week that 30m Google user's data leaked out of Google?  I don't think any service is immune here.. 


On Jun 4, 2011, at 11:03 PM, Joe Andrieu wrote:

Absolutely, I think we can, but it's hard.  And it scares people.  Which makes both regular folks and experts avoid it. Same reason locksmiths don't talk about locks. Most locks are crap and subject to trivial attacks.

Most people don't want to hear about that and most experts don't want the techniques leaked to a wider criminal audience. Plus, there's the unfortunate tendency to enjoy being one of the few wizards who understand the secret magic.  But in the end, most people are fine with the $30 Schlage lock, even thoough it's pretty much useless for anyone with even moderate training or industry.  For most people, it provides the security they care about and, in fact it keeps out enough potential criminals that people are mostly happy.

Which is to say that what I'm talking about is figuring out the digital equivalent of (1) simple locks, (2) laws and rules against breaking and entering and constitutional protection against unreasonable search and seizure, and (3) a cultural shift that locks are to be expected and respected. I think just getting /that/ in place will do more for our society then the, also important, more detail oriented work of outright security.

I think of it this way. For most of my information, I don't need the equivalent of Fort Knox.  Locks on my doors are just fine. Today, we not only don't have digital locks on the doors, but it's common practice to grab the pies cooling in my window sill. And too much of the data security conversation ends up sounding like Fort Knox!

To track this back to the FTC paper, it doesn't even address what minimal business practices should be followed, that is, that there should be locks on the doors.  The main reason I push back against too much fixation on data security is because

(1) I think doing that 100% is literally impossible (see Wikileaks) and ultimately is distracting.  The data is out there. It will continue to be out there. It will continue to be created and put out there by people you know, simply because they tweet or blog or check in and mention you.  I don't believe we can contain the data. I do believe we can penalize inappropriate use of that data. To point again at the Do-Not-Call registry, it solved a significant annoyance not by data security--the fact that my phone number is available was never seen as the problem--but by inappropriate use of that data.

And (2) because I believe the world will be a better place with more intimate, more trusting, more valuable relationships, especially compared to the minor cost of the risk of criminal use of my data. To me, security is almost entirely about independence, not engagement. In fact, the approaches I know preclude more engagement by their very nature. But, I want Google to know what I'm looking for. I want facebook to know the statuses I want to share with my friends. I want FourSquare to know where I am. I want WordPress to know what I write. Information sharing is the essence of digital relationships... and the bane of data security. And, as you know, I've spent a lot of time working through these issues from an information sharing perspective; that's my lens, rose colored or otherwise.

So, yes, I think we should be able to have conversations about data security--even as I explain why that's not my focus. From our previous conversations, I think you and I are aligned on most of these issues. I just think the biggest bang for our buck is figuring out how individuals can contribute (data) to our digital experience without fear of exploitation.  Right now, the vast majority of exploitation is legal and accepted business practice. THAT I think we can change much more rapidly than we can control data through rigorous security.

-j
 

Joe Andrieu

 
 ">
 
+1 (805) 705-8651

On 6/4/2011 4:42 PM, Mary Hodder wrote:
" type="cite">Joe.. i agree we should collect less data and have more honest businesses. We don't have as many problems
talking about that stuff.. and we can keep doing it and that part will get dramatically better soon, i think.

but some data will be collected.. 

and i know criminals will do their thing.. but more.. or less is the question...

I'd like less and i'd like to know when we get real about having a way to measure security around data?

Most institutions hide/run from that kind of discussion and i don't think we solve this until
we talk about it.

We have ways to talk about problems with airplanes and safety.. 
food safety
even clear air and water .. 

we have measures and standards for serious things like that..

why can't we have similar talks about personal data security?


On Jun 4, 2011, at 3:56 PM, Joe Andrieu wrote:

I think our biggest problem isn't with those who will break the law and steal identifiers. That's a security issue and one that deserves appropriate secrecy on behalf of those trying to solve it...

What is most broken is that it is *common business practice* to capture and exploit information about and from individuals, without permission. If there were appropriate boundaries for what is and isn't acceptable, companies like Groupon--and those who aspire to IPOs or acquisitions valued in the billions--would be forced to play by the rules. Public markets won't tolerate wholesale illegal behavior. Not indefinitely.

This is the essence of privacy enforcement.  Good people and companies respect privacy.  Bad ones don't. Or as the aphorism puts it: "Locks don't keep criminals from stealing. They keep honest people honest."

What we are trying to figure out is how to tell the difference in a new environment where the boundaries are unclear.

Although many researchers and authors argue that privacy defies definition because it is so complex, I disagree. Privacy is context management.  Information released or created in one context is expected to be dealt with under that context. When it leaks in ways that are inconsistent with the expectations of the originating context, privacy is violated. What we are dealing with is both new online contexts and context collapse due to online interactions. That's the problem: new contextual realities we don't have a social framework for, whether it may be enforced by law, regulation, or etiquette.

To restate my initial premise: criminals will always find ways to violate context. We can legislate consequences and we can build technical barriers, but all laws can be broken and all techno-solutions can be hacked. What we /can/ do is figure out how the mainstream of well-intentioned companies and individuals can handle context management in a mutually satisfactory way. Once we figure that out, we can deal with the technical and legal barriers to violations.

-j

Joe Andrieu

 
 ">
 
+1 (805) 705-8651

On 6/4/2011 3:15 PM, Mary Hodder wrote:
" type="cite">I think there is an interesting comparison here to the banking industry.

Obviously they have big security concerns and address them with things like using .NET
and double logins to check your account and making everyone come into the bank to open an 
account or get signing rights.

The FCRA and congress tell financial institutions they *must* give the highest security to
our data.. and yet they don't. Instead, they give some security.. but have held back
on making credit cards with chips (like in the rest of the world) because it was cheaper
to pay out for fraud on the mag stripe data on the backs of CCs than it is to get the chips.
And it's cheaper to not have restaurants get wireless swipers than wired ones so the
servers walk away with your card (statistically the place you are most likely to get
IDENTIFIER theft around a commercial transaction).

And they don't protect your data all that well. Just enough to not get called on by regulators..
but not so much that they can't offer you $40 a month to protect you from IDENTITY theft
(i love how they use "identity" which scares people into paying the $40.. great marketing..
if they said "identifer theft" i don't think they would sell a lot of that.. )

If we mandated (lets say.. with the Kerry McCain Right's and Responsibilities legislation
.. which currently leaves out a "highest" standard on data security) that data collectors of
any sort maintain a highest level of security.. would we have a standard to give sites..
would we be able to hold the sites to it?

How do we know when a site collecting data is being negligent?

The bar is always moving due to the script kiddies, Anonymous, and credit card / spammers
from obscure parts of the world, not to mention your average cracker.

I think if we want a standard.. we have to make a standard.. and codify it.

It doesn't have to be codified into law.. but the problem is the cryptographers
and RSA types don't want to tell people outloud and in public how to be secure
because the baddies will get the info.

Or at least the ones I know at Stanford and Berkeley ... and they work for the US Govt
and have lead lined offices.. seriously.

So how do you make a secure standard for our data when the security people don't want to talk publicly
about it.. Bruce Schneier not withstanding?



On Jun 4, 2011, at 2:39 PM, j. clark wrote:

Thanks Dan.
Of note from the end of the article:

"A key failure of the FTC report is that it largely ignores the responsibility of websites in safeguarding the privacy of their users," says Wills.

"These sites should play a custodial role in protecting their users and preventing the leakage of their sensitive or identifiable information. Third-party sites have a powerful economic incentive to continue to collect and aggregate user information, so relying on them to protect user privacy will continue to be a losing battle."

Ah, there's a toxic leak in our ecosystem. I'm shocked!  Sony's many sites, practices and breaches are but one example now dangling in the media's hooks. Alas, our attention span is short, our needs continuous, and the practice is SO widespread... what's a person to do? Is this even a valid crisis?

Where's the righteous indignation?

  j.


On Sat, Jun 4, 2011 at 11:15 AM, Dan Miller < "> > wrote:
Not surprised that Web sites "leak data;" But 3/4 of "popular" sites!!!!!

http://www.tgdaily.com/security-features/56368-popular-websites-leaking-customer-data











Archive powered by MHonArc 2.6.19.