| Hi Joe, So we've just been discussing all this.. the ways encryption can go wrong.. and i got the same info in the room.
So the idea is.. use HTTPS for anything in transit.. but the data at databases will be wide open.. and even the user's password will most likely be stored in plain text.. in the database because it's too much work to encrypt that.
If the databases are where the most vulnerabilities are for PII .. and encryption isn't an option there...
Then what is the next option for keeping PII data safe from leakage?
mary
On Jun 5, 2011, at 7:56 PM, Joe Andrieu wrote:
Mary,
You're analysis is fairly on the nose, but there is a difference
between https by default and storing the data encrypted.
Https by default is good, but at a small cost. It also introduces extra
debugging overhead, but in theory, most of us would be using standard
libraries so that we aren't debugging the crypto anyway.
But all https does is deal with the data in transit.
It's a separate issue for data on the server or in an archive. And
your analysis there is spot on, certain types of transactions simply
become either too costly to be reasonable or essentially non-viable,
such as search.
I think the biggest problem here is that most of the analytic stuff
that's of modern interest needs to operate on unencrypted data. Even
simple web analytics to try to find out why people are buying one
product verses another requires access to the underlying factors. Any
web business worth its salt uses A/B testing to try out new features.
Doing that on top of some encrypted data set would be untenable, which
means that companies would need to decrypt their entire analytic set
when they want to run any tests like that. And the larger companies
are doing this constantly. So, the more sophisticated the vendor, the
less and less tenable it is to "encrypt everything", even if you adopt
an "on-demand" decryption strategy. On top of that the shear
computational complexity of current analytics is already bursting
server-side machines to the limit. Adding crypto simply makes it
worse. When you're busting your chops trying to fit your algorithm into
a scalable hadoop cluster, you consider any additional compute costs
very very carefully.
That said, https by default is, imo, a no brainer.
-j
Joe Andrieu
">
+1 (805) 705-8651
On 6/5/2011 7:17 PM, Mary Hodder wrote:
" type="cite">Hi,
So i checked with a friend who used to write the Linux OS for
google servers and knows databases fairly well and he said that
to encrypt everything would mean the following:
* encryption would take no more space in the database.. because
while the data would no longer be in plain text.. encryption would
likely compress it to about the same size
* compression would not happen at the same time as encryption
* that extra compression/encryption step for each piece of data
would take additional CPU time and power.. times trillions of bits of
data.. that would mean a big hit to the CPUs (many more servers would
be required)
* you could route all larger media (videos and photos) through
unencrypted channels to save on encryption and compression power
* encryption would mean packets would not be inspectable by your
ISP etc.. (an added benefit for privacy!!)
* because data would be encrypted you wouldn't be able to reuse
it (ie, if twitter encrypted everything.. you would not be able to take
a link and have all those who pointed to it.. share it in the
database.. i don't know whether twitter does this.. but if they wanted
to.. an encrypted system wouldn't allow it though you could assign
unique items a unique number, match the item in an additional step, on
the way into the database.. then grab the number, encrypt it. and make
the user's pointer to it unique and encrypted)
* you could not do searches across the database.. but you could
build an index, that would show data.. but not totally compromise it..
and that database could be put somewhere else (offline perhaps?) to
keep it safer from crackers
So it sounds to me like encrypting a system with HTTPS for a lot
of users and data would slow it down.. and require more CPU power.. but
it would work. And be far more secure than what we have now. Probably
not practical for search.. but if it was say, a site that sells
things.. a repeat user logs in.. and it's all an encrypted session, and
the entire user's profile would have to be unencrypted.. then another
sale could happen matching the older profile data.. then the
re-encryption would occur as the new data was stored with the old data.
It would be expensive in terms of CPU time.. but the user data
would be pretty secure in that scenario.
Other than that, can anyone think of why we wouldn't recommend
HPPTS encryption to services to protect user data?
mary
On Jun 5, 2011, at 3:49 PM, Mary Hodder wrote:
maybe..
i don't know if we've evaluated what it would mean to encrypt
everything..
let's take this use case:
a user has a personal data store.. they put as much as they
want to into it..
they add some apps.. maybe a VRM app like buyosphere for
sharing shopping data..
maybe
they get an app for allowing news sites to know their advertising
preferences and serve back ads (not a VRM app.. because it's about ads
and marketing)
maybe they get an app for archiving their bank
statements.. and there is no sharing here.. except with their
accountant who is an email away..
maybe they get an app for aggregating their search history..
and maybe they get an app for book recommendations where they
share what books they've bought..
If all the sites the user touches.. their PDS.. their
buysphere account..
their advertising preferences that go to news sites
the bank, all their browsers and histories, and the
recommender for books.. and the book purchase sites..
if they all encrypted.. what would that do to the system?
Does every site require an Https: login?
I think this is a good question to evaluate..
what does it do to apps makers of any type that want to hook
into a PDS?
mary
On Jun 5, 2011, at 3:16 PM, Michael O'Connor Clarke wrote:
So
I've just read the original article linked and the entire resulting
thread here, and I have what may be a completely dumb question. The
point I kept expecting didn't show up, so I'm driven to ask it, even if
it simply serves to highlight my own naivete.
Shouldn't part of the answer here be, simply, encrypt everything?
It
doesn't stop data "leaking" or prevent breaches, but it sure makes
things harder for your workaday cyber-baddy to do naughty things.
The
locks - all locks - can be picked. So just make the stuff behind the
locks useless to all but the really determined bad guys.
Doesn't seem like this should be all too hard to implement.
Am I completely missing something obvious? Would appreciate a little
education here.
Thanks,
/m
Michael O'Connor Clarke
+1.416.893.4941
@michaelocc
Date: Sun, 5 Jun 2011 13:48:26 -0700
Cc: <
">
>; j. clark<
">
>;
Dan Miller<
">
>;
ProjectVRM list<
">
>
Subject: Re: [projectvrm] Can you really close
Pandora's box
Mark,
The idea in the personal data ecosystem model is that users
do control their own data.
But where? Most likely their personal data store will be
hosted..
and point to or store personal data.. so those hosting
services should likely have some kind of level of
security we expect from the hosting.
And.. there will be services out in the world we send our
data to.. from those PDS.
Those services should also have security standards for
keeping personal data.
I don't think a personal data ecosystem model means we will
all be hosting our own boxes..
(kind of like today... most of us don't host our own email
on our home servers.. if we even have one..
those of us who have our own domains -- like hodder.org --
likely still have someone else manage
that.)
I'm curious.. of the VRM and PDE companies in the space..
have any of them announced
a level of security for their servers that is better.. kind
of like adhering to a Trust Framework..
but for data security?
So that leaks aren't happening?
I don't know of any company or service that talks about it.
mary
On Jun 5, 2011, at 1:40 PM, Mark Lizar wrote:
if
the Customer had authentic and official control of their data then all
the other data would be second class. The customer gets to manage
access.
Does the box need to be closed? Could we all one day have
our own box? :-)
- M
On 5 Jun 2011, at 17:05, Mary Hodder wrote:
Joe..
I
think it scares people to talk about personal data security.. yes.. but
i think it's healthy to talk about things that scare us..
So.. regarding locks. I think you and i are agreeing
here.. in a weird way.
I
want to define the locks and what the "highest standard" is.. but that
doesn't mean the extreme standards.. like super cryptography.
It means that .. like the schlage locks example.. we ask
for schlage locks. Note that recently we changed all my house
locks
from Kwikset ($13 locks) to Schlage ($52 each) and we feel it's good..
safe.. and frankly the locks don't need to be completely
tightened
up every 6 months because they are crap. The new locks do the
trick.. they have a longer thicker deadbolt, with more pins inside,
better structure and screws and I think given our
neighborhood, we can consider them "highest standard" for the
circumstances.
..
so maybe it's asking that when a site collects personal data.. they
partition it across multiple data bases.. in ways that make it hard to
steal and put back together..
unless you know how to do
it.. as opposed to keeping the whole of user data in one DB. Or maybe
we say.. attach bits of user data to other non PII data in a data
structure.. and make an
obscure way to connect it back to the
user.. unless you know the way to do it. In other words don't store:
name, address and CC all together. Or whatever.
Maybe the whole of user data at a service is only stored
in a single DB when it's not attached to the public internet.
Maybe the api data available is fully examined publicly
.. or maybe api's with any PII access require special oversite..
Those
are a couple of suggestions.. we can talk about asking for the Schlage
standard without telling IT people exactly how to do it.. they can
figure that out on their own
and in relative secrecy which means the details don't so
easily get to the bad guys.
I
agree we need laws against breaking and entering.. i think we have a
lot of those now.. but how do you enforce that in Uzbekistan?
We
don't have international laws and enforcements at the same levels as we
do at the nation state level (nation states in my view are an
anachronism..
i think they are passe but we don't have a lot to take their place..
the real power is in global markets.. for good and for bad..
and it's too scary to talk about the fact that they are
kind of passe.)
I
don't know that you get the whole world on board with a culture of not
breaking and entering.. we have uneasy peace (and wars) across
the world as it is. We can ask for it.. but many won't
respect it at all.
What I'm asking for is to create a "highest" standard
for services.. put it in writing.. and then show up and ask those guys:
"hey.. are you following the standard?" because we'd
really like you to....
Give the IT guys something practical to implement
instead of just lamenting the fact that our data is leaking all over.
So I'm asking for that.. what does that look like?
mary
ps..
did you see the thing last week that 30m Google user's data leaked out
of Google? I don't think any service is immune here..
On Jun 4, 2011, at 11:03 PM, Joe Andrieu wrote:
Absolutely, I think
we can, but it's hard. And it scares people.
Which makes both regular folks and experts avoid it. Same reason
locksmiths don't talk about locks. Most locks are crap and subject to
trivial attacks.
Most people don't want to hear about that and
most experts don't want the techniques leaked to a wider criminal
audience. Plus, there's the unfortunate tendency to enjoy being one of
the few wizards who understand the secret magic. But in the end, most
people are fine with the $30 Schlage lock, even thoough it's pretty
much useless for anyone with even moderate training or industry. For
most people, it provides the security they care about and, in fact it
keeps out enough potential criminals that people are mostly happy.
Which is to say that what I'm talking about is figuring out the digital
equivalent of (1) simple locks, (2) laws and rules against breaking and
entering and constitutional protection against unreasonable search and
seizure, and (3) a cultural shift that locks are to be expected and
respected. I think just getting /that/ in place will do more for our
society then the, also important, more detail oriented work of outright
security.
I think of it this way. For most of my information,
I don't need the equivalent of Fort Knox. Locks on my doors are just
fine. Today, we not only don't have digital locks on the doors, but
it's common practice to grab the pies cooling in my window sill. And
too much of the data security conversation ends up sounding like Fort
Knox!
To track this back to the FTC paper, it doesn't even
address what minimal business practices should be followed, that is,
that there should be locks on the doors. The main reason I push back
against too much fixation on data security is because
(1) I
think doing that 100% is literally impossible (see Wikileaks) and
ultimately is distracting. The data is out there. It will continue to
be out there. It will continue to be created and put out there by
people you know, simply because they tweet or blog or check in and
mention you. I don't believe we can contain the data. I do believe we
can penalize inappropriate use of that data. To point again at the
Do-Not-Call registry, it solved a significant annoyance not by data
security--the fact that my phone number is available was never seen as
the problem--but by inappropriate use of that data.
And (2)
because I believe the world will be a better place with more intimate,
more trusting, more valuable relationships, especially compared to the
minor cost of the risk of criminal use of my data. To me, security is
almost entirely about independence, not engagement. In fact, the
approaches I know preclude more engagement by their very nature. But, I
want Google to know what I'm looking for. I want facebook to know the
statuses I want to share with my friends. I want FourSquare to know
where I am. I want WordPress to know what I write. Information sharing
is the essence of digital relationships... and the bane of data
security. And, as you know, I've spent a lot of time working through
these issues from an information sharing perspective; that's my lens,
rose colored or otherwise.
So, yes, I think we should be able
to have conversations about data security--even as I explain why that's
not my focus. From our previous conversations, I think you and I are
aligned on most of these issues. I just think the biggest bang for our
buck is figuring out how individuals can contribute (data) to our
digital experience without fear of exploitation. Right now, the vast
majority of exploitation is legal and accepted business practice. THAT
I think we can change much more rapidly than we can control data
through rigorous security.
-j
Joe Andrieu
">
+1 (805) 705-8651
On 6/4/2011 4:42 PM, Mary Hodder wrote:
" type="cite">Joe..
i agree we should collect less data and have more honest businesses. We
don't have as many problems
talking about that stuff.. and we can keep doing it
and that part will get dramatically better soon, i think.
but some data will be collected..
and i know criminals will do their thing.. but
more.. or less is the question...
I'd like less and i'd like to know when we get real
about having a way to measure security around data?
Most institutions hide/run from that kind of
discussion and i don't think we solve this until
we talk about it.
We have ways to talk about problems with airplanes
and safety..
food safety
even clear air and water ..
we have measures and standards for serious things
like that..
why can't we have similar talks about personal data
security?
On Jun 4, 2011, at 3:56 PM, Joe Andrieu wrote:
I
think our biggest problem isn't with those who will break the law and
steal identifiers. That's a security issue and one that deserves
appropriate secrecy on behalf of those trying to solve it...
What is most broken is that it is *common business practice* to capture
and exploit information about and from individuals, without permission.
If there were appropriate boundaries for what is and isn't acceptable,
companies like Groupon--and those who aspire to IPOs or acquisitions
valued in the billions--would be forced to play by the rules. Public
markets won't tolerate wholesale illegal behavior. Not indefinitely.
This is the essence of privacy enforcement. Good people and companies
respect privacy. Bad ones don't. Or as the aphorism puts it: "Locks
don't keep criminals from stealing. They keep honest people honest."
What we are trying to figure out is how to tell the difference in a new
environment where the boundaries are unclear.
Although many researchers and authors argue that privacy defies
definition because it is so complex, I disagree. Privacy is context
management. Information released or created in one context is expected
to be dealt with under that context. When it leaks in ways that are
inconsistent with the expectations of the originating context, privacy
is violated. What we are dealing with is both new online contexts and
context collapse due to online interactions. That's the problem: new
contextual realities we don't have a social framework for, whether it
may be enforced by law, regulation, or etiquette.
To
restate my initial premise: criminals will always find ways to violate
context. We can legislate consequences and we can build technical
barriers, but all laws can be broken and all techno-solutions can be
hacked. What we /can/ do is figure out how the mainstream of
well-intentioned companies and individuals can handle context
management in a mutually satisfactory way. Once we figure that out, we
can deal with the technical and legal barriers to violations.
-j
Joe Andrieu
">
+1 (805) 705-8651
On 6/4/2011 3:15 PM, Mary Hodder wrote:
" type="cite">I
think there is an interesting comparison here to the banking industry.
Obviously they have big security concerns and
address them with things like using .NET
and double logins to check your account and
making everyone come into the bank to open an
account or get signing rights.
The FCRA and congress tell financial
institutions they *must* give the highest security to
our data.. and yet they don't. Instead, they
give some security.. but have held back
on making credit cards with chips (like in the
rest of the world) because it was cheaper
to pay out for fraud on the mag stripe data on
the backs of CCs than it is to get the chips.
And it's cheaper to not have restaurants get
wireless swipers than wired ones so the
servers walk away with your card (statistically
the place you are most likely to get
IDENTIFIER theft around a commercial
transaction).
And they don't protect your data all that well.
Just enough to not get called on by regulators..
but not so much that they can't offer you $40 a
month to protect you from IDENTITY theft
(i love how they use "identity" which scares
people into paying the $40.. great marketing..
if they said "identifer theft" i don't think
they would sell a lot of that.. )
If we mandated (lets say.. with the Kerry McCain
Right's and Responsibilities legislation
.. which currently leaves out a "highest"
standard on data security) that data collectors of
any sort maintain a highest level of security..
would we have a standard to give sites..
would we be able to hold the sites to it?
How do we know when a site collecting data is
being negligent?
The bar is always moving due to the script
kiddies, Anonymous, and credit card / spammers
from obscure parts of the world, not to mention
your average cracker.
I think if we want a standard.. we have to make
a standard.. and codify it.
It doesn't have to be codified into law.. but
the problem is the cryptographers
and RSA types don't want to tell people outloud
and in public how to be secure
because the baddies will get the info.
Or at least the ones I know at Stanford and
Berkeley ... and they work for the US Govt
and have lead lined offices.. seriously.
So how do you make a secure standard for our
data when the security people don't want to talk publicly
about it.. Bruce Schneier not withstanding?
On Jun 4, 2011, at 2:39 PM, j. clark wrote:
Thanks Dan.
Of note from the end of the article:
"A
key failure of the FTC report is that it largely ignores the
responsibility of websites in safeguarding the privacy of their users,"
says Wills. "These sites should
play
a custodial role in protecting their users and preventing the leakage
of their sensitive or identifiable information. Third-party sites have
a powerful economic incentive to continue to collect and aggregate user
information, so relying on them to protect user privacy will continue
to be a losing battle."
Ah, there's a toxic leak in our
ecosystem. I'm shocked! Sony's many sites, practices and breaches are
but one example now dangling in the media's hooks. Alas, our attention
span is short, our needs continuous, and the practice is SO
widespread... what's a person to do? Is this even a valid crisis?
Where's the righteous indignation?
j.
|