Hi Bill,
Currently in the US, there are very few protections of your data held by any commercial hosting provider. Self-hosting in this case means that you own and operate the servers that hold your data. There is some grey area if those servers are in a co-lo datacenter in that you don't control physical access and we have seen many cases of collateral damage and over-broad searches that swept up innocent bystanders. If you want strong assurance that your data is safe from government, you'll need to host some of it in your private residence where you do in fact have some expectation of privacy and the bar for search and seizure is higher.
Ironically, this probably means compromising on physical security. The term "wiring closet" is supposed to be a metaphor, not an actual closet where a 6U rack is the perfect height to put a dirty clothes hamper or shoe rack on top. Worst data loss incident I ever suffered was perpetrated by my then-school-age kids as a result of poor physical and login security.
For what it's worth, I do in fact have a storage array at home. For about $1,300 I have 6TB of redundant RAID storage which can lose up to 2 drives without losing any data and which I can hot-swap drives. I invested another $200 for a couple spare drives so that I can in fact hot swap them when the time comes. The sensitive and high-value portion of this data is also backed up with multiple commercial hosts. However, it is encrypted *before* it is transmitted to the commercial hosts so that it is impossible for the hosts, government, or hackers to read it if those servers are compromised. The keys to the data are backed up offline, on CD-ROM, with multiple escrow agents and geographically dispersed.
This means that my personal cloud is throttled by my ISP's upstream bandwidth limit, but for what I need that is good enough. It also means that the fast cloud hosts on the Internet cannot provide any services with that data of mine. They are reduced to mere cloud storage providers. I also don't have good physical security and even if I did I can't expect guests to take security training and carry badges or submit to iris scans. But on the good side of the equation, I have 100% redundant, offsite backup of important stuff so if my house burns down I can recover all of the data, and I have some protection from government snooping.
This sounds extreme, I know. But for what I want my personal cloud to do (hold transactional data, photos, vital records, business records, etc.) it's actually the minimum necessary if you are concerned about privacy. One premise of this movement is that VRM is an equal peer to CRM so it stands to reason we'd have things like high availability, disaster recovery through geographically distant secondary sites, a key management infrastructure (public or otherwise), secure archival backup, etc. Our job in the VRM and pClouds communities is to build the infrastructure that provides this level of assurance but does so in a way that makes it drop-dead simple for consumers. In fact, we actually have a slightly higher bar set because, unlike the corporate world, we have to assume poor physical security and a hostile network.
So, that's my take on requirements for self-hosting. Love to hear other thoughts on it.
-- T.Rob