I learned a neat trick this week, which I think I can explain to non-technical readers.
The problem: say you want to have data on the cloud, to be shared or synced across multiple machines, like laptop and phone, new laptop and work phone, etc. Data like managed passwords or bookmarks or hell, emails. Data you would like to keep secret. What to do?
Approach 1: log into a cloud service provider with your passphrase and upload your data. This is terrible, they can see your data, and even if they're mostly trustworthy, a bad employee or a hacker could make off with your bank passwords.
Approach 2: log into the provider with your passphrase, and upload your data encrypted with the passphrase. This is barely any better; with standard login mechanisms, the provider sees your passphrase, even if they ideally don't store it in a visible form, and could trivially use it to decrypt your data and make off with your bank passwords.
Approach 3: encrypt your data with a separate passphrase, and upload that encrypted data when you log in. This is solidly secure (assuming you chose strong passphrases). Many geeks like me probably do a manual equivalent, encrypting files with gpg and copying them remotely. Only problem is, you need to manage two passphrases.
Approach 4: This is what I learned, and is the approach of Firefox Sync. Good presentation
, and more technically gory presentation
. But I'll give my own description.
The key idea is that you don't log in with your passphrase. Instead, a credential is made *from* the passphrase, via a one-way hash. (One-way meaning the passphrase cannot be recovered from the hash.) That is what is sent to the server and used for log in. Which leaves your passphrase free to encrypt the data you upload, which is now safe because the provider never sees your passphrase. You don't have to trust them; even if they broadcast your files to the world, ideally your data is safely encrypted. The provider only stores [(login credential), (passphrase-encrypted data)]. But any client you log in from can download the data and decrypt it with the passphrase you entered locally.
Put another way, the secret of your passphrase can be used to generate multiple secrets, for login and encryption, that don't generate each other. So you get the security of Approach 3 with the convenience of Approach 2.
On the flip side, if you forget and have to reset your passphrase, your encrypted data has to be thrown away; no one can recover it. That's not a big deal for Sync, especially as any device that has a copy of your data can then upload it again.
There's a bunch of complexity to the actual Firefox Sync process, but that's the fundamental insight.
One bit of complexity is straightforward. Approach 4 as I described it means that if you change your passphrase, your data has to be re-encrypted, which could get annoying. So instead have your client, at sign-up, generate a strong random data encryption key. Use that to encrypt your data, and encrypt the key with your passphrase. Now the provider stores [(login credential), (passphrase-encrypted key), (key-encrypted data)]. Your data is still secure since the provider never sees your passphrase or the actual key. But if you change your passphrase, only the small (login credential) and (passphrase-encrypted key) have to change, not the arbitrarily large (key-encrypted data).
Other bits of complexity are less interesting, or even baffling. Firefox actually encrypts your data key not with the passphrase, but with another secret derived from the passphrase. There are long steps that make brute force attacks less feasible. The key is protected with XOR rather than some fancier encryption mechanism. And weirdest of all, instead of your browser generating a data key and sending the encrypted key to the server, what happens is that the server makes up some random number ( wrap(wrap(kB)) in the Firefox writeups ) from which the client derives the key. The math works out but it's a reversal of expected flow. My best guess is that they feel they can make better random numbers than the client, which might be true if they have good hardware randomness generators one their servers.
(Though I'm not sure if it really matters; seems like they could assign '0000...' to everyone, and the data keys would still differ based on people's passphrases.)
 Passwords are supposed to be stored salted (add some non-secret extra stuff to defeat various attacks) and digested/hashed. So someone who steals the password file can't see the actual passwords. Login means you present your password, it's salted and digested and compared to what's in the file, then your password is thrown away. Whether any particular Internet site follows that protocol is another matter, which is one reason you're told not to re-use passwords across sites.
 In theory open-source clients can be examined, so that they can be fully trusted. How many of us do that, though? Not me... My gut feels that a single-purpose program like gpg is easier to secure, to make sure it's never doing anything like opening a network connection while decrypted my data, compared to some client or web browser that does everything, so it would be harder to make sure it was only opening the right network connections and not secretly sending your data somewhere.