Wednesday 16 April 2014

Website specific password manager

Following on from my post about what the benefits would be if websites would enforce unique passwords for users, I thought I would try to come up with a better scheme that avoided the dichotomy of the ability of users to remember passwords and password complexity.

My goal was to devise a way a website could allow users to choose a password they could remember and gain the same benefits as if users chose unique passwords.  Additionally, the security of the scheme should ideally be better, but at least no worse than what (a well designed) password authentication scheme is today.

The basic idea is the same as the way a password manager works.  Password managers allow a user to choose a strong local password that is used on their client machine to protect a password for a certain website.  When a user enters their local password in their password manager, it decrypts the website password, and then that is sent to the website to authenticate the user.  Usually the encrypted password(s) are stored online somewhere so the password manager can be used on multiple devices.

The problem (I use this term loosely) with password managers is it requires users to go to the effort of using one, or perhaps understanding what it is and trusting it with their passwords.  It would be preferable to get some of the benefits of a password manager on a website without requiring the user to know or even be aware one was being used.

So what I describe below is effectively a website specific password manager (that manages just 1 password, the password of the website).  The password management will happen behind the scenes and the user will be none the wiser.  The basic idea is that when a user registers with a website and chooses a password, that password will actually be a local password used to encrypt the actual website password.  The actual website password will be chosen by the website, be very strong and be unique (to the website).  When a user tries to log on, the following would happen:
  • The user will be prompted for their username and (local) password,
  • The entered username will be sent to the website and used to recover the encrypted password store (encrypted with the local password and containing the actual website password) for that user, which is then sent to the client
  • On the client the local password is used to decrypt the encrypted password store to recover the website password
  • The website password is sent to the website to authenticate the user.
Ignoring security for a second, we can improve the efficiency of the scheme by caching the encrypted password store on the client.  In fact the only reason the encrypted password store is stored on the server is to accommodate login from multiple devices.  It's an assumed requirement that login from multiple devices is necessary and that registering a device with a website doesn't scale sufficiently for it to be an option.

To examine the security of the scheme we need more details, so I provide those below.  But first I will say what security threats we are NOT looking to mitigate.  This scheme doesn't concern itself with transport security (so it's assumed HTTPS is used for communications), doesn't protect against XSS attacks (attacker script in the client will be able to read passwords when they are entered), doesn't protect against online password guessing attacks (it's assumed the website limits the number of password guesses).

The threats we are concerned about though are:
  • An attacker that gets an encrypted password store for a user should not be able to recover the user's local password in an offline attack.
  • An attacker that gets a DB dump of password hashes should not be able to recover either the website password or the local password.
So let's get into the detail of the scheme.  Here is a description of the algorithms and parameters involved.
  • Hash(), a password hashing algorithm like PBKDF2, bcrypt or scrypt. It's output size is 256 bits.
  • HMAC_K(), this is HMAC-SHA256 using a secret key K.
  • K, a 256 bit secret key used by the website solely for password functionality, it is not per-user, it is the same for all users.
  • xor, this means the binary exclusive-or of two values.
  • PwdW, the website password, a 256 bit strong random value.
  • PwdL, the local password, chosen by the user.
Let's start with the registration process, a new user signs up:
  • A user chooses to register with the website.
  • The website generates a new PwdW and returns this to the client.
  • The user chooses a password (and username if appropriate), this is PwdL.
  • The client sends to the website, PwdW xor Hash(PwdL).
  • The website stores (against the username), PwdW xor Hash(PwdL) and also PwdW xor HMAC_K(PwdW).
I've skipped over some of the implementation details here (like the Hash parameters etc), but hopefully provided enough details for analysis.  One important point is that the website does not store the value PwdW anywhere, it only stores the 2 values in the last step, namely:
  • PwdW xor Hash(PwdL)      (this is the encrypted password store from the description above)
  • PwdW xor HMAC_K(PwdW)  (this is the equivalent of the password hash in most schemes, it is used to verify that any PwdW submitted by a client is correct)
Now let's see what happens during login:
  1. User visits the website and enters their username and PwdL
  2. The client sends the username to the website, which looks up the encrypted password store, PwdW xor Hash(PwdL) and returns it to the client
  3. The client generates Hash(PwdL) and uses this to recover PwdW by calculating (PwdW xor Hash(PwdL)) xor Hash(PwdL) = PwdW
  4. The client sends PwdW to the website
  5. The website recovers HMAC_K(PwdW) by calculating (PwdW xor HMAC_K(PwdW)) xor PwdW = HMAC_K(PwdW) (let's call this Recovered_HMAC).
  6. The website calculates HMAC_K(PwdW) using the received value of PwdW from the client (let's call this Calculated_HMAC).
  7. If Recovered_HMAC = Calculated_HMAC the user is successfully authenticated.
So let's address the threats we are concerned about, the first is "An attacker that gets an encrypted password store for a user should not be able to recover the user's local password in an offline attack."  Let's assume an attacker knows the username they want to target, and that they download the encrypted password store (PwdW xor Hash(PwdL)) for that user.  If we accept that PwdW is a random 256-bit value, then an attacker is unable to determine the value of Hash(PwdL) because there is a value of PwdW that gives every possible hash value of every possible local password that could be chosen.  This is the security property of the one-time pad.  Of course an attacker could instead generate Hash(PwdL) (by guessing PwdL) and calculate PwdW, but they have no idea if they are correct and would have to submit it to the website to determine if it was correct, however we are assuming there are controls in place that limit the number of online password guessing attempts.

The other threat we consider is "An attacker that gets a DB dump of password hashes should not be able to recover either the website password or the local password."  There are 2 scenarios we want to consider, when the attacker does not have access to the secret key K, and when they do.

If the attacker does not have access to K, then in this case the attacker has access to (PwdW xor Hash(PwdL)) and (PwdW xor Enc_K(PwdW)).  If the attacker tries to offline guess PwdW they have the problem of having no way to determine if a guess is correct, because they cannot verify the HMAC without the key K.  They can of course guess PwdW online by submitting it to the website, but we assume controls that mitigate this attack.  If the attack tries to guess PwdL, then they have exactly the same problem of being unable to offline verify if the calculated PwdW is correct.

If the attacker does have access to K then they have the same information that the website has and so can perform an offline (dictionary) attack against the local password PwdL.  How successful this attack is depends on the parameters of the password hashing scheme (Hash()) that were chosen and the complexity of the user chosen password PwdL.  For the purposes of this scheme though the attacker in this scenario has the same chance of success as they would have against the current industry best practice of storing a password hash in the DB (in a scenario where those hashes were extracted by the attacker and attacked offline).

So that's a quick and dirty description and security analysis of a novel password authentication and protection scheme for a website.  The benefits of this scheme are:
  • The website never gets to see the local password chosen by the user.  But this is not an absolute, it assumes a trusted website (and trusted 3rd party resources the website requests).  This is a benefit because:
    • Websites cannot accidentally expose the user's local password, either internally or externally.
    • A malicious insider working for the website would have a much more difficult task in obtaining local passwords with an intent of using those passwords on other websites.
  • A successful SQL Injection attack that dumped out the password hash DB table would not allow an attacker to compromise any accounts.
  • If the encryption key K was stored and used in an HSM, then it would be impractical under any circumstances for an attacker to mount an offline attack (short of physical access to the user!)
But let's also be clear about the problems:
  • You probably shouldn't be implementing any scheme you read about in a blog.  Such things need to be reviewed by experts and their acceptance needs to reach a level of consensus in the security community.  Anyone can create security system that they themselves cannot break.
  • Lack of implementation details.  Even if the theory is sound there are several pitfalls to avoid in implementing something like this.  This means it's going to be difficult to implement securely unless you are very knowledgeable about such things.
  • If you are using a salted password-hash scheme now and went through and encrypted those hashes (as another layer of security), you basically get the same level of security as this scheme proposes. The benefit of this scheme is that the user's password is not sent to the website.
  • Password changing.  It may be more difficult to mandate or enforce user's change their local password because the change must be done on the client side.
  • JavaScript crypto.  The Hash() algorithm would need to be implemented in JavaScript.  The first problem is implementing crypto algorithms in JavaScript in the browser raises several security concerns.  The second is the Hash algorithm is computationally expensive which means performance might negatively affect usability, perhaps more so when implemented in JavaScript, and perhaps even more so on certain devices e.g. mobile.
  • Lastly, I didn't quite achieve my goal of allowing users to choose weaker passwords as in the worst case attack (attacker has access to the secret key), the use of weak passwords would make recovering those passwords relatively simple.
But hey, it's a novel (as far as I know) approach, and it might be useful to someone else designing a solution for website authentication.

No comments:

Post a Comment