Tuesday 16 August 2022

Introducing threatware

I'd like to share a tool that I have been working on that helps security and engineering teams to threat model.  I'd also like to share the threat modelling process it was designed to help with.  Both are meant as contributions to help make threat modelling easier (at least for some people).

I'm not going to go over the process in detail, I've captured that in this overview in the documentation.  I will give some context about the origin of the process and the tooling though.  

I have been doing some form of threat modelling for many years.  Originally the process was using my experience as a penetration tester to ask questions and focus on areas where vulnerabilities usually live.  The output was only ever a list of potential vulnerabilities that were discovered.  As I did more architecture and design work and found myself needing to create generic frameworks to mitigate threats and vulnerabilities, I found myself more and more modelling systems as a means for designing solutions.  That modelling then changed my approach to threat modelling as I sought for a more systematic approach, that could be consistent and lend itself to giving some assurance.  

Of course there were existing threat modelling tools to try to leverage to provide a more systematic approach.  I was not impressed.  The tooling I discovered needed me to invest a lot of time and effort and often the result was a plethora of threats that I knew were mostly noise and offered little value.  Many of the tools required complete indoctrination into the tools methodology, and if your use case wasn't in the sweet spot for the tool, you were out of luck.  These problems are by no means unique to threat modelling tools, or security tools, or really any tools.

So I started exploring developing my own approach.  I started to define a template of information that needed to be gathered in order to determine the threats against a system.  That template went through many iterations.  Initially it wanted so much information that it was difficult to populate, but the goal was not to miss any threats!  Through repeated use it became clear the balance between the burden of data collection and finding all the threats needed to be refined.  That became the process then, refine the template and evaluate the findings, determining what could be stripped out that offered little value.  Eventually the template began to require fewer changes, and I got more confidence that it found an acceptable amount of threats for the effort required.

I would say it was a success!  But with success comes different problems.  The teams I worked with could acknowledge that the template was doing a good job, but it was still not accessible to them to populate without a lot of hand holding.  As more teams wanted to benefit from using the template, the problem of scale emerged, as I was having difficulty keeping up with the demand - and that is not even mentioning the repetition of providing the same guidance to each team.  The next step was to produce detailed documentation so teams didn't need to come to me as the first step to populating the template.  This helped, but I was still reviewing the output, and the feedback loop for mistakes was too long.

Could tooling help to solve this?  The problem was the template was just a document, I wasn't capturing information in an app where it would be simpler to detect incorrect input.  I should write my own app!  Wow did I discover that was going to take a long time and a lot of work and likely just result in an inflexible approach that frustrated me about other tooling I had used.  I didn't want teams to have learn a new app, and I knew the business didn't (and shouldn't) want to commit to a process tied to an app.  I wanted teams to have the flexibility and familiarity that a document provides.  Actually one of the key features that many document solutions provide is the ability to provide inline comments, and this had proved invaluable in reviewing and communicating with teams asynchronously.  Replicating that in an app seemed to be well beyond what a part-time tool development effort could achieve.

But I could do some basic tooling still.  Some automation that highlighted the really common errors teams were making that were the real time intensive parts of reviewing threat models.  So scripts were written.  The format of the document was assumed, information was extracted and verification was applied.  With something working I then moved it to an AWS Lambda function so teams could benefit from just a simple request in the browser.  Now whenever someone created a new threat model and asked me to review it, the first thing I told them was to run the tool and fix the errors.  This was a huge help to me.  What surprised me was how positive the reaction was from teams, they loved that they could use a tool to help validate their threat model!

There was a real 'golden age' for a period of time as I refined the tool to deliver more value to both me and the teams, and it enabled even more threat models to be created and threat models to be updated as well.  Unfortunately with success, scaling becomes a problem that won't go away, and it was clear more changes were needed to support demand, but the tooling I had was creaking at the seams.

Regardless of being a victim of its own success, and forgive me for thinking so, but what I had created seemed like an approach to threat modelling that others in the industry might find useful, and the seeds were planted for how I could share this with the world.  In its current state it was not shareable, it needed to be re-written from scratch (and independently of the company I was working for).  There was also the danger of hubris over the template I had created, as it would not suit everyone's situation.  

So I made the decision to build something that could support a flexible format for the template but still be able to provide validation.  I would provide an example template that others can use if they so choose, but if someone wanted to do something completely different they can do that also.  I also made the decision that the tooling needed to support the management of threat models, as I had first hand experience of a successful threat modelling process, and managing all those threat models required processes as well.

And that is how threatware was born.  The documentation hopefully gives all the detail you need, and while I use it daily, I would still only consider it in beta, as there are still occasional issues that need to be resolved.

If you need to do threat modelling, you might find it useful.  I threat model and it helps me a lot.

Thursday 27 December 2018

Establishing identity within a group of services

How can a group of services determine each other's identity?

For the purpose of this discussion, the services are connected (indirectly) to each other over a network, and the services are capable of storing secrets, which can be securely distributed to the services.  A central service can also exist with which all other services can authenticate i.e. have a trusted connection to.

If the central service establishes itself as the registration authority, it can issue credentials e.g. secret key, password, private key etc., then we have a group where every service that registers can authenticate itself back to the central service in the future.  That doesn't immediately allow services to identify themselves to other services.

We usually have additional constraints though, such as
- service-to-service identification should not have to involve the central service every time services communicate, but bootstrapping via trusted centralised information, or occasionally talk to the central component is acceptable
- the overhead of proving identity should not be a burden
- direct connections between services don't exist e.g. there are intermediate network devices
- replay attacks should not work
- stealing credentials is not in our threat model, but stealing ephemeral credentials should not lead to permanent compromise of an identity

Secret Key pairs
A central service can hand out keys unique to every pair of services, and that way each service can sign a message and the receiving service will know it must have originated from them because only they have the key.  For N services that requires managing O(N^2) keys.  Keys would need an identifier, a lifetime and to be rotated, all without causing any downtime or miscommunication.  Creating signatures would be simple and signatures would be small.

PKI
Each component can get an asymmetric key, and then sign a message and the receiving component will verify the signature via the public key, which it obtains from a central trusted location, and it will then know it must have originated from the signer because only they have the private key.  For N services that requires managing O(N) keys.  Keys would need an identifier, a lifetime and to be rotated, all without causing any downtime or miscommunication.  Signatures that would be sent would be somewhat large ~ 1-2KB.

Mutual-TLS
This implies that either the TLS is terminated at each component, or if terminated at some intermediate network point the client certificate is transmitted as an HTTP header.  The same constraints as for PKI exist. 

OIDC/OAuth2
Each service (client) gets an ID Token for each other service (Relying Party) it needs to talk to (the ID Token audience would be the Relying Party service).  The lifetime of the ID Token would control how often each component needs to talk to the central component.  For N services that requires managing O(N) secrets.  The tokens used would need to be regularly refreshed at the central service.

Kerberos
Designed to solve this problem, but difficult to understand and get assurance of security.

There are likely others I am missing.

Friday 30 March 2018

Shall we play a game?

A while ago I watched a very interesting lecture series on Game Theory and I wanted to share some of the insights that I got out of it.

The reason I watched a lecture series on Game Theory is because it shares a lot in common with security.  Generally speaking Game Theory studies how two (or more) interacting players should behave to get their desired result.  It certainly isn't about studying "games"; games are just tools to enable analysis, it is perhaps better understood as the study of "strategic decision making".

One of the first things to understand about looking at a game is the idea of a payoff.  It's the reason the players are playing, they want to maximise their payoff.  Which leads to the first insight I got from the course:

In general, the payoffs for different players cannot be directly compared.
What this means is really interesting, as it says that the strategy you choose as a player is a strategy that maximises your payoff, and so does every other player, but all those strategies might be different because each player is maximising for a different payoff.  Relating this to security, an example might be a company choosing security controls to minimise the chance of losing their intellectual property, but an attacker totally focused on gaining control of machines to use for mining cryptocurrency.

To gain more insights it helpful to know there are different types of games:

  • finite games vs infinite games
  • zero sum games vs non-zero sum game
  • games of pure strategy vs games of mixed strategy
  • sequential games vs simultaneous games
  • games of perfect information vs games of incomplete information
  • cooperative games vs non-cooperative games

From a security perspective if you are trying to choose a strategy to protect an asset against a malicious attacker, then you are playing an infinite, non-zero sum, mixed strategy, simultaneous, incomplete information, non-cooperative game.  And yes, people actually study this type of game to determine what an optimal strategy might be (generally given more constraints).

An important type from the list above is the mixed strategy game.  That means that a player will not choose a single strategy, but will choose from multiple strategies that they will play with certain probabilities.  The payoff from a mixed strategy game for a player is the sum of the payoffs multiplied by the probability of playing the strategy with that payoff (like calculating an expected value in statistics).

Now whereas the payoffs for each strategy are fixed, the flexibility comes from a player assigning different probabilities to each strategy.  Of course every other player is doing the same thing, and will certainly choose the probabilities they play a certain strategy based on the probabilities they think other players will choose.

Relating this back to security, if a strategy has a probability and payoff, then the probability is the likelihood of a certain strategy being chosen, and the payoff is the effect of that strategy, or impact we could say.  Then the expected payoff for any strategy is likelihood multiplied by impact, which in security we recognise as how risk is calculated.

Now for the interesting bit if you are into security.  Game Theory has determined that there is a "stable" set of strategies that all players can reach, something called the Nash Equilibrium:

A Nash equilibrium is a set of strategies where no player benefits by unilaterally changing his or her strategy in the profile


This means there is an approach to security where we can construct a range of controls in such a way as to know exactly what our losses will be, and know that the attacker cannot come up with a better approach (set of strategies) as any other approach will lead to the attacker getting a reduced payoff. It's important to point our that a Nash equilibrium does not necessarily lead to the best payoffs for each player, or all players. What's also interesting is that for the type of game we play in security, at least one Nash equilibrium is guaranteed to exist.

Of course the problem is we don't know how to find the Nash equilibrium. Even if we knew all the possible strategies of all the players and the payoffs of all the strategies, there isn't a way to calculate it. Not to mention that clearly we don't know all the strategies or all the payoffs.

So useless right? Sure, largely that is right for a quantitative analysis. But there are still insights that can be useful. For instance in the type of games where we know how to find the Nash equilibrium (zero-sum or constant-sum games), the way a player finds a Nash equilibrium is by calculating the probability for each of their strategies by calculating probabilities that give the other player the same payoff regardless of which strategy the opposing player chooses (the minimax strategy).

So what does that mean in practical terms. It means this:

Choose your security controls based on the payoffs to the attacker

Don't choose your security controls based on the impact on you of the attacker's strategy. The goal being to force the attacker into a constant payoff strategy no matter what they do. This approach doesn't mean you won't suffer losses to an attacker, it just means those losses will be fixed.

Naturally this approach has a problem, what if the impact (or payoff) to you is unacceptable from a business perspective? Well that leads to another important insight:

Game Theory can help you determine if you are playing the wrong game.
You can change a game in multiple ways, but ultimately it comes down to changing the payoffs for the other players.  In practical terms this means changing the amount of value an attacker can derive from the assets of interest to them e.g. no longer store the credit cards that an attacker wants access to.

What's interesting is this approach runs counter to the general approach of risk management today, which is to figure out what threats would have the most impact on your business, try to figure how likely those threats are, and try to minimise either the impact or likelihood.  Game Theory isn't saying that isn't sensible, just that it might not be the best strategy.  The simple reason is due to the very first insight, if you optimise to protect what is of most value to you, you might fail to protect the real target of an attacker, which ultimately might cost you more.  Put another way, investing heavily in protecting something that is valuable to the business is a waste of time if no attackers are actually interested in it.  For instance if you have IP that is core to your business, attackers won't care about that unless they can monetise it easily, especially if you have other assets they can monetise more easily.

The last insight that I wanted to mention is this quote:

“The power to constrain an adversary depends on the power to bind oneself” - Thomas Schelling
My interpretation of "bind oneself" is the processes you put in place to ensure you have adequate security controls across your business.  In appsec this would be your Secure SDLC.  The quote is all about the benefits of strictly enforcing controls on your business, so that an adversary is also constrained.

Game Theory isn't about solve all the problems we have in security, but it seems a very complementary field that has many practical insights to offer.  If you are looking to broaden your understanding of security I would thoroughly recommend taking a closer look.

Wednesday 16 April 2014

Website specific password manager

Following on from my post about what the benefits would be if websites would enforce unique passwords for users, I thought I would try to come up with a better scheme that avoided the dichotomy of the ability of users to remember passwords and password complexity.

My goal was to devise a way a website could allow users to choose a password they could remember and gain the same benefits as if users chose unique passwords.  Additionally, the security of the scheme should ideally be better, but at least no worse than what (a well designed) password authentication scheme is today.

The basic idea is the same as the way a password manager works.  Password managers allow a user to choose a strong local password that is used on their client machine to protect a password for a certain website.  When a user enters their local password in their password manager, it decrypts the website password, and then that is sent to the website to authenticate the user.  Usually the encrypted password(s) are stored online somewhere so the password manager can be used on multiple devices.

The problem (I use this term loosely) with password managers is it requires users to go to the effort of using one, or perhaps understanding what it is and trusting it with their passwords.  It would be preferable to get some of the benefits of a password manager on a website without requiring the user to know or even be aware one was being used.

So what I describe below is effectively a website specific password manager (that manages just 1 password, the password of the website).  The password management will happen behind the scenes and the user will be none the wiser.  The basic idea is that when a user registers with a website and chooses a password, that password will actually be a local password used to encrypt the actual website password.  The actual website password will be chosen by the website, be very strong and be unique (to the website).  When a user tries to log on, the following would happen:
  • The user will be prompted for their username and (local) password,
  • The entered username will be sent to the website and used to recover the encrypted password store (encrypted with the local password and containing the actual website password) for that user, which is then sent to the client
  • On the client the local password is used to decrypt the encrypted password store to recover the website password
  • The website password is sent to the website to authenticate the user.
Ignoring security for a second, we can improve the efficiency of the scheme by caching the encrypted password store on the client.  In fact the only reason the encrypted password store is stored on the server is to accommodate login from multiple devices.  It's an assumed requirement that login from multiple devices is necessary and that registering a device with a website doesn't scale sufficiently for it to be an option.

To examine the security of the scheme we need more details, so I provide those below.  But first I will say what security threats we are NOT looking to mitigate.  This scheme doesn't concern itself with transport security (so it's assumed HTTPS is used for communications), doesn't protect against XSS attacks (attacker script in the client will be able to read passwords when they are entered), doesn't protect against online password guessing attacks (it's assumed the website limits the number of password guesses).

The threats we are concerned about though are:
  • An attacker that gets an encrypted password store for a user should not be able to recover the user's local password in an offline attack.
  • An attacker that gets a DB dump of password hashes should not be able to recover either the website password or the local password.
So let's get into the detail of the scheme.  Here is a description of the algorithms and parameters involved.
  • Hash(), a password hashing algorithm like PBKDF2, bcrypt or scrypt. It's output size is 256 bits.
  • HMAC_K(), this is HMAC-SHA256 using a secret key K.
  • K, a 256 bit secret key used by the website solely for password functionality, it is not per-user, it is the same for all users.
  • xor, this means the binary exclusive-or of two values.
  • PwdW, the website password, a 256 bit strong random value.
  • PwdL, the local password, chosen by the user.
Let's start with the registration process, a new user signs up:
  • A user chooses to register with the website.
  • The website generates a new PwdW and returns this to the client.
  • The user chooses a password (and username if appropriate), this is PwdL.
  • The client sends to the website, PwdW xor Hash(PwdL).
  • The website stores (against the username), PwdW xor Hash(PwdL) and also PwdW xor HMAC_K(PwdW).
I've skipped over some of the implementation details here (like the Hash parameters etc), but hopefully provided enough details for analysis.  One important point is that the website does not store the value PwdW anywhere, it only stores the 2 values in the last step, namely:
  • PwdW xor Hash(PwdL)      (this is the encrypted password store from the description above)
  • PwdW xor HMAC_K(PwdW)  (this is the equivalent of the password hash in most schemes, it is used to verify that any PwdW submitted by a client is correct)
Now let's see what happens during login:
  1. User visits the website and enters their username and PwdL
  2. The client sends the username to the website, which looks up the encrypted password store, PwdW xor Hash(PwdL) and returns it to the client
  3. The client generates Hash(PwdL) and uses this to recover PwdW by calculating (PwdW xor Hash(PwdL)) xor Hash(PwdL) = PwdW
  4. The client sends PwdW to the website
  5. The website recovers HMAC_K(PwdW) by calculating (PwdW xor HMAC_K(PwdW)) xor PwdW = HMAC_K(PwdW) (let's call this Recovered_HMAC).
  6. The website calculates HMAC_K(PwdW) using the received value of PwdW from the client (let's call this Calculated_HMAC).
  7. If Recovered_HMAC = Calculated_HMAC the user is successfully authenticated.
So let's address the threats we are concerned about, the first is "An attacker that gets an encrypted password store for a user should not be able to recover the user's local password in an offline attack."  Let's assume an attacker knows the username they want to target, and that they download the encrypted password store (PwdW xor Hash(PwdL)) for that user.  If we accept that PwdW is a random 256-bit value, then an attacker is unable to determine the value of Hash(PwdL) because there is a value of PwdW that gives every possible hash value of every possible local password that could be chosen.  This is the security property of the one-time pad.  Of course an attacker could instead generate Hash(PwdL) (by guessing PwdL) and calculate PwdW, but they have no idea if they are correct and would have to submit it to the website to determine if it was correct, however we are assuming there are controls in place that limit the number of online password guessing attempts.

The other threat we consider is "An attacker that gets a DB dump of password hashes should not be able to recover either the website password or the local password."  There are 2 scenarios we want to consider, when the attacker does not have access to the secret key K, and when they do.

If the attacker does not have access to K, then in this case the attacker has access to (PwdW xor Hash(PwdL)) and (PwdW xor Enc_K(PwdW)).  If the attacker tries to offline guess PwdW they have the problem of having no way to determine if a guess is correct, because they cannot verify the HMAC without the key K.  They can of course guess PwdW online by submitting it to the website, but we assume controls that mitigate this attack.  If the attack tries to guess PwdL, then they have exactly the same problem of being unable to offline verify if the calculated PwdW is correct.

If the attacker does have access to K then they have the same information that the website has and so can perform an offline (dictionary) attack against the local password PwdL.  How successful this attack is depends on the parameters of the password hashing scheme (Hash()) that were chosen and the complexity of the user chosen password PwdL.  For the purposes of this scheme though the attacker in this scenario has the same chance of success as they would have against the current industry best practice of storing a password hash in the DB (in a scenario where those hashes were extracted by the attacker and attacked offline).

So that's a quick and dirty description and security analysis of a novel password authentication and protection scheme for a website.  The benefits of this scheme are:
  • The website never gets to see the local password chosen by the user.  But this is not an absolute, it assumes a trusted website (and trusted 3rd party resources the website requests).  This is a benefit because:
    • Websites cannot accidentally expose the user's local password, either internally or externally.
    • A malicious insider working for the website would have a much more difficult task in obtaining local passwords with an intent of using those passwords on other websites.
  • A successful SQL Injection attack that dumped out the password hash DB table would not allow an attacker to compromise any accounts.
  • If the encryption key K was stored and used in an HSM, then it would be impractical under any circumstances for an attacker to mount an offline attack (short of physical access to the user!)
But let's also be clear about the problems:
  • You probably shouldn't be implementing any scheme you read about in a blog.  Such things need to be reviewed by experts and their acceptance needs to reach a level of consensus in the security community.  Anyone can create security system that they themselves cannot break.
  • Lack of implementation details.  Even if the theory is sound there are several pitfalls to avoid in implementing something like this.  This means it's going to be difficult to implement securely unless you are very knowledgeable about such things.
  • If you are using a salted password-hash scheme now and went through and encrypted those hashes (as another layer of security), you basically get the same level of security as this scheme proposes. The benefit of this scheme is that the user's password is not sent to the website.
  • Password changing.  It may be more difficult to mandate or enforce user's change their local password because the change must be done on the client side.
  • JavaScript crypto.  The Hash() algorithm would need to be implemented in JavaScript.  The first problem is implementing crypto algorithms in JavaScript in the browser raises several security concerns.  The second is the Hash algorithm is computationally expensive which means performance might negatively affect usability, perhaps more so when implemented in JavaScript, and perhaps even more so on certain devices e.g. mobile.
  • Lastly, I didn't quite achieve my goal of allowing users to choose weaker passwords as in the worst case attack (attacker has access to the secret key), the use of weak passwords would make recovering those passwords relatively simple.
But hey, it's a novel (as far as I know) approach, and it might be useful to someone else designing a solution for website authentication.

Friday 4 April 2014

Removing usernames and enforcing unique passwords

I often have "ideas" that, in the haystack of my mind, are usually straws rather than that elusive needle. One recent one was whether the username is necessary to login to a website.  Usually the Internet is excellent at helping identifying straws, and this idea is no exception, see http://blog.codinghorror.com/why-do-login-dialogs-have-a-user-field/.  But let's follow through on the idea and see where it takes us.

So the basic idea is that when a user wants to authenticate to a website, all they do is provide their password and the website will use that information to identify them.

The draw of removing usernames for me is really an efficiency one, as it's not immediately clear that usernames are required to authenticate to a website.

The obvious problem though is if someone tries to register with the same password as an existing user.  Since we can't have 2 users with the same password (more on unique passwords later), we must reject the registration and reveal that the chosen password is a valid credential for another user.  We can overcome this problem by forcing the existing user with that password to change their password (using (an assumed pre-existing) out-of-band password reset mechanism e.g. email).  Which highlights one benefit of authenticating with the password only, it creates an incentive for the user to choose a strong password (as otherwise they will continually have to change their password when new users attempt to register with the same password).

From the website's perspective there is a problem though, how do you identify the user?  If passwords are being stored using a salted password-hashing algorithm, then it would be too inefficient for the website to, for each row in the password hash DB table, fetch the salt for that row, then generate the password hash (using the password of the user trying to authenticate), and the compare it against the stored password hash in that row.  That approach simply does not scale.  We certainly don't want to use a password hash without a salt or with a fixed salt (as this makes dictionary attacks on the password hash DB table much quicker in the event the table is exposed).

One option is to encrypt the password and use that as the salt for the password hash.  To verify a login attempt the website would encrypt the password to create the salt, calculate the password hash (using the salt) and compare with the list of stored password hashes (potentially sorted for efficient searching).  It's important to point out this encrypted password, which we are using as the salt, is not stored, but calculated during the login attempt.

If the password hash store was ever compromised (and we always assume it will be) then it will be impossible to brute-force the passwords without the encryption key as well (as the salt will not be known and not be practical to guess).  Thus the security of this approach relies on protecting the encryption key.  The key should not be stored in the DB, as the likely extraction method of the password hash DB table is SQL Injection, meaning if the key was in the DB it too could be extracted.  The key should be stored somewhere the DB cannot access (the goal is to require the attacker to find a separate exploit to obtain the key).  It could be stored in configuration files, but the best option would be an HSM, with encryption done on-board.  At the stage an attacker has code executing talking to the HSM, it's game over anyway.  If the encryption key was obtained by an attacker then they could more efficiently (than attacking traditional salted passwords hashes) perform a dictionary attack on the password hash DB table.

We can make another optimisation for security as well.  We should keep track of the passwords that more than one user has chosen in the past i.e. password collisions discovered during user registration.  After all, we don't want to force a user to change their weak password, only for that password to become available for use again!  This way we will avoid the re-use of weak passwords e.g. 123456.  Now imagine we had a list of passwords that we know more than one person had thought of.  Now imagine we publicly shared that list (so other websites could avoid letting users choose those passwords as well).

So let's imagine we now have an application that doesn't require usernames for authentication and all users have unique passwords.  What are the threats?  Well a distributed dictionary attack against our application is a problem because every password guess is a possible match for every user.  Annoyingly the more users we have the better the chances of the guess being right.  Additionally, limiting the number of guesses is more difficult since authentication attempts are not user specific.  This makes clear the benefit of having usernames; they make online authentication attacks much harder.

So my conclusion was that although usernames might not be strictly necessary, they do offer significant security benefits.  From a user perspective as well there is minimal burden in using usernames as browsers (or other user agents) often remember the username for convenience.

But what about the benefits of unique passwords!  What about the incentive we gained when users were forced to choose strong passwords?  Well what if we keep usernames AND still forced passwords to be unique?  Could this be the best of both worlds?  Might I have just pricked my finger on a needle?

The 'sticking point' for me is the user acceptability of forcing unique passwords.  It may drive the uptake of password managers or strong passwords, or it might annoy the hell out of people.  Perhaps for higher security situations it could be justified.

Tuesday 1 April 2014

Revisiting Fundamentals: Defence in Depth

I think it is a worthwhile exercise to revisit fundamental ideas occasionally.  We are constantly learning new things and this new knowledge can have an effect on the way we understand fundamental concepts or assumptions, sometimes challenging them and sometimes offering new insights or facets to appreciate.  I recently learned about how Defence in Depth was used in some historic military battles and it really challenged my understanding of how the principle can be applied to security.

So what is Defence in Depth (DiD) and what is it good for?  My understanding is that DiD originated as a military tactic, but is more commonly understood in its engineering sense.  It turns out the term has different meanings in these different disciplines.  Largely I am going to focus on the meanings with regard to application security (as this is relevant to me).

Engineering
In engineering DiD is a design tactic to protect a resource by using layers of controls, so if one control fails (bypassed by an attacker) there is another control still protecting an asset.  By incorporating DiD into an application's security design we add (passive) redundant controls that provide a margin of safety for a control being bypassed.  The controls we add may be at different trust boundaries or act as only a partially redundant control for other controls e.g. input validation at the perimeter partially helps protect against XSS, SQLi, etc.

I would say that adding DiD in the engineering sense is essential to any security design of an application.  Security controls should be expected to fail, and so accounting for this by architecting in redundancy to achieve a margin of safety makes sense.  I will say that some disadvantages to redundancy have been identified that likely apply to DiD as well.  Briefly:
  • Redundancy can result in a more complex system.  Complex systems are harder to secure.
  • The redundancy is used to justify weaker security controls in downstream systems.
  • The redundancy is used as an excuse to implement risky functionality which reduces the margin of safety.  Multiple independent functionality relying on the margin can have an accumulated effect that reduces or removes the margin of safety.

Military
DiD in military terms is a layered use of defensive forces and capabilities that are designed to expend the resources of the attacker trying to penetrate them e.g. Battle of Kursk.  What's crucial to appreciate, I think, is that in the physical world the attacker's resources (e.g. troops, guns, bombs, time) are usually limited in supply.  In this way DiD grinds down an enemy and the defender wins if the cost/benefit analysis of the attacker changes so they either stop or focus on another target.  Additionally, defensive forces are also a limited resource and can be ground down by an attacker.  It is possible for an attack to result in stalemate as well, which may be considered a defensive win.

Taking this point of view then, is protecting an application analogous to protecting a physical resource?  We need it to be analogous in some way otherwise using DiD (in the military sense) might not be an appropriate tactic to use for defending applications.

Well one of the key points in the physical arena is the consumption of resources, but in the application arena it seems less accurate (to me) to say that computing resources are consumed in the same way.  If an attacker uses a computer to attack an application, the computer is not "consumed" in the process, or less able to be used again.  The same is true for the application defences.

So it isn't analogous that physical resources are consumed in the same way in the physical and application security arenas.  But there are other types of resources though, non-physical resources.  I can think of 2 resources of an attacker that are limited and are applicable to application security;
  • time - as a resource, if a defensive position can increase the time an attacker needs to spend performing an attack, the cost/benefit analysis of the attacker may change, the "opportunity cost" of that time may be too high.
  • resolve - in the sense that an attacker will focus on a target for as long as they believe that attacking that target is possible and practical.  If a defensive position can make the attacker believe that attacking it is impractical, then the attacker will focus their efforts elsewhere.
There is an irony in 'resolve' being a required resource.  The folks who have the appropriate skills to attack applications are a funny bunch I reckon, as they are exactly the type of people who are very persistent, after all, if someone can master the art of reverse engineering or blind SQL injection, they are likely one persistence SOB.  In a sense they are drawn to the problem of (application) security because it requires the very resource they have in abundance.

As an aside, this could be why APT attacks are considered so dangerous, it's not so much the advanced bit (I rarely read about advanced attacks), but the 'persistent' bit; the persistent threat is a reflection of the persistence (or resolve) of the attacker.  The risk being that most defences are going to be unlikely to break that resolve.

So if those are the resources of the attacker then that gives us a way to measure how effective our DiD controls are; we need to measure how they increase time or weaken resolve.

Layered controls work well in increasing the time the attacker has to spend.  The more dead ends they go down, the more controls they encounter that they cannot bypass, all that takes time.  Also, attackers tend to have a standard approach, usually a set of tools they use, so any controls that limit the effectiveness of those tools and force them to manually attack the application will cause a larger use of time.  Ironically though, consuming time could have an opposite effect on the resolve of the attacker, since there is likely a strong belief that a weakness exists somewhere, and also the "sunk cost effect",  meaning the attacker may become more resolved to find a way in.

I'm not a psychologist so I can't speak with authority on what would wear down resolve.  I suspect it would involve making the attack frustrating though.  I will suggest that any control that makes the application inconsistent or inefficient to attack, will help to wear down resolve.

I did a bit of brain-storming to think of (mostly pre-existing) controls or designs an application could implement to consume 'time' and 'resolve' resources:
  • Simplicity.  Attackers are drawn to complex functionality because complexity is the enemy of security.  Complexity is where the vulnerabilities will be.  The simpler the design and interface (a.k.a minimal attack surface area) the less opportunity the attacker will perceive.
  • Defensive counter controls.  Any controls that react to block or limit attacks (or tools). 
    • Rapid Incident Response.  Any detection of partial attacks should be met with the tightening of existing controls or attack-specific controls.  If you can fix things faster than the attacker can turn a weakness into a vulnerability, then they may lose hope.  I do not pretend this would be easy.
    • Random CAPTCHAs.  Use a CAPTCHA (a good one that requires a human to solve), make these randomly appear on requests, especially if an attack might be under way against you.
  • Offensive counter controls.  Any control that misleads the attacker (or their tools).
    • Random error response.  Replace real error responses with random (but valid looking) error responses, the goal is to make the attacker think they have found a weakness but in reality they haven't.  This could have a detrimental effect on actual trouble shooting though.
    • Random time delays. Vary the response time of some requests, especially those that hit the database.  Occasionally adding a 1 second delay won't be an issue for most users but could frustrate attacker timing attacks.
    • Hang responses.  If you think you are being attacked, you could hang responses, or deliver very slow responses so the connection doesn't time out.
I'm certainly not the first to suggest this approach, it is known as "active defence".  There are even some tools and a book about it (which I cannot vouch for).  The emphasis may be more on network defensive controls rather than the application controls that my focus has been on.

TL;DR
Defence in Depth in application security should involve redundant controls but may also include active defences.

Tuesday 25 March 2014

Making Java RIAs MIA - Enterprise Style

This post is all about how to disable Java in the browser, or alternative making Java Rich Internet Applications (RIA) missing-in-action.  There is a lot of information out there on how to do this, but I couldn't find a good source that covered it from an Enterprise point of, that is, how to disable Java in the browser at scale (rather than just doing it on our own machine using a GUI), across multiple browsers and OSes.

First let's cover the obvious, why do we want to do this.  See my page on Java Vulnerabilities.  Basically if Java is enabled in your browsers then you are opening yourself up to a world of pain.  If you have Java enabled in your browsers then you have malware somewhere in your Enterprise.

What we want is to understand the options we have for disabling Java in the browser and understand what the settings are that we need to configure.

Browser Agnostic
Uninstall Java
This is obviously a good option as it removes the risk of Java.  However, depending on your environment user's may just re-install Java, so unless you are actively scanning for Java installations your Enterprise this approach might not be as effective as you'd like it to be.

For that reason I'm not going to focus on this solution too much, but for non-Windows machines see Oracle's advice on uninstalling and this post.

For Windows you can detect the installed versions of Java using
wmic product where "name like 'Java%'" get name
and uninstall these via
wmic product where "name like 'Java%'" call uninstall


Java Deployment Options
From Java version 7 update 10 a deployment option exists to disable Java content in all browsers (these are the same options as available in the Java Control Panel).  This option is a feature of Java and so is available on both Windows and non-Windows OSes.

The option is stored in a deployment.properties file.  The user has their own version of this file stored at (on Windows and non-Windows respectively):
<user_profile>\AppData\LocalLow\Sun\Java\Deployment\deployment.properties
<user home>/.java/deployment/deployment.properties

A system version, containing the default values for all users, can be created and stored at (on Windows and non-Windows respectively):
%WINDIR%\Sun\Java\Deployment\deployment.properties
/etc/.java/deployment/deployment.properties

This system version is only used by Java if a deployment.config file exists at (on Windows and non-Windows respectively):
%WINDIR%\Sun\Java\Deployment\deployment.config
/etc/.java/deployment/deployment.config
and contains an option that specifies the location of the system deployment.properties to use e.g. on Windows:
deployment.system.config=file\:C:/WINDOWS/Sun/Java/Deployment/deployment.properties
deployment.system.config.mandatory=true

The system version contains the default values that the user version will use, but the user can override these.  That is unless the system version indicates that the option should be locked (which means the user cannot change the option).

So to disable the plugin across all browsers and users the system deployment.properties should contain:
deployment.webjava.enabled=false
deployment.webjava.enabled.locked

Both the system deployment.config and deployment.properties should be ACL’ed so that the user cannot edit them (but can read them).

Although this option only applies from Java version 7 update 10, there is no issue in settings this option for earlier versions of Java, as the setting is ignored.  The benefit is that should Java be upgraded then this setting will be adopted.

Browser Specific
Internet Explorer (IE)

In response to the threat posed by Java in the browser Microsoft released a FixIt solution that disables Java in IE - http://support.microsoft.com/kb/2751647

The executable disables Java in the browser and appears as “IE Java Block 32bit Shim” in the list of installed programs.  Presumably the FixIt installs a 64bit equivalent on 64bit versions of IE.

We can detect if the Microsoft FixIt is installed by confirming that the following registry exists
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Uninstall\{01cf069a-f8a1-4067-adc4-5ef7e922733c}.sdb
If the Windows OS is 64-bit and 32-bit IE is installed then the key will be under:
HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Windows\CurrentVersion\Uninstall

There is some debate about the effectiveness of this FixIt.  Advice from CERT states that it doesn't completely prevent RIAs from being invoked by IE.  Alternative methods to disable Java exist that may be more reliable but seem to involve updating a list of blocked ActiveX controls with every release of Java.  You'll have to decide on the risk yourself, but on Windows perhaps unnstalling Java or using the Java deployment options are the safest options.

Firefox

Firefox has a configuration option that controls whether the plugin is enabled.  The option can be viewed by going to about:config and is called “plugin.state.java”.  It can take one of three values:
  • 0 = disabled
  • 1 = click to play
  • 2 = enabled
The value of this option is stored in a file called “prefs.js” in the user’s Firefox profile directory.  If a user was to set this option via about:config their selection would be saved in this file.

It’s preferably to enforce all users (and all user profiles) to disable the plugin.  Firefox supports a ‘lock’ on the preference so that the user cannot (easily) change the option and enable the plugin (see Locking preferences).  By creating a file called “mozilla.cfg” in the installation directory of Firefox with the entry:
//
lockPref("plugin.state.java", 0);

Firefox needs to be instructed to load this lock file and this is done by creating a “local-settings.js” file in the “defaults\pref” sub-directory of the Firefox installation directory with:
pref("general.config.obscure_value", 0);
pref("general.config.filename", "mozilla.cfg");

With these configuration files and options in place the option plugin.state.java in about:config will be italicized and greyed out.

To prevent a user from changing “mozilla.cfg” and “local-settings.js” these files should be ACL’ed so that the user cannot edit them (but can read them).

I think I am not sure of is in what version of Firefox the plugin.state.java configuration was introduced, or even the ability to lock configuration options.  If your Enterprise is using a very old version of Firefox then this approach might not be available.



Chrome
This method applies to Google Chrome and Chromium (with minor differences).  Chrome supports configuration via Policy  (see chrome://policy) and the policy value we want to set is called DisabledPlugins (see list of plugins via chrome://plugins/).

The DisabledPlugins policy can be overridden by the EnabledPlugins and DisabledPluginsExceptions policies though, so these policies also need to be set to ensure the Java plugin is disabled.

Windows
Chrome policy can be enforced via Group Policy.  Google provides ADM/ADMX templates.  The GPO should be configured at the "Computer Configuration" level:
  • To set the DisabledPlugins policy, the policy entry “Specify a list of disabled plug-ins” should be set to “Enabled” and an entry of “Java(TM)” should be added.
  • To set the EnabledPlugins policy, the policy entry “Specify a list of enabled plug-ins” should be set to a random value. (1)
  • The set the DisabledPluginsExceptions policy, the policy entry “Specify a list of plug-ins that the user can enable or disable” should be set to a random value. (1)

Chrome historically supported policy configuration through the registry but this ability was removed in version 28.  The current policy is still written to the registry though, so this provides a mechanism to verify the policy exists on a machine.

Non-Windows
Chrome also supports configuration via policy on Non-Windows machines via a local JSON file containing the policy configuration.  The policy files live under /etc/opt/chrome for Google Chrome (and /etc/chromium for Chromium).

Two types of policies exist; "managed", which are mandatory policies, and "recommended" which are not mandatory.  These policies are respectively located under:
/etc/opt/chrome/policies/managed/
/etc/opt/chrome/policies/recommended/

To disable the Java plugin create a file “policy.json” in /etc/opt/chrome/policies/managed/ with the contents:
{
“DisabledPlugins”:["Java(TM)"],
“DisabledPluginsExceptions”:[" "],
“EnabledPlugins”:[" "],
}

The policy file should have ACLs applied to it so that users cannot edit (but can read) it.

(1) So why do we assign the EnabledPlugins and DisabledPluginsExceptions random values (or a space character)?  Turns out if we set these policies to "Not configured" or "Disabled" then a user can enable Java in the browser by configuring these policies in the "User Configuration" policy section.  If this seems like a security issue to you, well it seemed like one to me as well, that's why I opened a bug with Google about it.  They didn't really agree, but I concede it's complicated (and potentially it's a bug in Microsoft's Group Policy functionality).