openid - Anyway

Sun’s OpenID IdP: Summary

I’ve now finished my current batch of postings about Sun’s OpenID IdP. Here’s a listing of the relevant postings that the team has made. I’ll add new postings to this list as they’re published, or as I find them.

Purpose and Policies

Architecture

Hubert Le Van Gong’s OpenID @ Work — Architecture
Hubert Le Van Gong’s OpenID @ Work — Infrastructure Description

Deployment

Yvonne Wilson’s User-centricity, Trust: Technology or Practice?
Eve Maler’s Sun OpenID IdP: protocol and implementation review

Sun’s OpenID IdP: Trust

Part of a series on Sun’s OpenID@Work initiative; see the introduction for more context.

Trust is always an issue on the web. People don’t usually even think about it, but they trust the DNS server to point their browser at the right web site when they click on a link, they trust the web server to serve up the right page, they trust their online bank to not broadcast their credit card numbers to the world, etc. etc. We as end-users can’t do anything about most of those, but there are some things that we can do, such as not giving banking details to sites that don’t look like our bank’s, or only giving out our social insurance numbers when we really have to. Knowing some of the issues and potential problems is important — you want to verify as much as possible whether your trust in the site is justified. So you don’t click on links in emails that don’t quite look right, and you check whether the little “locked” sign is present (assuming your browser hasn’t been hacked). Lots of people don’t trust internet systems with their personal data at all, deciding that the advantages of online interactions are outweighed by the potential damage if something goes wrong (there’s that risk assessment again that I talked about in the Business Purpose posting of this series).

So what’s this got to do with OpenID? Quite a lot, actually.

OpenID is an untrusted protocol, at least for version 1.1, which is the one we deployed on the OpenID IdP, and it’s likely to be true for version 2.0 as well, although that isn’t finished yet. As the OpenID web site says: This is not a trust system.. Among other things, you don’t know anything about the site you’re logging into, it might be genuine, it might be a phishing site, it might be some other rogue site. And there’s no way currently for the Identity Provider to know. In other words, just because you can log into it with your openid identifier, doesn’t mean anything about what that site might do with any data or information you might give it. Which is one good reason why Sun’s OpenID IdP does not hand over information from the user’s account to the consuming site (relying party) unless the user agrees to it. You’re the person logging in, you can decide whether to trust that site with any information, whether that’s your openid identifier, or your name (possibly fake) or email address. And Sun’s system doesn’t ask for or store your date of birth, so if some site wants it (why would always be the right question to ask), feel free to answer correctly or with some completely random date (in fact, many privacy advocates say you should never tell any web site your real date of birth if there’s any way of legally avoiding it). Even handing over your openid identifier to some site can cause problems, if they then use it for purposes you didn’t expect and don’t agree to. Since this is an opt-in system for personal use, Sun wouldn’t bear any liability if you did fall prey to a phisher or other rogue while using your Sun openid identifier.

The upshot of this is that OpenID shouldn’t be used for what are called high-value transactions, at least in its current incarnation. High-value transactions are things such as logging in to your banking system, or releasing sensitive personal information such as your medical history. Typing “openid phishing” or “openid attacks” into your favourite search engine will give you some idea of the sorts of attacks that are currently possible. Some of these will be relatively easy to mitigate, and some aren’t really worth mitigating for the sorts of use cases that OpenID was designed for, as they would make the resulting protocol much harder to implement and deploy. And let’s face it, the idea behind OpenID was to have something easy and lightweight to deploy that meets some, but not all, authentication use cases.

Related articles include Steven Nelson’s So you wannabe an OpenID provider?, Eve Maler’s A Tincture of Trust, and Yvonne Wilson’s Trusted Sources of Information. Simon Willison has a slightly different take in Designing for a security breach. And if you want a more formal definition of trust and some of the issues around it, try Trust Modeling for Security Architecture Development.

Sun’s OpenID IdP: Real vs Fake

Part of a series on Sun’s OpenID@Work initiative; see the introduction for more context.

Probably the biggest discussion we had in the entire policy discussion was whether to let Sun employees use fake or fictitious names, or whether to force the use of real names in what the OpenID simple registration extension calls the fullname. The policy discussion has value outside of the narrow scope of an OpenID IdP, and the discussions we had reflect the importance of the issue for any sort of identity management system.

Note on terminology: in this post, I’ll use the term “name” to mean the OpenID “fullname”.

There are two competing principles at work here, and making a decision as to whether to allow fake names and non-identity-revealing openid identifiers depends on which is considered more important. The argument for allowing fictitious names is based on privacy, and the principle that any time you can allow the user to retain privacy, you should. Storing Personally Identifiable Information (PII) should be avoided whenever possible. Since the OpenID service that we’re providing is an opt-in, personal service that Sun employees do not need to use for any Sun business processes, there is no business reason that requires the use of their real names (auditing accesses to certain files, for example, would require knowing the user’s real name, so these processes can’t use these openid identifiers). Even in the case of some store giving a discount to a Sun employee, the store needs to know where to ship the item and which credit card to charge it to, but the OpenID IdP doesn’t need to know any of that information. The IdP verifies only that the user is a Sun employee, nothing more. So the privacy advocates are in favour of allowing fake names, email addresses that aren’t Sun addresses, and storing as little information as possible. Of course, if someone wants to be really private, they shouldn’t use an openid identifier from Sun, as that divulges the piece of information that they are a Sun employee.

The case against allowing the use of fake names is a security and liability one. If someone can use a fake name, that means they can also use someone else’s name or an openid identifier that might lead people to believe the user is someone they’re not. Since Sun is providing the OpenID service, people might think that Sun is also guaranteeing the veracity of information about the user other than the mere fact that they work for Sun (we’re not, Sun verifies only that the user is a Sun employee, nothing else). Such impersonation could cause reputation damage that could take some time to repair, particularly if the user does something stupid or illegal.

The solution we came up with was a compromise. Users can choose a fake name, a non-Sun email address, and an openid identifier that doesn’t say anything about them. The OpenID IdP stores the information about which Sun employee signed up for that openid identifier, so in the event of a problem, we can trace it back. When a Sun employee leaves the company, the openid account is made inactive. It’s deleted after 6 months. This way there’s a time gap if someone else wishes to use the same openid identifier, and 6 months is a reasonable amount of time to keep such records in case there’s a problem. We also keep the web server logs for 6 months; since these contain the records of which openid identifier visited which site (though not where they went or what they did once there) these are only visible for compliance reasons (I’ll talk more about the data governance in another post). And finally, the user policy states specifically that impersonation is not allowed, and that information about who signed up for each openid identifier is stored for compliance reasons. Telling the user that we know who they are and what their openid identifier is may help prevent problems, at least that’s the hope.

If the policy is abused, then we may have to change it, but so far we don’t know of any problems. Sun’s experience with bloggers has shown that people do take their responsibilities as Sun employees seriously, and are careful what they say and how they say it, and we saw no reason why that should be any different for Sun employees using the OpenID service. Of course, there’s no way of making sure that people really do read the policy, just like there’s no way to make people read the licences for software packages that they install, but at least the information is available for those who care to look. And to sign up for an account they have to agree to a disclaimer that contains the most important parts of the policy as well, so there’s some hope that they will read it.

A related post is Yvonne Wilson’s User-centricity, Trust: Technology or Practice?.

Sun’s OpenID IdP: Data Governance

Part of a series on Sun’s OpenID@Work initiative; see the introduction for more context.

Data governance is the term used for knowing what happens to the data that is stored, particularly when that data has any PII (personally identifiable information), which the OpenID IdP does. Using OpenID isn’t the reason we keep this information; any registration system keeps at least some information about the people who have accounts on it, even if it’s only a name, email, and password (or openid identifier). I thought it might be useful to others to see some of the basic steps that we went through when discussing how to protect that PII, and some of the decisions we made on what data to keep and what not. If you’re setting up a registration system yourself, you may make completely different decisions, depending on what information you’re keeping and what your registration system is being used for.

Obviously, step 1 is to make someone responsible for figuring it out. In our case, that person was me, with the grand title of “Data Steward” in Sun’s process. Yes, there’s a process to be followed and checklists to be filled out, and people whose job it is to help us figure it all out (the Chief Privacy Office with Michelle Dennedy and her team). What you need to do is:

figure out what data you need to have, whether for technical or policy reasons
figure out who will need access to the data
figure out how to prevent people accessing the data who don’t need access
figure out when you can destroy the data
write the decisions up and make the information available

What data needs to be kept?

In this service, people can use fake names, but often choose to use their real ones. For compliance reasons, in case there needs to be an investigation into an allegation of wrong-doing by a user, we need to keep the employee ID that was used to sign up for the openid identifier. Even after the openid account is closed, the information is kept for a set period of time to allow any problems to surface. Yes, the users are warned about this during the registration process.

The web server logs are in the Common Log Format, which includes a record of the HTTP GET request from the consuming site (relying party) asking for authentication of the openid identifier. This HTTP GET request includes the openid identifier and the site’s URL, thus allowing correlation of who went where (though not what they did after logging in). This happens with every OpenID Identity Provider that has web server logs, which I would guess is basically all of them, so it’s certainly not a problem that is specific to Sun’s service. Every OpenID IdP could perform such correlations about their users. This is not necessarily a problem, and some people would say that allowing people to see that this openid identifier was used in different places allows reputations to be built, but it also has privacy implications. I might not want my employer (or anyone else, for that matter) knowing what sites I visit, how often, and when. So on principle we mask the data, so that we can see how often a site is visited, but not who’s doing the visiting.

Who needs access to the data?

If there is an allegation of wrongdoing on the part of a user, then Corporate Compliance may need access to the information about whose openid identifier it is, and access to the web server logs showing whether the user actually did log in to the web site in question. This data is only passed on after review of the allegations by Sun’s legal team.

Apart from that, support personnel need access to the openid accounts to help people with things like forgotten passwords (if they forgot to set a secret question), or deleting the account on a voluntary basis. The user has to file a support request using Sun’s internal support system, and the employee ID of the person filing the request has to match that of the owner of the account.

Engineering may need access to some of the files for debugging. There is also a script that runs over the web server logs and extracts records of which sites were visited and when, discarding all information about who the user was who visited that site.

Restrict access

Only a few people have access to the accounts; support, engineering, and me as data governance steward. That access is controlled through operating-system access control. The same applies to the logs and everyone who has access has gone through training to ensure they know the privacy conditions applying to the use of the information (i.e., used only for debugging or support once the user’s identity is verified, as above).

As a side-note, to log in to my account on the machines, I have to log in to Sun’s internal network, ssh from there to the machine I want to access and then log in with my standard Sun credentials followed by a one-time password that uses a challenge-response mechanism with a secret passphrase. Then I need to su to the appropriate user account, using yet another password (of course).

Destroying Data

Once an account has been deactivated, either because the employee left Sun, or because they asked for it to be deleted, it remains inactive for 6 months. Once that time has passed, the account is deleted. The web server logs are deleted automatically after 6 months. This time was chosen as it seemed to meet both the privacy principles (delete as soon as possible) and the corporate compliance principles (keep around for a reasonable length of time, just in case it’s needed).

Documenting

Once it was all figured out, and reviewed by the privacy specialists in Sun, documenting it was the easy part (just like writing standards, really, coming to the consensus is the difficult bit). So we have information in the disclaimer that people need to agree to when they sign up for an account, the user policy, the FAQ, and the more formal checklists etc are available from the Sun-internal project site. And people can always ask me, or email one of the mailing lists we have, if they have any questions.

Sun’s OpenID IdP: Business Purpose

Part of a series on Sun’s OpenID@Work initiative; see the introduction for more context.

One of the interesting things about security is that you can never make anything 100% secure. You need to figure out what the risks are, how likely they are to occur, and what the damage will be if something bad does happen, and then make your plans accordingly. In most countries I’ve lived in, that means putting locks on the house doors and using them; in Canada we also have a security alarm but none of the apartments I lived in in Germany had one. Different countries, different risks (houses are often easier to break into than apartments that aren’t on the ground floor), and different plans for minimizing risks.

So it is with computer systems, and with the OpenID IdP we put up. The amount of effort that is worth putting into securing a system depends on how important the system is, and what the expected damage is if something goes wrong. So in the formal security review of the system, one of the first questions was, what’s the purpose of this system? How does that purpose balance the risks of running it? Any time you make information available via the web, there is a risk that the information will be stolen or compromised so you need to know where that might happen, what the probability is of it happening, what the expected damage is, etc.

The business purpose for the OpenID IdP was, and still is, to gain experience in using OpenID, and to make openid identifiers available for Sun employees on an experimental, opt-in basis. Sun employees do not use OpenID for any mission-critical or important business applications within Sun. A couple of the reasons for that are that this is an experimental service, that is not guaranteed to be available 24/7, and with limited user support. OpenID is also an untrusted protocol. It has some well-known susceptibilities to phishing and other attacks, only some of which can be mitigated by good programming (at least in version 1.1, the version we deployed since 2.0 isn’t finished yet). So this service that we put up was expressly made available to Sun employees for their personal, not business use. The fact that it also guarantees that a person with an authenticated openid.sun.com OpenID is a Sun employee is almost a side-effect. We thought that maybe some consumer sites (or relying parties) might offer special deals for Sun employees, or whitelist advantages, but we haven’t seen any yet. Yes, we’re on the whitelist at AOL, but I’m not sure what advantage that’s going to bring.

So, what are the results of our experiment? If you look at it in terms of what our little project group learned in terms of putting up an experimental test deployment, it was great. I got to play around with OpenSSO code and learn more about load balancing than I did previously. (As a reminder, OpenSSO is open source, as is the OpenID extension we used, so feel free to download them and try them out.) We get a lot of queries from people both within and outside of Sun wanting to know what OpenID is about, how it works, what people use it for, all of which we can answer on the basis of “well, in our deployment it looks like this”.

In terms of how many people actually use the service each week? Well, that number is pretty low. Under 35 accesses of some consumer site (relying party) per week, most weeks. I have my own theories as to why this is the case; the most obvious to me is that it’s harder to use OpenID than the alternative username/password approach. On all the sites I use that are OpenID-enabled, I need to have an account already and then can use my openid identifier as an alternative means to log in. But if I already have a username and password stored in my browser, it’s only one click to use that, whereas to use my openid identifier, I have to click on the icon, fill in the openid identifier, wait until it redirects, sign in at the Sun OpenID IdP, wait until it redirects again… it just takes a lot longer. Being the paranoid type that I am, I have added my openid information to some of these sites so that if I forget my password, or lose it when I reinstall the OS, I have a back-up login method, but that’s not reason enough to use my openid identifier regularly. In the absence of some special deal for Sun employees, or a site enabling login without registration, there just isn’t enough motivation for me to go through those extra steps.

Getting back to the risk and security issue, we did make the system secure for the things we thought really important. We are using commercial-grade software (OpenSSO is the open source variation of Access Manager) to keep people’s information secure, and users are not allowed to use the same user name or password that they use for Sun’s internal systems, just in case they’re stolen by some rogue site. We use HTTPS for everything except the openid identifier itself and the system has been tested to ensure it responds appropriately to a number of expected exploits. So users don’t have to worry about their information being compromised, as long as they don’t give it away themselves. The one weak spot is that we use password-based authentication, which is more susceptible to phishing than some other systems; more about the reasons for that in a later post.

Sun’s OpenID IdP: Introduction

This is the first of a series of posts on Sun Microsystem’s OpenID@Work service, which is an OpenID Identity Provider available for use by Sun employees.

[Update: I was asked what the purpose of these postings is — it’s simply to share our experiences in the hope that they’re helpful to others.]

I was part of the team that put up the OpenID Identity Provider. I wrote a lot of the pages, revamped Sun’s default style sheet to work with the HTML I wanted on the pages, and took part in all the discussions about policies and security. I’m also the “data steward” for the IdP, responsible for ensuring that our policies regarding data privacy are carried out. Given that range of tasks in the project, it’s no surprise that when we divvied up the areas for blogging, I picked the policy questions, and other people on the team will blog about other areas. We’ll be cross-linking to each others’ posts, of course. For example, here’s Gerry’s introduction.

One of the good things about working for Sun is that there are a lot of people with relevant expertise, who also understand the need to be flexible. We spent a lot of time discussing the user policy with the people in the Chief Privacy Office (who also let me write it in language people can understand), we had security experts review not only the deployment but also the OpenID specification (they’ll be blogging more on those aspects themselves), and on the technical side many people went out of their way to help. As an example, I spent most of one weekend trying to figure out a weird MIME type problem with the web server with Murthy Chintalapati (aka cvr), him emailing “try this”, me emailing back “nope, didn’t work” until we eventually solved the problem. In this series I’m going to be talking about a few of the issues we discussed, and how we resolved them. This is not to say we came up with perfect solutions, or that they are necessarily applicable to other companies or circumstances, but at the very least they will give you things to think about if you’re considering a similar project.

We were heavily influenced by Sun’s experience with blogging, to the extent that many of our discussions about “should we do this” were answered by “blogs.sun.com did it successfully and here’s how”. The similarity between the user policy documents is no coincidence, for example.

If you’re looking for technical documentation on Sun’s OpenID system, try Hubert Le Van Gong’s infrastructure description and OpenID @ Work — Architecture.