Anyway - Page 34 - meandering thoughts from Lauren Wood

WordPress and File Masks

I upgraded to WordPress 2.3 at the weekend. Everything seemed to upgrade properly with no database errors, but I was getting a 500 Internal Server Error when I tried to look at the site pages. The error logs contained the answer – error: file is writable by others with a pointer to the main index.php file. This seemed a little odd to me, but I looked at the mask and sure enough, the index.php file (and a whole lot of others) was group-writable. I changed the mask on the directories to 755 from 775, and the files from 664 to 644, and then everything worked just fine.

I also changed the stylesheet; still tweaking but it’s mostly done. Comments welcome!

Sun’s OpenID IdP: Trust

Part of a series on Sun’s OpenID@Work initiative; see the introduction for more context.

Trust is always an issue on the web. People don’t usually even think about it, but they trust the DNS server to point their browser at the right web site when they click on a link, they trust the web server to serve up the right page, they trust their online bank to not broadcast their credit card numbers to the world, etc. etc. We as end-users can’t do anything about most of those, but there are some things that we can do, such as not giving banking details to sites that don’t look like our bank’s, or only giving out our social insurance numbers when we really have to. Knowing some of the issues and potential problems is important — you want to verify as much as possible whether your trust in the site is justified. So you don’t click on links in emails that don’t quite look right, and you check whether the little “locked” sign is present (assuming your browser hasn’t been hacked). Lots of people don’t trust internet systems with their personal data at all, deciding that the advantages of online interactions are outweighed by the potential damage if something goes wrong (there’s that risk assessment again that I talked about in the Business Purpose posting of this series).

So what’s this got to do with OpenID? Quite a lot, actually.

OpenID is an untrusted protocol, at least for version 1.1, which is the one we deployed on the OpenID IdP, and it’s likely to be true for version 2.0 as well, although that isn’t finished yet. As the OpenID web site says: This is not a trust system.. Among other things, you don’t know anything about the site you’re logging into, it might be genuine, it might be a phishing site, it might be some other rogue site. And there’s no way currently for the Identity Provider to know. In other words, just because you can log into it with your openid identifier, doesn’t mean anything about what that site might do with any data or information you might give it. Which is one good reason why Sun’s OpenID IdP does not hand over information from the user’s account to the consuming site (relying party) unless the user agrees to it. You’re the person logging in, you can decide whether to trust that site with any information, whether that’s your openid identifier, or your name (possibly fake) or email address. And Sun’s system doesn’t ask for or store your date of birth, so if some site wants it (why would always be the right question to ask), feel free to answer correctly or with some completely random date (in fact, many privacy advocates say you should never tell any web site your real date of birth if there’s any way of legally avoiding it). Even handing over your openid identifier to some site can cause problems, if they then use it for purposes you didn’t expect and don’t agree to. Since this is an opt-in system for personal use, Sun wouldn’t bear any liability if you did fall prey to a phisher or other rogue while using your Sun openid identifier.

The upshot of this is that OpenID shouldn’t be used for what are called high-value transactions, at least in its current incarnation. High-value transactions are things such as logging in to your banking system, or releasing sensitive personal information such as your medical history. Typing “openid phishing” or “openid attacks” into your favourite search engine will give you some idea of the sorts of attacks that are currently possible. Some of these will be relatively easy to mitigate, and some aren’t really worth mitigating for the sorts of use cases that OpenID was designed for, as they would make the resulting protocol much harder to implement and deploy. And let’s face it, the idea behind OpenID was to have something easy and lightweight to deploy that meets some, but not all, authentication use cases.

Related articles include Steven Nelson’s So you wannabe an OpenID provider?, Eve Maler’s A Tincture of Trust, and Yvonne Wilson’s Trusted Sources of Information. Simon Willison has a slightly different take in Designing for a security breach. And if you want a more formal definition of trust and some of the issues around it, try Trust Modeling for Security Architecture Development.

Sun’s OpenID IdP: Real vs Fake

Part of a series on Sun’s OpenID@Work initiative; see the introduction for more context.

Probably the biggest discussion we had in the entire policy discussion was whether to let Sun employees use fake or fictitious names, or whether to force the use of real names in what the OpenID simple registration extension calls the fullname. The policy discussion has value outside of the narrow scope of an OpenID IdP, and the discussions we had reflect the importance of the issue for any sort of identity management system.

Note on terminology: in this post, I’ll use the term “name” to mean the OpenID “fullname”.

There are two competing principles at work here, and making a decision as to whether to allow fake names and non-identity-revealing openid identifiers depends on which is considered more important. The argument for allowing fictitious names is based on privacy, and the principle that any time you can allow the user to retain privacy, you should. Storing Personally Identifiable Information (PII) should be avoided whenever possible. Since the OpenID service that we’re providing is an opt-in, personal service that Sun employees do not need to use for any Sun business processes, there is no business reason that requires the use of their real names (auditing accesses to certain files, for example, would require knowing the user’s real name, so these processes can’t use these openid identifiers). Even in the case of some store giving a discount to a Sun employee, the store needs to know where to ship the item and which credit card to charge it to, but the OpenID IdP doesn’t need to know any of that information. The IdP verifies only that the user is a Sun employee, nothing more. So the privacy advocates are in favour of allowing fake names, email addresses that aren’t Sun addresses, and storing as little information as possible. Of course, if someone wants to be really private, they shouldn’t use an openid identifier from Sun, as that divulges the piece of information that they are a Sun employee.

The case against allowing the use of fake names is a security and liability one. If someone can use a fake name, that means they can also use someone else’s name or an openid identifier that might lead people to believe the user is someone they’re not. Since Sun is providing the OpenID service, people might think that Sun is also guaranteeing the veracity of information about the user other than the mere fact that they work for Sun (we’re not, Sun verifies only that the user is a Sun employee, nothing else). Such impersonation could cause reputation damage that could take some time to repair, particularly if the user does something stupid or illegal.

The solution we came up with was a compromise. Users can choose a fake name, a non-Sun email address, and an openid identifier that doesn’t say anything about them. The OpenID IdP stores the information about which Sun employee signed up for that openid identifier, so in the event of a problem, we can trace it back. When a Sun employee leaves the company, the openid account is made inactive. It’s deleted after 6 months. This way there’s a time gap if someone else wishes to use the same openid identifier, and 6 months is a reasonable amount of time to keep such records in case there’s a problem. We also keep the web server logs for 6 months; since these contain the records of which openid identifier visited which site (though not where they went or what they did once there) these are only visible for compliance reasons (I’ll talk more about the data governance in another post). And finally, the user policy states specifically that impersonation is not allowed, and that information about who signed up for each openid identifier is stored for compliance reasons. Telling the user that we know who they are and what their openid identifier is may help prevent problems, at least that’s the hope.

If the policy is abused, then we may have to change it, but so far we don’t know of any problems. Sun’s experience with bloggers has shown that people do take their responsibilities as Sun employees seriously, and are careful what they say and how they say it, and we saw no reason why that should be any different for Sun employees using the OpenID service. Of course, there’s no way of making sure that people really do read the policy, just like there’s no way to make people read the licences for software packages that they install, but at least the information is available for those who care to look. And to sign up for an account they have to agree to a disclaimer that contains the most important parts of the policy as well, so there’s some hope that they will read it.

A related post is Yvonne Wilson’s User-centricity, Trust: Technology or Practice?.

Navigating Sites

I was chatting with Norm Walsh this morning, and he pointed me at the navigation toolbar he uses for reading specifications. It’s one of those small things that makes the web world more functional. I often miss a couple of days of posts from some blogger on my not-quite-every-day list and this makes starting on one day and working backwards till I’ve caught up much easier. Well, at least for those blogs that implement the rel="prev" and next attributes on the <link> elements in the header.

Of course, after installing the Firefox toolbar, I discovered that the list of blogs that implements these useful links didn’t include mine. It isn’t an integral part of WordPress installations, but since there’s a plugin to do most things anyone ever wants to do, the quick solution (as opposed to programming it myself when I have time) lay just a few searches away. The META Relationship Links plugin does just what I needed.

Sun’s OpenID IdP: Data Governance

Part of a series on Sun’s OpenID@Work initiative; see the introduction for more context.

Data governance is the term used for knowing what happens to the data that is stored, particularly when that data has any PII (personally identifiable information), which the OpenID IdP does. Using OpenID isn’t the reason we keep this information; any registration system keeps at least some information about the people who have accounts on it, even if it’s only a name, email, and password (or openid identifier). I thought it might be useful to others to see some of the basic steps that we went through when discussing how to protect that PII, and some of the decisions we made on what data to keep and what not. If you’re setting up a registration system yourself, you may make completely different decisions, depending on what information you’re keeping and what your registration system is being used for.

Obviously, step 1 is to make someone responsible for figuring it out. In our case, that person was me, with the grand title of “Data Steward” in Sun’s process. Yes, there’s a process to be followed and checklists to be filled out, and people whose job it is to help us figure it all out (the Chief Privacy Office with Michelle Dennedy and her team). What you need to do is:

figure out what data you need to have, whether for technical or policy reasons
figure out who will need access to the data
figure out how to prevent people accessing the data who don’t need access
figure out when you can destroy the data
write the decisions up and make the information available

What data needs to be kept?

In this service, people can use fake names, but often choose to use their real ones. For compliance reasons, in case there needs to be an investigation into an allegation of wrong-doing by a user, we need to keep the employee ID that was used to sign up for the openid identifier. Even after the openid account is closed, the information is kept for a set period of time to allow any problems to surface. Yes, the users are warned about this during the registration process.

The web server logs are in the Common Log Format, which includes a record of the HTTP GET request from the consuming site (relying party) asking for authentication of the openid identifier. This HTTP GET request includes the openid identifier and the site’s URL, thus allowing correlation of who went where (though not what they did after logging in). This happens with every OpenID Identity Provider that has web server logs, which I would guess is basically all of them, so it’s certainly not a problem that is specific to Sun’s service. Every OpenID IdP could perform such correlations about their users. This is not necessarily a problem, and some people would say that allowing people to see that this openid identifier was used in different places allows reputations to be built, but it also has privacy implications. I might not want my employer (or anyone else, for that matter) knowing what sites I visit, how often, and when. So on principle we mask the data, so that we can see how often a site is visited, but not who’s doing the visiting.

Who needs access to the data?

If there is an allegation of wrongdoing on the part of a user, then Corporate Compliance may need access to the information about whose openid identifier it is, and access to the web server logs showing whether the user actually did log in to the web site in question. This data is only passed on after review of the allegations by Sun’s legal team.

Apart from that, support personnel need access to the openid accounts to help people with things like forgotten passwords (if they forgot to set a secret question), or deleting the account on a voluntary basis. The user has to file a support request using Sun’s internal support system, and the employee ID of the person filing the request has to match that of the owner of the account.

Engineering may need access to some of the files for debugging. There is also a script that runs over the web server logs and extracts records of which sites were visited and when, discarding all information about who the user was who visited that site.

Restrict access

Only a few people have access to the accounts; support, engineering, and me as data governance steward. That access is controlled through operating-system access control. The same applies to the logs and everyone who has access has gone through training to ensure they know the privacy conditions applying to the use of the information (i.e., used only for debugging or support once the user’s identity is verified, as above).

As a side-note, to log in to my account on the machines, I have to log in to Sun’s internal network, ssh from there to the machine I want to access and then log in with my standard Sun credentials followed by a one-time password that uses a challenge-response mechanism with a secret passphrase. Then I need to su to the appropriate user account, using yet another password (of course).

Destroying Data

Once an account has been deactivated, either because the employee left Sun, or because they asked for it to be deleted, it remains inactive for 6 months. Once that time has passed, the account is deleted. The web server logs are deleted automatically after 6 months. This time was chosen as it seemed to meet both the privacy principles (delete as soon as possible) and the corporate compliance principles (keep around for a reasonable length of time, just in case it’s needed).

Documenting

Once it was all figured out, and reviewed by the privacy specialists in Sun, documenting it was the easy part (just like writing standards, really, coming to the consensus is the difficult bit). So we have information in the disclaimer that people need to agree to when they sign up for an account, the user policy, the FAQ, and the more formal checklists etc are available from the Sun-internal project site. And people can always ask me, or email one of the mailing lists we have, if they have any questions.

Facebook Apps and Profiles

So I’m not really into the Facebook thing, but occasionally I hop on and see what people are up to. I’ve noticed a few of my friends have some interesting looking apps on their profile pages, and figured I might try some of them out. Except for, every single one I’ve looked at so far insists on getting access to my complete profile. Why? I can’t imagine why an application that pops up pictures of cute cats needs to know where I live, for example, or why any application needs my birthdate. It should be easy enough to assign some random identifier to my account for most needs that these apps have, such as counting how many subscribers they have. Why do they want more?