Learning Identity

One of the things I’ve found about try­ing to fig­ure out iden­tity man­age­ment con­cepts and tech­no­logy is that there are lots of nuances, lots of things to worry about, and it tends to make you more wary (which I guess is all to the good). I now am more care­ful about wheth­er web­sites have believ­able pri­vacy policies before I sign up for them, I have a num­ber of free email accounts for the sole pur­pose of get­ting news­let­ters or regis­ter­ing at web­sites, and I more often fig­ure the inform­a­tion on these web­sites is unlikely to be worth the effort. 

It’s excit­ing though, being part of some­thing that is import­ant and where people are real­ising the import­ance more day by day, sort of like XML in the early days where people start­ing say­ing, yes I do have that prob­lem and maybe this tech­no­logy can help solve it. So part of what I hope to do in the com­ing year is help the Liberty Alli­ance fig­ure out how to help people learn what they need to know about some of these con­cepts, tech­no­lo­gies, and spe­cific­a­tions. Iden­tity man­age­ment is start­ing to expand bey­ond the “in group” now as more people start to real­ise the import­ance of build­ing it (and secur­ity) into sys­tems from the begin­ning rather than try­ing to bolt it on after­wards. Fig­ur­ing iden­tity out takes time – iden­tity man­age­ment is inher­ently com­plex (well, more com­plex than XML, any­way ;-)) and although Ein­stein’s fam­ous quote says things should be made as simple as pos­sible, it also says “but not simpler”.

One of the things that Liberty does to tell people about Liberty-related aspects of iden­tity is to host web­casts on a reg­u­lar basis. This month’s is on the People Ser­vice:

The Liberty ID-WSF People Ser­vice, a key com­pon­ent in ID-WSF 2.0, is the industry’s first com­pre­hens­ive plat­form for man­aging social inform­a­tion with­in an open fed­er­ated net­work envir­on­ment. People Ser­vice allows con­sumers and enter­prise users to man­age social applic­a­tions such as book­marks, blog­ging, cal­en­dars, photo shar­ing and instant mes­saging from a com­mon lay­er with­in the ID-WSF 2.0 frame­work. Liberty People Ser­vice has been developed to allow indi­vidu­als to eas­ily store, main­tain, and cat­egor­ize online rela­tion­ships so that oth­er socially-aware Web ser­vices applic­a­tions can lever­age inform­a­tion based on the con­sent and pri­vacy con­trols estab­lished by a user in the fed­er­ated social net­work. With Liberty Alli­ance People Ser­vice, con­sumers and enter­prise users can now cent­rally man­age all of their online social rela­tion­ships using a fed­er­ated net­work approach with pri­vacy con­trols built into the sys­tem allow­ing users to lever­age the pri­vacy func­tion­al­ity of Liberty Web Ser­vices to more eas­ily and securely share social and enter­prise inform­a­tion across applic­a­tions, plat­forms and ser­vice pro­viders. In this Web cast, we’ll over­view the func­tion­al­ity of People Ser­vice and provide some use case examples. You won’t want to miss this highly inform­at­ive session.

The web­cast is on this com­ing Wed­nes­day (Janu­ary 11, 2006) at 8 am Pacific; if you’re think­ing of listen­ing in please register soon (prefer­ably by Monday) so there will be enough phone lines booked.

Phishing Sophistication

I’m start­ing to be impressed by the (almost) soph­ist­ic­a­tion of phish­ing attempts. The latest one in my inbox today con­tained a mes­sage from someone pur­port­ing to have bought an item via eBay that they had­n’t received and unless they heard back they were going to com­plain to eBay and then the police — I can quite see some nervous seller who thinks there might be a mis­take in the sys­tem click­ing on the “log in to eBay mes­sage cen­ter” link (which of course does­n’t go to eBay at all) to try to rec­ti­fy it. 

Mind you, the spam fil­ters are also start­ing to become soph­ist­ic­ated — my ISP adds head­ers to the email mark­ing poten­tial spam and this one tripped a num­ber of meters, adding up to quite a lot of red flags. Some of them are, on their own, quite legit­im­ate of course, but not all:

    1.0 FROM_ENDS_IN_NUMS      
        From: ends in numbers
    1.3 RCVD_NUMERIC_HELO      
        Received: contains a numeric HELO
    1.0 MSGID_SPAM_CAPS        
        Message-ID =~ /^\s*< ?[A-Z]+\@(?!(?:mailcity|whowhere)\.com)/
    0.1 HTML_TAG_EXISTS_TBODY  
        BODY: HTML has "tbody" tag
    0.4 HTML_70_80             
        BODY: Message is 70% to 80% HTML
    0.1 HTML_FONTCOLOR_BLUE    
        BODY: HTML font color is blue
    0.7 MIME_HTML_ONLY         
        BODY: Message only has text/html MIME parts
    0.2 HTML_MESSAGE           
        BODY: HTML included in message
     0.3 HTML_FONT_BIG          
        BODY: HTML has a big font
    1.1 MIME_HTML_NO_CHARSET   
        RAW: Message text in HTML without charset
    0.2 MIME_QP_LONG_LINE      
        RAW: Quoted-printable line longer than 76 chars
    0.4 NORMAL_HTTP_TO_IP      
        URI: Uses a dotted-decimal IP address in URL
    0.1 FORGED_HOTMAIL_RCVD2   
        hotmail.com 'From' address, but no 'Received:'
    3.0 FORGED_MUA_OUTLOOK     
        Forged mail pretending to be from MS Outlook
    0.6 MISSING_MIMEOLE        
        Message has X-MSMail-Priority, but no X-MimeOLE
    1.1 FORGED_OUTLOOK_HTML    
        Outlook can't send HTML message only
    1.1 MIME_HTML_ONLY_MULTI   
        Multipart message only has text/html MIME parts
    1.1 FORGED_OUTLOOK_TAGS    
        Outlook can't send HTML in this format
    3.0 SARE_MSGID_YAHOO       
        Message-ID is forged, (yahoo.com)
    1.1 HTML_MIME_NO_HTML_TAG  
        HTML-only message, but there is no HTML tag

After I saw this I promptly went and got the latest ver­sion of Pegas­us Mail, which I use for my per­son­al email. Pegas­us has always had good anti-vir­us pro­tec­tion, has had decent spam fil­ter­ing for some time, and shows the real URL that is being linked to on HTML emails, but it now advert­ises anti-phish­ing checks as well. It will be inter­est­ing to see how well they work in practise. 

Woody to Sarge

Ive been intend­ing on upgrad­ing my Debi­an firewall/blog box to the latest ver­sion, called ‘sarge’ (a.k.a 3.1) for some months now. Today was the day I decided to finally bite the bul­let. Since I’ve been using back­ports of unstable ver­sions of soft­ware, such as MySQL (see Upgrad­ing MySQL on Debi­an for that pro­cess, and Enabling Thumb­nails for the pro­cess to upgrade libgd) I figured this could be a little trick­i­er than I really like, and I should be pre­pared. Here’s the his­tor­ic­al record of actu­ally get­ting it run­ning. YMMV, of course!

First, the doc­u­ment­a­tion on the Debi­an web site is good. The upgrad­ing instruc­tions are writ­ten per hard­ware plat­form and seem com­plete. I star­ted, as recom­men­ded in Upgrad­ing your Woody sys­tem by repla­cing the word “stable” in the /etc/apt/sources.list file with the word “woody” and then check­ing I had woody’s ver­sion of aptitude installed.

After copy­ing the recom­men­ded files to a safe loc­a­tion (that’s a lot of files!), I deleted the /etc/preferences file after sav­ing a copy — this is the file that says which ver­sions of any soft­ware to use. Since to begin with I want to use a clean, stand­ard Debi­an sarge dis­tri­bu­tion, I don’t need this file. Then it was on to sec­tion 4.2.2, “Check­ing pack­ages status”. I found that apt-get showed no holds, but aptitude showed that php4 was on hold (I can­’t ima­gine why). So I got rid of the hold.

After that, I just fol­lowed the steps, tak­ing the defaults mostly (since I did­n’t under­stand some of the ques­tions, that was an easy choice! One day I might under­stand what pango and defoma are all about, but in the mean­time I’ve decided not to both­er). There were a couple of mes­sages that mostly seemed ignor­able (note to self: upgrade exim3 to exim4 at some stage in the future) and all in all the pro­cess ran smoothly, if not par­tic­u­larly fast on my old, slow Pen­ti­um box. 

Time to check the res­ults — try my web site and find it’s been replaced by a gen­er­ic “wel­come to an Apache web site” mes­sage. The web serv­er has been magic­ally upgraded to Apache 2.0, which I had­n’t quite expec­ted or planned for. Oh well, time to hit the Apache documentation.

There’s a big dif­fer­ence between Debi­an upgrade doc­u­ment­a­tion and Apache upgrade doc­u­ment­a­tion. Where the Debi­an upgrade instruc­tions are exactly that (“Do this, then this. Run this com­mand and if you get this out­put, do this, oth­er­wise do that”), the Apache doc­u­ment­a­tion on Upgrad­ing to 2.0 from 1.3 is basic­ally a list of fea­ture changes, rather than instruc­tions on how to upgrade or what modi­fic­a­tions need to be made to the con­fig­ur­a­tion files. Look­ing at the con­fig­ur­a­tion files them­selves in the Debi­an Sarge Apache 2 dis­tri­bu­tion you can see, for example, that httpd.conf has changed markedly from being the main con­fig­ur­a­tion file to con­tain­ing simply a com­ment say­ing it exists for back­wards com­pat­ib­il­ity only. The README file does have some clues to the new files, with short descrip­tions of what they’re used for. The most inter­est­ing new dir­ect­ory to me was sites-enabled, which seemed to have some­thing to do with set­ting up vir­tu­al hosts. So I typed sites-enabled into the Apache doc­u­ment­a­tion search engine and found no hits what­so­ever. The Vir­tu­al­Host part of the doc­u­ment­a­tion for Apache 2.0 says “Below is a list of doc­u­ment­a­tion pages which explain all details of vir­tu­al host sup­port in Apache ver­sion 1.3 and later.” Hmmm, things do seem to have changed some­what between Apache 1.3 and Apache 2.0. On the oth­er hand, it’s always pos­sible that this par­tic­u­lar con­fig­ur­a­tion and choice of dir­ect­ory names etc is due to Debi­an rather than Apache; the Debi­an dis­tri­bu­tions do have a repu­ta­tion for put­ting files in places that are unex­pec­ted and maybe this has exten­ded to the names used in the Debi­an fla­vour of the Apache install­a­tions. If this is the case it’s not sur­pris­ing it isn’t doc­u­mented on the Apache web site.

For­tu­nately oth­ers have writ­ten this up; I found Upgrad­ing to Apache 2 which described the pur­pose of the sites-enabled and sites-avail­able dir­ect­or­ies in ways that make sense and worked when I tried them out. The same prin­ciples apply to mak­ing the mod_rewrite mod­ule avail­able, which Word­Press uses for rewrit­ing the URLs for archives and categories.

So far, so good. My web site is avail­able again, just not my blog. The error mes­sage is “Your PHP install­a­tion appears to be miss­ing the MySQL which is required for Word­Press”. When I check, all the neces­sary pack­ages are installed. A quick search through the Word­Press sup­port site turns up that I’ve for­got­ten to uncom­ment the MySQL mod­ule in the php.ini file. I’m so used to Debi­an just doing the right thing that it seems odd to have to make that change, some­how. Now my blog is back as well, everything else seems to be work­ing, no files seem to have been lost, and over­all the upgrade was a lot less pain­ful than I had anticipated.

Double Routing

Like prob­ably every oth­er com­puter geek out there, I do a cer­tain amount of help­ing friends set up their home sys­tems. This par­tic­u­lar friend knows noth­ing about net­works and fire­walls and the like, and just wanted some­thing secure that would allow her to have a reas­on­ably safe Win­dows box and the daugh­ter to have a reas­on­ably safe and vir­us-free Win­dows laptop. The easy bits were installing the spy­ware detect­ors (Ad-Aware and Spy­bot S&D) and the vir­us checker/utilities (Norton Sys­tem­Works); the tough bit was get­ting the routers to work.

The sys­tem that made most sense was to feed the DSL into a wired eth­er­net router with a built-in fire­wall (the D‑Link DI-604 has a reas­on­able price point and an integ­rated fire­wall) and then set up a wire­less point for the daugh­ter­’s laptop. So my friend got a Link­sys wire­less router (no fire­wall). We have this sys­tem at home, though with dif­fer­ent hard­ware (Linux fire­wall + Air­port wire­less) and it works just fine. So I was­n’t expect­ing any oddit­ies. I found the sup­port page on the Link­sys site that said to turn off the DHCP serv­er on the wire­less router, and to give it an IP address that fit­ted in with the IP setup of the wired router. That was easy enough to do. But some­how the laptop just nev­er man­aged to sync up.

Ah, how good it was that I allowed more time than I expec­ted to need to set it up! My basic idea was that eth­er­net comes out of the DSL mode, goes into the wired router in the uplink sock­et, then a cable comes out of the wired router and goes into the uplink sock­et of the wire­less router. Still seems logic­al to me, but in this case my logic was com­pletely wrong. For­tu­nately Link­sys has live chat to tech sup­port that works on a Sat­urday (good move, people!) and Mel­rose did­n’t need very long to fig­ure out the prob­lem and tell me to put the cable com­ing out of the wired router into one of the 4 reg­u­lar sock­ets. This worked just fine; the laptop synced up, my friend (and her daugh­ter) are happy and think I know exactly what I’m doing, while I’m still slightly baffled and won­der­ing what’s wrong with my simple hose-pipe ana­logy of inter­net con­nec­tions. Still, I now know empir­ic­ally what to do, so that’s the import­ant thing.

Bring out Your Votes!

Work­ing in small tech­nic­al com­mit­tees on well-con­strained prob­lems can be really reward­ing; the small group allows for a cer­tain amount of fun in the meet­ings and every­one knows they have a role to play. I chair the OASIS Entity Res­ol­u­tion TC, which is work­ing on XML Catalogs.

The idea of cata­logs has been around for a long time, it was one of the first pieces of work to come out of SGML Open, the pre­curs­or to OASIS. We’ve updated them for XML and use on the Web and although we spend a lot of time explain­ing that entity res­ol­u­tion is not restric­ted to XML enti­tit­ies and indeed we use the word “entity” in the more gen­er­al sense of the word, i.e. we really mean “resource” in today’s ter­min­o­logy (see the FAQ for more on this), I think it’s a good piece of work. Mind you, hav­ing Norm edit it and write code to imple­ment it does help immensely. 

So now it’s time to vote! We need anoth­er 44 OASIS mem­ber com­pan­ies to vote (we need to reach a total of 47 “Yes” votes to pass) — so please pass this on to any vot­ing reps you know (yes, this is a shame­less lob­by­ing act for some­thing I think is worth­while). The bal­lot is at Approve XML Cata­logs v1.1 as an OASIS Stand­ard. Many thanks!

Some sup­port­ing inform­a­tion from the TC:

XML doc­u­ments and data often ref­er­ence oth­er extern­al resources. Often the ref­er­en­cing inform­a­tion is not suf­fi­cient to loc­ate the desired resource unam­bigu­ously, or the resource is not access­ible at the giv­en loc­a­tion at the time it is required, or it is prefer­able that an altern­ate resource be used in place of the ref­er­enced resource. 

For example:

  1. Extern­al iden­ti­fi­ers may require resources that are not always avail­able. For example, a sys­tem iden­ti­fi­er that points to a resource on anoth­er machine may be inac­cess­ible if a net­work con­nec­tion is not available. 
  2. Extern­al iden­ti­fi­ers may require pro­to­cols that are not access­ible to all of the tools on a single com­puter sys­tem. An extern­al iden­ti­fi­er that is addressed with the FTP pro­tocol, for example, is not access­ible to a tool that does not sup­port that protocol. 
  3. It is often con­veni­ent to access resources using sys­tem iden­ti­fi­ers that point to loc­al resources. Exchan­ging doc­u­ments that refer to loc­al resources with oth­er sys­tems is prob­lem­at­ic at best and impossible at worst. 
  4. Incom­ing XML doc­u­ments may ref­er­ence cus­tom­ized ver­sions of stand­ard XML schem­as. To pro­tect your sys­tems, it is neces­sary to remap the schema ref­er­ences so that known, trus­ted cop­ies of the schem­as are used. 

Entity Res­ol­u­tion is the pro­cess by which these resource ref­er­ences can be mapped to anoth­er ver­sion of the ref­er­ence that can be found or that is pre­ferred for oth­er reas­ons. To address these issues, the OASIS XML Cata­log spe­cific­a­tion defines an applic­a­tion-inde­pend­ent entity cata­log that maps extern­al iden­ti­fi­ers and URI ref­er­ences to (oth­er) URI references. 

Entity res­ol­u­tion cata­logs have already been widely imple­men­ted in much deployed soft­ware. Pro­mot­ing the OASIS XML Cata­log spe­cific­a­tion to an OASIS Stand­ard is cru­cial for con­tin­ued inter­op­er­ab­il­ity of XML applications.

Invisible Files

What do you do when you need the answer to a ques­tion and Google does­n’t deliv­er? Ask on the blog of course… I would really like to know the answer to this one, as it would save a large amount of irrit­a­tion and I assume oth­ers have the same prob­lem. I’ve spent hours bur­ied deep in search engine res­ults with no luck. 

As befits a fam­ily with jobs in the com­puter industry, we have a few com­puters spread around the house, all con­nec­ted with a decent home net­work and pro­tec­ted with a good Linux-based fire­wall (which also serves this blog). The com­puters run a num­ber of oper­at­ing sys­tems — Win­dows 2000, Win­dows XP, Mac OS X, Sol­ar­is. The prob­lem only appears with the Win­dows XP boxes — or rather, between them. For some reas­on, one Win­dows XP box can­’t see all the files and folders on the oth­er Win­dows XP box, although they’re quite vis­ible from both Win­dows 2000 and OS X. The odd thing is that some files and folders are vis­ible, often some files in a giv­en folder will be vis­ible but the oth­ers won’t, and to my eye there are no dif­fer­ences in secur­ity set­tings, own­er­ship, or ACLs. Mind you, I’m obvi­ously miss­ing some­thing some­where or I’d be able to see all those files from every machine in the house! I tried copy­ing some of the files to new dir­ect­or­ies; some­times that lets me see them across the net­work, and some­times it does­n’t. I have no idea what set­tings are being put in place to stop me look­ing at such dan­ger­ous files as .css and .html in par­tic­u­lar dir­ect­or­ies; the sys­tem seems capri­cious — as does any sys­tem when you haven’t figured out the rules by which it oper­ates. The innate abil­ity of the human brain to fig­ure out pat­terns has decidedly failed me in this instance.

Help would be much appre­ci­ated, not only for me but for the rest of the fam­ily who have to put up with my imprec­a­tions each time I want to trans­fer files from one box to the oth­er, only to find that they’re not vis­ible from the box I want to trans­fer them to.