HotMetaL

When I first came to Canada I worked at SoftQuad. SoftQuad was one of the first SGML com­pan­ies, well known (in some circles, any­way) for its Pres­id­ent, Yuri Rub­in­sky. And well known in many oth­er circles for its HTML edit­or, HoT­Met­aL. The Sur­rey office did most of the devel­op­ment work on HoT­Met­aL and it was my main focus for quite some time. So it was with a cer­tain amount of nos­tal­gia that I saw HoT­Met­aL lis­ted on eWeek’s Jim Rapoza Picks the Top Web Tech­no­lo­gies of All Time — gone but not for­got­ten, as they say. Thanks to Kim for send­ing me the link.

Google Cloaking

The aff sites are still fram­ing my site, and the num­ber of hits and amount of band­width is cer­tainly not decreas­ing with time (details at pre­vi­ous posts, if you want to catch up on the story). I’m still not entirely sure what they’re doing, but the script they use (if you fetch the pages with a com­mand-line tool) has spe­cif­ic instruc­tions for Google and oth­er search engines, so there’s obvi­ously some reas­on for that.

As far as I can tell, this is a clas­sic cloak­ing attack, and, to quote Wiki­pe­dia as of the time I read the art­icle, “major search engines con­sider cloak­ing for decep­tion to be a viol­a­tion of their guidelines, and there­fore, they del­ist sites when decept­ive cloak­ing is repor­ted”. So I figured that was worth a try and filled out the form at Report a Spam Res­ult (for your enter­tain­ment, the search query I put in was “adul riendfinder.com”, which illus­trates the prob­lem nicely).

Has any­one else ever tried this and have it work? Any hints? I sub­mit­ted the form over a week ago, and have seen no res­ults yet. I thought I’d try with Google first since they gen­er­ally are quick at updat­ing their indices (they were cer­tainly quick­er at flush­ing the hack­ing res­ults than Yahoo).

Privacy in the Internet Age

Dar­ren Bare­foot has some inter­est­ing thoughts about pri­vacy in the inter­net age and the way in which today’s north amer­ic­an teen­agers are grow­ing up post­ing everything about their lives on the inter­net. Up till now, most of the dis­cus­sion I’ve read on the sub­ject has revolved around the effects on future careers of post­ing poten­tially embar­rass­ing stuff on the web. Derek Miller points out that bosses will also have embar­rass­ing stuff up on the web, although there will still be a gen­er­a­tion gap there for some years until those future bosses become bosses (assum­ing that most bosses will still con­tin­ue to be older than many of the people they employ).

We’re start­ing to dis­cern the out­lines of some likely effects of this now. For example, if I get an inter­est­ing email from someone I haven’t heard of, I’ll look them up in Google or Yahoo search, or Linked­In. I don’t neces­sar­ily ignore the email if I don’t find any inform­a­tion about the per­son, but I can see that hap­pen­ing in the future — if you don’t exist in search engines, is that going to be con­sidered weird?

Find­ing people I’ve lost touch with is get­ting easi­er every year, as long as they haven’t changed their name. I’ve man­aged to track down old friends, and oth­ers (who did change their name after mar­riage) have man­aged to track me down. Mind you, I’m rel­at­ively easy to find. 

One effect I’m won­der­ing about is on politi­cians: cur­rently politi­cians either have to be squeaky clean, or good at hid­ing things the elect­or­ate might not like to hear about. Rudy Giuliani’s per­son­al life includes three mar­riages and gay friends, all well-doc­u­mented; in pre­vi­ous years this would have made a pres­id­en­tial cam­paign basic­ally impossible. Now it just makes it more dif­fi­cult, or maybe it’s just dis­cussed more; in future years when more inform­a­tion is avail­able about every­one on the inter­net, and hid­ing these things is going to be impossible, will voters be more accepting?

One inter­est­ing aspect to this is how little inform­a­tion is avail­able about Google’s founders — and more than a little iron­ic, giv­en how easy they’ve made it to find inform­a­tion about oth­er people. An art­icle on Moth­er Jones, via Bruce Schnei­er, has more.

Hacked!

On top of being framed (and yes, they’re still there), my site was recently hacked. Some­how someone man­aged to edit a post, adding a script and a bunch of porn keywords to two posts. And man­aged thereby to elev­ate their site to the front page of Google searches on those strings, in some cases the num­ber one hit, so it’s clear why they did it. I found these while brows­ing through the search engine strings (teen porn keywords are not usu­ally searches that find my site), found the posts and stripped out the offend­ing divs. It’s not obvi­ous to me how they got in, but since the Word­Press devel­op­ment blog has been warn­ing of secur­ity exploits, I assume it’s one of them. So I upgraded to the latest ver­sion, 2.1, and would advise any­one else run­ning Word­Press to do the same. 

Between the AFF people and these hack­ers, I do some­times won­der wheth­er blog­ging is worth­while for someone like me, who does­n’t blog a lot. Sort of takes the fun out of it.

They’re Back!

The aff people, that is. I am too, after some time spent in Aus­tralia with little access to the inter­net. Enough to notice my stats, but I had no inclin­a­tion to actu­ally do any­thing about them.

Look­ing at the logs, the aff people went away on Decem­ber 18 and star­ted send­ing refer­rers my way again on Decem­ber 23. Noth­ing seems to have changed about what they’re doing, and the explan­a­tion put forth by com­menters to the ori­gin­al post that it’s likely an attempt to scam the affil­i­ate pro­gram of an adult site seems the most likely explan­a­tion. I’m still not sure what (if any­thing) to do about this; if the band­width usage becomes excess­ive I’ll use one of the meth­ods sug­ges­ted by the com­menters to my ori­gin­al post. 

In the mean­time, I guess I’d bet­ter post some more inter­est­ing content…

Post results

The light shone on that dark corner of the inter­net (see my Framed! post, as well as Tim’s Fram­ing Lauren linked post) and it brought res­ults. If you haven’t read my post­ing, go ahead; there’s too much inform­a­tion to use­fully sum­mar­ize here and the com­ments are good too. 

After read­ing all the com­ments that came in (includ­ing some private email), Tim and I chat­ted a bit with Paul Hoff­man, noted IETF/IMC heavy. A likely explan­a­tion (as poin­ted out by many com­menters) is that the pur­vey­ors of the “aff” sites were prob­ably try­ing to run an affil­i­ate scam on the Adult Friend Find­er site. They prob­ably chose my site to frame because Tim has a high Page Rank and often links to my site, and thus my Page Rank is also reas­on­ably high. Paul checked and found that they were doing their own name serving, hid­ing them­selves quite effect­ively from us (well, giv­en more time and effort I’m sure Paul could have dug up more inform­a­tion, but it did­n’t seem worth it). 

I pos­ted my piece on Sat­urday and Tim poin­ted to it not long after­wards. By Sunday after­noon the “aff” sites were notice­ably slower and seemed to be going off the air. The last refer­rer I had to my site was from site 23 on Monday Decem­ber 18th at 10 am loc­al time; pinging the address shows it’s still there but there is no longer an http serv­er attached to it. The site is still in the Google index but I expect that to go away at some stage as well. So they reacted quickly; I expect we will nev­er know the entire story. I’ve learned two things though: keep a closer eye on my access logs, and post about things that look weird.