Dec 162006
 

This is a story of some of the dark corners of the internet, with a puzzle at the end and a request for advice…

Our story starts a few weeks ago. I had installed Statcounter on the blog postings to keep an eye on who visits my blog and why, with more information than you get from perusing access logs (I have those too). I also like following links back to referrers to see why they’re linking to my site, when I have time. A few weeks ago I noticed what looked like a spam site linking to my blog — you know the type of URL, it’s some nonsensical combination of letters and digits. So I followed it back, only to find that it was a complete frame of my blog. View source showed only that my site was being framed. No other content was being added as ads, as meta content, or anything else that I could see. Nothing that explained why they’re doing this.

So I looked up the whois for the site, discovered it’s hidden by a company called “Domains by Proxy”, which specializes in hiding registration data for web sites. They have lots of information on their site about how they cooperate with law enforcement if people are doing something illegal, which leads me to suspect that unless you can prove someone’s doing something illegal they won’t do anything or even talk to you. Not that I tried talking to them, since simply framing my site isn’t illegal, or even contravening my Creative Commons license. It is, however, highly suspicious.

A little more investigation was in order; the number of hits on my website from this site were increasing and other versions of the URL were showing up. The URL was of the form “aff” followed by “0000” followed by a number, followed by .com (yes, it’s circuitous, but I don’t want my site linked to theirs in search engines, for reasons that will become obvious). I checked out and found that all numbers from 1 to 28 pointed to my site. So someone paid to register 28 domains, host 28 domains, and put in HTML to point to my site? None of the URLs showed up in the common search engines, but somehow they were being clicked on, seemingly by real people (spread of ISPs across the world, different OSes, screen resolutions, and browsers, all staying for approximately zero seconds).

I contemplated putting in some frame busting code but decided to wait a little and see what happened, in case they were just getting ready to do something. In the meantime more of these sites start pointing at mine. And finally one of them showed up in a search engine, and there it points to an adult site. One of those ones that may not be safe at work, at least judging by the front page. In which case the frame busting isn’t the answer anyway, the people visiting this site don’t want to see my musings on technology, motherhood, or knitting, they want the adult content they expect.

Tim had the bright idea at this stage of using a command-line fetch on the “aff” sites and found that the index page returns a list of potential misspellings of the adult site’s name. About 10000 of them. The other sites return similar lists; number 28 only returns about 7000 misspellings. If you search for one of these misspellings in a common search engine, you land on an “aff” page, which then redirects you to the adult site. But only if you come from a search engine. If you type in that site name in the address bar, the redirect sends you to my blog.

So I have a couple of questions, and would appreciate any thoughts or experiences you have.

  1. Why are they not redirecting to the adult site, which is probably what the people who are clicking on an “aff” site probably want? Why send them to another site?
  2. Related question: why me? Why someone who writes about technology, and not someone on some free hosting site who may not even notice the increase in traffic, let alone get suspicious about it?
  3. What do I do about it? I could block people from “aff” site from linking to my site; receiving a “You’re in timeout.” message (error 403 as seen by Mark Pilgrim) might have some effect. One related question to this is why people are going to an “aff” site anyway; since the “aff” sites redirect people coming from search engines to the actual adult site itself one could suppose nobody would ever click on it. Tim suggested people might be curious; they see the URL in the search engine listings and type it in the address bar to see what’s there.

The adult site itself does have a technical contact in the whois registry but the purveyors of the “aff” sites might not be them. Suggestions welcome… the hits I’m getting have grown from nothing a few weeks ago to now being a substantial part of the direct hits on my site so it’s a problem I want to solve soon.

  14 Responses to “Framed!”

  1. The only thing I can imagine is that it’s a way to masquerade the page as a non-adult URL to some audience.

  2. Google thinks it’s the adult site but everyone else sees your site? I think it’s an attempt to gain pagerank. Because it looks and behaves like an innocent link on most sites, it is less likely to be deleted from blog comments, wikis and so on. Google thinks many highly-ranked sites linking to the adult site and gives it a high score. Obviously the scammers think you’re a prominent site so you should feel flattered. What to do about it? Hmm… if I’m right the people visiting your site using the link are genuinely interested in your site so maybe you shouldn’t slap them with an error page. How about redirecting to a page that explains what’s going on and has a link to the original page?

  3. I suspect a two-pronged SEO effort is going on. On one hand they are trying to get traffic from mis-spellings of AdultFriendFinder, on the other hand they are trying to increase their google juice so their specific mis-spelling bubbles up ahead of the other scammers trying the same trick.

    They are probably posting comments on tech blogs that do not use rel=”nofollow”, or on forums, hoping that people will not realize the aff URL is a frame, and start linking to the aff* variant instead of the actual source, thus increasing the Google PagRank of aff*.com. Tech blogs must be a highly labor-effective way of increasing one’s PageRank, as opposed to, say, garden gnome blogging. Your site is probably not the only one affected, they must use some kind of code in the frameset page URL to indicate which relevant content should be loaded into a frame. Since there is only one level of referrer tracking in browsers you wouldn’t see the original blog posting or forum in your logs.

    Since PageRank is not context-specific, a high PageRank acquired from relevance to technical queries also qualifies you to stand out from the din of others trying the same typosquatting scam. Once one of the sites has reached a certain level of PageRank, they probably switch the content away from the tech articles to those typosquatting AdultFriendFinder. The fact you saw the actual search queries is probably an error on their part where they jumped the gun on typosquatting before switching off the framing and replacing it with whatever content they are actually trying to push on would-be daters.

    The only way you can fight back against this is by using frame-buster code, at least if the referrer is one of those scumbags’ domain. That only goes so far – if you do that, they may adopt outright copying of your site content (I have seen my own relatively obscure site’s content stolen by link farms).

    You could also inform FriendFinder. They are a legitimate business, even if some of their properties like Alt.com are somewhat raunchy, and they can probably bring legal firepower to bear on DomainsByProxy to get the scammers’ identity. I am not sure how the scammers actually plan to make a profit. Perhaps by abusing a FriendFinder affiliate program, in which case exposing the scam should make the profit motive disappear.

  4. Hi,

    Analyzing the code itself, this what is happening.

    With a JS-enabled browser, a hit on an “aff” page will also pull in a little Javascript file: x.js

    x.js basically says:

    * if referrer is a non-blank query from a search engine (specifically Google, MSN, Yahoo, AltaVista, Ask), then load the savoury page in the browser window.

    * if referrer is none of these, then load the file index.htm from the aff site — it is this file that contains the frame reference to your site.

    People are not necessarily typing in the “aff” address in their browser address bars: a search anonymizer (a Firefox extension?) that sends a non-search referrer will get the index.htm page and therefore your site in a FRAME.

    I can’t see anything malicious or underhand in the code itself: the only ‘victims’ beside your bandwidth and server logs will be the search users themselves who get “(your) musings on technology, motherhood, or knitting.”

    What to do? A 403 Forbidden is most appropriate in this case (if your Apache has mod_rewrite loaded):

    RewriteCond %{HTTP_REFERER} ^http://aff0000.*
    RewriteRule ^/.+ - [FL]

    HTH,

    Cliff

  5. Ooops, rewrite rule should read:

    RewriteCond %{HTTP_REFERER} ^http://aff0000.*
    RewriteRule ^/.+ - [F,L]

    Rgds,

    Cliff

  6. I’ve had a run in or two with some dating spam/scams and it maybe useful to remember they tend to be copy [example] and paste coders so they make all kinds of errors. Many at the same time.

    I’m guessing your name and one of these aff’s will show up as blog spam payloads. (or as ping/trackback) Looks almost like a legit comment/ping to blog readers, search engines see something else. That’s my first guess without thinking about it the way scammers do. My second thought is it could be a phish being set up aainst aff (some of their customers don’t always use all the spark plugs in their engine) Just quick speculating on my part.

    –Cecil

  7. For the continuation (and probable end) of the story, see my next post at http://www.laurenwood.org/anyway/archives/2006/12/19/post-results/. Thanks for the comments, they helped me figure out what was going on, and may have helped the purveyors of the aff sites decide to stop whatever they were doing.

  8. And now they’re back again. http://www.laurenwood.org/anyway/archives/2007/01/29/theyre-back/.
    So I may yet have a chance to use some of these ideas, if the bandwidth usage goes up too much.

  9. The link to http://www.laurenwood.org/anyway/archives/2006/12/19/post-results/ doesnt work for me, can you fix it please.
    Or I can try to look it up manually

  10. The link works for me; what are the results when you click on it? 404? Something else?

  11. This simply looks like making efforts to build a techi impression of their site in the view of search engines.

  12. I had this happen to one of my sites about a year ago. It ended up redirecting to a search engine for puppies (of all things). After several spam emails I eventually purchased a new domain name but that didn’t stop it either. I have read these posts and will implement the post by Cliff. Hopefully your issue gets solved soon.

    Regards

    Garret

  13. Very interesting posts….

    I know a little bit about Adultfriendfinder’s affiliate system, and one can make money there by simply sending there a visitor. I think your spammers were doing that. And I am sure they got to you, and exploited your blog’s high google ranking to promote themselves higher on the search engines… I actually found you through a blogposting software: I have a dating advice site and wanted to get a comment to get a link… I’ll use my blog instead… I don’t want you to say I am spamming your board 🙂

    Thank you for educating us/me. I appreciate it.
    Sophie

  14. I am also not sure how the scammers actually plan to make a profit.

 Leave a Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

(required)

(required)

/* ]]> */