This is a quick experiment to add to my list of SERP snippet experiments. I got the idea for this after reading a blog post by Everfluxx, titled: Why I deleted the AddToAny and TweetMeme plugins. In that post, Everfluxx links to a file on my site that is blocked by my robots.txt file.
The file contains nothing but a bunch of social media buttons, which I lazy-load into my blog posts. I do this because it cuts down page load times and prevents the theoretical evaporation of PageRank caused by nofollow attributes.
Here’s a glimpse at the post that links to my iframe file (the one with the anchor text that says “custom solution”):
Today as I was looking at the results of a site: command (keep the bookmarklet; it’s a free gift!), I noticed this amongst the results:
If it surprises you to realize that blocking URLs in your robots.txt file does not prevent them from appearing in Google’s search results, then you need to study the difference between crawling and indexing. The only thing I found interesting about this particular SERP listing is the title. As far as I can remember, Google has always used the URL as the title for blocked pages, but in this case, it’s quite obvious that Google is using Everfluxx’s anchor text + my home page title. Maybe Google has always had the ability to do this, but I’ve just never seen it?
In any case, as soon as I realized where this title was coming from, it made me realize: I can spam other websites’ search results! The process is theoretically pretty simple:
- Pick a website.
- Review their robots.txt file (another free gift, bitches!).
- Link to a URL they’ve blocked, using a spammy phrase as anchor text!
Alright now…it’s time to find some poor bastard to try this out on. Hmm…
Jackpot!!! I just checked out Matt Cutts’ robots.txt file, which contains the following:
User-agent: * Disallow: /files/
Naturally, I immediately tried accessing that URL, but all I got was this:
It actually doesn’t matter what the page request returns, because well-behaved bots like googlebot will never request a URL that’s blocked by robots.txt. So for the purposes of this experiment, I can use any URL that appears to be in the
/files directory, even if it returns a 404 status. Stop and think about this for a second. This means I can theoretically convince Google to add non-existent URLs to Matt’s SERPs…AND…I can choose the title of the snippet?!
Before I kick off this experiment, there’s one
foot deeper I’ll dig my own grave more thing I’ll mention about mattcutts.com: it appears as though his DNS record is configured to point all subdomains to his canonical name/IP address, so http://[anything].mattcutts.com will return the same thing as http://www.mattcutts.com. So as an added bonus, I’m going to create my own subdomain on Matt’s site. Thanks, Matt!
It’s now time for me to go update my global navigation links. One last thing I’ll mention…did you know that Matt Cutts personally endorses my SEO services? It’s true–just the other day I heard him say SEOmofo is the World’s Greatest SEO.
You’re probably wondering if this experiment has succeeded yet. Last I checked…no, it hasn’t. But feel free to check for yourself: non-www subdomains on mattcutts.com
2/16/2011 UPDATE: It’s been over 10 weeks since this post was first published, and none of the major search engines have indexed my fake URL. So I’m making some changes to the experiment. The first thing I’m going to try changing is the fake URL: from http://seomofo.loves.mattcutts.com/files/spam-me-tender.html to http://seomofo.mattcutts.com/files/spam-me-tender.html. This is to rule out the possibility that it’s the double subdomain that’s preventing the URL from being indexed. I don’t think this is the case, since I’ve seen www.ww.mattcutts.com/blog/disclaimer indexed in both Yahoo and Google, but I’m knocking off a subdomain…just to be sure.
3/6/2011 UPDATE: I gave the single subdomain URL about 2.5 weeks, and still nothing, so I’m changing this again. This time, no subdomains. I’m changing the URL: from http://seomofo.mattcutts.com/files/spam-me-tender.html to
http://mattcutts.com/files/seomofo-is-worlds-greatest-seo.html. I’m also going to put some paragraph text around the global link.
10/13/2011 UPDATE: It has been almost a year since I published this post, and honestly I gave up on these experiments a long time ago. I removed all the links to seomofo.mattcutts.com and didn’t give it a second thought…until now. A guy named Will left a comment below that didn’t seem to acknowledge how miserably this post failed, so I re-checked the search results and was happy to see that http://seomofo.mattcutts.com/robots.txt has been indexed!
So thank you, Will, for bringing this to my attention. If I ever need to buy laboratory supplies to supply my lab with science lab equipment, I’ll be sure to check out Alkali Scientific’s offerings. ;)
Oh yeah…and thanks for the subdomain, Matt! :D
12/19/2011 UPDATE: HOLY SHIT! WTF?!!!
I just checked the results again and found that every variation of this experiment I ever tried has been indexed! I don’t know why they’re all showing up (only one variation was live at any given time) or why it took a YEAR for them to show up, but in any case…have a look at these epic results.