Recent Posts
Sebastian´s Pamphlets
Return To Blog Listing
Thoughts, findings, notes and opinions on Web development, Web search and SEO - not always serious and sometimes polemic.
Recent Posts Tagged With 'crawler directives'
Debugging robots.txt with Google Webmaster Tools
Although Google’s Webmaster Console is a really neat toolkit, it can mislead the not-that-savvy crowd every once in a while. When you go to “Diagnostics::Crawl Errors::Restricted by robots.txt” and you find URIs that aren’t di...
Vaporize yourself before Google burns your linking power
I couldn’t care less about PageRank™ sculpting, because a well thought out link architecture does the job with all search engines, not just Google. That’s where Google is right on the money. They own PageRank™, hence they can ...
Crawling vs. Indexing
Sigh. I just have to throw in my 2 cents. Crawling means sucking content without processing the results. Crawlers are rather dumb processes that fetch content supplied by Web servers answering (HTTP) requests of requested URIs, delivering those cont...
@ALL: Give Google your feedback on NOINDEX, but read this pamphlet beforehand!
Matt Cutts asks us How should Google handle NOINDEX? That’s a tough question worth thinking twice before you submit a comment to Matt’s post. Here is Matt’s question, all the background information you need, and my opinion. What is ...
Update your crawler detection: MSN/Live Search announces msnbot/1.1
Fabrice Canel from Live Search announces significant improvements of their crawler today. The very much appreciated changes are: HTTP compression The revised msnbot supports gzip and deflate as defined by RFC 2616 (sections 14.11 and 14.39). Micros...
Get a grip on the Robots Exclusion Protocol (REP)
Thanks to the very nice folks over at SEOmoz I was able to prevent this site from becoming a kind of REP/robots.txt blog. Please consider reading this REP round up: Robots Exclusion Protocol 101 My REP 101 links to the various standards (robots...
Getting URLs outta Google - the good, the popular, and the definitive way
There’s more and more robots.txt talk in the SEOsphere lately. That’s a good thing in my opinion, because the good old robots.txt’s power is underestimated. Unfortunately it’s quite often misused or even abused too, usually be...
My plea to Google - Please sanitize your REP revamps
Standardization of REP tags as robots.txt directives This draft is kinda request for comments for search engine staff and uber search geeks interested in the progress of Robots Exclusion Protocol (REP) standardization (actually, every search engine m...
Google to change the Robots Exclusion Protocol again
Web crawler directives, partly standardized in the Robots Exclusion Protocol (REP), evolved since 1994. Nowadays we’ve to deal with a conglomerate of not binding de facto standards and microformats, all of them extended by various organizations...
Validate your robots.txt - Googlebot becomes smarter
Last week I reported that Google experiments with new crawler directives for use in robots.txt. Today Google has confirmed that Googlebot understands experimental REP syntax like Noindex:. That means that forgotten –and, until recently, ignored...
Microsoft funding bankrupt Live Search experiment with porn spam
If only this headline would be linkbait … of course it’s not sarcastic. Rumors are out that Microsoft will launch a porn affiliate programm soon. The top secret code name for this project is “pornbucks”, but analysts say that...
Q&A: An undocumented robots.txt crawler directive from Google
Blogging should be fun every now and then. Today I don’t tell you anything new about Google’s secret experiments with the robots exclusion protocol. I ask you instead, because I’m sure you know your stuff. Unfortunately, the Q&...
Act out your sophisticated affiliate link paranoia
My recent posts on managing affiliate links and nofollow cloaking paid links led to so many reactions from my readers that I thought explaining possible protection levels could make sense. Google’s request to condomize affiliate links is a bit,...
A pragmatic defence against Google’s anti paid links campaign
Google’s recent shot across the bows of a gazillion sites handling paid links, advertising, or internal cross links not compliant to Google’s imagination of a natural link is a call for action. Google’s message is clear: “cond...
