Photo of Jamie & Lion

The personal site of Jamie Knight, an autistic web developer, speaker and mountain biker who is never seen far from his plush sidekick Lion. View the Archive

Topics: Autism Development

The plight of the google bot

I index websites, that’s what I do. Its sometimes boring sometimes fun but most of the time it is hard work. In this short article I am going to tell you about what I do day in day out. Also, I am going to tell you how you can make my life easier.

Oh, By the way, I am the google bot.

I don’t Look at the website as such, I look at the code. I have too because I am a bot, I don’t really have the ability to look at the page. So the code is all I get. This is where the first frustration lies, a lot of people don’t know how to code properly.

What I Need from the code is really quite simple. Its as simple as a meaningful tag here, or an extra attribute there. Really nothing difficult.
A good example is the DTD or the Document Type Declaration for its proper name.

The DTD is the complicated looking bit at the top of the page. It tells me the rules on which the page was based, and pretty much how I should interpret them. This is very important, but sadly many pages don’t have one. Due to the lack of the DTD I don’t know the rules of the page. This complicates my indexing and makes the content harder for me to understand.

By adding this attribute you enable me to have a starting point on adding your page to the index.

Now I know the rules which I am playing to, I can put them into action. For example the rules tell me that what I find inside a h1 tag, is the most important heading on the page. This is greatly useful for indexing the content on the page and determining how this page should rank for various keywords. The whole of the page is evaluated this way. But it is far more accurate when people use the right tags. So be semantic guys!

Often while indexing a web page I will come across an img tag. This tag, unsurprisingly, displays an image. My Issue is I cannot see these images because I am reading the code. Therefore my best bet is to read the alt tag, to see what the image is about. Unless an alt attribute is defined or the filename is descriptive I cannot get anything from this type of content. Not good.

These are the basic methods and best practise which will enable me to index you page well. There are many other methods which will help me index your page.

Well, That’s my manager deploying me, I need to go. Off to spider another page. Maybe yours, I hope I have a productive time.

Disclamier – This is written from the perspective of the google bot, is not aproved by google or in anyway adsociated with them. All methods mentioned are considerd best practise… they are frome experience!

Published: 1 August 2007 | Categories: , Permalink

Commenting is closed for this article.