Sunday, April 27, 2008

Google Webmaster Tools

Google are now providing free Webmaster Tools. Like other free Google tools, clearly Google are getting something in return for providing you the service: the information you provide will help Google index the pages better, which is good for them (as well as you). To sign up for the servcie you need a Google account (usually a Gmail account). To use some you have to verify the web site you want to check is yours by uploading a code to the home page.

The servcie provides:
  1. Diagnostics
  2. Statistics
  3. Links
  4. Sitemaps
  5. Tools


  1. Web crawl: problems Google had accessing pages. My site had no errors: HTTP errors , Not found, URLs not followed, URLs restricted by robots.txt, URLs timed out, Unreachable URLs.
  2. Content analysis: problems with site metadata (title and description information). Google found one of my web pages was missing a title. It also looks for duplicates, very long or titles and "non-informative" ones. The missing title turned out to be in a web page in the Moodle system.
  3. Mobile crawl: problems with pages designed for mobile phones. Google looks for CHTML and WML/XHTML . CHTML is a variant of HTML mostly used for Japanese mobile phones. Some of my pages have XHTML and CSS specifically designed for mobiles.


Search queries

This shows which queries to Google returned pages from the site and which were most often selected by the person searching. This was an interesting list for mw web site as it differs from the results the statistics package my web server provides. The difference is essentially, that this is how others perceive the web site from the outside, not how I see it from the inside. As an example The 2020 summit does not figure highly in my web site stats:

Top search queries
# % Query Position
1 45% 2020 summit 23
2 25% 20 20 summit 11
3 4% 2020 summitt 9
4 4% australia 2020 summit 16
5 3% 2020 summit submissions 5
6 2% 2020 39
7 2% 2020 summit summary 9
8 2% australia 2020 28
9 2% "2020 summit" 17
10 1% 2020 summit video 6
11 1% 2020 summit australia 30
12 1% 2020 summit submission 4
13 1% alan smart 10
14 1% australian 2020 summit 11
15 1% what is the 2020 summit 15
16 1% smart 33
17 1% cookies enabled on your browser 4
18 1% forum 2020 4
19 1% 2020 summit governance 7
20 1% 20 20 summitt 10

Top clicked queries

# % Query Position
1 27% konkan railway 5
2 18% 2020 summit official opening speakers 2
3 18% 2020 summit submissions 5
4 18% australia 2020 summit submissions 5
5 18% indian ferry 7

Crawl stats

The crawl stats are a little hard on the ego, as it shows what proportion of the pages have a high, medium or low PageRank. Most of mine rated low. My highest rating was one on the accessibility of Olympic web sites.

Subscriber stats

This shows hom many have subscribed to RSS feeds using Google services, such as such as Google Reader. There were none for my site, although I have an RSS feed on it.

What Googlebot sees

This shows words and phrases in the anchor text of links to the site. This is not information from the site itself, but what other people used to describe it, when linking to it. So this is what the system which collects links to the site (the "Googlebot "), sees.

This information is quite confronting as it does not necessarily match the idealized picture of how you see your carefully crafted web site being viewed. Also in some cases you say "who was silly enough to say that?" and find the phrase is from something you wrote. Here are the top few phrases and words from my site (Google provide a longer list):

Phrases in external links:

1. open 2020 summit moodle
2. all the notes
3. help cookies must be enabled in your browser
4. moodle for local summit details and links
5. create new account
6. http tomw net au moodle course view php
7. new account
8. summit on open source
9. writing for the web
10. aide votre navigateur doit supporter les cookies

Keywords In your site's content

1. australian
2. australia
3. government
4. tom
5. computer
6. worthington
7. system
8. technology
9. post
10. canberra

Keywords in external links to your site

1. tomw
2. stores
3. net
4. other
5. line
6. online
7. html
8. ltd
9. pty
10. communications

Pages with external links

This shows which pages external sites are pointing to. This list did not make a lot of sense at first. As an example, there was an entry for with 74 links. On closer inspection, this turned out to be the page for the Open 2020 Summit and numerous people had put in links to it. But I think I still don't quite understand what this report is trying to tell me.


Sitelinks are a small table of contents which Google generates itself and places in its search results. My site doesn't have one of these, which might suggest the site is not clearly enough organized for Google's algorithm to work it out. Sightlinks have been controversial as they might supplant the web site's own navigation.

Pages with internal links

This provides pages pointed to from other pages on the site. This was not a lot more use than the tools usually provided with web development tools.


This reposts any sitemaps associated with the web site. These are XML files which provide Google bot (and other web crawlers) with a list of the pages on the web site and make it easier for new pages to be indexed. This can reduce the traffic on the web site from web crawlers and allow them to index the site more frequently. Google provide a list of tools which can be sued to generate the sitemap. Ideally this should be built into the web server, so each time a page is added, or changed, the site map is updated. But there are some external web based tools, such as to try.

No comments: